Estimating Cell-Type-Specific Fractions with Autoencoders

Shu, Xin, Computer Science - School of Engineering and Applied Science, University of Virginia
Zhang, Aidong, EN-Comp Science Dept, University of Virginia
Bekiranov, Stefan, MD-BIOC Biochem-Mole Genetics, University of Virginia

In this work, we introduce autoencoder architectures and the evolution of these methods leading to disentangled representation learning. We apply three autoencoder architectures to single-cell RNA sequencing (scRNAseq) data. ScRNAseq characterizes cellular heterogeneity by measuring the expression profiles of individual cells. However, this measurement remains relatively expensive compared with bulk RNAseq, where expression profiles are averaged over many cells of various cell types and at different cell states, preventing us from capturing cellular heterogeneity. We propose a new bulk RNA-seq data deconvolution method, expDC, to estimate cell-type-specific proportions from bulk RNA-seq data. The latent codes of autoencoders offer additional interpretability to explore the grouping of cell types. To do this, we first learn reliable and denoised representations for each cell type given single-cell RNAseq as a reference, then we use these representations to deconvolute simulated bulk RNAseq data to infer cell-type-specific proportions. Our method estimates cell-type composition in a two-stage process. We evaluate our methods on three PBMC datasets and found that a shallow autoencoder architecture performs best in deconvolution.

MS (Master of Science)
autoencoder, single-cell RNAseq, neural network
Issued Date: