Challenging Dogmas in Biology: Leveraging Chimeric RNA Profiling to Discover Hidden Expression in Healthy Tissues and Cancer
Elfman, Justin, Biochemistry and Molecular Genetics - School of Medicine, University of Virginia
Li, Hui, MD-PATH Experimental Pathology, University of Virginia
Gene fusions and their chimeric products are commonly associated with cancer, yet recent studies have also identified chimeric transcripts in non-cancerous tissues and cell lines. Similarly, efforts to annotate structural variation have uncovered gene fusion events in healthy populations, including some capable of generating chimeric transcripts. Despite recent efforts, characterization of chimeric RNAs in healthy tissues is not comprehensive, and there are significant gaps in chimeric RNA annotation. Building on these gaps in research, this dissertation seeks to characterize and catalogue roles of chimeric RNAs in healthy individuals, and leverage this knowledge to explore their dysregulation in cancer.
In Chapter 2, we adopt a bottom-up approach to target population-specific chimeric RNAs, identifying 58 such instances within the GTEx cohort. We provide direct evidence for 29 additional polymorphic chimeric RNAs associated with structural variants, uncovering 13 novel rare structural variants. Additionally, using the All of Us dataset and a large cohort of clinical samples, we examine the association between the SUZ12P1–CRLF3-causing variant and patient phenotypes, demonstrating the identification of elusive structural variants through population-specific fusion transcripts.
In Chapter 3 we explore the expression of chimeric RNAs in a tissue-specific context, integrating data from four leading software packages. We identified 54,576 chimeric RNAs from 22,814 gene pairs, analyzed in the context of tissue specificity. Our findings reveal that chimeric RNAs exhibit higher tissue specificity compared to the broader transcriptome. This dataset is validated with single-molecule long-read sequencing, providing in silico validation for 1,946 chimeric junctions and in vitro validation for 25, leveraged to form the first complete transcriptome of chimeric RNAs in the testis. We annotate 8,562 unique isoforms from 1,569 genes, including 8,394 novel isoforms. Additionally, we identify high-confidence peptide matches for 1,143 fusion peptides, with 733 specifically found in testis or cancer. Furthermore, we explore the potential application of these data for identifying cancer-enriched chimeric RNAs, cancer-testis chimeric RNAs, and cancer-testis proteins that may serve as cancer-testis antigens. Using stringent filtering criteria, we identify 262 cancer-testis chimeric isoforms from 220 unique fusion gene partners, including 15 with protein-level evidence.
This dissertation presents a thorough examination of chimeric RNAs found within normal tissues and investigates the connection of chimeric RNA expression in testis and cancer. We establish improved pipelines for de novo chimeric RNA prediction, identification of transcribed gene fusions, tissue-wise chimeric RNA analysis, full-length chimeric RNA annotation, and integration of chimeric RNA predictions with proteomic data. From these methods, we provide comprehensive, first-in-kind annotations and datasets to form a foundation for future study of chimeric RNAs. Finally, we demonstrate two avenues for the use of this data: a) categorization of polymorphic chimeric RNAs and exemplary characterization of SUZ12P1-CRLF3 and b) the integration of testis-specific chimeric RNA predictions with long-read RNA-seq and proteomic data to establish a panel of chimeric cancer-testis antigen candidates.
PHD (Doctor of Philosophy)
English
All rights reserved (no additional license for public reuse)
2024/07/31