"Computational analysis of high throughout sequencing data-Applications to DNA and RNA studies"

Malhotra, Ankit, Department of Biochemistry and Molecular Genetics, University of Virginia
Dutta, Anindya, Department of Biochemistry and Molecular Genetics, University of Virginia

For a long time, the state of the art DNA sequencing technology was the capillarybased Sanger sequencing technology. However the advent of the high-throughput massively parallel sequencing technologies (MPS) in 2005 has revolutionized the field of genomics. It has given us the tools to investigate a whole genome worth of DNA sequence in a very short period of time for a relatively low cost. Both the cost and amount of sequencing has kept pace with Moore's Law for the last few years, and is expected to continue into the next decade, making the dream of personal genetics possible. It has been a challenge to come up with innovative and optimal solutions to the technical and bioinformatic challenges in interpreting data from MPS. The new molecular methods that form the basis of these technologies introduce new biases that have to be addressed in our analysis. The vast amounts of sequence data generated provide its own statistical and computational challenges. This dissertation provides a description of my efforts to develop computational and analytical methods to analyze sequencing data from large-scale genomic studies with a focus on understanding molecular basis underlying cancers. In the first part, I present two methods - AbLink and AbCNV, to study genome wide structural variation using the high throughput sequencing methodologies. AbLink and AbCNV are computational pipelines that rely on the sequencing platforms to investigate the genome and can predict chromosomal rearrangement events such as VI recombinations, insertions, deletions and inversions as well as genomic loci that are involved in Copy Number Variations (CNV). In the second part of the dissertation, I present a novel method to identify specific gene fusions in multiple patient samples with application to diagnostic and prognostic analyses. In the third part, I present a study of short RNA (sRNA) populations from prostate cancer cell lines to identify a miRNA signature for androgen independence and to identify a new class of sRNAs.

Note: Abstract extracted from PDF file via OCR

PHD (Doctor of Philosophy)
All rights reserved (no additional license for public reuse)
Issued Date: