Representation Learning of Longitudinal Electronic Health Record Data for Patient Characterization and Prediction of Health Outcomes

Author: ORCID icon
Zhang, Jinghe, Systems Engineering - School of Engineering and Applied Science, University of Virginia
Barnes, Laura, Department of Systems and Information Engineering, University of Virginia

The wide implementation of electronic health record (EHR) systems facilitates the collection of large scale health data from real clinical settings. Despite the significant increase in adoption of EHR systems, this data remains largely unexplored, but presents a rich data source for knowledge discovery from patient health histories in tasks such as understanding disease correlations and predicting health outcomes. However, the heterogeneity, sparsity, noise, and bias in this data present many complex challenges. This complexity makes it difficult to translate the potentially relevant information into machine learning algorithms.

To that end, this research provides contributions to the interpretable representation of complex, sparse, high-dimensional data comprised of various medical events, such as diagnoses, medications and procedures. The goal of this dissertation is to propose new computational frameworks for representing longitudinal EHR data for improved patient characterization and developing optimized predictive models. To illustrate the utility of the proposed frameworks, the designed algorithms will be applied to a variety of risk prediction problems including the early detection of diabetes, comorbid risk prediction of chronic diseases, and prediction of hospitalization. Furthermore, the designed algorithms are evaluated against other state-of-the-art representation approaches, and the learned representations are visualized and interpreted to deepen clinical sights. In addition to assisting clinical decision making, the methods proposed in this research could be applied to other complex temporal knowledge representation tasks within and outside the healthcare domain.

PHD (Doctor of Philosophy)
Representation learning, predictive modeling, data mining, health informatics, electronic health record
All rights reserved (no additional license for public reuse)
Issued Date: