Deep Multimodal Representation Learning to Integrate Natural Language Processing with Genomic Interval Data for Tailored Biomedical Discovery; Exploring the Downfall of IBM’s Watson Health: An Actor Network Theory Perspective

Fay, Caitlyn, School of Engineering and Applied Science, University of Virginia
Laugelli, Benjamin, EN-Engineering and Society, University of Virginia
Sheffield, Nathan, EN-Biomed Engr Dept, University of Virginia

My technical work and STS research are linked through their focus on utilizing machine learning in healthcare. Due to recent advancements in computing power and memory storage, artificial intelligence (AI) tools have more potential than ever before. Machine learning, which is a subset of artificial intelligence, can be used to create algorithms that analyze data more efficiently and accurately than current platforms are able to. My technical and STS research both focus on the potential of AI in healthcare, with my STS research investigating the failure of IBM Watson to enhance the development my technical project of a representation learning model for tailored biomedical discovery. My technical project explores the applications of machine learning models in healthcare through the creation of multimodal representation learning models. Multiple models have been developed by my Capstone group that are able to process natural language and return relevant genomic regions. As new sequencing technologies have emerged over recent years, there has been a drastic increase in the amount of data from ATAC-Seq and Chip-Seq experiments. This has created a clear demand for more complex models to sort through and extract relevancies within these data sets. The goal of this technology is to integrate natural language (text) with genomic data by creating a predictive model that can return relevant genomic regions based on a user entered query. This technology will make for easier and more precise epigenomic research. My STS research paper explores machine learning through the failure of International Business Machines Corporation’s (IBM) Watson Health. Watson is a supercomputer that uses artificial intelligence to combine machine learning and natural language processing. Watson’s failure is multifaceted, stemming from issues including poor training data, weak AI programs, and unstable relationships between human organizations working on the technology. In this STS research paper, the Actor Network Theory, developed by scholars Michael Callon and Bruno Latour, is used to examine the human and non-human actors relevant to the collapse of Watson Health. My claim is that while technical factors did play a role in the downfall of IBM Watson Health, I argue it was more so the social relationships between groups that caused these technical issues to manifest and derail the project. The goal of this research is to examine the importance of both human behaviors and technical tools in creating a successful machine learning model. Working on my STS research and technical project simultaneously allowed me to gain a deeper understanding of both projects. My STS research helped me recognize the social responsibilities involved in generating an AI model for the healthcare industry. There are multiple considerations that must be made with respect to doctors, patients, and research organizations. This influenced my awareness surrounding the origins of training data and program development for the machine learning model I was creating. My concurrent work on these projects heightened my awareness of the social implications of developing AI tools in the healthcare industry.

BS (Bachelor of Science)
Epigenomics, Machine Learning, IBM Watson

School of Engineering and Applied Science

Bachelor of Science in Biomedical Engineering

Technical Advisor: Nathan Sheffield

STS Advisor: Benjamin Laugelli

Technical Team Members: Lily Jones, Zach Mills, Peneeta Wojcik

All rights reserved (no additional license for public reuse)
Issued Date: