Morphometric Analysis of Fibroblast-to-Myofibroblast Transition for Fibrotic Diagnoses; How Training Data Shaped Bias in IBM’s Watson Health Clinical Failure

Carvalho, Alicia

Morphometric Analysis of Fibroblast-to-Myofibroblast Transition for Fibrotic Diagnoses; How Training Data Shaped Bias in IBM’s Watson Health Clinical Failure 44 views

Author

Carvalho, Alicia, School of Engineering and Applied Science, University of Virginia

Advisors

Earle, Joshua , EN-Engineering and Society , University of Virginia
Cai, Liheng , EN-Mat Sci & Engr Dept , University of Virginia

Abstract

Technical Project: Morphometric Analysis of Fibroblast to Myofibroblast Transition
Fibrosis is a progressive disease and endpoint to many medical complications that can be characterized by excessive scarring of lung tissue. At the cellular level, the pathological behavior is driven by fibroblast-to-myofibroblast transition (FMT), in which resting state fibroblasts are activated into myofibroblasts—contractile, α -smooth muscle actin (α-SMA)-expressing cells that modify the behavior of the extracellular matrix. One key driver that has been previously identified to cause FMT is transforming growth factor-beta (TGF-β), a cytokine that promotes myofibroblast activation and downstream remodeling of the extracellular matrix. In this work, TGF-β and its inhibitor SB431542 are used to promote and suppress FMT, respectively, in primary human lung fibroblast cultures, providing an in-vitro model for evaluating cytoskeletal descriptors associated with the development of fibrosis.
The scope of this capstone project focuses on two interconnected goals: optimizing the immunofluorescence laboratory protocol for staining of primary human lung fibroblasts (HLFs) and developing an automated computational image analysis pipeline for morphometric analysis of FMT. Extensive modification and iteration of cell culture and immunofluorescence conditions for primary cells samples have been conducted to produce reliable, high-quality images of primary HLFs staining with α-SMA and phalloidin for immunofluorescent intensity and coherency as in-vitro cytoskeletal and morphological indicators of FMT. Manual image analysis of this dataset using ImageJ indicated clear distinction between treatment groups in 4-day and 7-day sample sets. Building on the basis of the manual analysis, a deep-learning based automated segmentation pipeline using the Meta Segment Anything Model 2 (SAM2) separated cell images and analyzed key morphometric features to assess the relationship between fibroblasts and activated myofibroblasts. Once verified and validated, the pipeline will be applied to training datasets from additional wet lab immunofluorescence datasets to produce a generalizable lab tool for quantifying FMT in fibroblast cell populations that are often heterogeneous.
STS Project: How Training Data Shaped Bias in IBM’ Watson Health Clinical Failure
IBM’s Watson Health for Oncology was one of the early and most prominent investments by a large technology company to apply artificial intelligence to clinical care. Launched following the success of Watson on the quiz show Jeopardy! In 2011, the platform was promoted to deliver personalized, evidence-based treatment recommendations for oncology cases. Despite a powerful partnership with research and medical institution Memorial Sloan Kettering Cancer Center (MSKCC) and significant institutional investment, Watson Health for Oncology ultimately failed to demonstrate technological capability and clinical adoption and was ultimately divested by IBM in 2022.
This sociotechnical thesis analyzes Watson Health’s failure through the lens of Actor-Network Theory (ANT), an ethical framework developed by Latour, Callon, and Law that examines technological inventions and outcomes as results of interconnected networks of actors, both human and non-human. In the Watson Health network, key non-human actors included the training data, the Watson algorithm, and the absence of real-world clinical image data. Key human actors in the case of Watson Health included IBM researchers and engineers, MSKCC clinicians, and clinicians within the healthcare network. 
Sociotechnical Synthesis
Together, these two projects present a tension in computational tools in medicine: the promise of data-driven tools to innovate and transform biological understanding, and the risk that these tools will fail due to flaws and inaccuracies in the data they rely on. The technical project addresses this concern at the preclinical stage by constructing an automated, quantitative pipeline for characterizing fibroblast-to-myofibroblast transition – a biological process that has direct impact for the diagnosis subsequent treatment of fibrotic diseases. By basing this analysis on morphometric features derived from the cytoskeletal structure of the cells, the work provides a computational tool that is built on biologically meaningful and reproducible data. Improvement and optimization of staining protocols and systemic feature extraction make up the effort to avoid the failures that ultimately led to the divestment of Watson Health
The STS project’s analysis using Actor-Network Theory analysis of Watson Health reveals how both human and non-human actors, particularly training data, shape patient outcomes in modern medicine. Watson’s reliance on curated cases introduced bias directly into the recommendation tool, causing it to produce outputs that were misaligned with the patient populations it was meant to serve. This case demonstrated that flawed or misrepresentative data cannot be compensated for by any amount of computation innovation. Through this lens, the automated morphometric pipelines emphasized quantification of primary fibroblasts through treatment-controlled experimental design and comparison with established biological markers. This represents movement towards computational tools whose inputs and outputs can interrogated, validated, and ultimately trusted by the networks they are a part of.

Degree

BS (Bachelor of Science)

Keywords

fibrosis; fibroblast-to-myfibroblast transition; image analysis pipeline

Language

English

Rights

Issued Date

2026-05-11

Persistent Link

https://doi.org/10.18130/q021-8s18

Suggested Citation

Carvalho, Alicia. Morphometric Analysis of Fibroblast-to-Myofibroblast Transition for Fibrotic Diagnoses; How Training Data Shaped Bias in IBM’s Watson Health Clinical Failure. University of Virginia, School of Engineering and Applied Science, BS (Bachelor of Science), 2026-05-11, https://doi.org/10.18130/q021-8s18.

Files

This item is restricted to UVA until 2031-05-11.