Abstract
Cancer clinical trials represent the primary pathway through which new therapies reach patients, yet only about 7 percent of adult cancer patients participate in them. The disparity cuts even deeper across demographic lines: Black patients represent approximately 13 percent of the cancer population but constitute only 4 to 6 percent of trial enrollees, while Hispanic patients face a similar gap. These are not simply logistical failures. They reflect something more fundamental about who gets counted, whose records get reviewed, and whose access to cutting-edge treatment is treated as a priority. At the University of Virginia Comprehensive Cancer Center, the manual process of matching over 300 oncology patients per week to active trials is slow and labor-intensive. The rise of large language model-based systems promises to address this bottleneck through automation. But automation carries its own risks. When an algorithmic system mediates access to potentially lifesaving therapies, the technical decisions embedded in its design are no longer neutral. They encode assumptions about clinical knowledge, patient populations, and what counts as a match. Solving the enrollment problem with AI is not straightforwardly a solution. It is a redistribution of authority with equity consequences that remain underexamined. Both projects in this thesis portfolio address this shared problem from complementary directions: one through the construction of a more precise and temporally aware matching pipeline, and the other through a sociotechnical investigation of how such systems reshape clinical practice and perpetuate or challenge structural inequity.
The technical project is a large language model-based agentic pipeline to automate the matching of cancer patients to eligible clinical trials. The core technical challenge this work addresses is temporal reasoning. Most existing systems treat eligibility as a static determination made at a single moment, but cancer is a disease defined by continuous progression. A patient may meet eligibility only after completing a specific therapy line, or lose it when their performance status declines past a protocol threshold. Trial criteria frequently encode constraints such as “at least 28 days since last systemic therapy,” requiring the system to reason over intervals and sequences rather than snapshot states. The system combines transformer-based language models fine-tuned on oncology notes with a rules engine that encodes temporal logic for sequencing constraints. It also incorporates alert mechanisms that flag transitions into and out of eligibility windows, transforming the matching workflow from periodic batch screening into continuous monitoring.
The STS research paper, “Algorithmic Authority and Clinical Autonomy in AI-Driven Trial Matching,” examines how AI-driven clinical trial matching systems redistribute epistemic authority in oncology practice and what sociotechnical complexities determine whether they expand or encode patterns of exclusion. The analysis draws on the NASSS framework, which examines technology implementation across seven interdependent domains including the clinical condition, the technology itself, the stakeholder value proposition, intended adopters, organizational context, regulatory environment, and adaptive capacity. It also employs a mutual shaping perspective that treats algorithmic systems and clinical environments as continuously reshaping one another. A secondary lens on deskilling examines what happens over time to the coordinator expertise that automated screening gradually displaces. The paper’s central finding is that AI-driven trial matching does redistribute epistemic authority, but does so under conditions of opacity, institutional misalignment, and data bias that make this shift neither neutral nor consistently beneficial. The literature reveals a reliability gap in which clinicians cannot adequately evaluate recommendations from systems they cannot interrogate, a pipeline bias problem in which training data representing majority populations produces models that approximate dominant trends while underserving underrepresented groups, and a deskilling concern in which the long-term erosion of coordinator expertise silently removes the human safeguard most needed when algorithms fail. The conclusion drawn is that equitable trial matching requires treating justice as a design requirement from the start, not a consideration to be added after performance benchmarks are satisfied.
Taken together, these two projects form an argument that is stronger than either could make alone. The technical work demonstrates that more precise and temporally responsive matching is achievable and that it can meaningfully reduce the structural barriers preventing eligible patients from ever entering a trial. The STS work demonstrates that technical precision alone is not sufficient, and that building faster systems on biased data or within misaligned institutions may simply accelerate existing exclusions at greater scale. The coordination between these projects reflects a broader conviction that the most consequential problems in clinical AI are not optimization problems. They are problem of accountability that must be engaged directly and rigorously from the engineering stage forward.