Abstract
Introduction
Machine learning systems have increasingly taken control of making high-stake decisions
about people’s lives. This can include who advances in a hiring pipeline, who is released before
trial, or whose face a security camera flag. The reason behind this mass adoption is the efficiency
and apparent objectivity that machine learning systems bring. What gets left out is that this
objectivity that is spoken of by organizations is often a fiction. When training a model using
historical data, it is often shaped by the discrimination apparent within this data. So instead of
correcting them, it encodes and automates these patterns. Amazon’s recruiting algorithm
penalized resumes containing the word “women’s” as it had learned from a male-dominated
workforce. HireVue’s video interviewing system was responsible for generating disparate
outcomes across demographic groups and even faced EEOC charges in 2025. These are the
inevitable outcomes that come from machine learning systems optimizing for accuracy with no
questions raised about fairness. My thesis addresses this problem with two directions: A
technical project proposing a framework for evaluating fairness-aware machine learning
methods, and an STS project that uses Actor-Network Theory to analyze how ethical
responsibility for discriminatory hiring algorithms is distributed and evaded.
Technical Project
The technical portion of my thesis proposes a systematic framework for comparing bias
mitigation techniques across three pipeline stages, which are preprocessing, in-processing, and
postprocessing. It includes the use of three datasets: The Adult Income dataset, a synthetic
pretrial risk assessment dataset that will be constructed from public criminal justice statistics, and
facial recognition benchmarks. The synthetic dataset is critical to the project as it allows control
over what known bias patterns are explicitly implemented into the data and therefore the
framework can test whether each method removed them, which is impossible with proprietary
tools like COMPAS. The evaluation will use Python, scikit-learn, and IBM’s AI Fairness 360
toolkit to measure not only the predictive performance but also fairness metrics which will
include demographic parity, equalized odds, and equalized opportunity. The expected result from
the project is a practical and empirically grounded understanding of the tradeoffs that come from
each type of method and in which domains each of them work best.
STS Research Paper
In my STS research, I applied Actor-Network Theory to discover and trace how
responsibility for biased hiring algorithms is formed and deflected. ANT treats both human and
non-human actors symmetrically, so executives, data scientists, training datasets, and legal
frameworks are all actants that actively shape the outcomes; therefore blame cannot rest at any
single node. Through case studies of Amazon’s recruiting tool, HireVue’s video system, and the
2024 Mobley v. Workday lawsuit, I have mapped how accountability spreads when no one actor
has full control of the system. The employers will blame vendors, vendors claim they only built
the tool, and regulators find civil rights laws ill-suited to algorithmic decision-making. Drawing
from the works of O’Neil, Noble, and Eubanks, my research argues this diffusion is a structural
feature of how these networks are formed and recommends three specific interventions:
mandatory algorithmic impact assessments, enforceable third-party bias audits with the inclusion
of public disclosure, and vendor liability reforms.
Conclusion
When put together, these two projects illustrate a core tension as the technical work
demonstrates that fairness interventions are feasible, while the STS research shows that this
technical feasibility will not matter if there are no accountability structures that require their use.
The technical framework strengthens the argument that developers who create the algorithms and
employers who use them have an obligation to act. The STS analysis details why these tools
remain underused, as organizations have very little reason to adopt these fairness-aware methods
when the alternative carries no real legal or reputational risk. Hiring is a primary domain where
this often occurs in and automating discrimination at scale behind the façade of objectivity
causes harm that is harder to detect and contest than its human equivalent. Addressing this issue
requires both the technical capability of building fairer systems and the institutional will to
require that organizations use them.