Abstract
Science depends on trust, but trust is built on an increasingly unstable system. In 2023, Hindawi, an open-access scientific publisher, retracted a record 8,000 papers across its journals after discovering evidence of paper mills, fabricated results, and other academic misconduct (Van Noorden, 2023, p. 1). This scandal revealed the “systematic manipulation of the publication and peer review process” (Van Noorden, 2023, p. 1). Among all academic disciplines, Computer Science has the joint highest rate of academic misconduct retractions (Li & Shen, 2024). In computational research, misconduct is increasingly committed through algorithmic duplication, in which recycled algorithms are disguised as novel findings. This is especially common in the optimization field of metaheuristics, which “attempts to find the best (feasible) solution out of all possible solutions to an optimization problem” (Glover, Laguna, & Marti, 2000). While there are legitimate metaheuristic approaches, there has been an explosion of highly derivative papers in the literature. My two projects address this problem from complementary perspectives. The technical project develops a framework to detect algorithmic plagiarism in metaheuristic research, while the STS research examines how automated academic misconduct detection tools reshape governance in academic publishing. Together, these projects show that protecting scientific integrity is both a technical and an ethical challenge.
The technical portion of my thesis developed an AI-driven, multistage pipeline designed to identify algorithmic plagiarism and superficial novelty in metaheuristic optimization research. The system extracts pseudocode from academic papers, converts it into a standardized canonical form, and compares algorithms through a four-level weighted scoring system. Each level captures a different dimension of similarity, and the combined score determines whether a paper should be flagged for further review. To validate the system, we built synthetic disguised cases and tested the system against papers that experts had identified as non-novel. The results showed the system can distinguish between genuinely distinct algorithms from derivative ones. Although the system is a proof-of-concept, it demonstrates that misconduct detection can become more systematic, scalable, and quantitative, rather than relying on time-intensive expert-driven judgment. This mattered to me as AI is often, and deservedly, seen as a threat to scientific integrity, yet my project showed that it can also help protect integrity when it is designed properly.
My STS research examined how automated academic misconduct detection tools change the system of academia through the lens of Actor-Network Theory. Rather than treating these tools as neutral technologies, I argued that they function as political actors within a broader academic network. My research revealed that misconduct governance is already distributed across many actors, but institutions hold much more power than individual researchers. When automated tools are introduced, they shift parts of the judgment process from professional self-regulation toward technically mediated decision-making. These tools do not just improve efficiency; they also shape the definition of misconduct, what counts as evidence, and whose decisions matter most. I concluded that the central issue is not only whether these tools are accurate, but whether they are governed in ways that are transparent, contestable, and accountable. This experience was meaningful to me because I have personally been wrongly flagged for misconduct by automated tools like Turnitin, and known how difficult it can be to challenge those decisions.
Working on both projects made the significance of this issue a lot more concrete to me. My technical project demonstrated that engineering can help address scientific fraud by developing tools to detect misconduct at scale. My STS research showed that those same tools can create new ethical risks by redistributing power and shaping judgement in ways that may be unfair or difficult to challenge. Together, these projects changed what I think it means to be a good engineer. This reinforced the idea that ethical engineering requires more than creating effective technology. It also requires considering the bigger picture and how it affects the people and institutions around it. The perspectives I learned in STS helped me understand that responsible engineering means questioning one’s own assumptions and biases to better recognize the broader impacts of technologies. In this case, that meant stepping beyond my perspective as a tool developer to consider how such a technology might affect others.
The technical work was supported by the Jefferson Trust project “Safeguarding Science: Developing Knowledge and Tools to Prevent Scientific Fraud.” My teammate Halbert Nguyen was instrumental in building the model. Professor Matthew Bolton provided expertise on metaheuristics, model building, and technical writing. Professor William Davis guided the STS research and the broader ethical framing.