A Data Solutions Internship: Enhancing Document Retrieval Systems; Contested Oils: Social Divisions over Seed Oil Guidelines
Michelitch, Sarah, School of Engineering and Applied Science, University of Virginia
Elliott, Travis, AT-Academic Affairs, EN-Engineering and Society, University of Virginia
Vrugtman, Rosanne, EN-Comp Science Dept, University of Virginia
The technical capstone report reflects my work during a summer internship, where I helped improve a document retrieval system using semantic embeddings and machine learning techniques. The system aimed to return relevant documents in response to user queries, but it suffered from inconsistent accuracy. My work focused on diagnosing these inconsistencies by creating Python-based visualization tools (using PCA and UMAP), developing automated testing pipelines with Pytest and Postman, and conducting targeted analyses of document features such as length, language, and character encoding. These efforts improved the team’s understanding of embedding behavior and set a foundation for future enhancements to the retrieval system.
The thesis analyzes the polarized debate over seed oils in the U.S. through the Social Construction of Technology (SCOT) framework, revealing how corporate-funded groups promote seed oils as heart-healthy, while anti-seed oil advocates challenge their dominance by advocating for natural alternatives. The conflict stems from interpretive flexibility, where competing social groups leverage selective science. The lack of closure reflects entrenched economic interests and the scalability of cheap seed oils versus pricier alternatives. Ultimately, the paper calls for transparent nutrition research and nuanced policies to reconcile scientific contradictions and corporate influence.
BS (Bachelor of Science)
Document retrieval systems, Query performance, Postman / Pytest, Seed oils
School of Engineering and Applied Science
Bachelor of Science in Computer Science
Technical Advisor: Rosanne Vrugtman
STS Advisor: Travis Elliott
Technical Team Members: N/A
English
All rights reserved (no additional license for public reuse)
2025/05/08