Abstract
Large computing courses face a persistent feedback problem: students need timely, individualized help while instructors must still protect the reasoning and productive struggle that computing education is meant to develop. Office hours, teaching assistants, and autograders rarely provide support at the exact moment students encounter confusion, but unrestricted AI tools can create a different problem by turning feedback into answer substitution. My portfolio addresses this broader challenge: how universities can scale programming support without weakening learning, trust, or instructor oversight. The issue is not simply whether large language models should be used in computing education, but how they should be configured, framed, and governed. Research discussed across my two projects shows that guardrails matter because they shape what kind of help students receive and what kind of cognitive work remains student-owned. In one study synthesized in my technical work, unguarded ChatGPT improved practice performance by 48% but reduced later exam performance by 17%, while a guardrailed tutor improved practice by 127% without the same exam penalty. Together, my technical and STS projects examine the same underlying problem from complementary angles: one proposes a practical feedback system design, while the other explains why social and institutional governance matters as much as the technology itself.
My technical report, Closing the Loop: LLM-Mediated Feedback for Active CS Learning—Meta-Study and Curriculum Recommendations, proposes a guardrailed LLM-mediated feedback loop for algorithm-intensive computer science courses, particularly UVA's CS 3100. The project responds to the difficulty of delivering fast, personalized feedback in courses where students must iteratively revise solutions and develop strong problem-solving habits. Rather than treating AI as a free-form answer source, I synthesize recent empirical research to design a structured system based on active learning and human oversight. The proposed architecture has four stages: an initial student attempt, LLM-scaffolded hints, a revision cycle, and a formative assessment step that flags students for instructor review when needed. The design includes answer suppression, progressive hint disclosure, cognitive-load monitoring, and automated verification of hint quality. Because this report is a design and synthesis project rather than a completed classroom trial, its results are presented as expected outcomes grounded in prior research: improved practice-task performance and revision quality, reduced instructor time spent on routine feedback, and more equitable access to help through always-available and multilingual support. The report ultimately argues that the educational value of LLMs lies not in unrestricted generation, but in carefully constrained feedback systems that preserve student thinking while expanding access to timely assistance.
My STS research paper, Guardrails, Trust, and Legitimate Help: Governing LLM-Mediated Programming Support in Computing Education, studies the same issue from a sociotechnical perspective. It asks how guardrails and human oversight shape what counts as legitimate help in student-facing LLM systems used in higher-education computing courses. To answer that question, I conducted a systematic review and comparative synthesis of ten empirical studies published between approximately 2023 and 2025. Across these studies, I compared explicitly guardrailed tutoring tools, more open ChatGPT use, and systems that generate feedback on submitted work. The evidence shows that students do not simply interact with "the model"; they interact with a bundle of model behavior, prompt structure, interface design, course rules, and social expectations. Guardrailed systems such as CodeHelp, Iris, and PyTutor can preserve productive struggle and improve some outcomes, especially for students with weaker foundations, but the literature also reveals limits and tensions. Some students still seek immediate assignment help with very little context, chatbot feedback can be unreliable in specialized domains, and overreliance can reduce self-efficacy. I argue that guardrails should be understood as governance mechanisms because they define the boundary between scaffolding and substitution and distribute responsibility among students, instructors, and AI systems.
Taken together, these projects make one central contribution to the problem of scalable programming support: both show that the key educational question is not whether AI is present, but how AI-mediated help is governed. My technical project turns that insight into a concrete feedback-loop design that keeps instructors in the loop and structures AI around revision rather than answer delivery. My STS project explains why those constraints matter socially as well as pedagogically, since they shape trust, dependence, and the meaning of "help" in computing education. I therefore conclude that the strongest path forward is neither blanket prohibition nor unrestricted adoption. Instead, computing education needs course-integrated, accountable, and guardrailed AI support systems that expand access to feedback while keeping legitimate learning work visible and student-owned. Future work should build on this portfolio by piloting such systems in live classrooms and measuring not only short-term performance, but also long-term retention, independence, and trust.