Abstract
Generative AI is a rapidly evolving technology that has profoundly changed the way that we as a global society discover, create, and learn. My capstone project is a research project focused on improving reasoning performance in Generative AI models through the use of looped transformer architectures, which are highly parameter-efficient models that have demonstrated stronger reasoning capacity than traditional transformer models. The motivation for this work stems from the growing demand for Generative AI that can reason more effectively without proportionally increasing computational cost. My STS research paper examines how the introduction of Generative AI has disrupted educational environments and what criteria would define its successful integration into these environments in the future. This research is important because education is how we cultivate the next generation to advance society, and the role Generative AI plays within it will directly shape the quality of learning available to future students. Improving reasoning at lower parameter counts in AI models makes them more capable of assisting the students and educators who are most directly affected by this technology at lower costs, which has direct implications for equity concerns surrounding AI access in educational settings.
Large language models have demonstrated strong reasoning capabilities, but this impressive performance has come at the cost of extensive computational costs. This makes these models expensive to train and deploy at scale. Looped transformer architectures offer a promising path forward due to their parameter efficiency, but they are not without their own challenges. Due to their recursive structure and parameter-sharing, looped transformers typically require more loops than a comparable standard model has layers to match performance and inherently perform worse on perplexity benchmarks for language modeling. My capstone project addresses these limitations through targeted architectural modifications to looped transformers, with the goal of closing this performance gap while preserving their efficiency advantages. Beyond addressing these shortcomings, my work also explores the applications of looped transformers in the frontier of reasoning research, where the field has increasingly shifted toward latent space reasoning and world modeling. Joint Embedding Predictive Architecture (JEPA) is a prominent paradigm for this, and a key question my work investigates is whether dynamic, recursive reasoning within such a framework is more effective than the fixed-length sequential reasoning employed by standard models.
The results of this research suggest that while looped transformer architectures retain their strong inductive bias toward iterative reasoning, the efficiency gap with standard transformers remains difficult to fully close. Specifically, experiments showed that looped models still require a greater number of recursive iterations to match the effective depth of comparable sequential transformers, reinforcing prior findings in related literature. Additionally, this research demonstrated that looped transformers are highly effective for latent-space reasoning under the JEPA training paradigm, empirically matching or outperforming comparable sequential transformer architectures on mathematical word problems and reasoning primitives while using significantly fewer parameters. The overall contribution of this work is therefore both a refinement of the current understanding of where looped transformers succeed and where their constraints persist, and the establishment of a strong foundation for future reasoning research under JEPA and other latent world-modeling training paradigms.
My STS research paper addresses how the effective integration of generative AI into educational environments should be defined through the lens of the technological momentum framework. Technological momentum describes how technologies become increasingly embedded within societal institutions over time until reversal becomes increasingly unlikely, eventually causing the technology to shape society more than society shapes the technology in the long term. This question is significant because changes to educational systems shape how future generations think, reason, and contribute to the world. To answer this question, this paper conducts a qualitative literature review analyzing existing regulatory frameworks and institutional policies, stakeholder sentiment across students, educators, policymakers, and technology companies, and the documented outcomes associated with current patterns of generative AI adoption in educational settings.
The evidence analyzed in this paper shows that generative AI is already deeply integrated in educational environments. Students widely use these tools for support and efficiency, but face risks of overreliance that may weaken independent reasoning. Alongside this, assessment systems are being pushed toward redesign, as traditional evaluation methods become less effective in the presence of AI-generated outputs. Institutions and policymakers have begun developing governance frameworks, but these responses remain inconsistent and often lag behind the pace of adoption, while technology companies continue to actively drive large-scale integration. These findings support the conclusion that the impact of generative AI is not determined by the technology itself, but by how its use is structured within educational systems. Under the lens of technological momentum, this reinforces that integration is unlikely to be reversed, and that effective adoption must be defined by strong governance, clear expectations, and the use of AI as a tool to support, rather than replace, student learning.