Online Archive of University of Virginia Scholarship
Differential Inline Testing: Framework, Test Generation, and Application31 views
Author
Shahane, Chaitanya Rajendra, Computer Science - School of Engineering and Applied Science, University of Virginia
Advisors
Nie, Pengyu , Computer Science, University of Waterloo
Wang, Wenxi, EN-Comp Science Dept, University of Virginia
Abstract
Inline testing was proposed to check the correctness of hard-to-reach code statements and has demonstrated effectiveness in finding single-statement bugs not covered by unit testing. Existing inline testing framework only supports using explicit assertions as the test oracle, which is limited in the situation of non-deterministic and under-specified programs—the well-known “oracle” problem. In the world of unit testing, the oracle problem has been addressed using techniques like differential testing, i.e., comparing the outputs of equivalent programs given the same test inputs.
In this work, we propose differential inline testing to bridge this gap. We extend the inline testing framework to support differential testing: instead of an explicit assertion, developers can specify the differential variables and the output variable, and the framework will automatically execute the target statement under multiple configurations and check for divergences. In light of inline testing being a new concept with limited adoption in practice, we also design and implement an LLM-based test generation framework called DiffITestGen, which can automatically generate and refine differential inline tests given a target statement. Finally, we evaluate the effectiveness of differential inline testing and DiffITestGen through a case study of applying them on a popular deep learning library, PyTorch. We experiment with two scenarios: (1) library developers applying differential inline testing to test their codebase; (2) users applying differential inline testing to test complex statements using the library’s APIs. DiffITestGen can generate valid differential inline tests with high success rates (62.5%/73.74%). The generated differential inline tests covered 1,092 lines of code that were not covered by PyTorch’s unit tests. We identified 12 bugs with the help of DiffITestGen generated differential inline tests, out of which 7 have been confirmed as previously unknown bugs by PyTorch developers.
Degree
MS (Master of Science)
Keywords
Inline Testing; Differential Testing; Automated Test Generation; Large Language Models,; Deep Learning Libraries
Language
English
Rights
All rights reserved by the author (no additional license for public reuse)
Shahane, Chaitanya Rajendra. Differential Inline Testing: Framework, Test Generation, and Application. University of Virginia, Computer Science - School of Engineering and Applied Science, MS (Master of Science), 2025-11-25, https://doi.org/10.18130/9yq0-3760.
Files
This item is restricted to abstract view only until 2026-05-24.