Autonomous System Testing: Incorporating Physical Environments and Physical Semantics

Author: ORCID icon
Hildebrandt, Carl, Computer Science - School of Engineering and Applied Science, University of Virginia
Elbaum, Sebastian, EN-Comp Science Dept, University of Virginia

As the integration of autonomous systems becomes increasingly common in our everyday lives, their shortcomings and failures become more apparent. Therefore, rigorous validation to ensure their safety and reliability is paramount. Since autonomous systems behavior is predominantly driven by software, and software validation has achieved significant success in validating applications billions of people use today, it seems natural to attempt to apply current software validation to autonomous systems. Such application, however, requires overcoming two key challenges introduced by the differences between traditional software and autonomous systems, namely the physical environment and the systems physical semantics. Without considering these differences, traditional software testing techniques struggle to cope with a large unbounded input space and to effectively target areas of the software that drive the behaviors of the autonomous system. This work introduces techniques grounded in traditional software analysis that overcome these challenges spanning the entire testing pipeline: test generation, test execution, and test adequacy assessment.

In the area of Test Generation, I investigated techniques to produce tests based on a vehicle's kinematics to ensure they aligned with the physical semantics of the autonomous system, all while using parametrizable scoring models to identify tests that stress an autonomous system. Moreover, I leveraged a vast array of existing sensor data from real-world physical environments to identify performance discrepancies across different versions of an autonomous system. The sensor data that yielded discrepancies were then compared against the autonomous systems Operational Design Domain to determine their relevance.

In Test Execution, I have devised a mixed-reality strategy that bridges the gap between simulation and real-world testing. Recognizing that real-world testing, while ideal, is often impractical, hazardous, and expensive, my approach integrates virtual elements into real physical environments. This allows for validating performance and safety while reducing both cost and time. Additionally, I designed a haptic suit for drones, enabling us to test the physical semantics of a drone by applying forces to the drone in the real world.

Regarding Test Coverage, I created Physical Coverage, one of the first coverage metrics for autonomous systems, which considers both the physical environment and physical semantics of the autonomous system. Utilizing physical reachability analysis and geometric vectorization, this metric offers a quantifiable measure of test suite effectiveness. It has proven instrumental in identifying missing scenarios and redundant tests in datasets such as Waymo's Open Perception dataset.

By addressing these challenges across the entire testing pipeline, this dissertation takes a significant step toward creating safer and more reliable autonomous systems.

PHD (Doctor of Philosophy)
Autonomous Systems, Software Testing, Test Generation, Test Execution, Test Adequacy, Physical Environments, Physical Semantics
Sponsoring Agency:
National Science FoundationAir Force Office of Scientific ResearchU.S. Army Research Office Grant
Issued Date: