Online Archive of University of Virginia Scholarship
Diffusion Policy for Interactive Fleet Learning with Scalable Human Supervision9 views
Author
Shen, Weiran, Computer Science - School of Engineering and Applied Science, University of Virginia
Advisors
Kuo, Yen-Ling, EN-Comp Science Dept, University of Virginia
Abstract
In large-scale robot fleet deployment, a small number of human supervisors must oversee many robots acting in parallel. Interactive fleet learning (IFL) addresses this setting in which multiple robots can query limited human supervisors while intervention data are aggregated over time to update a policy. A central challenge in this setting is how to allocate limited supervision and determine which queried states should be selected for intervention and learning. Existing implementations of the IFL benchmark instantiate the imitation learner as a multi-layer perceptron (MLP), which is limited in modeling multimodal action distributions. Motivated by prior work that uses diffusion loss as a query signal in single-robot imitation learning, we use diffusion policy (DP) and deploy diffusion loss as an uncertainty signal for threshold calibration, intervention prioritization, human allocation, and continual policy improvement in online fleet learning. We evaluate the resulting system on Allegro Hand. The IFL with diffusion policy achieves higher episode success rates than the MLP baseline, and our threshold diagnostic shows that outcome-based recalibration makes the diffusion-loss query rule more budget-aware under limited supervision.
Degree
MS (Master of Science)
Language
English
Rights
All rights reserved by the author (no additional license for public reuse)
Shen, Weiran. Diffusion Policy for Interactive Fleet Learning with Scalable Human Supervision. University of Virginia, Computer Science - School of Engineering and Applied Science, MS (Master of Science), 2026-04-21, https://doi.org/10.18130/q9yq-dc79.