Online Archive of University of Virginia Scholarship
Rethinking Cloud Resource Scheduling for Performance, Elasticity, and Cost Efficiency in Heterogeneous Serverless Workload42 views
Author
Fu, Yuqi, Computer Science - School of Engineering and Applied Science, University of Virginia0009-0009-6556-9162
Advisors
Cheng, Yue, DS-Faculty Affairs, University of Virginia
Abstract
Modern serverless platforms face a dual challenge: meeting the diverse and dynamic demands of user workloads—ranging from short, bursty invocations to long-running, resource-intensive executions, such as Function-as-a-Service (FaaS) and LLM fine-tuning applications, while minimizing operational cost through efficient resource utilization. Existing systems fall short of this goal, relying on workload-oblivious scheduling and lacking adaptive, cost-aware orchestration across heterogeneous compute resources.
To reconcile user responsiveness with provider efficiency, we rethink serverless scheduling from the provider’s perspective and propose an adaptive scheduling framework that integrates two complementary planes of optimization. The first plane focuses on intra-function efficiency, realized in our prior works SFS (SC’22) and ALPS (ATC’24), which enhance per-worker performance through application-aware, priority-based scheduling that approximates Shortest-Remaining-Processing-Time (SRPT), mitigating interference from long-running jobs and accelerating latency-critical short functions. Building on this foundation, Allpass is an inter-function elasticity plane that extends adaptivity across heterogeneous GPU clouds. It dynamically places and migrates workloads across multiple GPU providers, seamlessly combining fast, elastic serverless GPUs with lower-cost serverful
GPU instances.
We envision a unified scheduling architecture that bridges local efficiency and global elasticity, enabling serverless platforms to dynamically balance responsiveness and cost under highly variable, heterogeneous workloads.
Fu, Yuqi. Rethinking Cloud Resource Scheduling for Performance, Elasticity, and Cost Efficiency in Heterogeneous Serverless Workload. University of Virginia, Computer Science - School of Engineering and Applied Science, PHD (Doctor of Philosophy), 2026-04-25, https://doi.org/10.18130/rkkf-nz73.