Sample Efficient In-Context Learning and Fine-Tuning of Transformer-Based Language Models

Liu, Renpu

Sample Efficient In-Context Learning and Fine-Tuning of Transformer-Based Language Models 7 views

Author

Liu, Renpu, Electrical Engineering - School of Engineering and Applied Science, University of Virginia

Advisors

Yang, Jing , EN-Elec & Comp Engr Dept , University of Virginia

Abstract

Transformer-based models, large language models (LLMs), and related modular architectures have demonstrated a remarkable ability to adapt to new tasks either in-context, by conditioning on a few demonstrations, through fine-tuning, by updating a small subset of parameters, or through selective routing to specialized components. However, the mechanisms that enable such sample-efficient adaptation, and principled methods for exploiting unlabeled data, personalization, and non-stationary environments, remain only partially understood. This dissertation develops a unified theoretical and algorithmic framework for sample-efficient adaptation in transformer-based and modular learning systems.

The first part studies how transformers implement learning-to-optimize (L2O) algorithms for in-context sparse recovery tasks. For LASSO-type problems, we show that a K-layer decoder-only transformer can be explicitly constructed to realize a LISTA-type algorithm with an error that decreases linearly with K. Unlike classical LISTA variants that must be trained and tested under the same measurement matrix, the resulting ``LISTA with Varying Measurements'' (LISTA-VM) implemented by the transformer provably generalizes across different measurement matrices and accommodates a varying number of demonstrations. Experiments with both small transformers and GPT-2 validate the theory and highlight the robustness and efficiency of the learned in-context procedure.

The second part introduces an augmented in-context learning framework that combines a small set of labeled examples with a block of unlabeled inputs in the same prompt. For multi-class linear classification, we demonstrate that, under chain-of-thought prompting, a multi-layer encoder-based transformer can emulate an expectation-maximization (EM) style algorithm, leveraging both labeled and unlabeled data to achieve provable gains in in-context accuracy. When trained with teacher forcing, the transformer parameters converge to the desired solution at a linear rate, and the resulting classifier enjoys excess risk bounds that strictly improve over purely supervised in-context learning.

The third part focuses on personalized reinforcement learning from human feedback (RLHF). We propose a shared low-rank adaptation approach in which Low-Rank Adaptation (LoRA) is applied in a joint parameter space of all user-specific reward functions. This construction exploits shared low-rank structure while allowing individual-specific adaptations, leading to sample-complexity guarantees for recovering both common and personalized components of human preferences. Empirical results on real-world preference datasets demonstrate that the resulting P-ShareLoRA algorithms improve preference-model accuracy and alignment quality compared with standard global or purely local LoRA baselines.

The fourth part studies continual learning under concept drift through a mixture-of-experts (MoE) framework. We model streaming data as partially observed samples drawn from a union of low-dimensional subspaces whose structure evolves over time. In the stationary setting, we show that the underlying low-rank experts are identifiable under partial observations, yielding correct expert selection and exact completion. In the online setting, we show that an MoE-style continual learner can track drifting subspaces with provable contraction of expert error over time while updating only the routed expert, thereby mitigating cross-concept interference. Synthetic repeated-drift experiments further validate the theory and demonstrate stable adaptation relative to single-subspace online baselines.

Collectively, these four parts provide a principled foundation for designing adaptive learning systems that efficiently leverage limited labeled data, auxiliary unlabeled samples, heterogeneous human feedback, and evolving data streams.

Degree

PHD (Doctor of Philosophy)

Keywords

Large Language Models; Transformers; In-context Learning; Reinforcement Learning; Mixture-of-Experts

Issued Date

2026-04-25

Suggested Citation

Liu, Renpu. Sample Efficient In-Context Learning and Fine-Tuning of Transformer-Based Language Models. University of Virginia, Electrical Engineering - School of Engineering and Applied Science, PHD (Doctor of Philosophy), 2026-04-25, https://doi.org/10.18130/xac6-yh38.