Abstract
Generalization is a fundamental requirement for machine learning models to function reliably in real-world environments, where test conditions often deviate from those seen during training. This challenge becomes especially acute when labeled data is limited, supervision is weak, or distributions shift across tasks and domains. These issues arise broadly across modalities such as graph-structured data, used in scientific discovery, healthcare, and neuroscience, and textual data in natural language processing. While powerful models like Graph Neural Networks (GNNs) and large language models (LLMs) have demonstrated strong performance in controlled settings, their generalization capacity under realistic constraints remains limited. In response, this dissertation focuses on developing generalizable learning methods that are robust to data sparsity, domain shifts, and task variability across both graph and textual modalities.
Specifically, this dissertation contributes to the advancement of machine learning through three research themes. The first theme, generalization with limited supervision, introduces frameworks such as TENT and GLITTER to enable few-shot learning for nodes and graphs, and MoD, a retrieval-based strategy for efficient LLM task adaptation using minimal input samples. The second theme, generalization under extremely weak supervision, addresses cases where even meta-training lacks sufficient labels. Here, X-FNC leverages Poisson learning and information bottlenecks to combat overfitting, while DLG and GRM enable out-of-distribution generalization through graph augmentation and generative modeling. The third theme, generalization in real-world applications, develops BrainMAP to model long-range and multi-pathway brain interactions from fMRI-derived graphs, and proposes super-relational reasoning to enhance interpretability and robustness in LLM-based knowledge graph reasoning.
Through these contributions, this dissertation advances the study of generalizable learning by relaxing traditional assumptions of abundant and homogeneous data, and by designing frameworks that remain effective under limited supervision and shifting environments. These efforts aim to bridge the gap between research models and deployment-ready systems in high-stakes, data-scarce, and dynamically evolving real-world settings.