A Journey of Personalization-Aware Collaborative Multimodal Machine Learning

Author: ORCID icon orcid.org/0000-0003-0217-6352
Chen, Jiayi, Computer Science - School of Engineering and Applied Science, University of Virginia
Zhang, Aidong, Computer Science, University of Virginia

There has been a surge of interest in Multimodal Machine Learning (MML) in recent years. MML focuses on building frameworks for understanding or synthesizing the multimodal world and is one of significant fields paving the way toward artificial general intelligence. However, in the field of MML, the practice of learning through multi-agent collaboration, namely Collaborative Multimodal Machine Learning (CoMML), remains relatively underexplored. CoMML draws the inspiration from the peer-to-peer learning behavior observed in humans and aims to build frameworks that enable multiple MML agents to collaborate with each other efficiently, achieving a comprehensive understanding of the multimodal world that surpasses what any individual agent could achieve alone. This thesis conducts a systematic study on CoMML, and particularly, we emphasize the importance of Personalization in CoMML focusing on supporting the special needs and prerequisites of individual MML agents throughout all stages of collaborative learning. However, achieving effective personalization while encouraging efficient collaboration in the multimodal learning system poses many practical challenges. This thesis specifically endeavors to pursue the following research goals in Personalization-aware CoMML. (1) First, we investigate a variety of technical challenges during collaboration brought about by user personalization. The various user needs and capabilities among learning agents lead to increased knowledge diversity, which complicates the discovery of shareable knowledge among individual agents. We aims to achieve a good balance between personalization and collaboration under such knowledge heterogeneity. (2) Second, the patterns of personalization may vary dramatically depending on application scenarios, users' unique attributes and roles, locality constraints, system budgets, and other factors. This thesis investigates different personalization patterns, including modality, task, concept, and architecture preferences, and further explores the necessary technical efforts and proposes novel approaches for dealing with each pattern. (3) Third, we aim to advance the compatibility and applicability of personalization-aware CoMML frameworks in multimodal general intelligence. A broad range of modality types, including language, image, audio, video, 3d shapes, and diverse types of multimodal learning tasks, will be studied within our frameworks.

PHD (Doctor of Philosophy)
multimodal machine learning, collaborative learning, user personalization, heterogeneity, federated learning
All rights reserved (no additional license for public reuse)
Issued Date: