Online Archive of University of Virginia Scholarship
Informative missingness in recommender system366 views
Author
Jin, Haiyun, Statistics - Graduate School of Arts and Sciences, University of Virginia
Advisors
Tang, Xiwei, AS-Statistics AS-Statistics, University of Virginia
Abstract
Recommender systems have been extensively adopted in a variety of areas such as electronic commerce, social media platforms, and content generators for individualized prediction and recommendation. Data sparsity is one of the main challenges in this topic as usually only a very limited number of user-item interactions are observed, resulting in a large proportion of missing data. Since users' ratings to items may depend on underlying user-specific preferences or item-specific characteristics, the missing data pattern in recommender systems is typically informative. Most existing recommender systems fail to account for this crucial information by assuming a missing completely at random mechanism. In this thesis, to address this challenge, we developed new recommender system models to utilize the informative missing data pattern in two different directions. In the first one, we propose a multi-layer matrix factorization scheme by leveraging extra layers that incorporate the informative missingness through embedding techniques. The new model combines the strength of matrix factorization and collaborative filtering on both user and item dimensions. Furthermore, to improve the algorithm's scalability, we present effective sampling strategies based on random walks in obtaining the embeddings of users and items with high dimensionality. Both simulation studies and real data applications illustrate the outperformance of the proposed model with significantly better predictive power and great computational scalability. In the second one, we explore incorporating the missing information into the estimated propensity scores and construct an adjusted prediction on user-item ratings based on the association between the missing data mechanism and observed ratings. Numerical studies indicate a reasonable local improvement with the introduced missing-based adjustment, especially for users with few observations.
Degree
PHD (Doctor of Philosophy)
Keywords
Missing Data; Embedding; Latent Factor Models; Propensity Score; Recommender System
Jin, Haiyun. Informative missingness in recommender system. University of Virginia, Statistics - Graduate School of Arts and Sciences, PHD (Doctor of Philosophy), 2022-05-06, https://doi.org/10.18130/3z4p-zh70.