Live Session
Session 4: Collaborative Filtering
Main Track
Unlocking the Hidden Treasures: Enhancing Recommendations with Unlabeled Data
Yuhan Zhao (Harbin Engineering University), Rui Chen (Harbin Engineering University), Qilong Han (Harbin Engineering University), Hongtao Song (Harbin Engineering University) and Li Chen (Hong Kong Baptist University)
Abstract
Collaborative filtering (CF) stands as a cornerstone in recommender systems, yet effectively leveraging the vast reservoir of unlabeled data presents a persistent challenge. Current research endeavors to address the challenge of unlabeled data by extracting a subset closely approximating negative samples. Regrettably, the remaining data are overlooked, failing to fully integrate this valuable information into the construction of user preferences. To address this gap, we introduce a novel positive-neutral-negative (PNN) learning paradigm. PNN introduces a neutral class, encompassing intricate items challenging to categorize directly as positive or negative samples. By training a model based on this triple-wise partial ranking, PNN offers a promising solution to learning complex user preferences. Through theoretical analysis, we connect PNN to one-way partial AUC (OPAUC) to validate its efficacy. Implementing the PNN paradigm is, however, technically challenging because: (1) it is difficult to classify unobserved items into neutral or negative in the absence of supervisory signals; (2) there does not exist any loss function that can handle set-level triple-wise ranking relationships. To address these challenges, we propose a semi-supervised learning method coupled with a user-aware attention model for knowledge acquisition and classification refinement. Additionally, a novel loss function and two-step centroid ranking approach enable handling set-level rankings. Extensive experiments on four real-world datasets demonstrate that, when combined with PNN, a wide range of representative CF models can consistently and significantly boost their performance. Our code is publicly available at https://anonymous.4open.science/r/PNN-RecBole-4E04.