Live Session
Session 11: Optimisation and Evaluation 1
Industry
Self-Auxiliary Distillation for Sample Efficient Learning in Google-Scale Recommenders
Yin Zhang (Google DeepMind), Ruoxi Wang (Google DeepMind), Xiang Li (Google, Inc), Tiansheng Yao (Google, Inc), Andrew Evdokimov (Google, Inc), Jonathan Valverde (Google DeepMind), Yuan Gao (Google, Inc), Jerry Zhang (Google, Inc), Evan Ettinger (Google, Inc), Ed H. Chi (Google DeepMind) and Derek Zhiyuan Cheng (Google DeepMind)
Abstract
Industrial recommendation systems process billions of daily user feedback which are complex and noisy. Efficiently uncovering user preference from these signals becomes crucial for high-quality recommendation. We argue that those signals are not inherently equal in terms of their informative value and training ability, which is particularly salient in industrial applications with multi-stage processes (e.g., augmentation, retrieval, ranking). Considering that, in this work, we propose a novel self-auxiliary distillation framework that prioritizes training on high-quality labels, and improves the resolution of low-quality labels through distillation by adding a bilateral branch-based auxiliary task. This approach enables flexible learning from diverse labels without additional computational costs, making it highly scalable and effective for Google-scale recommenders. Our framework consistently improved both offline and online key business metrics across three Google major products. Notably, self-auxiliary distillation proves to be highly effective in addressing the severe signal loss challenge posed by changes such as Apple iOS policy. It further delivered significant improvements in both offline (+17\% AUC) and online metrics for a Google Apps recommendation system. This highlights the opportunities of addressing real-world signal loss problems through self-auxiliary distillation techniques.