Live Session
Teatro Petruzzelli
Paper
17 Oct
 
14:30
CEST
Session 16: Large Language Models 2
Add Session to Calendar 2024-10-17 02:30 pm 2024-10-17 04:20 pm Europe/Rome Session 16: Large Language Models 2 Session 16: Large Language Models 2 is taking place on the RecSys Hub. Https://recsyshub.org
Main Track

Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation

View on ACM Digital Library

David Austin (University of Waterloo), Anton Korikov (University of Toronto), Armin Toroghi (University of Toronto) and Scott Sanner (University of Toronto)

View Paper PDFView Poster
Abstract

Designing preference elicitation (PE) methodologies that can quickly ascertain a user’s top item preferences in a cold-start setting is a key challenge for building effective and personalized conversational recommendation (ConvRec) systems. While large language models (LLMs) constitute a novel technology that enables fully natural language (NL) PE dialogues, we hypothesize that monolithic LLM NL-PE approaches lack the multi-turn, decision-theoretic reasoning required to effectively balance the NL exploration and exploitation of user preferences towards an arbitrary item set. In contrast, traditional Bayesian optimization PE methods define theoretically optimal PE strategies, but fail to use NL item descriptions or generate NL queries, unrealistically assuming users can express preferences with direct item ratings and comparisons. To overcome the limitations of both approaches, we formulate NL-PE in a Bayesian Optimization (BO) framework that seeks to actively elicit natural language feedback to reduce uncertainty over item utilities to identify the best recommendation. Key challenges in generalizing BO to deal with natural language feedback include determining (a) how to leverage LLMs to model the likelihood of NL preference feedback as a function of item utilities and (b) how to design an acquisition function that works for language-based BO that can elicit in the infinite space of language. We demonstrate our framework in a novel NL-PE algorithm, PEBOL, which uses Natural Language Inference (NLI) between user preference utterances and NL item descriptions to maintain preference beliefs and BO strategies such as Thompson Sampling (TS) and Upper Confidence Bound (UCB) to guide LLM query generation. We numerically evaluate our methods in controlled experiments, finding that PEBOL achieves up to 131% improvement in MAP@10 after 10 turns of cold start NL-PE dialogue compared to monolithic GPT-3.5, despite relying on a much smaller 400M parameter NLI model for preference inference.

Join the Conversation

Head to Slido and select the paper's assigned session to join the live discussion.

Conference Agenda

View Full Agenda →
No items found.