Live Session
Chamber of Commerce
Poster
15 Oct
 
8:00
CEST
Tuesday Posters
Add Session to Calendar 2024-10-15 08:00 am 2024-10-15 05:30 pm Europe/Rome Tuesday Posters Tuesday Posters is taking place on the RecSys Hub. Https://recsyshub.org
Late Breaking Results

Informed Dataset-Selection with Algorithm-Performance-Spaces

View on ACM Digital Library

Joeran Beel (University of Siegen), Lukas Wegmeth (University of Siegen), Lien Michiels (University of Antwerp) and Steffen Schulz (University of Siegen)

View Paper PDFView Poster
Abstract

When designing recommender-systems experiments, central questions are how many and which datasets to use. So far,the community has not answered these questions. We argue that the informed selection of datasets for recommender-system research is a crucial aspect of the design of offline experiments. Eventually, the goal of evaluating a recommender-system algorithm offline is to obtain an estimate of how well the algorithm will perform on future unknown data compared to another algorithm. In this paper, we propose one method to strategically select datasets for recommender-system experiments to obtain good generalization power to new data. Namely, we introduce the idea of "Algorithm Performance Spaces" in which datasets are plottet based on how algorithms perform on them. This allows to identify diverse datasets, whereas "diverse" considers how differently algorithms perform on the dataset. We do not claim to have found the final answer. We see the proposed method as one suggestion that will hopefully initiate a discussion in the community andeventually lead to accepted best practices on dataset selection.

Join the Conversation

Head to Slido and select the paper's assigned session to join the live discussion.

Conference Agenda

View Full Agenda →
No items found.