Title | : | Towards Plurality: Foundations for Learning from Diverse Human Preferences |
Speaker | : | Ramya Vinayak (Assistant professor ECE Dept, UW-Madison) |
Details | : | Mon, 20 May, 2024 11:00 AM @ RBCDSAI Seminar Hall |
Abstract: | : | Large pre-trained models trained on internet-scale data are often not ready for safe deployment out-of-the-box. They are heavily fine-tuned and aligned using large quantities of human preference data, usually elicited using pairwise comparisons. While aligning an AI/ML model to human preferences or values, it is worthwhile to ask whose preference and values we are aligning it to? The current approaches of alignment are severely limited due to their inherent uniformity assumption and the need for plurality, i.e., capturing the diversity in human preferences and values – is getting recognized as an important challenge to address in this arena. There is also rich literature on learning preferences from human judgements using comparison queries. It plays a crucial role in several applications ranging from cognitive and behavioral psychology, crowdsourcing democracy, surveys in social science applications, and recommendation systems. However, the models in this literature often focus on learning average preference over the population due to the limitations on the amount of data available per individual or on learning an individual's preference using a lot of queries. Furthermore, the knowledge of the metric, i.e., the way humans judge similarity and dissimilarity, is assumed to be known which does not hold in practice. We aim to overcome these limitations by building mathematical foundations for learning from diverse human preferences. In this talk, I will discuss some recent results that focus on how we can reliably capture diversity in preferences while pooling together data from individuals to learn a common metric. In particular, I will talk about fundamental questions in two directions: (1) Simultaneous metric and preference learning where the goal is to learn an unknown but shared metric from preference queries while the preferences are diverse and also unknown. (2) Learning distribution of preferences over a population with a single comparison query per individual. |