Title | : | Causal contextual bandits |
Speaker | : | Chandrasekar Subramanian (IITM) |
Details | : | Mon, 8 May, 2023 3:00 PM @ SSB 233 (MR-1) |
Abstract: | : | We study contextual bandit settings where the agent has access to qualitative causal side information. We study two specific settings: one, where the agent can iterativ ely perform interventions on targeted subsets of the population, and two, where the agent has access to an offline logged dataset but can actively acquire additional data samples in one shot at a cost. These model a wide range of real-world scenarios better than existing methods. However, these fundamentally change the problem that the agent faces compared to standard contextual bandit settings, necessitating new techniques. Further, this is the first set of works that integrates causal side-information in a contextual bandit setting, where the agent aims to learn a policy that maps contexts to arms (as opposed to just identifying one best arm). We propose new learning algorithms for both of these settings using novel entropy-like measures that we introduce that help exploit information leakage that occurs as a result of the causal graph. We show that our algorithms perform better than baselines using a combination of empirical evaluations and theoretical results. |