Title | : | Towards Better Evaluation of Natural Language Generation |
Speaker | : | Ananya Sai B (IITM) |
Details | : | Wed, 13 Mar, 2024 11:00 AM @ SSB-334 |
Abstract: | : | Several automatic text generation, a.k.a Natural Language Generation (NLG) applications have emerged in recent years due to advancements in machine learning research combined with the availability of large-scale data and access to powerful computing resources. It is necessary to carefully evaluate NLG models to understand the scientific progress being made. In an ideal scenario, expert humans would evaluate the outputs. However, this becomes a severe bottleneck in the rapidly developing field since it is time-consuming and expensive. The practical alternative is to use automatic metrics. In this talk, we examine the differences between human evaluation and automatic evaluation approaches currently used. We propose perturbation checklists to meta-evaluate automatic metrics and provide a fine-grained analysis of their performance. Since most of these metrics ar e developed with an English-centric approach, we investigate potential extensions to other languages. We collect data with detailed annotations on 5 Indian languages and propose techniques to improve performance and robustness of metrics. We also study zero-shot performance of our approach for providing further extension to other related languages. |