Title | : | Encode-Attend-Refine-Decode: Enriching Encoder Decoder Models with Better Context Representation for Natural Language Generation |
Speaker | : | Preksha Nema (IITM) |
Details | : | Tue, 15 May, 2018 11:00 AM @ A M Turing Hall |
Abstract: | : | While off-the-shelf neural encode-attend-decode models have been applied to a wide variety of natural language generation (NLG) tasks, we show that these models can be further enriched by (i) handling specific limitations which are task agnostic or (ii) exploiting certain characteristics which are task specific. For example, for a wide variety of NLG tasks such as machine translation, document summarization, etc., it has been reported that neural encode-attend-decode models suffer from the problem of generating the same phrase over and over. We study this problem in the context of query based abstractive summarization and propose a diversity based attention model to alleviate the repeating phrase problem. We demonstrate a 28% absolute improvement in ROUGE_L scores using our model over the vanilla seq2seq model using our newly introduced dataset building on debatepedia. Next we look at exploiting task specific characteristics to enrich neural models. Specifically, we consider the task of generating natural language descriptions from structured tables containing facts. The input has a specific structure which can be exploited for building neural models. We build models which take into consideration the (i) hierarchical organisation of facts (ii) need for continued attention on a fact for certain time steps (iii) need for ignoring already considered facts. We experiment on a recently released dataset WikiBio) which contains fact tables about people and their one line descriptions in English. We also introduce two similar datasets for French and German. We demonstrate a 21% relative improvement over a recently proposed state of the art method and 9.9% relative improvement over the vanilla seq2seq models. |