Title | : | On Challenges in Training Recurrent Neural Networks |
Speaker | : | Sarath Chandar (University of Montreal) |
Details | : | Thu, 12 Sep, 2019 3:00 PM @ AM Turing Hall |
Abstract: | : | Modelling long-term dependencies is one of the fundamental problems in machine learning. While Recurrent Neural Networks (RNNs) can, in theory, model any long-term dependency, in practice, they can only model short-term dependencies due to the problem of vanishing gradients. In the first half of the talk, I will explore the problem of vanishing gradient in RNNs and propose new solutions to avoid the same. In the second half, I will discuss the challenges that arise when training RNNs in a multi-task lifelong learning setting. Specifically, I will discuss the problem of catastrophic forgetting and capacity saturation and propose a solution to overcome these challenges. Bio: arath Chandar is a final year Ph.D. candidate at Mila, University of Montreal, working with Yoshua Bengio and Hugo Larochelle. He is starting as an Assistant Professor at Polytechnique Montreal and Mila from Fall 2019. His research interests lie at the intersection of Deep Learning, Natural Language Processing, and Reinforcement Learning. His work includes solutions for various fundamental problems in recurrent neural networks and memory augmented neural networks. He also works on several applications in natural language processing, including question answering and dialogue systems. Sarath is a recipient of the IBM Ph.D. Fellowship 2018-2020 and FQRNT PBEEE scholarship 2016-2018. Sarath has spent time at IBM Research, Twitter, and Google Brain as a research intern. Sarath has co-organized workshops on reinforcement learning and lifelong learning at leading venues like ICML, IJCAI, and RLDM. Sarath did his MS by Research at Indian Institute of Technology Madras. Sarath is a recipient of the award for the best MS thesis in Computer Science at IIT Madras. For more details, please visit http://sarathchandar.in/ |