CS6370 - Natural Language Processing

Course Data :

Syllabus

  • Introduction : why NLP is hard, why NLP is useful, classical problems.
  • Words: Structure (spellcheck, morphology using FSTs).
  • Words: Semantics (Basic ideas in Lexical Semantics, WordNet and WordNet based similarity measures, Distributional measures of similarity, Concept Mining using Latent Semantic Analysis).
    • Words: Semantics (Word Sense Disambiguation; supervised, unsupervised and semi-supervised approaches)
  • Words: Parts of Speech (POST using Brill's Tagger and HMMs)
  • Sentences: Basic ideas in compositional semantics, Classical Parsing (Bottom up, top down, Dynamic Programming: CYK parser)
    • Sentences: Parsing using Probabilistic Context Free Grammars and EM based approaches for learning PCFG parameters.
  • Language Modelling (basic ideas, smoothing techniques)
  • Machine Translation (rule based techniques, Statistical Machine Translation (SMT), parameter learning in SMT (IBM models) using EM)
  • Information Extraction: Introduction to Named Entity Recognition and Relation Extraction
  • Natural Language Generation: the potential of using ML for NLG
  • Additional topics: Advanced Language Modelling (including LDA), other applications like summarization, question answering

Pre-Requisites

    None

Parameters

Credits Type Date of Introduction
4-0-0-0-8-12 Elective Jun 2009

Previous Instances of the Course


© 2016 - All Rights Reserved - Dept of CSE, IIT Madras
Website Credits