Title | : | Studies on multi-label classification and domain adaptation for detection of phonetic features |
Speaker | : | Rupam Ojha (IITM) |
Details | : | Tue, 20 Nov, 2018 11:00 AM @ A M Turing Hall |
Abstract: | : | Acoustic modeling in large vocabulary continuous speech recognition systems is commonly done by building the models for subword units such as phonemes, syl- lables, diphones and triphones. Though the number of phoneme classes is small, phoneme recognition is a challenging task because of similarities among several phonemes. As the number of classes for other types of subword units is very large, the task of acoustic modeling is complex. Recently, the approaches based on detection of phonetic features have been explored for acoustic modeling. In these approaches, the phonetic features related to speech production mechanism are used to describe each of the phonemes. The multi-label classification models are built for detection of phonetic features in the given speech signal. In this work, we explore the deep learning models for multi-label classification to be used for detection of phonetic features. The detected phonetic features along with the acoustic features are given as input to a feedforward neural network model to perform phoneme recognition. The effectiveness of the phonetic feature detection based approach to phoneme recognition is demonstrated on benchmark datasets. A domain adaptation based technique to build models for detection of phonetic features using the data of more than one language is also explored. |