Title | : | Melodic Pitch Estimation in Music and Multi-pitch Estimation in Speech |
Speaker | : | Rajeev Rajan (IITM) |
Details | : | Tue, 8 Dec, 2015 2:00 PM @ BSB 361 |
Abstract: | : | An important task in Music Information Retrieval (MIR) is melody extraction. Melody extraction involves the estimation of the predominant pitch in the presence of accompaniment. Conventional pitch extraction algorithms used in the context of speech fail due to multiple reasons, notably inappropriate time or frequency resolution or the presence of interfering partials. In the proposed work, we developed a modified group delay based algorithm to extract the prominent pitch from polyphonic music. The flattened power spectrum is analyzed using modified group delay algorithm followed by dynamic programming to ensure consistency. The results demonstrate the potential of group delay analysis in the music information retrieval applications. The second task was to develop a multi-pitch estimation algorithm in concurrent speech. A majority of the pitch tracking methods are usually limited to clean speech and give a degraded performance in the presence of other speakers or noise such as channel noise. When a combination of speech utterances from two or more speakers are transmitted through a single channel, pitch cues of the individual sources will be weakened by the presence of mutual interference. The proposed method analyses the flattened spectrum of the mixed speech using modified group delay algorithm and detects the prominent pitch in the first pass. In the second pass, the estimated frequency component will be attenuated from the flattened spectrum and detect the second pitch. The results are at par with other state of the art algorithms such as Wu’s algorithm and Jin’s algorithm. |