NLP based Voice Modulation using Mel Frequency Cepstral Coefficient (MFCC) and Convolution Network |
Author(s): |
Harsh Joshi , Dehradun Institute of Technology; Harshita Agrawal, Dehradun Institute of Technology; Ashish Chaturvedi, Dehradun Institute of Technology; Bharat Kumar Singh, Dehradun Institute of Technology; Dr. Sandeep Sharma, Dehradun Institute of Technology |
Keywords: |
Natural Language Processing, Audio Fingerprinting, MFCC, Perceptual Hash, Deep Learning, Speech to text conversion |
Abstract |
Speech processing is a challenging task, owing to the stochastic nature and high dimensional nature of dataset at hand. In this paper we propose a novel technique for extracting speech characteristics and mapping the differences in them between different speakers to generate a modulation vector. Using Speech to text conversion, natural language processing for phoneme based segmentation, and the modulation vector as suggested earlier, we aim to modulate speech characteristics of host speaker to deliver voice content (audio signal) in target speaker’s voice. |
Other Details |
Paper ID: NCILP041 Published in: Conference 1 : NCIL 2015 Publication Date: 16/10/2015 Page(s): 164-169 |
Article Preview |
Download Article |
|