High Impact Factor : 4.396 icon | Submit Manuscript Online icon | UGC Approved icon

NLP based Voice Modulation using Mel Frequency Cepstral Coefficient (MFCC) and Convolution Network


Harsh Joshi , Dehradun Institute of Technology; Harshita Agrawal, Dehradun Institute of Technology; Ashish Chaturvedi, Dehradun Institute of Technology; Bharat Kumar Singh, Dehradun Institute of Technology; Dr. Sandeep Sharma, Dehradun Institute of Technology


Natural Language Processing, Audio Fingerprinting, MFCC, Perceptual Hash, Deep Learning, Speech to text conversion


Speech processing is a challenging task, owing to the stochastic nature and high dimensional nature of dataset at hand. In this paper we propose a novel technique for extracting speech characteristics and mapping the differences in them between different speakers to generate a modulation vector. Using Speech to text conversion, natural language processing for phoneme based segmentation, and the modulation vector as suggested earlier, we aim to modulate speech characteristics of host speaker to deliver voice content (audio signal) in target speaker’s voice.

Other Details

Paper ID: NCILP041
Published in: Conference 1 : NCIL 2015
Publication Date: 16/10/2015
Page(s): 164-169

Article Preview

Download Article