HIGH IMPACT FACTOR - 2.39

NLP based Voice Modulation using Mel Frequency Cepstral Coefficient (MFCC) and Convolution Network

Author(s):

Harsh Joshi , Dehradun Institute of Technology; Harshita Agrawal, Dehradun Institute of Technology; Ashish Chaturvedi, Dehradun Institute of Technology; Bharat Kumar Singh, Dehradun Institute of Technology; Dr. Sandeep Sharma, Dehradun Institute of Technology

Keywords:

Natural Language Processing, Audio Fingerprinting, MFCC, Perceptual Hash, Deep Learning, Speech to text conversion

Abstract

Speech processing is a challenging task, owing to the stochastic nature and high dimensional nature of dataset at hand. In this paper we propose a novel technique for extracting speech characteristics and mapping the differences in them between different speakers to generate a modulation vector. Using Speech to text conversion, natural language processing for phoneme based segmentation, and the modulation vector as suggested earlier, we aim to modulate speech characteristics of host speaker to deliver voice content (audio signal) in target speaker’s voice.

Other Details

Paper ID: NCILP041
Published in: Conference 1 : NCIL 2015
Publication Date: 16/10/2015
Page(s): 164-169

Article Preview




Download Article