Voice Tone Analyzer Using Ml
Main Article Content
Abstract
The most natural and fastest method of communication between humans is speech. Through speech one
can express oneself as well as one’s ideas. Speech tone reflects the emotion, mental state ,attitude and
expression of the speaker. In recent years due to the evolution of technologies like machine-learning,
computer vision, speech analysis has equipped the computers with an ability to react and respond in real
time, and undergo successful interactions with the users just like human interactions. The key factor that
supports such instances is the voice tone interpretation ability that has been facilitated by machines. Voice
tone interpretation is a substantial field emerging from artificial intelligence and machine learning. Voice
tone interpretation explains what actually the speaker feels. Voice tone plays a very important role while
conveying some expression. The versatility of the human voice and its ability to behold a mass of emotions
makes it a rich source of data. As voice tone bears the speech signal, which is affluent enough and contains
various linguistic parameters that enable the machine to detect the mental and emotional state of the user.
Many speech emotion recognition systems have been developed by various researchers. When we
differentiate various emotions, particularly speech features are more beneficial which is done by deploying
feature extraction techniques using MFCC. As every feature independently contributes in distinguishing the
emotions of the user and classifying the gender individually. There are numerous dataset available for speech
emotions, its modeling, and types that aid in knowing the type of speech. After feature extraction, the next
crucial part is the classification of speech emotions such as sadness, neutral, happiness, surprise, anger, etc.
The input voice tone is analyzed and prediction of emotion is specified accordingly. The classification task is
executed by deploying a Convolutional Neural Network model. Identification of several emotions via voice
signal and speech signal analysis finds its applications in supporting dynamic human-machine interactions,
dialogue systems supporting spoken languages such as call center conversations, customer care, onboard
vehicle driving systems, etc. Thus, abstraction of various emotions by means of speech synthesis has practical
validity and would definitely be favorable for enhancing human communication and convincing skills.