Communication plays a vital role according to the people's emotion, as emotions and gesture play 80% role while communication. Nowadays emotion recognition and classification are used in different areas to understand the human feelings like in the robotics, Health care, Military, Home automation, Hands-free computing, Mobile Telephony, Video game,call-center system, Marketing, etc. SER can help better interaction between the machine and the human. There are various algorithms and combination of the algorithms are available to recognize and classify the audio according to their emotion. In this paper, we attempted to investigate the episodic significant works, their technique and the impact of the approaches and the scope of the correction of the results.