Human emotion recognition subject becomes important due to it's usability in daily lifestyle which requires human and computer interraction. Human emotion recognition is a complex problem due to the difference within custom tradition and specific dialect which exists on different ethnic, region and community. This problem also exacerbated due to objectivity assessment for the emotion is difficult since emotion happens unconsciously. This research conducts an experiment to discover pattern of emotion based on feature extracted from speech. Method used for feature extraction on this experiment is Mel-Frequency Cepstral Coefficient (MFCC) which is a method that similar to the human hearing system. Dataset used on this experiment is Berlin Database of Emotional Speech (Emo-DB). Emotions that are used for this experiments are happiness, boredom, neutral, sad and anger. For each of these emotion, 3 samples from Emo-DB are taken as experimental subject. The emotion patterns are successfully visible using specific values for MFCC parameters such as 25 for frame duration, 10 for frame shift, 0.97 for preemphasis coefficient, 20 for filterbank channel and 12 for ceptral coefficients. MFCC features are then extracted and calculated to find mean values from these parameters. These mean values are then plotted based on timeframe graph to be investigated to find the specific pattern which appears from each emotion. Keywords— Emotion, Speech, Mel-Frequency Cepstral Coefficients (MFCC).
<p class="Abstrak">Ucapan merupakan sinyal yang memiliki kompleksitas tinggi terdiri dari berbagai informasi. Informasi yang dapat ditangkap dari ucapan dapat berupa pesan terhadap lawan bicara, pembicara, bahasa, bahkan emosi pembicara itu sendiri tanpa disadari oleh si pembicara. Speech Processing adalah cabang dari pemrosesan sinyal digital yang bertujuan untuk terwujudnya interaksi yang natural antar manusia dan mesin. Karakteristik emosional adalah fitur yang terdapat dalam ucapan yang membawa ciri-ciri dari emosi pembicara. Linear Predictive Coding (LPC) adalah sebuah metode untuk mengekstraksi ciri dalam pemrosesan sinyal. Penelitian ini, menggunakan LPC sebagai ekstraksi ciri dan Metode Euclidean Distance untuk identifikasi emosi berdasarkan ciri yang didapatkan dari LPC. Penelitian ini menggunakan data emosi marah, sedih, bahagia, netral dan bosan. Data yang digunakan diambil dari Berlin Emo DB, dengan menggunakan tiga kalimat berbeda dan aktor yang berbeda juga. Penelitian ini menghasilkan akurasi pada emosi sedih 58,33%, emosi netral 50%, emosi marah 41,67%, emosi bahagia 8,33% dan untuk emosi bosan tidak dapat dikenali. Penggunaan Metode LPC sebagai ekstraksi ciri memberikan hasil yang kurang baik pada penelitian ini karena akurasi rata-rata hanya sebesar 31,67% untuk identifikasi semua emosi. Data suara yang digunakan dengan kalimat, aktor, umur dan aksen yang berbeda dapat mempengaruhi dalam pengenalan emosi, maka dari itu ekstraksi ciri dalam pengenalan pola ucapan emosi manusia sangat penting. Hasil akurasi pada penelitian ini masih sangat kecil dan dapat ditingkatkan dengan menggunakan ekstraksi ciri yang lain seperti prosidis, spektral, dan kualitas suara, penggunaan parameter <em>max, min, mean, median, kurtosis dan skewenes.</em> Selain itu penggunaan metode klasifikasi juga dapat mempengaruhi hasil pengenalan emosi.</p><p class="Judul2" align="left"> </p><p class="Judul2"><strong><em>Abstract</em></strong></p><p class="Abstrak"><em>Speech is a signal that has a high complexity consisting of various information. Information that can be captured from speech can be in the form of messages to interlocutor, the speaker, the language, even the speaker's emotions themselves without the speaker realizing it. Speech Processing is a branch of digital signal processing aimed at the realization of natural interactions between humans and machines. Emotional characteristics are features contained in the speech that carry the characteristics of the speaker's emotions. Linear Predictive Coding (LPC) is a method for extracting features in signal processing. This research uses LPC as a feature extraction and Euclidean Distance Method to identify emotions based on features obtained from LPC. This study uses data on emotions of anger, sadness, happiness, neutrality, and boredom. The data used was taken from Berlin Emo DB, using three different sentences and different actors. This research resulted in inaccuracy in sad emotions 58.33%, neutral emotions 50%, angry emotions 41.67%, happy emotions 8.33% and bored emotions could not be recognized. The use of the LPC method as feature extraction gave unfavorable results in this study because the average accuracy was only 31.67% for the identification of all emotions. Voice data used with different sentences, actors, ages, and accents</em><em> </em><em>can influence the recognition of emotions, therefore the extraction of features in the recognition of speech patterns of human emotions is very important. Accuracy results in this study are still very small and can be improved by using other feature extractions such as provides, spectral, and sound quality, using parameters max, min, mean, median, kurtosis, and skewness. Besides the use of classification methods can also affect the results of emotional recognition.</em></p><p class="Abstrak"> </p>
Deep Learning is an essential technique in the classification problem in machine learning based on artificial neural networks. The general issue in deep learning is data-hungry, which require a plethora of data to train some model. Wayang is a shadow puppet art theater from Indonesia, especially in the Javanese culture. It has several indistinguishable characters. In this paper, We tried proposing some steps and techniques on how to classify the characters and handle the issue on a small wayang dataset by using model selection, transfer learning, and fine-tuning to obtain efficient and precise accuracy on our classification problem. The research used 50 images for each class and a total of 24 wayang characters classes. We collected and implemented various architectures from the initial version of deep learning to the latest proposed model and their state-of-art. The transfer learning and fine-tuning method showed a significant increase in accuracy, validation accuracy. By using Transfer Learning, it was possible to design the deep learning model with good classifiers within a short number of times on a small dataset. It performed 100% on their training on both EfficientNetB0 and MobileNetV3-small. On validation accuracy, gave 98.33% and 98.75%, respectively.
This study seeks to identify human emotions using artificial neural networks. Emotions are difficult to understand and hard to measure quantitatively. Emotions may be reflected in facial expressions and voice tone. Voice contains unique physical properties for every speaker. Everyone has different timbres, pitch, tempo, and rhythm. The geographical living area may affect how someone pronounces words and reveals certain emotions. The identification of human emotions is useful in the field of human-computer interaction. It helps develop the interface of software that is applicable in community service centers, banks, education, and others. This research proceeds in three stages, namely data collection, feature extraction, and classification. We obtain data in the form of audio files from the Berlin Emo-DB database. The files contain human voices that express five sets of emotions: angry, bored, happy, neutral, and sad. Feature extraction applies to all audio files using the method of Mel Frequency Cepstrum Coefficient (MFCC). The classification uses Multi-Layer Perceptron (MLP), which is one of the artificial neural network methods. The MLP classification proceeds in two stages, namely the training and the testing phase. MLP classification results in good emotion recognition. Classification using 100 hidden layer nodes gives an average accuracy of 72.80%, an average precision of 68.64%, an average recall of 69.40%, and an average F1-score of 67.44%.
Prestasi adalah suatu hasil yang dicapai seseorang dalam bidang apapun. Dalam dunia pendidikan prestasi seringkali dikaitkan dengan nilai akademik yang dijadikan sebagai acuan peserta didik dikatakan berprestasi dibidang akademik. Proses data secara manual membutuhkan waktu lama. Maka perlu dilakukan prediksi prestasi menggunakan sistem komputasi yang dapat membantu proses prediksi. Data diambil dari MAN Model Palangka Raya dari sebelas mata pelajaran nilai UAS ketika MTs dan nilai rata-rata raport semester satu ketika MA. Pada jaringan syaraf tiruan backpropagation data dinormalisasikan dengan interval kecil yaitu [0.1, 0.9] dan data untuk fuzzy inference system merupakan data asli yang dikalikan 10. Kemudian dilakukan pengujian dengan menggunakan jaringan syaraf tiruan dan fuzzy inference system yang akan bandingkan pada hasil yang diperoleh. Berdasarkan data yang telah diuji, presentase prediksi prestasi peserta didik pada jaringan syaraf tiruan backpropagation menghasilkan presentase sebesar 100% dengan arsitektur satu lapisan tersembunyi, parameter optimal MSE = 0,0001, learning rate = 0,9, momentum = 0,4. Sedangkan untuk prediksi pada fuzzy inference system metode mamdani dengan menggunakan kurva-S dan kurva lonceng (kurva PI) menghasilkan presentase sebesar 83,8%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.