Currently, there are a large number of hotel reviews on the Internet that need to be evaluated to turn the data into practicable information. Deep learning has excellent capabilities for recognizing this type of data. With the advances in deep learning paradigms, many algorithms have been developed that can be used in sentiment analysis tasks. In this study, we aim to compare the performance of classical machine learning algorithms—logistic regression (LR), naïve Bayes (NB), and support vector machine (SVM) using the Word2Vec model in conjunction with deep learning algorithms such as a convolutional neural network (CNN) to classify hotel reviews on the Traveloka website into positive or negative classes. Both learning methods apply hyperparameter tuning to determine the parameters that produce the best model. Furthermore, the Word2Vec model parameters use the skip-gram model, hierarchical softmax evaluation, and the value of 100 vector dimensions. The highest average accuracy obtained was 98.08% by using the CNN with a dropout of 0.2, Tanh as convolution activation, softmax as output activation, and Adam as the optimizer. The findings from the study demonstrate that the integration of the Word2Vec model and the CNN model obtains significantly better accuracy than other classical machine learning methods.
<p>Sampah merupakan salah satu permasalahan global yang dihadapi seluruh dunia termasuk Indonesia. Apabila tidak dikelola dengan baik, jenis dan volume sampah yang semakin meningkat dapat berdampak buruk pada lingkungan dan kesehatan manusia. Pemilahan sampah merupakan langkah awal dalam melakukan berbagai jenis pengolahan sampah. Pemilahan sampah secara manual tidak mudah dilakukan mengingat jumlahnya yang amat besar, sehingga otomatisasi pemilahan sampah diperlukan. Penelitian ini mengusulkan klasifikasi citra sampah menggunakan <em>Support Vector Machine</em> (SVM) dengan ekstraksi fitur <em>Gray Level Co-Occurrence Matrix</em> (GLCM) dan <em>Color Moments</em> serta mengoptimalkan kinerja terbaik dalam proses klasifikasinya. Dataset TrashNet digunakan untuk mengevaluasi metode yang diusulkan. Beberapa parameter penting yang digunakan dalam penelitian ini adalah orientasi sudut GLCM, parameter C (<em>soft margin</em>) pada SVM, dan parameter 𝛾 pada <em>Radial Basis Kernel</em> (RBF). Pembagian data dilakukan menggunakan <em>10-Fold Cross Validation</em>. Hasil penelitian menunjukkan bahwa kombinasi fitur GLCM dengan orientasi sudut 45° dan <em>Color Moments</em> memberikan rata-rata akurasi terbaik sebesar 78,87% dengan menggunakan parameter C bernilai 32 dan parameter γ bernilai 4. Hasil pengujian terbaik diperoleh pada <em>fold</em> ke-3 dengan akurasi sebesar 85,43% yang digunakan sebagai skenario pengujian data baru. Pengujian terhadap 30 citra sampah baru menggunakan model terbaik memperoleh akurasi sebesar 70%.</p><p> </p><p><em><strong>Abstract</strong></em></p><p><em>Waste is one of the global problems faced by the whole world, including Indonesia. Improper waste management can harm the environment and interfere with health. Waste management involved several steps in handling waste, the first one being waste sorting. In Indonesia, waste sorting is still performed manually. Manual waste sorting is not easy to do because the waste amount is very large. Therefore, automatic waste detection technology is needed to support more optimal waste sorting. This study proposes waste image classification using Support Vector Machine (SVM) with Gray Level Co-Occurrence Matrix (GLCM) and Color Moments as the features. The TrashNet dataset is used to evaluate the proposed method. In addition, 30 additional waste image outside trashnet is used as testing data. Some of the important parameters that are tuned in this study are the angle orientation of the GLCM, C (soft margin) parameter on the SVM, and </em><em>𝛾</em><em> parameter on the Radial Base Kernel (RBF). Data splitting is done using 10-Fold Cross Validation. The results showed that the combination of GLCM features with 45° angle orientation and Color Moments gave the best average accuracy of 78.87% using C parameter with a value of 32 and γ parameter with a value of 4. The best test results were obtained in the third fold with an accuracy of 85, 43%. This result is used to test the 30 test image outside the TrashNet dataset, and achieve accuracy of 70%.</em></p><p><em><strong><br /></strong></em></p>
Solid waste problem become a serious issue for the countries around the world since the amount of generated solid waste increase annually. As an effort to reduce and reuse of solid waste, a classification of solid waste image is needed to support automatic waste sorting. In the image classification task, image segmentation and feature extraction play important roles. This research applies recent deep leaning-based segmentation, namely pyramid scene parsing network (PSPNet). We also use various combination of image feature extraction (color, texture, and shape) to search for the best combination of features. As a comparison, we also perform experiment without using segmentation to see the effect of PSPNet. Then, support vector machine (SVM) is applied in the end as classification algorithm. Based on the result of experiment, it can be concluded that generally applying segmentation provide better source for feature extraction, especially in color and shape feature, hence increase the accuracy of classifier. It is also observed that the most important feature in this problem is color feature. However, the accuracy of classifier increase if additional features are introduced. The highest accuracy of 76.49% is achieved when PSPNet segmentation is applied and all combination of features are used.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.