Comparison of distance measurement on k-nearest neighbour in textual data classification

Wahyono, Wahyono; Trisna, I Nyoman Prayana; Sariwening, Sarah Lintang; Fajar, Muhammad; Wijayanto, Danur

doi:10.14710/jtsiskom.8.1.2020.54-58

Cited by 15 publications

(12 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The effectiveness of the K-NN algorithm's classification results is also influenced by selecting the proper distance metric, such as Euclidean and Manhattan since it will change how the clusters forms [13]. Studies in this discussion about textual data classification show that the Euclidean distance metric yields the best performance (accuracy value of 85.5%) compared to the Manhattan distance (accuracy value of (85.48%) [14]. Research about stroke disease detection shows that the Manhattan distance performs better in the classification than the Euclidean distance, with an accuracy value of 96.03% against 95.93% [15].…”

Section: Imentioning

confidence: 95%

Hyperteroid Disease Analysis With Backpropagation Artificial Neural Network

Batubara¹,

Zarlis²,

Rosnelly³

2023

ICoSTEC

View full text Add to dashboard Cite

In the era of technology 4.0, a system is needed to support the development of a company, both in industry, education, and others to help solve problems. In this study, the authors used the Backpropagation Neural Network Algorithm in recognizing hyperthyroid disease patterns. In this study, in the recognition of hyperthyroid disease patterns. The author uses 11 data variables that will be trained using the back-propagation algorithm where the weighting is done randomly and the second data is trained using the backpropagation algorithm. In this study using Matlab application for processing. From the results of testing data derived from kaggle, namely hyperthyroid disease data above, we can see in the 2-2-1 architecture which shows that the target is reduced by the jst output that the SSE is 0.06571 which indicates that there is an increase in hyperthyroid disease activity in humans. From the data obtained, that the performance of artificial neural network calculations with the Backpropagation Algorithm is 86%. Can be seen by comparing the desired target with the prediction target.

show abstract

Section: Imentioning

confidence: 95%

Hyperteroid Disease Analysis With Backpropagation Artificial Neural Network

Batubara¹,

Zarlis²,

Rosnelly³

2023

ICoSTEC

View full text Add to dashboard Cite

show abstract

“…Sebelum melakukan pengelompokan data untuk proses deteksi, ditetapkan terlebih dahulu ukuran jarak antar elemen data. Dalam berbagai aplikasi, beragam metode pengukuran jarak digunakan untuk menilai tingkat kemiripan data, seperti jarak Euclidean, Manhattan (City Block Distance), Mahalanobis, Korelasi, Berbasis Sudut, Minkowski, dan Squared Euclidean [3].…”

Section: Pendahuluanunclassified

Perbandingan Metode Pengukuran Jarak Pada K-Nearest Neighbour Dalam Klasifikasi Data Teks Cardiovaskular

Ardiyansyah,

Oktafiani

2024

JISMDB

View full text Add to dashboard Cite

In the current era of technology, the use of data processing has become an undeniable necessity, especially in the health context. One of the uses in managing health data is the classification of disease text data, such as cardiovascular disease, therefore this research aims to evaluate the performance of the KNN method using 3 types of distance measurements, namely Euclidean, Manhattan/City Block, and Mahalanobis. Although simple, this algorithm has succeeded in producing high performance in some cases. This KNN approach classifies an object based on its similarity or closest distance to the objects in the training data. The research produces accuracy values obtained from variations in the k value from the numbers 1 to 31 with odd multiples. Results of calculating Euclidien, Manhattan and Mahalanobis distances on K-NN accuracy. The more the k value increases the accuracy, even though there are conditions where the accuracy decreases for a certain K value, although it is not significant. The lowest total accuracy was obtained from the Euclidien distance, while the highest total accuracy was achieved by the Manhattan distance, followed by the Mahalanobis distance. The Manhattan and Mahalanobis distances produce the best total accuracy and produce the best accuracy in most K sizes, while Euclidien is the calculation with the best total accuracy. Increasing the number of K in each distance calculation can increase classification accuracy. The optimal number of K in this experiment was 29, indicating the highest effectiveness. It is important to note that selecting an appropriate number of K can have a significant impact on classifier performance, and experimental results support the conclusion that by using K29 values, we can achieve the highest accuracy in fitting the model to the given data.

show abstract

“…Pada fase klasifikasi, fitur-fitur yang sama dihitung untuk data pengujian (yang klasifikasinya tidak diketahui). Setelah jarak dari vektor yang baru terhadap seluruh vektor data pembelajaran dihitung dan sejumlah K buah yang paling dekat diambil, selanjutnya klasifikasi ditentukan dari titik-titik tersebut (Wahyono et al, 2020). Algoritma K-Nearest Neighbor (KNN) adalah sebuah metode untuk melakukan klasifikasi terhadap objek berdasarkan data pembelajaran yang jaraknya 14 paling dekat dengan objek tersebut.…”

Section: Pendahuluanunclassified

Data Mining Memprediksi Kelulusan Mahasiswa Menggunakan Metode K-Nearest Neighbors (Knn) Studi Kasus Universitas Pgri Mahadewa Indonesia

I Putu Yogista Putra Atmaja,

I Nyoman Bagus Suweta Nugraha,

Ni Luh Gede Ambaradewi

2023

jmti

View full text Add to dashboard Cite

Graduation is a significant milestone in education, and it is a crucial assessment factor for ensuring higher education accreditation. The K-Nearest Neighbor (KNN) algorithm classifies objects based on learning data, with a minimum and maximum number of training datasets. The algorithm normalizes patterns, calculates Euclidean distance, votes from the smallest euclidean distance, and determines the classification results. The Student Graduation Prediction Model uses the KNN method to help assess students' graduation accuracy and accreditation.

show abstract

Comparison of distance measurement on k-nearest neighbour in textual data classification

Cited by 15 publications

References 11 publications

Hyperteroid Disease Analysis With Backpropagation Artificial Neural Network

Hyperteroid Disease Analysis With Backpropagation Artificial Neural Network

Perbandingan Metode Pengukuran Jarak Pada K-Nearest Neighbour Dalam Klasifikasi Data Teks Cardiovaskular

Data Mining Memprediksi Kelulusan Mahasiswa Menggunakan Metode K-Nearest Neighbors (Knn) Studi Kasus Universitas Pgri Mahadewa Indonesia

Contact Info

Product

Resources

About