Evaluasi Daftar Stopword Bahasa Indonesia

Rahutomo, Faisal; Ririd, Ariadi Retno Tri Hayati

doi:10.25126/jtiik.2019611226

Cited by 9 publications

(10 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…At first, the stop words (meaningless words that frequently appear in a sentence) are removed for effectiveness [16]. As our dataset is mainly Indonesian, the stop words are the Indonesian ones as defined by Rahutomo and Ririd [17]. Secondly, the text is tokenized to form a sequence of words [18] to avoid trivial mismatches caused by meaningless characters.…”

Section: Methodsmentioning

confidence: 99%

Thesis Supervisor Recommendation with Representative Content and Information Retrieval

Wijanto

Rachmadiany

Karnalim

2020

JISEBI

View full text Add to dashboard Cite

Background: In higher education in Indonesia, students are often required to complete a thesis under the supervision of one or more lecturers. Allocating a supervisor is not an easy task as the thesis topic should match a prospective supervisor’s field of expertise.Objective: This study aims to develop a thesis supervisor recommender system with representative content and information retrieval. The system accepts a student thesis proposal and replies with a list of potential supervisors in a descending order based on the relevancy between the prospective supervisor’s academic publications and the proposal.Methods: Unique to this, supervisor profiles are taken from previous academic publications. For scalability, the current research uses the information retrieval concept with a cosine similarity and a vector space model.Results: According to the accuracy and mean average precision (MAP), grouping supervisor candidates based on their broad expertise is effective in matching a potential supervisor with a student. Lowercasing is effective in improving the accuracy. Considering only top ten most frequent words for each lecturer’s profile is useful for the MAP.Conclusion:An arguably effective thesis supervisor recommender system with representative content and information retrieval is proposed.

show abstract

Section: Methodsmentioning

confidence: 99%

Thesis Supervisor Recommendation with Representative Content and Information Retrieval

Wijanto

Rachmadiany

Karnalim

2020

JISEBI

View full text Add to dashboard Cite

show abstract

“…Filter Stopword, merupakan proses menghilangkan kata-kata yang sering muncul namun tidak ada pengaruh apapun terhadap ekstraksi sentimen. Kata yang termasuk seperti kata penunjuk waktu, kata tanya [12]; 4. Filter Token (By Length), merupakan proses menghapus kata dengan jumlah huruf tertentu melalui dengan parameter min chars 4 dan max chars 25 untuk membatasi jumlah huruf pada kata minimal 4 dan maksimal 25 pada teks [13].…”

Section: B Text Processingunclassified

Analisis Sentimen Terhadap Sistem Informasi Akademik Institut Teknologi Garut

Julianto

2022

Jurnal Algoritma

View full text Add to dashboard Cite

Analisis sentimen merupakan suatu proses untuk mengekstraksi data dalam bentuk teks untuk mendapatkan opini dari pengguna layanan. Penelitian ini bertujuan untuk melakukan analisis sentimen terhadap kepuasan pengguna Sistem Informasi Akademik Mahasiswa (SIAM) berbasis Android yang digunakan oleh Institut Teknologi Garut (ITG). Metode yang digunakan yaitu dengan mengumpulkan komentar-komentar di Google Play terhadap aplikasi ini, kemudian akan di klasifikasikan kedalam tiga kategori sentimen, yaitu Positif, Negatif dan Netral. Hasil penelitian menunjukkan bahwa 57,14% pengguna memberikan sentimen Positif, kemudian 37,14% pengguna memberikan sentimen Negatif dan sisanya yaitu 5,71% termasuk kedalam sentimen Netral.

show abstract

“…Kamus stopword tidak tersedia baku sehingga memerlukan database indeks berisi daftar kata-kata stop words (stopword list). Beberapa peneliti telah membuat stopword list bahasa Indonesia antara lain Fadillah Z. Tala, Damian Doyle, dan Yudi Wibisono [8].…”

Section: Pendahuluanunclassified

Text-Preprocessing Model Youtube Comments in Indonesian

Khomsah¹,

Aribowo

2020

RESTI

View full text Add to dashboard Cite

YouTube is the most widely used in Indonesia, and it’s reaching 88% of internet users in Indonesia. YouTube’s comments in Indonesian languages produced by users has increased massively, and we can use those datasets to elaborate on the polarization of public opinion on government policies. The main challenge in opinion analysis is preprocessing, especially normalize noise like stop words and slang words. This research aims to contrive several preprocessing model for processing the YouTube commentary dataset, then seeing the effect for the accuracy of the sentiment analysis. The types of preprocessing used include Indonesian text processing standards, deleting stop words and subjects or objects, and changing slang according to the Indonesian Dictionary (KBBI). Four preprocessing scenarios are designed to see the impact of each type of preprocessing toward the accuracy of the model. The investigation uses two features, unigram and combination of unigram-bigram. Count-Vectorizer and TF-IDF-Vectorizer are used to extract valuable features. The experimentation shows the use of unigram better than a combination of unigram and bigram features. The transformation of the slang word to standart word raises the accuracy of the model. Removing the stop words also contributes to increasing accuracy. In conclusion, the combination of preprocessing, which consists of standard preprocessing, stop-words removal, converting of Indonesian slang to common word based on Indonesian Dictionary (KBBI), raises accuracy to almost 3.5% on unigram feature.

show abstract

Evaluasi Daftar Stopword Bahasa Indonesia

Cited by 9 publications

References 3 publications

Thesis Supervisor Recommendation with Representative Content and Information Retrieval

Thesis Supervisor Recommendation with Representative Content and Information Retrieval

Analisis Sentimen Terhadap Sistem Informasi Akademik Institut Teknologi Garut

Text-Preprocessing Model Youtube Comments in Indonesian

Contact Info

Product

Resources

About