2019
DOI: 10.25126/jtiik.2019621105
|View full text |Cite
|
Sign up to set email alerts
|

Kombinasi Metode Rule-Based dan N-Gram Stemming untuk Mengenali Stemmer Bahasa Bali

Abstract: AbstrakProses untuk mengekstraksi kata dasar dari kata berafiks dikenal dengan istilah stemming yang bertujuan meningkatkan recall dengan mereduksi variasi kata berafiks ke dalam bentuk kata dasarnya. Penelitian terdahulu tentang stemming bahasa Bali pernah dilakukan menggunakan metode rule-based, tapi afiks yang diluluhkan hanya prefiks dan sufiks, sedangkan variasi afiks lain tidak diluluhkan, seperti infiks, konfiks, simulfiks, dan kombinasi afiks. Penelitian tentang stemming menggunakan pendekatan rule-bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0
6

Year Published

2020
2020
2024
2024

Publication Types

Select...
10

Relationship

1
9

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 11 publications
0
2
0
6
Order By: Relevance
“…In the pre-processing stage, tokenize, stopwords removal, and stemming processes are carried out. The determination of stopwords in Balinese has been studied by Putra, et al which includes the words anggen, sane, ring, miwah, puniki, and olih (Putra et al, 2016), and for the Balinese stemming process, we use the stemmer method that has been done by Subali, et al, where the stemmer method uses the rule based method and n-gram string similarity (Subali & Fatichah, 2019).…”
Section: Preprocessmentioning
confidence: 99%
“…In the pre-processing stage, tokenize, stopwords removal, and stemming processes are carried out. The determination of stopwords in Balinese has been studied by Putra, et al which includes the words anggen, sane, ring, miwah, puniki, and olih (Putra et al, 2016), and for the Balinese stemming process, we use the stemmer method that has been done by Subali, et al, where the stemmer method uses the rule based method and n-gram string similarity (Subali & Fatichah, 2019).…”
Section: Preprocessmentioning
confidence: 99%
“…Karena kata yang diperoleh tidak sesuai dengan morfologi bahasa Massenrempulu, maka diperoleh kata uji yang salah. Penelitian lain oleh Subali [10] dengan judul Pengembangan Metode Stemmer untuk Bahasa Bali dengan Pendekatan Rule-Based dan N-Gram Stemming. Penelitian ini bertujuan untuk mengembangkan metode stemmer yang meluluhkan seluruh variasi afiks pada bahasa Bali dengan mengombinasikan pendekatan Rule-Based dan metode N-Gram Stemming.…”
Section: Pendahuluanunclassified
“…An N-gram is classified based on n characters. In general, the n-gram is done by adding additional blanks at the beginning and at the end [13]. For example, the sentence "accord gran" is processed by n-grams, the blank is symbolized by "_", resulting in n-grams in Table 1.…”
Section: Term Weightingmentioning
confidence: 99%