2009 International Conference on Knowledge and Systems Engineering 2009
DOI: 10.1109/kse.2009.44
|View full text |Cite
|
Sign up to set email alerts
|

A Hybrid Approach to Vietnamese Word Segmentation Using Part of Speech Tags

Abstract: Word segmentation is one of the most important tasks in NLP. This task, within Vietnamese language and its own features, faces some challenges, especially in words boundary determination. To tackle the task of Vietnamese word segmentation, in this paper, we propose the WS4VN system that uses a new approach based on Maximum matching algorithm combining with stochastic models using part-ofspeech information. The approach can resolve word ambiguity and choose the best segmentation for each input sentence. Our sys… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0
1

Year Published

2009
2009
2020
2020

Publication Types

Select...
5
3
2

Relationship

3
7

Authors

Journals

citations
Cited by 31 publications
(13 citation statements)
references
References 7 publications
0
12
0
1
Order By: Relevance
“…The transition graph will show the probability among syllables to form the words in a specific text. Nguyen et al, 2003;Pham et al, 2009 used this method to segment Vietnamese word [3,4].…”
Section: B Transition Graph Methodsmentioning
confidence: 99%
“…The transition graph will show the probability among syllables to form the words in a specific text. Nguyen et al, 2003;Pham et al, 2009 used this method to segment Vietnamese word [3,4].…”
Section: B Transition Graph Methodsmentioning
confidence: 99%
“…As the case satisfies the conditions of the rules at nodes (3), (5) and (40), it is passed to node (42), using except edges. Since the case does not satisfy the conditions of the rules at nodes (42), (43) and (45), we have the evaluation path (0)-(1)-(2)-(3)-(5)-(40)-(42)-(43)-(45) with the last fired node (40). Given another case of "In which projects is enrico motta working on", it satisfies the conditions of the rules at nodes (0), (1) and (2); as node (2) has no except child node, we have the evaluation path (0)-(1)-(2) and the last fired node (2).…”
Section: Single Classification Ripple Down Rulesmentioning
confidence: 99%
“…Tuỳ vào cách thức so khớp mà ta có các phƣơng pháp khác nhau nhƣ: so khớp từ dài nhất (longest matching), so khớp từ ngắn nhất (short matching), so khớp chồng lắp (overlap matching) và so khớp cực đại (maximum matching) (Dinh et al, 2001), (Pham et al, 2009). Độ chính xác của phƣơng pháp dựa trên từ điển phụ thuộc rất lớn vào kích thƣớc từ điển đƣợc xây dựng.…”
Section: A Tiếp Cận Dựa Trên Từ đIểnunclassified