Wannaphong Phatthiyaphaibun scite author profile

Wannaphong Phatthiyaphaibun

5Publications

12Citation Statements Received

60Citation Statements Given

How they've been cited

How they cite others

Affiliations

Vidyasirimedhi Institute of Science and Technology

Publications

Order By: Most citations

Domain Adaptation of Thai Word Segmentation Models using Stacked Ensemble

Limkonchotiwat¹,

Phatthiyaphaibun²,

Sarwar³

et al. 2020

View full text Add to dashboard Cite

Like many Natural Language Processing tasks, Thai word segmentation is domain-dependent. Researchers have been relying on transfer learning to adapt an existing model to a new domain. However, this approach is inapplicable to cases where we can interact with only input and output layers of the models, also known as "black boxes". We propose a filter-and-refine solution based on the stackedensemble learning paradigm to address this black-box limitation. We conducted extensive experimental studies comparing our method against state-of-the-art models and transfer learning. Experimental results show that our proposed solution is an effective domain adaptation method and has a similar performance as the transfer learning method.

show abstract

Handling Cross- and Out-of-Domain Samples in Thai Word Segmentation

Limkonchotiwat¹,

Phatthiyaphaibun²,

Sarwar³

et al. 2021

View full text Add to dashboard Cite

While word segmentation is a solved problem in many languages, it is still a challenge in continuous-script or low-resource languages. Like other NLP tasks, word segmentation is domain-dependent, which can be a challenge in low-resource languages like Thai and Urdu since there can be domains with insufficient data. This investigation proposes a new solution to adapt an existing domaingeneric model to a target domain, as well as a data augmentation technique to combat the low-resource problems. In addition to domain adaptation, we also propose a framework to handle out-of-domain inputs using an ensemble of domain-specific models called Multi-Domain Ensemble (MDE). To assess the effectiveness of the proposed solutions, we conducted extensive experiments on domain adaptation and out-of-domain scenarios. Moreover, we also proposed a multiple task dataset for Thai text processing, including word segmentation. For domain adaptation, we compared our solution to the state-of-the-art Thai word segmentation (TWS) method and obtained improvements from 93.47% to 98.48% at the character level and 84.03% to 96.75% at the word level. For out-of-domain scenarios, our MDE method significantly outperformed the state-of-the-art TWS and multi-criteria methods. Furthermore, to demonstrate our method's generalizability, we also applied our MDE framework to other languages, namely Chinese, Japanese, and Urdu, and obtained improvements similar to Thai's.

show abstract

Robust Fragment-Based Framework for Cross-lingual Sentence Retrieval

Trijakwanich¹,

Limkonchotiwat²,

Sarwar

et al. 2021

View full text Add to dashboard Cite

Cross-lingual Sentence Retrieval (CLSR) aims at retrieving parallel sentence pairs that are translations of each other from a multilingual set of comparable documents. The retrieved parallel sentence pairs can be used in other downstream NLP tasks such as machine translation and cross-lingual word sense disambiguation. We propose a CLSR framework called Robust Fragment-level Representation (RFR) CLSR framework to address Out-of-Domain (OOD) CLSR problems. In particular, we improve the sentence retrieval robustness by representing each sentence as a collection of fragments. In this way, we change the retrieval granularity from the sentence to the fragment level. We performed CLSR experiments based on three OOD datasets, four language pairs, and three base well-known sentence encoders: m-USE, LASER, and LaBSE. Experimental results show that RFR significantly improves the base encoders' performance for more than 85% of the cases.

show abstract

Explosion/Spacy: V2.0.11: Alpha Vietnamese Support, Fixes To Vectors, Improved Errors And More

Honnibal¹,

Montani²,

Peters³

et al. 2018

View full text Add to dashboard Cite

wannaphongcom/thai-ner 1.2.1

Phatthiyaphaibun¹

2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.