Thien Huu Nguyen scite author profile

Thien Huu Nguyen

5Publications

11Citation Statements Received

65Citation Statements Given

How they've been cited

How they cite others

Affiliations

VinUniversity, Kyushu University, Nong Lam University Ho Chi Minh City

Publications

Order By: Most citations

Toward Mention Detection Robustness with Recurrent Neural Networks

Nguyen¹,

Sil²,

Dinu³

et al. 2016

Preprint

View full text Add to dashboard Cite

One of the key challenges in natural language processing (NLP) is to yield good performance across application domains and languages. In this work, we investigate the robustness of the mention detection systems, one of the fundamental tasks in information extraction, via recurrent neural networks (RNNs). The advantage of RNNs over the traditional approaches is their capacity to capture long ranges of context and implicitly adapt the word embeddings, trained on a large corpus, into a task-specific word representation, but still preserve the original semantic generalization to be helpful across domains. Our systematic evaluation for RNN architectures demonstrates that RNNs not only outperform the best reported systems (up to 9% relative error reduction) in the general setting but also achieve the state-of-the-art performance in the cross-domain setting for English. Regarding other languages, RNNs are significantly better than the traditional methods on the similar task of named entity recognition for Dutch (up to 22% relative error reduction).

show abstract

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

Nguyen¹,

Lai²,

Veyseh³

et al. 2021

Preprint

View full text Add to dashboard Cite

We introduce Trankit, a light-weight Transformer-based Toolkit for multilingual Natural Language Processing (NLP). It provides a trainable pipeline for fundamental NLP tasks over 100 languages, and 90 pretrained pipelines for 56 languages. Built on a state-of-the-art pretrained language model, Trankit significantly outperforms prior multilingual NLP pipelines over sentence segmentation, part-of-speech tagging, morphological feature tagging, and dependency parsing while maintaining competitive performance for tokenization, multi-word token expansion, and lemmatization over 90 Universal Dependencies treebanks. Despite the use of a large pretrained transformer, our toolkit is still efficient in memory usage and speed. This is achieved by our novel plugand-play mechanism with Adapters where a multilingual pretrained transformer is shared across pipelines for different languages. Our toolkit along with pretrained models and code are publicly available at: https: //github.com/nlp-uoregon/trankit.

show abstract

Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional Networks and Syntax-based Regulation

Veyseh¹,

Nasim²,

Dernoncourt³

et al. 2020

Preprint

View full text Add to dashboard Cite

Extensively Matching for Few-shot Learning Event Detection

Lai¹,

Dernoncourt²,

Nguyen³

2020

Preprint

View full text Add to dashboard Cite

MadDog: A Web-based System for Acronym Identification and Disambiguation

Veyseh¹,

Dernoncourt²,

Chang³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thien Huu Nguyen

Toward Mention Detection Robustness with Recurrent Neural Networks

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

Improving Aspect-based Sentiment Analysis with Gated Graph Convolutional Networks and Syntax-based Regulation

Extensively Matching for Few-shot Learning Event Detection

MadDog: A Web-based System for Acronym Identification and Disambiguation

Contact Info

Product

Resources

About