Anais Do XVIII Encontro Nacional De Inteligência Artificial E Computacional (ENIAC 2021) 2021
DOI: 10.5753/eniac.2021.18273
|View full text |Cite
|
Sign up to set email alerts
|

Universal Dependencies for Tweets in Brazilian Portuguese: Tokenization and Part of Speech Tagging

Abstract: Automatically dealing with Natural Language User-Generated Content (UGC) is a challenging task of utmost importance, given the amount of information available over the web. We present in this paper an effort on building tokenization and Part of Speech (PoS) tagging systems for tweets in Brazilian Portuguese, following the guidelines of the Universal Dependencies (UD) project. We propose a rule-based tokenizer and the customization of current state-of-the-art UD-based tagging strategies for Portuguese, achievin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
references
References 5 publications
0
0
0
Order By: Relevance