Proceedings of the Second Workshop on Computational Research in Linguistic Typology 2020
DOI: 10.18653/v1/2020.sigtyp-1.2
|View full text |Cite
|
Sign up to set email alerts
|

KMI-Panlingua-IITKGP @SIGTYP2020: Exploring rules and hybrid systems for automatic prediction of typological features

Abstract: This paper enumerates SigTyP 2020 Shared Task on the prediction of typological features as performed by the KMI-Panlingua-IITKGP team. The task entailed the prediction of missing values in a particular language, provided, the name of the language family, its genus, location (in terms of latitude and longitude coordinates and name of the country where it is spoken) and a set of feature-value pair are available. As part of fulfillment of the aforementioned task, the team submitted 3 kinds of system -2 rule-based… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 1 publication
0
8
0
Order By: Relevance
“…Past Considering the past years of research, typological databases have mainly been used in the context of feature predictions. Methodologically speaking, features are typically predicted in the context of other features, and other languages (Daumé III and Campbell, 2007;Teh et al, 2009;Berzak et al, 2014;Malaviya et al, 2018;Bjerva et al, 2019cBjerva et al, ,a, 2020Bjerva et al, , 2019bVastl et al, 2020;Jäger, 2020;Choudhary, 2020;Gutkin and Sproat, 2020;Kumar et al, 2020). That is to say, given a language l ∈ L, where L is the set of all languages contained in a specific database, and the features of that language F l , the setup is typically to attempt to predict some subset of features f ⊂ F l , based on the remaining features F l \ f .…”
Section: Model Accuracymentioning
confidence: 99%
“…Past Considering the past years of research, typological databases have mainly been used in the context of feature predictions. Methodologically speaking, features are typically predicted in the context of other features, and other languages (Daumé III and Campbell, 2007;Teh et al, 2009;Berzak et al, 2014;Malaviya et al, 2018;Bjerva et al, 2019cBjerva et al, ,a, 2020Bjerva et al, , 2019bVastl et al, 2020;Jäger, 2020;Choudhary, 2020;Gutkin and Sproat, 2020;Kumar et al, 2020). That is to say, given a language l ∈ L, where L is the set of all languages contained in a specific database, and the features of that language F l , the setup is typically to attempt to predict some subset of features f ⊂ F l , based on the remaining features F l \ f .…”
Section: Model Accuracymentioning
confidence: 99%
“…Panlingua (Kumar et al, 2020), a team effort across KMI, Panlingua, and IIT KGP, submitted constrained systems from three approaches: two rule-based systems (one statistical, and one frequency-based baseline) and one hybrid system. Their baseline is similar to the organizers' frequency-base baseline, except that it produces the most frequent value for a feature within a genus if available, backing off to language family, and then the overall most-frequent value.…”
Section: Submissionsmentioning
confidence: 99%
“…These remarks may or may not be based on an individual's protected status or protected activities such as race, color, religion, sex, national origin, sexual orientation, or gender identity of an individual [8]. By considering abusive language as an umbrella term, that covers different types of online abuse, extensive studies have been done to address hate speech [3, 8-10, 13, 15, 16], offensive language [1,2,12], cyberbullying [33,34], aggression detection [11,29,34,35], and toxicity detection [36].…”
Section: Offensive Language Detection Techniquesmentioning
confidence: 99%
“…Although many efforts have been dedicated to address the problem of hate speech and offensive language detection in high-resource languages such as English [8,9,50], recently concerns have been raised about other languages as well. Emerging recent shared tasks and academic events such as Kaggle's Toxic Comment Classification Challenge in English, Automatic Misogyny Identification (AMI) at IberEval [17] and EVALITA [4] including Spanish and Italian languages respectively, identification of offensive language at GermEval [2,51] in German language, identification of offensive language at SemEval-2019 [50] for English and SemEval-2020 [1] for Arabic, Danish, English, Greek, and Turkish languages, proceedings of the Workshop on Trolling, Aggression and Cyberbullying Workshops [34,52], and proceedings of the Workshop on Abusive Language Online [5][6][7] shows the raising concerns towards hate speech and offensive language detection in different languages. These events and shared tasks mainly focused on different types of this phenomenon such as hate, offensive, misogyny, aggression, etc.…”
Section: Language-specific Abusive Language Detectionmentioning
confidence: 99%
See 1 more Smart Citation