Parham Tofighi scite author profile

Parham Tofighi

2Publications

0Citation Statements Received

32Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Author’s Native Language Identification from Web-Based Texts

Tofighi¹,

Köse²,

Rouka³

2012

IJCCE

View full text Add to dashboard Cite

Abstract-With the rapid growth of Internet technologies and applications, Text is still the most common Internet medium. Examples of this include social networking applications and web applications are also mostly text based. We developed a framework to determine an anonymous author's native language for short length, multi-genre such as the ones found in many Internet applications. In this framework, four types of feature sets (lexical, syntactic, structural, and content-specific features) are extracted and three machine learning algorithms (C4.5 decision tree, support vector machine and Naïve Bayes) are designed for author's native language identification based on the proposed features. To experiment this framework, we used English, Persian, Turkish and German online news texts. The experimental results showed that the proposed approach was able to identify author's native language in web-based texts with satisfactory accuracy of 70% to 80%. And Support vector machines outperformed the other two classification techniques in our experiments.

show abstract

Using Machine Translators in Textual Data Classification

Rouka¹,

Köse²,

Tofighi³

2012

IJCCE

View full text Add to dashboard Cite

In this paper, the effect of machine translators in the textual data classification is examined by using supervised classification methods. The developed system first analyzes and classifies an input text in one language, and then analyzes and classifies the same text in another language generated by machine translators from the input text. The obtained results are compared to measure the effect of the translators in textual data classification. The performances of the classification method used in this study are also measured and compared. The classification process can be described as training data preparation, feature selection, and classification of the input texts with/without translation. The obtained results show that Multinomial Naïve Bayes method is the most successful method, and that the translation has quite a small effect on the attained classification accuracy.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Parham Tofighi

Author’s Native Language Identification from Web-Based Texts

Using Machine Translators in Textual Data Classification

Contact Info

Product

Resources

About