Tarek Kanan scite author profile

Arabic news articles in electronic collections are difficult to work with. Browsing by category is rarely supported. While helpful machine learning methods have been applied successfully to similar situations for English news articles, limited research has been completed to yield suitable solutions for Arabic news. In connection with a QNRF funded project to build digital library community and infrastructure in Qatar, we developed software for browsing a collection of about 237K Arabic news articles, which should be applicable to other Arabic news collections as well. We designed a simple taxonomy for Arabic news stories that is suitable for the needs in Qatar and other nations, is compatible with the subject codes of the International Press Telecommunications Council, and was enhanced with the aid of a librarian expert as well as five Arabic-speaking volunteers. We developed tailored stemming (i.e., a new Arabic light stemmer) and automatic classification methods (the best being binary SVM classifiers) to work with the taxonomy. Using evaluation techniques commonly used in the information retrieval community, including 10-fold cross-validation and the Wilcoxon signed-rank test, we showed that our approach to stemming and classification is superior to state-of-the-art techniques.

show abstract

A Review of Natural Language Processing and Machine Learning Tools Used to Analyze Arabic Social Media

Kanan

Sadaqa

Aldajeh

et al. 2019

View full text Add to dashboard Cite

SmartCert BlockChain Imperative for Educational Certificates

Kanan

Obaidat²,

Al-Lahham

2019

View full text Add to dashboard Cite

An efficient hybrid similarity measure based on user interests for recommender systems

et al. 2019

View full text Add to dashboard Cite

Recommender systems are used to suggest items to users based on their interests. They have been used widely in various domains, including online stores, web advertisements, and social networks. As part of their process, recommender systems use a set of similarity measurements that would assist in finding interesting items. Although many similarity measurements have been proposed in the literature, they have not concentrated on actual user interests. This paper proposes a new efficient hybrid similarity measure for recommender systems based on user interests. This similarity measure is a combination of two novel base similarity measurements: the user interest-user interest similarity measure and the user interest-item similarity measure. This hybrid similarity measure improves the existing work in three aspects. First, it improves the current recommender systems by using actual user interests. Second, it provides a comprehensive evaluation of an efficient solution to the cold start problem. Third, this similarity measure works well even when no corated items exist between two users. Our experiments show that our proposed similarity measure is efficient in terms of accuracy, execution time, and applicability. Specifically, our proposed similarity measure achieves a mean absolute error (MAE) as low as 0.42, with 64% applicability and an execution time as low as 0.03 s, whereas the existing similarity measures from the literature achieve an MAE of 0.88 at their best; these results demonstrate the superiority of our proposed similarity measure in terms of accuracy, as well as having a high applicability percentage and a very short execution time.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tarek Kanan

A survey on particle swarm optimization with emphasis on engineering and network applications

Automated arabic text classification with P‐Stemmer, machine learning, and a tailored news article taxonomy

A Review of Natural Language Processing and Machine Learning Tools Used to Analyze Arabic Social Media

SmartCert BlockChain Imperative for Educational Certificates

An efficient hybrid similarity measure based on user interests for recommender systems

Contact Info

Product

Resources

About