Abstract:We propose a machine learning method for recognizing modern Arabic poems based on the common poetic features of modern Arabic poetry. The poetic features include: rhyming, repetition, use of diacritics and punctuations, and text alignment. The method can classify text documents as poem or non-poem documents with a very high accuracy of 99.81%.
The paper deals with problems that imbalanced and overlapping datasets often en- counter. Performance indicators as accuracy, precision and recall of imbalanced data sets, both with and without overlapping, are discussed and compared with the same performance indicators of balanced datasets with overlapping. Three popular classification algorithms, namely, Decision Tree, KNN (k-Nearest Neighbors) and SVM (Support Vector Machines) classifiers are analyzed and compared.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.