The Arabic letters are used in many written languages. However, little work has been done to analyze and characterize handwritten Arabic letters comprehensively. Such characterization is important for the active research in the computer processing of Arabic written scripts. We extract carefully selected features from a large database of handwritten Arabic letters, from the letter's secondary components, main body, skeleton, and boundary. These features are studied and statistically analyzed to reach the targeted characterization. Observations about the important writing style variations are presented and statistically specified. The Arabic letters have multiple forms depending on the letter's position in the word. Comparisons among the four main letter forms (isolated, initial, medial, and final) are also presented.
The internet and smartphone penetrations continue to rise reaching large percentages of the world populations. Likewise, many Jordanians are actively communicating through the popular social networks and mobile phone messages. There are large questions and concerns related to the characteristics and quality of the language used in these forums and how to improve it. This study addresses these issues by collecting and analyzing a large sample of text from five sources: Facebook, Twitter, news sites, blogging sites, and mobile phone short messages. We analyzed the sample comprehensively including the sender, context, message, channel, and code. We present in this paper the results related to the used language, alphabet, dialect, text components, and style. The study concludes that the bilingualism problem is manifested in Twitter and Facebook with 24% and 14% of contributions in English, respectively. Moreover, 6.4% of the analyzed Arabic samples have English words and 13.2% are written in Arabizi (Arabic in English letters and numerals). The diglossia problem is manifested as 55.4% of the sample is in colloquial Arabic, 36.4% in the standard Arabic, and 8.2% in standard with some colloquial words.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.