In December 2019, the novel coronavirus disease 2019 (COVID-19) appeared. Being highly contagious and with no effective treatment available, the only solution was to detect and isolate infected patients to further break the chain of infection. The shortage of test kits and other drawbacks of lab tests motivated researchers to build an automated diagnosis system using chest X-rays and CT scanning. The reviewed works in this study use AI coupled with the radiological image processing of raw chest X-rays and CT images to train various CNN models. They use transfer learning and numerous types of binary and multi-class classifications. The models are trained and validated on several datasets, the attributes of which are also discussed. The obtained results of various algorithms are later compared using performance metrics such as accuracy, F1 score, and AUC. Major challenges faced in this research domain are the limited availability of COVID image data and the high accuracy of the prediction of the severity of patients using deep learning compared to well-known methods of COVID-19 detection such as PCR tests. These automated detection systems using CXR technology are reliable enough to help radiologists in the initial screening and in the immediate diagnosis of infected individuals. They are preferred because of their low cost, availability, and fast results.
This paper describes the creation of the new Bangor Arabic Annotated Corpus (BAAC) which is a Modern Standard Arabic (MSA) corpus that comprises 50K words manually annotated by parts-of-speech. For evaluating the quality of the corpus, the Kappa coefficient and a direct percent agreement for each tag were calculated for the new corpus and a Kappa value of 0.956 was obtained, with an average observed agreement of 94.25%. The corpus was used to evaluate the widely used Madamira Arabic part-of-speech tagger and to further investigate compression models for text compressed using partof-speech tags. Also, a new annotation tool was developed and employed for the annotation process of BAAC. Keywords-Component; arabic language; corpus; annotated corpora; analysis results I. BACKGROUND AND MOTIVATION The Arabic language "انعربيت" is acknowledged to be one of the most largely used languages, with 330 million people using the language as their first language, as shown in Table 1, plus 1.4 billion more using it as a secondary language [1]. The majority of the speakers are located across twenty-two nations, primarily in the Middle East, North Africa and Asia, and the United Nations considers the Arabic language as one of its five official languages. The Arabic language is part of the Semitic languages that includes Tigrinya, Amharic, Hebrew, etc., and shares almost the same structure as those languages. It has 28 letters, two gendersfeminine and masculine, as well as singular, dual and plural forms. The Arabic language has a right-to-left writing system with the basic grammatical structure that consists of verb-subject-object and other structures, such as VOS, VO and SVO [2]-[4]. TABLE I. THE MOST UNIVERSALLY USED LANGUAGES Rank Language Users (millions)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.