This study is devoted to proposing a useful intelligent prediction model to distinguish the severity of COVID-19, to provide a more fair and reasonable reference for assisting clinical diagnostic decision-making. Based on patients' necessary information, pre-existing diseases, symptoms, immune indexes, and complications, this article proposes a prediction model using the Harris hawks optimization (HHO) to optimize the Fuzzy K-nearest neighbor (FKNN), which is called HHO-FKNN. This model is utilized to distinguish the severity of COVID-19. In HHO-FKNN, the purpose of introducing HHO is to optimize the FKNN's optimal parameters and feature subsets simultaneously. Also, based on actual COVID-19 data, we conducted a comparative experiment between HHO-FKNN and several well-known machine learning algorithms, which result shows that not only the proposed HHO-FKNN can obtain better classification performance and higher stability on the four indexes but also screen out the key features that distinguish severe COVID-19 from mild COVID-19. Therefore, we can conclude that the proposed HHO-FKNN model is expected to become a useful tool for COVID-19 prediction.
INDEX TERMSCOVID-19, coronavirus, fuzzy K-nearest neighbor, Harris hawk optimization, disease diagnosis, feature selection. I. INTRODUCTION Coronavirus disease 2019 (COVID-19) is a highly contagious viral disease, and the World Health Organization (WHO) declared that the COVID-19 was an international public health emergency [1], [2]. First described COVID-19 in December 2019 in Wuhan, Hubei Province, China. The ongoing outbreak of COVID-19 is affecting multiple countries in the world [1]. Until Mar 11th, 2020, The associate editor coordinating the review of this manuscript and approving it for publication was Juan Wang . 118,326 cases of COVID-19 were diagnosed worldwide, including 80,955 cases in China and 37,371 cases outside China. Additionally, 4,292 deaths have been triggered by COVID-19 [3]. Many countries are facing increased pressures on health care resources. Up to now, a great deal of studies is focused on using traditional statistical methods to identify risk factors of COVID-19 patients. As an example, older age, pre-existing diseases, abnormal liver function, and T-lymphocyte count were correlated closely with COVID-19 progression and prognosis [4]-[6]. However, traditional statistical methods could not rapidly identify changes