Corona pandemic has affected the whole world, and it is a highly researched area in biological sciences. As the current pandemic has affected countries socially and economically, the purpose of this bibliometric analysis is to provide a holistic review of the corona pandemic in the field of social sciences. This study aims to highlight significant, influential aspects, research streams, and themes. We have reviewed 395 journal articles related to coronavirus in the field of social sciences from 2003 to 2020. We have deployed 'biblioshiny' a web-interface of the 'bibliometrix 3.0' package of R-studio to conduct bibliometric analysis and visualization. In the field of social sciences, we have reported influential aspects of coronavirus literature. We have found that the 'Morbidity and Mortality Weekly Report' is the top journal. The core article of coronavirus literature is 'Guidelines for preventing health-care-associated pneumonia'. The most commonly used word, in titles, abstracts, author's keywords, and keywords plus, is 'SARS'. Top affiliation is 'The University of Hong Kong'. Hong Kong is a leading country based on citations, and the USA is on top based on total publications. We have used a conceptual framework to identify potential research streams and themes in coronavirus literature. Four research streams are found by deploying a co-occurrence network. These research streams are 'Social and economic effects of epidemic disease', 'Infectious disease calamities and control', 'Outbreak of COVID 19,' and 'Infectious diseases and the role of international organizations'. Finally, a thematic map is used to provide a holistic understanding by dividing significant themes into basic or transversal, emerging or declining, motor, highly developed, but isolated themes. These themes and subthemes have proposed future directions and critical areas of research.
Medical datasets are usually imbalanced, where negative cases severely outnumber p osit iv e cases. Therefore, it is essential to deal with this data skew problem when training machine learning algorithms. This study uses two representative lung cancer datasets, PLCO an d NLST, wit h imb alan ce ratios (the proportion of samples in the majority class to those in the minority class) of 24.7 and 25.0, respectively, to predict lung cancer incidence. This research uses the performance o f 23 clas s imb alan ce methods (resampling and hybrid systems) with three classical classifiers (logistic regression, random forest, and LinearSVC) to identify the best imbalance techniques suitable for medical datasets. Resampling includes ten under-sampling methods (RUS, Etc.), seven over-sampling methods (SMOTE, Etc.), an d t wo integrated sampling methods (SMOTEENN, SMOTE-Tomek). Hybrid systems include (Balanced Bagging, Etc.). The results show that class imbalance learning can improve the classification abilit y o f t h e mo d el. Compared with other imbalanced techniques, under-sampling techniques have the highest standard deviation (SD), and over-sampling techniques have the lowest SD. Over-sampling is a stable met h od, an d the AUC in the model is generally higher than in other ways. Using ROS, the random forest p erforms t h e best predictive ability and is more suitable for the lung cancer datasets used in this study.
In our work, we have presented two widely used recommendation systems. We have presented a context-aware recommender system to filter the items associated with user’s interests coupled with a context-based recommender system to prescribe those items. In this study, context-aware recommender systems perceive the user’s location, time, and company. The context-based recommender system retrieves patterns from World Wide Web-based on the user’s past interactions and provides future news recommendations. We have presented different techniques to support media recommendations for smartphones, to create a framework for context-aware, to filter E-learning content, and to deliver convenient news to the user. To achieve this goal, we have used content-based, collaborative filtering, a hybrid recommender system, and implemented a Web ontology language (OWL). We have also used the Resource Description Framework (RDF), JAVA, machine learning, semantic mapping rules, and natural ontology languages that suggest user items related to the search. In our work, we have used E-paper to provide users with the required news. After applying the semantic reasoning approach, we have concluded that by some means, this approach works similarly as a content-based recommender system since by taking the gain of a semantic approach, we can also recommend items according to the user’s interests. In a content-based recommender system, the system provides additional options or results that rely on the user’s ratings, appraisals, and interests.
Financial threats are displaying a trend about the credit risk of commercial banks as the incredible improvement in the financial industry has arisen. In this way, one of the biggest threats faces by commercial banks is the risk prediction of credit clients. Recent studies mostly focus on enhancing the classifier performance for credit card default prediction rather than an interpretable model. In classification problems, an imbalanced dataset is also crucial to improve the performance of the model because most of the cases lied in one class, and only a few examples are in other categories. Traditional statistical approaches are not suitable to deal with imbalanced data. In this study, a model is developed for credit default prediction by employing various credit-related datasets. There is often a significant difference between the minimum and maximum values in different features, so Min-Max normalization is used to scale the features within one range. Data level resampling techniques are employed to overcome the problem of the data imbalance. Various undersampling and oversampling methods are used to resolve the issue of class imbalance. Different machine learning models are also employed to obtain efficient results. We developed the hypothesis of whether developed models using different machine learning techniques are significantly the same or different and whether resampling techniques significantly improves the performance of the proposed models. Oneway Analysis of Variance is a hypothesis-testing technique, used to test the significance of the results. The split method is utilized to validate the results in which data has split into training and test sets. The results on imbalanced datasets show the accuracy of 66.9% on Taiwan clients credit dataset, 70.7% on South German clients credit dataset, and 65% on Belgium clients credit dataset. Conversely, the results using our proposed methods significantly improve the accuracy of 89% on Taiwan clients credit dataset, 84.6% on South German clients credit dataset, and 87.1% on Belgium clients credit dataset. The results show that the performance of classifiers is better on the balanced dataset as compared to the imbalanced dataset. It is also observed that the performance of data oversampling techniques are better than undersampling techniques. Overall, the Gradient Boosted Decision Tree method performs better than other traditional machine learning classifiers. The Gradient Boosted Decision Tree method gives the best results while utilizing the K-means SMOTE oversampling method. Using one-way ANOVA, the null hypothesis was rejected by a p-value <0.001, hence confirming that the proposed model improved performance is statistical significance. The interpretable model is also deployed on the web to ease the different stakeholders. This model will help commercial banks, financial organizations, loan institutes, and other decision-makers to predict the loan defaulter earlier.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.