BackgroundSupervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction.MethodsIn this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction.ResultsWe found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered.ConclusionThis study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies.
The goal of this study was to understand research trends and collaboration patterns together with scholarly impact within the domain of global obesity research. We developed and analysed bibliographic affiliation data collected from 117,340 research articles indexed in Scopus database on the topic of obesity and published from 1993-2012. We found steady growth and an exponential increase of publication numbers. Research output in global obesity research roughly doubled each 5 years, with almost 80% of the publications and authors from the second decade (2003-2012). The highest publication output was from the USA - 42% of publications had at least one author from the USA. Many US institutions also ranked highly in terms of research output and collaboration. Fifteen of the top-20 institutions in terms of publication output were from the USA; however, several European and Japanese research institutions ranked more highly in terms of average citations per paper. The majority of obesity research and collaboration has been confined to developed countries although developing countries have showed higher growth in recent times, e.g. the publication ratio between 2003-2012 and 1993-2002 for developing regions was much higher than that of developed regions (9:1 vs. 4:1). We also identified around 42 broad disciplines from authors' affiliation data, and these showed strong collaboration between them. Overall, this study provides one of the most comprehensive longitudinal bibliometric analyses of obesity research. This should help in understanding research trends, spatial density, collaboration patterns and the complex multi-disciplinary nature of research in the obesity domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.