Machine learning approaches are explored to predict the bandgaps of inorganic compounds using known compositional features, based on a dataset of 3896 compounds with experimentally measured bandgaps. In particular, among various existing methods, we propose a new method, random forest with Gaussian process model as leaf nodes (RF-GP), and show its advantages. We have also investigated ensemble learning methods, which produce superior results over other traditional machine learning methods, but at the cost of extra computational load and further reduced interpretability.
The Musculoskeletal Radiographs (MURA) dataset, proposed by Stamford Machine Learning (ML) group, has 40,561 images of bone X-rays from 14,863 studies. The X-ray images belong to seven body areas of the upper extremity namely wrist, elbow, finger, humerus, forearm, hand, and shoulder. Radiologists have classified the data into two classes, namely normal and abnormal. Six board-certified Stanford radiologists labeled the data samples using most votes, which is considered the gold standard. The 169 layers deep model, introduced by the Stamford ML group, works well on a par with the gold standard except for the humerus radiographs, despite humerus data labeled with high accuracy. We propose to develop a comparatively shallower version of a neural network and a convolutional network with 10 hidden layers each in an Adaboost framework in the humerus data and the model performance is on par or sometimes superior to the Stamford ML group model. We evaluate the performance of our model using the validation error and Cohen’s kappa coefficients. We have shown that our modeling framework is much faster in terms of the model training time and as accurate compared to the 169 layers of deep neural network introduced by the Stamford ML group. Also, with increased resources, the performance of our model will increase.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.