Linear regression models are traditionally used to capture the relation between the input and output variables. Linear models cannot account for the nonlinear relations in the data. Hence, the prediction models may not be accurate. For this reason, machine learning-based models are being increasingly used. For modeling, design, and scaleup of rotating disc contactors (RDCs), rational estimation of dispersed-phase holdup and drop size is crucial. We have employed random forest (RF) and autoencoder−RF-based models for the prediction of dispersed-phase holdup and drop size in RDCs. Our results show that both these models predict drop size quite well. For holdup, the autoencoder−RF combination predictions are not satisfactory. The standalone RF model predictions generalize very well. RF-based models can be further used for prediction of different variables of interest in RDCs.
Understanding the star-formation properties of galaxies as a function of cosmic epoch is a critical exercise in studies of galaxy evolution. Traditionally, stellar population synthesis models have been used to obtain best fit parameters that characterise star formation in galaxies. As multiband flux measurements become available for thousands of galaxies, an alternative approach to characterising star formation using machine learning becomes feasible. In this work, we present the use of deep learning techniques to predict three important star formation properties -stellar mass, star formation rate and dust luminosity. We characterise the performance of our deep learning models through comparisons with outputs from a standard stellar population synthesis code.Deep learning is inspired by the synaptic connections of
Bacterial virulence can be attributed to a wide variety of factors including toxins that harm the host. Pore-forming toxins are one class of toxins that confer virulence to the bacteria and are one of the promising targets for therapeutic intervention. In this work, we develop a sequence-based machine learning framework for the prediction of pore-forming toxins. For this, we have used distributed representation of the protein sequence encoded by reduced alphabet schemes based on conformational similarity and hydropathy index as input features to Support Vector Machines (SVMs). The choice of conformational similarity and hydropathy indices is based on the functional mechanism of pore-forming toxins. Our methodology achieves about 81% accuracy indicating that conformational similarity, an indicator of the flexibility of amino acids, along with hydrophobic index can capture the intrinsic features of pore-forming toxins that distinguish it from other types of transporter proteins. Increased understanding of the mechanisms of pore-forming toxins can further contribute to the use of such “mechanism-informed” features that may increase the prediction accuracy further.
Prediction of liquid holdup is of significance in designing
and
in evaluating the performance of trickle bed contactors. The present
work focuses on the development of Gradient Boosting Machines (GBM)
for the prediction of total and dynamic liquid holdup in trickle bed
reactors. A comprehensive data set of 394 data points of total liquid
holdup and 416 data points of dynamic liquid holdup curated from open
literature is used in this study. We built GBM models with the input
data sets containing 11 governing variables. GBM provided excellent
predictions for both data sets. We have also compared the GBM predictions
with that of the Random Forest (RF) and Artificial Neural Networks
(ANN) predictions. As GBM provided the best performance, we further
employed SHAP (SHapley Additive exPlanations) with GBM black box models to get local and global
interpretability. Also, we have used SHAP to identify informative
subsets of governing variables. The work shall pave the way for use
of GBM in prediction of hydrodynamic parameters in multiphase systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.