Jabed H. Tomal scite author profile

Statistical detection of a rare class of objects in a two-class classification problem can pose several challenges. Because the class of interest is rare in the training data, there is relatively little information in the known class response labels for model building. At the same time the available explanatory variables are often moderately high dimensional. In the four assays of our drug-discovery application, compounds are active or not against a specific biological target, such as lung cancer tumor cells, and active compounds are rare. Several sets of chemical descriptor variables from computational chemistry are available to classify the active versus inactive class; each can have up to thousands of variables characterizing molecular structure of the compounds. The statistical challenge is to make use of the richness of the explanatory variables in the presence of scant response information. Our algorithm divides the explanatory variables into subsets adaptively and passes each subset to a base classifier. The various base classifiers are then ensembled to produce one model to rank new objects by their estimated probabilities of belonging to the rare class of interest. The essence of the algorithm is to choose the subsets such that variables in the same group work well together; we call such groups phalanxes.

show abstract

The Impact of COVID-19 on Students’ Marks: A Bayesian Hierarchical Modeling Approach

Tomal

Rahmati

Boroushaki

et al. 2021

METRON

View full text Add to dashboard Cite

Due to COVID-19, universities across Canada were forced to undergo a transition from classroom-based face-to-face learning and invigilated assessments to online-based learning and non-invigilated assessments. This study attempts to empirically measure the impact of COVID-19 on students’ marks from eleven science, technology, engineering, and mathematics (STEM) courses using a Bayesian linear mixed effects model fitted to longitudinal data. The Bayesian linear mixed effects model is designed for this application which allows student-specific error variances to vary. The novel Bayesian missing value imputation method is flexible which seamlessly generates missing values given complete data. We observed an increase in overall average marks for the courses requiring lower-level cognitive skills according to Bloom’s Taxonomy and a decrease in marks for the courses requiring higher-level cognitive skills, where larger changes in marks were observed for the underachieving students. About half of the disengaged students who did not participate in any course assessments after the transition to online delivery were in special support.

show abstract

Ecological models for estimating breakpoints and prediction intervals

Tomal

Ciborowski

2020

Ecology and Evolution

View full text Add to dashboard Cite

show abstract

Measuring statistical evidence and multiple testing

Evans

Tomal

2018

FACETS

View full text Add to dashboard Cite

The measurement of statistical evidence is of considerable current interest in fields where statistical criteria are used to determine knowledge. The most commonly used approach to measuring such evidence is through the use of p-values, even though these are known to possess a number of properties that lead to doubts concerning their validity as measures of evidence. It is less well known that there are alternatives with the desired properties of a measure of statistical evidence. The measure of evidence given by the relative belief ratio is employed in this paper. A relative belief multiple testing algorithm was developed to control for false positives and false negatives through bounds on the evidence determined by measures of bias. The relative belief multiple testing algorithm was shown to be consistent and to possess an optimal property when considering the testing of a hypothesis randomly chosen from the collection of considered hypotheses. The relative belief multiple testing algorithm was applied to the problem of inducing sparsity. Priors were chosen via elicitation, and sparsity was induced only when justified by the evidence and there was no dependence on any particular form of a prior for this purpose.

show abstract

Model and variable selection using machine learning methods with applications to childhood stunting in Bangladesh

Khan

Tomal

Raheem

2021

Informatics for Health and Social Care

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jabed H. Tomal

Ensembling classification models based on phalanxes of variables with applications in drug discovery

The Impact of COVID-19 on Students’ Marks: A Bayesian Hierarchical Modeling Approach

Ecological models for estimating breakpoints and prediction intervals

Measuring statistical evidence and multiple testing

Model and variable selection using machine learning methods with applications to childhood stunting in Bangladesh

Contact Info

Product

Resources

About