Mariam Nassar scite author profile

White blood cell (WBC) differential counting is an established clinical routine to assess patient immune system status. Fluorescent markers and a flow cytometer are required for the current state‐of‐the‐art method for determining WBC differential counts. However, this process requires several sample preparation steps and may adversely disturb the cells. We present a novel label‐free approach using an imaging flow cytometer and machine learning algorithms, where live, unstained WBCs were classified. It achieved an average F1‐score of 97% and two subtypes of WBCs, B and T lymphocytes, were distinguished from each other with an average F1‐score of 78%, a task previously considered impossible for unlabeled samples. We provide an open‐source workflow to carry out the procedure. We validated the WBC analysis with unstained samples from 85 donors. The presented method enables robust and highly accurate identification of WBCs, minimizing the disturbance to the cells and leaving marker channels free to answer other biological questions. It also opens the door to employing machine learning for liquid biopsy, here, using the rich information in cell morphology for a wide range of diagnostics of primary blood. © 2019 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.

show abstract

LoRAS: an oversampling approach for imbalanced datasets

Bej

Davtyan

Wolfien

et al. 2020

Mach Learn

View full text Add to dashboard Cite

The Synthetic Minority Oversampling TEchnique (SMOTE) is widely-used for the analysis of imbalanced datasets. It is known that SMOTE frequently over-generalizes the minority class, leading to misclassifications for the majority class, and effecting the overall balance of the model. In this article, we present an approach that overcomes this limitation of SMOTE, employing Localized Random Affine Shadowsampling (LoRAS) to oversample from an approximated data manifold of the minority class. We benchmarked our algorithm with 14 publicly available imbalanced datasets using three different Machine Learning (ML) algorithms and compared the performance of LoRAS, SMOTE and several SMOTE extensions that share the concept of using convex combinations of minority class data points for oversampling with LoRAS. We observed that LoRAS, on average generates better ML models in terms of F1-Score and Balanced accuracy. Another key observation is that while most of the extensions of SMOTE we have tested, improve the F1-Score with respect to SMOTE on an average, they compromise on the Balanced accuracy of a classification model. LoRAS on the contrary, improves both F1 Score and the Balanced accuracy thus produces better classification models. Moreover, to explain the success of the algorithm, we have constructed a mathematical framework to prove that LoRAS oversampling technique provides a better estimate for the mean of the underlying local data distribution of the minority class data space.

show abstract

LoRAS: An oversampling approach for imbalanced datasets

Bej¹,

Davtyan²,

Wolfien³

et al. 2019

Preprint

View full text Add to dashboard Cite

The Synthetic Minority Oversampling TEchnique (SMOTE) is widely-used for the analysis of imbalanced datasets. It is known that SMOTE frequently overgeneralizes the minority class, leading to misclassifications for the majority class, and effecting the overall balance of the model. In this article, we present an approach that overcomes this limitation of SMOTE, employing Localized Random Affine Shadowsampling (LoRAS) to oversample from an approximated data manifold of the minority class. We benchmarked our LoRAS algorithm with 28 publicly available datasets and show that that drawing samples from an approximated data manifold of the minority class is the key to successful oversampling. We compared the performance of LoRAS, SMOTE, and several SMOTE extensions and observed that for imbalanced datasets LoRAS, on average generates better Machine Learning (ML) models in terms of F1-score and Balanced Accuracy. Moreover, to explain the success of the algorithm, we have constructed a mathematical framework to prove that LoRAS is a more effective oversampling technique since it provides a better estimate to mean of the underlying local data distribution of the minority class data space.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mariam Nassar

Label‐Free Identification of White Blood Cells Using Machine Learning

LoRAS: an oversampling approach for imbalanced datasets

LoRAS: An oversampling approach for imbalanced datasets

Contact Info

Product

Resources

About