Haya Salah scite author profile

Data Curation on data streams is effective in operating and reducing costs of BIG DATA analytic. Basically, analytic preparation requires data curation of available heterogeneous data sets available in big data clusters and such analytic process becomes harder when it comes to the concept of conducting the curation process on Data-on-Motion, in order to come at actionable insights and valuable analytic on a real-time basis including the Machine Learning further analytic and processing. In our paper, we identified and surveyed the different issues and challenges among different areas that are related to the big data. In addition to investigate, the most common techniques and methods followed through the implementations including Streams Curation, the Machine Learning Different Algorithms used in such implementations and the Feature Engineering different techniques that can be considered as curation pre-processing paradigm for data streams analytic. Furthermore, our paper shows the different application areas were data curation concept plays a critical role. Finally, we draw the map between the techniques and methods that are related to the data curation field to emphasize on its main critical role among Business, Retails, Culture, Arts, Health, Medicine, Social Media, Wireless Sensor Networks, Natural Language Processing (NLP) and Automated Feature Engineering (FE). On other hand, we identified the different issues and challenges among different areas including the IoT and Media Streams Curation to help the scholars in this region accordingly.

show abstract

Predict, then schedule: Prescriptive analytics approach for machine learning-enabled sequential clinical scheduling

Salah

Srinivas

2022

Computers & Industrial Engineering

View full text Add to dashboard Cite

Performance Evaluation of DigiMesh and ZigBee Wireless Mesh Networks

Khalifeh¹,

Salah²,

Alouneh³

et al. 2018

View full text Add to dashboard Cite

Explainable machine learning framework for predicting long-term cardiovascular disease risk among adolescents

Salah

Srinivas

2022

Sci Rep

View full text Add to dashboard Cite

Although cardiovascular disease (CVD) is the leading cause of death worldwide, over 80% of it is preventable through early intervention and lifestyle changes. Most cases of CVD are detected in adulthood, but the risk factors leading to CVD begin at a younger age. This research is the first to develop an explainable machine learning (ML)-based framework for long-term CVD risk prediction (low vs. high) among adolescents. This study uses longitudinal data from a nationally representative sample of individuals who participated in the Add Health study. A total of 14,083 participants who completed relevant survey questionnaires and health tests from adolescence to young adulthood were chosen. Four ML classifiers [decision tree (DT), random forest (RF), extreme gradient boosting (XGBoost), and deep neural networks (DNN)] and 36 adolescent predictors are used to predict adulthood CVD risk. While all ML models demonstrated good prediction capability, XGBoost achieved the best performance (AUC-ROC: 84.5% and AUC-PR: 96.9% on testing data). Besides, critical predictors of long-term CVD risk and its impact on risk prediction are obtained using an explainable technique for interpreting ML predictions. The results suggest that ML can be employed to detect adulthood CVD very early in life, and such an approach may facilitate primordial prevention and personalized intervention.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Haya Salah

Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: A data analytics approach

Data Streams Curation for Better Machine Learning Functionality and Result to Serve IoT and other Applications: A Survey

Predict, then schedule: Prescriptive analytics approach for machine learning-enabled sequential clinical scheduling

Performance Evaluation of DigiMesh and ZigBee Wireless Mesh Networks

Explainable machine learning framework for predicting long-term cardiovascular disease risk among adolescents

Contact Info

Product

Resources

About