Using K-Fold Cross Validation Proposed Models for Spikeprop Learning Enhancements

Ahmed, Falah Y. H.; Ali, Yasir Hassan; Shamsuddin, Siti Mariyam

doi:10.14419/ijet.v7i4.11.20790

Cited by 17 publications

(11 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Random Forests were implemented by repeatedly fitting the model to 1000 resampled subsets of the data (100 repeats of 10-fold cross-validation). For each repetition, the dataset was divided into 10-cross-folds, of which 9-folds were used to perform an inner 10-fold cross-validation 20 . The number of trees to grow and the number of predictors randomly sampled as candidates in each split was set to default 21 (number of trees = 500; number of predictors randomly selected = 2, 19 and 36), and the optimization criteria was maximization of the area under the of Receiver Operating Characteristic (ROC) curve, known as AUC 22 .…”

Section: Methodsmentioning

confidence: 99%

Antibody selection strategies and their impact in the analysis of malaria multi-sera data

Fonseca

Biecek

Cordeiro

et al. 2022

Preprint

View full text Add to dashboard Cite

Identifying antibody responses associated with natural immunity to malaria is key to advancing antimalarial vaccine development. With the advent of high-throughput serological assay robust pipelines that produce solid and reproducible results are crucial for the identification of the antibody responses that lead to malaria protection. Here we have developed two pipelines and assessed their predictive performance on published data from IgG antibody responses against 36 antigens derived from Plasmodium falciparum in 121 Kenyan children with ages below 10 years old. The first pipeline relied on parametric approaches while the second represented a more pragmatic approach to data analysis. The proposed pipelines enabled us to construct classifiers based on few antibodies, whose performances outperformed previous findings based on Random Forest. The best classifier overall was based on antibodies against the msp2, msp4, msp7, msp10, pf11_0373 and pf113 antigens and reached a predictive performance of 86% (AUC = 0.86; 95% CI = (0.79-0.93)) using the pragmatic approach. Concerning the parametric approach, our best achievement was a classifier against the h103, msp2, msp4, msp7 and msrp3 antigens with a predictive performance of 82% (AUC = 0.82; 95% CI = (0.75-0.90)). The good performance of our pipelines suggests their applicability in antibody data analysis intending to identify antimalarial vaccine candidates. In summary, we were able to identify several antibody responses with high predictive ability against clinical malaria. The proposed pipelines also showed promise to improve the statistical analysis of antibody data aiming to identify antimalarial vaccine candidates.

show abstract

Section: Methodsmentioning

confidence: 99%

Antibody selection strategies and their impact in the analysis of malaria multi-sera data

Fonseca

Biecek

Cordeiro

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Nh K-fold Cross-Validation involves splitting the data into k subsets. One of the k subsets is used as the validation set, while the other k-1 subsets are used as the training set [22][23].…”

Section: Train/testmentioning

confidence: 99%

Pediatric Diabetes Prediction Using Deep Learning

El-Bashbishy,

El-Bakry

2023

Preprint

View full text Add to dashboard Cite

The present study proposes a novel technique for the early prediction of diabetes with the utmost accuracy. Recently, the contemporary methodologies of artificial intelligence and in particular Deep Learning (DL), have proven to be expeditious in the diagnosis of diabetes. The model that is supported has been constructed with the implementation of two hidden layers and a multitude of epochs of Deep Learning Neural Network (DLNN) utilizing the Multi-Layer Perceptron (MLP) technique. We proceeded to meticulously adjust the hyperparameters within the fully automated DLNN architecture, with the aim of optimizing data pre-processing, classification and prediction. This was accomplished by a novel dataset of Mansoura University Children's Hospital Diabetes (MUCHD), which allowed for a more comprehensive evaluation of the system’s performance. The system is validated and tested on a sample of 548 patients, each exhibiting 18 significant features. Various validation metrics were employed to ensure the accuracy and reliability of the results like K-folds, leave-one-subject-out and cross-validation approaches with various statistical measures of accuracy, f-score, precision, sensitivity, specificity and dice similarity coefficient. The high-performance level of the proposed system can help clinicians to accurately diagnose health and different diabetes grades with a remarkable accuracy rate of 99.8%. According to our analysis, the implementation of this method results in a noteworthy increase of 4.15% in overall system performance when compared to the current state-of-the-art. As such, we highly recommend the utilization of this method as a promising tool for forecasting diabetes.

show abstract

“…The (𝑘 parameter) refers to the number of groups that the dataset will split into. In this experiment, we used 10-fold cross-validation [30], [31].…”

Section: B Dataset Splitmentioning

confidence: 99%

Detecting Fake News in Social Media Using Voting Classifier

Elsaeed

El-Daydamony

Elmogy³

et al. 2021

IEEE Access

View full text Add to dashboard Cite

The availability of social media, blogs, and websites to everyone creates a lot of problems. False news is a critical issue that can affect individuals or entire countries. Fake news can be created and shared all over the world. The 2016 presidential election in the United States illustrates that problem. As a result, controlling social media is essential. Machine learning (ML) algorithms help to detect fake news automatically. This article proposes a framework for detecting fake news based on feature extraction and feature selection algorithms and a set of voting classifiers. The proposed system distinguishes fake news from real news. First, we preprocessed the data taking unnecessary characters and numbers and reducing the words in the dictionary (lemmatization). Second, we extracted some important features using two feature extraction types: the term frequency-inverse document frequency (TF-IDF) technique and the DOC2VEC algorithm, a word embedding technique. Third, the extracted characteristics were reduced with the help of the chi-square algorithm and the analysis of variance (ANOVA) algorithm. We used three data sets that are published online: Media-Eval, Fake-or-Real-News, and ISOT. To evaluate the proposed framework, we used five different performance metrics: accuracy (ACC), the area under the curve (AUC), precision, recall, and f1-score. Our system achieved 94.6% of ACC for the Fake-or-Real dataset. For the Media-Eval dataset, the system achieved 92.3% of ACC. For the ISOT dataset, the system achieved 100% of ACC. We contrast the proposed framework with several other classification algorithms. The experimental results show that the proposed framework outperforms the existing works in terms of ACC by 0.2% for the ISOT dataset. INDEX TERMSFake news, News classification, Voting classifier, Term frequency-inverse document frequency, Chi-square Eman Elsaeed received the B.Sc. degree in information technology department from the Faculty of Computers and Information, Mansoura University, Egypt, in 2014. Currently, she is an M.Sc. student at the Faculty of Computers and Information, Mansoura University, Egypt. Her research interests include Artificial intelligence, machine learning, and data mining.

show abstract

Using K-Fold Cross Validation Proposed Models for Spikeprop Learning Enhancements

Cited by 17 publications

References 38 publications

Antibody selection strategies and their impact in the analysis of malaria multi-sera data

Antibody selection strategies and their impact in the analysis of malaria multi-sera data

Pediatric Diabetes Prediction Using Deep Learning

Detecting Fake News in Social Media Using Voting Classifier

Contact Info

Product

Resources

About