A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation

Minku, Leandro L.

doi:10.1007/s10664-019-09686-w

Cited by 33 publications

(17 citation statements)

References 54 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Many ML techniques have a number of hyperparameters that can be tuned (e.g., the learning rate, number of hidden units, or activation function) [23]. Hyperparameter tuning can have a major impact on model accuracy, and can enable significant improvements in the results of even simple ML techniques.…”

Section: Rq6 (Challenges)mentioning

confidence: 99%

Using machine learning to generate test oracles: a systematic literature review

Fontes

2021

Proceedings of the 1st International Workshop on Test Oracles

View full text Add to dashboard Cite

Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field.Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and-most commonly-expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata-including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employedand how they are applied-the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field. CCS CONCEPTS• Software and its engineering → Software verification and validation; • Computing methodologies → Machine learning.

show abstract

Section: Rq6 (Challenges)mentioning

confidence: 99%

Using machine learning to generate test oracles: a systematic literature review

Fontes

2021

Proceedings of the 1st International Workshop on Test Oracles

View full text Add to dashboard Cite

show abstract

“…In terms of online hyperparameter tuning algorithms, there are few works [17,[23][24][25][26] that use support vector machines together with batch processing, gradient solutions combined with brute force or genetic algorithms to optimise hyperparameters. Lawal and Abdulkarim [23] introduce an incremental learning-model selection method for data stream batches.…”

Section: Related Workmentioning

confidence: 99%

“…The algorithm computes the hyperparameter gradients on the fly whenever a new datum is observed and, then, updates smoothly the hyperparameters with the average of the past and current hypergradients. Minku [25] proposed an online hyperparameter tuning method that maintains a number of model instances created from different subsets. The method applies computational brute force to find the model instance with the smallest validation error.…”

Section: Related Workmentioning

confidence: 99%

Hyperparameter self-tuning for data streams

et al. 2021

View full text Add to dashboard Cite

The number of Internet of Things devices generating data streams is expected to grow exponentially with the support of emergent technologies such as 5G networks. The online processing of these data streams therefore requires the design and development of suitable machine learning algorithms, able to learn online, as data is generated. Like their batch-learning counterparts, streambased learning algorithms require careful hyperparameter settings. However, this problem is exacerbated in online learning settings, especially with the occurrence of concept drifts, that frequently require the reconfiguration of hyperparameters. In this article, we present SSPT, an extension of the Self Parameter Tuning (SPT) optimisation algorithm for data streams. We apply the Nelder-Mead algorithm to dynamically-sized samples, converging to optimal settings in a single-pass over data, while using a relatively small number of hyperparameter configurations. In addition, our proposal automatically readjusts hyperparameters when concept drift occurs. To assess the effectiveness of SSPT, the algorithm is evaluated with three different machine learning problems: recommendation, regression, and classification. Experiments with well-known data sets show that the proposed algorithm can outperform previous hyperparameter tuning efforts by human experts. Results show that SSPT converges significantly faster and

show abstract

“…In terms of how to set the parameters in real world problems, the difficulty is that the best values may change over time. Potentially, one could run multiple versions of the approach with different parameter settings [61]. The parameters 'β', 'θ' and 'Period' were analysed and their effect on prediction accuracy, ensemble size and drift detections.…”

Section: Parameters Analysismentioning

confidence: 99%

A heterogeneous online learning ensemble for non-stationary environments

Idrees

Minku

Stahl

et al. 2020

Knowledge-Based Systems

Self Cite

View full text Add to dashboard Cite

Learning in non-stationary environments is a challenging task which requires the updating of predictive models to deal with changes in the underlying probability distribution of the problem, i.e., dealing with concept drift. Most work in this area is concerned with updating the learning system so that it can quickly recover from concept drift, while little work has been dedicated to investigating what type of predictive model is most suitable at any given time. This paper aims to investigate the benefits of online model selection for predictive modelling in nonstationary environments. A novel heterogeneous ensemble approach is proposed to intelligently switch between different types of base models in an ensemble to increase the predictive performance of online learning in nonstationary environments. This approach is Heterogeneous Dynamic Weighted Majority (HDWM). It makes use of "seed" learners of different types to maintain ensemble diversity, overcoming problems of existing dynamic ensembles that may undergo loss of diversity due to the exclusion of base learners. The algorithm has been evaluated on artificial and real-world data streams against existing well-known approaches such as a heterogeneous Weighted Majority Algorithm (WMA) and a homogeneous Dynamic Weighted Majority (DWM). The results show that HDWM performed significantly better than WMA in non-stationary environments. Also, when recurring concept drifts were present, the predictive performance of HDWM showed an improvement over DWM.

show abstract

A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation

Cited by 33 publications

References 54 publications

Using machine learning to generate test oracles: a systematic literature review

Using machine learning to generate test oracles: a systematic literature review

Hyperparameter self-tuning for data streams

A heterogeneous online learning ensemble for non-stationary environments

Contact Info

Product

Resources

About