Preventing Overfitting by Training Derivatives

Avrutskiy, V. I.

doi:10.1007/978-3-030-32520-6_12

Cited by 4 publications

(5 citation statements)

References 69 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Over-fitting and underfitting are common fitting problems for model training (Dahan and Keller 2021;Graham et al 2020;Yuan et al 2020). Over-fitting takes the local features of training data set as the whole features, and the reason is that too many parameters of model make the training error small and the test error large (Yildirim and Ozkale 2021;Avrutskiy 2020). When under-fitting occurs, the learning ability of model is weak and the basic features of training data set cannot be learned (Handelman et al 2019;Ahmed and Isa 2017;Van Calster and Vickers 2015).…”

Section: Evaluation Metrics For Machine Learningmentioning

confidence: 99%

Synthesis optimization and adsorption modeling of biochar for pollutant removal via machine learning

Zhang

Chen

et al. 2023

Biochar

View full text Add to dashboard Cite

Due to large specific surface area, abundant functional groups and low cost, biochar is widely used for pollutant removal. The adsorption performance of biochar is related to biochar synthesis and adsorption parameters. But the influence factor is numerous, the traditional experimental enumeration is powerless. In recent years, machine learning has been gradually employed for biochar, but there is no comprehensive review on the whole process regulation of biochar adsorbents, covering synthesis optimization and adsorption modeling. This review article systematically summarized the application of machine learning in biochar adsorbents from the perspective of all-round regulation for the first time, including the synthesis optimization and adsorption modeling of biochar adsorbents. Firstly, the overview of machine learning was introduced. Then, the latest advances of machine learning in biochar synthesis for pollutant removal were summarized, including prediction of biochar yield and physicochemical properties, optimal synthetic conditions and economic cost. And the application of machine learning in pollutant adsorption by biochar was reviewed, covering prediction of adsorption efficiency, optimization of experimental conditions and revelation of adsorption mechanism. General guidelines for the application of machine learning in whole-process optimization of biochar from synthesis to adsorption were presented. Finally, the existing problems and future perspectives of machine learning for biochar adsorbents were put forward. We hope that this review can promote the integration of machine learning and biochar, and thus light up the industrialization of biochar. Graphical Abstract

show abstract

Section: Evaluation Metrics For Machine Learningmentioning

confidence: 99%

Synthesis optimization and adsorption modeling of biochar for pollutant removal via machine learning

Zhang

Chen

et al. 2023

Biochar

View full text Add to dashboard Cite

show abstract

“…The capacity of a model is defined by its ability to fit various functions. The model architecture, number of hidden layers, weight size, and hypothesis space size affect the capacity of the model [10,39,40]. In addition, weight decay is a term added to the objective function to penalize the high-order weights [10,27], In the early stopping technique, training should be stopped before weights become large to avoid hidden units being in their non-linear ranges, resulting in high capacity [29].…”

Section: Adapting Capacitymentioning

confidence: 99%

“…Data augmentation is an effective approach for magnifying small datasets by inverting, trimming, rotating, zooming, and adjusting brightness, sharpness, and contrast. These approaches rely on existing samples to create new samples from the entire training dataset [10,35].…”

Section: Data Enrichmentmentioning

confidence: 99%

Improve Generalization in Deep Neural Network using Multivariate Statistical Knowledge Transformation

Morabbi

Soltanizadeh

Mozaffari

et al. 2022

Preprint

View full text Add to dashboard Cite

Most DNNs are trained in an over-parametrized regime. In this case, the numbers of their parameters are more than available training data which reduces the generalization capability and performance on new and unseen samples. generalization of deep neural networks (DNNs) has been improved through applying various methods such as regularization techniques, data enhancement, network capacity restriction, injection randomness, etc. In this paper, we proposed an effective generalization method, named multivariate statistical knowledge transformation, which learns feature distribution to separate samples based on variance of deep hypothesis space in all dimensions. Moreover, the proposed method uses latent knowledge of the target to boost the confidence of its prediction. Our method was evaluated on CIFAR-10 and CIFAR-100 datasets. The multivariate statistical knowledge transformation produces competitive results, compared with the state-of-the-art methods. Experimental results show that the proposed method improves the generalization of a DNN by 5% in test error and makes it much faster to converge in total runs.

show abstract

“…The first idea of approximation of a differential equation by MLP‐NN was proposed by IE Lagaris with one hidden‐layer network . Later Avrutskiy develops the idea by using more than one hidden layer MLP‐NN . The basic idea of this paper is inspired by these works.…”

Section: Neural Network Approximation Of Fpementioning

confidence: 99%

“…In , the single hidden layer network is applied to find the solution of differential equations. Recently, more than one‐hidden‐layer structure network is also investigated for both enhancing the function approximation power of the neural network as well as finding the solution of PDEs . In a special form of the solution for differential equations was chosen to satisfy the initial and boundary conditions.…”

Section: Introductionmentioning

confidence: 99%

Analytical solution of stochastic differential equation by multilayer perceptron neural network approximation of Fokker–Planck equation

Namadchian

Ramezani

2019

Numerical Methods Partial

View full text Add to dashboard Cite

The Fokker–Planck equation is a useful tool to analyze the transient probability density function of the states of a stochastic differential equation. In this paper, a multilayer perceptron neural network is utilized to approximate the solution of the Fokker–Planck equation. To use unconstrained optimization in neural network training, a special form of the trial solution is considered to satisfy the initial and boundary conditions. The weights of the neural network are calculated by Levenberg–Marquardt training algorithm with Bayesian regularization. Three practical examples demonstrate the efficiency of the proposed method.

show abstract

Preventing Overfitting by Training Derivatives

Cited by 4 publications

References 69 publications

Synthesis optimization and adsorption modeling of biochar for pollutant removal via machine learning

Synthesis optimization and adsorption modeling of biochar for pollutant removal via machine learning

Improve Generalization in Deep Neural Network using Multivariate Statistical Knowledge Transformation

Analytical solution of stochastic differential equation by multilayer perceptron neural network approximation of Fokker–Planck equation

Contact Info

Product

Resources

About