Fiber Quality Prediction Using Nir Spectral Data:  Tree-Based Ensemble Learning VS Deep Neural Networks

Nasir, Vahid; Mohammadpanah, Ahmad; Raut, Sameen; Nabavi, Mohamad; Dahlen, Joseph; Schimleck, Laurence R.

doi:10.22382/wfs-2023-10

Cited by 7 publications

(5 citation statements)

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Training on tabular data, the general superior performance of models based on the gradient-boosted decision tree over deep learning methods has been reported, specifically where machine learning models outperformed deep learning models in regression [59]. Nasir et al (2023) [27] showed that tree-based gradient-boosting machines such as LGBMs, XGBoost, and TreeNet outperformed the ANN and CNN models when predicting fiber properties using NIR spectral data (with and without applying PCA). Thus, one might speculate that the LGBM model was so robust on the original training dataset that it did not experience significant improvement in its performance by changing the size of the training data.…”

Section: Resultsmentioning

confidence: 99%

“…ANNs are biologically inspired mathematical models that can explain variations in almost any type of dataset with a good degree of accuracy. Therefore, these are one of the most widely used deep learning neural networks for regression and classification [ 27 ]. ANNs consist of input, hidden, and output layers, with the layers consisting of neurons that are interconnected by weighted links [ 50 ].…”

Section: Methodsmentioning

confidence: 99%

“…The LGBM technique employs a histogram-based decision-tree-learning technique that optimizes memory utilization and reduces communication overhead. Numerous applied machine learning tasks have used LGBM techniques because of its excellent predictive power, effectiveness, and capacity for handling complex datasets [ 27 , 54 ]. The algorithm has several key parameters that control overfitting, complexity, and the optimization process.…”

Section: Methodsmentioning

confidence: 99%

“…Research on using machine learning and deep learning applied to NIR spectra for wood characterization and monitoring is relatively limited. Studies on small-sized NIR spectral datasets (sample sizes ranging from 172 to 480) showed artificial neural networks (ANNs) outperformed PLS regression models [ 24 , 25 , 26 , 27 ]. Specifically, Ayanleye et al (2021) [ 26 ] used 240 samples to train ANN and neuro-fuzzy models to predict the MOE and MOR of western hemlock ( Tsuga heterophylla ) and Douglas-fir ( Pseudotsuga menziesii ) lumber.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Utilization of Synthetic Near-Infrared Spectra via Generative Adversarial Network to Improve Wood Stiffness Prediction

Ali,

Raut,

Dahlen

et al. 2024

Sensors

Self Cite

View full text Add to dashboard Cite

Near-infrared (NIR) spectroscopy is widely used as a nondestructive evaluation (NDE) tool for predicting wood properties. When deploying NIR models, one faces challenges in ensuring representative training data, which large datasets can mitigate but often at a significant cost. Machine learning and deep learning NIR models are at an even greater disadvantage because they typically require higher sample sizes for training. In this study, NIR spectra were collected to predict the modulus of elasticity (MOE) of southern pine lumber (training set = 573 samples, testing set = 145 samples). To account for the limited size of the training data, this study employed a generative adversarial network (GAN) to generate synthetic NIR spectra. The training dataset was fed into a GAN to generate 313, 573, and 1000 synthetic spectra. The original and enhanced datasets were used to train artificial neural networks (ANNs), convolutional neural networks (CNNs), and light gradient boosting machines (LGBMs) for MOE prediction. Overall, results showed that data augmentation using GAN improved the coefficient of determination (R2) by up to 7.02% and reduced the error of predictions by up to 4.29%. ANNs and CNNs benefited more from synthetic spectra than LGBMs, which only yielded slight improvement. All models showed optimal performance when 313 synthetic spectra were added to the original training data; further additions did not improve model performance because the quality of the datapoints generated by GAN beyond a certain threshold is poor, and one of the main reasons for this can be the size of the initial training data fed into the GAN. LGBMs showed superior performances than ANNs and CNNs on both the original and enhanced training datasets, which highlights the significance of selecting an appropriate machine learning or deep learning model for NIR spectral-data analysis. The results highlighted the positive impact of GAN on the predictive performance of models utilizing NIR spectroscopy as an NDE technique and monitoring tool for wood mechanical-property evaluation. Further studies should investigate the impact of the initial size of training data, the optimal number of generated synthetic spectra, and machine learning or deep learning models that could benefit more from data augmentation using GANs.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Utilization of Synthetic Near-Infrared Spectra via Generative Adversarial Network to Improve Wood Stiffness Prediction

Ali,

Raut,

Dahlen

et al. 2024

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

“…The spectra were classified using a TreeNet gradient boosting machine, a tree-based ensemble learning model. The model could successfully handle NIR spectra without the need for prior dimensionality reduction and was shown to outperform the ANN and convolutional neural network (CNN) for fiber quality prediction using NIR data [43]. Gradient boosting machines have been used in the wood science and technology literature for wood species identification [44] and predicting the properties of wood composites [45].…”

Section: Methodsmentioning

confidence: 99%

Quality Control of Thermally Modified Western Hemlock Wood Using Near-Infrared Spectroscopy and Explainable Machine Learning

Nasir,

Schimleck,

Abdoli

et al. 2023

Polymers

Self Cite

View full text Add to dashboard Cite

The quality control of thermally modified wood and identifying heat treatment intensity using nondestructive testing methods are critical tasks. This study used near-infrared (NIR) spectroscopy and machine learning modeling to classify thermally modified wood. NIR spectra were collected from the surfaces of untreated and thermally treated (at 170 °C, 212 °C, and 230 °C) western hemlock samples. An explainable machine learning approach was practiced using a TreeNet gradient boosting machine. No dimensionality reduction was performed to better explain the feature ranking results obtained from the model and provide insight into the critical wavelengths contributing to the performance of classification models. NIR spectra in the ranges of 1100–2500 nm, 1400–2500 nm, and 1700–2500 nm were fed into the TreeNet model, which resulted in classification accuracy values (test data) of 94.35%, 89.29%, and 84.52%, respectively. Feature ranking analysis revealed that when using the range of 1100–2500 nm, the changes in wood color resulted in the highest variation in NIR reflectance amongst treatments. As a result, associated features were given higher importance by TreeNet. Limiting the wavelength range increased the significance of features related to water or wood chemistry; however, these predictive models were not as accurate as the one benefiting from the impact of wood color change on the NIR spectra. The developed framework could be applied to different applications in which NIR spectra are used for wood characterization and quality control to provide improved insights into selected NIR wavelengths when developing a machine learning model.

show abstract

Health Fitness Tracker System Using Machine Learning Based on Data Analytics

Veeraiah,

Ramesh,

Koujalagi

et al. 2024

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Fiber Quality Prediction Using Nir Spectral Data: Tree-Based Ensemble Learning VS Deep Neural Networks

Cited by 7 publications

References 57 publications

Utilization of Synthetic Near-Infrared Spectra via Generative Adversarial Network to Improve Wood Stiffness Prediction

Utilization of Synthetic Near-Infrared Spectra via Generative Adversarial Network to Improve Wood Stiffness Prediction

Quality Control of Thermally Modified Western Hemlock Wood Using Near-Infrared Spectroscopy and Explainable Machine Learning

Health Fitness Tracker System Using Machine Learning Based on Data Analytics

Contact Info

Product

Resources

About