Cost-Complexity Pruning of Random Forests

Kiran, B. Ravi; Serra, Jean

doi:10.1007/978-3-319-57240-6_18

Cited by 6 publications

(4 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Random Forest: Random Forest is an ensemble learning method of using bagging and random features selection to construct a multitude of decision trees during the training [38], [40]. This classification algorithm is widely used in data mining area.…”

Section: B Baseline Comparison Methodsmentioning

confidence: 99%

Cost-sensitive Boosting Pruning Trees for depression detection on Twitter

Tong¹,

Liu²,

Jiang³

et al. 2019

Preprint

View full text Add to dashboard Cite

Depression is one of the most common mental health disorders, and a large number of depression people commit suicide each year. Potential depression sufferers do not consult psychological doctors because they feel ashamed or are unaware of any depression, which may result in severe delay of diagnosis and treatment. In the meantime, evidence shows that social media data provides valuable clues about physical and mental health conditions. In this paper, we argue that it is feasible to identify depression at an early stage by mining online social behaviours. Our approach, which is innovative to the practice of depression detection, does not rely on the extraction of numerous or complicated features to achieve accurate depression detection. Instead, we propose a novel classifier, namely, Inverse Boosting Pruning Trees (IBPT), which demonstrates a strong classification ability on a publicly accessible dataset with 7862 Twitter users. To comprehensively evaluate the classification capability of the IBPT, we use three real datasets from the UCI machine learning repository and the IBPT still obtains the best classification results against several state of the arts techniques. The results manifest that our proposed framework is promising for identifying social networks' users with depression.

show abstract

Section: B Baseline Comparison Methodsmentioning

confidence: 99%

Cost-sensitive Boosting Pruning Trees for depression detection on Twitter

Tong¹,

Liu²,

Jiang³

et al. 2019

Preprint

View full text Add to dashboard Cite

show abstract

“…This process is parametrized by the complexity parameter, α, that indicates a particular tree dimension. How α is calculated is beyond the scope of this work, but more information on the MCCP can be found at [6]. The more leaf nodes a tree has, the higher its complexity becomes and the lower the value of α.…”

Section: Fig 1: Schematic Of 2d Fe Model (Left -L) and Multi-spot Sen...mentioning

confidence: 99%

Braking torque estimation through machine learning algorithms

Bonini

2023

Theoretical and Applied Mechanics - AIMETA 2022

View full text Add to dashboard Cite

Abstract. MotoGP class motorcycles rely on carbon braking system to cope with their incredible acceleration capability and high speed. Hence, assessing the torque generated by the front discs is a key to improve the vehicle performance. As direct measurement of the braking torque is not allowed during races, its value may be estimated through a physical model, using as inputs the brake fluid pressure (monitored on board), the braking system geometry and the friction coefficient (μ). However, the results obtained with this method are highly limited by the knowledge of the instantaneous friction coefficient between the disc rotor and the pads. Since the value of μ is a highly nonlinear function of many variables (namely temperature, pressure and angular velocity of the disc), an analytical model appears impractical to establish. This work aims to implement an innovative algorithm, based on machine learning, for determining μ from the signals regularly available in races, to enable accurate breaking torque computation. The proposed method consists of two main tools. An artificial neural network (ANN) is developed to approximate the unknown function that relates the input variables to μ, while a Kalman filter (KF) is implemented to estimate the real temperature distribution on the disc surface that constitutes one of the most important ANN inputs. The proposed algorithm has been successfully validated with real data collected from extensive tests in racetracks, with a special sensor setup.

show abstract

“…It should be noted that these methods are advantageous in a large set, but contraindicated in a moderate size set. Thus, different optimization techniques have been employed and various techniques are utilized to overcome this situation including genetic algorithm (Ko et al, 2014;Mousavi et al, 2018;Taleb Zouggar and Adla, 2019), Particle Swarm (Escalante et al, 2010(Escalante et al, , 2012Taghavi et al, 2015), greedy algorithm (Guo and Fan, 2011;Dai, 2013;Dai and Li, 2015), semi-definite programming (Zhang et al, 2006), quadratic programming (Li and Zhou, 2009), hill climbing (Partalas et al, 2010;Indyk et al, 2014), localized generalization error (Pratama et al, 2018), bi-objective evolutionary optimization (Yin et al, 2014;Qian et al, 2015) and lately the cost-complexity pruning (Kiran and Serra, 2017;Wang et al, 2017;Fernandes et al, 2017).…”

Section: The Optimization-based Pruningmentioning

confidence: 99%

A new correlation-based approach for ensemble selection in random forests

Daho

Settouti

Bechar

et al. 2021

IJICC

View full text Add to dashboard Cite

PurposeEnsemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.Design/methodology/approachIn this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.FindingsThe proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.Originality/valueCES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.

show abstract

Cost-Complexity Pruning of Random Forests

Cited by 6 publications

References 13 publications

Cost-sensitive Boosting Pruning Trees for depression detection on Twitter

Cost-sensitive Boosting Pruning Trees for depression detection on Twitter

Braking torque estimation through machine learning algorithms

A new correlation-based approach for ensemble selection in random forests

Contact Info

Product

Resources

About