2020
DOI: 10.1007/s12065-020-00379-8
|View full text |Cite
|
Sign up to set email alerts
|

Pruning of genetic programming trees using permutation tests

Abstract: We present a novel approach based on statistical permutation tests for pruning redundant subtrees from genetic programming (GP) trees that allows us to explore the extent of effective redundancy. We observe that over a range of regression problems, median tree sizes are reduced by around 20% largely independent of test function, and that while some large subtrees are removed, the median pruned subtree comprises just three nodes; most take the form of an exact algebraic simplification. Our statistically-based p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 27 publications
0
7
0
Order By: Relevance
“…These results are thus at variance with what was seen in [7,25] where standardization (alone) was observed to reduce tree size. The speculation in [25] was that standardization reduces the range of constants that a tree needs to synthesize in order to fit the data; indeed, in a study of pruning of baseline-type GP trees [29], we observed that significant numbers of individuals terminated with subtrees of the form of a constant combined with another constant using a binary operation, presumably serving to generate constants outside the range of those available in the a priori terminal set. Thus it is intuitively reasonable to suggest that standardization may reduce a tree's need to synthesize 'larger' and 'smaller' constants leading to smaller overall tree sizes.…”
Section: Discussion and Future Workmentioning
confidence: 98%
See 1 more Smart Citation
“…These results are thus at variance with what was seen in [7,25] where standardization (alone) was observed to reduce tree size. The speculation in [25] was that standardization reduces the range of constants that a tree needs to synthesize in order to fit the data; indeed, in a study of pruning of baseline-type GP trees [29], we observed that significant numbers of individuals terminated with subtrees of the form of a constant combined with another constant using a binary operation, presumably serving to generate constants outside the range of those available in the a priori terminal set. Thus it is intuitively reasonable to suggest that standardization may reduce a tree's need to synthesize 'larger' and 'smaller' constants leading to smaller overall tree sizes.…”
Section: Discussion and Future Workmentioning
confidence: 98%
“…To illustrate this, in least-squares fitting of a straight line with a function y = (a + b)x + c , calculating precise values of both a and b is indeterminate, but finding a single, precise value for the sum (a + b) is feasible (debarring any other numerical difficulties). We have seen ample evidence of 'double constant' tree termination in [29] in a study of tree pruning. An obvious way to explore this issue (in future work) is to perform minor simplification of a tree by replacing all 'double constants' with a single constant using a simple rule-based substitution, which may remove numerical indeterminacy, and hopefully reduce slow convergence.…”
Section: Algorithm Run Timesmentioning
confidence: 99%
“…In our study, trees were simple enough to be interpreted without the need of postprocessing, whereas for complex trees the result from dozens, or even hundreds, of operations can be greatly simplified (for example, when the result of a number of operations is always equal to a constant), though often this cannot be understood by visual inspection of the tree. In the case of complex trees, dedicated algorithms can support and automate the postprocessing of trees (Garcia-Almanza & Tsang, 2006; Rockett, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…Every generation [2], [117]- [122] Final generation [2], [123], [124] At certain interval [119], [125]- [128] Which individuals All individuals in the population [2], [119], [122], [125]- [127] The best individuals in the population [2], [121], [123], [124], [128] Randomly with some probability [117] The parents for breeding [118], [120] How Genotypic (Structural) [2], [117]- [119], [122], [123],…”
Section: Whenmentioning
confidence: 99%