e de nition of a concise and e ective testbed for Genetic Programming (GP) is a recurrent ma er in the research community.is paper takes a new step in this direction, proposing a di erent approach to measure the quality of the symbolic regression benchmarks quantitatively. e proposed approach is based on meta-learning and uses a set of dataset meta-features-such as the number of examples or output skewness-to describe the datasets. Our idea is to correlate these meta-features with the errors obtained by a GP method. ese meta-features de ne a space of benchmarks that should, ideally, have datasets (points) covering di erent regions of the space. An initial analysis of 63 datasets showed that current benchmarks are concentrated in a small region of this benchmark space. We also found out that number of instances and output skewness are the most relevant meta-features to GP output error. Both conclusions can help de ne which datasets should compose an e ective testbed for symbolic regression methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.