Abstract. This paper investigates the impact of geometric semantic crossover operators in a wide range of symbolic regression problems. First, it analyses the impact of using Manhattan and Euclidean distance geometric semantic crossovers in the learning process. Then, it proposes two strategies to numerically optimize the crossover mask based on mathematical properties of these operators, instead of simply generating them randomly. An experimental analysis comparing geometric semantic crossovers using Euclidean and Manhattan distances and the proposed strategies is performed in a test bed of twenty datasets. The results show that the use of different distance functions in the semantic geometric crossover has little impact on the test error, and that our optimized crossover masks yield slightly better results. For SGP practitioners, we suggest the use of the semantic crossover based on the Euclidean distance, as it achieved similar results to those obtained by more complex operators.
The version in the Kent Academic Repository may differ from the final published version. Users are advised to check http://kar.kent.ac.uk for the status of the paper. Users should always cite the published version of record.
This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transformation is performed according to the semantic distances between the desired and obtained outputs and a geometric semantic operator. The rationale behind SSR is that, after generating a suboptimal function f via symbolic regression, the output errors can be approximated by another function in a subsequent iteration. The method was tested in eight polynomial functions, and compared with canonical genetic programming (GP) and geometric semantic genetic programming (SGP). Results showed that SSR significantly outperforms SGP and presents no statistical difference to GP. More importantly, they show the potential of the proposed strategy: an effective way of applying geometric semantic operators to combine different (partial) solutions, avoiding the exponential growth problem arising from the use of these operators.
Epidemiological early warning systems for dengue fever rely on upto-date epidemiological data to forecast future incidence. However, epidemiological data typically requires time to be available, due to the application of time-consuming laboratorial tests. is implies that epidemiological models need to issue predictions with larger antecedence, making their task even more di cult. On the other hand, online platforms, such as Twi er or Google, allow us to obtain samples of users' interaction in near real-time and can be used as sensors to monitor current incidence. In this work, we propose a framework to exploit online data sources to mitigate the lack of up-to-date epidemiological data by obtaining estimates of current incidence, which are then explored by traditional epidemiological models. We show that the proposed framework obtains more accurate predictions than alternative approaches, with statistically be er results for delays greater or equal to 4 weeks.
Abstract-Dengue fever is a mosquito-borne disease present in all Brazilian territory. Brazilian government, however, lacks an accurate early warning system to quickly predict future dengue outbreaks. Such system would help health authorities to plan their actions and to reduce the impact of the disease in the country. However, most attempts to model dengue fever use parametric models which enforce a specific expected behaviour and fail to capture the inherent complexity of dengue dynamics. Therefore, we propose a new Bayesian non-parametric model based on Gaussian processes to design an accurate and flexible model that outperforms previous/standard techniques and can be incorporated into an early warning system, specially at cities from Southeast and Center-West regions. The model also helps understanding dengue dynamics in Brazil through the analysis of the covariance functions generated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.