Flor Castillo scite author profile

The chapter summarizes the practical experience of integrating genetic programming and statistical modeling at The Dow Chemical Company. A unique methodology for using Genetic Programming in statistical modeling of designed and undesigned data is described and illustrated with successful industrial applications. As a result of the synergistic efforts, the building technique has been improved and the model development cost and time can be significantly reduced. In case of designed data Genetic Programming reduced costs by suggesting transformations as an alternative to doing additional experimentation. In case of undesigned data Genetic Programming was instrumental in reducing the model building costs by providing alternative models for consideration.

show abstract

Split‐Plot Experimental Designs for Combinatorial and High‐Throughput Experimentation

Castillo¹,

Sweeney²,

Margl³

et al. 2005

QSAR Comb. Sci.

View full text Add to dashboard Cite

In the last few years, high-throughput reactors have small received significant attention due to the potential they offer for fast material development. While many experimental design techniques are proposed, statistical issues related to experimentation in this type of equipment are emerging. One of the experimental design techniques needed is the split-plot approach, given the randomization restrictions imposed by the equipment. This paper presents the use of split-plot experimental designs in a high-throughput reactor. We discuss the unique error structure of these designs and the special statistical analysis that considers two different types of errors. A case study in the Dow Chemical Company is presented. The main advantage of the split-plot approach related to high throughput is that reactor-well utilization can be maximized, while randomization restrictions can be addressed correctly and simultaneously. The results obtained indicate the success of this strategy in maximizing the chance of detecting a lead and making the right conclusions, which is of key importance given the speed of data generation of high-throughput reactors.

show abstract

Pareto front genetic programming parameter selection based on design of experiments and industrial data

Castillo

Kordon

Smits

et al. 2006

View full text Add to dashboard Cite

Symbolic regression based on Pareto Front GP is the key approach for generating high-performance parsimonious empirical models acceptable for industrial applications. The paper addresses the issue of finding the optimal parameter settings of Pareto Front GP which direct the simulated evolution toward simple models with acceptable prediction error. A generic methodology based on statistical design of experiments is proposed. It includes statistical determination of the number of replicates by half-width confidence intervals, determination of the significant inputs by fractional factorial design of experiments, approaching the optimum by steepest ascent/descent, and local exploration around the optimum by Box Behnken or by central composite design of experiments. The results from implementing the proposed methodology to a small-sized industrial data set show that the statistically significant factors for symbolic regression, based on Pareto Front GP, are the number of cascades, the number of generations, and the population size. A second order regression model with high R 2 of 0.97 includes the three parameters and their optimal values have been defined. The optimal parameter settings were validated with a separate small sized industrial data set. The optimal settings are recommended for symbolic regression applications using data sets with up to 5 inputs and up to 50 data points.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Flor Castillo

Application Issues of Genetic Programming in Industry

Using Genetic Programming in Industrial Statistical Model Building

Split‐Plot Experimental Designs for Combinatorial and High‐Throughput Experimentation

Pareto front genetic programming parameter selection based on design of experiments and industrial data

Contact Info

Product

Resources

About