In recent years, the techniques of the exact sciences have been applied to the analysis of increasingly complex and non-linear systems. The related uncertainties and the large amounts of data available have progressively shown the limits of the traditional hypothesis driven methods, based on first principle theories. Therefore, a new approach of data driven theory formulation has been developed. It is based on the manipulation of symbols with genetic computing and it is meant to complement traditional procedures, by exploring large datasets to find the most suitable mathematical models to interpret them. The paper reports on the vast amounts of numerical tests that have shown the potential of the new techniques to provide very useful insights in various studies, ranging from the formulation of scaling laws to the original identification of the most appropriate dimensionless variables to investigate a given system. The application to some of the most complex experiments in physics, in particular thermonuclear plasmas, has proved the capability of the methodology to address real problems, even highly nonlinear and practically important ones such as catastrophic instabilities. The proposed tools are therefore being increasingly used in various fields of science and they constitute a very good set of techniques to bridge the gap between experiments, traditional data analysis and theory formulation.