<p>Symbolic regression (SR) is a function identification process, the task of which is to identify and express the relationship between the input and output variables in mathematical models. SR is named to emphasise its ability to find the structure and coefficients of the model simultaneously. Genetic Programming (GP) is an attractive and powerful technique for SR, since it does not require any predefined model and has a flexible representation. However, GP based SR generally has a poor generalisation ability which degrades its reliability and hampers its applications to science and real-world modeling. Therefore, this thesis aims to develop new GP approaches to SR that evolve/learn models exhibiting good generalisation ability. This thesis develops a novel feature selection method in GP for high-dimensional SR. Feature selection can potentially contribute not only to improving the efficiency of learning algorithms but also to enhancing the generalisation ability. However, feature selection is seldom considered in GP for high-dimensional SR. The proposed new feature selection method utilises GP’s built-in feature selection ability and relies on permutation to detect the truly relevant features and discard irrelevant/noisy features. The results confirm the superiority of the proposed method over the other examined feature selection methods including random forests and decision trees on identifying the truly relevant features. Further analysis indicates that the models evolved by GP with the proposed feature selection method are more likely to contain only the truly relevant features and have better interpretability. To address the overfitting issue of GP when learning from a relatively small number of instances, this thesis proposes a new GP approach by incorporating structural risk minimisation (SRM), which is a framework to estimate the generalisation performance of models, into GP. The effectiveness of SRM highly depends on the accuracy of the Vapnik-Chervonenkis (VC) dimension measuring model complexity. This thesis significantly extends an experimental method (instead of theoretical estimation) to measure the VC-dimension of a mixture of linear and nonlinear regression models in GP for the first time. The experimental method has been conducted using uniform and non-uniform settings and provides reliable VC-dimension values. The results show that our methods have an impressively better generalisation gain and evolve more compact model, which have a much smaller behavioural difference from the target models than standard GP and GP with bootstrap, The proposed method using the optimised non-uniform setting further improves the one using the uniform setting. This thesis employs geometric semantic GP (GSGP) to tackle the unsatisfied generalisation performance of GP for SR when no overfitting occurs. It proposes three new angle-awareness driven geometric semantic operators (GSO) including selection, crossover and mutation to further explore the geometry of the semantic space to gain a greater generalisation improvement in GP for SR. The angle-awareness brings new geometric properties to these geometric operators, which are expected to provide a greater leverage for approximating the target semantics in each operation, and more importantly, to be resistant to overfitting. The results show that compared with two kinds of state-of-the-art GSOs, the proposed new GSOs not only drive the evolutionary process fitting the target semantics more efficiently but also significantly improve the generalisation performance. A further comparison on the evolved models shows that the new method generally produces simpler models with a much smaller size and containing important building blocks of the target models.</p>