# An Exploration of Generalization and Overfitting in Genetic Programming: Standard and Geometric Semantic Approaches

### Authors

### Abstract

Computational learning refers to the task of inducing a general pattern from a provided set of examples. A learning method is expected to generalize to unseen examples of the same pattern. A common issue in computational learning is the possibility that the resulting models could be simply learning the provided set of examples, instead of learning the underlying pattern. A model that is incurring in such a behavior is commonly said to be overfitting. This dissertation explores the task of computational learning and the related concepts of generalization and overfitting, in the context of Genetic Programming (GP). GP is a computational method inspired by natural evolution that considers a set of primitive functions and terminals that can be combined without any considerable constraints on the structure of the models being evolved. This flexibility can help in learning complex patterns but it also increases the risk of overfitting.The contributions of this dissertation cover the most common form of GP (Standard GP), as well as the recently proposed Geometric Semantic GP (GSGP). The initial set of approaches relies on dynamically selecting different training data subsets during the evolutionary process. These approaches can avoid overfitting and improve the resulting generalization without restricting the flexibility of GP. Besides improving the generalization, these approaches also produce considerably smaller individuals. An analysis of the generalization ability of GSGP is performed, which shows that the generalization outcome is greatly dependent on particular characteristics of the mutation operator. It is shown that, as Standard GP, the original formulation of GSGP is prone to overfitting. The necessary conditions to avoid overfitting are presented. When such conditions are in place, GSGP can achieve a particularly competitive generalization. A novel geometric semantic mutation that substantially improves the effectiveness and efficiency of GSGP is proposed. Besides considerably improving the training data learning rate, it also achieves a competitive generalization with only a few applications of the mutation operator.

The final set of contributions covers the domain of Neural Networks (NNs). These contributions originated as an extension of the research conducted within GSGP. This set of contributions includes the definition of a NN construction algorithm based on an extension of the mutation operator defined in GSGP. Similarly to GSGP, the proposed algorithm searches over a space without local optima. This allows for an effective and efficient stochastic search in the space of NNs, without the need to use backpropagation to adjust the weights of the network. Finally, two search stopping criteria are proposed, which can be directly used in the proposed NN construction algorithm and in GSGP. These stopping criteria are able to detect when the risk of overfitting increases significantly. It is shown that the stopping points detected result in a competitive generalization.