“…Now, however, the dilemma of overfitting arises: small networks generalize well, but are slow at learning, whereas big networks learn fast (needing fewer training epochs to obtain similar precision), but generalize badly [5,4]. The way to obtain good generalization ability is to use the smallest network that can learn the input data efficiently [93,94,103]. For this reason, this operator is combined with the following one.…”