The ability of the human visual processing system to accommodate and retain clear understanding or identification of patterns irrespective of their orientations is quite remarkable. Conversely, pattern invariance, a common problem in intelligent recognition systems is not one that can be overemphasized; obviously, one's definition of an intelligent system broadens considering the large variability with which the same patterns can occur. This research investigates and reviews the performance of convolutional networks, and its variant, convolutional auto encoder networks when tasked with recognition problems considering invariances such as translation, rotation, and scale. While, various patterns can be used to validate this query, handwritten Yoruba vowel characters have been used in this research. Databases of images containing patterns with constraints of interest are collected, processed, and used to train and simulate the designed networks. We provide extensive architectural and learning paradigms review of the considered networks, in view of how built-in invariance is learned. Lastly, we provide a comparative analysis of achieved error rates against back propagation neural networks, denoising auto encoder, stacked denoising auto encoder, and deep belief network.