Domain-wall-synapse-based crossbar arrays have been shown to be very efficient, in terms of speed and energy consumption, while implementing fully connected neural network (FCNN) algorithms for simple data-classification tasks, both in inference and on-chip-learning modes. But for more complex and realistic data-classification tasks, convolutional neural networks (CNN) need to be trained through such crossbar arrays. In this paper, we carry out device-circuit-system co-design and co-simulation of on-chip learning of a CNN using a domain-wall-synapse-based crossbar array. For this purpose, we use a combination of micromagnetic-physics-based synapse-device modeling, SPICE simulation of a crossbar-array circuit using such synapse devices, and system-level-coding using a high-level language. In our design, each synaptic weight of the convolutional kernel is considered to be of 15 bits; one domain-wall-synapse crossbar array is dedicated to the 5 least significant bits (LSBs), and two crossbar arrays are dedicated to the other bits. The crossbar arrays accelerate the matrix vector multiplication (MVM) operation involved in the forward computation of the CNN. The synaptic weights of the LSB crossbar are updated after forward computation on every training sample, while the weights of the other crossbars are updated after forward computation on 10 samples, to achieve on-chip learning. We report high classification-accuracy numbers for different machine-learning data sets using our method. We also carry out a study of how the classification accuracy of our designed CNN is affected by device-to-device variations, cycle-to-cycle variations, bit precision of the synaptic weights, and the frequency of weight updates.