This work provides a novel approach to monitor the aflatoxin B1 (AFB1) content in maize by near-infrared (NIR) spectra-based deep learning models that integrates Markov transition field (MTF) image coding and a convolutional neural network (CNN) strategy. According to the data structure characteristics of near-infrared spectra, new structures of one-dimensional CNN (1D-CNN) and two-dimensional MTF-CNN (2D-MTF-CNN) were designed to construct a deep learning model for the monitoring of AFB1 in maize. The results obtained showed that compared with the 1D-CNN model, the performance of the 2D-MTF-CNN model had been significantly improved, and its root mean square error of prediction, coefficient of predictive determination, and relative percent deviation were 1.3591 μg·kg−1, 0.9955, and 14.9386, respectively. The results indicate that the MTF is an effective data encoding technique for converting one-dimensional spectra into two-dimensional images. It more intuitively reflects the intrinsic characteristics of the NIR spectra from a new perspective and provides richer spectral information for the construction of deep learning models, which can ensure the detection accuracy and generalization performance of deep learning quantitative detection models. This study provides a new analytical perspective for the chemometrics analysis of the NIR spectroscopy.