Many edge/mobile devices are now able to utilize deep neural networks (DNNs) thanks to the development of mobile DNN accelerators. Mobile DNN accelerators overcame the problems of limited computing resources and battery capacity by realizing energy-efficient inference. However, its passive behavior makes it difficult for DNN to provide active customization for individual users or its service environment. The importance of onchip training is rising more and more to provide active interaction between DNN processors and ever-changing surroundings or conditions. Despite its advantages, the DNN training has more constraints than the inference such that it was considered impractical to be realized on mobile/edge devices. Recently, there are many trials to realize mobile DNN training, and a number of prior works will be summarized. Firstly, it arranges the new challenges of the DNN accelerator induced by training functionality and discusses new hardware features related to the challenges. Secondly, it explains algorithm-hardware cooptimization methods and explains why it becomes mainstream in mobile DNN training research. Thirdly, it compares the main differences between the conventional inference accelerators and recent training processors. Finally, the conclusion is made by proposing the future directions of the DNN training processor in micro-AI systems.