Training DNN mostly relies on GPUs with FP32 format. While FP16 is acknowledged for its advantage of high computation and memory efficiencies for training DNN, the training must be accompanied with techniques dedicated for a particular dataset. Therefore, a hardware engine with a configurable bit-width feature is desirable for covering any datasets and applications. This work proposes an adaptive bit-width and voltage scaling (ABVS) scheme for DNN training. The key idea is to increase fraction bit-width (FB) gradually from a small value according to current training quality (e.g., accuracy, mAP). Since less FB achieves shorter hardware latency, this training scheme concurrently adapts bit-width and voltage scaling and intensify energy reduction. Experimental results show that the ABVS scheme achieves the comparable quality to FP32 with at most 0.5% accuracy drop, but up to 63% energy reduction.Index Terms-deep learning training, floating-point, configurable bit-width, bit-width scaling, voltage scaling