Federated learning is served as a novel distributed training framework that enables multiple clients of the internet of things to collaboratively train a global model while the data remains local. However, the implement of federated learning faces many problems in practice, such as the large number of training for convergence due to the size of model and the lack of adaptivity by the stochastic gradient-based update at the client side. Meanwhile, it is sensitive to noise during the optimization process that can affect the performance of the final model. For these reasons, we propose Federated Adaptive learning based on Derivative Term, called FedADT in this paper, which incorporates adaptive step size and difference of gradient in the update of local model. To further reduce the influence of noise on the derivative term that is estimated by difference of gradient, we use moving average decay on the derivative term. Moreover, we analyze the convergence performance of the proposed algorithm for non-convex objective function, i.e., the convergence rate of 1/nT can be achieved by choosing appropriate hyper-parameters, where n is the number of clients and T is the number of iterations, respectively. Finally, various experiments for the image classification task are conducted by training widely used convolutional neural network on MNIST and Fashion MNIST datasets to verify the effectiveness of FedADT. In addition, the receiver operating characteristic curve is used to display the result of the proposed algorithm by predicting the categories of clothing on the Fashion MNIST dataset.