Accurate automatic quantitative cephalometry are essential for orthodontics. However, manual labeling of cephalometric landmarks is tedious and subjective, which also must be performed by professional doctors. In recent years, deep learning has gained attention for its success in computer vision field. It has achieved large progress in resolving problems like image classification or image segmentation. In this paper, we propose a two-step method which can automatically detect cephalometric landmarks on skeletal X-ray images. First, we roughly extract a region of interest (ROI) patch for each landmark by registering the testing image to training images, which have annotated landmarks. Then, we utilize pre-trained networks with a backbone of ResNet50, which is a state-of-the-art convolutional neural network, to detect each landmark in each ROI patch. The network directly outputs the coordinates of the landmarks. We evaluate our method on two datasets: ISBI 2015 Grand Challenge in Dental X-ray Image Analysis and our own dataset provided by Shandong University. The experiments demonstrate that the proposed method can achieve satisfying results on both SDR (Successful Detection Rate) and SCR (Successful Classification Rate). However, the computational time issue remains to be improved in the future. a global-context shape model. In 2017, Ibragimov et al. added a convolutional neural network for binary classification to their conventional method. They surpass the result of Lindner's a little bit, with around 75.3% prediction accuracy within a 2-mm range [9]. In 2017, Hansang Lee et al. proposed a deep learning based method which achieved not bad results but in a small resized image. He trained two networks to regress the landmark's x and y coordinate directly [10]. In 2019, Jiahong Qian et al. proposed a new architecture called CephaNet which improves the architecture of Faster R-CNN [11,12].Its accuracy is nearly 6% higher than other conventional methods.Despite the variety of techniques available, automatic cephalometric landmark detection remains insufficient due to its limited accuracy. In recent years, deep learning has gained attention for its success in computer vision field. For example, convolutional neural network models are widely used in problems like landmark detection [13,14], image classification [15][16][17] and image segmentation [18,19]. Trained models' performances surpass that of human beings in many applications. Since direct regression of several landmarks is a highly non-linear mapping, which is difficult to learn [20][21][22][23]. In our method, we only try to detect one key point in one patch image. We learn a non-linear mapping function for only one key point. Each key point has its corresponding non-linear mapping function. So we can achieve more accurate detection of key point than other methods.In this paper, we propose a two-step method for the automatic detection of cephalometric landmarks. First, we get the coarse landmark location by registering the test image to a most similar image in...