The transformer has dominated the natural language processing (NLP) field for a long time. Recently, the transformer-based method has been adopted into the computer vision (CV) field and shows promising results. As an important branch of the CV field, medical image analysis joins the wave of the transformer-based method rightfully. In this review, we illustrate the principle of the attention mechanism, and the detailed structures of the transformer, and depict how the transformer is adopted into medical image analysis. We organize the transformer-based medical image analysis applications in a sequence of different tasks, including classification, segmentation, synthesis, registration, localization, detection, captioning, and denoising. For the mainstream classification and segmentation tasks, we further divided the corresponding works based on different medical imaging modalities. The datasets corresponding to the related works are also organized. We include thirteen modalities and more than twenty objects in our work.