Handwritten signature verification is an indispensable means of identification in biometric information recognition and has broad application prospects and research significance in financial, judicial, and educational systems. With the advancement of signature forgery technology, people have higher requirements for the accuracy and efficiency of signature verification, and sophisticated convolutional networks capable of automatic feature extraction are gradually being applied to the field of handwriting recognition. However, these convolution methods still have the potential for improvement in recognition capability, generalization capability, and accuracy rate. This paper proposes a novel network model, Multi-Size Assembled-Attention Swin-Transformer network, to perform signature handwriting authenticity identification. The inputs to the network are signature images that are resized to multiple sizes, including (224, 224), (112, 112), (56, 56). Then, features within the same image are extracted using the self-attention mechanism in Swin-Transformer, and features between different images are also extracted with the cross-attention mechanism in Assembled-Attention Block, enabling signature feature information to interact within the same image and between different images. Also, Regularized Dropout strategy and adversarial method are implemented in the training stage. Therefore, our method considerably prompts the identification ability of the signature handwriting and obtains state-of-the-art performance, especially 57.1% and 50.4% improvement, in the situation of training in CEDAR and evaluation in Bengali and Hindi. Meantime, we evaluated the impact of the input images passing through the model's times on performance and found that the network achieves the optimal performance at times of four.