Currently, hand motion recognition of single-modality data has been extensively explored for the analysis of various contact and noncontact sensors, and it is recognized that all the existing technologies have both strengths and limitations. As a significant motor symptom, hand tremor is usually utilized for the diagnosis and evaluation of Parkinson’s disease; furthermore, a multimodal analysis of the handwriting pattern of the patient has made up for the one-sided way of learning the hand movement in a single measurement dimension. Especially, considering a variety of measurement resources, it shows promising performance in recognizing handwriting patterns of Parkinson’s disease. In this work, a novel Spatio-temporal Siamese neural network (ST-SiamNN) is proposed to learn the handwriting differences between healthy individuals and patients with Parkinson’s disease, process data onto multiple sensors, and enhance the characteristics of handwriting in Parkinson’s disease. Uniquely, it is a discriminative model of multilabel and multinetwork constructed by a Siamese network, which consists of four modules: a preprocessor for handwritten data enhancement, a Siamese bidirectional memory neural network (SiamBiMNN) for temporal and texture feature extraction and difference enhancement, a Siamese octave convolutional neural network (SiamOctCNN) for spatial feature extraction and difference enhancement, and a decision-making layer to rejudge the output features of the Siamese networks to obtain more accurate auxiliary diagnosis results. The framework proposed in this article is verified on two handwritten datasets of multiple modalities, i.e., images, smart pen signals, and graphics tablet signals, which are compared with several state-of-the-art studies.