The Objectives of this study are (1) to evaluate tone production in Mandarin-speaking patients with post-stroke dysarthria (PSD) using an artificial neural network (ANN), (2) to investigate the efficacy of recognition performance of the ANN model contrast to the human listeners and the convolutional neural network (CNN) model, and (3) to explore rehabilitation application of the artificial intelligence recognition for lexical tone production disorder with PSD. The subjects include two groups of native Mandarin speaking adults: 31 patients with PSD and 42 normal-speaking adults (NA) in a similar age range as controls. Each subject was recorded producing a list of 7 Mandarin monosyllables with 4 tones (i.e., a total of 28 tokens). The fundamental frequency (F0) of each monosyllable was extracted using auto-correlation algorithm. The ANN was trained with F0 data of the tone tokens from the NA, to generate the final model. The recognition rates of the human ears, ANN model, and CNN model were 87.78% ± 8.96% (mean ± SD), 89.11% ±11.80%, 65.91% ± 8.79% respectively for tone production of NA group; 70.28% ± 17.61%, 63.35% ± 17.40%, 34.71% ± 6.92% respectively for tone production of PSD group. For PSD group, there was significant correlation between the performance of the ANN model and human listeners (r = 0.826, P < 0.001). However, the performance of CNN model was not correlated with that of the human ears (r = −0.108, P = 0.562). Thus, the experiments show that ANN is more objective and efficient, which could replace human listeners in the assessment of lexical tone production disorder in Mandarin-speaking patients with PSD. Furthermore, using ANN may reduce the heterogeneity of rehabilitation evaluation among different speech therapists and may give the feedback for achievement of rehabilitation treatment more accurately. INDEX TERMS Lexical tone rehabilitation, post-stroke dysarthria (PSD), ANN, human listeners, application analysis.