Despite extensive research, recognition of Persian and Arabic manuscripts is still a challenging problem due to the complicated and irregular nature of writing, wide vocabulary, and diversity of handwritings. In Persian and Arabic words, letters are joined together, and signs such as dots are placed above or below letters. In the proposed approach, the words are first decomposed into their constituent subwords to enhance the recognition accuracy. Then the signs of subwords are extracted to develop a dictionary of main subwords and signs. The dictionary is then employed to train a classifier. Since the proposed recognition approach is based on unsigned subwords, the classifier may make a mistake in recognizing some subwords of a word. To overcome this, a new subword fusion algorithm is proposed based on the similarity of the main subwords and signs. Here, convolutional neural networks (CNNs) are utilized to train the classifier. An autoencoder (AE) network is employed to extract appropriate features. Thus, a hybrid network is developed and named AECNN. The known Iranshahr dataset, including nearly 17000 images of handwritten names of 503 cities of Iran, was employed to analyze and test the proposed approach. The resultant recognition accuracy is 91.09%. Therefore, the proposed approach is much more capable than the other methods known in the literature.