The old Arabic manuscripts are highly sought-after documents but very difficult to access. Digitization, and thus handwriting recognition, is a beneficial way to make these resources accessible. This paper presents an end-to-end approach to the offline recognition of ancient manuscripts. First, a crucial pre-processing step is to extract text lines and words by applying transfer learning on YOLO (You Only Look Once) architecture. Thus the segmentation problem is treated as a detection problem. Then for the recognition of old handwritten words, we propose ensemble learning techniques based on recurrent neural networks associated with the Connectionist Temporal Classification layer (CTC) combined to convolution networks with Squeeze-and-Excitation blocks. The presented work accurately detects lines of text and words, even when overlapping or touching words are present, and correctly identifies those with multiple connected components. We evaluate this approach on a collection of 20 pages for text line detection. Moreover, we introduce a new consistent and accurate dataset for word detection and recognition. We have achieved promising results with 98.1% and 94.38% F1-measure on the text line and word detection, respectively, with a character error rate recognition of 8.27%.