Many historically significant documents are only accessible in paper record form, making text recognition a crucial problem in the arena of digital image processing. Text recognition techniques primarily aim to convert paper documents into digital files that can be easily managed in a database or other server-based entity. Size, colour, font, orientation, backdrop complexity, occlusion, illumination, and lighting all make text identification more difficult in photos from real-world settings. Variations in writing style, several forms of the same letter, linked text, ligature diagonal, and condensed text make Urdu text identification more difficult than with non-cursive scripts. To separate the spatial correlation and appearance correlation (DSSAC) of the mapped convolutional channel, the suggested intelligent model employs the deep separable convolutional layers in place of the conventional design in the U-Net. To achieve cursive region, capture, the research offers a model called DSSAC-RSC.