Deep learning’s emergence and continued advancement have significantlychanged and redefined computer vision. Scene text identification andrecognition, a significant area of computer vision research, has unavoidablybeen impacted by this wave of innovation, and consequently hasentered the era of deep learning.It has a wide range of practical uses,from semantic natural scene analysis to navigation for those with visualimpairments. The complexity of the background, the image quality, thetext orientation, the text size, etc. present obstacles to experts in thefield of scene text analysis. Our proposed model provides a solution forscene text localization, detection, and refinement by integrating featurebasedand deep learning-based approaches. To obtain high quality andclear images, our model super-resolves images and deblurs them usingan Edge Attention Network (EAN). It is followed by a candidate regionproposal module that uses Maximally Stable Extremal Region (MSER)and Stroke Width Transform (SWT) to create text region suggestions.The detection result is then improved using a post-processing methodthat uses the idea of a Generative Adversarial Network (GAN). Standarddatasets including ICDAR 2013, ICDAR 2015, and the Street View Text(SVT) have been used for experiments verification which concludes thatour proposed model consistently produces acceptable results.