Graphical texts in natural images play an important role in portraying information in multitudinous fields such as communication, education, and entertainment to name a few. Recognizing text in scene images is challenging due to the inherent complexity of the images. Text recognition in natural images involves script identification, which requires text localization. This is not trivial for natural scene images due to the presence of disparate foreground/background components. For scene images like movie posters, the challenge is more dominant. The challenges aggravate due to the presence of composite characteristics of posters like complex graphics background and the presence of different texts like a movie title, names of actors, producers, directors, and tagline. These texts have miscellaneous fonts, variations in colors, size, orientation, and textures. In this work, an M-EAST (modified EAST) model is proposed, which is based on the EAST (efficient and accurate scene text detector) model for text localization. A novel movie title extraction is thereafter used for separating the title from the extracted text pool. Finally, the title script was identified using a shallow convolutional neural network (SCNN)-based to ensure functionality in low-resource environments. Experiments were performed on a dataset of movieposter images of Tollywood, Bollywood, and Hollywood industries, and a highest accuracy of 99.82% was obtained. The system performed better than the reported techniques.