With advancements in computing powers and the overall quality of images captured on everyday cameras, a much wider range of possibilities has opened in various scenarios. This fact has several implications for deaf and dumb people as they have a chance to communicate with a greater number of people much easier. More than ever before, there is a plethora of info about sign language usage in the real world. Sign languages, and by extension the datasets available, are of two forms, isolated sign language and continuous sign language. The main difference between the two types is that in isolated sign language, the hand signs cover individual letters of the alphabet. In continuous sign language, entire words' hand signs are used. This paper will explore a novel deep learning architecture that will use recently published large pre-trained image models to quickly and accurately recognize the alphabets in the American Sign Language (ASL). The study will focus on isolated sign language to demonstrate that it is possible to achieve a high level of classification accuracy on the data, thereby showing that interpreters can be implemented in the real world. The newly proposed Mobile-NetV2 architecture serves as the backbone of this study. It is designed to run on end devices like mobile phones and infer signals (what does it infer) from images in a relatively short amount of time. With the proposed architecture in this paper, the classification accuracy of 98.77% in the Indian Sign Language (ISL) and American Sign Language (ASL) is achieved, outperforming the existing state-of-the-art systems.