In this paper, a novel vision-based automated framework for estimation of pollution severity of outdoor insulators is proposed. The correct estimation of pollution severity is important to prevent the premature flashover of insulators. Existing methods to determine the degree of contamination of insulator surface requires direct contact with the insulator, which is practically problematic. Considering the aforesaid fact, in this article, a novel infrared thermal (IRT) image-based automated framework is proposed for noncontact monitoring the surface condition of outdoor insulators. A large number of IRT images corresponding to different contamination levels were captured from several porcelain disc insulators using a thermal camera. The captured IRT images were initially segmented using mask region-based convolutional neural network to remove the effect of background. Then, a hybrid deep learning network consisting of convolutional neural network and Bi-directional long short memory (CNN-BiLSTM) is designed for automated classification of IRT images. It has been observed that the proposed network has achieved an accuracy of 98.86%, specificity of 99.72%, precision of 98.86% and F-1 score of 98.86%, respectively. A comparative study with other deep learning models indicated that the proposed CNN-BiLSTM network delivered better performance. Hence, the proposed framework can be used in real-life for non-contact condition monitoring of insulators.