Spammers have created a new kind of electronic mail (e-mail) called image-based spam to bypass text-based spam filters. Unfortunately, these images contain harmful links that can infect the user’s computer system and take a long time to be deleted, which can hamper users’ productivity and security. In this paper, a hybrid deep neural network architecture is suggested to address this problem. It is based on the convolution neural network (CNN), which has been enhanced with the convolutional block attention module (CBAM). Initially, CNN enhanced with CBAM is used to extract the most crucial information from each image-based e-mail. Then, the generated feature vectors are fed to the support vector machine (SVM) model to classify them as either spam or ham. Four datasets—including Image Spam Hunter (ISH), Annadatha, Chavda Approach 1, and Chavda Approach 2—are used in the experiments. The obtained results demonstrated that in terms of accuracy, our model exceeds the existing state-of-the-art methods.