The rise of deepfakes and users' susceptibility to online manipulation underscore the critical need for effective detection methods in scholarly research. Detecting multimodal deepfakes—particularly in inflammatory posts—poses unique challenges, as these combine multiple media types to increase believability and emotional impact. To address this, we propose the DEFUTE framework, an entropy-based method that assesses feature consistency across images and text through four modules: deepfake detection, image similarity, text similarity, and text-image matching. Using the DamCNN algorithm, DEFUTE identifies facial forgeries, while image and text similarity modules analyze key visual and semantic features. The text-image matching module verifies alignment between descriptions and images to spot discrepancies. DEFUTE demonstrates high accuracy in identifying deepfake content, with future efforts focused on enhancing generalization and integrating multimodal data for even greater precision.