The spread of fake news has become a critical problem in recent years due extensive use of social media platforms. False stories can go viral quickly, reaching millions of people before they can be mocked, i.e., a false story claiming that a celebrity has died when he/she is still alive. Therefore, detecting fake news is essential for maintaining the integrity of information and controlling misinformation, social and political polarization, media ethics, and security threats. From this perspective, we propose an ensemble learning-based detection of multi-modal fake news. First, it exploits a publicly available dataset Fakeddit consisting of over 1 million samples of fake news. Next, it leverages Natural Language Processing (NLP) techniques for preprocessing textual information of news. Then, it gauges the sentiment from the text of each news. After that, it generates embeddings for text and images of the corresponding news by leveraging Visual Bidirectional Encoder Representations from Transformers (V-BERT), respectively. Finally, it passes the embeddings to the deep learning ensemble model for training and testing. The 10-fold evaluation technique is used to check the performance of the proposed approach. The evaluation results are significant and outperform the state-of-the-art approaches with the performance improvement of 12.57%, 9.70%, 18.15%, 12.58%, 0.10, and 3.07 in accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Odds Ratio (OR), respectively.