Social media datasets have been widely used in disaster assessment and management. When a disaster occurs, many users post messages in a variety of formats, e.g., image and text, on social media platforms. Useful information could be mined from these multimodal data to enable situational awareness and to support decision making during disasters. However, the multimodal data collected from social media contain a lot of irrelevant and misleading content that needs to be filtered out. Existing work has mostly used unimodal methods to classify disaster messages. In other words, these methods treated the image and textual features separately. While a few methods adopted multimodality to deal with the data, their accuracy cannot be guaranteed. This research seamlessly integrates image and text information by developing a multimodal fusion approach to identify useful disaster images collected from social media platforms. In particular, a deep learning method is used to extract the visual features from social media, and a FastText framework is then used to extract the textual features. Next, a novel data fusion model is developed to combine both visual and textual features to classify relevant disaster images. Experiments on a real-world disaster dataset, CrisisMMD, are performed, and the validation results demonstrate that the method consistently and significantly outperforms the previously published state-of-the-art work by over 3%, with a performance improvement from 84.4% to 87.6%.