Natural disaster happens, as a result of natural hazards that cause financial, environmental or human losses. Natural disasters strike unexpectedly, affecting the lives of tens of thousands of people. During the flood, social media sites were also heavily used to disseminate information about flooded areas, rescue agencies, food and relief centres. This work proposes an ensemble learning strategy for combining and analysing social media data in order to close the gap and progress in catastrophic situation. To enable scalability and broad accessibility of the dynamic streaming of multimodal data namely text, image, audio and video, this work is designed around social media data. A fusion technique was employed at the decision level, based on a database of 15 characteristics for more than 300 disasters around the world (Trained with MNIST dataset 60000 training images and 10000 testing images). This work allows the collected multimodal social media data to share a common semantic space, making individual variable prediction easier. Each merged numerical vector(tensors) of text and audio is sent into the K-CNN algorithm, which is an unsupervised learning algorithm (K-CNN), and the image and video data is given to a deep learning based Progressive Neural Artificial Search (PNAS). The trained data acts as a predictor for future incidents, allowing for the estimation of total deaths, total individuals impacted, and total damage, as well as specific suggestions for food, shelter and housing inspections. To make such a prediction, the trained model is presented a satellite image from before the accident as well as the geographic and demographic conditions, which is expected to result in a prediction accuracy of more than 85%.