Fire Radiative Power (FRP) is a key indicator for evaluating the intensity of wildfires, unlike traditional realtime fire lines or combustion areas that only provide binary information, and its accurate prediction is more important for firefighting actions and environmental pollution assessment. To this end, we used a combination of data from geostationary satellites and polar orbit satellites to correct FRP data. Incorporating various factors that affect wildfire spread, such as meteorological conditions, topography, vegetation indexes, and population density, we constructed a comprehensive California wildfire spread dataset, covering information since 2017. Then, we established a deep learning framework that integrates various modules to analyze multimodal data for the accurate prediction of FRP imagery. We investigated the impact of input sequence length and loss function design on model predictive performance, leading to subsequent model optimization. Furthermore, our model has demonstrated acceptable performance in transfer learning and multi-step prediction, emphasizing its application value in wildfire prediction and management. It can provide more detailed information about wildfire spread, showcasing the powerful capability of deep learning to process multimodal data and its potential in the emerging field of real-time FRP prediction.