Remote sensing (RS) imagery has become more and more popular in surface water extent extraction applications with the help of increasing availability of RS data and advancements in image processing algorithms, software, and hardware. Many studies have demonstrated that RS imagery has the potential to work independently or along with other well-documented approaches in identifying flood extent. However, due to the insufficiency of images from single-sourced RS and independent references for validation, most existing studies either focused on mapping a single scene or failed to support their results with adequate non-RS validation when multi-temporal/multi-spatiotemporal images were involved. Because of that, hydrosimulations still dominate flood series mapping despite requiring huge data and computational resources. To close these gaps, this study investigated the efficacy of RS-based multi-spatiotemporal flood inundation mapping using multimodal RS imageries to take advantage of improved data availability and complementary image properties. This study also proposed a Quantile-based Filling & Refining (QFR) workflow to resolve the blocking effects of dense vegetation that occurs in study areas. We tested the workflow in four lock and dam sites on the Mississippi River, downstream to the Quad City area, by comparing the RS-based flood maps with HEC-RAS simulations. Compared to the original flood extent that only went through basic post-processing, QFR maps were noticeably more consistent with HEC-RAS maps. Results also showed that all steps in QFR contributed to performance improvements. Despite all being necessary in our case, some should be adjusted in different study regions, such as the levee step. Our findings showcased the efficacy of the multimodal RS flood mapping with QFR post-processing. Due to its simple structure, the proposed workflow has potential to be fully automated and can benefit near-real-time and real-time applications.