The unprecedented availability of petascale analysis-ready earth observation data has given rise to a remarkable surge in demand for regional to global environmental studies, which exploit tons of data for temporal–spatial analysis at a much larger scale than ever. Imagery mosaicking, which is critical for forming “One Map” with a continuous view for large-scale climate research, has drawn significant concern. However, despite employing distributed data processing engines such as Spark, large-scale data mosaicking still significantly suffers from a staggering number of remote sensing images which could inevitably lead to discouraging performance. The main ill-posed problem of traditional parallel mosaicking algorithms is inherent in the huge computation demand and incredible heavy data I/O burden resulting from intensively shifting tremendous RS data back and forth between limited local memory and bulk external storage throughout the multiple processing stages. To address these issues, we propose an in-memory Spark-enabled distributed data mosaicking at a large scale with geo-gridded data staging accelerated by Alluxio. It organizes enormous “messy” remote sensing datasets into geo-encoded gird groups and indexes them with multi-dimensional space-filling curves geo-encoding assisted by GeoTrellis. All the buckets of geo-grided remote sensing data groups could be loaded directly from Alluxio with data prefetching and expressed as RDDs implemented concurrently as grid tasks of mosaicking on top of the Spark-enabled cluster. It is worth noticing that an in-memory data orchestration is offered to facilitate in-memory big data staging among multiple mosaicking processing stages to eliminate the tremendous data transferring at a great extent while maintaining a better data locality. As a result, benefiting from parallel processing with distributed data prefetching and in-memory data staging, this is a much more effective approach to facilitate large-scale data mosaicking in the context of big data. Experimental results have demonstrated our approach is much more efficient and scalable than the traditional ways of parallel implementing.