The Data Interpolating Empirical Orthogonal Functions (DINEOF) method has demonstrated usability and accuracy for filling spatial gaps in remote sensing datasets. In this study, we conducted the reconstruction of the chlorophyll-a concentration (Chl-a) data using a convolutional neural networks model called Data-Interpolating Convolutional Auto-Encoder (DINCAE), and we compared its performance with that of DINEOF. Furthermore, the cloud-free sea surface temperature (SST) was used as a phytoplankton dynamics predictor for the Chl-a reconstruction. Finally, four reconstruction schemes were implemented: DINCAE (Chl-a only), DINCAE (Chl-a and SST), DINEOF (Chl-a only), and DINEOF (Chl-a and SST), denoted rec1, rec2, rec3, and rec4 respectively. To quantitatively evaluate the accuracy of these reconstruction schemes, both the cross-validation and in situ data were used. The study domain was chosen to be the Northern South China Sea (SCS) and West Philippine Sea (WPS), bounded by 115–125°E and 16–24°N to test the model performance for the reconstruction of Chl-a under different Chl-a controlling mechanisms. The in situ validation showed that rec1 performs best among the four reconstruction schemes, and that adding SST into the Chl-a reconstruction cannot improve the reconstruction results. However, for cross validation, adding SST can slightly improve spatial distributions of the root mean square error (RMSE) between the reconstructed data and the original data, especially over the SCS continental shelf. Furthermore, the potential of DINCAE prediction is confirmed in this paper; thus, the trained DINCAE model can be re-applied to reconstruct other missing data, and more importantly, it can also be re-trained using the reconstructed data, thereby further improving reconstruction results. Another consideration is efficiency; with similar reconstruction conditions, DINCAE is 5–10 times faster than DINEOF.