Reproducibility and reliability are fundamental principles of scientific research. A compiling setup that includes a specific compiler version and compiler flags is an essential technical support for Earth system modeling. With the fast development of computer software and hardware, a compiling setup has to be updated frequently, which challenges the reproducibility and reliability of Earth system modeling. The existing results of a simulation using an original compiling setup may be irreproducible by a newer compiling setup because trivial round-off errors introduced by the change in compiling setup can potentially trigger significant changes in simulation results. Regarding the reliability, a compiler with millions of lines of code may have bugs that are easily overlooked due to the uncertainties or unknowns in Earth system modeling. To address these challenges, this study shows that different compiling setups can achieve exactly the same (bitwise identical) results in Earth system modeling, and a set of bitwise identical compiling setups of a model can be used across different compiler versions and different compiler flags. As a result, the original results can be more easily reproduced; for example, the original results with an older compiler version can be reproduced exactly with a newer compiler version. Moreover, this study shows that new test cases can be generated based on the differences of bitwise identical compiling setups between different models, which can help detect software bugs in the codes of models and compilers and finally improve the reliability of Earth system modeling.Published by Copernicus Publications on behalf of the European Geosciences Union.