We investigate a fundamental nonlinear process of vacuum photon emission in the presence of strong electromagnetic fields going beyond the locally-constant field approximation (LCFA), i.e., providing the exact treatment of the spatiotemporal inhomogeneities of the external field. We examine a standing electromagnetic wave formed by high-intensity laser pulses and benchmark the approximate predictions against the results obtained by means of a precise approach evaluating both the tadpole (reducible) and vertex (irreducible) contributions. It is demonstrated that the previously used approximate methods may fail to properly describe the quantitative characteristics of each of the two terms. In the case of the tadpole contribution, the LCFA considerably underestimates the number of photons emitted for sufficiently high frequency of the external field. The vertex term predicts emission of a great number of soft photons whose spectrum is no longer isotropic in contrast to the LCFA results. A notable difference among the photon yields along different spatial directions, which is not captured by the LCFA, represents an important signature for experimental studies of the photon emission process. Since this feature takes place unless the Keldysh parameter is much larger than unity, it can also be used in indirect observation of the Schwinger mechanism.