Evaluating impacts of observations on the skill of numerical weather prediction (NWP) is important. The Ensemble Forecast Sensitivity to Observation (EFSO) provides an efficient approach to diagnosing observation impacts, quantifying how much each observation improves or degrades a subsequent forecast with a given verification reference. This study investigates the sensitivity of EFSO impact estimates to the choice of the verification reference, using a global NWP system consisting of the Non‐hydrostatic Icosahedral Atmospheric Model (NICAM) and the Local Ensemble Transform Kalman Filter (LETKF). The EFSO evaluates observation impacts with the moist total energy norm and with recently proposed observation‐based verification metrics. The results show that each type of observation mainly contributes to the improvement of forecast departures of the observed variable maybe due to the limitation of localization in the EFSO. The EFSO overestimates the fraction of beneficial observations when verified with subsequent analyses, especially for shorter lead times such as 6 h. We may avoid this overestimation to some extent by verifying with observations, analyses from other data assimilation (DA) systems, or analyses of an independent run with the same DA system. In addition, this study demonstrates two important issues possibly leading to overestimating observation impacts. First, observation impacts would be overestimated if we apply relaxation‐to‐prior methods to the initial conditions of the ensemble forecasts in the EFSO; therefore, the ensemble forecasts in the EFSO should be independent of the ensemble forecasts in the DA cycle. Second, deterministic baseline forecasts of the EFSO, which represent the forecast without DA, should be initialized by the ensemble mean of the first guess at the analysis time, not by the previous analysis.