Context. V838 Mon is an eruptive variable, which exploded in 2002. It displayed the most spectacular light echo ever observed. However, neither the origin of the reflecting matter nor the nature of the 2002 outburst have been firmly constrained. Aims. We investigate the nature of the CO radio emission detected in the field of the light echo. In particular, we explore its connection to the echoing dust around V838 Mon. Methods. We observed the echo region in multiple CO rotational transitions. We present and analyse maps of the region obtained in the 12 CO(1-0) and (3-2) lines. In addition, deep spectra at several positions were acquired in 12 CO(1-0), (2-1), (3-2), and 13 CO(1-0), (2-1). Radiative transfer modelling of line intensities is performed for chosen positions to constrain the kinetic temperatures and densities. We derive global parameters (e.g. mass, distance, total column density) of the emitting cloud. Results. We found that a compact molecular cloud is located within the echo region. The molecular emission is physically connected to the dusty environment seen in the optical echo and they both belong to the same translucent cloud. The interstellar nature of the cloud is confirmed by its high mass of 90-150 M . We propose that the cloud consists of material remaining after the formation of the cluster to which V838 Mon belongs. This indicates that the eruptive star has a young age (3-10 Myr).