Grassland is the dominant vegetation type on the Mongolian Plateau. It is not only an important part of the ecological environment of the Mongolian Plateau, but also an important resource base for the development of animal husbandry in the Mongolian Plateau. As one of the evaluation indicators of grassland productivity, the grass yield has guiding significance for striking the balance between grassland and livestock. However, due to the long-term dependence on artificial investigation, there is a shortage of products for estimating grass yield in a large range, high spatial resolution and continuous time. Taking Mongolia as the research area, in this paper, we used Landsat8 remote sensing image, MODIS remote sensing data and meteorological data in combination with the measured sample data of grass yield in the field survey to obtain the relationship between the measured grass yield and the vegetation index NDVI, surface temperature and precipitation through the depth neural network. In this way, we constructed the estimation model of Mongolia's domestic grass yield suitable for the characteristics of the region. Moreover, we establish a deep neural network estimation model for grass yield, and retrieved the temporal and spatial distribution map of grass yield in Mongolia from 2017 to 2021. The precision verification experiment shows that the model based on deep learning has a high precision, with an RMSE of 12.14 g/m2 and an estimation accuracy of 81%, which can provide a method and data reference for the estimation of domestic grassland in Mongolia.