Characterization of molecular responses in real and complex field environments is essential for understanding the environmental response of plants. Field transcriptomics, i.e. modelling large amounts of transcriptomic and meteorological data, is the most comprehensive method of studying gene expression dynamics in complex environments. However, it is not clear what factors influence the accuracy of field transcriptome models. In this study, a novel simulation system was developed. Using the system, we performed a large-scale simulation to reveal the factors affecting the accuracy of the models. We found that the factors that had the greatest impact on the accuracy are, in order of importance, the expression pattern of the gene, the number of samples in the training data, the diurnal coverage of the training data, and the temperature coverage of the training data. Validation using actually measured transcriptome data showed similar results to the simulations. Our simulation system and the analysis results will be helpful for developing efficient sampling strategies for training data and for generating simulated data for benchmarking new modelling methods. It will also be valuable to dissect the relative importance of various factors behind transcriptome dynamics in the real environment.Key messageNovel simulation system revealed how prediction accuracy of field transcriptome was affected by number and diversity of training data