The rapid development of machine learning technologies in recent years has led to the emergence of CNN-based sensors or ML-enabled smart sensor systems, which are intensively used in medical analytics, unmanned driving of cars, Earth sensing, etc. In practice, the accuracy of CNN-based sensors is highly dependent on the quality of the training datasets. The preparation of such datasets faces two fundamental challenges: data quantity and data quality. In this paper, we propose an approach aimed to solve both of these problems and investigate its efficiency. Our solution improves training datasets and validates it in several different applications: object classification and detection, depth buffer reconstruction, panoptic segmentation. We present a pipeline for image dataset augmentation by synthesis with computer graphics and generative neural networks approaches. Our solution is well-controlled and allows us to generate datasets in a reproducible manner with the desired distribution of features which is essential to conduct specific experiments in computer vision. We developed a content creation pipeline targeted to create realistic image sequences with highly variable content. Our technique allows rendering of a single 3D object or 3D scene in a variety of ways, including changing of geometry, materials and lighting. By using synthetic data in training, we have improved the accuracy of CNN-based sensors compared to using only real-life data.