To mitigate the challenges of operating through narrow incisions under image guidance, there is a desire to develop intelligent systems that assist decision making and spatial reasoning in minimally invasive surgery (MIS). In this context, machine learning-based systems for interventional image analysis are receiving considerable attention because of their flexibility and the opportunity to provide immediate, informative feedback to clinicians. It is further believed that learning-based image analysis may eventually form the foundation for semior fully automated delivery of surgical treatments. A significant bottleneck in developing such systems is the availability of annotated images with sufficient variability to train generalizable models, particularly the most recently favored deep convolutional neural networks or transformer architectures. A popular alternative to acquiring and manually annotating data from the clinical practice is the simulation of these data from human-based models. Simulation has many advantages, including the avoidance of ethical issues, precisely controlled environments, and the scalability of data collection. Here, we survey recent work that relies on in silico training of learning-based MIS systems, in which data are generated via computational simulation. For each imaging modality, we review available simulation tools in terms of compute requirements, image quality, and usability, as well as their applications for training intelligent systems. We further discuss open challenges for simulation-based development of MIS systems, such as the need for integrated imaging and physical modeling for non-optical modalities, as well as generative patient models not dependent on underlying CT, MRI, or other patient data. In conclusion, as the capabilities of in silico training mature, with respect to sim-to-real transfer, computational efficiency, and degree of control, they are contributing toward the next generation of intelligent surgical systems.