Teratogenicity poses severe threats to patient safety. Stem-cell-based in vitro systems are promising tools to predict human teratogenicity. However, current in vitro assays are limited because they either capture effects on a certain germ layer, or focus on a subset of predictive markers. Here we report the characterization and critical assessment of TeraTox, a newly developed multi-lineage differentiation assay using 3D human induced pluripotent stem cells. TeraTox probes stem-cell derived embryoid bodies with two endpoints, one quantifying cytotoxicity and the other inferring the teratogenicity potential with gene expression as a molecular phenotypic readout. To derive teratogenicity potentials from gene expression profiles, we applied both unsupervised machine-learning tools including factor analysis and supervised tools including classification and regression. To identify the best predictive model for the teratogenicity potential that is explainable, we systematically tested 64 machine-learning model architectures and identified the optimal model, which uses expression of 77 representative germ-layer genes, summarized by 10 latent germ-layer factors, as input for random-forest regression. We combined measured cytotoxicity and inferred teratogenicity potential to predict concentration-dependent teratogenicity profiles of 33 approved pharmaceuticals and 12 proprietary drug candidates with known in vivo data. Compared with the mouse embryonic stem cell test, which has been in routine use for more than a decade, the TeraTox assay shows higher sensitivity, particularly towards teratogens impairing ectodermal development or stem-cell renewal, and a more balanced prediction performance. We envision that further refinement and development of TeraTox has the potential to reduce and replace animal research in drug discovery and to improve preclinical assessment of teratogenicity.