Stem cell-based in vitro test systems can recapitulate specific phases of human development. In the UKK test system, human pluripotent stem cells (hPSCs) randomly differentiate into cells of the three germ layers and their derivatives. In the UKN1 test system, hPSCs differentiate into early neural precursor cells. During the normal differentiation period (14 days) of the UKK system, 570 genes [849 probe sets (PSs)] were regulated >fivefold; in the UKN1 system (6 days), 879 genes (1238 PSs) were regulated. We refer to these genes as ‘developmental genes’. In the present study, we used genome-wide expression data of 12 test substances in the UKK and UKN1 test systems to understand the basic principles of how chemicals interfere with the spontaneous transcriptional development in both test systems. The set of test compounds included six histone deacetylase inhibitors (HDACis), six mercury-containing compounds (‘mercurials’) and thalidomide. All compounds were tested at the maximum non-cytotoxic concentration, while valproic acid and thalidomide were additionally tested over a wide range of concentrations. In total, 242 genes (252 PSs) in the UKK test system and 793 genes (1092 PSs) in the UKN1 test system were deregulated by the 12 test compounds. We identified sets of ‘diagnostic genes’ appropriate for the identification of the influence of HDACis or mercurials. Test compounds that interfered with the expression of developmental genes usually antagonized their spontaneous development, meaning that up-regulated developmental genes were suppressed and developmental genes whose expression normally decreases were induced. The fraction of compromised developmental genes varied widely between the test compounds, and it reached up to 60 %. To quantitatively describe disturbed development on a genome-wide basis, we recommend a concept of two indices, ‘developmental potency’ (D
p) and ‘developmental index’ (D
i), whereby D
p is the fraction of all developmental genes that are up- or down-regulated by a test compound, and D
i is the ratio of overrepresentation of developmental genes among all genes deregulated by a test compound. The use of D
i makes hazard identification more sensitive because some compounds compromise the expression of only a relatively small number of genes but have a high propensity to deregulate developmental genes specifically, resulting in a low D
p but a high D
i. In conclusion, the concept based on the indices D
p and D
i offers the possibility to quantitatively express the propensity of test compounds to interfere with normal development.Electronic supplementary materialThe online version of this article (doi:10.1007/s00204-016-1741-8) contains supplementary material, which is available to authorized users.