In human health risk assessment of chemicals and pharmaceuticals, identification of genotoxicity hazard usually starts with a standard battery of in vitro genotoxicity tests, which is needed to cover all genotoxicity endpoints. The individual tests included in the battery are not designed to pick up all endpoints. This explains why resulting data can appear contradictory, thereby complicating accurate interpretation of the findings. Such interpretation could be improved through application of mathematical modeling. One of the advantages of mathematical modeling is that the strengths and weaknesses of each test are taken into account. Furthermore, the generated predictions are objective and convey the associated uncertainties. This approach was explored by the working group “Predictivity of In Vitro Genotoxicity Testing,” convened in the context of the 8th International Workshop on Genotoxicity Testing (IWGT). Specifically, we applied mathematical modeling to a database with publicly available in vitro and in vivo data for genotoxicity. The results indicate that a mammalian in vitro clastogenicity test and a mammalian cell gene mutation test together provide strong predictive weight‐of‐evidence for evaluating genotoxic hazard of a substance, although they are better in predicting absence of genotoxic potential than in predicting presence of genotoxic potential. Remarkably, the bacterial reverse mutation (Ames) test did not significantly change these predictions when used in combination with in vitro mutagenicity and clastogenicity tests using cells of mammalian origin. However, in case only data from a bacterial reverse mutation test are available for the assessment of genotoxic potential, these do bear weight of evidence and thus can be used. Genotoxicity assays are generally executed in tiers, in which the bacterial reverse mutation test often is the starting point. Thus, it is reasonable to suspect that early in development test results from the bacterial reverse mutation test have influenced the composition of the database studied here. We performed several tests on the robustness of the database used for the analyses presented here, and the forthcoming results do not indicate a strong bias. Further research comparing in vitro genotoxicity data with in vivo data for additional compounds will provide more insights whether it is indeed time to reconsider the composition of the standard in vitro genotoxicity battery.