Leptoquarks extending the Standard Model (SM) are attracting an increasing attention in the recent literature. Hence, the identification of 4D SM-like models and the classification of allowed leptoquarks from strings is an important step in the study of string phenomenology. We perform the most extensive search for SM-like models from the nonsupersymmetric heterotic string SO(16) × SO(16), resulting in more than 170,000 inequivalent promising string models from 138 Abelian toroidal orbifolds. We explore the 4D massless particle spectra of these models in order to identify all exotics beside the three generations of quarks and leptons. Hereby, we learn which leptoquark can be realized in this string setup. Moreover, we analyze the number of SM Higgs doublets which is generically larger than one. Then, we identify SM-like models with a minimal particle content. These so-called almost SM models appear most frequently in the orbifold geometries Z 2 ×Z 4 (2, 4) and (1, 6). Finally, we apply machine learning to our dataset in order to predict the orbifold geometry where a given particle spectrum can be found most likely.