Data-driven chemistry has garnered much interest concurrent with improvements in hardware and the development of new machine learning models. However, a notable bottleneck for data-driven chemistry specifically is the challenge in obtaining sufficiently large, accurate datasets of a desired chemical outcome. Herein, I develop a machine learning framework that makes prediction amid low data: First, a chemical “foundational model” is trained using on a dataset of ~1 million experimental organic crystal structures of organic molecules - a source of big data in the chemistry. A task specific model is then stacked on top on this general model. This approach achieves state-of-the-art performance in a diverse set of tasks – toxicity prediction, yield prediction, and odor prediction. More generally, my work shows that a foundational model approach, which led to step-change in domains such as natural language, can unlock advances in chemistry.