Novel, knowledge-based models for the prediction of hydrate and solvate formation are introduced, which require only the molecular formula as input. A dataset of more than 19,000 organic, non-ionic and non-polymeric molecules was extracted from the Cambridge Structural Database. Molecules that formed solvates were compared with those that did not using molecular descriptors and statistical methods, which allowed the identification of chemical properties that contribute to solvate formation. The study was conducted for five types of solvates: ethanol, methanol, dichloromethane, chloroform and water solvates. The identified properties were all related to the size and branching of the molecules and to the hydrogen bonding ability of the molecules. The corresponding molecular descriptors were used to fit logistic regression models to predict the probability of any given molecule to form a solvate. The established models were 2 able to predict the behavior of ~80% of the data correctly using only two descriptors in the predictive model.