For well over 100 years, chemists have explored the relationship between the chemical structure and biological activity, and dreamed of predicting them as well as other measurable properties. The first description of a relationship between composition and activity [1] was based on observations of correlation between specific molecular features and observable physiochemical properties [2]. With some data tabulation, it was found that structure-activity relationships could be used to quantify chemical intuition: For a small change in the molecular structure, a corresponding small change in activity could be explained by analyzing regular changes the numerical representations of molecular structure. The power inherent in this type of relationship quickly became obvious, and increased in importance with the quick tabulation abilities of computers. The reductionist qualities of quantitative structureactivity relationships (QSARs) have resulted in both praise and condemnation for the discipline throughout its existence [3][4][5]. Without debating the philosophical validity of reductionist views, a more practical approach is to understand how and when QSARs are applicable to relevant problems. As discussed below, there are many choices to make when matching available data with types of chemical descriptors and machine learning methodologies (Figure 2.1). Inherent in these choices are decisions that affect the level of difficulty and computational effort needed to develop a model and to establish its domain of applicability -a crucial element for managing end-user expectations of model performance. Most models are constructed using methods that project or compress information into a simpler form, consequently representing a compromise between mode interpretability and predictive power. For any nontrivial QSAR the importance of good chemical descriptors cannot be overstated -even the most capable machine learning methodology cannot extract signal from descriptor variance that is not monotonically related to the endpoint of interest. This is the essential Tao of building QSARs, where the ultimate goal is to construct chemically meaningful, validated models. Achievement of this goal relies Statistical Modelling of Molecular Descriptors in QSAR/QSPR. First Edition. Edited