Associations between high-dimensional datasets, each comprising many features, can be discovered through multivariate statistical methods, like Canonical Correlation Analysis (CCA) or Partial Least Squares (PLS). CCA and PLS are widely used methods which reveal which features carry the association. Despite the longevity and popularity of CCA/PLS approaches, their application to high-dimensional datasets raises critical questions about the reliability of CCA/PLS solutions. In particular, overfitting can produce solutions that are not stable across datasets, which severely hinders their interpretability and generalizability. To study these issues, we developed a generative model to simulate synthetic datasets with multivariate associations, parameterized by feature dimensionality, data variance structure, and assumed latent association strength. We found that resulting CCA/PLS associations could be highly inaccurate when the number of samples per feature is relatively small. For PLS, the profiles of feature weights exhibit detrimental bias toward leading principal component axes. We confirmed these model trends in state-ofthe-art datasets containing neuroimaging and behavioral measurements in large numbers of subjects, namely the Human Connectome Project (n ≈ 1000) and UK Biobank (n = 20000), where we found that only the latter comprised enough samples to obtain stable estimates. Analysis of the neuroimaging literature using CCA to map brain-behavior relationships revealed that the commonly employed sample sizes yield unstable CCA solutions. Our generative modeling framework provides a calculator of dataset properties required for stable estimates. Collectively, our study characterizes dataset properties needed to limit the potentially detrimental effects of overfitting on stability of CCA/PLS solutions, and provides practical recommendations for future studies.Significance StatementScientific studies often begin with an observed association between different types of measures. When datasets comprise large numbers of features, multivariate approaches such as canonical correlation analysis (CCA) and partial least squares (PLS) are often used. These methods can reveal the profiles of features that carry the optimal association. We developed a generative model to simulate data, and characterized how obtained feature profiles can be unstable, which hinders interpretability and generalizability, unless a sufficient number of samples is available to estimate them. We determine sufficient sample sizes, depending on properties of datasets. We also show that these issues arise in neuroimaging studies of brain-behavior relationships. We provide practical guidelines and computational tools for future CCA and PLS studies.