Model validation metrics have been developed to provide a quantitative measure that characterizes the agreement between predictions and observations. In engineering design, the metrics become useful for model selection when alternative models are being considered. Additionally, the predictive capability of a computational model needs to be assessed before it is used in engineering analysis and design. Due to the various sources of uncertainties in both computer simulations and physical experiments, model validation must be conducted based on stochastic characteristics. Currently there is no unified validation metric that is widely accepted. In this paper, we present a classification of validation metrics based on their key characteristics along with a discussion of the desired features. Focusing on stochastic validation with the consideration of uncertainty in both predictions and physical experiments, four main types of metrics, namely classical hypothesis testing, Bayes factor, frequentist’s metric, and area metric, are examined to provide a better understanding of the pros and cons of each. Using mathematical examples, a set of numerical studies are designed to answer various research questions and study how sensitive these metrics are with respect to the experimental data size, the uncertainty from measurement error, and the uncertainty in unknown model parameters. The insight gained from this work provides useful guidelines for choosing the appropriate validation metric in engineering applications.