Model validation is the assessment of the "correctness" of a given model relative to experimental data. The results of a model validation study can be used to quantify the model form uncertainty, to select between different models, or to improve the model (i.e., through calibration or model updating). The process of model validation is complicated by the fact that both the simulation and experimental outcomes include significant uncertainty, which can come in the form of aleatory (random) uncertainties, epistemic (lack-of-knowledge) uncertainties, and bias errors. The application of Probability Bounds Analysis for treating mixed (i.e., containing both aleatory and epistemic) input uncertainties results in a family of cumulative distribution functions (CDFs) of the simulation outcomes, which is referred to as a "probability box" or "p-box." The fact that a family of CDFs, as opposed to a single CDF, is required to characterize the outcomes complicates the implementation of model validation techniques. In this paper, we will examine the following approaches to model validation and assess their ability to handle problems with mixed uncertainties: 1) the area validation metric, 2) a modified area validation metric with a factor of safety, 3) a modified area validation metric with confidence intervals, 4) the standard validation uncertainty, and 5) the difference in simulation and experimental means. To provide a rigorous assessment of these model validation techniques, we employ the recently developed Method of Manufactured Universes (MMU), where "true values in nature" are constructed by the analyst to reflect the behavior of a physical reality of interest. Here, MMU is applied to the compressible turbulent flow over a NACA 0012 airfoil, where the "true" values are constructed using turbulent computational fluid dynamics simulations while the "model" employs simplified lift and drag estimates based on thin airfoil theory and empirical lift and drag correlations. Preliminary results suggest that the modified area validation metric implementations provide more conservative estimates of the model form uncertainty than the area validation metric for mean values, with the confidence-interval implementation also returning smaller uncertainty ratios.