The prediction validity of discrete choice models is key for policy making in the transportation sector. For internal validation, i.e., when the population used to estimate and validate the model is the same, different approaches exist. Each approach is characterised in terms of sampling strategy and accuracy metric. The former includes in-sample, also referred to as apparent, split-sample, cross-validation, and bootstrapping. The latter include McFadden rho-squared, percentage of right classification, McFadden proportion of right predictions, Brier Score, polytomous discrimination index, and hypervolume under ROC manifold. It is widely recognised that in-sample strategies are overly optimistic, because the model is optimized for performance in the sample in which it is estimated. Evaluation of performance of approaches to internal validation has been carried out in the clinical epidemiology area with logistic regression models. This paper evaluates approaches to internal validation using synthetic and real datasets related to personal travel mode choices modelled using multinomial Logit. The performance of each approach is evaluated against the apparent performance in the full population. With both synthetic and real data, cross-validation produces the lowest bias with most metrics. The metric with lowest bias is data-specific. Lowest variability is produced by bootstrapping.