The channelized Hotelling observer (CHO) has become a widely used approach for evaluating medical image quality, acting as a surrogate for human observers in early-stage research on assessment and optimization of imaging devices and algorithms. The CHO is typically used to measure lesion detectability. Its popularity stems from experiments showing that the CHO’s detection performance can correlate well with that of human observers. In some cases, CHO performance overestimates human performance; to counteract this effect, an internal-noise model is introduced, which allows the CHO to be tuned to match human observer performance. Typically, this tuning is achieved using example data obtained from human observers. We argue that this internal-noise tuning step is essentially a model training exercise; therefore, just as in supervised learning, it is essential to test the CHO with an internal-noise model on a set of data that is distinct from that used to tune (train) the model. Furthermore, we argue that, if the CHO is to provide useful insights about new imaging algorithms or devices, the test data should reflect such potential differences from the training data; it is not sufficient simply to use new noise realizations of the same imaging method. Motivated by these considerations, the novelty of this paper is the use of new model selection criteria to evaluate ten established internal-noise models, utilizing four different channel models, in a train-test approach. Though not the focus of the paper, a new internal-noise model is also proposed that outperformed the ten established models in the cases tested. The results, using cardiac perfusion SPECT data, show that the proposed train-test approach is necessary, as judged by the newly proposed model selection criteria, to avoid spurious conclusions. The results also demonstrate that, in some models, the optimal internal-noise parameter is very sensitive to the choice of training data; therefore, these models are prone to overfitting, and will not likely generalize well to new data. In addition, we present an alternative interpretation of the CHO as a penalized linear regression wherein the penalization term is defined by the internal noise model.