Chi-squared tests for lack of fit are traditionally employed to find evidence against a hypothesized model, with the model accepted if the Karl Pearson statistic comparing observed and expected numbers of observations falling within cells is not 'significantly large'. However, if one really wants evidence for goodness of fit, it is better to adopt an equivalence testing approach in which small values of the chi-squared statistic are evidence for the desired model. This method requires one to define what is meant by equivalence to the desired model, and guidelines are proposed. Then a simple extension of the classical normalizing transformation for the non-central chi-squared distribution places these values on a simple to interpret calibration scale for evidence. It is shown that the evidence can distinguish between normal and nearby models, as well between the Poisson and over-dispersed models. Applications to evaluation of random number generators and to uniformity of the digits of pi are included. Sample sizes required to obtain a desired expected evidence for goodness of fit are also provided.