Learning machines that have hierarchical structures or hidden variables are singular statistical models because they are nonidentifiable and their Fisher information matrices are singular. In singular statistical models, neither does the Bayes a posteriori distribution converge to the normal distribution nor does the maximum likelihood estimator satisfy asymptotic normality. This is the main reason that it has been difficult to predict their generalization performance from trained states. In this paper, we study four errors, (1) the Bayes generalization error, (2) the Bayes training error, (3) the Gibbs generalization error, and (4) the Gibbs training error, and prove that there are universal mathematical relations among these errors. The formulas proved in this paper are equations of states in statistical estimation because they hold for any true distribution, any parametric model, and any a priori distribution. Also we show that the Bayes and Gibbs generalization errors can be estimated by Bayes and Gibbs training errors, and we propose widely applicable information criteria that can be applied to both regular and singular statistical models.