The purpose of this paper is to evaluate the primary databases of zootechnical records in western Siberia. The object of the study was the records on milk productivity (milk yield during the whole lactation, milk fat content, milk protein content, amount of milk fat, amount of milk protein), duration of lactation (service period, dry period, inter-breeding period), age of the first fruitful insemination and information on the origin of 20,000 Holstein cows. The validity of the raw data was assessed by assuming a Gaussian distribution without significant human influence. For this purpose, the Anderson-Darling test and corresponding visualization using histograms and quantile-quantile plots were applied. The list of traits of milk production was based on the values of the Anderson-Darling criterion. The authors found that the highest levels of this criterion were correlated with milk fat and protein. And the indicator «milk yield» was practically absent from the list of traits of dairy productivity. These results can be explained by the fact that in most enterprises, the value of milk yield was higher than the appraisal threshold values. An analysis of the genealogical trees of the studied breeding enterprises was carried out, along with the use of statistical criteria. This analysis revealed the inappropriate assignment of several dozen of offspring to a single mother. Thus, the presented approach can be used to identify outliers associated with human factors. And it can also be related to improper methodological support of the sampling process and errors in the work of the laboratories of selective milk quality control associated with the sampling and delivery of samples.