The purpose of this study is to demonstrate the use of a deep learning model in quantitatively evaluating clinical findings typically subject to uncertain evaluations by physicians, using binary test results based on routine protocols. A chest X-ray is the most commonly used diagnostic tool for the detection of a wide range of diseases and is generally performed as a part of regular medical checkups. However, when it comes to findings that can be classified as within the normal range but are not considered disease-related, the thresholds of physicians’ findings can vary to some extent, therefore it is necessary to define a new evaluation method and quantify it. The implementation of such methods is difficult and expensive in terms of time and labor. In this study, a total of 83,005 chest X-ray images were used to diagnose the common findings of pleural thickening and scoliosis. A novel method for quantitatively evaluating the probability that a physician would judge the images to have these findings was established. The proposed method successfully quantified the variation in physicians’ findings using a deep learning model trained only on binary annotation data. It was also demonstrated that the developed method could be applied to both transfer learning using convolutional neural networks for general image analysis and a newly learned deep learning model based on vector quantization variational autoencoders with high correlations ranging from 0.89 to 0.97.