Recently, researchers' attention has been paid to pronunciation assessment not based on comparison between learners' utterances and native models, but based on comprehensibility of the utterances [1, 2, 3]. In our previous studies [4, 5], native listeners' shadowing was investigated and shown to be effective to predict comprehensibility perceived by listeners (shadowers). In this paper, native listeners' shadowings are viewed as spoken annotations that can represent comprehensibility. In [4, 5], to predict comprehensibility of a non-native utterance, the GOP scores of its corresponding native listeners' shadowings were calculated by using a DNN-based ASR front-end. Generally speaking, annotations are prepared manually and, even when some techniques are used for annotations, only stable and reliable techniques should be used. In this paper, a simpler, stabler, and more reliable method to derive comprehensibility annotations was proposed. After native listeners' shadowing, they are asked to read aloud the sentence intended by the learner. Reading is the most prepared speech and shadowing is probably the least prepared speech. DTW between the two utterances is supposed to be able to quantify and predict comprehensibility or shadowability perceived by the shadowers. In experiments, DTW between shadowings and readings shows higher correlation than the GOP scores of shadowings.