Influence diagrams, derived from the mystery method as its learning output, represent an externalization of systems thinking and are, therefore, valid to research; so far they have not been conceptualized in the research literature for teaching systems thinking in education for sustainable development. In this study, 31 of those diagrams are confronted with (1) three different expert references, in (2) two different ways, by (3) three different scoring systems to determine which evaluation option is both valid and easy to implement. As a benchmark, the diagrams’ diameters are used, which allows statements about the quality of the maps/diagrams in general. The results show that, depending on the combination of variables that play a role in the evaluation (1, 2, 3), the quality of the influence diagram becomes measurable. However, strong differences appear in the various evaluation schemes, which can be explained by each variable’s peculiarities. Overall, the tested methodology is effective, but will need to be sharpened in the future. The results also offer starting points for future research to further deepen the path taken here.