Visual representations are prevalent in STEM instruction. To benefit from visuals, students need representational competencies that enable them to see meaningful information. Most research has focused on explicit conceptual representational competencies, but implicit perceptual competencies might also allow students to efficiently see meaningful information in visuals. Most common methods to assess students’ representational competencies rely on verbal explanations or assume explicit attention. However, because perceptual competencies are implicit and not necessarily verbally accessible, these methods are ill‐equipped to assess them. We address these shortcomings with a method that draws on similarity learning, a machine learning technique that detects visual features that account for participants’ responses to triplet comparisons of visuals. In Experiment 1, 614 chemistry students judged the similarity of Lewis structures and in Experiment 2, 489 students judged the similarity of ball‐and‐stick models. Our results showed that our method can detect visual features that drive students’ perception and suggested that students’ conceptual knowledge about molecules informed perceptual competencies through top‐down processes. Furthermore, Experiment 2 tested whether we can improve the efficiency of the method with active sampling. Results showed that random sampling yielded higher accuracy than active sampling for small sample sizes. Together, the experiments provide the first method to assess students’ perceptual competencies implicitly, without requiring verbalization or assuming explicit visual attention. These findings have implications for the design of instructional interventions that help students acquire perceptual representational competencies.