This study aims to understand the development of users’ mental models (MMs) over time. We use behavioral data obtained from process tracing to identify key components of MMs and their relative importance. Further, we investigate the stability and predictability of these components as users learn through system interaction. Human-in-the-loop experimentation was deployed in a dynamic geospatial environment and six information attributes were provided to inform participants’ decisions. Partial Least Squares Regression was used to relate behavioral data and decision-making outcomes. We found that top-most performers initially adapt and progressively stabilize toward a suitable model as performance improves. In contrast, low performers lack adaptability and perform poorly. Overall, most participants are consistent with their choices as task familiarity increases. Identifying MMs and the underlying stability and predictability trends within performance groups has implications for improving user experience and curating decision support tools for human-AI teams.