Cultural heritage virtual tourism offers users a novel digital heritage experience, becoming an essential channel for cultural dissemination and preservation. However, how to stimulate users’ continuous behavioral intention remains unresolved. This study integrates the Stimulus–Organism–Response theory (SOR) and experience economy theories to construct a comprehensive model, exploring factors influencing users’ continuous intentions in cultural heritage virtual tourism. By analyzing data from 451 valid questionnaires through structural equation modeling (SEM) and fuzzy-set qualitative comparative analysis (fsQCA) methods, several key findings emerged. The SEM results show that (1) esthetics, entertainment, escapism, education, and connection experiences all positively affect perceived value and satisfaction; (2) except for escapism, other experiences positively influence cultural identity; and (3) perceived value, satisfaction, and cultural identity significantly impact continuous intention. The FsQCA results show that (1) in high continuous intention scenarios, perceived value, satisfaction, and cultural identity are core conditions, while esthetics, entertainment, escapism, education, and connection act as supporting conditions, enhancing users’ willingness to continue engaging under different configurations; (2) in low continuous intention cases, the absence of escapism, satisfaction, cultural identity, education, esthetics, and connection weakens users’ virtual tourism experiences, leading to a decline in continuous usage intentions. This study provides theoretical and practical insights for promoting users’ continuous intentions in cultural heritage virtual tourism.