PurposeTo meet the emerging demand for fine-grained annotation and semantic enrichment of cultural heritage images, this paper proposes a new approach that can transcend the boundary of information organization theory and Panofsky's iconography theory.Design/methodology/approachAfter a systematic review of semantic data models for organizing cultural heritage images and a comparative analysis of the concept and characteristics of deep semantic annotation (DSA) and indexing, an integrated DSA framework for cultural heritage images as well as its principles and process was designed. Two experiments were conducted on two mural images from the Mogao Caves to evaluate the DSA framework's validity based on four criteria: depth, breadth, granularity and relation.FindingsResults showed the proposed DSA framework included not only image metadata but also represented the storyline contained in the images by integrating domain terminology, ontology, thesaurus, taxonomy and natural language description into a multilevel structure.Originality/valueDSA can reveal the aboutness, ofness and isness information contained within images, which can thus meet the demand for semantic enrichment and retrieval of cultural heritage images at a fine-grained level. This method can also help contribute to building a novel infrastructure for the increasing scholarship of digital humanities.