Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health as well as disease. Such large-scale atlases not only increase the scale and generalizability of analyses but also enable combining the knowledge generated by individual studies. Specifically, individual studies often differ in terms of cell annotation terminology and depth, with different groups specializing in different cell type compartments, often using distinct terminology. Understanding how these distinct sets of annotations are related and complement each other would mark a major step towards a consensus-based cell type annotation that reflects the latest knowledge in the field. Whereas recent computational techniques, referred to as 'reference mapping' methods, facilitate the usage and expansion of existing reference atlases by mapping new datasets (i.e. queries) onto an atlas; a systematic approach towards harmonizing dataset-specific cell type terminology and annotation depth is still lacking. Here, we present 'treeArches', a framework to automatically build and extend reference atlases while enriching them with an updatable hierarchy of cell type annotations across different datasets. We demonstrate various use cases for treeArches, from automatically resolving relations between reference and query cell types to identifying unseen cell types not present in the reference, such as disease-associated cell states. We envision treeArches enabling data-driven construction of consensus, atlas-level cell type hierarchies, as well as facilitating efficient usage of reference atlases.