The New York Times Annotated Corpus, the ACM Digital Library, and PubMed are three prototypical examples of document collections in which each document is tagged with keywords or phrases. Such collections can be viewed as high-dimensional document cubes against which browsers and search systems can be applied in a manner similar to online analytical processing against data cubes. After examining the tagging patterns in these collections, a partial materialization strategy is developed to provide efficient storage and access to centroids for document subsets that are defined through queries over tags. By adopting this strategy, summary measures dependent on centroids (including measures involving medoids, sets of representative documents, or sets of representative terms) can be efficiently computed. The proposed design is evaluated on the three collections and on several synthetically generated collections to validate that it outperforms alternative storage strategies.