Along with the enlargement of image scale, convolutional local features, such as SIFT, are ineffective for representing or indexing and more compact visual representations are required. Due to the intrinsic mechanism, state-of-the-art Vector of Locally Aggregated Descriptors (VLAD) has a few limits. Based on this, we propose a new descriptor named Holons Visual Representation (HVR). The proposed HVR is a derivative mutational self-contained combination of global and local information. It exploits both global characteristics and the statistic information of local descriptors in the image dataset. It also takes advantages of local features of each image and computes their distribution with respect to the entire local descriptor space. Accordingly, the HVR is computed by a twolayer hierarchical scheme, which splits the local feature space and obtains raw partitions, as well as, the corresponding refined partitions. Then according to the distances from the centroids of partition spaces to local features and their spatial correlation, we assign the local features into their nearest raw partitions and refined partitions to obtain the global description of an image. Compared with VLAD, HVR holds critical structure information and enhances the discriminative power of individual representation with a small amount of computation cost, while using the same memory overhead. Extensive experiments on several benchmark datasets demonstrate that the proposed HVR outperforms conventional approaches in terms of scalability as well as retrieval accuracy for images with similar intra local information.