Sub-image content-based similarity search forms an important operation in current image archives since it provides users with images that contain a query image as their part. Such a search can conveniently be implemented using the bag-of-features model. Its integral part is a construction of visual vocabulary. Most existing algorithms to create a visual vocabulary suffer from high computational (e.g. k-means) or supervisor-guidance (e.g. visual-bit classifier, or sparse coding) requirements. In this paper, we propose a novel approach to visual vocabulary construction called metric distance permutation vocabulary. It is based on permutations of metric distances to create compact visual words. Its major advantage over prior techniques is time and space efficiency of vocabulary construction and quantization process during querying, while achieving comparable or even better effectiveness (query result quality). Moreover, this basic concept is extended to combine more independent permutations. Both the proposals are experimented on well-known real-world data-sets and compared to other state-of-the-art techniques.