“…By comparing with mean descriptors from top to down and at the same time scoring in each node, vocabulary tree based matching method can efficiently recognize objects in images due to the fact that visualization of vocabulary tree can be viewed geometrically as a nested voronoi graph, however, it is not an ideal tool to exactly match SIFT descriptors as k-d tree does. Dong et al [16] proposed an exact matching method that uses vocabulary tree to vote and then match features with selected keyframe, with which their method arrives at reliable matches even though the number of features increases. However, large amount of features in this method plays important role only in first voting stage, whereas fails to have any effect in second keyframe based matching stage, leading to the fact that it is only an image-image matching method in essence.…”