Benchmark datasets play an important role in evaluating remote sensing image retrieval (RSIR) methods. At present, several small-scale benchmark datasets for RSIR are publicly available on the web and are mostly collected through the Google Map API or other desktop tools. Because the Google Map API requires the users to have programming skills and other collection tools are not publicly available, this may limit the possibility for a wide range of volunteers to participate in generating large-scale benchmark datasets. To address this challenge, we develop an open access web-based tool V-RSIR that allows volunteers to easily participate in generating new benchmark datasets for RSIR. This web-based tool not only facilitates the remote sensing image label and cropping, but also provides image editing, review, quantity statistics, spatial distribution, sharing, and so on. To validate this tool, we recruit 32 volunteers to label and crop remote sensing images by using the tool. Finally, a new benchmark dataset that contains 38 classes with at least 1500 images per class is created. Then, the new dataset is validated by five handcrafted low-level feature methods and four deep learning high-level feature methods. The experimental results show that the handcrafted low-level feature methods perform worse than the deep learning methods, in which the precision at top 5 can achieve 94%. The evaluation results are consistent with our theoretical understanding and experimental results on the PatternNet dataset. This indicates that our web-based tool can help users generating valid benchmark datasets with volunteers for the RSIR.INDEX TERMS Annotation tool, benchmark dataset, remote sensing image retrieval, volunteers, web-based tool.