Comparing the performance of CBIR (Content-Based Image Retrieval) algorithms is difficult. Private data sets are used so it is controversial to compare CBIR algorithms developed by different researchers. Also, the performance of CBIR algorithms is usually measured on an isolated, well-tuned PC or workstation. In a real-world environment, however, the CBIR algorithms would only constitute a minor component among the many interacting components needed to facilitate a useful CBIR application e.g., Web-based applications on the Internet. The Internet, being a shared medium, dramatically changes many of the usual assumptions about measuring CBIR performance. Any CBIR benchmark should be designed from a networked systems standpoint. Networked system benchmarks have been developed for other applications e.g., text retrieval, and relational database management. These benchmarks typically introduce communication overhead because the real systems they model are distributed applications e.g., an airline reservation system. The most common type of distributed computing architecture uses a client/server model. We present our implementation of a client/server CBIR benchmark called BIRDS-I (Benchmark for Image Retrieval using Distributed Systems over the Internet) to measure image retrieval performance over the Internet. The BIRDS-I benchmark has been designed with the trend toward the use of small personalized wireless-internet systems in mind. Web-based CBIR implies the use of heterogeneous image sets and this, in turn, imposes certain constraints on how the images are organized and the type of performance metrics that are applicable. Surprisingly, BIRDS-I only requires controlled human intervention for the compilation of the image collection and none for the generation of ground truth in the measurement of retrieval accuracy. Benchmark image collections need to be evolved incrementally toward the storage of millions of images and that scaleup can only be achieved through the use of computeraided compilation. Finally, the BIRDS-I scoring metric introduces a tightly optimized image-ranking window, which is important for the future benchmarking of large-scale personalized wireless-internet CBIR systems.