Wireless sensor networks (WSNs) have a significant potential in diverse applications. In contrast to WSNs in a small-scale setting, the real-world adoption of large-scale WSNs is quite slow particularly due to the lack of robustness of protocols at all levels. Upon the demanding need for their experimental verification and evaluation, researchers have developed numerous WSN testbeds. While each individual WSN testbed contributes to the progress with its own unique innovation, still a missing element is an analysis on the overall system architecture and methodologies that can lead to systematic advances. This paper seeks to provide a framework to reason about the evolving WSN testbeds from the architectural perspective. We define three core requirements for WSN testbeds, which are scalability, flexibility, and efficiency. Then, we establish a taxonomy of WSN testbeds that represents the architectural design space by a hierarchy of design domains and associated design approaches. Through a comprehensive literature survey of existing prominent WSN testbeds, we examine their best practices for each design approach in our taxonomy. Finally, we qualitatively evaluate WSN testbeds for their responsiveness to the aforementioned core requirements by assessing the influence by each design approach on the core requirements and suggest future directions of research.