Abstract-Research into wireless sensor networks is rapidly moving from simulations to realistic testbeds. The widely varying characteristics (e.g., radio hardware, #nodes, topology) of various testbeds raises concerns about the validity of results across different testbeds. This paper presents empirical data of an experiment involving one application (Surge), two routing protocols (MultiHop and MintRoute), and two testbeds (MoteLab and MistLab). The outcome is somewhat mixed. When increasing the data rate, congestion causes goodput to fall off in a similar fashion on both testbeds, which is good, but only when ignoring MintRoute ill-behaving for low rates on MoteLab, which is bad. Accounting for differences in communication hardware is necessary, but even then results should be taken with a grain of salt. This certainly holds for TOSSIM simulation results, since we found that they generally do not match those of physical testbeds.