Bridge detection in aerial images is to determine whether a given aerial image contains one or more bridges and locate them. However, the arbitrary orientations, extreme aspect ratios, and variable backgrounds pose great challenges for bridge detection and positioning. In this paper, we tackle these problems by combining the strengths of semantic-segmentation-based auxiliary supervision, waterbody constraint, and instance-switchingbased data augmentation. More precisely, we make three main contributions: (i) We propose an oriented bridge detection model with an auxiliary task of waterbody segmentation, which performs as guidance for bridge localization. The network is specifically designed in cascade style to handle the bridge detection and waterbody segmentation task end-to-end. (ii) We make use of the semantic features of waterbody as spatial attention to distinguish bridges from cluttered backgrounds, and then generate the waterbody segmentation map as the waterbody constraint, which introduces the prior knowledge of bridge distribution to refine the network predictions. (iii) We propose a background consistent instance switching method for online data augmentation to further improve the robustness of bridge detection. To verify the effectiveness of the proposed method, we introduce a dataset named BridgeDetV1 containing 5,000 well-annotated images with two kinds of bridge representations, i.e., the horizontal bounding box and the oriented bounding box. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods on this challenging benchmark. Dataset and code are available at https://github.com/whughw/BridgeDet.