Synthetic aperture radar (SAR) ship detection has been playing an increasingly essential role in marine monitoring in recent years. The lack of detailed information about ships in wide swath SAR imagery poses difficulty for traditional methods in exploring effective features for ship discrimination. Being capable of feature representation, deep neural networks have achieved dramatic progress in object detection recently. However, most of them suffer from the missing detection of small-sized targets, which means that few of them are able to be employed directly in SAR ship detection tasks. This paper discloses an elaborately designed deep hierarchical network, namely a contextual region-based convolutional neural network with multilayer fusion, for SAR ship detection, which is composed of a region proposal network (RPN) with high network resolution and an object detection network with contextual features. Instead of using low-resolution feature maps from a single layer for proposal generation in a RPN, the proposed method employs an intermediate layer combined with a downscaled shallow layer and an up-sampled deep layer to produce region proposals. In the object detection network, the region proposals are projected onto multiple layers with region of interest (ROI) pooling to extract the corresponding ROI features and contextual features around the ROI. After normalization and rescaling, they are subsequently concatenated into an integrated feature vector for final outputs. The proposed framework fuses the deep semantic and shallow high-resolution features, improving the detection performance for small-sized ships. The additional contextual features provide complementary information for classification and help to rule out false alarms. Experiments based on the Sentinel-1 dataset, which contains twenty-seven SAR images with 7986 labeled ships, verify that the proposed method achieves an excellent performance in SAR ship detection.