The tidal flat is long and narrow area along rivers and coasts with high sediment content, so there is little feature difference between the waterbody and the background, and the boundary of the waterbody is blurry. The existing waterbody extraction methods are mostly used for the extraction of large water bodies like rivers and lakes, whereas less attention has been paid to tidal flat waterbody extraction. Extracting tidal flat waterbody accurately from high-resolution remote sensing imagery is a great challenge. In order to solve the low accuracy problem of tidal flat waterbody extraction, we propose a fine-grained tidal flat waterbody extraction method, named FYOLOv3, which can extract tidal flat water with high accuracy. The FYOLOv3 mainly includes three parts: an improved object detection network based on YOLOv3 (Seattle, WA, USA), a fully convolutional network (FCN) without pooling layers, and a similarity algorithm for water extraction. The improved object detection network uses 13 convolutional layers instead of Darknet-53 as the model backbone network, which guarantees the water detection accuracy while reducing the time cost and alleviating the overfitting phenomenon; secondly, the FCN without pooling layers is proposed to obtain the accurate pixel value of the tidal flat waterbody by learning the semantic information; finally, a similarity algorithm for water extraction is proposed to distinguish the waterbody from non-water pixel by pixel to improve the extraction accuracy of tidal flat water bodies. Compared to the other convolutional neural network (CNN) models, the experiments show that our method has higher accuracy on the waterbody extraction of tidal flats from remote sensing images, and the IoU of our method is 2.43% higher than YOLOv3 and 3.7% higher than U-Net (Freiburg, Germany).