Formation recognition is a significant focus of maritime target recognition. Automatic formation extraction and recognition facilitate autonomous decision-making. However, few studies have explored formation extraction prior to recognition. This paper introduces a density-based spatial clustering of applications with noise (DBSCAN) method based on Gaussian kernel to extract formation targets. On this basis, a depthwise separable convolutional neural network (DSCNN) method is proposed for formation recognition. A track simulation system is established to form a track dataset containing three different proportions of clutter, and the formation extraction method is examined using track dataset. Subsequently, the image dataset with eight different types of formation is formulated, on the basis of various detection errors, the DSCNN method for formation recognition is compared with several typical deep learning methods. As exposed in experimental results, the DBSCAN method based on Gaussian kernel can guarantee accurate extraction of formation targets subject to different proportions of clutter. Hence, it is greatly robust and capable of effective formation extraction. Under different radar detection errors, the formation recognition accuracy of DSCNN is 91.5%-99.5%, which achieves performance improvement by up to 12.5% compared with other deep learning methods. The combination of DBSCAN and DSCNN can well realise formation extraction and recognition with different proportions of clutter in tracks and various radar detection errors.