In the realm of underwater environment detection, achieving information matching stands as a pivotal step, forming an indispensable component for collaborative detection and research in areas such as distributed mapping. Nevertheless, the progress in studying the matching of underwater side-scan sonar images has been hindered by challenges including low image quality, intricate features, and susceptibility to distortion in commonly used side-scan sonar images. This article presents a comprehensive overview of the advancements in underwater sonar image processing. Building upon the novel SchemaNet image topological structure extraction model, we introduce a feature matching model grounded in side-scan sonar images. The proposed approach employs a semantic segmentation network as a teacher model to distill the DeiT model during training, extracting the attention matrix of intermediate layer outputs. This emulates SchemaNet’s transformation method, enabling the acquisition of high-dimensional topological structure features from the image. Subsequently, utilizing a real side-scan sonar dataset and augmenting data, we formulate a matching dataset and train the model using a graph neural network. The resulting model demonstrates effective performance in side-scan sonar image matching tasks. These research findings bear significance for underwater detection and target recognition and can offer valuable insights and references for image processing in diverse domains.