Aerial targets grouping is an important content of air battlefield situation awareness, and intelligent aerial targets grouping is an urgent need for intelligent air combat. Traditional aerial targets grouping methods have problems such as difficulty extracting air target features, difficulty designing formation similarity measures, and lack of data samples. To solve the above problems, this paper proposes a relational network based aerial targets grouping method to achieve end‐to‐end grouping under small sample conditions. Firstly, an adaptive clustering algorithm is proposed to preliminarily cluster the aerial targets in the air battlefield situation to obtain the group to be tested. Then, a representation of aerial target groups is designed, which unifies the multi‐dimensional features of the group's temporal, spatial, and attribute. Next, based on this representation, combining 3D convolutional neural networks (3DCNN) and long short‐term memory (LSTM), a neural network structure that can automatically extract multi‐dimensional features of groups is designed. After that, the relation module is used to learn the distance metric between groups. After training the relation network, the type of group to be tested can be identified, thereby achieving the goal of grouping. A dataset of 30 types of aerial target group patterns was established through simulation. The ablation experiments were conducted to verify the performance of each module, and the experimental results showed that the designed feature extraction module can effectively extract temporal sample features, while the designed relationship module performs better in extracting temporal sample information. By comparing several typical traditional aerial target grouping methods, the proposed method improves the grouping accuracy by 11.2% to 13.4%, achieving an accuracy of 91.1% in the 5‐way 1‐shot task and 95.6% in the 5‐way 5‐shot task.