IntroductionAutomatically and accurately delineating the primary nasopharyngeal carcinoma (NPC) tumors in head magnetic resonance imaging (MRI) images is crucial for patient staging and radiotherapy. Inspired by the bilateral symmetry of head and complementary information of different modalities, a multi-modal neural network named BSMM-Net is proposed for NPC segmentation.MethodsFirst, a bilaterally symmetrical patch block (BSP) is used to crop the image and the bilaterally flipped image into patches. BSP can improve the precision of locating NPC lesions and is a simulation of radiologist locating the tumors with the bilateral difference of head in clinical practice. Second, modality-specific and multi-modal fusion features (MSMFFs) are extracted by the proposed MSMFF encoder to fully utilize the complementary information of T1- and T2-weighted MRI. The MSMFFs are then fed into the base decoder to aggregate representative features and precisely delineate the NPC. MSMFF is the output of MSMFF encoder blocks, which consist of six modality-specific networks and one multi-modal fusion network. Except T1 and T2, the other four modalities are generated from T1 and T2 by the BSP and DT modal generate block. Third, the MSMFF decoder with similar structure to the MSMFF encoder is deployed to supervise the encoder during training and assure the validity of the MSMFF from the encoder. Finally, experiments are conducted on the dataset of 7633 samples collected from 745 patients.Results and discussionThe global DICE, precision, recall and IoU of the testing set are 0.82, 0.82, 0.86, and 0.72, respectively. The results show that the proposed model is better than the other state-of-the-art methods for NPC segmentation. In clinical diagnosis, the BSMM-Net can give precise delineation of NPC, which can be used to schedule the radiotherapy.