Among primary bone cancers, osteosarcoma is the most common, peaking between the ages of a child’s rapid bone growth and adolescence. The diagnosis of osteosarcoma requires observing the radiological appearance of the infected bones. A common approach is MRI, but the manual diagnosis of MRI images is prone to observer bias and inaccuracy and is rather time consuming. The MRI images of osteosarcoma contain semantic messages in several different resolutions, which are often ignored by current segmentation techniques, leading to low generalizability and accuracy. In the meantime, the boundaries between osteosarcoma and bones or other tissues are sometimes too ambiguous to separate, making it a challenging job for inexperienced doctors to draw a line between them. In this paper, we propose using a multiscale residual fusion network to handle the MRI images. We placed a novel subnetwork after the encoders to exchange information between the feature maps of different resolutions, to fuse the information they contain. The outputs are then directed to both the decoders and a shape flow block, used for improving the spatial accuracy of the segmentation map. We tested over 80,000 osteosarcoma MRI images from the PET-CT center of a well-known hospital in China. Our approach can significantly improve the effectiveness of the semantic segmentation of osteosarcoma images. Our method has higher F1, DSC, and IOU compared with other models while maintaining the number of parameters and FLOPS.