As a special kind of signal processing technology, image processing has been developed rapidly after the appearance of convolutional neural network (CNN). At present, the semantic segmentation methods are all based on CNN and ignore the advantages of traditional image processing technology. We combine the two and make them promote each other. The high frequency component of the image represents the edge part and the low frequency represents the body part. Based on this assumption, we use Fourier transform to obtain the high and low frequency component from images. Then, a multibranch parallel network structure is designed, and the high and low frequency components are sent into two branches respectively to obtain the body and edge features. Finally, the two features are fused together through one deep feature fusion to obtain the final output. In this way, the edge information and body information are decoupled on original images and extracted separately, which not only ensures the consistency of the internal information within objects, but also strengthens the supervision of the edge part which is the most error prone area in semantic segmentation. The results on Cityscapes and KITTI fully demonstrate the effectiveness of our work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.