It is very difficult for visually impaired people to perceive and avoid obstacles at a distance. To address this problem, the unified framework of multiple target detection, recognition and fusion is proposed based on the sensor fusion system comprised of a low-power MMW radar and an RGB-D sensor. In this paper, Mask R-CNN and SSD network are utilized to detect and recognize the objects from color images. The obstacles depth information is obtained from the depth images using the MeanShift algorithm. The position and velocity information of the multiple target are detected by the millimeter wave radar based on the principle of frequency modulated continuous wave. The data fusion based on the Particle Filter obtains more accurate state estimation and richer information by fusing the detection results from the color images, depth images and radar data compared with using only one sensor. The experiment results show that the data fusion enriches the detection results. Meanwhile, the effective detection range is expanded compared to using only the RGB-Depth sensor. Moreover, the data fusion results keep high accuracy and stability under diverse range and illumination conditions. As a wearable system, the sensor fusion system has the characteristics of versatility, portability and cost-effectiveness.