Microscopic vision plays an important role in automated micro-assembly. However, some uncertain factors in the assembly process, such as occlusion and stains can lead to the mistakes of feature extraction. Herein, to solve the problem, the deep learning techniques are introduced into the feature recognition tasks, focusing on the attention mechanism and visualizing CNNs for DL-based microscopic vision. The main contributions are summarized as follows: The CBAM attention mechanism is combined with the YOLOv5 algorithm to improve the accuracy and robustness of feature extraction. The micropart feature occlusion experiment results show that at 70% occlusion degree, YOLOV5-CBAM can reach 97.9% mAP@0.5, which is 4.6% higher than the original one. Visualization analysis of DL-based model is conducted using Grad-CAM to make the decision result more transparent and avoid potential visual detection risks during assembly. The heatmap matching degree between GT area and high-light area is increased by 27.81% on average, which further verify the effectiveness of attention mechanism in micropart feature localization. Additionally, micropart surface stain and droplet quality classification models based on ResNet50 are trained to replace the manual sorting. The visual results are consistent with human eye discernment and judgement, confirming the reliability of parts and droplets sorting.