Robot control based on visual information perception is a hot topic in the industrial robot domain and makes robots capable of doing more things in a complex environment. However, complex visual background in an industrial environment brings great difficulties in recognizing the target image, especially when a target is small or far from the sensor. Therefore, target recognition is the first problem that should be addressed in a visual servo system. This paper considers common complex constraints in industrial environments and proposes a You Only Look Once Version 2 Region of Interest (YOLO-v2-ROI) neural network image processing algorithm based on machine learning. The proposed algorithm combines the advantages of YOLO (You Only Look Once) rapid detection with effective identification of ROI (Region of Interest) pooling structure, which can quickly locate and identify different objects in different fields of view. This method can also lead the robot vision system to recognize and classify a target object automatically, improve robot vision system efficiency, avoid blind movement, and reduce the calculation load. The proposed algorithm is verified by experiments. The experimental result shows that the learning algorithm constructed in this paper has real-time image-detection speed and demonstrates strong adaptability and recognition ability when processing images with complex backgrounds, such as different backgrounds, lighting, or perspectives. In addition, this algorithm can also effectively identify and locate visual targets, which improves the environmental adaptability of a visual servo system