Aerial manipulator systems possess active operational capability, and by incorporating various sensors, the systems’ autonomy is further enhanced. In this paper, we address the challenge of accurate positioning between an aerial manipulator and the operational targets during tasks such as grasping and delivery in the absence of motion capture systems indoors. We propose a vision-guided aerial manipulator system comprising a quad-rotor UAV and a single-degree-of-freedom manipulator. First, the overall structure of the aerial manipulator is designed, and a hierarchical control system is established. We employ the fusion of LiDAR-based SLAM (simultaneous localization and mapping) and IMU (inertial measurement unit) to enhance the positioning accuracy of the aerial manipulator. Real-time target detection and recognition are achieved by combining a depth camera and laser sensor for distance measurements, enabling adjustment of the grasping pose of the aerial manipulator. Finally, we employ a segmented grasping strategy to position and grasp the target object precisely. Experimental results demonstrate that the designed aerial manipulator system maintains a stable orientation within a certain range of ±5° during operation; its position movement is independent of orientation changes. The successful autonomous grasping of lightweight cylindrical objects in real-world scenarios verifies the effectiveness and rationality of the proposed system, ensuring high operational efficiency and robust disturbance resistance.