Automated aerial refueling (AAR) provides unique challenges for computer vision systems. Aerial refueling maneuvers require high-precision low-variance pose estimates. The performance of two stereoscopic (stereo) vision systems is quantified in ground tests specially designed to mimic AAR. In this experiment, three-dimensional (3-D) pose-estimation errors of 6 cm on a target 30 m from the current vision system are achieved. Next, a novel computer vision pipeline to efficiently generate a 3-D point cloud of the target object using stereo vision that leverages a convolutional neural network (CNN) is proposed. Using the proposed approach, a high-fidelity 3-D point cloud with ultra-high-resolution imagery 11.3 times faster than previous approaches can be generated.