This paper introduces a simple method to achieve full-field and real-scale reconstruction using a movable binocular vision system (MBVS). The MBVS is composed of two cameras, one is called the tracking camera, and the other is called the working camera. The tracking camera is used for tracking the positions of the MBVS and the working camera is used for the 3D reconstruction task. The MBVS has several advantages compared with a single moving camera or multi-camera networks. Firstly, the MBVS could recover the real-scale-depth-information from the captured image sequences without using auxiliary objects whose geometry or motion should be precisely known. Secondly, the removability of the system could guarantee appropriate baselines to supply more robust point correspondences. Additionally, using one camera could avoid the drawback which exists in multi-camera networks, that the variability of a cameras’ parameters and performance could significantly affect the accuracy and robustness of the feature extraction and stereo matching methods. The proposed framework consists of local reconstruction and initial pose estimation of the MBVS based on transferable features, followed by overall optimization and accurate integration of multi-view 3D reconstruction data. The whole process requires no information other than the input images. The framework has been verified with real data, and very good results have been obtained.