The road networks provide key information for a broad range of applications such as urban planning, urban management, and navigation. The fast-developing technology of remote sensing that acquires high-resolution observational data of the land surface offers opportunities for automatic extraction of road networks. However, the road networks extracted from remote sensing images are likely affected by shadows and trees, making the road map irregular and inaccurate. This research aims to improve the extraction of road centerlines using both very-high-resolution (VHR) aerial images and light detection and ranging (LiDAR) by accounting for road connectivity. The proposed method first applies the fractal net evolution approach (FNEA) to segment remote sensing images into image objects and then classifies image objects using the machine learning classifier, random forest. A post-processing approach based on the minimum area bounding rectangle (MABR) is proposed and a structure feature index is adopted to obtain the complete road networks. Finally, a multistep approach, that is, morphology thinning, Harris corner detection, and least square fitting (MHL) approach, is designed to accurately extract the road centerlines from the complex road networks. The proposed method is applied to three datasets, including the New York dataset obtained from the object identification dataset, the Vaihingen dataset obtained from the International Society for Photogrammetry and Remote Sensing (ISPRS) 2D semantic labelling benchmark and Guangzhou dataset. Compared with two state-of-the-art methods, the proposed method can obtain the highest completeness, correctness, and quality for the three datasets. The experiment results show that the proposed method is an efficient solution for extracting road centerlines in complex scenes from VHR aerial images and light detection and ranging (LiDAR) data.