Urbanization poses significant challenges on sustainable development, disaster resilience, climate change mitigation, and environmental and resource management. Accurate urban extent datasets at large spatial scales are essential for researchers and policymakers to better understand urbanization dynamics and its socioeconomic drivers and impacts. While high-resolution urban extent data products - including the Global Human Settlements Layer (GHSL), the Global Man-Made Impervious Surface (GMIS), the Global Human Built-Up and Settlement Extent (HBASE), and the Global Urban Footprint (GUF) - have recently become available, intermediate-resolution urban extent data products including the 1 km SEDAC’s Global Rural-Urban Mapping Project (GRUMP), MODIS 1km, and MODIS 500 m still have many users and have been demonstrated in a recent study to be more appropriate in urbanization process analysis (around 500 m resolution) than those at higher resolutions (30 m). The objective of this study is to improve large-scale urban extent mapping at an intermediate resolution (500 m) using machine learning methods through combining the complementary nighttime Visible Infrared Imaging Radiometer Suite (VIIRS) and daytime Moderate Resolution Imaging Spectroradiometer (MODIS) data, taking the conterminous United States (CONUS) as the study area. The effectiveness of commonly-used machine learning methods, including random forest (RF), gradient boosting machine (GBM), neural network (NN), and their ensemble (ESB), has been explored. Our results show that these machine learning methods can achieve similar high accuracies across all accuracy metrics (>95% overall accuracy, >98% producer’s accuracy, and >92% user’s accuracy) with Kappa coefficients greater than 0.90, which have not been achieved in the existing data products or by previous studies; the ESB is not able to produce significantly better accuracies than individual machine learning methods; the total misclassifications generated by GBM are more than those generated by RF, NN, and ESB by 14%, 16%, and 11%, respectively, with NN having the least total misclassifications. This indicates that using these machine learning methods, especially NN and RF, with the combination of VIIRS nighttime light and MODIS daytime normalized difference vegetation index (NDVI) data, high accuracy intermediate-resolution urban extent data products at large spatial scales can be achieved. The methodology has the potential to be applied to annual continental-to-global scale urban extent mapping at intermediate resolutions.