Urban sprawl related increase of built-in areas requires reliable monitoring methods and remote sensing can be an efficient technique. Aerial surveys, with high spatial resolution, provide detailed data for building monitoring, but archive images usually have only visible bands. We aimed to reveal the efficiency of visible orthophotographs and photogrammetric dense point clouds in building detection with segmentation-based machine learning (with five algorithms) using visible bands, texture information, and spectral and morphometric indices in different variable sets. Usually random forest (RF) had the best (99.8%) and partial least squares the worst overall accuracy (~60%). We found that >95% accuracy can be gained even in class level. Recursive feature elimination (RFE) was an efficient variable selection tool, its result with six variables was like when we applied all the available 31 variables. Morphometric indices had 82% producer’s and 85% user’s Accuracy (PA and UA, respectively) and combining them with spectral and texture indices, it had the largest contribution in the improvement. However, morphometric indices are not always available but by adding texture and spectral indices to red-green-blue (RGB) bands the PA improved with 12% and the UA with 6%. Building extraction from visual aerial surveys can be accurate, and archive images can be involved in the time series of a monitoring.