High-resolution remote sensing (HRRS) images, when used for building detection, play a key role in urban planning and other fields. Compared with the deep learning methods, the method based on morphological attribute profiles (MAPs) exhibits good performance in the absence of massive annotated samples. MAPs have been proven to have a strong ability for extracting detailed characterizations of buildings with multiple attributes and scales. So far, a great deal of attention has been paid to this application. Nevertheless, the constraints of rational selection of attribute scales and evidence conflicts between attributes should be overcome, so as to establish reliable unsupervised detection models. To this end, this research proposes a joint optimization and fusion building detection method for MAPs. In the pre-processing step, the set of candidate building objects are extracted by image segmentation and a set of discriminant rules. Second, the differential profiles of MAPs are screened by using a genetic algorithm and a cross-probability adaptive selection strategy is proposed; on this basis, an unsupervised decision fusion framework is established by constructing a novel statistics-space building index (SSBI). Finally, the automated detection of buildings is realized. We show that the proposed method is significantly better than the state-of-the-art methods on HRRS images with different groups of different regions and different sensors, and overall accuracy (OA) of our proposed method is more than 91.9%.