The binary partition tree (BPT) is a hierarchical data-structure that models the content of an image in a multiscale way. In particular, a cut of the BPT of an image provides a segmentation, as a partition of the image support. Actually, building a BPT allows for dramatically reducing the search space for segmentation purposes, based on intrinsic (image signal) and extrinsic (construction metric) information. A large literature has been devoted to the construction on such metrics, and the associated choice of criteria (spectral, spatial, geometric, etc.) for building relevant BPTs, in particular in the challenging context of remote sensing. But, surprisingly, there exists few works dedicated to evaluate the quality of BPTs, i.e. their ability to further provide a satisfactory segmentation. In this paper, we propose a framework for BPT quality evaluation, in a supervised paradigm. Indeed, we assume that ground-truth segments are provided by an expert, possibly with a semantic labelling and a given uncertainty. Then, we describe local evaluation metrics, BPT nodes / ground-truth segments fitting strategies, and global quality score computation considering semantic information, leading to a complete evaluation framework. This framework is illustrated in the context of BPT segmentation of multispectral satellite images.