The selection of an appropriate feature set is crucial for the efficient analysis of any media collection. In general, feature selection strongly depends on the data and commonly requires expert knowledge and previous experiments in related application scenarios. Current unsupervised feature selection methods usually ignore existing relationships among components of multi-dimensional features (group features) and operate on single feature components. In most applications, features carry little semantics. Thus, it is less relevant if a feature set consists of complete features or a selection of single feature components. However, in some domains, such as content-based audio retrieval, features are designed in a way that they, as a whole, have considerable semantic meaning. The disruption of a group feature in such application scenarios impedes the interpretability of the results. In this paper, we propose an unsupervised group feature selection algorithm based on canonical correlation analysis (CCA). Experiments with different audio and video classification scenarios demonstrate the outstanding performance of the proposed approach and its robustness across different datasets.