In this paper we introduce a realistic and challenging, multi-source and multi-room acoustic environment and an improved algorithm for the estimation of source-dominated microphone clusters in acoustic sensor networks. Our proposed clustering method is based on a single microphone per node and on unsupervised clustered federated learning which employs a light-weight autoencoder model. We present an improved clustering control strategy that takes into account the variability of the acoustic scene and allows the estimation of a dynamic range of clusters using reduced amounts of training data. The proposed approach is optimized using clustering-based measures and validated via a network-wide classification task.