Mean Shift today, is widely used for mode detection and clustering. The technique though, is challenged in practice due to assumptions of isotropicity and homoscedasticity. Isotropic/scalar bandwidths tend to smooth anisotropic patterns and affect partition boundaries, while homoscedastic / global bandwidths are inappropriate when clusters (or modes) at different scales need to be identified.We present an adaptive Mean Shift methodology that allows for anisotropic clustering, through unsupervised local bandwidth selection. The bandwidth matrices evolve naturally, adapting locally through agglomeration, and in turn guiding further agglomeration. The online methodology is practical for low-dimensional feature spaces, preserving better detail and clustering salience. Additionally, conventional Mean Shift either critically depends on a per instance choice of bandwidth, or relies on offline methods which are inflexible and/or again data instance specific. The presented approach, due to its adaptive design, also alleviates this issuewith a default form performing generally well. The methodology though, allows for effective tuning of results.In the proposed approach, clusters arise on the fly, as a consequence of agglomeration of extant clusters. Local bandwidths which evolve anisotropically every iteration, are associated with each cluster; by design, all members of a cluster converge to the same local mode. By evolving as function of a cluster's aggregated trajectory points, these bandwidths are able to adapt to the underlying mode structure (shape, scale, orientation) -and in turn, guide future cluster trajectory and agglomeration. This results in robust mode detection and with increased partition saliency (Figs. 1, 2(a)). The supplementary presents a convergence proof when anisotropic bandwidths vary between Mean shift iterations, as is the case here. The approach involves running Mean Shift fixed point iterations at cluster levels, over a single data point per cluster. Starting out as trivial clusters (solitary data points), the clusters agglomerate between iterations. By algorithm design, clusters are merged only when they are tending towards the same mode. All member points of a cluster, u, which will eventually converge to a common local mode, share a common bandwidth, Σ u -referred to as the local bandwidth. This bandwidth evolves every iteration, adapting to the structure of the local mode and to an extent, its basin. The standard MS fixed point iteration, is reformulated through local bandwidth based decomposition, as a fixed point update over clusters :For ascertaining cluster merges, the data points in the vicinity of a cluster u's trajectory, u τ , are considered. If a data point, y, in vicinity of u τ , is ascertained to be heading to the same mode as u τ , then by transitivityall the members of its parent cluster, Π(y), are heading to that mode toothe clusters u and Π(y), can then be merged. The cluster which is higher up the mode (higher density) assimilates the other cluster into itself, thus accelera...