The Survival of the Fittest is a principle which selects the superior and eliminates the inferior in the nature. This principle has been used in many fields, especially in optimization problem-solving. Clustering in data mining community endeavors to discover unknown representations or patterns hidden in datasets. Hierarchical clustering algorithm (HCA) is a method of cluster analysis which searches the optimal distribution of clusters by a hierarchical structure. Strategies for hierarchical clustering generally have two types: agglomerative with a bottom-up procedure and divisive with a top-down procedure. However, most of the clustering approaches have two disadvantages: the use of distance-based measurement and the difficulty of the clusters integration. In this paper, we propose an optimal probabilistic estimation (OPE) approach by exploiting the Survival of the Fittest principle. We devise a hierarchical clustering algorithm (HCA) based on OPE, also called OPE-HCA. The OPE-HCA combines optimization with probability and agglomerative HCA. Experimental results show that the OPE-HCA has the ability of searching and discovering patterns at different description levels and can also obtain better performance than many clustering algorithms according to NMI and clustering accuracy measures.