Opinion evolution is generally subject to global neutral consensus, fragmentation state with more than two distinct clusters in the spectrum or polarization into two opposing camps. However, as one widely-existed state, the formation of "harmony with diversity", where individuals freely express various viewpoints to a certain extent to sustain integration of diversity and at the same time shared values ensure social coherence to avoid ideological split, still remains unclear, as well as its unique dynamic features. Since a general model framework to generate the desired state is still lacked. To address this issue, we develop an attraction-repulsion model based on the general simple assumption that individuals tend to either reach an agreement with shared opinions or to amplify difference from others with distant opinions, which allows us to take account into the three core parameters: interaction strength, individuals' susceptibility and tolerance to others' opinions. We are concerned with the effect of not only time-varying topology but also fixed interactions imposed by static social network, where the tasks of heterogeneous individuals' attributes are also performed. Remarkably, the simple model rules successfully generate the three above phases except for fragmentation, along with three different transitions and the triple points, regardless of whether the interactions are time-varying or fixed. We find that sufficient susceptibility, intermediate interaction strength and high tolerance can benefit a balance between repulsive and attractive forces, further leading to the emergence of "harmony with diversity". However, fixed interactions can introduce cluster-level self-reinforced mechanism which can unexpectedly promote polarization. Heterogeneous susceptibility or tolerance turns out to be a inhibiting factor, which should be avoided. A method to identify the phase boundaries through computing the maximum susceptibility of opinion entropy, confirmed by numerical simulations, allows us to build phase diagrams and to locate where the triple points are. These findings provide profound insights that can be subject to further empirical analysis of societal diversity and coherence.