Graph learning is being increasingly applied to image clustering to reveal intra-class and inter-class relationships in data. However, existing graph learning-based image clustering focuses on grouping images under a single view, which under-utilises the information provided by the data. To address that, we propose a self-supervised multi-view image clustering technique under contrastive heterogeneous graph learning. Our method computes a heterogeneous affinity graph for multi-view image data. It conducts Local Feature Propagation (LFP) for reasoning over the local neighbourhood of each node and executes an Influence-aware Feature Propagation (IFP) from each node to its influential node for learning the clustering intention. The proposed framework pioneeringly employs two contrastive objectives. The first targets to contrast and fuse multiple views for the overall LFP embedding, and the second maximises the mutual information between LFP and IFP representations. We conduct extensive experiments on the benchmark datasets for the problem, i.e. COIL-20, Caltech7 and CASIA-WebFace. Our evaluation shows that our method outperforms the state-of-the-art methods, including the popular techniques MVGL, MCGC and HeCo.