In this paper, we investigate the unsupervised deep representation learning issue and technically propose a novel framework called Deep Self-representative Concept Factorization Network (DSCF-Net), for clustering deep features. To improve the representation and clustering abilities, DSCF-Net explicitly considers discovering hidden deep semantic features, enhancing the robustness properties of the deep factorization to noise and preserving the local manifold structures of deep features. Specifically, DSCF-Net seamlessly integrates the robust deep concept factorization, deep self-expressive representation and adaptive locality preserving feature learning into a unified framework. To discover hidden deep representations, DSCF-Net designs a hierarchical factorization architecture using multiple layers of linear transformations, where the hierarchical representation is performed by formulating the problem as optimizing the basis concepts in each layer to improve the representation indirectly. DSCF-Net also improves the robustness by subspace recovery for sparse error correction firstly and then performs the deep factorization in the recovered visual subspace. To obtain locality-preserving representations, we also present an adaptive deep self-representative weighting strategy by using the coefficient matrix as the adaptive reconstruction weights to keep the locality of representations. Extensive comparison results with several other related models show that DSCF-Net delivers state-of-the-art performance on several public databases.