In extreme learning machines (ELM), the hidden node parameters are randomly generated and the output weights can be analytically computed. To overcome the bad feature extraction ability of the shallow architecture of ELM, the hierarchical ELM has been extensively studied as a deep architecture with multilayer neural network. However, the commonly used mean square error (MSE) criterion is very sensitive to outliers and impulsive noises, generally existing in real world data. In this paper, we investigate the correntropy to improve the robustness of the multilayer ELM and provide sparser representation. The correntropy, as a nonlinear measure of similarity, is robust to outliers and can approximate different norms (from 0 to 2). A new full correntropy based multilayer extreme learning machine (FC-MELM) algorithm is proposed to handle the classification of datasets which are corrupted by impulsive noises or outliers. The contributions of this paper are three-folds: (1) The MSE based reconstruction loss is replaced by the correntropy based loss function; In this way, the robustness of the ELM based multilayer algorithms is enhanced. (2) The traditional 1-based sparsity penalty term is also replaced by a correntropy-based sparsity penalty term, which can further improve the performance of the proposed algorithm with a sparser representation of the data. The combination of (1) and (2) provides the correntropy-based ELM autoencoder. (3) The FC-MELM is proposed by using the correntropy-based ELM autoencoder as a building block. It is notable that the FC-MELM is trained in a forward manner, which means fine-tuning procedure is not required. Thus, the FC-MELM has great advantage in learning efficiently when compared with traditional deep learning algorithms. The good property of the proposed algorithm is confirmed by the experiments on well-known benchmark datasets, including the MNIST datasets, the NYU Object Recognition Benchmark dataset, and the Moore network traffic dataset. Finally, the proposed FC-MELM algorithm is applied to address Computer Aided Cancer Diagnosis. Experiments conducted on the well-known Wisconsin Breast Cancer Data (Diagnostic) dataset are presented and show that the proposed FC-MELM outperforms state-of-the-art methods in solving computer aided cancer diagnosis problems.