Gene expressions are considered among the most used features in cancer classification. The available gene expression data has a small number of samples and a relatively big number of dimensions, and that makes it not suitable for deep Convolutional Neural Networks (CNN) architectures, which exhibit state-of-the-art performance in many fields. In this paper, we propose a lightweight CNN architecture for breast cancer classification using gene expression data downloaded from Pan-Cancer Atlas using ''Illumina HiSeq'' platform. The downloaded gene expression data is preprocessed and then transformed into 2Dimages. We started the preprocessing by removing the outlier samples, which are determined based on the Array-Array Intensity Correlation (AAIC), which defines a symmetric square matrix of Spearman correlation. Then we applied a normalization process on the gene expression data to ensure that we can infer the expression level from it correctly and avoid biases in the expression measures. Finally, filtering is applied on the data. Model selection or a parameters search strategy is conducted to choose the values of the CNN hyper-parameters that give optimal performance. Our experiments show that our proposed method achieves an accuracy of 98.76%, which is the highest compared to other competing methods.INDEX TERMS Tumor type classification, RNA-Seq, gene expression, convolutional neural network, edge detection.MURTADA K. ELBASHIR received the B.Sc. degree (Hons.) in computer/statistics from the University of Gezira, Wad Madani, Sudan, in 2000, the M.Sc. degree (Hons.) in computer information systems from the University of the Free State, Bloemfontein, South Africa, in 2003, and the Ph.D. degree in computer science and technology
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.