In many application domains such as information retrieval, computational biology, and image processing the data dimension is usually very high. Developing effective clustering methods for high dimensional dataset is a challenging problem due to the curse of dimensionality. The k-means clustering algorithm is used for many practical applications. But it is computationally expensive and the quality of the resulting clusters heavily depends on the selection of initial centroid and dimension of the data. The accuracy of the resultant value perhaps not up to the level of expectation when the dimensions of the dataset is high because we cannot say that the dataset chosen are free from noisy and flawless. So it is required to reduce the dimensionality of the given dataset in order to improve the efficiency and accuracy. This paper proposed a new approach to improve the accuracy of the cluster results by using PCA to determine the initial centroid and also to reduce the dimension of the data.
The classification of ovarian cancer types is a very challenging process for physicians' eyes. To solve this problem, this article proposes a new deep learner, which classifies ovarian cancer types from Computerized Tomography (CT) images. Firstly, a Deep Convolutional Neural Network (DCNN) model depending on AlexNet is proposed to categorize ovarian cancer from CT images. But its efficiency is not satisfactorily high. So, DCNN is built based on the fusion of AlexNet, VGG, and GoogLeNet. The fusion is carried out at the SoftMax layer by fusing the SoftMax values of each network structure using a weighted sum to obtain the overall classification outcome. But overfitting problems can occur due to an inadequate number of training images. Thus, a Deep Semi-Supervised Generative Learning with DCNN model (DSSGL-DCNN) is proposed by using a Generative Adversarial Network (GAN) which augments the training samples to solve the overfitting problem. Once the augmented dataset is obtained, the fused DCNN model is learned to classify ovarian cancer types. Further, the classified outcomes can be used as a useful guideline for physicians in medical diagnosis. Finally, the experimental results show that the DSSGL-DCNN achieves higher efficiency compared to the other DCNN architectures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.