Cancer is one of the most deadly diseases in the world. The International Agency for Research on Cancer (IARC) noted 14.1 million new cancer cases and 8.2 million deaths from cancer in 2012. In the last few years, DNA microarray technology has increasingly been used to analyze and diagnose cancer. Analysis of gene expression data in the form of microarray allows medical experts to ascertain whether or not a person suffers from cancer. DNA microarray data has a large dimension that can affect the process and accuracy of cancer classification. Therefore, a classification scheme that includes dimension reduction is needed. In this research, a Principal Component Analysis (PCA) dimension reduction method that includes the calculation of variance proportion for eigenvector selection was used. For the classification method, a Support Vector Machine (SVM) and Levenberg-Marquardt Backpropagation (LMBP) algorithm were selected. Based on the tests performed, the classification method using LMBP was more stable than SVM. The LMBP method achieved an average 96.07% accuracy, while the SVM achieved 94.98% accuracy.
Sentiment analysis in a movie review is the needs of today lifestyle. Unfortunately, enormous features make the sentiment of analysis slow and less sensitive. Finding the optimum feature selection and classification is still a challenge. In order to handle an enormous number of features and provide better sentiment classification, an information-based feature selection and classification are proposed. The proposed method reduces more than 90% unnecessary features while the proposed classification scheme achieves 96% accuracy of sentiment classification. From the experimental results, it can be concluded that the combination of proposed feature selection and classification achieves the best performance so far.
Network formed between users in a social media can be used to encourage information spreading among them. This research applied Social Network Analysis which further can be used to social media marketing to improve the marketing process effectively. Based on previous research, information spreading speed among the social media is affected by the users' activity connection which can be represented in centrality value. The centrality value itself is very affected by the graph structure and weights. This research applied degree and eigenvector centrality to observe the effect of centrality value for twitter data. The result shows that there is significant difference among 10 most influential users. This result will be used for the future research that will be focused in small and medium enterprise (SME) twitter data.
News is a source of information disseminated in various types of media. In order to make it easier for news readers to obtain the desired news, the news needs to be classified. The large number of scattered news creates difficulties in classifying the news based on the topic. Therefore the author conducted a study to classify news into 12 classes (culture, economy, entertainment, law, health, life, automotive, education, politics, sports, technology, and tourism) automatically against 360 Indonesian news data. In this study several test scenarios were conducted to see the effect of stopword removal and stemming methods on data preprocessing, the effect of mutual information in selecting features, and performance of Support Vector Machine in classifying news data. The test results showed that the data using only stemming without stopword removal, using the MI selection feature and SVM classification method produced the best results of 94.24%, compared to the other methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.