In recent years, mining of sequential patterns has been studied extensively in various domains. Most of the existing algorithms find patterns in transactional databases by scanning the records whether they contain patterns or not. This paper proposes a novel algorithm to mine closed sequential patterns using an inverted matrix and prefix based sequence element matrix. Inverted matrix minimizes the search space for discovering various sequential patterns of different items. We use a prefix based sequence element matrix to minimize the scans required at levels k and k+1 in the mining process. Our experimental results show the performance improvement of the new algorithm over the previous work.
Breast cancer is a very common cancer among women. It remains as the number one form of cancer among women around the globe. Lack of awareness and detection of cancer at an advanced level put patient's life at a very high risk among the cancer affected women. Of the two types, non-invasive and invasive, invasive cancer has the potential to spread to other parts apart from the affected part. This paper attempts to perform breast cancer data analysis using R package. Decision tree is one of the data mining algorithms for classification due to the reason that it is fast, scalable and distributable. Among many tools available for data analysis, R is observed to be better in analyzing the data as it has become popular among the data analysts recently for the study of their large, unstructured and dynamic datasets. The three classifiers taken for study are 'rpath', 'ctree' and 'randomforest'. The algorithms are studied based on their performance measures such as accuracy, precision, recall, sensitivity and specificity. Based on the results, the best classification approach that suits better for cancer data analytics is recommended.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.