Frequent itemset mining (FIM) is a crucial tool for identifying hidden patterns in information. FP-Growth is an FIM algorithm used to find associations. When the data size increases, the execution of FIM algorithms on a single machine suffers from computational problems, such as memory and time consumption. For these reasons, parallel and distributed processing on platforms such as Spark is essential. The parallel frequent pattern (PFP) is the implementation of FP-Growth in Spark. The main problem with PFP is that it does not consider the load balancing between cluster units. This research proposes an enhanced balanced parallel frequent pattern ''EBPFP'' algorithm to enhance and balance the PFP. The proposed algorithm (EBPFP) proposes two ideas. First, a strategy for load balancing between groups is proposed to ensure that the items are evenly divided between the nodes, and the cluster resources are used more effectively. Second, the improved conditional pattern base (ICPB) method aims to remove infrequent items from the conditional pattern base before constructing local FP-Trees. The experimental results show that the proposed EBPFP algorithm outperforms PFP, and the difference in running time between EBPFP and PFP was 21.56% and 39.72%, respectively.INDEX TERMS Big data, data mining, association rule analysis, frequent pattern growth algorithm, spark, load balancing.
Colon cancer is also referred to as colorectal cancer, a kind of cancer that starts with colon damage to the large intestine in the last section of the digestive tract. Elderly people typically suffer from colon cancer, but this may occur at any age.It normally starts as little, noncancerous (benign) mass of cells named polyps that structure within the colon. After a period of time these polyps can turn into advanced malignant tumors that attack the human body and some of these polyps can become colon cancers. So far, no concrete causes have been identified and the complete cancer treatment is very difficult to be detected by doctors in the medical field. Colon cancer often has no symptoms in early stage so detecting it at this stage is curable but colorectal cancer diagnosis in the final stages (stage IV), gives it the opportunity to spread to different pieces of the body, difficult to treat successfully, and the person's chances of survival are much lower. False diagnosis of colorectal cancer which mean wrong treatment for patients with long-term infections and they are suffering from colon cancer this causing the death for these patients. Also, the cancer treatment needs more time and a lot of money. This paper provides a comparative study for methodologies and algorithms used in colon cancer diagnoses and detection this can help for proposing a prediction for risk levels of colon cancer disease using CNN algorithm of the deep learning (Convolutional Neural Networks Algorithm).
Rehabilitation exercises reduce the demand for healthcare services over time by decreasing the number of hospital visits, lengths of stay, and readmissions. Since rehabilitation is a continuous process, it is crucial to monitor patient progress. This paper compares various machine learning classifiers which enable patients to perform exercises at home instead of visiting a physiotherapy center. The system assesses the correct performance of the exercises and tracks the patient's improvement, leading to lower rehabilitation costs. A distinct skeletal part, angle, and trajectory are required for each activity to distinguish between the workouts and assess whether they were executed correctly. Data extraction was performed using one Kinect camera, and six feature ranking algorithms were employed to construct the system, with the top features selected. Subsequently, 13 classical machine learning algorithms were implemented to identify the algorithm that produced the most accurate classification results. According to our experiments, Extra Tree Classifier, which employs feature extraction using the ReliefF technique, produces the best classification results, with an accuracy score of 99.94%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.