Association rule extraction is a widely used exploratory technique which has been exploited in different contexts (e.g., biological data, medical images). However, association rule extraction, driven by support and confidence constraints, entails (i) generating a huge number of rules which are difficult to analyze, or (ii) pruning rare itemsets, even if their hidden knowledge might be relevant. To address the above issues, this paper presents a novel frequent itemset mining algorithm, called GENIO (GENeralized Itemset DiscOverer), to analyze correlation among data by means of generalized itemsets, which provide a powerful tool to efficiently extract hidden knowledge, discarded by previous approaches. The proposed technique exploits a (user provided) taxonomy to drive the pruning phase of the extraction process. Instead of extracting itemsets for all levels of the taxonomy and post-pruning them, the GenIO algorithm performs a support driven opportunistic aggregation of itemsets.
Generalized itemsets are extracted only if itemsets at a lower level in the taxonomy are below the support threshold. Experiments performed in the network traffic domain show the efficiency and the effectiveness of the proposed algorithm
This paper presents a flexible framework that performs real-time analysis of physiological data to monitor people's health conditions in any context (e.g., during daily activities, in hospital environments). Given historical physiological data, different behavioral models tailored to specific conditions (e.g., a particular disease, a specific patient) are automatically learnt. A suitable model for the currently monitored patient is exploited in the real-time stream classification phase. The framework has been designed to perform both instantaneous evaluation and stream analysis over a sliding time window. To allow ubiquitous monitoring, real-time analysis could also be executed on mobile devices. As a case study, the framework has been validated in the intensive care scenario. Experimental validation, performed on 64 patients affected by different critical illnesses, demonstrates the effectiveness and the flexibility of the proposed framework in detecting different severity levels of monitored people's clinical situations.
The analysis of medical data is a challenging task for health care systems since a huge amount of interesting knowledge can be automatically mined to effectively support both physicians and health care organizations. This paper proposes a data analysis framework based on a multiple-level clustering technique to identify the examination pathways commonly followed by patients with a given disease. This knowledge can support health care organizations in evaluating the medical treatments usually adopted, and thus the incurred costs. The proposed multiple-level strategy allows clustering patient examination datasets with a variable distribution. To measure the relevance of specific examinations for a given disease complication, patient examination data has been represented in the Vector Space Model using the TF-IDF method. As a case study, the proposed approach has been applied to the diabetic care scenario. The experimental validation, performed on a real collection of diabetic patients, demonstrates the effectiveness of the approach in identifying groups of patients with a similar examination history and increasing severity in diabetes complications.
Itemset mining is a well-known exploratory data mining technique used to discover interesting correlations hidden in a data collection. Since it supports different targeted analyses, it is profitably exploited in a wide range of different domains, ranging from network traffic data to medical records. With the
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.