Abstract-SLAVE is an inductive learning algorithm that uses concepts based on fuzzy logic theory. This theory has been shown to be a useful representational tool for improving the understanding of the knowledge obtained from a human point of view. Furthermore, SLAVE uses an iterative approach for learning based on the use of a genetic algorithm (GA) as a search algorithm. In this paper, we propose a modification of the initial iterative approach used in SLAVE. The main idea is to include more information in the process of learning one individual rule. This information is included in the iterative approach through a different proposal of calculus of the positive and negative example to a rule. Furthermore, we propose the use of a new fitness function and additional genetic operators that reduce the time needed for learning and improve the understanding of the rules obtained.
Genetic algorithms offer a powerful search method for a variety of learning tasks, and there are different approaches in which they have been applied to learning processes. Structural learning algorithm on vague environment (SLAVE) is a genetic learning algorithm that uses the iterative approach to learn fuzzy rules. SLAVE can select the relevant features of the domain, but when working with large databases the search space is too large and the running time can sometimes be excessive. We propose to improve SLAVE by including a feature selection model in which the genetic algorithm works with individuals (representing individual rules) composed of two structures: one structure representing the relevance status of the involved variables in the rule, the other one representing the assignments variable/value. For this general representation, we study two alternatives depending on the information coded in the first structure. When compared with the initial algorithm, this new approach of SLAVE reduces the number of rules, simplifies the structure of the rules and improves the total accuracy.
ObjectiveWe aimed to mine the data in the Electronic Medical Record to automatically discover patients' Rheumatoid Arthritis disease activity at discrete rheumatology clinic visits. We cast the problem as a document classification task where the feature space includes concepts from the clinical narrative and lab values as stored in the Electronic Medical Record.Materials and MethodsThe Training Set consisted of 2792 clinical notes and associated lab values. Test Set 1 included 1749 clinical notes and associated lab values. Test Set 2 included 344 clinical notes for which there were no associated lab values. The Apache clinical Text Analysis and Knowledge Extraction System was used to analyze the text and transform it into informative features to be combined with relevant lab values.ResultsExperiments over a range of machine learning algorithms and features were conducted. The best performing combination was linear kernel Support Vector Machines with Unified Medical Language System Concept Unique Identifier features with feature selection and lab values. The Area Under the Receiver Operating Characteristic Curve (AUC) is 0.831 (σ = 0.0317), statistically significant as compared to two baselines (AUC = 0.758, σ = 0.0291). Algorithms demonstrated superior performance on cases clinically defined as extreme categories of disease activity (Remission and High) compared to those defined as intermediate categories (Moderate and Low) and included laboratory data on inflammatory markers.ConclusionAutomatic Rheumatoid Arthritis disease activity discovery from Electronic Medical Record data is a learnable task approximating human performance. As a result, this approach might have several research applications, such as the identification of patients for genome-wide pharmacogenetic studies that require large sample sizes with precise definitions of disease activity and response to therapies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.