Data science is currently one of the most promising fields used to support the decision-making process. Particularly, data streams can give these supportive systems an updated base of knowledge that allows experts to make decisions with updated models. Incremental Decision Rules Algorithm (IDRA) proposes a new incremental decision-rule method based on the classical ID3 approach to generating and updating a rule set. This algorithm is a novel approach designed to fit a Decision Support System (DSS) whose motivation is to give accurate responses in an affordable time for a decision situation. This work includes several experiments that compare IDRA with the classical static but optimized ID3 (CREA) and the adaptive method VFDR. A battery of scenarios with different error types and rates are proposed to compare these three algorithms. IDRA improves the accuracies of VFDR and CREA in most common cases for the simulated data streams used in this work. In particular, the proposed technique has proven to perform better in those scenarios with no error, low noise, or high-impact concept drifts.
The continuous input of data into an Information System makes it difficult to generate a reliable model when this stream changes unpredictably. This continuous and unexpected change of data, known as concept drift, is faced by different strategies depending on its type. Several contributions are focused on the adaptations of traditional Machine Learning techniques to solve these data streams problems. The decision tree is one of the most used Machine Learning techniques due to its high interpretability. This article aims to study the impact an abrupt concept change has on the accuracy of the original CART proposed by Breiman, and justify the necessity of detection and/or adaptation methodologies that update or rebuild the model when a concept drift occurs. To do that, some simulated experiences have been carried out to study several training and testing conditions in a changing data environment. According to the results, models that are rebuilt in the right moment after a concept drift occurs obtain high accuracy rates while those that are not rebuilt or are rebuilt after a change occurs, obtain considerably lower accuracies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.