Online Discretization of Continuous-Valued Attributes in Rule Induction

Pham, Duc Truong; Afify, A. A.

doi:10.1243/095440605x31571

Cited by 6 publications

(3 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Search-space pruning rules employed by RULES-6 r′ is any specialisation of rule r and Prune (r) indicates that the children of r should not be searched. The experimental results of many studies [22,23] have indicated that the choice of a discretisation method depends on both the data to be discretised and the learning algorithm. The performance of the four discretisation methods mentioned above when used with the RULES-6 algorithm was evaluated empirically [17].…”

Section: Discretisation Methodsmentioning

confidence: 99%

RULES-6: a simple rule induction algorithm for supporting decision making

Pham

Afify

2005

31st Annual Conference of IEEE Industrial Electronics Society, 2005. IECON 2005.

View full text Add to dashboard Cite

-RULES-3 Plus is a member of the RULES family of simple inductive learning algorithms with successful engineering applications. However, it requires modification in order to be a practical tool for problems involving large data sets. In particular, efficient mechanisms for handling continuous attributes and noisy data are needed. This paper presents a new rule induction algorithm called RULES-6, derived from the RULES-3 Plus algorithm. The algorithm employs a fast and noise-tolerant search method for extracting IF-THEN rules from examples. It also uses simple and effective methods for rule evaluation and continuous attributes handling. A detailed empirical evaluation of the algorithm is reported in the paper. The results presented demonstrate the strong performance of the algorithm.

show abstract

Section: Discretisation Methodsmentioning

confidence: 99%

RULES-6: a simple rule induction algorithm for supporting decision making

Pham

Afify

2005

31st Annual Conference of IEEE Industrial Electronics Society, 2005. IECON 2005.

View full text Add to dashboard Cite

show abstract

“…After consistent and clean data sets have been formed, data sampling and feature selection techniques are usually employed to reduce the data, thus speeding up the data mining process. Data often contain a mixture of categorical and continuous-valued attributes, and therefore continuous-valued attributes may have to be discretized first [35]. Data preparation is the most time-consuming stage in the whole data mining process.…”

Section: Overview Of Data Miningmentioning

confidence: 99%

Data mining: A tool for detecting cyclical disturbances in supply networks

Afify

Dimov

Naim

et al. 2007

Proceedings of the Institution of Mechanical Engineers, Part B:

View full text Add to dashboard Cite

Disturbances in supply chains may be either exogenous or endogenous. The ability automatically to detect, diagnose, and distinguish between the causes of disturbances is of prime importance to decision makers in order to avoid uncertainty. The spectral principal component analysis (SPCA) technique has been utilized to distinguish between real and rogue disturbances in a steel supply network. The data set used was collected from four different business units in the network and consists of 43 variables; each is described by 72 data points. The present paper will utilize the same data set to test an alternative approach to SPCA in detecting the disturbances. The new approach employs statistical data pre-processing, clustering, and classification learning techniques to analyse the supply network data. In particular, the incremental k-means clustering and the RULES-6 classification rule-learning algorithms, developed by the present authors' team, have been applied to identify important patterns in the data set. Results show that the proposed approach has the capability automatically to detect and characterize network-wide cyclical disturbances and generate hypotheses about their root cause.

show abstract

“…39 Instead of examining all individual values, this method examines only the boundary values of each numeric feature during learning. Split points are added when adjacent values of the same feature are identified, where each belongs to a different class.…”

mentioning

confidence: 99%

Adopting Relational Reinforcement Learning in Covering Algorithms for Numeric and Noisy Environments

ElGibreen¹,

Aksoy²

2016

IJCIS

View full text Add to dashboard Cite

Covering algorithms (CAs) constitute a type of inductive learning for the discovery of simple rules to predict future activities. Although this approach produces powerful models for datasets with discrete features, its applicability to problems involving noisy or numeric (continuous) features has been neglected. In real-life problems, numeric values are unavoidable, and noise is frequently produced as a result of human error or equipment limitations. Such noise affects the accuracy of prediction models and leads to poor decisions. Therefore, this paper studies the problem of CAs for data with numeric features and introduces a novel non-discretization algorithm called RULES-CONT. The proposed algorithm uses relational reinforcement learning (RRL) to resolve the current difficulties when addressing numeric and noisy data. The technical details of the algorithm are thoroughly explained to demonstrate that RULES-CONT contribute upon the RULES family by collecting its own knowledge and intelligently re-uses previous experience. The algorithm overcomes the infinite-space problem posed by numeric features and treats these features similarly to those with discrete values, while incrementally discovering the optimal rules for dynamic environments. It is the first RRL algorithm that intelligently induces rules to address continuous and noisy data without the need for discretization or pruning. To support our claims, RULES-CONT is compared with 7 well-known algorithms applied to 27 datasets with four levels of noise using 10-fold crossvalidation, and the results are analyzed using box plots and the Friedman test. The results show that the use of RRL results in significantly improved noise resistance compared with all other algorithms and reduces the computation time of the algorithm compared with the preceding version, which does not use relational representation.

show abstract

Online Discretization of Continuous-Valued Attributes in Rule Induction

Cited by 6 publications

References 37 publications

RULES-6: a simple rule induction algorithm for supporting decision making

RULES-6: a simple rule induction algorithm for supporting decision making

Data mining: A tool for detecting cyclical disturbances in supply networks

Adopting Relational Reinforcement Learning in Covering Algorithms for Numeric and Noisy Environments

Contact Info

Product

Resources

About