In Libya, from time to time, the National Electricity Grid is directed by the National Electricity Company to conduct load shedding to mitigate pressure on supply at times of peak demand. This involves hours' of power outages in the area covered by this study, namely, the Southern Electrical Grid of Libya (SEGL). This paper discusses the results of a pattern extraction process using the kmeans clustering algorithm to predict load shedding for this scenario. The data consist of all loads shed in 40 electrical power stations in southern Libya for a two-year period from 2009 through 2010. An experiment was conducted to assess the effectiveness of the k-means clustering algorithm in grouping (clustering) the data as a means to predict future load shedding in the SEGL. Each cluster was generated five times to create five different cluster sizes (1, 2, 5, 7 and 10) with different seed values. The pattern extracted provided information on all attributes. The obtained results showed that the generated clusters are fit to be used for the future load shedding schedule problem in the SEGL.Index Terms-data mining, decision support system, k-means algorithm, load shedding, clusters.
This study proposes a sequential pattern mining algorithm to discover sequential patterns of Malaysia rainfall data for prediction. The apriori based algorithm is employed to find the sequential patterns from the time series data. The frequent episodes of rainfall sequences are discovered and classified by the expert into four main events namely, No rain, Light, Moderate and heavy. The sequential rules of ten rainfall stations from the duration of 33 years are analysed. The proposed algorithm is able to generate higher confidence and support of frequent and sequential patterns. Generally, the proposed study has shown its potential in producing methods that manage to preserve important knowledge and thus reduce information loss in weather prediction problem.
In this paper we propose a new approach based on Symbolic Aggregate approximation (SAX), called improved iSAX to recognize efficient and accurate discovery of the important patterns, essential for time series data. The original SAX approach allows a very high-quality dimensionality reduction and distance measures to be defined on the symbolic approach and it is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality. The proposed improved SAX, called iSAX includes the Relative Frequency and K-Nearest Neighbor (RFknn) Algorithm. The main task of the algorithm is to determine the sufficient number of intervals represented as symbolic (alphabet size) that can ensure efficient mining process and a good knowledge model is obtained without major loss of knowledge. We show that iSAX can improve representation preciseness without losing symbolic nature of the original SAX representation. The iSAX is compared with the original SAX and PAA representation, and demonstrate its quality improvement. Ten time series rainfall data sets were used. The experimental results showed that iSAX gives better term of representation and minimum Euclidean Distance.
In this paper we propose a new approach to the dynamic data discretization technique. The technique is called Frequency Dynamic Interval Class (FDIC). FDIC consists of two important phases: The dynamic intervals class phase and the interval merging phase. The first phase uses a simple statistical frequency measure to obtain the initial intervals while in the second phase a KNearest Neighbour is used to calculate the merging factor for the unknown intervals. The experimental results showed that FDIC generates more intervals in an attribute, and less number rules with comparable accuracies within three tested datasets. It indicates that FDIC managed to reduce the loss of knowledge in several other techniques that generated the very least number of intervals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.