Background: Once multi-relational approach has emerged as an alternative for analyzing structured data such as relational databases, since they allow applying data mining in multiple tables directly, thus avoiding expensive joining operations and semantic losses, this work proposes an algorithm with multi-relational approach. Methods: Aiming to compare traditional approach performance and multi-relational for mining association rules, this paper discusses an empirical study between PatriciaMine-an traditional algorithm-and its corresponding multi-relational proposed, MR-Radix. Results: This work showed advantages of the multi-relational approach in performance over several tables, which avoids the high cost for joining operations from multiple tables and semantic losses. The performance provided by the algorithm MR-Radix shows faster than PatriciaMine, despite handling complex multirelational patterns. The utilized memory indicates a more conservative growth curve for MR-Radix than PatriciaMine, which shows the increase in demand of frequent items in MR-Radix does not result in a significant growth of utilized memory like in PatriciaMine. Conclusion: The comparative study between PatriciaMine and MR-Radix confirmed efficacy of the multi-relational approach in data mining process both in terms of execution time and in relation to memory usage. Besides that, the multi-relational proposed algorithm, unlike other algorithms of this approach, is efficient for use in large relational databases.
This paper presents a big data analytics-based model developed for electric distribution utilities aiming to forecast the demand of service orders (SOs) on a spatio-temporal basis. Being fed by robust history and location data from a database provided by an energy utility that is using this innovative system, the algorithm automatically forecasts the number of SOs that will need to be executed in each location in several time steps (hourly, monthly and yearly basis). The forecasted emergency SOs demand, which is related to energy outages, are stochastically distributed, projecting the impacted consumers and its individual interruption indexes. This spatio-temporal forecasting is the main input for a web-based platform for optimal bases allocation, field team sizing and scheduling implemented in the eleven distribution utilities of Energisa group in Brazil.
Data mining algorithms to find association rules are an important tool to extract knowledge from databases. However, these algorithms produce an enormous amount of rules, many of which could be redundant or irrelevant for a specific decision-making process. Also, the use of previous knowledge and hypothesis are not considered by these algorithms. On the other hand, most existing data mining approaches look for patterns in a single data table, ignoring the relations presented in relational databases. The contribution of this paper is the proposition of a multirelational data mining algorithm based on association rules, called TBMR-Radix, which considers previous knowledge and hypothesis through the using of the Templates technique. Applying this approach over two real databases, we were able to reduce the number of generated rules, use the existing knowledge about the data and reduce the waste of computational resources while processing. Our experiments show that the developed algorithm was also able to perform in a multi-relational environment, while the MR-Radix, that does not use Templates technique, was not.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.