Abstract. Event correlation is a necessary component of systems management but is perceived as a difficult function to set up and maintain. We report on our work to develop a set of tools and techniques to simplify event correlation and thereby reduce overall operating costs. The tools prototyped are described and our current plans for future tool development outlined.Event correlation is a key component of systems management. Events from multiple resources, e.g., network elements, servers, applications, are collected and analyzed to detect problems such as component failures, security breeches and failed business processes. Management solutions require correlation for filtering and analyzing massive numbers of events, for example by removing duplicate events, or for detecting event sequences that signal a significant occurrence in the managed systems. Relevant event patterns need to be identified and formulated as rules, and mechanisms provided to map observed events into the defined patterns. Many systems allow correlation of events where the patterns are expressed as rules [1]. The difficulty lies in identifying the different and relevant patterns of events, as patterns change and new ones are introduced. Our goal is to develop tools to help operators and systems management architects to identify event patterns, to create rules to implement the patterns, test their validity, and to monitor and manage the rules during their lifecycle.
Fig. 1. Tools for automated correlation rule generationThe tool collection is shown in Figure 1. The correlation engines use installed rules to filter incoming events, which are logged (Event Log) and displayed on the event console. By event mining [2,3] or operator intervention, patterns of events that need to be filtered or in sum indicate a situation are selected. In the rule wizard and the rule