Association rule mining can be applied to discover relations among news documents. Most existing approaches may not be good enough to extract meaningful news relations due to the limitation of having only single association measure for ranking mined news relations. This paper presents a region-based method to selectively use different association measures for different ranking regions, towards improvement of the ranking mechanism for news relation discovery. In this method, first the mined relations are sorted under a preliminary criterion to form a number of regions before scoring the relations in each region with different association measures. The meaningful news relations are discovered through three levels of relation: completely related, somehow related and unrelated relations, judged by the domain expert. To evaluate the proposed regionbased ranking method, the method which has no region construction is considered to be the baseline by using a set of 1,132 news relations mined from 811 news documents. As performance criterion, a rank-order mismatch is explored to compare the qualitative results between the proposed method and the human evaluation. Compared to the baseline, the region-based method significantly improves performance by the average rank-order mismatch of 1.21%-28.32% for confidence and 4.83%-29.04% for conviction, respectively.
N. Kittiphattanabawon et al. / Region-based association measures for ranking mined news relationsresulting in the complexity of O(N L ), where L is the length of the longest pattern. However, an inevitable issue of association rule mining is triggered by the large number of rules generated [5,22,23,27,31]. In the past, several works introduced various approaches to solve the problem of these tremendous association rules, which may be classified into filtering, constraint-based mining and ranking.Up to present, most previous works have applied single objective measure (such as support, confidence or lift) or single subjective measure (such as preference, personal value or cultural value) or their combinations, to filter, constrain, or rank the discovered rules. However, they still focus on a uniform function for the whole ranking interval. Intuitively, only a single objective function for the whole range of candidates may not be enough to prioritize the rules. It is worth exploring multiple objective functions for rule prioritization. Towards such issues, this paper presents a method to selectively use different objective functions (association measures) for different ranking regions to improve the ranking mechanism. So-called region-based ranking, the proposed approach, sets a specific weight for each different region (range) of the relation strength in order to maximize classification performance. This work also deals with two main issues, i.e., (1) how to determine suitable points of weight scheme change in order to form regions, and (2) which weight scheme we should assign to each region. In addition, when the domain is changed to other areas such as medical, enviro...