The problem of identifying meaningful patterns in a database lies at the very heart of data mining. A core objective of data mining processes is the recognition of inter-attribute correlations. Not only are correlations necessary for predictions and classifications -since rules would fail in the absence of pattern -but also the identification of groups of mutually correlated attributes expedites the selection of a representative subset of attributes, from which existing mappings allow others to be derived. In this paper, we describe a scalable, effective algorithm to identify groups of correlated attributes. This algorithm can handle non-linear correlations between attributes, and is not restricted to a specific family of mapping functions, such as the set of polynomials. We show the results of our evaluation of the algorithm applied to synthetic and real world datasets, and demonstrate that it is able to spot the correlated attributes. 368 E. P. M. de Sousa et al. Moreover, the execution time of the proposed technique is linear on the number of elements and of correlations in the dataset.
This paper presents a new approach to support Computer-aided Diagnosis (CAD) aiming at assisting the task of classification and similarity retrieval of mammographic mass lesions, based on shape content. We have tested classical algorithms for automatic segmentation of this kind of image, but usually they are not precise enough to generate accurate contours to allow lesion classification based on shape analyses. Thus, in this work, we have used Zernike moments for invariant pattern recognition within regions of interest (ROIs), without previous segmentation of images. A new data mining algorithm that generates statisticalbased association rules is used to identify representative features that discriminate the disease classes of images. In order to minimize the computational effort, an algorithm based on fractal theory is applied to reduce the dimension of feature vectors. Knearest neighbor retrieval was applied to a database containing images excerpted from previously classified digitalized mammograms presenting breast lesions. The results reveal that our approach allows fast and effective feature extraction and is robust and suitable for analyzing this kind of image.
Abstract-Complex networks are nowadays employed in several applications. Modeling urban street networks is one of them, and in particular to analyze criminal aspects of a city. Several research groups have focused on such application, but until now, there is a lack of a well-defined methodology for employing complex networks in a whole crime analysis process, i.e. from data preparation to a deep analysis of criminal communities. Furthermore, the "toolset" available for those works is not complete enough, also lacking techniques to maintain up-to-date, complete crime datasets and proper assessment measures. In this sense, we propose a threefold methodology for employing complex networks in the detection of highly criminal areas within a city. Our methodology comprises three tasks: (i) Mapping of Urban Crimes; (ii) Criminal Community Identification; and (iii) Crime Analysis. Moreover, it provides a proper set of assessment measures for analyzing intrinsic criminality of communities, especially when considering different crime types. We show our methodology by applying it to a real crime dataset from the city of San Francisco -CA, USA. The results confirm its effectiveness to identify and analyze high criminality areas within a city. Hence, our contributions provide a basis for further developments on complex networks applied to crime analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.