This paper describes part of content-based image retrieval (CBIR) system that has been developed for mammograms. Details are presented of methods implemented to derive measures of similarity based upon structural characteristics and distributions of density of the fibroglandular tissue, as well as the anatomical size and shape of the breast region as seen on the mammogram. Wellknown features related to shape, size, and texture (statistics of the gray-level histogram, Haralick's texture features, and moment-based features) were applied, as well as less-explored features based in the Radon domain and granulometric measures. The Kohonen self-organizing map (SOM) neural network was used to perform the retrieval operation. Performance evaluation was done using precision and recall curves obtained from comparison between the query and retrieved images. The proposed methodology was tested with 1,080 mammograms, including craniocaudal and mediolateral-oblique views. Precision rates obtained are in the range from 79% to 83% considering the total image set. Considering the first 50% of the retrieved mages, the precision rates are in the range from 78% to 83%; the rates are in the range from 79% to 86% considering the first 25% of the retrieved images. Results obtained indicate the potential of the implemented methodology to serve as a part of a CBIR system for mammography.KEY WORDS: Mammography, contend-based image retrieval, Kohonen self-organizing map, texture features, granulometric measures, radon transform domain, breast density
INTRODUCTION: BREAST CANCER AND MAMMOGRAPHYB reast density has been shown to be a risk factor in the development of breast cancer. Wolfe 1 presented the first study relating the density and structure of breast tissues as seen on mammograms to the characteristics of breast disease: he described and illustrated cases associating patterns of parenchymal distortion with the risk of development of breast cancer. Since then, several other researchers have studied the relation between the structural composition of breast tissue and the abnormalities found in the related regions. 2 A consequence of the understanding of this relationship has been the development of systems for the description and analysis of the density patterns found in mammograms: the Breast Imaging Reporting and Data System (BI-RADS), 3 developed by the American College of Radiology, is the most important of such systems. BI-RADS contains recommendations for standardization of terms used in image-based diagnosis of breast diseases, the division of breast composition and mammographic findings into categories, and suggestions for further actions by the radiologist. Visual analysis of mammograms takes into consideration the shape and size of the breast, the conditions of the breast contour and the nipple position, and the distribution of fibroglandular tissue (degree of granularity, amount, and distribution of breast density). Notwithstanding the developments mentioned above, visual analysis of mammograms by radiologists remains sub...