Malignant effusion in invasive breast carcinoma is associated with poor prognosis. To decipher molecular events leading to metastasis and to identify reliable markers for targeted therapies are of crucial need. Therefore, we have used cDNA microarrays to delineate molecular signatures associated with metastasis and relapse in breast carcinoma effusions. Taking advantage of an immunomagnetic method, we have purified to homogeneity EpCAM-positive cells from 34 malignant effusions. Immunopurified cells represented as much as 10% of the whole cell fraction and their epithelial and carcinoma features were confirmed by immunofluorescence labeling. Gene expression profiles of 19 immunopurified effusion samples, were analyzed using human pan-genomic microarrays, and compared with those of 4 corresponding primary tumors, 8 breast carcinoma effusion-derived cell lines, and 4 healthy mammary tissues. Principal component and multiple clustering analyses of microarray data, clearly identified distinctive molecular portraits corresponding to the 4 categories of specimens. Of uppermost interest, effusion samples were arranged in 2 subsets on the basis of their gene expression patterns. The first subset partly shares a gene expression signature with the different cell lines, and overexpresses CD24, CD44 and epithelial cytokeratins 8,18,19. The second subset overexpresses markers related to aggressive invasive carcinoma (uPA receptor, S100A4, vimentin, CXCR4). These findings demonstrate the importance of using pure cell fractions to accurately decipher in silico gene expression of clinical specimens. Further studies will lead to the identification of genes of oustanding importance to diagnose malignant effusion, predict survival and tailor appropriate therapies to the metastatic effusion disease in breast carcinoma patients. ' 2007 Wiley-Liss, Inc.