The evaluation of clustering results is one of the most important issues in cluster analysis, a core task for effective information access. There are two types of measures for evaluating the quality of clustering results: internal and external. External validity measures evaluate how well the clustering results match prior knowledge about the data, whereas internal measures do not need external information, dealing only with information within the data. In this regard, the main drawback of external evaluation measures is that they are not applicable in real-world situations. In this paper we present an experimental study to determine whether it is possible to predict the quality of multilingual news clustering results by means of an internal evaluation measure. We study whether the internal evaluation measure Expected Density correlates with the external measure F-measure, the most common way of evaluating clustering results. In the experiments, we use different data collections, clustering algorithms and similarity measures in order to determine their influence in the correlation between those measures. Regarding similarity measures, another important issue in clustering, we propose a new similarity measure to calculate how similar two news documents are. This measure is based on the Named Entities shared by both documents. The results show that correlation depends on several different factors, such as the type of collection, the granularity of the clusters, the type of algorithm and the similarity measure.