Abstract. Traditional similarity-based retrieval of structured data, as in CaseBased Reasoning (CBR) approaches, has been largely implemented using centralized storage systems. In such systems, when the cases or records contain both numeric and symbolic attributes, similarity-based retrieval cannot exploit standard speedup techniques based on multi-dimensional indexing, and the retrieval is implemented by an exhaustive comparison of the case to be solved with the whole set of stored cases. In this work, to improve the performance of the case retrieval step and build CBR systems that can scale up to large case bases, we propose a novel approach for storage of the case base in a decentralized Peer-to-Peer environment using the notion of Unspecified Ontology. We also develop an algorithm for efficient retrieval of approximated most-similar cases, that exploits inherent characteristics of the unspecified ontology in order to improve the performance of the case retrieval step. The experiments show that the algorithm successfully retrieves cases that are very close to the mostsimilar cases, while reducing the number of cases to be compared. Hence, it improves the performance of the retrieval step, the first stage of the CBR problem solving cycle. Moreover, the distributed nature of our approach eliminates the need for a centralized server that not only becomes a computational bottleneck, but is also a single point of failure. Abstract. Traditional similarity-based retrieval of structured data, as in CaseBased Reasoning (CBR) approaches, has been largely implemented using centralized storage systems. In such systems, when the cases or records contain both numeric and symbolic attributes, similarity-based retrieval cannot exploit standard speedup techniques based on multi-dimensional indexing, and the retrieval is implemented by an exhaustive comparison of the case to be solved with the whole set of stored cases. In this work, to improve the performance of the case retrieval step and build CBR systems that can scale up to large case bases, we propose a novel approach for storage of the case base in a decentralized Peer-to-Peer environment using the notion of Unspecified Ontology. We also develop an algorithm for efficient retrieval of approximated most-similar cases, that exploits inherent characteristics of the unspecified ontology in order to improve the performance of the case retrieval step. The experiments show that the algorithm successfully retrieves cases that are very close to the mostsimilar cases, while reducing the number of cases to be compared. Hence, it improves the performance of the retrieval step, the first stage of the CBR problem solving cycle. Moreover, the distributed nature of our approach eliminates the need for a centralized server that not only becomes a computational bottleneck, but is also a single point of failure.
Manuscript