Case-Base Maintenance (CBM) becomes of great importance when implementing a Computer-Aided Diagnostic (CAD) system using Case-Based Reasoning (CBR). Since it is essential for the learning to avoid the case-base degradation, this work aims to build and maintain a quality case base while overcoming the difficulty of assembling labeled case bases, traditionally assumed to exist or determined by human experts. The proposed approach takes advantage of large volumes of unlabeled data to select valuable cases to add to the case base while monitoring retention to avoid performance degradation and to build a compact quality case base. We use machine learning techniques to cope with this challenge: an Active Semi-Supervised Learning approach is proposed to overcome the bottleneck of scarcity of labeled data. In order to acquire a quality case base, we target its performance criterion. Case selection and retention are assessed according to three combined sampling criteria: informativeness, representativeness, and diversity. We support our approach with empirical evaluations using different benchmark data sets. Based on experimentation, the proposed approach achieves good classification accuracy with a small number of retained cases, using a small training set as a case base.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.