Semi-supervised classifiers combine labeled and unlabeled data during the learning phase in order to increase classifier's generalization capability. However, most successful semi-supervised classifiers involve complex ensemble structures and iterative algorithms which make it difficult to explain the outcome, thus behaving like black boxes. Furthermore, during an iterative self-labeling process, mistakes can be propagated if no amending procedure is used. In this paper, we build upon an interpretable self-labeling grey-box classifier that uses a black box to estimate the missing class labels and a white box to make the final predictions. We propose a Rough Set based approach for amending the self-labeling process. We compare its performance to the vanilla version of our self-labeling grey-box and the use of a confidence-based amending. In addition, we introduce some measures to quantify the interpretability of our model. The experimental results suggest that the proposed amending improves accuracy and interpretability of the self-labeling greybox, thus leading to superior results when compared to state-ofthe-art semi-supervised classifiers.
Information quality and organizational transparency are relevant issues for corporate governance and sustainability of companies, as they contribute to reducing information asymmetry, decreasing risks, and improving the conduct of decision-makers, ensuring an ethical standard of organizational control. This work uses the COBIT framework of IT governance, knowledge management, and machine learning techniques to evaluate organizational transparency considering the maturity levels of technology processes applied in 285 companies of southern Brazil. Data mining techniques have been methodologically applied to analyze the 37 processes in four different domains: Planning and organization, acquisition and implementation, delivery and support, and monitoring. Four learning techniques for knowledge discovery have been used to build a computational model that allowed us to evaluate the organizational transparency level. The results evidence the importance of IT performance monitoring and assessment, and internal control processes in enabling organizations to improve their levels of transparency. These processes depend directly on the establishment of IT strategic plans and quality management, as well as IT risk and project management, therefore an improvement in the maturity of these processes implies an increase in the levels of organizational transparency and their reputational, financial, and accountability impact.
In the context of some machine learning applications, obtaining data instances is a relatively easy process but labeling them could become quite expensive or tedious. Such scenarios lead to datasets with few labeled instances and a larger number of unlabeled ones. Semi-supervised classification techniques combine labeled and unlabeled data during the learning phase in order to increase classifier's generalization capability. Regrettably, most successful semi-supervised classifiers do not allow explaining their outcome, thus behaving like black boxes. However, there is an increasing number of problem domains in which experts demand a clear understanding of the decision process. In this paper, we report on an extended experimental study presenting an interpretable self-labeling grey-box classifier that uses a black box to estimate the missing class labels and a white box to make the final predictions. Two different approaches for amending the self-labeling process are explored: a first one based on the confidence of the black box and the latter one based on measures from Rough Set Theory. The results of the extended experimental study support the interpretability by means of transparency and simplicity of
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.