We address the problem of automating the process of deciding whether two data schema elements match (that is, refer to the same actual object or concept), and propose several methods for combining evidence computed by multiple basic matchers. One class of methods uses Bayesian networks to account for the conditional dependency between the similarity values produced by individual matchers that use the same or similar information, so as to avoid overconfidence in match probability estimates and improve the accuracy of matching. Another class of methods relies on optimization switches that mitigate this dependency in a domain-independent manner. Experimental results under several testing protocols suggest that the matching accuracy of the Bayesian composite matchers can significantly exceed that of the individual component matchers, and the careful selection of optimization switches can improve matching accuracy even further. Springer LinkThis work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Abstract. We address the problem of automating the process of deciding whether two data schema elements match (that is, refer to the same actual object or concept), and propose several methods for combining evidence computed by multiple basic matchers. One class of methods uses Bayesian networks to account for the conditional dependency between the similarity values produced by individual matchers that use the same or similar information, so as to avoid overconfidence in match probability estimates and improve the accuracy of matching. Another class of methods relies on optimization switches that mitigate this dependency in a domain-independent manner. Experimental results under several testing protocols suggest that the matching accuracy of the Bayesian composite matchers can significantly exceed that of the individual component matchers, and the careful selection of optimization switches can improve matching accuracy even further.
Esenther, A.; Ye, X.; Shiba, M.; Takayama, S. TR2012-050 June 2012 AbstractWe propose a method for accurate combining of evidence supplied by multiple individual matchers regarding whether two data schema elements match (refer to the same object or concept), or not, in the field of automatic schema matching. The method uses a Bayesian network to model correctly the statistical correlations between the similarity values produced by individual matchers that use the same or similar information, in order to avoid overconfidence in match probability estimates and improve the accuracy of matching. Experimental results under several testing protocols suggest that the matching accuracy of the Bayesian composite matcher can significantly exceed that of the individual component matchers. International Conference on Enterprise Informaiton SystemsThis work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. We propose a method for accurate combining of evidence supplied by multiple individual matchers regarding whether two data schema elements match (refer to the same object or concept), or not, in the field of automatic schema matching. The method uses a Bayesian network to model correctly the statistical correlations between the similarity values produced by individual matchers that use the same or similar information, in order to avoid overconfidence in match probability estimates and improve the accuracy of matching. Experimental results under several testing protocols suggest that the matching accuracy of the Bayesian composite matcher can significantly exceed that of the individual component matchers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.