The similarity between biomedical terms/concepts is a very important task for biomedical information extraction and knowledge discovery. The measures and tests are tools used to define how to measure the goodness of ontology or its resources. The semantic similarity measuring techniques can be classified into three classes: first, measuring semantic similarity using ontology/ taxonomy; second, using training corpora and information content and third, combination between them. Some of the semantic similarity measures are based on the path length between the concept nodes as well as the depth of the LCS node in the ontology tree or hierarchy, and these measures assign high similarity when the two concepts are in the lower level of the hierarchy. However, most of the semantic similarity measures can be adopted to be used in health domain (Biomedical Domain). Many experiments have been conducted to check the applicability of these measures. In this paper, we investigate to measure semantic similarity between two concepts within single ontology or multiple ontologies in UMLS Metathesaurus (MeSH, SNOMED-CT, ICD), and compare my results to human experts score by correlation coefficient.
Most information systems usually have some missing values due to unavailable data. Missing values minimizing the quality of classification rules generated by a data mining system. Missing vales also affecting the quantity of classification rules achieved by the data mining system. Missing values could influence the coverage percentage and number of reducts generated. Missing values lead to the difficulty of extracting useful information from that data set. Solving the problem of missing data is of a high priority in the field of data mining and knowledge discovery. Replacing missing values by a specific value should not affect the quality of the data. Four different models for dealing with missing data were studied. A framework is established that remove inconsistencies before and after filling the attributes of missing values with the new expected value as generated by one of the four models. Comparative results were discussed and recommendations were concluded
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.