The lack of entity label values is one of the problems faced by the application of Knowledge Graph. The method of automatically assigning entity label values still has shortcomings, such as costing more resources during training, leading to inaccurate label value assignment because of lacking entity semantics. In this paper, oriented to domain-specific Knowledge Graph, based on the situation that the initial entity label values of all triples are completely unknown, an Entity Label Value Assignment Method (ELVAM) based on external resources and entropy is proposed. ELVAM first constructs a Relationship Triples Cluster according to the relationship type, and randomly extracts the triples data from each cluster to form a Relationship Triples Subset; then collects the extended semantic text of the entities in the subset from the external resources to obtain nouns. Information Entropy and Conditional Entropy of the nouns are calculated through Ontology Category Hierarchy Graph, so as to obtain the entity label value with moderate granularity. Finally, the Label Triples Pattern of each Relationship Triples Cluster is summarized, and the corresponding entity is assigned the label value according to the pattern. The experimental results verify the effectiveness of ELVAM in assigning entity label values in Knowledge Graph.
The study of bug report-oriented program error location has the characteristics of strong pertinence and low cost, which is an important direction in the current research on program error location. This type of research takes bug reports and source code as input sources, and establishes a mapping relationship between the two through semantic mapping strategies to locate program errors. In the fine-grained program error location scenario, there is a problem that the location accuracy is greatly reduced. Existing empirical studies analyze the difference in location accuracy from two aspects: input source data noise and semantic mapping strategy selection, but most studies take the established location tools and methods as the evaluation object, the evaluation data type is single, and there is a lack of fine-grained analysis of constructing key variables. In order to evaluate the influence of key variables of location method on location accuracy, this paper decouples the location method through pseudo-siamese network, measures the sensitivity of location accuracy by counting the gain of location accuracy under different input source data types, and adds input source data types and a variety of semantic mapping strategies, Based on the evaluation of 23808 bug reports and corresponding source code data in 7 open source projects published on JIRA, this paper provides a more detailed empirical basis for additional data type selection and weight allocation, combined learning of multiple data types and different semantic mapping strategies in fine-grained program error location.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.