Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
The Universia repository is composed of more than 15 million of educational resources. The lack of metadata describing these resources complicates their classification, search and recovery. To overcome this drawback, it was decided to semantically annotate the available educational resources using the ADEGA algorithm. For this objective, we selected the DBpedia, a cross-domain linked data composed of more than 3.77 million 'things' with 400 million 'facts', in order to make sure that the wide range of Universia topics are covered by the ontology. However, this kind of process is extremely expensive from a computational point of view: more than 1600 years of CPU time was estimated to achieve it. In this paper, parallel programming techniques and distributed computing paradigms are combined in order to achieve this semantic annotation in a reasonable time. The cornerstone of this proposal is a resource management and execution framework able to integrate heterogeneous computing resources at our disposal (grid, cluster and cloud resources). As a result, the problem was solved in less than 180 days, demonstrating that it is perfectly feasible to exploit the advantages of these computing models in the field of linked data. ontology. From a semantic point of view, the use of the ADEGA algorithm was very promising in terms of precision and recall. Nevertheless, a preliminary estimation concluded that more than 1640 years of CPU time would be needed to create graphs of depth 3 and 25,000 years for graphs of depth 4 (the semantic richness of a graph is proportional to its depth). Obviously, with these computation requirements, a specific computing resource (a personal computer, for instance) or a limited set of resources cannot solve the semantic annotation process.Distributed computing paradigms (grid or cloud computing, for instance) can help us to make the semantic annotation process viable from a computational point of view. This approach is not totally novel, and, therefore, some interesting practical experiences can be found in the scientific literature [5][6][7]. These experiences present two common features that must be emphasized. On the one hand, they show software applications or tools to annotate semantically Web documents or pages. These applications were programmed to be executed on a set of concrete computing resources, and, therefore, they are highly coupled to their corresponding execution environments and technologies. On the other hand, the described annotation processes are fast from time's perspective. This is due to that small-sized and medium-sized collections of data were semantically annotated and, additionally, each data was annotated by means of a single instance of the selected ontology. Now, we try to take advantage of our knowledge in the field of scientific computing and, unlike previous proposals, provide a solution independent of the underlying computing environment and able to solve large-scale annotation processes. In the Universia problem, the complexity of the annotation process is cau...
The Universia repository is composed of more than 15 million of educational resources. The lack of metadata describing these resources complicates their classification, search and recovery. To overcome this drawback, it was decided to semantically annotate the available educational resources using the ADEGA algorithm. For this objective, we selected the DBpedia, a cross-domain linked data composed of more than 3.77 million 'things' with 400 million 'facts', in order to make sure that the wide range of Universia topics are covered by the ontology. However, this kind of process is extremely expensive from a computational point of view: more than 1600 years of CPU time was estimated to achieve it. In this paper, parallel programming techniques and distributed computing paradigms are combined in order to achieve this semantic annotation in a reasonable time. The cornerstone of this proposal is a resource management and execution framework able to integrate heterogeneous computing resources at our disposal (grid, cluster and cloud resources). As a result, the problem was solved in less than 180 days, demonstrating that it is perfectly feasible to exploit the advantages of these computing models in the field of linked data. ontology. From a semantic point of view, the use of the ADEGA algorithm was very promising in terms of precision and recall. Nevertheless, a preliminary estimation concluded that more than 1640 years of CPU time would be needed to create graphs of depth 3 and 25,000 years for graphs of depth 4 (the semantic richness of a graph is proportional to its depth). Obviously, with these computation requirements, a specific computing resource (a personal computer, for instance) or a limited set of resources cannot solve the semantic annotation process.Distributed computing paradigms (grid or cloud computing, for instance) can help us to make the semantic annotation process viable from a computational point of view. This approach is not totally novel, and, therefore, some interesting practical experiences can be found in the scientific literature [5][6][7]. These experiences present two common features that must be emphasized. On the one hand, they show software applications or tools to annotate semantically Web documents or pages. These applications were programmed to be executed on a set of concrete computing resources, and, therefore, they are highly coupled to their corresponding execution environments and technologies. On the other hand, the described annotation processes are fast from time's perspective. This is due to that small-sized and medium-sized collections of data were semantically annotated and, additionally, each data was annotated by means of a single instance of the selected ontology. Now, we try to take advantage of our knowledge in the field of scientific computing and, unlike previous proposals, provide a solution independent of the underlying computing environment and able to solve large-scale annotation processes. In the Universia problem, the complexity of the annotation process is cau...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.