An Ontology for Historical Research Documents

Adorni, Giovanni; Maratea, Marco; Pandolfo, Laura; Pulina, Luca

doi:10.1007/978-3-319-22002-4_2

Cited by 11 publications

(6 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…it is mostly depending upon user of ontology. Domain Ontologies are constructed for Safety Risk identification to formalize the safety risk knowledge in metro construction [8], for historical documents [9], for university purpose [10].…”

Section: Related Workmentioning

confidence: 99%

Development of Research Proposal Selection Based on Domain Ontology using K-Means Categorical Clustering

E.*,

M.,

N.J.

2020

IJITEE

View full text Add to dashboard Cite

With the prompt improvement in research progress of various zones, selection of research proposals became a remarkable methodology in many research funding agencies and organizations. When a less number of research proposals are received, then it is ease to cluster the research proposals and the selection process became as non-problematic way. If a number of research proposals elevated, then the clustering and selecting the proposals became complicated. In current system, proposals grouping is done in manual-based or along with their similarities in subject disciplinaries which yield irrelevant results in some cases. The main goal of this research work is to develop an enhanced system in selection of research proposals based on Domain ontology, where the ontology acts as a searching criteria for the topics of research proposals. This proposed system will help to select the topics of research proposals in well-systematic way without the interference of manual progression. In this paper, an algorithm is proposed as Scikit-learn K-means Multiclass Document Clustering(SKMDC) to group each subject discipline according to their sub-topics and sub-domains. Here, the k-means clustering technique is implemented on categorical data to implement the clustering process. As, the categorical data are not able to applied directly in K-means clustering algorithm, the LabelEncoder method is implemented to encode the text data to numerical values and the dimensions of a dataset are reduced using Principal Component Analysis. This paper also overwhelms the weaknesses of k-means technique in specification of cluster number in initial stage. It is done through the determination of optimal number of clusters by using Elbow Curve method and it is cross-validated through Silhouette Score analysis.

show abstract

Section: Related Workmentioning

confidence: 99%

Development of Research Proposal Selection Based on Domain Ontology using K-Means Categorical Clustering

E.*,

M.,

N.J.

2020

IJITEE

View full text Add to dashboard Cite

show abstract

“…The authors aim to create a system capable to automate the ontological-based annotation process of texts from digital libraries. The work is based on the STOLE [31], an ontology-based digital library created from documents about the history of public administration in Italy in the 19th and 20th centuries. For annotation purposes, they considered classes from STOLE to perform the experiment, Article, Event, Institution, Legal System, and Person.…”

Section: Related Workmentioning

confidence: 99%

Ontological Semantic Annotation of an English Corpus Through Condition Random Fields

2019

View full text Add to dashboard Cite

One way to increase the understanding of texts by machines is through adding semantic information to lexical items by including metadata tags, a process also called semantic annotation. There are several semantic aspects that can be added to the words, among them the information about the nature of the concept denoted through the association with a category of an ontology. The application of ontologies in the annotation task can span multiple domains. However, this particular research focused its approach on top-level ontologies due to its generalizing characteristic. Considering that annotation is an arduous task that demands time and specialized personnel to perform it, much is done on ways to implement the semantic annotation automatically. The use of machine learning techniques are the most effective approaches in the annotation process. Another factor of great importance for the success of the training process of the supervised learning algorithms is the use of a sufficiently large corpus and able to condense the linguistic variance of the natural language. In this sense, this article aims to present an automatic approach to enrich documents from the American English corpus through a CRF model for semantic annotation of ontologies from Schema.org top-level. The research uses two approaches of the model obtaining promising results for the development of semantic annotation based on top-level ontologies. Although it is a new line of research, the use of top-level ontologies for automatic semantic enrichment of texts can contribute significantly to the improvement of text interpretation by machines.

show abstract

“…According to Øyvind Eide (2014) there are at least six methods to include ontologies into a TEI document using the <relation> element which allows enhancing descriptions by using RDF-OWL ontologies. Nevertheless is a sophisticated method that still needs to be tested, but the perspective of success will allow integrating into TEI documents vocabularies like FOAF, LKIF Core, Bio Vocabulary, and even experimental ontologies for historical documents (Adorni et al, 2015).…”

Section: Semantic Data Modellingmentioning

confidence: 99%

Jurisdictional Culture and Memory Digitization of the “Government of Justice.” Data Modeling and Digital Approach for the Legal History of Ibero-America

Gayol

Flórez

2018

Cult. Hist. Digit. J.

View full text Add to dashboard Cite

Can a machine retrieve the cultural meaning from a corpus of sources? This article addresses the scope and restrictions that digitization, transcription, and data modeling represents for machine-mediated readings of legal, historical records, particularly those derived from the cultural context of Hispanic empire. It compares the dichotomy between the ambiguous language of ancient regime legal texts and the unambiguity required by machinereadable files. Besides, is problematize the corporal reading and the strategy of distant-reading and visualizations as a model for interpretation of vast bulk of textual data. We propose a strategy for segmentation and data modelling for the approach the textual logic of ancient regime legal records based on their hierarchization, interrelation with nonjudiciary sources (theological, historical, philosophical, etc.), its internal segmentation, the non-linear logic of the normative, and the authoritative requirements of compilations and relevant legal works. Its conclude that the advantages of automation are attached to the ability to manipulate files without distorting the original meaning of the texts, therefore, it proposes the necessity to develop standardized vocabularies that help to avoid anachronistic approaches regarding Modern Age legal sources.

show abstract

An Ontology for Historical Research Documents

Cited by 11 publications

References 10 publications

Development of Research Proposal Selection Based on Domain Ontology using K-Means Categorical Clustering

Development of Research Proposal Selection Based on Domain Ontology using K-Means Categorical Clustering

Ontological Semantic Annotation of an English Corpus Through Condition Random Fields

Jurisdictional Culture and Memory Digitization of the “Government of Justice.” Data Modeling and Digital Approach for the Legal History of Ibero-America

Contact Info

Product

Resources

About