IntroductionMachine readable knowledge bases are used to store datasets so that these datasets can be accessible through systems. Machine-readable knowledge base construction (MRKBC) involves the automated extraction and integration of data from different sources and generating meaningful information with interoperable knowledge [1]. There is a large body of research on the automatic extraction of information for MRKBC. Initially, the research focused on syntactic information extraction [2, 3], but more recently, the extraction of lexical semantic information has received more interest from the research community [4,5]. Knowledge base systems (KBS) which use traditional databases are not effective due to the limited operational and analytical workload and
AbstractConstructing an ontology-based machine-readable knowledge base system from different sources with minimum human intervention, also known as ontology-based machine-readable knowledge base construction (OMRKBC), has been a long-term outstanding problem. One of the issues is how to build a large-scale OMRKBC process with appropriate structural information. To address this issue, we propose Natural Language Independent Knowledge Representation (NLIKR), a method which regards each word as a concept which should be defined by its relations with other concepts. Using NLIKR, we propose a framework for the OMRKBC process to automatically develop a comprehensive ontology-based machine-readable knowledge base system (OMRKBS) using well-built structural information. Firstly, as part of this framework, we propose formulas to discover concepts and their relations in the OMRKBS. Secondly, the challenges in obtaining rich structured information are resolved through the development of algorithms and rules. Finally, rich structured information is built in the OMRKBS. OMRKBC allows the efficient search of words and supports word queries with a specific attribute. We conduct experiments and analyze the results of relational information extraction, with the results showing that OMRKBS had an accuracy of 84% which was higher than the other knowledge base systems, namely ConceptNet, DBpedia and WordNet.