Table understanding methods extract, transform, and interpret the information contained in tabular data embedded in documents/files of different formats. Such automatic understanding would allow to exploit tabular information with the aim of accurately answering queries, or integrating heterogeneous repositories of information in a common knowledge base, or exchanging information among different sources. The purpose of this survey is to provide a comprehensive analysis of the research efforts so far devoted to the problem of table understanding and to describe systems that support the transformation of heterogeneous tables into meaningful information.
This article is categorized under:
Application Areas > Data Mining Software Tools
Technologies > Data Preprocessing
Technologies > Structure Discovery and Clustering
A great number of companies and institutions use spreadsheets for managing, publishing and sharing their data. Though effective, spreadsheets are mainly designed for being interpreted by humans, and the automatic extraction of their content and interpretation is a complex task. The task becomes even harder when tables present different kinds of mistakes and their layout is complex. In this paper, we outline the approach that we wish to develop during the PhD for answering the research question "how to semi-automatically extract coherent semantic information from heterogeneous and complex spreadsheets?".
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.