Using deep learning networks to recognize the table attracts lots of attention. However, due to the lack of high-quality table datasets, the performance of using deep learning networks is limited. Therefore, TableRobot has been proposed, an automatic annotation method for heterogeneous tables. To be more specific, the annotations of table consist of the coordinates of the item block and the mapping relationship between item blocks and table cells. In order to transform the task, we successfully design an algorithm based on the greedy approach to find the optimum solution. To evaluate the performance of TableRobot, we check the annotation data of 3000 tables collected from the LaTex documents in arXiv.com, and the result shows that TableRobot can generate table annotation datasets with the accuracy of 93.2%. Besides, the table annotation data is feed into GraphTSR which is a state-of-the-art table recognition graph neural network, and the F1 value of the network has increased by nearly 10% compared with before.
Recently, Question Answering has been a hot topic in the research of information retrieval. Question Classification plays a critical role in most Question Answering systems. In this paper, a new approach to classifying questions using Profile Hidden Markov Models (PHMMs) is proposed. The generalization strategies to extract the pattern instances of questions by selective substitution are discussed. Then the classification method with pattern instances' structural features is investigated. Experimental results show that the PHMM based question classifier can reach the accuracy of 92.2% and significantly outperforms most of the state-of-the-art systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.