Important information that relates to a specific topic in a document is often organized in tabular format to assist readers with information retrieval and comparison, which may be difficult to provide in natural language. However, tabular data in unstructured digital documents, e.g. Portable Document Format (PDF) and images, are difficult to parse into structured machine-readable format, due to complexity and diversity in their structure and style. To facilitate image-based table recognition with deep learning, we develop the largest publicly available table recognition dataset PubTabNet 1 , containing 568k table images with corresponding structured HTML representation. PubTabNet is automatically generated by matching the XML and PDF representations of the scientific articles in PubMed Central TM Open Access Subset (PMCOA). We also propose a novel attention-based encoder-dual-decoder (EDD) architecture that converts images of tables into HTML code. The model has a structure decoder which reconstructs the table structure and helps the cell decoder to recognize cell content. In addition, we propose a new Tree-Edit-Distance-based Similarity (TEDS) metric for table recognition. The experiments demonstrate that the EDD model can accurately recognize complex tables solely relying on the image representation, outperforming the state-ofthe-art by 9.7% absolute TEDS score.
No abstract
While cochlear implants have helped hundreds of thousands of individuals, it remains difficult to predict the extent to which an individual’s hearing will benefit from implantation. Several publications indicate that machine learning may improve predictive accuracy of cochlear implant outcomes compared to classical statistical methods. However, existing studies are limited in terms of model validation and evaluating factors like sample size on predictive performance. We conduct a thorough examination of machine learning approaches to predict word recognition scores (WRS) measured approximately 12 months after implantation in adults with post-lingual hearing loss. This is the largest retrospective study of cochlear implant outcomes to date, evaluating 2,489 cochlear implant recipients from three clinics. We demonstrate that while machine learning models significantly outperform linear models in prediction of WRS, their overall accuracy remains limited (mean absolute error: 17.9-21.8). The models are robust across clinical cohorts, with predictive error increasing by at most 16% when evaluated on a clinic excluded from the training set. We show that predictive improvement is unlikely to be improved by increasing sample size alone, with doubling of sample size estimated to only increasing performance by 3% on the combined dataset. Finally, we demonstrate how the current models could support clinical decision making, highlighting that subsets of individuals can be identified that have a 94% chance of improving WRS by at least 10% points after implantation, which is likely to be clinically meaningful. We discuss several implications of this analysis, focusing on the need to improve and standardize data collection.
ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate summaries including lexical variations and paraphrasing. We propose a graph-based approach adopted into ROUGE to evaluate summaries based on both lexical and semantic similarities. Experiment results over TAC AESOP datasets show that exploiting the lexico-semantic similarity of the words used in summaries would significantly help ROUGE correlate better with human judgments.
Geolocated social media data provide a powerful source of information about places and regional human behavior. Because only a small amount of social media data have been geolocation-annotated, inference techniques play a substantial role to increase the volume of annotated data. Conventional research in this area has been based on the text content of posts from a given user or the social network of the user, with some recent crossovers between the text-and network-based approaches. This paper proposes a novel approach to categorize highly-mentioned users (celebrities) into Local and Global types, and consequently use Local celebrities as location indicators. A label propagation algorithm is then used over the refined social network for geolocation inference. Finally, we propose a hybrid approach by merging a text-based method as a back-off strategy into our network-based approach. Empirical experiments over three standard Twitter benchmark data sets demonstrate that our approach outperforms state-of-the-art user geolocation methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.