When searching for information, a human reader first glances over a document, spots relevant sections and then focuses on a few sentences for resolving her intention. However, the high variance of document structure complicates to identify the salient topic of a given section at a glance. To tackle this challenge, we present SECTOR, a model to support machine reading systems by segmenting documents into coherent sections and assigning topic labels to each section. Our deep neural network architecture learns a latent topic embedding over the course of a document. This can be leveraged to classify local topics from plain text and segment a document at topic shifts. In addition, we contribute WikiSection, a publicly available dataset with 242k labeled sections in English and German from two distinct domains: diseases and cities. From our extensive evaluation of 20 architectures, we report a highest score of 71.6% F1 for the segmentation and classification of 30 topics from the English city domain, scored by our SECTOR LSTM model with bloom filter embeddings and bidirectional segmentation. This is a significant improvement of 29.5 points F1 compared to state-of-the-art CNN classifiers with baseline segmentation. 1 Our source code is available under the Apache License 2.0 at https
We report results on benchmarking Open Information Extraction (OIE) systems using RelVis, a toolkit for benchmarking Open Information Extraction systems. Our comprehensive benchmark contains three data sets from the news domain and one data set from Wikipedia with overall 4522 labeled sentences and 11243 binary or n-ary OIE relations. In our analysis on these data sets we compared the performance of four popular OIE systems, ClausIE, OpenIE 4.2, Stanford OpenIE and PredPatt. In addition, we evaluated the impact of five common error classes on a subset of 749 n-ary tuples. From our deep analysis we unreveal important research directions for a next generation of OIE systems.
Color is one of the most common ways to convey information in visualization applications. Color vision deficiency (CVD) affects approximately 200 million individuals worldwide and considerably degrades their performance in understanding such contents by creating red-green or blue-yellow ambiguities. While several content-specific methods have been proposed to resolve these ambiguities, they cannot achieve this effectively in many situations for contents with a large variety of colors. More importantly, they cannot facilitate color identification. We propose a technique for using patterns to encode color information for individuals with CVD, in particular for dichromats. We present the first content-independent method to overlay patterns on colored visualization contents that not only minimizes ambiguities but also allows color identification. Further, since overlaying patterns does not compromise the underlying original colors, it does not hamper the perception of normal trichromats. We validated our method with two user studies: one including 11 subjects with CVD and 19 normal trichromats, and focused on images that use colors to represent multiple categories; and another one including 16 subjects with CVD and 22 normal trichromats, which considered a broader set of images. Our results show that overlaying patterns significantly improves the performance of dichromats in several color-based visualization tasks, making their performance almost similar to normal trichromats'. More interestingly, the patterns augment color information in a positive manner, allowing normal trichromats to perform with greater accuracy.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.