Jade Goldstein-Stewart scite author profile

Jade Goldstein-Stewart

5Publications

19Citation Statements Received

15Citation Statements Given

How they've been cited

How they cite others

Affiliations

United States Department of Defense, Mitre (United States)

Publications

Order By: Most citations

Person identification from text and speech genre samples

Goldstein-Stewart¹,

Winder²,

Sabin³

2009

View full text Add to dashboard Cite

In this paper, we describe experiments conducted on identifying a person using a novel unique correlated corpus of text and audio samples of the person's communication in six genres. The text samples include essays, emails, blogs, and chat. Audio samples were collected from individual interviews and group discussions and then transcribed to text. For each genre, samples were collected for six topics. We show that we can identify the communicant with an accuracy of 71% for six fold cross validation using an average of 22,000 words per individual across the six genres. For person identification in a particular genre (train on five genres, test on one), an average accuracy of 82% is achieved. For identification from topics (train on five topics, test on one), an average accuracy of 94% is achieved. We also report results on identifying a person's communication in a genre using text genres only as well as audio genres only.

show abstract

A Model-Based Analysis of Semi-Automated Data Discovery and Entry Using Automated Content Extraction

Winder¹,

Haimson²,

Goldstein-Stewart³

et al. 2011

View full text Add to dashboard Cite

________________________________________________________________________Content extraction systems can automatically extract entities and relations from raw text and use the information to populate knowledge bases, potentially eliminating the need for manual data discovery and entry. Unfortunately, content extraction is not sufficiently accurate for end-users who require high trust in the information uploaded to their databases, creating a need for human validation and correction of extracted content. In this paper, we examine content extraction errors and explore their influence on a prototype semi-automated system that allows a human reviewer to correct and validate extracted information before uploading it, focusing on the identification and correction of precision errors. We applied content extraction to six different corpora and used a Goals, Operators, Methods, and Selection rules Language (GOMSL) model to simulate the activities of a human using the prototype system to review extraction results, correct precision errors, ignore spurious instances, and validate information. We compared the simulated task completion rate of the semi-automated system model with that of a second GOMSL model that simulates the steps required for finding and entering information manually. Results quantify the efficiency advantage of the semi-automated workflow and illustrate the value of employing multidisciplinary quantitative methods to calculate system-level measures of technology utility.

show abstract

A Model-Based Analysis of Semiautomated Data Discovery and Entry Using Automated Content-Extraction

Winder

Haimson

Goldstein-Stewart

et al. 2013

International Journal of Human-Computer Interaction

View full text Add to dashboard Cite

A Semi-automatic System for Knowledge Base Population

Goldstein-Stewart

Winder

2011

View full text Add to dashboard Cite

Designing a System for Semi-Automatic Population of Knowledge Bases From Unstructured Text

Goldstein-Stewart

Winder²

2009

View full text Add to dashboard Cite

Important information from unstructured text is typically entered manually into knowledge bases, resulting in limited quantities of data. Automated information extraction from the text could assist with this process, but the technology is still at unacceptable accuracies. This task therefore requires a suitable user interface to allow for correction of the frequent extraction errors and validation of proposed assertions that a user wants to enter into a knowledge base. In this paper, we discuss our system for semi-automatic database population and how it handles the issues arising in content extraction and populating a knowledge base. The main contributions of this work are identifying the challenges in building such a semi-automated tool, the categorization of extraction errors, addressing the gaps in current extraction technology required for databasing, and the design and development of a usable interface and system, FEEDE, to support correcting content extraction output and speeding up the data entry time into knowledge bases. To our knowledge, this is the first effort to populate knowledge bases using content extraction from unstructured text.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.