Nawei Chen scite author profile

Categorization of biomedical articles is a central task for supporting various curation efforts. It can also form the basis for effective biomedical text mining. Automatic text classification in the biomedical domain is thus an active research area. Contests organized by the KDD Cup (2002) and the TREC Genomics track (since 2003) defined several annotation tasks that involved document classification, and provided training and test data sets. So far, these efforts focused on analyzing only the text content of documents. However, as was noted in the KDD'02 text mining contest-where figure-captions proved to be an invaluable feature for identifying documents of interest-images often provide curators with critical information. We examine the possibility of using information derived directly from image data, and of integrating it with text-based classification, for biomedical document categorization. We present a method for obtaining features from images and for using them-both alone and in combination with text-to perform the triage task introduced in the TREC Genomics track 2004. The task was to determine which documents are relevant to a given annotation task performed by the Mouse Genome Database curators. We show preliminary results, demonstrating that the method has a strong potential to enhance and complement traditional text-based categorization methods.

show abstract

A survey of document image classification: problem statement, classifier architecture and performance evaluation

Chen

Blostein

2006

IJDAR

145

View full text Add to dashboard Cite

Exploring a new space of features for document classification

Chen

Shatkay

Blostein

2006

View full text Add to dashboard Cite

Removal of hexavalent chromium in soil by lignin-based weakly acidic cation exchange resin

Chen

Qiu

Huang

et al. 2019

Chinese Journal of Chemical Engineering

View full text Add to dashboard Cite

Use of Figures in Literature Mining for Biomedical Digital Libraries

Chen

Shatkay

Blostein

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nawei Chen

Integrating image data into biomedical text categorization

A survey of document image classification: problem statement, classifier architecture and performance evaluation

Exploring a new space of features for document classification

Removal of hexavalent chromium in soil by lignin-based weakly acidic cation exchange resin

Use of Figures in Literature Mining for Biomedical Digital Libraries

Contact Info

Product

Resources

About