Young In Song scite author profile

This paper studies the problem of generating likely queries for multimodal documents with images. Our application scenario is enabling efficient "first-stage retrieval" of relevant documents, by attaching generated queries to documents before indexing. We can then index this expanded text to efficiently narrow down to candidate matches using inverted index, so that expensive reranking can follow. Our evaluation results show that our proposed multimodal representation meaningfully improves relevance ranking. More importantly, our framework can achieve the state of the art in the first-stage retrieval scenarios.

show abstract

Utilizing the Web for Automatic Word Spacing

Hong¹,

Lee²,

Song³

et al. 2009

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThis paper presents a new approach to word spacing problems by mining reliable words from the Web and use them as additional resources. Conventional approaches to automatic word spacing use noisefree data to train parameters for word spacing models. However, the insufficiency and irrelevancy of training examples is always the main bottleneck associated with automatic word spacing. To mitigate the data-sparseness problem, this paper proposes an algorithm to discover reliable words on the Web to expand the vocabularies and a model to utilize the words as additional resources. The proposed approach is very simple and practical to adapt to new domains. Experimental results show that the proposed approach achieves better performance compared to the conventional word spacing approaches. key words: word spacing, word segmentation

show abstract

On Labelling Intent Types for Evaluating Search Result Diversification

Sakai

Song²

2013

View full text Add to dashboard Cite

A segment-based annotation tool for Korean treebanks with minimal human intervention

Park

Song

Rim

2007

Lang Resources & Evaluation

View full text Add to dashboard Cite

In this paper, we propose a segment-based annotation tool providing appropriate interactivity between a human annotator and an automatic parser. The proposed annotation tool provides the preview of a complete sentence structure suggested by the parser, and updates the preview whenever the annotator cancels or selects each segmentation point. Thus, the annotator can select the proper sentence segments maximizing parsing accuracy and minimizing human intervention. Experimental results show that the proposed tool allows the annotator to be able to reduce human intervention by approximately 39% compared with manual annotation. Sejong Korean treebank, one of the large scale treebanks, was constructed with the proposed annotation tool.A treebank is a corpus annotated with syntactic information, and the structural analysis of each sentence is represented as a tree structure. This kind of corpus serves an extremely valuable resource for computational linguistics applications

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Young In Song

A Term Weighting Method Based on Lexical Chain for Automatic Summarization

Query Generation for Multimodal Documents

Utilizing the Web for Automatic Word Spacing

On Labelling Intent Types for Evaluating Search Result Diversification

A segment-based annotation tool for Korean treebanks with minimal human intervention

Contact Info

Product

Resources

About