1999
DOI: 10.1111/0824-7935.00085
|View full text |Cite
|
Sign up to set email alerts
|

Retrieving Domain‐Specific Collocations by Co‐occurrences and Word Order Constraints

Abstract: In this paper, we describe a method for automatically retrieving collocations from large text corpora. This method comprises the following stages: (1) extracting strings of characters as units of collocations, and (2) extracting recurrent combinations of strings as collocations. Through this method, various types of domain-specific collocations can be retrieved simultaneously. This method is practical because it uses plain text with no specific-languagedependent information, such as lexical knowledge and parts… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2005
2005
2018
2018

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…This methodology is language dependent rather than language independent, and the system requires highly specialized linguistic techniques to identify the possible candidate terms. Second, purely statistical systems extract discriminating multiword terms from the text corpora by means of association measures [5,6,7]. As they use plain text corpora and only require the information appearing in texts, such systems are highly flexible and extract relevant units independently from the domain and the language of the input text.…”
Section: Introductionmentioning
confidence: 99%
“…This methodology is language dependent rather than language independent, and the system requires highly specialized linguistic techniques to identify the possible candidate terms. Second, purely statistical systems extract discriminating multiword terms from the text corpora by means of association measures [5,6,7]. As they use plain text corpora and only require the information appearing in texts, such systems are highly flexible and extract relevant units independently from the domain and the language of the input text.…”
Section: Introductionmentioning
confidence: 99%