Please cite this article in press as: Y. Chen et al., Feature assembly method for extracting relations in Chinese, Artificial Intelligence (2015), http://dx.
AbstractThe goal of relation extraction is to detect relations between two entities in free text. In a sentence, a relation instance usually comprises a small number of words, which yields a sparse feature representation. To make better use of limited information in a relation instance, parsing trees and combined features are employed widely to capture the local dependencies of relation instances. However, the performance of parsing tree-based systems is often degraded by chunking or parsing errors. Combined features are used widely, but few studies have addressed how features can be combined to achieve optimal performance. Thus, in this study, we propose a feature assembly method for relation extraction. Six types of candidate features (head noun, POS tag, n-gram, omni-word, etc.) are employed as atomic features and six constraint conditions (singleton, position, syntax, etc.) are used to combine these features in different settings. Depending on the utilization of candidate features, different constraint conditions can be explored to achieve the optimal extraction performance. Our method is effective for capturing local dependencies and it reduces the errors caused by inaccurate parsing. We tested the proposed method using the ACE 2005 Chinese and English corpora, and it achieved state-of-the-art performance, where it was significantly superior to existing methods.