2013
DOI: 10.5121/ijwest.2013.4304
|View full text |Cite
|
Sign up to set email alerts
|

Question Classification using Semantic, Syntactic and Lexical features

Abstract: Question classification is very important for question answering. This paper present our research work on question classification through machine learning approach. In order to train the learning model, we designed a rich set of features that are predictive of question categories. An important component of question answering systems is question classification. The task of question classification is to predict the entity type of the answer of a natural language question. Question classification is typically don… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
32
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 48 publications
(32 citation statements)
references
References 19 publications
0
32
0
Order By: Relevance
“…This is the mostly used set of features, since it usually allows obtaining the best results [6][7][8][9][10]. Unigrams are obtained from tokens by applying two operations: i) elimination of tokens labelled with some lexical tags, like DT, IN, and punctuation; ii) sometimes the remaining unigrams are transformed by the stemming into their radical form.…”
Section: A Lexical Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…This is the mostly used set of features, since it usually allows obtaining the best results [6][7][8][9][10]. Unigrams are obtained from tokens by applying two operations: i) elimination of tokens labelled with some lexical tags, like DT, IN, and punctuation; ii) sometimes the remaining unigrams are transformed by the stemming into their radical form.…”
Section: A Lexical Featuresmentioning
confidence: 99%
“…A benchmark dataset built by Li and Roth [3] is widely used in literature, and NLP techniques able to analyze the question are more or less well established [4]. On the other hand, different authors chose to extract from questions a broad variety of features [5], usually divided into lexical, syntactic and semantic [6,7]. Different works focused on the extraction of particular words, like whword and head-word [8], but good results were gained as long as the number of employed features was increased to a very high number [6][7][8][9][10].…”
Section: Introductionmentioning
confidence: 99%
“…This feature aims at individuating the most informative word of the question for classification purposes. Introduced in [8], it is widely used and recognized to be useful [3,4,6,8,20]. It is extracted here from the parse tree, by using redefined rules, similar to those proposed by Collins [21], already modified in other works [6,8].…”
Section: Features Extraction and Representationmentioning
confidence: 99%
“…This is the mostly used set of features, since it usually allows obtaining the best results [3,4,6,8]. Unigrams are obtained from the set of tagged tokens of the question, by eliminating tokens with some tags, like DT, IN, and punctuation.…”
Section: Features Extraction and Representationmentioning
confidence: 99%
See 1 more Smart Citation