Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications - EANL '08 2008
DOI: 10.3115/1631836.1631844
|View full text |Cite
|
Sign up to set email alerts
|

Automatic identification of discourse moves in scientific article introductions

Abstract: This paper reports on the first stage of building an educational tool for international graduate students to improve their academic writing skills. Taking a text-categorization approach, we experimented with several models to automatically classify sentences in research article introductions into one of three rhetorical moves. The paper begins by situating the project within the larger framework of intelligent computer-assisted language learning. It then presents the details of the study with very encouraging … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
18
0
2

Year Published

2011
2011
2017
2017

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 21 publications
(22 citation statements)
references
References 29 publications
2
18
0
2
Order By: Relevance
“…The F1 score, or the harmonic mean of precision and recall, measures the overall performance of the system for a category (Van Rijsbergen, 1979) and is calculated as: Table 4 shows that the move classifier predicted Move 1 and Move 3 with higher precision than Move 2. This result is in agreement with our earlier experimentation where we found that Move 2 is most difficult to identify and that it tends to be misclassified as Move 1 (Pendar & Cotos, 2008). This is not surprising since this time the training data for Move 2 was also considerably sparser than the data for the other two moves (6,039 sentences for Move 1; 1,609 for Move 2; and 2,352 for Move 3).…”
Section: Evaluation and Discussionsupporting
confidence: 82%
See 2 more Smart Citations
“…The F1 score, or the harmonic mean of precision and recall, measures the overall performance of the system for a category (Van Rijsbergen, 1979) and is calculated as: Table 4 shows that the move classifier predicted Move 1 and Move 3 with higher precision than Move 2. This result is in agreement with our earlier experimentation where we found that Move 2 is most difficult to identify and that it tends to be misclassified as Move 1 (Pendar & Cotos, 2008). This is not surprising since this time the training data for Move 2 was also considerably sparser than the data for the other two moves (6,039 sentences for Move 1; 1,609 for Move 2; and 2,352 for Move 3).…”
Section: Evaluation and Discussionsupporting
confidence: 82%
“…The main features used for the identification of moves and steps were sets of word unigrams and trigrams (i.e., single words and three word sequences) from the annotated corpus. In our earlier work, we found that bigrams (two word sequences) had a negative effect on the classifier (Pendar & Cotos, 2008), which is why we did not experiment with the bigrams here. We also found that our extracted features were not discipline-dependent.…”
Section: Movementioning
confidence: 99%
See 1 more Smart Citation
“…2 Then, with the help of preprogrammed scripts, percentages for the move distribution in the student's draft are automatically calculated and compared with the distribution of moves in the corpus of his/her academic field (see Pendar & Cotos, 2008). The classification into moves and the information about the distribution of moves, both in the student draft and in the corpus, are the sources of two forms of feedback-color-coded and numerical (see sample feedback in Appendix C).…”
Section: Intelligent Academic Discourse Evaluator (Iade)mentioning
confidence: 99%
“…Using the annotated corpus data and the statistical properties of n-grams in our corpus, probabilistic language models for predicting the occurrence of moves and steps were built to generate rhetorical feedback (Babu, 2013;Cotos, Gilbert, & Sinapov, 2014). RWT's analysis engine approaches the identification of these discourse units as a supervised classification problem (see Burstein, Marcu, & Knight, 2003;Pendar & Cotos, 2008), where each sentence in a text is considered an independent unit of analysis to be classified into two categories -one corresponding to a move and the second corresponding to a step within the identified move.…”
Section: Computational Operationalization For Pedagogical Usementioning
confidence: 99%