Background: In our research project, we have developed a scoring rubric for a second language (L2) summary writing for English as a foreign language (EFL) students in Japanese universities. This study aimed to examine the applicability of our five-dimensional rubric, which features both analytic and holistic assessments, to classrooms in the EFL context. The examination especially focused on a newly added, optional overall quality dimension and two paraphrasing dimensions: paraphrase (quantity) and paraphrase (quality). Methods: Six teacher raters evaluated 16 summaries written by Japanese EFL university student writers using our new rubric. The scoring results were quantitatively compared with the scoring results of a commonly used rubric developed by the Educational Testing Service (ETS). For the qualitative examination, the teacher raters' retrospective comments on our five-dimensional rubric were analyzed. Results: The quantitative examination demonstrated positive results as follows: (a) adding the optional overall quality dimension improved the reliability of our rubric, (b) the overall quality dimension worked well even if it was used alone, (c) our rubric and the ETS holistic rubric overlapped moderately as L2 summary writing assessments, and (d) the two paraphrasing dimensions covered similar but different aspects of paraphrasing. However, the quantitative analysis using the generalizability theory (G theory) simulated that the reliability (generalizability coefficients) of the rubric was not high when the number of raters decreased. The qualitative examination of the raters' perceptions of our rubric generally confirmed the quantitative results. Conclusion: This study confirmed the applicability of our rubric to EFL classrooms. This new type of rating scale can be characterized as a "hybrid" approach that offers the user a choice of analytic and holistic measures depending on individual purposes, which can enhance teachers' explicit instructions.
Although the importance of summary writing is well documented in prior studies, few have investigated the evaluation of written summaries. Due to the complex nature of L2 summary writing, which requires one to read the original material and summarize its content in the L2, raters often emphasize different features when judging the quality of L2 summaries. Therefore, this study examines the ratings of English-language summaries written by Japanese university students in order to identify differences in EFL instructors' evaluations. Fifty-one Japanese EFL university students read a passage and then wrote an English summary without receiving any instructions concerning summary composition. The raters included three native English speakers (NESs) and three non-native English speakers (NNESs), who individually evaluated each summary using the Educational Testing Service's holistic rubric. Analysis of inter-rater reliability revealed a lower Cronbach's alpha coefficient for NNES raters (α = .39) when compared to NES raters (α = .77). Comments were collected from raters regarding the difficulty of evaluating summaries, and the causes of such difficulties were examined. Comments from NNES raters more concerned vocabulary use and paraphrasing, whereas the NES raters concentrated on content and language. This study also explores ways to potentially improve the holistic rubric by examining feedback from raters regarding their rating experiences.
Integrated writing tasks are becoming popular in the field of language testing, but it remains unclear how teachers assess integrated writing tasks holistically and/or analytically and which is more effective. This exploratory study aims to investigate teacher-raters' holistic and analytic ratings for reliability and validity and to reveal their perceptions of grading the integrated writing task on the Test of English as a Foreign Language Internet-based Test (TOEFL iBT). Thirty-six university students completed a reading-listening-writing task. Seven raters scored the 36 compositions using both a holistic and an analytic scale, and completed a questionnaire about their perceptions of the scales. Results indicated that the holistic and analytic scales exhibited high inter-rater reliability and there were high correlations between the two rating methods. In analytic scoring, which contained four dimensions, namely, content, organization, language use, and verbatim source use, the dimensions of content and organization were highly correlated to the overall analytic score (i.e., the mean score of the four dimensions). However, the dimension of verbatim source use was found to be peculiar in terms of construct validity for the analytic scale. The analyses also indicated various challenges the raters faced while scoring. Their perceptions varied particularly regarding verbatim source use: Some raters tended to emphasize the intricate process of textual borrowing while others stressed the difficulty in judging multiple types and degrees of textual borrowing. Pedagogical implications for the selection and use of rubrics as well as the teaching and assessment of source text use are suggested.
The aim of this paper was to clarify the characteristics of the evaluation of high school students’ free compositions by high school teachers and university students who are teacher-candidates. This was done by analyzing a comparison of the results of analytic and holistic rating scales. In addition, information about their consciousness regarding the evaluation and instruction of free compositions was obtained with the help of a questionnaire. The pedagogical implications were explored with the help of the analyses and the questionnaire. In recent years, free compositions in English (for example, explaining a particular situation and arguing for or against a statement) have been increasingly required in university entrance examinations in Japan. Such kinds of activities are also required in situations of actual communication. Moreover, the renewed course of study (Ministry of Education, 1999) has been conducted in high schools since the academic year of 2003. The overall objective of the course of study is to develop students’ practical communication abilities such as expressing their own ideas through written English. To satisfy these requirements, the chances of instructing students on free compositions in high school classrooms will be increased. In such a situation, it is important to obtain information on how teachers and teacher-candidates instruct and evaluate students’ free compositions. The survey consisted of several analyses of the teachers’ and teacher-candidates’ evaluation data, and a questionnaire concerning their consciousness of the evaluation and instruction of free compositions. The raters who evaluated high school students’ free compositions were 10 high school teachers from a national high school and two public high schools and 6 university students who belonged to a teacher-training course in a national university and had experience with practical teaching. First, the raters evaluated 40 compositions written by 20 high school students who belonged to the first and the second grade of a national high school in the Chugoku region in Japan (all the 20 students wrote two kinds of compositions). The evaluations were carried out using holistic rating scales and two kinds of analytic rating scales. In this survey, two subscales given by Ishida and Mori (1985), that is, the objective holistic rating scale (How good do you think the composition is?) and the subjective holistic rating scale (How do you personally like the composition?) were adopted as holistic rating scales. At the same time, two kinds of rating scales, that is, Jacobs, Zinkgraf, Wormuth, Hartfiel, and Hughey’s (1981) ESL Composition Profile and National Institute for Educational Policy Research’s (2002) kantenbetsu-hyoka (it means an analytic rating), were adopted as analytic rating scales. The following two points were discussed on the basis of a comparison of the result of the evaluation of the holistic and analytic rating scales using descriptive statistics, Pearson’s product-moment correlation coefficients, and Cronbach’s alpha coefficients: 1) To clarify the characteristics of the evaluations performed by the teachers and teacher-candidates, and 2) To examine the reliability and the validity of the kantenbetsu-hyoka scale. From the results, it appeared that the teachers’ evaluations were consistent within and beyond the rating scales, while those performed by the university students who were teacher-candidates were slightly inconsistent within and beyond the rating scales. At the same time, it also appeared that the validity of the kantenbetsu-hyoka scale was not higher than that of the other scales. Second, an open-ended questionnaire, which included two parameters, was given to the same raters. The first question was what they regarded as being of greatest importance while instructing students on free compositions, and the second was what they regarded as being of greatest importance when they evaluated them. The following point was discussed on the basis of the questionnaire: 3) To obtain information on the consciousness of teachers and teacher-candidates concerning the evaluation of and instruction on free compositions. From the results of the questionnaire, it appeared that teachers were more sensitive to the conditions of high school students than the university students who were teacher-candidates. These results led to the pedagogical implications as follows: 1) The validity of the kantenbetsu-hyoka scale was not higher than the other scales; therefore, it would be better to use it in combination with other scales, 2) It would be better for teachers to carefully consider each parameter of the analytic rating scales, in order to make good use of the merits of the scale, and 3) It would be better for university students who are teacher-candidates to have more opportunities of evaluating high school students’ compositions and of instructing them on writing compositions, before they become teachers. 本研究では、複数の評価尺度を組み合わせることで、教員と大学生による高校生の自由英作文評価の実態が調査された。総合的評価尺度として「客観的総合評価」と「主観的総合評価」が、分析的評価尺度としてESL Composition Profileと「観点別評価」が使用された。調査では、10名の教員と6名の大学生が、20名の高校生によって書かれた40編の自由英作文を評価した。評価結果を記述統計量、信頼性係数、相関係数から分析することで、教員の評価は大学生の評価よりも一貫性が高いことが示された。また「観点別評価」の信頼性と妥当性も議論され、妥当性がやや低かったことが示された。同時に、自由記述の質問紙による指導と評価の意識調査も行われ、教員は大学生よりも指導現場に根ざした意識を有していることが確認された。これらの結果に対する考察が行われ、そこから教員と大学生の双方に対する教育的示唆が示された。
This paper reports on the results of a listening instruction intervention for Japanese EFL university students aimed at improving their ability to correctly discern the phonetic and phonological aspects of English sounds. In the background of this project lies our belief that the phonetic/phonological instructions are likely to be helpful (even) for Japanese EFL students who do not major in English linguistics or literature, although these instructions are usually offered to those who are English majors. The goal of the study, thus, is to show that phonetics/phonology-based English teaching is effective for Japanese EFL students in improving their listening ability in general. To achieve the goal, we utilized a set of exercises devised for a 15-week listening course (i.e., “Sound Focus for Effective Listening”; hereinafter, “Sound Focus”). Sound Focus includes six phonetic/phonological aspects of English that are considered by the authors (= instructors) to be essential and important for improvement of listening ability. The participants were 331 freshmen at a national university: 254 were instructed in a CALL (computer-assisted language learning) classroom situation and 77 in a traditional classroom situation. Sound Focus was given with the help of a learning management system (LMS), Moodle, in the CALL classroom situation. In the traditional classroom, the Sound-Focus materials and listening exercises were provided in the form of paper-based handouts used with a CD. To understand the effects of Sound Focus instruction on student achievement and the difference between the two classroom situations, we conducted pre- and post-listening tests and administered a Can-do-statements questionnaire and a free-description questionnaire. The listening tests, which were based on Sound Focus, measured the improvement in students’ listening ability during the course; the Can-do-statements questionnaire evaluated their confidence in their listening ability; and the free description questionnaire aimed to identify the aspects of the instruction that was positively or negatively accepted by the learners. The results of the pre- and post-listening tests and the Can-do-statements questionnaire were analyzed by two-way repeated-measures ANOVA. The free description questionnaire was analyzed with a text-mining technique (SPSS Text Analytics for Surveys 3.0). The two-way repeated-measures ANOVA analysis on the difference between the scores of the pre- and post-listening tests suggested that students in each classroom situation improved their listening ability. The combined analysis of the results of the pre- and post-test scores and the Can-do-statements questionnaire further suggested that the instruction was effective for students with all levels of confidence. We analyzed the free description questionnaire to explicate what aspect of the instruction showed greater effectiveness. The results revealed that among the instructional materials, including the textbook conversations and TOEIC exercises, Sound Focus was considered by the students to be the most effective for their learning, regardless of their classroom situation. The students in the traditional classroom situation reported that the textbook conversations were also helpful. Regarding the presentation of the instructional materials, on the other hand, learners showed a sharp perceptual difference: Those in the CALL classroom situation accepted the LMS (Moodle) more positively, while in the traditional classroom situation, the presentation of the materials with the help of a projector was negatively scored. The analysis also showed that Moodle was also regarded as the best activity for the improvement of their listening ability among all the classroom activities (e.g., role-play conversations, dictations, shadowing). 本実践報告では,英語を専攻としていない日本の大学1年生を対象とした英語リスニングの授業において,英語の音声学・音韻論的特徴を指導した効果を検証した。授業はSound Focusと名付けた教材を使用し,普通教室またはCALL教室で行われ,教室環境の違いも考慮に入れた効果検証を行った。プレ・ポストテストにおける音声学・音韻論的特徴の聞き取りに関するパフォーマンスの違いと英語に対する自信の自己評価(Can-Do調査)との関連を2要因の分散分析により検討した。また,授業終了時の自由記述もテキストマイニングの手法を用いて検討した。分散分析の結果,Sound Focusを用いた英語音声学・音韻論的な指導の効果は,教室環境(普通教室,CALL教室)の違いにかかわらず,大学1年生の聞き取りパフォーマンスの向上に効果があることがわかった。またテキストマイニングの分析からは,両教室環境に対する学生の認識の違いが示された。
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.