[PDF] The Association Between TOEFL iBT Test Scores and the Common



Previous PDF Next PDF







The Association Between TOEFL iBT Test Scores and the Common

The Association Between TOEFL iBT ® Scores and CEFR Levels between test scores and the CEFR levels (see relevant discussion in Taylor, 2004) As a result of considering all the above information, and as suggested in the standard-setting literature (e g , Geisinger & McCormick, 2010), a revised set of CEFR cut scores for the TOEFL iBT test was



EF SET and TOEFL Correlation Study Report

This study was carried out to explore the statistical association between EF SET PLUS and TOEFL iBT scores Three-hundred eighty four volunteer examinees participated in the study The results suggest moderately strong, positive correlations between EF SET PLUS and TOEFL for both the reading and listening scales and provide solid



TOEFL Computer-Based and Paper-Based Tests

computer-based TOEFL test (TOEFL cBT) in 1998 was the fi rst incremental step in this broad test-improvement effort The next step was the introduction of an Internet-based version of the TOEFL test (TOEFL iBT) in Sep-tember 2005 The test was fi rst launched in the United States, and has been gradually rolled out worldwide



Linking TOEFL iBT to IELTS Scores - ETS Home

proficiency test Accepted by over 7,500 score users in more than 130 countries, the TOEFL test is the most widely recognized English‐language test in the world Recently, ETS conducted score comparison research between the TOEFL test and an alternative test, IELTS



A Minimum English Proficiency Standard for The Test of

of the TOEFL examination are administered in the U S : the Paper-based Test (PBT) and the Internet-based Test (iBT) Currently, a majority of TOEFL tests administered are Internet-based, however, the PBT is available as a sup-plement to TOEFL iBT in a limited number of locations nurse supervisors who speak a primary language other



THE CORRELATION BETWEEN READING COMPREHENSION AND ACADEMIC

TOEFL reading test and documentation of English Education Study Program Students of UIN Raden Fatah Palembang Seventy nine students were involved in the study and the data were analyzed by test analysis The result showed that all of the students had low comprehension because the score of TOEFL reading test of the



IS TOEFL COMPULSORY FOR INDONESIAN UNIVERSITY STUDENTS? A

Table 4 shows that the correlation between pre TOEFL score and GPA is r = 0 525 This means that there is a moderate positive relationship between pre TOEFL score and GPA This correlation coefficient indicates that if the students obtained high score in the pre TOEFL test, they will be likely to get high GPA score

[PDF] B2i College-CR2017-EMC

[PDF] B2i collège - mediaeduscoleducationfr

[PDF] Feuille de position B2i École

[PDF] Le B2I en 2016/2017 - Lycée André Honnorat - ac-aix-marseillefr

[PDF] Feuille de position B2i École

[PDF] Le B2I collège quot Brevet Informatique et Internet quot - Collège ADAUDET

[PDF] Référentiel du B2I lycée (pdf) - mediaeduscoleducationfr

[PDF] Programmes 2016 Cycles 2 3

[PDF] Performance of B3LYP Density Functional - ACS Publications

[PDF] Bit #8212 Wikipédia

[PDF] BAA - HEC Montréal

[PDF] Baccalauréat en administration - Étudier ? l 'UQAM

[PDF] pengaturan tentang hak asasi manusia berdasarkan undang

[PDF] Règlement simplifié du Badminton

[PDF] Sujet corrigé de Mathématiques - Baccalauréat S (Scientifique

Research Memorandum

ETS RM-15-06

The Association Between TOEFL iBT

Test Scores and the Common

European Framework of Reference

(CEFR) Levels

Spiros Papageorgiou

Richard J. Tannenbaum

Brent Bridgeman

Yeonsuk Cho

August 2015

ETS Research Memorandum Series

EIGNOR EXECUTIVE EDITOR

James Carlson

Principal Psychometrician

ASSOCIATE EDITORS

Beata Beigman Klebanov

Research Scientist

Heather Buzick

Resear

ch Scientist

Brent Bridgeman

Distinguished Presidential Appointee

Keelan Evanini

Managing Research Scientist

Marna Golub-Smith

Principal Psychometrician

Shelby Haberman

Distinguished Presidential Appointee Donald Powers Managing Principal Research Scientist

Gautam Puhan

Principal Psychometrician

John Sabatini

Managing Principal Research Scientist

Matthias von Davier

Senior Research Director

Rebecca Zwick

Distinguished Presidential Appointee

PRODUCTION EDITORS

Kim Fryer

Manager, Editing Services Ayleen Stellhorn Editor

The Association Between TOEFL iBT

Test Scores and the Common European

Framework of Reference (CEFR) Levels

Educational Testing Service, Princeton, New Jersey

VSDSDJHRUJLRX#HWVRUJ

The association between

TOEFL iBT

test scores and the Common European Framework of Reference (CEFR) levels

Action Editor:

Donald Powers

Reviewers:

Jonathan Schmidgall and Michael Kane

Abstract

The Common European Framework of Reference (CEFR), published by the Council of Europe (2001) is arguably one of the most influential language frameworks in the field of second language teaching and assessment, articulating a progression of language proficiency through a number of levels. Tannenbaum & Wylie (2008) mapped the TOEFL iBT test scores onto the CEFR levels to help test users and decision makers interpret TOEFL iBT test scores in terms of the CEFR levels. Based on the feedback of subsequent users and decisions makers, Educational Testing Service (ETS) revised the CEFR cut scores (i.e., minimum test scores required for each CEFR level) in 2014. In this research memorandum, we present the rationale for the revision of the CEFR cut scores and offer validity evidence that the revised cut scores (a) are reasonable and (b) do not negatively impact the quality of admissions decisions.

Key words:

CEF R, cut scores, language proficiency levels, score interpretation, TOEFL iBT i S. Papageorgiou et al. The Association Between TOEFL iBT

Scores and CEFR Levels

Current conceptualizations of validity and the process of validation place emphasis on the int erpretation of a test score, its use, and the impact of that use (Bachman, 2005; Bachman & Palmer, 2010; Kane, 2006, 2013). Scores on language tests for speakers of English as a second/foreign language (ESL/EFL) are often used to classify test takers into different categories or levels of proficiency. In academic contexts, for example, TOEFL iBT test scores are used by universities employing English as the primary mode of instruction to determine whether prospective ESL students have sufficient English-language skills in order to be admitted (Chapelle, Enright, & Jamieson, 2008; Cho & Bridgeman, 2012). As Tannenbaum and Cho (2014) noted, these types of decisions are criterion based, in that a defined level of language proficiency should be met. However, a test score by itself does not indicate if the criterion has been met. One way to

relate test scores to criteria is to map (i.e., associate or link) test scores with descriptions of levels

of language proficiency (Tannenbaum & Cho, 2014). The Common European Framework of Reference (CEFR; Council of Europe, 2001) is probably the most influential language frameworks in the field of second language teaching and assessment articulating a progression of language proficiency through six main levels. It is not easy to establish whether and to what extent admission decisions into higher education are made in relation to the CEFR levels because no uniform policy exists across institutions or educational authorities. In their study, Carlsen and Deygers (2014) argued that B2 level is the most common requirement for admissions into European universities. For example, at the time of producing this research memorandum, the UK government required evidence of English-language proficiency at B2 level for students applying for a Tier 4 student visa to pursue an academic degree in the country. 1 also reported the same CEFR level requirement (B2) for students in an English-medium university in Turkey. However, in North America and other parts of the world outside Europe, where TOEFL iBT test scores are used to inform admission decisions, reference to the CEFR to set score requirements seems to be much less common, with universities, for example, setting their own, context-specific requirements, which can vary a lot from institution to institution (see, for example, Ling, Wolf, Cho, & Wang, 2014). The CEFR can be a useful tool for informing decisions about levels of English-language proficiency. However, it should be kept in mind that the CEFR was designed as a generic

ETS RM-15-06 1

S. Papageorgiou et al. The Association Between TOEFL iBT

Scores and CEFR Levels

reference document (as its title clearly indicates) so that it can be applied in a variety of contexts

(Milanovic & Weir, 2010). Although several of its language proficiency descriptors appear to be likely to be based on a variety of factors that go beyond a generic description of language proficiency such as the one found in the CEFR descriptors. This practice of making decisions for academic admission is because setting cut scores is a context-specific, value-driven process (Kane, 2001; Tannenbaum & Katz, 2013), as two recent studies demonstrate with regard to the use of cut scores of English-language proficiency tests (Ling et al., 2014; Papageorgiou & Cho,

2014). For these reasons, users of the TOEFL iBT test are encouraged to set their own score

requirements in order to better serve their local needs (Educational Testing Service [ETS], 2005). In the process of setting requirements, users are also encouraged to consult empirically derived performance descriptors that provide additional evidence about the expected English proficiency of test takers at differing TOEFL iBT test score ranges (see, for example, ETS, 2014; Garcia Gomez, Noah, Schedl, Wright, & Yolkut 2007). For test users and decision makers who wish to interpret TOEFL iBT test scores in terms of the CEFR levels in order to inform their decisions, Tannenbaum and Wylie (2008) conducted a study that mapped TOEFL iBT test scores to these levels. Since the time of the mapping study (Tannenbaum & Wylie, 2008), ETS has been monitoring the needs of the above test users and decision makers and how they use the proposed CEFR cut scores (i.e., minimum test scores required for each CEFR level) to inform their admissions requirements in relation to English-language proficiency. Recall, to our knowledge, many university programs in Europe consider B2 to represent the constellation of English skills likely sufficient to cope with university instruction conducted in Englishand hence, to be sufficient for use as one criterion for admissions. Feedback from these users and decision makers, mostly universities that use CEFR levels to define admissions standards in the UK and other European countries, suggested that the TOEFL iBT test score mapping results to the CEFR levels might have been too rigorous, resulting in higher test scores than perhaps needed to reflect the English skills described by the B2 level (and other levels). Moreover, as ETS assessment developers and score users obtained a better understanding of the CEFR scales and their descriptors in the intended target language use (TLU) domain (Bachman & Palmer, 2010) for the TOEFL iBT test (i.e., postsecondary academic), it was reasonable to reconsider the relationship

ETS RM-15-06 2

S. Papageorgiou et al. The Association Between TOEFL iBT

Scores and CEFR Levels

between test scores and the CEFR levels (see relevant discussion in Taylor, 2004). As a result of c onsi dering all the above information, and as suggested in the standard-setting literature (e.g., Geisinger & McCormick, 2010), a revised set of CEFR cut scores for the TOEFL iBT test was proposed. The rationale behind the revision is presented in this report. Although the revised cut scores reflected, in part, the feedback received from decision makers at universities that use CEFR levels to define admissions standards (mostly universities in the UK and other European countries), the reasonableness of these revised cut scores and their impact on admissions needed to be investigated. Such investigation is the focus of the work documented in subsequent sections in this report. Following an argument-based approach (Kane,

2006, 2013), we aim, through the use of external, nonassessment criteria (Kane, 2001), to

provide evidence supporting two claims related to the inferences that can be made on the basis of

TOEFL iBT test scores:

Claim 1 (reasonableness of the cut scores): The revised CEFR cut scores are reasonable for making decisions about admission into higher education.

Claim 2 (impact of the cut scores): The revised CEFR cut scores do not negatively impact admissions decisions due to classification errors. Before discussing the analyses providing support to the above claims, we first present a brief overview of the CEFR and the process of mapping test scores to its levels. Mapping Test Scores to the Common European Framework of Reference (CEFR) The CEFR is one of several publications of the Council of Europe, which have been influential in second language teaching since the 1970s (Van Ek & Trim, 1991, 1998, 2001; Wilkins, 1976). According to the Council of Europe (2001), a common framework for learning, teaching, and assessment is desirable to promote and facilitate cooperation among educational institutions in different countries; provide a sound basis for the mutual recognition of language qualifications; and assist learners, teachers, course designers, examining bodies and educational administrators in situating and coordinating their efforts. (p. 5)

ETS RM-15-06 3

S. Papageorgiou et al. The Association Between TOEFL iBT

Scores and CEFR Levels

Although the CEFR contains rich information about the language learning process and tea ching as well as assessment in nine chapters and four appendices, its language proficiency scales 2 are arguably the best known part of the 2001 volume (Little, 2006). The CEFR scales and descriptors were primarily developed during a large research project in Switzerland (North, 2000
; North & Schneider, 1998). The proficiency scales of the CEFR have gained popularity because they offer a comprehensive description of the objectives that learners can expect to achieve at different levels of language proficiency. They describe language activities and competences at six main levels: A1 (the lowest) through A2, B1, B2, C1, and C2 (the highest). they are intended to motivate learners by describing what they can do when they use the language, rather than what they cannot do (Council of Europe, 2001, p. 205). The CEFR proficiency scales provide a convenient structure for thinking about and communicating a progression of language proficiency and for considering where people stand in relation to that progression. Therefore, mapping language test scores onto the CEFR levels is a useful way to assign practical meaning to those scores. For example, if a score of at least 16 on a speaking test were associated with the CEFR B1 level, that would suggest that test takers with at (Council of Europe, 2001, p. 26). To further help test providers add meaning to their test scores in relation to the CEFR levels, the Council of Europe (2009) published a manual offering a recommended set of procedures for aligning both test content and test scores with the CEFR common currency in language education, and curricula, syllabuses, textbooks, teacher training courses, not only examinations, claim to be related to Applications of the CEFR in these areas are illustrated by several studies presented in three edited volumes (Byram & Parmenter, 2012; Figueras, & Noijons, 2009; Martyniuk, 2010) and also North (2014). A number of studies and research projects such as the DIALANG project (Alderson

2005; Alderson & Huhta, 2005; Kaftandjieva & Takala, 2002) have shown that the hierarchy of

the CEFR language proficiency descriptors can be consistently replicated in a range of contexts, thus offering validity evidence for the use of those descriptors and the scales they belong to

ETS RM-15-06 4

S. Papageorgiou et al. The Association Between TOEFL iBT

Scores and CEFR Levels

across a variety of contexts. However, the CEFR is neither a static tool nor a prescription to be followe d with one singularly correct interpretation or application for designing test content or interpreting test scores. In fact, because the CEFR is intentionally context-free to allow for a variety of applications and its language proficiency descriptors are not specific to a language, researchers note problems when using the CEFR to design test specifications and tasks (Alderson et al., 2006; Hasselgreen, 2012; Weir, 2005). One of the chief architects of the CEFR, Brian North, and his colleagues appropriately reminded us of the intended flexibility of the CEFR: CEFR is a concertina-like reference tool that . . . educational professionals can merge or sub-divide, elaborate or summarise, adopt or adapt according to the needs of their context. . . . It is for users to choose activities, competences and proficiency stepping-stones that are appropriate never will be an authorised (North, 2014, p. 5). The mapping of test scores to the CEFR is essential if the scores are to be interpreted in terms of levels of the CEFR. Mapping is typically accomplished through a standard-setting approach, which is based on expert judgment, and informed by test data, which links test performances to CEFR levels (Council of Europe, 2009; Papageorgiou, 2010; Papageorgiou & Tannenbaum, in press; Tannenbaum & Cho, 2014; Tannenbaum & Katz, 2013). The process of setting standards is not without criticism, however, in large part due to its inherent subjectivity (North, 2014). Skepticism is also fueled by the acknowledgment in the measurement literature that different standard-setting methods produce somewhat different results (Cizek & Bunch, 2007). However, this is much the same as is expectedbut, traditionally readily accepted that a test taker taking two different forms of the same test will not likely earn the same score (Green, Trimble, & Lewis, 2003). Some ambiguity in test scores and in setting standards is inevitable. Nonetheless, North (2014) argued against use of standard setting in order to establish a relationship between test scores and the CEFR, in particular when the Angoff method (Angoff, 1971), or one of its modified variants, is used. It is worth noting, however, thatquotesdbs_dbs22.pdfusesText_28