Archived Content from Conference Held in March 1996

Issues in Computer Adaptive Testing of Second Language Reading Proficiency

March 20 - 22, 1996

This seminar represented a milestone in the field of second language assessment as the first international meeting to solely focus on second language computer adaptive testing (CAT). Over 80 participants from around the world came to learn about this cutting-edge area of language assessment.

The University of Minnesota has long been heralded as a leader in the arena of language proficiency testing, where its Center for Advanced Research on Language Acquisition (CARLA) has forwarded the agenda of proficiency assessment through test development and research. The grant "Improving and Strengthening Proficiency-Based Testing in Foreign Language Using Computer Adaptive Testing Technologies," awarded to CARLA in the summer of 1996 enabled the CARLA research team to begin constructing computer adaptive tests for assessing and providing diagnostic information regarding students' reading proficiency in French, German, and Spanish. A principle objective of this seminar was to address key issues that will inform the construction of these tests.

This ground-breaking seminar featured leading experts in the fields of computer adaptive testing, technology, second language reading proficiency, and second language assessment. The presentations addressed theoretical issues and empirical findings involving second language reading and assessment with computer adaptive testing.

The seminar covered the following topics:

Computer Adaptive Reading Proficiency Test Development at CARLA
Second Language Reading Models and Research: Their Relation to Computer Adaptive Testing
Newest Trends in Computer Adaptive Testing, Including Scoring Algorithms and Item Selection Heuristics
Computerized Testing Technology Including Multimedia, Simulations, Item Formats, Exposure and Security
Second Language Computer Adaptive Testing and Assessment
Item Response Theory
Multiple Item Pool Use for Proficiency and Diagnostic Testing

Conference Presentations

A Perspective on Computerized Second Language Testing
David Weiss, Ph.D., Professor, Dept. of Psychology, University of Minnesota

Improvements in second language testing can result from the computerized mode of administration and from various forms of adaptive administration. Some advantages of computerized administration will be discussed. The origins and approaches of adaptive testing will be described and several applications of adaptive testing to second language testing will be presented.

Computerized Testing on a Large Network: Issues for Today and Tomorrow
Charles Johnston, Ph.D., Vice President for Technology, Drake Prometric Corp.

Many institutions and organizations have moved to computerized delivery of their exams, where computer adaptive testing represents one increasingly growing delivery mode. The delivery often requires a large national or global network of delivery points. This system must reflect the latest trends in both test development and psychometrics, including multimedia presentation, simulations, new item/testlet formats, expert scoring systems, and security, among others.

Exploring New Item-Types for Computerized Testing: New Possibilities and Challenges
Michael Yoes, Ph.D., President, Assessment Systems Corp.

Computerized tests are becoming more widely used. Little consideration has been given to opportunities for new item-types uniquely offered by computerization. Most computerized tests (including CATs) use item-types from printed tests. Test developers can consider new item-types. A discussion of possible new directions, and psychometric challenges will be presented.

Learning to Read in a Foreign Language and C-A Reading Assessment
William Grabe, Ph.D., Associate Professor, Dept. of English, Northern Arizona University

This talk will first outline briefly a number of major findings from L1 reading research which have important consequences for learning to read in university foreign language (FL) contexts. The talk will then present a set of issues (or dilemmas) which influence the development of reading abilities in a university FL setting. Given these issues (or dilemmas), and given the goals of a specific university modern languages department, the last section will consider the concerns that need to be addressed for implementing computer adaptive reading assessment.

If Reading is Reader-Based, Can There Be a Computer-Adaptive Reading Test?
Elizabeth B. Bernhardt, Ph.D., Director of Language Center & Professor of German Studies, Stanford University

This presentation reviews theories of reading in both first and second languages. In addition, it examines the data buttressing each theory with particular emphasis on recent re-examinations of the L1/L2 literacy relationship data. The paper argues, from these individual perspectives and from their syntheses that CAT is a potentially alien endeavor when attempting to assess reading comprehension.

Computer Adaptive Testing: An Outsider's View
Tim McNamara, Ph.D., Associate Professor, Dept. of Linguistics and Applied Linguistics, University of Melbourne

Technologically innovative forms of assessment inevitably generate excitement, but such innovations need to be evaluated in the context of a broad range of assessment needs. What can CAT do, and what can it not do? This paper evaluates CAT from the point of view of current thinking on assessment, particularly performance assessment.

Content Considerations for Testing Reading Proficiency Via Computerized-Adaptive Tests
Jerry Larson, Ph.D., Director of Humanities Research Center & Professor of Spanish, Brigham Young University

This presentation will focus on issues related to content of items found in computerized-adaptive tests of reading proficiency. Of particular concern is the need to provide reading passages that represent current language in a variety of language settings. CAT algorithms to achieve appropriate item selection will be demonstrated.

Checking the Utility and Appropriacy of the Content and Measurement Models Used to Develop L2 Listening Comprehension CATs: Implications for Further Development of Comprehensive CATs
Patricia Dunkel, Ph.D., Professor and Chair, Dept. of Applied Linguistics & ESL, Georgia State University

Research and development of multi-media listening comprehension CATs in ESL and Hausa will first be described. Then, the presenter will share the insights gained from developing the CATs and from trialing the item banks on examinees learning (or having learned) ESL and Hausa. The insights derived both from observed data and from experience will be discussed largely in relation to decisions made by the CAT developers a priori concerning the following: (1) identification of the comprehension content/task model; (2) designation of the framework used for item writing and creation of the item banks; (3) selection of the Rasch IRT model as the CAT measurement model; and (4) specification of the algorithm for item-selection and stopping the CAT.

Towards Integrated Learning and Testing Using Structured Item Banks for CAT
John de Jong, Ph.D., Head of the Language Testing Unit, CITO-The Dutch National Institute for Educational Measurement

From a global perspective an ample amount of instruments seems to be available for testing foreign language reading comprehension. At closer inspection, however, it appears that many of these instruments lack in quality and that most of them concentrate on a limited number of domains in a restricted number of languages. Taking into account the diversity of language needs in our present-day society, this chaotic situation leads to the paradox that in fact the number of tests available is far from sufficient. It is argued, therefore, that international collaboration in building structured item banks is crucial if education wishes to meet the marketing requirements and technological standards at the turn of the century. Examples and suggestions will be presented to illustrate how structures item banks can be set up for CAT.

Constructing a Reading Strength Profile with CAT
J. Michael Linacre, Ph.D., Associate Director, MESA Psychometric Laboratory, University of Chicago

CAT offers flexibility, thoroughness, diagnosis and test security. Reading short messages can be tested by multiple-choice paraphrases in the second language, long texts by customized testlets of first language MCQ questions. For screening use, time is minimized. For placement, longer tests diagnose strengths. Test theory and reports are presented.

Adaptive Assessment of Reading Comprehension for TOEFL
Daniel R. Eignor, Ph.D., Principal Measurement Specialist, Education Testing Service

ETS is presently in the process of assessing the feasibility of introducing computer adaptive versions of each of the three sections of TOEFL, the last of which currently measures reading comprehension. In this presentation, the IRT model, item selection algorithm, and procedure for controlling item exposure that have been chosen for use with the adaptive version of the TOEFL reading comprehension section will be discussed, along with reasons for making these choices.

The Practical Utility of Rasch Measurement Models
Richard Luecht, Ph.D., Senior Psychometrician, Director of Computer Adaptive Testing, National Board of Medical Examiners

All statistical models are incomplete representations of reality; however, some models are useful. The utility of a model depends on many factors, including statistical fit, structural identifiability, parameter estimation costs, and the substantive theory underlying the selection of the model. This paper presents a comprehensive framework for evaluating the practical utility of IRT models, in general, and empirically demonstrates the overall usefulness of the rather parsimonious Rasch family of models, with a particular emphasis on CAT and reading assessment applications.

Visit the Computer AdaptiveTesting project page for more information.