The purpose of this project is to develop alternative methods of language testing that will be appropriate for assessing communicative and task-based language learning and teaching. In this project, alternative assessment is defined as assessment with the purpose of aiding communicative learning/teaching through development of curriculum-related achievement procedures. In particular, this project focuses on performance assessments, which are being developed to test students' academic reading, writing, listening, and speaking skills in an integrated manner.
During the course of this project, performance assessments are being developed first as English language prototypes. These prototypes will be made available to any other language professionals who wish to adapt them to the foreign languages that they teach. In addition, performance tests will be written for students of Japanese and Chinese. Once produced and validated, all forms of these tests and the framework upon which they are based will be made available nationally. The project involves four main stages.
In the first stage, we have already used what we learned in the process of writing Norris, Brown, Hudson, and Yoshioka (1998) to develop prototype English language performance tests for university-level English as a second language courses (approximately in the ACTFL range of low-intermediate to superior). This development process involved adapting first language testing practices and second language assessment theory to develop English language prototype performance tests. In the process, we created all necessary teacher/student test instructions, materials, realia, test forms, cassette tapes, answer sheets, and a student feedback questionnaire.
In the second stage, we piloted both forms of our prototype performance assessment instruments at the University of Hawai‘i on native speakers of English (n=5 for each of two forms) and are currently piloting with University of Hawai‘i L2 speakers of English: including intermediate students (n=25) from the Hawai‘i English Language Program, advanced students (n=25) from the English Language Institute, and very advanced students (n=10) in the MA in ESL program). In addition to piloting the test items themselves, we are also in the process of developing task-dependent and task-independent scales for rating the students' performances. This last process will result in two analytic scales (task-dependent and task-independent), sets of instructions for scoring, and teacher/rater training materials.
Third, a minimum of three raters per procedure will score the English language prototypes. The scores will then be compiled, coded, and statistically analyzed along with the feedback from the students. On the basis of the analyses, we will revise the instruments to make them more efficient and effective. To do so, we will examine the reliability and validity of our instruments with a focus on improving those testing characteristics by minimizing as many sources of measurement error as possible. To that end, generalizability studies and appropriate decision studies will be set up to examine factors like raters, topics, task-types, task-difficulty levels, etc. as sources of measurement error. The appropriate factors will be determined after we have observed the piloting of the instruments because that experience will help us to determine the most likely sources of measurement error. Information derived from these generalizability studies will also help us to revise our assessment procedures to maximize their reliability, validity, and practicality. At the end of this stage, we will write up a complete report of our results and their implications for further test use and development in other languages; we will also describe (and append) all components of the prototype English language assessment materials.
Fourth, based on what we learn from developing and piloting the English language pro