Test theories, educational priorities and reliability of public examinations in England

Much has already been written on the controversies surrounding the use of different test theories in educational assessment. Other authors have noted the prevalence of classical test theory over item response theory in practice. This Special Issue draws together articles based upon work conducted on...

Full description

Bibliographic Details
Main Authors: Baird, J, Black, P
Format: Journal article
Language:English
Published: 2013
Description
Summary:Much has already been written on the controversies surrounding the use of different test theories in educational assessment. Other authors have noted the prevalence of classical test theory over item response theory in practice. This Special Issue draws together articles based upon work conducted on the Reliability Programme for England's examinations regulator, Office of Qualifications and Examinations Regulation. One strand of the work was methodological, in which we noted the advantages and assumptions of different approaches to investigating reliability of public examinations. A feature of the field that has not been well documented is why psychometrics in general does not always fit the educational, cultural and political priorities of public examinations. Psychometrics has implications for test design and reliability. Public examinations are: curriculum-embedded; it is desirable that the domain is transparent; there are many curriculum changes (which makes item banking more problematical and costly); curriculum exposure affects performance; public examination questions might need to be secure until the examination is released and then, made public; pre-testing is infrequent; there is non-random syllabus representation; item independence assumptions are often broken; score distribution might not be expected to be normal; complex assessment designs compose whole qualifications; a latent trait might not be assumed; unidimensionality might not be considered important; and there might be very weakly described constructs. Implications for measurement of reliability in this context in England include less concern for internal reliability or occasion-related factors and an emphasis upon standard-setting and inter-rater reliability. Psychometricians sometimes view this approach as technically weak and behind the times, but we raise the prospect that psychometrics could seem like an answer to somebody else's problems if the curriculum, questions and washback upon learning are the main concern of the assessment designers. We agree with other authors that the field is currently under-theorised and raise the prospect that future theory needs to take account of the educational context of public examinations. © 2013 Copyright Taylor and Francis Group, LLC.