Maintaining standards in UK high-stakes examinations using test equating

By Christopher Wheadon

Abstract

The dual processes of setting and maintaining of standards in high-stakes public examinations in the UK have long used the same methodology: a combination of judgement and statistics. While judgement is necessary to set standards, if it is accepted that standards are a social construct, it has been found to be a blunt instrument with built-in biases.

As such it is arguably not suitable for maintaining standards once they have been set. Standard statistical cohort-referencing and predictive models of maintaining standards cannot be used in isolation because they are subjective models that do not allow for improvement over time.

This paper considers whether Item Response Theory (IRT) methods of test equating are suitable for maintaining standards over time in UK high-stakes examinations. It concludes that these methods are readily applicable for assessments that use short response test formats but that a full research programme is required to investigate whether IRT methods are suitable with longer response test formats.

How to cite

Wheadon, C (2008). Maintaining standards in UK high-stakes examinations using test equating, Manchester: AQA Centre for Education Research and Policy.

Keywords