Setting and maintaining GCSE and GCE grading standards: the case for contextualised cohort-referencing
General Certificate of Secondary Education (GCSE) and General Certificate of Education (GCE) grading standards are determined by Awarding Bodies using procedures that adhere to the Code of Practice published by the regulator, Ofqual. Grade boundary marks (cut scores) are set using subject experts’ (senior examiners) judgement of the quality of candidates’ work, informed by statistics about the candidature and the mark distribution of the examination.
This model has been called weak criterion-referencing: the requirement of (strong) criterion-referencing for evidence of specific knowledge, skills and understanding is relaxed to allow for variations in examination difficulty, requiring maintenance of only the general quality of examination performance.
This paper considers the major conceptual flaw in this model – the fact that the examiners making the judgements have insufficient information to estimate quantitatively the relative difficulty of two successive years’ examinations – as well as evidence demonstrating that experienced examiners are unable to distinguish between candidates’ work within a small range of marks, as is required to set grade boundaries.
Furthermore, examiners appear biased toward giving candidates the benefit of doubt when deciding grade boundary marks. Combined with their imprecision, this is a recipe for ‘grade inflation’ – the lowering of the quality of work required for a particular grade – of which the steadily increasing national GCSE and GCE results, and the consequent need to introduce grade As, could be a symptom.
It is proposed that a form of cohort-referencing that is sensitive to changes in examination entry patterns be introduced. The proposed system would shift the weight of evidence toward statistics, using qualitative judgements as part of a check on the veracity of the statistics, at the same time extinguishing the annual debate as to whether increasing examination outcomes are an indicator of improvement in education or a decline in examination ‘standards’, and safeguard against the need to introduce further higher grades in the future.