The standard setting process: validating interpretations of stakeholders

Abstract Background Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments...

Full description

Bibliographic Details
Main Authors:	Nele Kampa, Helene Wagner, Olaf Köller
Format:	Article
Language:	English
Published:	SpringerOpen 2019-02-01
Series:	Large-scale Assessments in Education
Subjects:	Standard setting Validity Science education Extended Angoff method Yes/No Angoff method Large-scale assessment
Online Access:	http://link.springer.com/article/10.1186/s40536-019-0071-8

_version_	1828868982301523968
author	Nele Kampa Helene Wagner Olaf Köller
author_facet	Nele Kampa Helene Wagner Olaf Köller
author_sort	Nele Kampa
collection	DOAJ
description	Abstract Background Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments established by means of standard-setting procedures such as proficiency levels (PL) therefore appear to be arbitrary to some degree. Owing to the significance such results take on, when they are communicated to stakeholders or the public, a thorough validation of this process seems crucial. In our study, ministry stakeholders intended to use PL established in an assessment of science abilities to obtain information about students’ strengths and weaknesses regarding science abilities in general and specifically about the extent to which they were prepared for future science studies. The aim of our study was to investigate the validity arguments regarding these two intended interpretations. Methods Based on a university science test administered to 3641 upper secondary students (Grade 13), a panel of nine experts set four cut scores using two variations of the Angoff method, the Yes/No Angoff method (multiple choice items) and the extended Angoff method (complex multiple choice items). We carried out t-tests, repeated measures ANOVA, G-studies and regression analyses to support the procedural, internal, external, and consequential validity elements regarding the aforementioned interpretations of the cut scores. Results Our t-tests and G-studies showed that the intended use of the cut scores was valid regarding procedural and internal aspects of validity. These findings were called into question by the experts’ lack of confidence in the established cut scores. Regression analyses including number of lessons taught and intended and pursued science-related studies showed good external and poor consequential validity. Conclusion The cut scores can be used as an indicator of 13th graders’ strengths and weaknesses in science. They should not be used as an indicator for preparedness for science university studies. Since assessment formats are continually evolving and consequently leading to more complex designs, further research needs to be conducted on the application of new standard-setting methods to meet the challenges arising from this development.
first_indexed	2024-12-13T05:41:20Z
format	Article
id	doaj.art-a2ccc842797945b7b154e696bf91e541
institution	Directory Open Access Journal
issn	2196-0739
language	English
last_indexed	2024-12-13T05:41:20Z
publishDate	2019-02-01
publisher	SpringerOpen
record_format	Article
series	Large-scale Assessments in Education
spelling	doaj.art-a2ccc842797945b7b154e696bf91e5412022-12-21T23:57:47ZengSpringerOpenLarge-scale Assessments in Education2196-07392019-02-017112510.1186/s40536-019-0071-8The standard setting process: validating interpretations of stakeholdersNele Kampa0Helene Wagner1Olaf Köller2Leibniz Institute for Science and Mathematics Education at the Christian-Albrechts-University of KielLeibniz Institute for Science and Mathematics Education at the Christian-Albrechts-University of KielLeibniz Institute for Science and Mathematics Education at the Christian-Albrechts-University of KielAbstract Background Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments established by means of standard-setting procedures such as proficiency levels (PL) therefore appear to be arbitrary to some degree. Owing to the significance such results take on, when they are communicated to stakeholders or the public, a thorough validation of this process seems crucial. In our study, ministry stakeholders intended to use PL established in an assessment of science abilities to obtain information about students’ strengths and weaknesses regarding science abilities in general and specifically about the extent to which they were prepared for future science studies. The aim of our study was to investigate the validity arguments regarding these two intended interpretations. Methods Based on a university science test administered to 3641 upper secondary students (Grade 13), a panel of nine experts set four cut scores using two variations of the Angoff method, the Yes/No Angoff method (multiple choice items) and the extended Angoff method (complex multiple choice items). We carried out t-tests, repeated measures ANOVA, G-studies and regression analyses to support the procedural, internal, external, and consequential validity elements regarding the aforementioned interpretations of the cut scores. Results Our t-tests and G-studies showed that the intended use of the cut scores was valid regarding procedural and internal aspects of validity. These findings were called into question by the experts’ lack of confidence in the established cut scores. Regression analyses including number of lessons taught and intended and pursued science-related studies showed good external and poor consequential validity. Conclusion The cut scores can be used as an indicator of 13th graders’ strengths and weaknesses in science. They should not be used as an indicator for preparedness for science university studies. Since assessment formats are continually evolving and consequently leading to more complex designs, further research needs to be conducted on the application of new standard-setting methods to meet the challenges arising from this development.http://link.springer.com/article/10.1186/s40536-019-0071-8Standard settingValidityScience educationExtended Angoff methodYes/No Angoff methodLarge-scale assessment
spellingShingle	Nele Kampa Helene Wagner Olaf Köller The standard setting process: validating interpretations of stakeholders Large-scale Assessments in Education Standard setting Validity Science education Extended Angoff method Yes/No Angoff method Large-scale assessment
title	The standard setting process: validating interpretations of stakeholders
title_full	The standard setting process: validating interpretations of stakeholders
title_fullStr	The standard setting process: validating interpretations of stakeholders
title_full_unstemmed	The standard setting process: validating interpretations of stakeholders
title_short	The standard setting process: validating interpretations of stakeholders
title_sort	standard setting process validating interpretations of stakeholders
topic	Standard setting Validity Science education Extended Angoff method Yes/No Angoff method Large-scale assessment
url	http://link.springer.com/article/10.1186/s40536-019-0071-8
work_keys_str_mv	AT nelekampa thestandardsettingprocessvalidatinginterpretationsofstakeholders AT helenewagner thestandardsettingprocessvalidatinginterpretationsofstakeholders AT olafkoller thestandardsettingprocessvalidatinginterpretationsofstakeholders AT nelekampa standardsettingprocessvalidatinginterpretationsofstakeholders AT helenewagner standardsettingprocessvalidatinginterpretationsofstakeholders AT olafkoller standardsettingprocessvalidatinginterpretationsofstakeholders

The standard setting process: validating interpretations of stakeholders

Similar Items