The standard setting process: validating interpretations of stakeholders
Abstract Background Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2019-02-01
|
Series: | Large-scale Assessments in Education |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s40536-019-0071-8 |
_version_ | 1818302590528520192 |
---|---|
author | Nele Kampa Helene Wagner Olaf Köller |
author_facet | Nele Kampa Helene Wagner Olaf Köller |
author_sort | Nele Kampa |
collection | DOAJ |
description | Abstract Background Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments established by means of standard-setting procedures such as proficiency levels (PL) therefore appear to be arbitrary to some degree. Owing to the significance such results take on, when they are communicated to stakeholders or the public, a thorough validation of this process seems crucial. In our study, ministry stakeholders intended to use PL established in an assessment of science abilities to obtain information about students’ strengths and weaknesses regarding science abilities in general and specifically about the extent to which they were prepared for future science studies. The aim of our study was to investigate the validity arguments regarding these two intended interpretations. Methods Based on a university science test administered to 3641 upper secondary students (Grade 13), a panel of nine experts set four cut scores using two variations of the Angoff method, the Yes/No Angoff method (multiple choice items) and the extended Angoff method (complex multiple choice items). We carried out t-tests, repeated measures ANOVA, G-studies and regression analyses to support the procedural, internal, external, and consequential validity elements regarding the aforementioned interpretations of the cut scores. Results Our t-tests and G-studies showed that the intended use of the cut scores was valid regarding procedural and internal aspects of validity. These findings were called into question by the experts’ lack of confidence in the established cut scores. Regression analyses including number of lessons taught and intended and pursued science-related studies showed good external and poor consequential validity. Conclusion The cut scores can be used as an indicator of 13th graders’ strengths and weaknesses in science. They should not be used as an indicator for preparedness for science university studies. Since assessment formats are continually evolving and consequently leading to more complex designs, further research needs to be conducted on the application of new standard-setting methods to meet the challenges arising from this development. |
first_indexed | 2024-12-13T05:41:20Z |
format | Article |
id | doaj.art-a2ccc842797945b7b154e696bf91e541 |
institution | Directory Open Access Journal |
issn | 2196-0739 |
language | English |
last_indexed | 2024-12-13T05:41:20Z |
publishDate | 2019-02-01 |
publisher | SpringerOpen |
record_format | Article |
series | Large-scale Assessments in Education |
spelling | doaj.art-a2ccc842797945b7b154e696bf91e5412022-12-21T23:57:47ZengSpringerOpenLarge-scale Assessments in Education2196-07392019-02-017112510.1186/s40536-019-0071-8The standard setting process: validating interpretations of stakeholdersNele Kampa0Helene Wagner1Olaf Köller2Leibniz Institute for Science and Mathematics Education at the Christian-Albrechts-University of KielLeibniz Institute for Science and Mathematics Education at the Christian-Albrechts-University of KielLeibniz Institute for Science and Mathematics Education at the Christian-Albrechts-University of KielAbstract Background Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments established by means of standard-setting procedures such as proficiency levels (PL) therefore appear to be arbitrary to some degree. Owing to the significance such results take on, when they are communicated to stakeholders or the public, a thorough validation of this process seems crucial. In our study, ministry stakeholders intended to use PL established in an assessment of science abilities to obtain information about students’ strengths and weaknesses regarding science abilities in general and specifically about the extent to which they were prepared for future science studies. The aim of our study was to investigate the validity arguments regarding these two intended interpretations. Methods Based on a university science test administered to 3641 upper secondary students (Grade 13), a panel of nine experts set four cut scores using two variations of the Angoff method, the Yes/No Angoff method (multiple choice items) and the extended Angoff method (complex multiple choice items). We carried out t-tests, repeated measures ANOVA, G-studies and regression analyses to support the procedural, internal, external, and consequential validity elements regarding the aforementioned interpretations of the cut scores. Results Our t-tests and G-studies showed that the intended use of the cut scores was valid regarding procedural and internal aspects of validity. These findings were called into question by the experts’ lack of confidence in the established cut scores. Regression analyses including number of lessons taught and intended and pursued science-related studies showed good external and poor consequential validity. Conclusion The cut scores can be used as an indicator of 13th graders’ strengths and weaknesses in science. They should not be used as an indicator for preparedness for science university studies. Since assessment formats are continually evolving and consequently leading to more complex designs, further research needs to be conducted on the application of new standard-setting methods to meet the challenges arising from this development.http://link.springer.com/article/10.1186/s40536-019-0071-8Standard settingValidityScience educationExtended Angoff methodYes/No Angoff methodLarge-scale assessment |
spellingShingle | Nele Kampa Helene Wagner Olaf Köller The standard setting process: validating interpretations of stakeholders Large-scale Assessments in Education Standard setting Validity Science education Extended Angoff method Yes/No Angoff method Large-scale assessment |
title | The standard setting process: validating interpretations of stakeholders |
title_full | The standard setting process: validating interpretations of stakeholders |
title_fullStr | The standard setting process: validating interpretations of stakeholders |
title_full_unstemmed | The standard setting process: validating interpretations of stakeholders |
title_short | The standard setting process: validating interpretations of stakeholders |
title_sort | standard setting process validating interpretations of stakeholders |
topic | Standard setting Validity Science education Extended Angoff method Yes/No Angoff method Large-scale assessment |
url | http://link.springer.com/article/10.1186/s40536-019-0071-8 |
work_keys_str_mv | AT nelekampa thestandardsettingprocessvalidatinginterpretationsofstakeholders AT helenewagner thestandardsettingprocessvalidatinginterpretationsofstakeholders AT olafkoller thestandardsettingprocessvalidatinginterpretationsofstakeholders AT nelekampa standardsettingprocessvalidatinginterpretationsofstakeholders AT helenewagner standardsettingprocessvalidatinginterpretationsofstakeholders AT olafkoller standardsettingprocessvalidatinginterpretationsofstakeholders |