Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance

Background Validated methods of objective assessments of surgical skills are resource intensive. We sought to test a web-based grading tool using crowdsourcing called Crowd-Sourced Assessment of Technical Skill. Materials and methods Institutional Review Board approval was granted to test the accura...

Full description

Bibliographic Details
Main Authors:	Chen, C, White, L, Kowalewski, T, Aggarwal, R, Lintott, C, Comstock, B, Kuksenok, K, Aragon, C, Holst, D, Lendvay, T
Format:	Journal article
Published:	2014

_version_	1797074854801833984
author	Chen, C White, L Kowalewski, T Aggarwal, R Lintott, C Comstock, B Kuksenok, K Aragon, C Holst, D Lendvay, T
author_facet	Chen, C White, L Kowalewski, T Aggarwal, R Lintott, C Comstock, B Kuksenok, K Aragon, C Holst, D Lendvay, T
author_sort	Chen, C
collection	OXFORD
description	Background Validated methods of objective assessments of surgical skills are resource intensive. We sought to test a web-based grading tool using crowdsourcing called Crowd-Sourced Assessment of Technical Skill. Materials and methods Institutional Review Board approval was granted to test the accuracy of Amazon.com's Mechanical Turk and Facebook crowdworkers compared with experienced surgical faculty grading a recorded dry-laboratory robotic surgical suturing performance using three performance domains from a validated assessment tool. Assessor free-text comments describing their rating rationale were used to explore a relationship between the language used by the crowd and grading accuracy. Results Of a total possible global performance score of 3-15, 10 experienced surgeons graded the suturing video at a mean score of 12.11 (95% confidence interval [CI], 11.11-13.11). Mechanical Turk and Facebook graders rated the video at mean scores of 12.21 (95% CI, 11.98-12.43) and 12.06 (95% CI, 11.57-12.55), respectively. It took 24 h to obtain responses from 501 Mechanical Turk subjects, whereas it took 24 d for 10 faculty surgeons to complete the 3-min survey. Facebook subjects (110) responded within 25 d. Language analysis indicated that crowdworkers who used negation words (i.e., "but," "although," and so forth) scored the performance more equivalently to experienced surgeons than crowdworkers who did not (P < 0.00001). Conclusions For a robotic suturing performance, we have shown that surgery-naive crowdworkers can rapidly assess skill equivalent to experienced faculty surgeons using Crowd-Sourced Assessment of Technical Skill. It remains to be seen whether crowds can discriminate different levels of skill and can accurately assess human surgery performances. © 2014 Elsevier Inc. All rights reserved.
first_indexed	2024-03-06T23:42:15Z
format	Journal article
id	oxford-uuid:6fbace98-0f83-45ba-815f-7403264a07ae
institution	University of Oxford
last_indexed	2024-03-06T23:42:15Z
publishDate	2014
record_format	dspace
spelling	oxford-uuid:6fbace98-0f83-45ba-815f-7403264a07ae2022-03-26T19:32:32ZCrowd-sourced assessment of technical skills: A novel method to evaluate surgical performanceJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:6fbace98-0f83-45ba-815f-7403264a07aeSymplectic Elements at Oxford2014Chen, CWhite, LKowalewski, TAggarwal, RLintott, CComstock, BKuksenok, KAragon, CHolst, DLendvay, TBackground Validated methods of objective assessments of surgical skills are resource intensive. We sought to test a web-based grading tool using crowdsourcing called Crowd-Sourced Assessment of Technical Skill. Materials and methods Institutional Review Board approval was granted to test the accuracy of Amazon.com's Mechanical Turk and Facebook crowdworkers compared with experienced surgical faculty grading a recorded dry-laboratory robotic surgical suturing performance using three performance domains from a validated assessment tool. Assessor free-text comments describing their rating rationale were used to explore a relationship between the language used by the crowd and grading accuracy. Results Of a total possible global performance score of 3-15, 10 experienced surgeons graded the suturing video at a mean score of 12.11 (95% confidence interval [CI], 11.11-13.11). Mechanical Turk and Facebook graders rated the video at mean scores of 12.21 (95% CI, 11.98-12.43) and 12.06 (95% CI, 11.57-12.55), respectively. It took 24 h to obtain responses from 501 Mechanical Turk subjects, whereas it took 24 d for 10 faculty surgeons to complete the 3-min survey. Facebook subjects (110) responded within 25 d. Language analysis indicated that crowdworkers who used negation words (i.e., "but," "although," and so forth) scored the performance more equivalently to experienced surgeons than crowdworkers who did not (P < 0.00001). Conclusions For a robotic suturing performance, we have shown that surgery-naive crowdworkers can rapidly assess skill equivalent to experienced faculty surgeons using Crowd-Sourced Assessment of Technical Skill. It remains to be seen whether crowds can discriminate different levels of skill and can accurately assess human surgery performances. © 2014 Elsevier Inc. All rights reserved.
spellingShingle	Chen, C White, L Kowalewski, T Aggarwal, R Lintott, C Comstock, B Kuksenok, K Aragon, C Holst, D Lendvay, T Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance
title	Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance
title_full	Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance
title_fullStr	Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance
title_full_unstemmed	Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance
title_short	Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance
title_sort	crowd sourced assessment of technical skills a novel method to evaluate surgical performance
work_keys_str_mv	AT chenc crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT whitel crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT kowalewskit crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT aggarwalr crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT lintottc crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT comstockb crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT kuksenokk crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT aragonc crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT holstd crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance AT lendvayt crowdsourcedassessmentoftechnicalskillsanovelmethodtoevaluatesurgicalperformance

Crowd-sourced assessment of technical skills: A novel method to evaluate surgical performance

Similar Items