Crowdsourcing ratings for single lexical items

In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learne...

Full description

Bibliographic Details
Main Authors:	Elena Volodina, David Alfter, Therese Lindström Tiedemann
Format:	Article
Language:	English
Published:	University of Ljubljana Press (Založba Univerze v Ljubljani) 2022-12-01
Series:	Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
Subjects:	core vocabulary and language learning non-expert crowdsourcing single lexical items CEFR levels comparative judgment
Online Access:	https://journals.uni-lj.si/slovenscina2/article/view/11247

_version_	1797730781280337920
author	Elena Volodina David Alfter Therese Lindström Tiedemann
author_facet	Elena Volodina David Alfter Therese Lindström Tiedemann
author_sort	Elena Volodina
collection	DOAJ
description	In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learners’ rankings can be used for assigning levels to unseen vocabulary. The study is performed on Swedish single-word items. The four hypotheses we examine are: (1) there is core vocabulary for each proficiency level, but this is only true until CEFR level B2 (upper-intermediate); (2) core vocabulary shows more systematicity in its behavior and usage, whereas peripheral items have more idiosyncratic behavior; (3) given that we have truly core items (aka anchor items) for each level, we can place any new unseen item in relation to the identified core items by using a series of comparative judgment tasks, this way assigning a “target” level for a previously unseen item; and (4) non-experts will perform on par with experts in a comparative judgment setting. The hypotheses have been largely confirmed: In relation to (1) and (2), our results show that there seems to be some systematicity in core vocabulary for early to mid-levels (A1-B1) while we find less systematicity for higher levels (B2-C1). In relation to (3), we suggest crowdsourcing word rankings using comparative judgment with known anchor words as a method to assign a “target” level to unseen words. With regard to (4), we confirm the previous findings that non-experts, in our case language learners, can be effectively used for the linguistic annotation tasks in a comparative judgment setting.
first_indexed	2024-03-12T11:49:16Z
format	Article
id	doaj.art-0c53fd68a2794b6faff535bcc2cecc8f
institution	Directory Open Access Journal
issn	2335-2736
language	English
last_indexed	2024-03-12T11:49:16Z
publishDate	2022-12-01
publisher	University of Ljubljana Press (Založba Univerze v Ljubljani)
record_format	Article
series	Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
spelling	doaj.art-0c53fd68a2794b6faff535bcc2cecc8f2023-08-31T11:24:09ZengUniversity of Ljubljana Press (Založba Univerze v Ljubljani)Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave2335-27362022-12-0110210.4312/slo2.0.2022.2.5-61Crowdsourcing ratings for single lexical itemsElena Volodina0David Alfter1Therese Lindström Tiedemann2University of Gothenburg, SwedenUniversity of Gothenburg, Sweden; Université Catholique de Louvain, BelgiumUniversity of Helsinki, Finland In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learners’ rankings can be used for assigning levels to unseen vocabulary. The study is performed on Swedish single-word items. The four hypotheses we examine are: (1) there is core vocabulary for each proficiency level, but this is only true until CEFR level B2 (upper-intermediate); (2) core vocabulary shows more systematicity in its behavior and usage, whereas peripheral items have more idiosyncratic behavior; (3) given that we have truly core items (aka anchor items) for each level, we can place any new unseen item in relation to the identified core items by using a series of comparative judgment tasks, this way assigning a “target” level for a previously unseen item; and (4) non-experts will perform on par with experts in a comparative judgment setting. The hypotheses have been largely confirmed: In relation to (1) and (2), our results show that there seems to be some systematicity in core vocabulary for early to mid-levels (A1-B1) while we find less systematicity for higher levels (B2-C1). In relation to (3), we suggest crowdsourcing word rankings using comparative judgment with known anchor words as a method to assign a “target” level to unseen words. With regard to (4), we confirm the previous findings that non-experts, in our case language learners, can be effectively used for the linguistic annotation tasks in a comparative judgment setting. https://journals.uni-lj.si/slovenscina2/article/view/11247core vocabulary and language learningnon-expert crowdsourcingsingle lexical itemsCEFR levelscomparative judgment
spellingShingle	Elena Volodina David Alfter Therese Lindström Tiedemann Crowdsourcing ratings for single lexical items Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave core vocabulary and language learning non-expert crowdsourcing single lexical items CEFR levels comparative judgment
title	Crowdsourcing ratings for single lexical items
title_full	Crowdsourcing ratings for single lexical items
title_fullStr	Crowdsourcing ratings for single lexical items
title_full_unstemmed	Crowdsourcing ratings for single lexical items
title_short	Crowdsourcing ratings for single lexical items
title_sort	crowdsourcing ratings for single lexical items
topic	core vocabulary and language learning non-expert crowdsourcing single lexical items CEFR levels comparative judgment
url	https://journals.uni-lj.si/slovenscina2/article/view/11247
work_keys_str_mv	AT elenavolodina crowdsourcingratingsforsinglelexicalitems AT davidalfter crowdsourcingratingsforsinglelexicalitems AT thereselindstromtiedemann crowdsourcingratingsforsinglelexicalitems

Crowdsourcing ratings for single lexical items

Similar Items