Crowdsourcing ratings for single lexical items
In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learne...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Ljubljana Press (Založba Univerze v Ljubljani)
2022-12-01
|
Series: | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
Subjects: | |
Online Access: | https://journals.uni-lj.si/slovenscina2/article/view/11247 |
_version_ | 1797730781280337920 |
---|---|
author | Elena Volodina David Alfter Therese Lindström Tiedemann |
author_facet | Elena Volodina David Alfter Therese Lindström Tiedemann |
author_sort | Elena Volodina |
collection | DOAJ |
description |
In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learners’ rankings can be used for assigning levels to unseen vocabulary. The study is performed on Swedish single-word items.
The four hypotheses we examine are: (1) there is core vocabulary for each proficiency level, but this is only true until CEFR level B2 (upper-intermediate); (2) core vocabulary shows more systematicity in its behavior and usage, whereas peripheral items have more idiosyncratic behavior; (3) given that we have truly core items (aka anchor items) for each level, we can place any new unseen item in relation to the identified core items by using a series of comparative judgment tasks, this way assigning a “target” level for a previously unseen item; and (4) non-experts will perform on par with experts in a comparative judgment setting. The hypotheses have been largely confirmed: In relation to (1) and (2), our results show that there seems to be some systematicity in core vocabulary for early to mid-levels (A1-B1) while we find less systematicity for higher levels (B2-C1). In relation to (3), we suggest crowdsourcing word rankings using comparative judgment with known anchor words as a method to assign a “target” level to unseen words. With regard to (4), we confirm the previous findings that non-experts, in our case language learners, can be effectively used for the linguistic annotation tasks in a comparative judgment setting.
|
first_indexed | 2024-03-12T11:49:16Z |
format | Article |
id | doaj.art-0c53fd68a2794b6faff535bcc2cecc8f |
institution | Directory Open Access Journal |
issn | 2335-2736 |
language | English |
last_indexed | 2024-03-12T11:49:16Z |
publishDate | 2022-12-01 |
publisher | University of Ljubljana Press (Založba Univerze v Ljubljani) |
record_format | Article |
series | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
spelling | doaj.art-0c53fd68a2794b6faff535bcc2cecc8f2023-08-31T11:24:09ZengUniversity of Ljubljana Press (Založba Univerze v Ljubljani)Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave2335-27362022-12-0110210.4312/slo2.0.2022.2.5-61Crowdsourcing ratings for single lexical itemsElena Volodina0David Alfter1Therese Lindström Tiedemann2University of Gothenburg, SwedenUniversity of Gothenburg, Sweden; Université Catholique de Louvain, BelgiumUniversity of Helsinki, Finland In this study, we investigate theoretical and practical issues connected to differentiating between core and peripheral vocabulary at different levels of linguistic proficiency using statistical approaches combined with crowdsourcing. We also investigate whether crowdsourcing second language learners’ rankings can be used for assigning levels to unseen vocabulary. The study is performed on Swedish single-word items. The four hypotheses we examine are: (1) there is core vocabulary for each proficiency level, but this is only true until CEFR level B2 (upper-intermediate); (2) core vocabulary shows more systematicity in its behavior and usage, whereas peripheral items have more idiosyncratic behavior; (3) given that we have truly core items (aka anchor items) for each level, we can place any new unseen item in relation to the identified core items by using a series of comparative judgment tasks, this way assigning a “target” level for a previously unseen item; and (4) non-experts will perform on par with experts in a comparative judgment setting. The hypotheses have been largely confirmed: In relation to (1) and (2), our results show that there seems to be some systematicity in core vocabulary for early to mid-levels (A1-B1) while we find less systematicity for higher levels (B2-C1). In relation to (3), we suggest crowdsourcing word rankings using comparative judgment with known anchor words as a method to assign a “target” level to unseen words. With regard to (4), we confirm the previous findings that non-experts, in our case language learners, can be effectively used for the linguistic annotation tasks in a comparative judgment setting. https://journals.uni-lj.si/slovenscina2/article/view/11247core vocabulary and language learningnon-expert crowdsourcingsingle lexical itemsCEFR levelscomparative judgment |
spellingShingle | Elena Volodina David Alfter Therese Lindström Tiedemann Crowdsourcing ratings for single lexical items Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave core vocabulary and language learning non-expert crowdsourcing single lexical items CEFR levels comparative judgment |
title | Crowdsourcing ratings for single lexical items |
title_full | Crowdsourcing ratings for single lexical items |
title_fullStr | Crowdsourcing ratings for single lexical items |
title_full_unstemmed | Crowdsourcing ratings for single lexical items |
title_short | Crowdsourcing ratings for single lexical items |
title_sort | crowdsourcing ratings for single lexical items |
topic | core vocabulary and language learning non-expert crowdsourcing single lexical items CEFR levels comparative judgment |
url | https://journals.uni-lj.si/slovenscina2/article/view/11247 |
work_keys_str_mv | AT elenavolodina crowdsourcingratingsforsinglelexicalitems AT davidalfter crowdsourcingratingsforsinglelexicalitems AT thereselindstromtiedemann crowdsourcingratingsforsinglelexicalitems |