Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context

Grammatical complexity has received extensive attention in second language acquisition. Although computational tools have been developed to analyze grammatical complexity, most relevant studies investigated this construct in the context of English as a second language. In response to an increasing n...

Full description

Bibliographic Details
Main Authors:	Ge Lan, Xiaofei Pan, Yachao Sun, Yuan Lu
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2023-02-01
Series:	Frontiers in Psychology
Subjects:	part of speech tagging SLA corpus linguistics language development grammatical features Chinese as a second language
Online Access:	https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1139703/full

_version_	1797905452061687808
author	Ge Lan Xiaofei Pan Yachao Sun Yuan Lu
author_facet	Ge Lan Xiaofei Pan Yachao Sun Yuan Lu
author_sort	Ge Lan
collection	DOAJ
description	Grammatical complexity has received extensive attention in second language acquisition. Although computational tools have been developed to analyze grammatical complexity, most relevant studies investigated this construct in the context of English as a second language. In response to an increasing number of L2 Chinese learners, it is important to extend the investigation of grammatical complexity in L2 Chinese. To promote relevant research, we evaluated the new computational tool, Stanza, on its accuracy of part-of-speech tagging for L2 Chinese writing. We particularly focused on eight grammatical features closely related to L2 Chinese development. Then, we reported the precisions, recalls, and F-scores for the individual grammatical features and offered a qualitative analysis of systematic tagging errors. In terms of the precision, three features have high rates, over 90% (i.e., ba and bei markers, classifiers, -de as noun modifier marker). For recall, four features have high rates, over 90% (i.e., aspect markers, ba and bei markers, classifiers, -de as noun modifier marker). Overall, based on the F-scores, Stanza has a good tagging performance on ba and bei markers, classifiers, and -de as a noun modifier marker. This evaluation provides research implications for scholars who plan to use this computational tool to study L2 Chinese development in second language acquisition or applied linguistics in general.
first_indexed	2024-04-10T10:06:35Z
format	Article
id	doaj.art-487950b0a74b43d78c2cb39e093ff319
institution	Directory Open Access Journal
issn	1664-1078
language	English
last_indexed	2024-04-10T10:06:35Z
publishDate	2023-02-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Psychology
spelling	doaj.art-487950b0a74b43d78c2cb39e093ff3192023-02-15T16:57:07ZengFrontiers Media S.A.Frontiers in Psychology1664-10782023-02-011410.3389/fpsyg.2023.11397031139703Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing contextGe Lan0Xiaofei Pan1Yachao Sun2Yuan Lu3Department of English, City University of Hong Kong, Hong Kong, Hong Kong SAR, ChinaLanguage and Culture Center, Duke Kunshan University, Suzhou, ChinaLanguage and Culture Center, Duke Kunshan University, Suzhou, ChinaDepartment of Asian and Slavic Languages and Literatures, The University of Iowa, Iowa City, IA, United StatesGrammatical complexity has received extensive attention in second language acquisition. Although computational tools have been developed to analyze grammatical complexity, most relevant studies investigated this construct in the context of English as a second language. In response to an increasing number of L2 Chinese learners, it is important to extend the investigation of grammatical complexity in L2 Chinese. To promote relevant research, we evaluated the new computational tool, Stanza, on its accuracy of part-of-speech tagging for L2 Chinese writing. We particularly focused on eight grammatical features closely related to L2 Chinese development. Then, we reported the precisions, recalls, and F-scores for the individual grammatical features and offered a qualitative analysis of systematic tagging errors. In terms of the precision, three features have high rates, over 90% (i.e., ba and bei markers, classifiers, -de as noun modifier marker). For recall, four features have high rates, over 90% (i.e., aspect markers, ba and bei markers, classifiers, -de as noun modifier marker). Overall, based on the F-scores, Stanza has a good tagging performance on ba and bei markers, classifiers, and -de as a noun modifier marker. This evaluation provides research implications for scholars who plan to use this computational tool to study L2 Chinese development in second language acquisition or applied linguistics in general.https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1139703/fullpart of speech taggingSLAcorpus linguisticslanguage developmentgrammatical featuresChinese as a second language
spellingShingle	Ge Lan Xiaofei Pan Yachao Sun Yuan Lu Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context Frontiers in Psychology part of speech tagging SLA corpus linguistics language development grammatical features Chinese as a second language
title	Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context
title_full	Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context
title_fullStr	Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context
title_full_unstemmed	Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context
title_short	Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context
title_sort	part of speech tagging of grammatical features related to l2 chinese development a case analysis of stanza in the l2 writing context
topic	part of speech tagging SLA corpus linguistics language development grammatical features Chinese as a second language
url	https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1139703/full
work_keys_str_mv	AT gelan partofspeechtaggingofgrammaticalfeaturesrelatedtol2chinesedevelopmentacaseanalysisofstanzainthel2writingcontext AT xiaofeipan partofspeechtaggingofgrammaticalfeaturesrelatedtol2chinesedevelopmentacaseanalysisofstanzainthel2writingcontext AT yachaosun partofspeechtaggingofgrammaticalfeaturesrelatedtol2chinesedevelopmentacaseanalysisofstanzainthel2writingcontext AT yuanlu partofspeechtaggingofgrammaticalfeaturesrelatedtol2chinesedevelopmentacaseanalysisofstanzainthel2writingcontext

Part of speech tagging of grammatical features related to L2 Chinese development: A case analysis of Stanza in the L2 writing context

Similar Items