Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers

In studying language acquisition in children, sizable research studies have been focusing on the investigation of form and lexical semantics. This study aims to establish a child language database annotated both syntactically with part of speech and semantically with semantic content category to sup...

Full description

Bibliographic Details
Main Authors: Tempo Po-Yi Tang, Dustin Kai-Yan Lau, Man-Tak Leung
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-11-01
Series:Frontiers in Psychology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1234525/full
_version_ 1827379841401356288
author Tempo Po-Yi Tang
Dustin Kai-Yan Lau
Man-Tak Leung
author_facet Tempo Po-Yi Tang
Dustin Kai-Yan Lau
Man-Tak Leung
author_sort Tempo Po-Yi Tang
collection DOAJ
description In studying language acquisition in children, sizable research studies have been focusing on the investigation of form and lexical semantics. This study aims to establish a child language database annotated both syntactically with part of speech and semantically with semantic content category to supplement the study of child language acquisition in the semantic domain beyond lexical level. The Corpus of Mandarin Child Language (CMCL) that documented the production of different semantic content categories by Mandarin-speaking children was established. Naturalistic language samples of 82 native Mandarin-speaking children aged 25–60 months, divided into three age groups, were obtained. The corresponding semantic content categories coded in each utterance were tagged according to previous studies, in addition to the annotations of part of speech. MLU and lexical diversity were examined and the usage and acquisition of different semantic content categories were also analyzed. The results regarding syntactic complexity and lexical diversity replicated the typical language acquisition pattern from previous studies, which supported the validity of the data obtained in the CMCL. To investigate the trajectory of acquisition of various semantic content categories by age, a 90% acquisition criterion was used. Our findings regarding the acquisition order of semantic content category were basically in line with previous studies in general, with some minor differences. This acquisition order observed is largely explained by the cognitive and syntactic complexity associated with the semantic content category, with additional influence from language specific properties and cultural specific factors of Mandarin. In addition, with the tags in both part-of-speech and semantic content category, the CMCL potentially provides a platform for examining the form-content interface in early child language acquisition, which also implies significantly on the theoretical and clinical ground.
first_indexed 2024-03-08T13:23:59Z
format Article
id doaj.art-ce460f977a044157a8312440fd9057b0
institution Directory Open Access Journal
issn 1664-1078
language English
last_indexed 2024-03-08T13:23:59Z
publishDate 2023-11-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Psychology
spelling doaj.art-ce460f977a044157a8312440fd9057b02024-01-17T15:24:41ZengFrontiers Media S.A.Frontiers in Psychology1664-10782023-11-011410.3389/fpsyg.2023.12345251234525Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolersTempo Po-Yi TangDustin Kai-Yan LauMan-Tak LeungIn studying language acquisition in children, sizable research studies have been focusing on the investigation of form and lexical semantics. This study aims to establish a child language database annotated both syntactically with part of speech and semantically with semantic content category to supplement the study of child language acquisition in the semantic domain beyond lexical level. The Corpus of Mandarin Child Language (CMCL) that documented the production of different semantic content categories by Mandarin-speaking children was established. Naturalistic language samples of 82 native Mandarin-speaking children aged 25–60 months, divided into three age groups, were obtained. The corresponding semantic content categories coded in each utterance were tagged according to previous studies, in addition to the annotations of part of speech. MLU and lexical diversity were examined and the usage and acquisition of different semantic content categories were also analyzed. The results regarding syntactic complexity and lexical diversity replicated the typical language acquisition pattern from previous studies, which supported the validity of the data obtained in the CMCL. To investigate the trajectory of acquisition of various semantic content categories by age, a 90% acquisition criterion was used. Our findings regarding the acquisition order of semantic content category were basically in line with previous studies in general, with some minor differences. This acquisition order observed is largely explained by the cognitive and syntactic complexity associated with the semantic content category, with additional influence from language specific properties and cultural specific factors of Mandarin. In addition, with the tags in both part-of-speech and semantic content category, the CMCL potentially provides a platform for examining the form-content interface in early child language acquisition, which also implies significantly on the theoretical and clinical ground.https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1234525/fullsemantic content categorylanguage corpusMandarin-speaking childrencognitive and syntactic complexityacquisition
spellingShingle Tempo Po-Yi Tang
Dustin Kai-Yan Lau
Man-Tak Leung
Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers
Frontiers in Psychology
semantic content category
language corpus
Mandarin-speaking children
cognitive and syntactic complexity
acquisition
title Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers
title_full Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers
title_fullStr Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers
title_full_unstemmed Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers
title_short Corpus of Mandarin Child Language: a preliminary study on the acquisition of semantic content categories in Mandarin-speaking preschoolers
title_sort corpus of mandarin child language a preliminary study on the acquisition of semantic content categories in mandarin speaking preschoolers
topic semantic content category
language corpus
Mandarin-speaking children
cognitive and syntactic complexity
acquisition
url https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1234525/full
work_keys_str_mv AT tempopoyitang corpusofmandarinchildlanguageapreliminarystudyontheacquisitionofsemanticcontentcategoriesinmandarinspeakingpreschoolers
AT dustinkaiyanlau corpusofmandarinchildlanguageapreliminarystudyontheacquisitionofsemanticcontentcategoriesinmandarinspeakingpreschoolers
AT mantakleung corpusofmandarinchildlanguageapreliminarystudyontheacquisitionofsemanticcontentcategoriesinmandarinspeakingpreschoolers