Language statistics as a window into mental representations

Abstract Large-scale linguistic data is nowadays available in abundance. Using this source of data, previous research has identified redundancies between the statistical structure of natural language and properties of the (physical) world we live in. For example, it has been shown that we can gauge...

Full description

Bibliographic Details
Main Authors: Fritz Günther, Luca Rinaldi
Format: Article
Language:English
Published: Nature Portfolio 2022-05-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-022-12027-5
_version_ 1811250913962098688
author Fritz Günther
Luca Rinaldi
author_facet Fritz Günther
Luca Rinaldi
author_sort Fritz Günther
collection DOAJ
description Abstract Large-scale linguistic data is nowadays available in abundance. Using this source of data, previous research has identified redundancies between the statistical structure of natural language and properties of the (physical) world we live in. For example, it has been shown that we can gauge city sizes by analyzing their respective word frequencies in corpora. However, since natural language is always produced by human speakers, we point out that such redundancies can only come about indirectly and should necessarily be restricted cases where human representations largely retain characteristics of the physical world. To demonstrate this, we examine the statistical occurrence of words referring to body parts in very different languages, covering nearly 4 billions of native speakers. This is because the convergence between language and physical properties of the stimuli clearly breaks down for the human body (i.e., more relevant and functional body parts are not necessarily larger in size). Our findings indicate that the human body as extracted from language does not retain its actual physical proportions; instead, it resembles the distorted human-like figure known as the sensory homunculus, whose form depicts the amount of cortical area dedicated to sensorimotor functions of each body part (and, thus, their relative functional relevance). This demonstrates that the surface-level statistical structure of language opens a window into how humans represent the world they live in, rather than into the world itself.
first_indexed 2024-04-12T16:11:53Z
format Article
id doaj.art-713c46a17ee345f89f7b94b46dd59fdb
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-12T16:11:53Z
publishDate 2022-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-713c46a17ee345f89f7b94b46dd59fdb2022-12-22T03:25:52ZengNature PortfolioScientific Reports2045-23222022-05-0112111310.1038/s41598-022-12027-5Language statistics as a window into mental representationsFritz Günther0Luca Rinaldi1Department of Psychology, Humboldt-Universität zu BerlinDepartment of Brain and Behavioral Sciences, University of PaviaAbstract Large-scale linguistic data is nowadays available in abundance. Using this source of data, previous research has identified redundancies between the statistical structure of natural language and properties of the (physical) world we live in. For example, it has been shown that we can gauge city sizes by analyzing their respective word frequencies in corpora. However, since natural language is always produced by human speakers, we point out that such redundancies can only come about indirectly and should necessarily be restricted cases where human representations largely retain characteristics of the physical world. To demonstrate this, we examine the statistical occurrence of words referring to body parts in very different languages, covering nearly 4 billions of native speakers. This is because the convergence between language and physical properties of the stimuli clearly breaks down for the human body (i.e., more relevant and functional body parts are not necessarily larger in size). Our findings indicate that the human body as extracted from language does not retain its actual physical proportions; instead, it resembles the distorted human-like figure known as the sensory homunculus, whose form depicts the amount of cortical area dedicated to sensorimotor functions of each body part (and, thus, their relative functional relevance). This demonstrates that the surface-level statistical structure of language opens a window into how humans represent the world they live in, rather than into the world itself.https://doi.org/10.1038/s41598-022-12027-5
spellingShingle Fritz Günther
Luca Rinaldi
Language statistics as a window into mental representations
Scientific Reports
title Language statistics as a window into mental representations
title_full Language statistics as a window into mental representations
title_fullStr Language statistics as a window into mental representations
title_full_unstemmed Language statistics as a window into mental representations
title_short Language statistics as a window into mental representations
title_sort language statistics as a window into mental representations
url https://doi.org/10.1038/s41598-022-12027-5
work_keys_str_mv AT fritzgunther languagestatisticsasawindowintomentalrepresentations
AT lucarinaldi languagestatisticsasawindowintomentalrepresentations