The Relationship between Linguistic Representations in Biological and Artificial Neural Networks

Research in cognitive neuroscience strives to understand the representations and algorithms that support human cognition, including language. The scientific tools for investigating human-unique capacities, such as language, have long been limited. For example, we do not have the option to learn abou...

Full description

Bibliographic Details
Main Author: Kauf, Carina
Other Authors: Fedorenko, Evelina
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/157004
_version_ 1824458208039665664
author Kauf, Carina
author2 Fedorenko, Evelina
author_facet Fedorenko, Evelina
Kauf, Carina
author_sort Kauf, Carina
collection MIT
description Research in cognitive neuroscience strives to understand the representations and algorithms that support human cognition, including language. The scientific tools for investigating human-unique capacities, such as language, have long been limited. For example, we do not have the option to learn about the neural circuits that support these capabilities by studying simpler systems than the human brain, such as animal models. However, recent advances in engineering have provided new tools for studying language: artificial neural network language models (LMs), which exhibit remarkable linguistic capabilities and are fully intervenable. In this thesis, I draw on these advances to shed light on language processing in the human brain. Of course, comparisons between LMs and the human language system face challenges. I argue that in order to evaluate the suitability of LMs as cognitive models of language processing, we need to better understand (i) how linguistic stimuli are encoded in the internal representations of LMs (ii) how linguistic stimuli are encoded in the language-selective cortex of humans, and (iii) whether and how we can meaningfully relate linguistic representations from these two systems to each other. This thesis work makes progress on all three questions by combining evidence from neuroimaging, behavioral research, and computational modeling. First, I analyze whether LM representations of linguistic stimuli encode information about semantic plausibility. I find that LMs acquire substantial but inconsistent plausibility knowledge and that their judgments are influenced by low-level features of the input, making them good models of human language processing but unreliable models of world knowledge. Then I use fMRI to probe the computations that drive the language network’s response. I find evidence for a generalized reliance of language comprehension on syntactic processing, contra claims that language comprehension relies on shallow/associative processing, and for only a superficial encoding of sentence meaning. Finally, I systematically investigate what aspects of language inputs are critical for LM-to-brain alignment. I find that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and that this alignment is mainly driven by representations of word meanings rather than sentence structure. Taken together, this thesis provides evidence that the core language network encodes semantic information only superficially, implying that naturalistic human language processing must rely on the interaction of multiple tightly interconnected systems, and argues that – in spite of their limitations – LMs can help improve our understanding of human language processing through the interplay of in-silico modeling and human experiments.
first_indexed 2025-02-19T04:22:14Z
format Thesis
id mit-1721.1/157004
institution Massachusetts Institute of Technology
last_indexed 2025-02-19T04:22:14Z
publishDate 2024
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1570042024-09-25T03:32:56Z The Relationship between Linguistic Representations in Biological and Artificial Neural Networks Kauf, Carina Fedorenko, Evelina Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Research in cognitive neuroscience strives to understand the representations and algorithms that support human cognition, including language. The scientific tools for investigating human-unique capacities, such as language, have long been limited. For example, we do not have the option to learn about the neural circuits that support these capabilities by studying simpler systems than the human brain, such as animal models. However, recent advances in engineering have provided new tools for studying language: artificial neural network language models (LMs), which exhibit remarkable linguistic capabilities and are fully intervenable. In this thesis, I draw on these advances to shed light on language processing in the human brain. Of course, comparisons between LMs and the human language system face challenges. I argue that in order to evaluate the suitability of LMs as cognitive models of language processing, we need to better understand (i) how linguistic stimuli are encoded in the internal representations of LMs (ii) how linguistic stimuli are encoded in the language-selective cortex of humans, and (iii) whether and how we can meaningfully relate linguistic representations from these two systems to each other. This thesis work makes progress on all three questions by combining evidence from neuroimaging, behavioral research, and computational modeling. First, I analyze whether LM representations of linguistic stimuli encode information about semantic plausibility. I find that LMs acquire substantial but inconsistent plausibility knowledge and that their judgments are influenced by low-level features of the input, making them good models of human language processing but unreliable models of world knowledge. Then I use fMRI to probe the computations that drive the language network’s response. I find evidence for a generalized reliance of language comprehension on syntactic processing, contra claims that language comprehension relies on shallow/associative processing, and for only a superficial encoding of sentence meaning. Finally, I systematically investigate what aspects of language inputs are critical for LM-to-brain alignment. I find that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and that this alignment is mainly driven by representations of word meanings rather than sentence structure. Taken together, this thesis provides evidence that the core language network encodes semantic information only superficially, implying that naturalistic human language processing must rely on the interaction of multiple tightly interconnected systems, and argues that – in spite of their limitations – LMs can help improve our understanding of human language processing through the interplay of in-silico modeling and human experiments. Ph.D. 2024-09-24T18:26:10Z 2024-09-24T18:26:10Z 2024-05 2024-07-11T15:34:46.121Z Thesis https://hdl.handle.net/1721.1/157004 0000-0002-2718-1978 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Kauf, Carina
The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
title The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
title_full The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
title_fullStr The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
title_full_unstemmed The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
title_short The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
title_sort relationship between linguistic representations in biological and artificial neural networks
url https://hdl.handle.net/1721.1/157004
work_keys_str_mv AT kaufcarina therelationshipbetweenlinguisticrepresentationsinbiologicalandartificialneuralnetworks
AT kaufcarina relationshipbetweenlinguisticrepresentationsinbiologicalandartificialneuralnetworks