The Relationship between Linguistic Representations in Biological and Artificial Neural Networks
Research in cognitive neuroscience strives to understand the representations and algorithms that support human cognition, including language. The scientific tools for investigating human-unique capacities, such as language, have long been limited. For example, we do not have the option to learn abou...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/157004 |
_version_ | 1824458208039665664 |
---|---|
author | Kauf, Carina |
author2 | Fedorenko, Evelina |
author_facet | Fedorenko, Evelina Kauf, Carina |
author_sort | Kauf, Carina |
collection | MIT |
description | Research in cognitive neuroscience strives to understand the representations and algorithms that support human cognition, including language. The scientific tools for investigating human-unique capacities, such as language, have long been limited. For example, we do not have the option to learn about the neural circuits that support these capabilities by studying simpler systems than the human brain, such as animal models. However, recent advances in engineering have provided new tools for studying language: artificial neural network language models (LMs), which exhibit remarkable linguistic capabilities and are fully intervenable. In this thesis, I draw on these advances to shed light on language processing in the human brain.
Of course, comparisons between LMs and the human language system face challenges. I argue that in order to evaluate the suitability of LMs as cognitive models of language processing, we need to better understand (i) how linguistic stimuli are encoded in the internal representations of LMs (ii) how linguistic stimuli are encoded in the language-selective cortex of humans, and (iii) whether and how we can meaningfully relate linguistic representations from these two systems to each other. This thesis work makes progress on all three questions by combining evidence from neuroimaging, behavioral research, and computational modeling. First, I analyze whether LM representations of linguistic stimuli encode information about semantic plausibility. I find that LMs acquire substantial but inconsistent plausibility knowledge and that their judgments are influenced by low-level features of the input, making them good models of human language processing but unreliable models of world knowledge. Then I use fMRI to probe the computations that drive the language network’s response. I find evidence for a generalized reliance of language comprehension on syntactic processing, contra claims that language comprehension relies on shallow/associative processing, and for only a superficial encoding of sentence meaning. Finally, I systematically investigate what aspects of language inputs are critical for LM-to-brain alignment. I find that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and that this alignment is mainly driven by representations of word meanings rather than sentence structure. Taken together, this thesis provides evidence that the core language network encodes semantic information only superficially, implying that naturalistic human language processing must rely on the interaction of multiple tightly interconnected systems, and argues that – in spite of their limitations – LMs can help improve our understanding of human language processing through the interplay of in-silico modeling and human experiments. |
first_indexed | 2025-02-19T04:22:14Z |
format | Thesis |
id | mit-1721.1/157004 |
institution | Massachusetts Institute of Technology |
last_indexed | 2025-02-19T04:22:14Z |
publishDate | 2024 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1570042024-09-25T03:32:56Z The Relationship between Linguistic Representations in Biological and Artificial Neural Networks Kauf, Carina Fedorenko, Evelina Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Research in cognitive neuroscience strives to understand the representations and algorithms that support human cognition, including language. The scientific tools for investigating human-unique capacities, such as language, have long been limited. For example, we do not have the option to learn about the neural circuits that support these capabilities by studying simpler systems than the human brain, such as animal models. However, recent advances in engineering have provided new tools for studying language: artificial neural network language models (LMs), which exhibit remarkable linguistic capabilities and are fully intervenable. In this thesis, I draw on these advances to shed light on language processing in the human brain. Of course, comparisons between LMs and the human language system face challenges. I argue that in order to evaluate the suitability of LMs as cognitive models of language processing, we need to better understand (i) how linguistic stimuli are encoded in the internal representations of LMs (ii) how linguistic stimuli are encoded in the language-selective cortex of humans, and (iii) whether and how we can meaningfully relate linguistic representations from these two systems to each other. This thesis work makes progress on all three questions by combining evidence from neuroimaging, behavioral research, and computational modeling. First, I analyze whether LM representations of linguistic stimuli encode information about semantic plausibility. I find that LMs acquire substantial but inconsistent plausibility knowledge and that their judgments are influenced by low-level features of the input, making them good models of human language processing but unreliable models of world knowledge. Then I use fMRI to probe the computations that drive the language network’s response. I find evidence for a generalized reliance of language comprehension on syntactic processing, contra claims that language comprehension relies on shallow/associative processing, and for only a superficial encoding of sentence meaning. Finally, I systematically investigate what aspects of language inputs are critical for LM-to-brain alignment. I find that LMs represent linguistic information similarly enough to humans to enable relatively accurate brain encoding and that this alignment is mainly driven by representations of word meanings rather than sentence structure. Taken together, this thesis provides evidence that the core language network encodes semantic information only superficially, implying that naturalistic human language processing must rely on the interaction of multiple tightly interconnected systems, and argues that – in spite of their limitations – LMs can help improve our understanding of human language processing through the interplay of in-silico modeling and human experiments. Ph.D. 2024-09-24T18:26:10Z 2024-09-24T18:26:10Z 2024-05 2024-07-11T15:34:46.121Z Thesis https://hdl.handle.net/1721.1/157004 0000-0002-2718-1978 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Kauf, Carina The Relationship between Linguistic Representations in Biological and Artificial Neural Networks |
title | The Relationship between Linguistic Representations in Biological
and Artificial Neural Networks |
title_full | The Relationship between Linguistic Representations in Biological
and Artificial Neural Networks |
title_fullStr | The Relationship between Linguistic Representations in Biological
and Artificial Neural Networks |
title_full_unstemmed | The Relationship between Linguistic Representations in Biological
and Artificial Neural Networks |
title_short | The Relationship between Linguistic Representations in Biological
and Artificial Neural Networks |
title_sort | relationship between linguistic representations in biological and artificial neural networks |
url | https://hdl.handle.net/1721.1/157004 |
work_keys_str_mv | AT kaufcarina therelationshipbetweenlinguisticrepresentationsinbiologicalandartificialneuralnetworks AT kaufcarina relationshipbetweenlinguisticrepresentationsinbiologicalandartificialneuralnetworks |