Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech

Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kin...

Full description

Bibliographic Details
Main Authors:	Philip A. Huebner, Jon A. Willits
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2018-02-01
Series:	Frontiers in Psychology
Subjects:	semantic development language learning neural networks statistical learning
Online Access:	http://journal.frontiersin.org/article/10.3389/fpsyg.2018.00133/full

_version_	1811307087115845632
author	Philip A. Huebner Jon A. Willits
author_facet	Philip A. Huebner Jon A. Willits
author_sort	Philip A. Huebner
collection	DOAJ
description	Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory) to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM) and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing semantic system.
first_indexed	2024-04-13T08:58:22Z
format	Article
id	doaj.art-03d8b30d6e374904929acee3c45c3c56
institution	Directory Open Access Journal
issn	1664-1078
language	English
last_indexed	2024-04-13T08:58:22Z
publishDate	2018-02-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Psychology
spelling	doaj.art-03d8b30d6e374904929acee3c45c3c562022-12-22T02:53:13ZengFrontiers Media S.A.Frontiers in Psychology1664-10782018-02-01910.3389/fpsyg.2018.00133285125Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed SpeechPhilip A. Huebner0Jon A. Willits1Interdepartmental Neuroscience Graduate Program, University of California, Riverside, Riverside, CA, United StatesDepartment of Psychology, University of California, Riverside, Riverside, CA, United StatesPrevious research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory) to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM) and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing semantic system.http://journal.frontiersin.org/article/10.3389/fpsyg.2018.00133/fullsemantic developmentlanguage learningneural networksstatistical learning
spellingShingle	Philip A. Huebner Jon A. Willits Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech Frontiers in Psychology semantic development language learning neural networks statistical learning
title	Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_full	Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_fullStr	Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_full_unstemmed	Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_short	Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_sort	structured semantic knowledge can emerge automatically from predicting word sequences in child directed speech
topic	semantic development language learning neural networks statistical learning
url	http://journal.frontiersin.org/article/10.3389/fpsyg.2018.00133/full
work_keys_str_mv	AT philipahuebner structuredsemanticknowledgecanemergeautomaticallyfrompredictingwordsequencesinchilddirectedspeech AT jonawillits structuredsemanticknowledgecanemergeautomaticallyfrompredictingwordsequencesinchilddirectedspeech

Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech

Similar Items