Gated Orthogonal Recurrent Units: On Learning to Forget

© 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extend...

Disgrifiad llawn

Manylion Llyfryddiaeth
Prif Awduron:	Jing, Li, Gulcehre, Caglar, Peurifoy, John, Shen, Yichen, Tegmark, Max, Soljacic, Marin, Bengio, Yoshua
Awduron Eraill:	Sloan School of Management
Fformat:	Erthygl
Iaith:	English
Cyhoeddwyd:	MIT Press - Journals 2021
Mynediad Ar-lein:	https://hdl.handle.net/1721.1/135148

_version_	1826214122572742656
author	Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua
author2	Sloan School of Management
author_facet	Sloan School of Management Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua
author_sort	Jing, Li
collection	MIT
description	© 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks.
first_indexed	2024-09-23T15:59:52Z
format	Article
id	mit-1721.1/135148
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T15:59:52Z
publishDate	2021
publisher	MIT Press - Journals
record_format	dspace
spelling	mit-1721.1/1351482023-01-23T17:20:32Z Gated Orthogonal Recurrent Units: On Learning to Forget Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua Sloan School of Management Massachusetts Institute of Technology. Department of Physics © 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks. 2021-10-27T20:10:57Z 2021-10-27T20:10:57Z 2019 2019-06-05T12:08:35Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/135148 en 10.1162/neco_a_01174 Neural Computation Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf MIT Press - Journals MIT Press
spellingShingle	Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua Gated Orthogonal Recurrent Units: On Learning to Forget
title	Gated Orthogonal Recurrent Units: On Learning to Forget
title_full	Gated Orthogonal Recurrent Units: On Learning to Forget
title_fullStr	Gated Orthogonal Recurrent Units: On Learning to Forget
title_full_unstemmed	Gated Orthogonal Recurrent Units: On Learning to Forget
title_short	Gated Orthogonal Recurrent Units: On Learning to Forget
title_sort	gated orthogonal recurrent units on learning to forget
url	https://hdl.handle.net/1721.1/135148
work_keys_str_mv	AT jingli gatedorthogonalrecurrentunitsonlearningtoforget AT gulcehrecaglar gatedorthogonalrecurrentunitsonlearningtoforget AT peurifoyjohn gatedorthogonalrecurrentunitsonlearningtoforget AT shenyichen gatedorthogonalrecurrentunitsonlearningtoforget AT tegmarkmax gatedorthogonalrecurrentunitsonlearningtoforget AT soljacicmarin gatedorthogonalrecurrentunitsonlearningtoforget AT bengioyoshua gatedorthogonalrecurrentunitsonlearningtoforget

Gated Orthogonal Recurrent Units: On Learning to Forget

Eitemau Tebyg