Gated Orthogonal Recurrent Units: On Learning to Forget

© 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extend...

Disgrifiad llawn

Manylion Llyfryddiaeth
Prif Awduron: Jing, Li, Gulcehre, Caglar, Peurifoy, John, Shen, Yichen, Tegmark, Max, Soljacic, Marin, Bengio, Yoshua
Awduron Eraill: Sloan School of Management
Fformat: Erthygl
Iaith:English
Cyhoeddwyd: MIT Press - Journals 2021
Mynediad Ar-lein:https://hdl.handle.net/1721.1/135148
_version_ 1826214122572742656
author Jing, Li
Gulcehre, Caglar
Peurifoy, John
Shen, Yichen
Tegmark, Max
Soljacic, Marin
Bengio, Yoshua
author2 Sloan School of Management
author_facet Sloan School of Management
Jing, Li
Gulcehre, Caglar
Peurifoy, John
Shen, Yichen
Tegmark, Max
Soljacic, Marin
Bengio, Yoshua
author_sort Jing, Li
collection MIT
description © 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks.
first_indexed 2024-09-23T15:59:52Z
format Article
id mit-1721.1/135148
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:59:52Z
publishDate 2021
publisher MIT Press - Journals
record_format dspace
spelling mit-1721.1/1351482023-01-23T17:20:32Z Gated Orthogonal Recurrent Units: On Learning to Forget Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua Sloan School of Management Massachusetts Institute of Technology. Department of Physics © 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks. 2021-10-27T20:10:57Z 2021-10-27T20:10:57Z 2019 2019-06-05T12:08:35Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/135148 en 10.1162/neco_a_01174 Neural Computation Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf MIT Press - Journals MIT Press
spellingShingle Jing, Li
Gulcehre, Caglar
Peurifoy, John
Shen, Yichen
Tegmark, Max
Soljacic, Marin
Bengio, Yoshua
Gated Orthogonal Recurrent Units: On Learning to Forget
title Gated Orthogonal Recurrent Units: On Learning to Forget
title_full Gated Orthogonal Recurrent Units: On Learning to Forget
title_fullStr Gated Orthogonal Recurrent Units: On Learning to Forget
title_full_unstemmed Gated Orthogonal Recurrent Units: On Learning to Forget
title_short Gated Orthogonal Recurrent Units: On Learning to Forget
title_sort gated orthogonal recurrent units on learning to forget
url https://hdl.handle.net/1721.1/135148
work_keys_str_mv AT jingli gatedorthogonalrecurrentunitsonlearningtoforget
AT gulcehrecaglar gatedorthogonalrecurrentunitsonlearningtoforget
AT peurifoyjohn gatedorthogonalrecurrentunitsonlearningtoforget
AT shenyichen gatedorthogonalrecurrentunitsonlearningtoforget
AT tegmarkmax gatedorthogonalrecurrentunitsonlearningtoforget
AT soljacicmarin gatedorthogonalrecurrentunitsonlearningtoforget
AT bengioyoshua gatedorthogonalrecurrentunitsonlearningtoforget