Gated Orthogonal Recurrent Units: On Learning to Forget
© 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extend...
Prif Awduron: | , , , , , , |
---|---|
Awduron Eraill: | |
Fformat: | Erthygl |
Iaith: | English |
Cyhoeddwyd: |
MIT Press - Journals
2021
|
Mynediad Ar-lein: | https://hdl.handle.net/1721.1/135148 |
_version_ | 1826214122572742656 |
---|---|
author | Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua |
author2 | Sloan School of Management |
author_facet | Sloan School of Management Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua |
author_sort | Jing, Li |
collection | MIT |
description | © 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks. |
first_indexed | 2024-09-23T15:59:52Z |
format | Article |
id | mit-1721.1/135148 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T15:59:52Z |
publishDate | 2021 |
publisher | MIT Press - Journals |
record_format | dspace |
spelling | mit-1721.1/1351482023-01-23T17:20:32Z Gated Orthogonal Recurrent Units: On Learning to Forget Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua Sloan School of Management Massachusetts Institute of Technology. Department of Physics © 2019 Massachusetts Institute of Technology. We present a novel recurrent neural network (RNN)based model that combines the remembering ability of unitary evolution RNNs with the ability of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks. 2021-10-27T20:10:57Z 2021-10-27T20:10:57Z 2019 2019-06-05T12:08:35Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/135148 en 10.1162/neco_a_01174 Neural Computation Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf MIT Press - Journals MIT Press |
spellingShingle | Jing, Li Gulcehre, Caglar Peurifoy, John Shen, Yichen Tegmark, Max Soljacic, Marin Bengio, Yoshua Gated Orthogonal Recurrent Units: On Learning to Forget |
title | Gated Orthogonal Recurrent Units: On Learning to Forget |
title_full | Gated Orthogonal Recurrent Units: On Learning to Forget |
title_fullStr | Gated Orthogonal Recurrent Units: On Learning to Forget |
title_full_unstemmed | Gated Orthogonal Recurrent Units: On Learning to Forget |
title_short | Gated Orthogonal Recurrent Units: On Learning to Forget |
title_sort | gated orthogonal recurrent units on learning to forget |
url | https://hdl.handle.net/1721.1/135148 |
work_keys_str_mv | AT jingli gatedorthogonalrecurrentunitsonlearningtoforget AT gulcehrecaglar gatedorthogonalrecurrentunitsonlearningtoforget AT peurifoyjohn gatedorthogonalrecurrentunitsonlearningtoforget AT shenyichen gatedorthogonalrecurrentunitsonlearningtoforget AT tegmarkmax gatedorthogonalrecurrentunitsonlearningtoforget AT soljacicmarin gatedorthogonalrecurrentunitsonlearningtoforget AT bengioyoshua gatedorthogonalrecurrentunitsonlearningtoforget |