Adaptive Semiparametric Language Models

AbstractWe present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an integrated architecture. Our model uses extended short-term context by caching local hidden states—similar to transformer-XL...

Full description

Bibliographic Details
Main Authors:	Dani Yogatama, Cyprien de Masson d’Autume, Lingpeng Kong
Format:	Article
Language:	English
Published:	The MIT Press 2021-01-01
Series:	Transactions of the Association for Computational Linguistics
Online Access:	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00371/100688/Adaptive-Semiparametric-Language-Models

Internet

https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00371/100688/Adaptive-Semiparametric-Language-Models

Adaptive Semiparametric Language Models

Internet

Similar Items