How to Dissect a Muppet: The Structure of Transformer Embedding Spaces

AbstractPretrained embeddings based on the Transformer architecture have taken the NLP community by storm. We show that they can mathematically be reframed as a sum of vector factors and showcase how to use this reframing to study the impact of each component. We provide evidence tha...

Full description

Bibliographic Details
Main Authors: Timothee Mickus, Denis Paperno, Mathieu Constant
Format: Article
Language:English
Published: The MIT Press 2022-01-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00501/112915/How-to-Dissect-a-Muppet-The-Structure-of