How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
AbstractPretrained embeddings based on the Transformer architecture have taken the NLP community by storm. We show that they can mathematically be reframed as a sum of vector factors and showcase how to use this reframing to study the impact of each component. We provide evidence tha...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The MIT Press
2022-01-01
|
Series: | Transactions of the Association for Computational Linguistics |
Online Access: | https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00501/112915/How-to-Dissect-a-Muppet-The-Structure-of |