Efficient Latent Space Compression for Lightning-Fast Fine-Tuning and Inference of Transformer-Based Models

This paper presents a technique to reduce the number of parameters in a transformer-based encoder–decoder architecture by incorporating autoencoders. To discover the optimal compression, we trained different autoencoders on the embedding space (encoder’s output) of several pre-trained models. The ex...

Full description

Bibliographic Details
Main Authors: Ala Alam Falaki, Robin Gras
Format: Article
Language:English
Published: MDPI AG 2023-07-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/5/3/45

Similar Items