Distilling Monolingual Models from Large Multilingual Transformers

Although language modeling has been trending upwards steadily, models available for low-resourced languages are limited to large multilingual models such as mBERT and XLM-RoBERTa, which come with significant overheads for deployment vis-à-vis their model size, inference speeds, etc. We attempt to ta...

Full description

Bibliographic Details
Main Authors: Pranaydeep Singh, Orphée De Clercq, Els Lefever
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/4/1022