Optimizing Small BERTs Trained for German NER
Currently, the most widespread neural network architecture for training language models is the so-called BERT, which led to improvements in various Natural Language Processing (NLP) tasks. In general, the larger the number of parameters in a BERT model, the better the results obtained in these NLP t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-10-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/12/11/443 |