Optimizing Small BERTs Trained for German NER

Currently, the most widespread neural network architecture for training language models is the so-called BERT, which led to improvements in various Natural Language Processing (NLP) tasks. In general, the larger the number of parameters in a BERT model, the better the results obtained in these NLP t...

Full description

Bibliographic Details
Main Authors: Jochen Zöllner, Konrad Sperfeld, Christoph Wick, Roger Labahn
Format: Article
Language:English
Published: MDPI AG 2021-10-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/12/11/443

Similar Items