NERSkill.Id: Annotated dataset of Indonesian's skill entity recognition

NERSkill.Id is a manually annotated named entity recognition (NER) dataset focused on skill entities in the Indonesian language. The dataset comprises 418.868 tokens, each accompanied by corresponding tags following the BIO scheme. Notably, 15,51% of these tokens represent named entities, falling in...

Full description

Bibliographic Details
Main Authors: Meilany Nonsi Tentua, Suprapto, Afiahayati
Format: Article
Language:English
Published: Elsevier 2024-04-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S235234092400163X