EASIER corpus: A lexical simplification resource for people with cognitive impairments.

Thanks to technologies such as the Internet and devices now available to people, we have increasingly greater access to larger quantities of information. However, people with ageing disabilities or intellectual disabilities, non-native speakers, and others have difficulties reading and understanding...

Full description

Bibliographic Details
Main Authors: Rodrigo Alarcon, Lourdes Moreno, Paloma Martínez
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0283622
_version_ 1797843240213282816
author Rodrigo Alarcon
Lourdes Moreno
Paloma Martínez
author_facet Rodrigo Alarcon
Lourdes Moreno
Paloma Martínez
author_sort Rodrigo Alarcon
collection DOAJ
description Thanks to technologies such as the Internet and devices now available to people, we have increasingly greater access to larger quantities of information. However, people with ageing disabilities or intellectual disabilities, non-native speakers, and others have difficulties reading and understanding information. For this reason, it is essential to provide text simplification mechanisms when accessing information. Natural Language Processing methods can be applied to simplify textual content and improve understanding. These methods often use machine learning algorithms and models which require resources, such as corpora, to be trained and tested. This article presents the EASIER corpus, a resource that can be used to build lexical simplification methods to process Spanish domain-independent texts. The EASIER corpus is composed of 260 annotated documents with 8,155 words labelled as complex and 5,130 words with at least one proposed context-aware synonym associated. Expert linguists in easy-to-read and plain language guidelines have annotated the corpus based on their experience adapting texts for people with intellectual disabilities. Sixteen annotation guidelines that discriminate between complex and simple words have been defined to help other groups of experts to generate new annotations. Additionally, an inter-annotator agreement test was performed to validate the corpus, obtaining a Fleiss Kappa coefficient of 0.641. Furthermore, a qualitative evaluation was conducted with 45 users (including people with intellectual disabilities, elderly people, and a control audience). Complex word identification tasks achieved moderate results, but the synonyms proposed to replace complex words achieved almost perfect ratings. This resource has been integrated into the EASIER platform, a tool that helps people with cognitive impairments and intellectual disabilities to read and understand texts more easily.
first_indexed 2024-04-09T17:01:52Z
format Article
id doaj.art-5445f09c1059407d8987f2673d243992
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-09T17:01:52Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-5445f09c1059407d8987f2673d2439922023-04-21T05:32:15ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01184e028362210.1371/journal.pone.0283622EASIER corpus: A lexical simplification resource for people with cognitive impairments.Rodrigo AlarconLourdes MorenoPaloma MartínezThanks to technologies such as the Internet and devices now available to people, we have increasingly greater access to larger quantities of information. However, people with ageing disabilities or intellectual disabilities, non-native speakers, and others have difficulties reading and understanding information. For this reason, it is essential to provide text simplification mechanisms when accessing information. Natural Language Processing methods can be applied to simplify textual content and improve understanding. These methods often use machine learning algorithms and models which require resources, such as corpora, to be trained and tested. This article presents the EASIER corpus, a resource that can be used to build lexical simplification methods to process Spanish domain-independent texts. The EASIER corpus is composed of 260 annotated documents with 8,155 words labelled as complex and 5,130 words with at least one proposed context-aware synonym associated. Expert linguists in easy-to-read and plain language guidelines have annotated the corpus based on their experience adapting texts for people with intellectual disabilities. Sixteen annotation guidelines that discriminate between complex and simple words have been defined to help other groups of experts to generate new annotations. Additionally, an inter-annotator agreement test was performed to validate the corpus, obtaining a Fleiss Kappa coefficient of 0.641. Furthermore, a qualitative evaluation was conducted with 45 users (including people with intellectual disabilities, elderly people, and a control audience). Complex word identification tasks achieved moderate results, but the synonyms proposed to replace complex words achieved almost perfect ratings. This resource has been integrated into the EASIER platform, a tool that helps people with cognitive impairments and intellectual disabilities to read and understand texts more easily.https://doi.org/10.1371/journal.pone.0283622
spellingShingle Rodrigo Alarcon
Lourdes Moreno
Paloma Martínez
EASIER corpus: A lexical simplification resource for people with cognitive impairments.
PLoS ONE
title EASIER corpus: A lexical simplification resource for people with cognitive impairments.
title_full EASIER corpus: A lexical simplification resource for people with cognitive impairments.
title_fullStr EASIER corpus: A lexical simplification resource for people with cognitive impairments.
title_full_unstemmed EASIER corpus: A lexical simplification resource for people with cognitive impairments.
title_short EASIER corpus: A lexical simplification resource for people with cognitive impairments.
title_sort easier corpus a lexical simplification resource for people with cognitive impairments
url https://doi.org/10.1371/journal.pone.0283622
work_keys_str_mv AT rodrigoalarcon easiercorpusalexicalsimplificationresourceforpeoplewithcognitiveimpairments
AT lourdesmoreno easiercorpusalexicalsimplificationresourceforpeoplewithcognitiveimpairments
AT palomamartinez easiercorpusalexicalsimplificationresourceforpeoplewithcognitiveimpairments