Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts
The use of inclusive language, among many other gender equality initiatives in society, has garnered great attention in recent years. Gender equality offices in universities and public administration cannot cope with the task of manually checking the use of non-inclusive language in the documentatio...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-08-01
|
Series: | Publications |
Subjects: | |
Online Access: | https://www.mdpi.com/2304-6775/8/3/41 |
_version_ | 1797559920903585792 |
---|---|
author | Pedro Orgeira-Crespo Carla Míguez-Álvarez Miguel Cuevas-Alonso María Isabel Doval-Ruiz |
author_facet | Pedro Orgeira-Crespo Carla Míguez-Álvarez Miguel Cuevas-Alonso María Isabel Doval-Ruiz |
author_sort | Pedro Orgeira-Crespo |
collection | DOAJ |
description | The use of inclusive language, among many other gender equality initiatives in society, has garnered great attention in recent years. Gender equality offices in universities and public administration cannot cope with the task of manually checking the use of non-inclusive language in the documentation that those institutions generate. In this research, an automated solution for the detection of non-inclusive uses of the Spanish language in doctoral theses generated in Spanish universities is introduced using machine learning techniques. A large dataset has been used to train, validate, and analyze the use of inclusive language; the result is an algorithm that detects, within any Spanish text document, non-inclusive uses of the language with error, false positive, and false negative ratios slightly over 10%, and precision, recall, and F-measure percentages over 86%. Results also show the evolution with time of the ratio of non-inclusive usages per document, having a pronounced reduction in the last years under study. |
first_indexed | 2024-03-10T17:53:07Z |
format | Article |
id | doaj.art-a0b823d44ccc43cca9e1397a0d3357c3 |
institution | Directory Open Access Journal |
issn | 2304-6775 |
language | English |
last_indexed | 2024-03-10T17:53:07Z |
publishDate | 2020-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Publications |
spelling | doaj.art-a0b823d44ccc43cca9e1397a0d3357c32023-11-20T09:18:36ZengMDPI AGPublications2304-67752020-08-01834110.3390/publications8030041Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic TextsPedro Orgeira-Crespo0Carla Míguez-Álvarez1Miguel Cuevas-Alonso2María Isabel Doval-Ruiz3Aerospace Area, Department of Mechanical Engineering, Heat Engines and Machines, and Fluids, Aerospace Engineering School, University of Vigo, Campus Orense, 32004 Orense, SpainLanguage Variation and Textual Categorization (LVTC), Philology and Translation School, University of Vigo, 36310 Vigo, SpainLanguage Variation and Textual Categorization (LVTC), Philology and Translation School, University of Vigo, 36310 Vigo, SpainFaculty of Educational Sciences, University of Vigo, Campus Lagoas Marcosende, 36310 Vigo, SpainThe use of inclusive language, among many other gender equality initiatives in society, has garnered great attention in recent years. Gender equality offices in universities and public administration cannot cope with the task of manually checking the use of non-inclusive language in the documentation that those institutions generate. In this research, an automated solution for the detection of non-inclusive uses of the Spanish language in doctoral theses generated in Spanish universities is introduced using machine learning techniques. A large dataset has been used to train, validate, and analyze the use of inclusive language; the result is an algorithm that detects, within any Spanish text document, non-inclusive uses of the language with error, false positive, and false negative ratios slightly over 10%, and precision, recall, and F-measure percentages over 86%. Results also show the evolution with time of the ratio of non-inclusive usages per document, having a pronounced reduction in the last years under study.https://www.mdpi.com/2304-6775/8/3/41inclusive languageSpanish languagenatural language processingclassification algorithmmachine learning |
spellingShingle | Pedro Orgeira-Crespo Carla Míguez-Álvarez Miguel Cuevas-Alonso María Isabel Doval-Ruiz Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts Publications inclusive language Spanish language natural language processing classification algorithm machine learning |
title | Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts |
title_full | Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts |
title_fullStr | Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts |
title_full_unstemmed | Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts |
title_short | Decision Algorithm for the Automatic Determination of the Use of Non-Inclusive Terms in Academic Texts |
title_sort | decision algorithm for the automatic determination of the use of non inclusive terms in academic texts |
topic | inclusive language Spanish language natural language processing classification algorithm machine learning |
url | https://www.mdpi.com/2304-6775/8/3/41 |
work_keys_str_mv | AT pedroorgeiracrespo decisionalgorithmfortheautomaticdeterminationoftheuseofnoninclusivetermsinacademictexts AT carlamiguezalvarez decisionalgorithmfortheautomaticdeterminationoftheuseofnoninclusivetermsinacademictexts AT miguelcuevasalonso decisionalgorithmfortheautomaticdeterminationoftheuseofnoninclusivetermsinacademictexts AT mariaisabeldovalruiz decisionalgorithmfortheautomaticdeterminationoftheuseofnoninclusivetermsinacademictexts |