Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese
Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-10-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/11/10/484 |
_version_ | 1797550902291202048 |
---|---|
author | Pedro Fialho Luísa Coheur Paulo Quaresma |
author_facet | Pedro Fialho Luísa Coheur Paulo Quaresma |
author_sort | Pedro Fialho |
collection | DOAJ |
description | Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language. |
first_indexed | 2024-03-10T15:36:59Z |
format | Article |
id | doaj.art-f2824336509d43fd8dd5ddd6bdc8fdfc |
institution | Directory Open Access Journal |
issn | 2078-2489 |
language | English |
last_indexed | 2024-03-10T15:36:59Z |
publishDate | 2020-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj.art-f2824336509d43fd8dd5ddd6bdc8fdfc2023-11-20T17:13:46ZengMDPI AGInformation2078-24892020-10-01111048410.3390/info11100484Benchmarking Natural Language Inference and Semantic Textual Similarity for PortuguesePedro Fialho0Luísa Coheur1Paulo Quaresma2INESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, PortugalINESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, PortugalINESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, PortugalTwo sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language.https://www.mdpi.com/2078-2489/11/10/484natural language inferencesemantic textual similaritymultilingual BERTlexical features |
spellingShingle | Pedro Fialho Luísa Coheur Paulo Quaresma Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese Information natural language inference semantic textual similarity multilingual BERT lexical features |
title | Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese |
title_full | Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese |
title_fullStr | Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese |
title_full_unstemmed | Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese |
title_short | Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese |
title_sort | benchmarking natural language inference and semantic textual similarity for portuguese |
topic | natural language inference semantic textual similarity multilingual BERT lexical features |
url | https://www.mdpi.com/2078-2489/11/10/484 |
work_keys_str_mv | AT pedrofialho benchmarkingnaturallanguageinferenceandsemantictextualsimilarityforportuguese AT luisacoheur benchmarkingnaturallanguageinferenceandsemantictextualsimilarityforportuguese AT pauloquaresma benchmarkingnaturallanguageinferenceandsemantictextualsimilarityforportuguese |