Improving Sentence Representations via Component Focusing
The efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-02-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/3/958 |
_version_ | 1818329258387308544 |
---|---|
author | Xiaoya Yin Wu Zhang Wenhao Zhu Shuang Liu Tengjun Yao |
author_facet | Xiaoya Yin Wu Zhang Wenhao Zhu Shuang Liu Tengjun Yao |
author_sort | Xiaoya Yin |
collection | DOAJ |
description | The efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to learn the representations of sentences and are suitable for processing sequences. Recently, bidirectional encoder representations from transformers (BERT) has attracted much attention because it achieves state-of-the-art performance on various NLP tasks. However, these standard models do not adequately address a general linguistic fact, that is, different sentence components serve diverse roles in the meaning of a sentence. In general, the subject, predicate, and object serve the most crucial roles as they represent the primary meaning of a sentence. Additionally, words in a sentence are also related to each other by syntactic relations. To emphasize on these issues, we propose a sentence representation model, a modification of the pre-trained bidirectional encoder representations from transformers (BERT) network via component focusing (CF-BERT). The sentence representation consists of a basic part which refers to the complete sentence, and a component-enhanced part, which focuses on subject, predicate, object, and their relations. For the best performance, a weight factor is introduced to adjust the ratio of both parts. We evaluate CF-BERT on two different tasks: semantic textual similarity and entailment classification. Results show that CF-BERT yields a significant performance gain compared to other sentence representation methods. |
first_indexed | 2024-12-13T12:45:12Z |
format | Article |
id | doaj.art-913e9749298d4ef380584f8ab39b7550 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-12-13T12:45:12Z |
publishDate | 2020-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-913e9749298d4ef380584f8ab39b75502022-12-21T23:45:30ZengMDPI AGApplied Sciences2076-34172020-02-0110395810.3390/app10030958app10030958Improving Sentence Representations via Component FocusingXiaoya Yin0Wu Zhang1Wenhao Zhu2Shuang Liu3Tengjun Yao4School of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaThe 36th Institute China Electronics Technology Group Corporation, Jiaxing 314000, ChinaThe efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to learn the representations of sentences and are suitable for processing sequences. Recently, bidirectional encoder representations from transformers (BERT) has attracted much attention because it achieves state-of-the-art performance on various NLP tasks. However, these standard models do not adequately address a general linguistic fact, that is, different sentence components serve diverse roles in the meaning of a sentence. In general, the subject, predicate, and object serve the most crucial roles as they represent the primary meaning of a sentence. Additionally, words in a sentence are also related to each other by syntactic relations. To emphasize on these issues, we propose a sentence representation model, a modification of the pre-trained bidirectional encoder representations from transformers (BERT) network via component focusing (CF-BERT). The sentence representation consists of a basic part which refers to the complete sentence, and a component-enhanced part, which focuses on subject, predicate, object, and their relations. For the best performance, a weight factor is introduced to adjust the ratio of both parts. We evaluate CF-BERT on two different tasks: semantic textual similarity and entailment classification. Results show that CF-BERT yields a significant performance gain compared to other sentence representation methods.https://www.mdpi.com/2076-3417/10/3/958natural language processingsentence representationsentence embeddingcomponent focusingsemantic textual similarity |
spellingShingle | Xiaoya Yin Wu Zhang Wenhao Zhu Shuang Liu Tengjun Yao Improving Sentence Representations via Component Focusing Applied Sciences natural language processing sentence representation sentence embedding component focusing semantic textual similarity |
title | Improving Sentence Representations via Component Focusing |
title_full | Improving Sentence Representations via Component Focusing |
title_fullStr | Improving Sentence Representations via Component Focusing |
title_full_unstemmed | Improving Sentence Representations via Component Focusing |
title_short | Improving Sentence Representations via Component Focusing |
title_sort | improving sentence representations via component focusing |
topic | natural language processing sentence representation sentence embedding component focusing semantic textual similarity |
url | https://www.mdpi.com/2076-3417/10/3/958 |
work_keys_str_mv | AT xiaoyayin improvingsentencerepresentationsviacomponentfocusing AT wuzhang improvingsentencerepresentationsviacomponentfocusing AT wenhaozhu improvingsentencerepresentationsviacomponentfocusing AT shuangliu improvingsentencerepresentationsviacomponentfocusing AT tengjunyao improvingsentencerepresentationsviacomponentfocusing |