Improving Sentence Representations via Component Focusing

The efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to...

Full description

Bibliographic Details
Main Authors:	Xiaoya Yin, Wu Zhang, Wenhao Zhu, Shuang Liu, Tengjun Yao
Format:	Article
Language:	English
Published:	MDPI AG 2020-02-01
Series:	Applied Sciences
Subjects:	natural language processing sentence representation sentence embedding component focusing semantic textual similarity
Online Access:	https://www.mdpi.com/2076-3417/10/3/958

_version_	1818329258387308544
author	Xiaoya Yin Wu Zhang Wenhao Zhu Shuang Liu Tengjun Yao
author_facet	Xiaoya Yin Wu Zhang Wenhao Zhu Shuang Liu Tengjun Yao
author_sort	Xiaoya Yin
collection	DOAJ
description	The efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to learn the representations of sentences and are suitable for processing sequences. Recently, bidirectional encoder representations from transformers (BERT) has attracted much attention because it achieves state-of-the-art performance on various NLP tasks. However, these standard models do not adequately address a general linguistic fact, that is, different sentence components serve diverse roles in the meaning of a sentence. In general, the subject, predicate, and object serve the most crucial roles as they represent the primary meaning of a sentence. Additionally, words in a sentence are also related to each other by syntactic relations. To emphasize on these issues, we propose a sentence representation model, a modification of the pre-trained bidirectional encoder representations from transformers (BERT) network via component focusing (CF-BERT). The sentence representation consists of a basic part which refers to the complete sentence, and a component-enhanced part, which focuses on subject, predicate, object, and their relations. For the best performance, a weight factor is introduced to adjust the ratio of both parts. We evaluate CF-BERT on two different tasks: semantic textual similarity and entailment classification. Results show that CF-BERT yields a significant performance gain compared to other sentence representation methods.
first_indexed	2024-12-13T12:45:12Z
format	Article
id	doaj.art-913e9749298d4ef380584f8ab39b7550
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-12-13T12:45:12Z
publishDate	2020-02-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-913e9749298d4ef380584f8ab39b75502022-12-21T23:45:30ZengMDPI AGApplied Sciences2076-34172020-02-0110395810.3390/app10030958app10030958Improving Sentence Representations via Component FocusingXiaoya Yin0Wu Zhang1Wenhao Zhu2Shuang Liu3Tengjun Yao4School of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaSchool of Computer Engineering and Science, Shanghai University, Shanghai 200444, ChinaThe 36th Institute China Electronics Technology Group Corporation, Jiaxing 314000, ChinaThe efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to learn the representations of sentences and are suitable for processing sequences. Recently, bidirectional encoder representations from transformers (BERT) has attracted much attention because it achieves state-of-the-art performance on various NLP tasks. However, these standard models do not adequately address a general linguistic fact, that is, different sentence components serve diverse roles in the meaning of a sentence. In general, the subject, predicate, and object serve the most crucial roles as they represent the primary meaning of a sentence. Additionally, words in a sentence are also related to each other by syntactic relations. To emphasize on these issues, we propose a sentence representation model, a modification of the pre-trained bidirectional encoder representations from transformers (BERT) network via component focusing (CF-BERT). The sentence representation consists of a basic part which refers to the complete sentence, and a component-enhanced part, which focuses on subject, predicate, object, and their relations. For the best performance, a weight factor is introduced to adjust the ratio of both parts. We evaluate CF-BERT on two different tasks: semantic textual similarity and entailment classification. Results show that CF-BERT yields a significant performance gain compared to other sentence representation methods.https://www.mdpi.com/2076-3417/10/3/958natural language processingsentence representationsentence embeddingcomponent focusingsemantic textual similarity
spellingShingle	Xiaoya Yin Wu Zhang Wenhao Zhu Shuang Liu Tengjun Yao Improving Sentence Representations via Component Focusing Applied Sciences natural language processing sentence representation sentence embedding component focusing semantic textual similarity
title	Improving Sentence Representations via Component Focusing
title_full	Improving Sentence Representations via Component Focusing
title_fullStr	Improving Sentence Representations via Component Focusing
title_full_unstemmed	Improving Sentence Representations via Component Focusing
title_short	Improving Sentence Representations via Component Focusing
title_sort	improving sentence representations via component focusing
topic	natural language processing sentence representation sentence embedding component focusing semantic textual similarity
url	https://www.mdpi.com/2076-3417/10/3/958
work_keys_str_mv	AT xiaoyayin improvingsentencerepresentationsviacomponentfocusing AT wuzhang improvingsentencerepresentationsviacomponentfocusing AT wenhaozhu improvingsentencerepresentationsviacomponentfocusing AT shuangliu improvingsentencerepresentationsviacomponentfocusing AT tengjunyao improvingsentencerepresentationsviacomponentfocusing

Improving Sentence Representations via Component Focusing

Similar Items