Unsupervised Domain Adaptation Based on Style Aware

In recent years,neural machine translation has made significant progress in translation quality,but it relies on parallel bilingual sentence pairs heavily during the training process.However,parallel resources are scarce for the e-commerce domain,in addition,cultural differences lead to stylistic di...

Full description

Bibliographic Details
Main Author: NING Qiu-yi, SHI Xiao-jing, DUAN Xiang-yu, ZHANG Min
Format: Article
Language:zho
Published: Editorial office of Computer Science 2022-01-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-1-271.pdf
Description
Summary:In recent years,neural machine translation has made significant progress in translation quality,but it relies on parallel bilingual sentence pairs heavily during the training process.However,parallel resources are scarce for the e-commerce domain,in addition,cultural differences lead to stylistic differences in product information expression.To solve these two problems,a style-aware unsupervised domain adaptation algorithm is proposed,which makes full use of e-commerce monolingual data in the mutual training method,while introducing quasi knowledge distillation approach to deal with style differences.We construct non-parallel bilingual corpus by obtaining e-commerce product data information,and then carry out experiments based on the aforementioned corpus and Chinese and English news parallel corpus.The results show that the algorithm significantly improves translation qua-lity compared to various unsupervised domain adaptation methods,improves about 5 BLEU points compared with the strongest baseline system.In addition,the algorithm is further extended to Ted,Law and Medical OPUS data,all of which achieve better translation results.
ISSN:1002-137X