WCC-EC 2.0: Enhancing Neural Machine Translation with a 1.6M+ Web-Crawled English-Chinese Parallel Corpus

This research introduces WCC-EC 2.0 (Web-Crawled Corpus—English and Chinese), a comprehensive parallel corpus designed for enhancing Neural Machine Translation (NMT), featuring over 1.6 million English-Chinese sentence pairs meticulously gathered via web crawling. This corpus, extracted through an a...

Full description

Bibliographic Details
Main Authors: Jinyi Zhang, Ke Su, Ye Tian, Tadahiro Matsumoto
Format: Article
Language:English
Published: MDPI AG 2024-04-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/13/7/1381