Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which perfor...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co. Ltd.
2023-01-01
|
Series: | AI Open |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666651023000025 |
_version_ | 1797378603204214784 |
---|---|
author | Jialin Yu Alexandra I. Cristea Anoushka Harit Zhongtian Sun Olanrewaju Tahir Aduragba Lei Shi Noura Al Moubayed |
author_facet | Jialin Yu Alexandra I. Cristea Anoushka Harit Zhongtian Sun Olanrewaju Tahir Aduragba Lei Shi Noura Al Moubayed |
author_sort | Jialin Yu |
collection | DOAJ |
description | This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase. |
first_indexed | 2024-03-08T20:10:08Z |
format | Article |
id | doaj.art-7405011d2af64562ae5861bfc0674a5f |
institution | Directory Open Access Journal |
issn | 2666-6510 |
language | English |
last_indexed | 2024-03-08T20:10:08Z |
publishDate | 2023-01-01 |
publisher | KeAi Communications Co. Ltd. |
record_format | Article |
series | AI Open |
spelling | doaj.art-7405011d2af64562ae5861bfc0674a5f2023-12-23T05:22:51ZengKeAi Communications Co. Ltd.AI Open2666-65102023-01-0141932Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generationJialin Yu0Alexandra I. Cristea1Anoushka Harit2Zhongtian Sun3Olanrewaju Tahir Aduragba4Lei Shi5Noura Al Moubayed6Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK; Department of Statistical Science, University College London, Gower Street, London, WC1E 6BT, UK; Corresponding author at: Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK.Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK; Corresponding author.Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKThis paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.http://www.sciencedirect.com/science/article/pii/S2666651023000025Deep latent variable modelsParaphrase generationSemi-supervised learningNatural language processingDeep learning |
spellingShingle | Jialin Yu Alexandra I. Cristea Anoushka Harit Zhongtian Sun Olanrewaju Tahir Aduragba Lei Shi Noura Al Moubayed Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation AI Open Deep latent variable models Paraphrase generation Semi-supervised learning Natural language processing Deep learning |
title | Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation |
title_full | Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation |
title_fullStr | Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation |
title_full_unstemmed | Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation |
title_short | Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation |
title_sort | language as a latent sequence deep latent variable models for semi supervised paraphrase generation |
topic | Deep latent variable models Paraphrase generation Semi-supervised learning Natural language processing Deep learning |
url | http://www.sciencedirect.com/science/article/pii/S2666651023000025 |
work_keys_str_mv | AT jialinyu languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT alexandraicristea languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT anoushkaharit languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT zhongtiansun languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT olanrewajutahiraduragba languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT leishi languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT nouraalmoubayed languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration |