Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which perfor...

Full description

Bibliographic Details
Main Authors: Jialin Yu, Alexandra I. Cristea, Anoushka Harit, Zhongtian Sun, Olanrewaju Tahir Aduragba, Lei Shi, Noura Al Moubayed
Format: Article
Language:English
Published: KeAi Communications Co. Ltd. 2023-01-01
Series:AI Open
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666651023000025
_version_ 1797378603204214784
author Jialin Yu
Alexandra I. Cristea
Anoushka Harit
Zhongtian Sun
Olanrewaju Tahir Aduragba
Lei Shi
Noura Al Moubayed
author_facet Jialin Yu
Alexandra I. Cristea
Anoushka Harit
Zhongtian Sun
Olanrewaju Tahir Aduragba
Lei Shi
Noura Al Moubayed
author_sort Jialin Yu
collection DOAJ
description This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.
first_indexed 2024-03-08T20:10:08Z
format Article
id doaj.art-7405011d2af64562ae5861bfc0674a5f
institution Directory Open Access Journal
issn 2666-6510
language English
last_indexed 2024-03-08T20:10:08Z
publishDate 2023-01-01
publisher KeAi Communications Co. Ltd.
record_format Article
series AI Open
spelling doaj.art-7405011d2af64562ae5861bfc0674a5f2023-12-23T05:22:51ZengKeAi Communications Co. Ltd.AI Open2666-65102023-01-0141932Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generationJialin Yu0Alexandra I. Cristea1Anoushka Harit2Zhongtian Sun3Olanrewaju Tahir Aduragba4Lei Shi5Noura Al Moubayed6Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK; Department of Statistical Science, University College London, Gower Street, London, WC1E 6BT, UK; Corresponding author at: Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK.Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK; Corresponding author.Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKThis paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.http://www.sciencedirect.com/science/article/pii/S2666651023000025Deep latent variable modelsParaphrase generationSemi-supervised learningNatural language processingDeep learning
spellingShingle Jialin Yu
Alexandra I. Cristea
Anoushka Harit
Zhongtian Sun
Olanrewaju Tahir Aduragba
Lei Shi
Noura Al Moubayed
Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
AI Open
Deep latent variable models
Paraphrase generation
Semi-supervised learning
Natural language processing
Deep learning
title Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_full Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_fullStr Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_full_unstemmed Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_short Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_sort language as a latent sequence deep latent variable models for semi supervised paraphrase generation
topic Deep latent variable models
Paraphrase generation
Semi-supervised learning
Natural language processing
Deep learning
url http://www.sciencedirect.com/science/article/pii/S2666651023000025
work_keys_str_mv AT jialinyu languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration
AT alexandraicristea languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration
AT anoushkaharit languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration
AT zhongtiansun languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration
AT olanrewajutahiraduragba languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration
AT leishi languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration
AT nouraalmoubayed languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration