Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which perfor...

Full description

Bibliographic Details
Main Authors:	Jialin Yu, Alexandra I. Cristea, Anoushka Harit, Zhongtian Sun, Olanrewaju Tahir Aduragba, Lei Shi, Noura Al Moubayed
Format:	Article
Language:	English
Published:	KeAi Communications Co. Ltd. 2023-01-01
Series:	AI Open
Subjects:	Deep latent variable models Paraphrase generation Semi-supervised learning Natural language processing Deep learning
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666651023000025

_version_	1797378603204214784
author	Jialin Yu Alexandra I. Cristea Anoushka Harit Zhongtian Sun Olanrewaju Tahir Aduragba Lei Shi Noura Al Moubayed
author_facet	Jialin Yu Alexandra I. Cristea Anoushka Harit Zhongtian Sun Olanrewaju Tahir Aduragba Lei Shi Noura Al Moubayed
author_sort	Jialin Yu
collection	DOAJ
description	This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.
first_indexed	2024-03-08T20:10:08Z
format	Article
id	doaj.art-7405011d2af64562ae5861bfc0674a5f
institution	Directory Open Access Journal
issn	2666-6510
language	English
last_indexed	2024-03-08T20:10:08Z
publishDate	2023-01-01
publisher	KeAi Communications Co. Ltd.
record_format	Article
series	AI Open
spelling	doaj.art-7405011d2af64562ae5861bfc0674a5f2023-12-23T05:22:51ZengKeAi Communications Co. Ltd.AI Open2666-65102023-01-0141932Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generationJialin Yu0Alexandra I. Cristea1Anoushka Harit2Zhongtian Sun3Olanrewaju Tahir Aduragba4Lei Shi5Noura Al Moubayed6Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK; Department of Statistical Science, University College London, Gower Street, London, WC1E 6BT, UK; Corresponding author at: Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK.Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UK; Corresponding author.Department of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKDepartment of Computer Science, Durham University, Stockton Road, Durham, DH1 3LE, UKThis paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.http://www.sciencedirect.com/science/article/pii/S2666651023000025Deep latent variable modelsParaphrase generationSemi-supervised learningNatural language processingDeep learning
spellingShingle	Jialin Yu Alexandra I. Cristea Anoushka Harit Zhongtian Sun Olanrewaju Tahir Aduragba Lei Shi Noura Al Moubayed Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation AI Open Deep latent variable models Paraphrase generation Semi-supervised learning Natural language processing Deep learning
title	Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_full	Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_fullStr	Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_full_unstemmed	Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_short	Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
title_sort	language as a latent sequence deep latent variable models for semi supervised paraphrase generation
topic	Deep latent variable models Paraphrase generation Semi-supervised learning Natural language processing Deep learning
url	http://www.sciencedirect.com/science/article/pii/S2666651023000025
work_keys_str_mv	AT jialinyu languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT alexandraicristea languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT anoushkaharit languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT zhongtiansun languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT olanrewajutahiraduragba languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT leishi languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration AT nouraalmoubayed languageasalatentsequencedeeplatentvariablemodelsforsemisupervisedparaphrasegeneration

Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

Similar Items