Automatic document summarization from social media and online news

This dissertation provides a new method for sentence embedding and document summarization. The topic model is utilized to modify the sentence embedding method SIF by capturing the information in the document, instead of relying on an external corpus. Thus, the modification embeds the information of...

Full description

Bibliographic Details
Main Author: Feng, Zijian
Other Authors: Mao Kezhi
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141154
_version_ 1826120552357560320
author Feng, Zijian
author2 Mao Kezhi
author_facet Mao Kezhi
Feng, Zijian
author_sort Feng, Zijian
collection NTU
description This dissertation provides a new method for sentence embedding and document summarization. The topic model is utilized to modify the sentence embedding method SIF by capturing the information in the document, instead of relying on an external corpus. Thus, the modification embeds the information of the entire document into the sentence vectors, which is beneficial for further information extraction. Then we employ the graph-based method to score the sentences and select the high-scoring sentences to form the summary. In addition, this dissertation also tested the impact of different parameter changes in the model. The experimental results show that the proposed model can beat other classic and advanced models in semantic analysis and summary extraction with strong robustness. The datasets used in this dissertation are from social media and online news, which proves the applicability of this model to online information extraction.
first_indexed 2024-10-01T05:18:38Z
format Thesis-Master by Coursework
id ntu-10356/141154
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:18:38Z
publishDate 2020
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1411542023-07-04T16:35:53Z Automatic document summarization from social media and online news Feng, Zijian Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Electrical and electronic engineering This dissertation provides a new method for sentence embedding and document summarization. The topic model is utilized to modify the sentence embedding method SIF by capturing the information in the document, instead of relying on an external corpus. Thus, the modification embeds the information of the entire document into the sentence vectors, which is beneficial for further information extraction. Then we employ the graph-based method to score the sentences and select the high-scoring sentences to form the summary. In addition, this dissertation also tested the impact of different parameter changes in the model. The experimental results show that the proposed model can beat other classic and advanced models in semantic analysis and summary extraction with strong robustness. The datasets used in this dissertation are from social media and online news, which proves the applicability of this model to online information extraction. Master of Science (Signal Processing) 2020-06-04T07:59:21Z 2020-06-04T07:59:21Z 2020 Thesis-Master by Coursework https://hdl.handle.net/10356/141154 en application/pdf Nanyang Technological University
spellingShingle Engineering::Electrical and electronic engineering
Feng, Zijian
Automatic document summarization from social media and online news
title Automatic document summarization from social media and online news
title_full Automatic document summarization from social media and online news
title_fullStr Automatic document summarization from social media and online news
title_full_unstemmed Automatic document summarization from social media and online news
title_short Automatic document summarization from social media and online news
title_sort automatic document summarization from social media and online news
topic Engineering::Electrical and electronic engineering
url https://hdl.handle.net/10356/141154
work_keys_str_mv AT fengzijian automaticdocumentsummarizationfromsocialmediaandonlinenews