An Automatic Persian Text Summarization System Based on Linguistic Features and Regression

Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document summarization based on several linguis...

Full description

Bibliographic Details
Main Authors: Mahmood Soltani, Jalal Nasiri, Ehsan Asgarian
Format: Article
Language:fas
Published: Iranian Research Institute for Information and Technology 2018-09-01
Series:Iranian Journal of Information Processing & Management
Subjects:
Online Access:http://jipm.irandoc.ac.ir/browse.php?a_code=A-10-3807-1&slc_lang=en&sid=1
_version_ 1819154104360894464
author Mahmood Soltani
Jalal Nasiri
Ehsan Asgarian
author_facet Mahmood Soltani
Jalal Nasiri
Ehsan Asgarian
author_sort Mahmood Soltani
collection DOAJ
description Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document summarization based on several linguistic features of text. In our approach after extracting the linguistic features for each sentence, the weight of features is learned by a linear regression method. We select one sentence with maximum score at each step of algorithm. The score of each sentence is calculated based on two factors: first, sum of the weighted features and second, the amount of its similarity to the sentences that are selected for final summary previously. We use an automatic evaluation tool to compare our approach with other existing approaches. The result indicates that our method improves the performance of summarization.
first_indexed 2024-12-22T15:15:46Z
format Article
id doaj.art-4f4e6ec7aaf6465eba9947d9c6370f9f
institution Directory Open Access Journal
issn 2251-8223
2251-8231
language fas
last_indexed 2024-12-22T15:15:46Z
publishDate 2018-09-01
publisher Iranian Research Institute for Information and Technology
record_format Article
series Iranian Journal of Information Processing & Management
spelling doaj.art-4f4e6ec7aaf6465eba9947d9c6370f9f2022-12-21T18:21:45ZfasIranian Research Institute for Information and TechnologyIranian Journal of Information Processing & Management2251-82232251-82312018-09-0133418091828An Automatic Persian Text Summarization System Based on Linguistic Features and RegressionMahmood Soltani0Jalal Nasiri1Ehsan Asgarian2 Department of Computer Engineering; Quchan University of Technology ranian Research Institute for Information Science and Technology (IranDoc) Engineering Department of Ferdowsi University of Mashhad Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document summarization based on several linguistic features of text. In our approach after extracting the linguistic features for each sentence, the weight of features is learned by a linear regression method. We select one sentence with maximum score at each step of algorithm. The score of each sentence is calculated based on two factors: first, sum of the weighted features and second, the amount of its similarity to the sentences that are selected for final summary previously. We use an automatic evaluation tool to compare our approach with other existing approaches. The result indicates that our method improves the performance of summarization.http://jipm.irandoc.ac.ir/browse.php?a_code=A-10-3807-1&slc_lang=en&sid=1Single-Document SummarizationLinguistic FeatureLinear Regression
spellingShingle Mahmood Soltani
Jalal Nasiri
Ehsan Asgarian
An Automatic Persian Text Summarization System Based on Linguistic Features and Regression
Iranian Journal of Information Processing & Management
Single-Document Summarization
Linguistic Feature
Linear Regression
title An Automatic Persian Text Summarization System Based on Linguistic Features and Regression
title_full An Automatic Persian Text Summarization System Based on Linguistic Features and Regression
title_fullStr An Automatic Persian Text Summarization System Based on Linguistic Features and Regression
title_full_unstemmed An Automatic Persian Text Summarization System Based on Linguistic Features and Regression
title_short An Automatic Persian Text Summarization System Based on Linguistic Features and Regression
title_sort automatic persian text summarization system based on linguistic features and regression
topic Single-Document Summarization
Linguistic Feature
Linear Regression
url http://jipm.irandoc.ac.ir/browse.php?a_code=A-10-3807-1&slc_lang=en&sid=1
work_keys_str_mv AT mahmoodsoltani anautomaticpersiantextsummarizationsystembasedonlinguisticfeaturesandregression
AT jalalnasiri anautomaticpersiantextsummarizationsystembasedonlinguisticfeaturesandregression
AT ehsanasgarian anautomaticpersiantextsummarizationsystembasedonlinguisticfeaturesandregression
AT mahmoodsoltani automaticpersiantextsummarizationsystembasedonlinguisticfeaturesandregression
AT jalalnasiri automaticpersiantextsummarizationsystembasedonlinguisticfeaturesandregression
AT ehsanasgarian automaticpersiantextsummarizationsystembasedonlinguisticfeaturesandregression