Statistical language model-based analysis of the English-Chinese corpus and political discourse

Politics and political discourse are closely related to people’s daily life, and this study aims to propose a new approach to political discourse analysis by combining English and Chinese corpora. By exploring the composition of formal language and the grammar generation process, this paper proposes...

Full description

Bibliographic Details
Main Authors: Sun Xueyu, Zhang Songsong
Format: Article
Language:English
Published: Sciendo 2024-01-01
Series:Applied Mathematics and Nonlinear Sciences
Subjects:
Online Access:https://doi.org/10.2478/amns.2023.2.00387
_version_ 1797340769005076480
author Sun Xueyu
Zhang Songsong
author_facet Sun Xueyu
Zhang Songsong
author_sort Sun Xueyu
collection DOAJ
description Politics and political discourse are closely related to people’s daily life, and this study aims to propose a new approach to political discourse analysis by combining English and Chinese corpora. By exploring the composition of formal language and the grammar generation process, this paper proposes an improved N-gram algorithm to address the shortcomings of the N-gram model in dealing with low-frequency words with low accuracy and uses the strategy of introducing alternative words to alleviate the problem of sparse data. Then, a critical metaphor analysis of political discourse in the English-Chinese corpus is conducted based on the improved statistical language model, and the convergence of political discourse is studied in terms of space and time. By analyzing the political discourse of American presidents, the spatial centrality factors of “we” and “our nation” were accurately extracted, and their correlations were 0.83, 0.73, 0.68, 0.51, 0.76, and 0.41 in order. The correlations of the unqualified facsimile noun phrases in the temporal convergence of political discourse reached 0.28, 0.25, 0.72, 0.68, and 0.54, respectively, and the accuracy of the improved N-gram model improved by about 28.1% compared with the traditional method, making using statistical linguistic models for political discourse analysis feasible and applicable.
first_indexed 2024-03-08T10:08:05Z
format Article
id doaj.art-c9cb8813e7354989bf0527599caf19c7
institution Directory Open Access Journal
issn 2444-8656
language English
last_indexed 2024-03-08T10:08:05Z
publishDate 2024-01-01
publisher Sciendo
record_format Article
series Applied Mathematics and Nonlinear Sciences
spelling doaj.art-c9cb8813e7354989bf0527599caf19c72024-01-29T08:52:32ZengSciendoApplied Mathematics and Nonlinear Sciences2444-86562024-01-019110.2478/amns.2023.2.00387Statistical language model-based analysis of the English-Chinese corpus and political discourseSun Xueyu0Zhang Songsong1School of Foreign Languages, Jiangsu Open University, Nanjing, Jiangsu, 210036, ChinaSchool of Foreign Languages, Jinling Institute of Technology, Jiangu, 211169, ChinaPolitics and political discourse are closely related to people’s daily life, and this study aims to propose a new approach to political discourse analysis by combining English and Chinese corpora. By exploring the composition of formal language and the grammar generation process, this paper proposes an improved N-gram algorithm to address the shortcomings of the N-gram model in dealing with low-frequency words with low accuracy and uses the strategy of introducing alternative words to alleviate the problem of sparse data. Then, a critical metaphor analysis of political discourse in the English-Chinese corpus is conducted based on the improved statistical language model, and the convergence of political discourse is studied in terms of space and time. By analyzing the political discourse of American presidents, the spatial centrality factors of “we” and “our nation” were accurately extracted, and their correlations were 0.83, 0.73, 0.68, 0.51, 0.76, and 0.41 in order. The correlations of the unqualified facsimile noun phrases in the temporal convergence of political discourse reached 0.28, 0.25, 0.72, 0.68, and 0.54, respectively, and the accuracy of the improved N-gram model improved by about 28.1% compared with the traditional method, making using statistical linguistic models for political discourse analysis feasible and applicable.https://doi.org/10.2478/amns.2023.2.00387political discourse analysisstatistical language modeln-gram algorithmcritical metaphor analysisconvergence analysis.97c50
spellingShingle Sun Xueyu
Zhang Songsong
Statistical language model-based analysis of the English-Chinese corpus and political discourse
Applied Mathematics and Nonlinear Sciences
political discourse analysis
statistical language model
n-gram algorithm
critical metaphor analysis
convergence analysis.
97c50
title Statistical language model-based analysis of the English-Chinese corpus and political discourse
title_full Statistical language model-based analysis of the English-Chinese corpus and political discourse
title_fullStr Statistical language model-based analysis of the English-Chinese corpus and political discourse
title_full_unstemmed Statistical language model-based analysis of the English-Chinese corpus and political discourse
title_short Statistical language model-based analysis of the English-Chinese corpus and political discourse
title_sort statistical language model based analysis of the english chinese corpus and political discourse
topic political discourse analysis
statistical language model
n-gram algorithm
critical metaphor analysis
convergence analysis.
97c50
url https://doi.org/10.2478/amns.2023.2.00387
work_keys_str_mv AT sunxueyu statisticallanguagemodelbasedanalysisoftheenglishchinesecorpusandpoliticaldiscourse
AT zhangsongsong statisticallanguagemodelbasedanalysisoftheenglishchinesecorpusandpoliticaldiscourse