Statistical language model-based analysis of the English-Chinese corpus and political discourse
Politics and political discourse are closely related to people’s daily life, and this study aims to propose a new approach to political discourse analysis by combining English and Chinese corpora. By exploring the composition of formal language and the grammar generation process, this paper proposes...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2024-01-01
|
Series: | Applied Mathematics and Nonlinear Sciences |
Subjects: | |
Online Access: | https://doi.org/10.2478/amns.2023.2.00387 |
_version_ | 1797340769005076480 |
---|---|
author | Sun Xueyu Zhang Songsong |
author_facet | Sun Xueyu Zhang Songsong |
author_sort | Sun Xueyu |
collection | DOAJ |
description | Politics and political discourse are closely related to people’s daily life, and this study aims to propose a new approach to political discourse analysis by combining English and Chinese corpora. By exploring the composition of formal language and the grammar generation process, this paper proposes an improved N-gram algorithm to address the shortcomings of the N-gram model in dealing with low-frequency words with low accuracy and uses the strategy of introducing alternative words to alleviate the problem of sparse data. Then, a critical metaphor analysis of political discourse in the English-Chinese corpus is conducted based on the improved statistical language model, and the convergence of political discourse is studied in terms of space and time. By analyzing the political discourse of American presidents, the spatial centrality factors of “we” and “our nation” were accurately extracted, and their correlations were 0.83, 0.73, 0.68, 0.51, 0.76, and 0.41 in order. The correlations of the unqualified facsimile noun phrases in the temporal convergence of political discourse reached 0.28, 0.25, 0.72, 0.68, and 0.54, respectively, and the accuracy of the improved N-gram model improved by about 28.1% compared with the traditional method, making using statistical linguistic models for political discourse analysis feasible and applicable. |
first_indexed | 2024-03-08T10:08:05Z |
format | Article |
id | doaj.art-c9cb8813e7354989bf0527599caf19c7 |
institution | Directory Open Access Journal |
issn | 2444-8656 |
language | English |
last_indexed | 2024-03-08T10:08:05Z |
publishDate | 2024-01-01 |
publisher | Sciendo |
record_format | Article |
series | Applied Mathematics and Nonlinear Sciences |
spelling | doaj.art-c9cb8813e7354989bf0527599caf19c72024-01-29T08:52:32ZengSciendoApplied Mathematics and Nonlinear Sciences2444-86562024-01-019110.2478/amns.2023.2.00387Statistical language model-based analysis of the English-Chinese corpus and political discourseSun Xueyu0Zhang Songsong1School of Foreign Languages, Jiangsu Open University, Nanjing, Jiangsu, 210036, ChinaSchool of Foreign Languages, Jinling Institute of Technology, Jiangu, 211169, ChinaPolitics and political discourse are closely related to people’s daily life, and this study aims to propose a new approach to political discourse analysis by combining English and Chinese corpora. By exploring the composition of formal language and the grammar generation process, this paper proposes an improved N-gram algorithm to address the shortcomings of the N-gram model in dealing with low-frequency words with low accuracy and uses the strategy of introducing alternative words to alleviate the problem of sparse data. Then, a critical metaphor analysis of political discourse in the English-Chinese corpus is conducted based on the improved statistical language model, and the convergence of political discourse is studied in terms of space and time. By analyzing the political discourse of American presidents, the spatial centrality factors of “we” and “our nation” were accurately extracted, and their correlations were 0.83, 0.73, 0.68, 0.51, 0.76, and 0.41 in order. The correlations of the unqualified facsimile noun phrases in the temporal convergence of political discourse reached 0.28, 0.25, 0.72, 0.68, and 0.54, respectively, and the accuracy of the improved N-gram model improved by about 28.1% compared with the traditional method, making using statistical linguistic models for political discourse analysis feasible and applicable.https://doi.org/10.2478/amns.2023.2.00387political discourse analysisstatistical language modeln-gram algorithmcritical metaphor analysisconvergence analysis.97c50 |
spellingShingle | Sun Xueyu Zhang Songsong Statistical language model-based analysis of the English-Chinese corpus and political discourse Applied Mathematics and Nonlinear Sciences political discourse analysis statistical language model n-gram algorithm critical metaphor analysis convergence analysis. 97c50 |
title | Statistical language model-based analysis of the English-Chinese corpus and political discourse |
title_full | Statistical language model-based analysis of the English-Chinese corpus and political discourse |
title_fullStr | Statistical language model-based analysis of the English-Chinese corpus and political discourse |
title_full_unstemmed | Statistical language model-based analysis of the English-Chinese corpus and political discourse |
title_short | Statistical language model-based analysis of the English-Chinese corpus and political discourse |
title_sort | statistical language model based analysis of the english chinese corpus and political discourse |
topic | political discourse analysis statistical language model n-gram algorithm critical metaphor analysis convergence analysis. 97c50 |
url | https://doi.org/10.2478/amns.2023.2.00387 |
work_keys_str_mv | AT sunxueyu statisticallanguagemodelbasedanalysisoftheenglishchinesecorpusandpoliticaldiscourse AT zhangsongsong statisticallanguagemodelbasedanalysisoftheenglishchinesecorpusandpoliticaldiscourse |