Applying the Bell’s Test to Chinese Texts

Search engines are able to find documents containing patterns from a query. This approach can be used for alphabetic languages such as English. However, Chinese is highly dependent on context. The significant problem of Chinese text processing is the missing blanks between words, so it is necessary...

Full description

Bibliographic Details
Main Authors: Igor A. Bessmertny, Xiaoxi Huang, Aleksei V. Platonov, Chuqiao Yu, Julia A. Koroleva
Format: Article
Language:English
Published: MDPI AG 2020-02-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/3/275
_version_ 1798002458808549376
author Igor A. Bessmertny
Xiaoxi Huang
Aleksei V. Platonov
Chuqiao Yu
Julia A. Koroleva
author_facet Igor A. Bessmertny
Xiaoxi Huang
Aleksei V. Platonov
Chuqiao Yu
Julia A. Koroleva
author_sort Igor A. Bessmertny
collection DOAJ
description Search engines are able to find documents containing patterns from a query. This approach can be used for alphabetic languages such as English. However, Chinese is highly dependent on context. The significant problem of Chinese text processing is the missing blanks between words, so it is necessary to segment the text to words before any other action. Algorithms for Chinese text segmentation should consider context; that is, the word segmentation process depends on other ideograms. As the existing segmentation algorithms are imperfect, we have considered an approach to build the context from all possible n-grams surrounding the query words. This paper proposes a quantum-inspired approach to rank Chinese text documents by their relevancy to the query. Particularly, this approach uses Bell’s test, which measures the quantum entanglement of two words within the context. The contexts of words are built using the hyperspace analogue to language (HAL) algorithm. Experiments fulfilled in three domains demonstrated that the proposed approach provides acceptable results.
first_indexed 2024-04-11T11:52:32Z
format Article
id doaj.art-e44afeea2ec5457390a5fd7cc5b9f7a8
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-11T11:52:32Z
publishDate 2020-02-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-e44afeea2ec5457390a5fd7cc5b9f7a82022-12-22T04:25:16ZengMDPI AGEntropy1099-43002020-02-0122327510.3390/e22030275e22030275Applying the Bell’s Test to Chinese TextsIgor A. Bessmertny0Xiaoxi Huang1Aleksei V. Platonov2Chuqiao Yu3Julia A. Koroleva4School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, ChinaSaint Petersburg National Research, University of Information Technology Mechanics and Optics, St. Petersburg 197101, RussiaBeijing Institute of Technology, Beijing 100081, ChinaSaint Petersburg National Research, University of Information Technology Mechanics and Optics, St. Petersburg 197101, RussiaSearch engines are able to find documents containing patterns from a query. This approach can be used for alphabetic languages such as English. However, Chinese is highly dependent on context. The significant problem of Chinese text processing is the missing blanks between words, so it is necessary to segment the text to words before any other action. Algorithms for Chinese text segmentation should consider context; that is, the word segmentation process depends on other ideograms. As the existing segmentation algorithms are imperfect, we have considered an approach to build the context from all possible n-grams surrounding the query words. This paper proposes a quantum-inspired approach to rank Chinese text documents by their relevancy to the query. Particularly, this approach uses Bell’s test, which measures the quantum entanglement of two words within the context. The contexts of words are built using the hyperspace analogue to language (HAL) algorithm. Experiments fulfilled in three domains demonstrated that the proposed approach provides acceptable results.https://www.mdpi.com/1099-4300/22/3/275text miningcontent analysis and indexingtext analysis
spellingShingle Igor A. Bessmertny
Xiaoxi Huang
Aleksei V. Platonov
Chuqiao Yu
Julia A. Koroleva
Applying the Bell’s Test to Chinese Texts
Entropy
text mining
content analysis and indexing
text analysis
title Applying the Bell’s Test to Chinese Texts
title_full Applying the Bell’s Test to Chinese Texts
title_fullStr Applying the Bell’s Test to Chinese Texts
title_full_unstemmed Applying the Bell’s Test to Chinese Texts
title_short Applying the Bell’s Test to Chinese Texts
title_sort applying the bell s test to chinese texts
topic text mining
content analysis and indexing
text analysis
url https://www.mdpi.com/1099-4300/22/3/275
work_keys_str_mv AT igorabessmertny applyingthebellstesttochinesetexts
AT xiaoxihuang applyingthebellstesttochinesetexts
AT alekseivplatonov applyingthebellstesttochinesetexts
AT chuqiaoyu applyingthebellstesttochinesetexts
AT juliaakoroleva applyingthebellstesttochinesetexts