Applying the Bell’s Test to Chinese Texts
Search engines are able to find documents containing patterns from a query. This approach can be used for alphabetic languages such as English. However, Chinese is highly dependent on context. The significant problem of Chinese text processing is the missing blanks between words, so it is necessary...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-02-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/22/3/275 |
_version_ | 1798002458808549376 |
---|---|
author | Igor A. Bessmertny Xiaoxi Huang Aleksei V. Platonov Chuqiao Yu Julia A. Koroleva |
author_facet | Igor A. Bessmertny Xiaoxi Huang Aleksei V. Platonov Chuqiao Yu Julia A. Koroleva |
author_sort | Igor A. Bessmertny |
collection | DOAJ |
description | Search engines are able to find documents containing patterns from a query. This approach can be used for alphabetic languages such as English. However, Chinese is highly dependent on context. The significant problem of Chinese text processing is the missing blanks between words, so it is necessary to segment the text to words before any other action. Algorithms for Chinese text segmentation should consider context; that is, the word segmentation process depends on other ideograms. As the existing segmentation algorithms are imperfect, we have considered an approach to build the context from all possible n-grams surrounding the query words. This paper proposes a quantum-inspired approach to rank Chinese text documents by their relevancy to the query. Particularly, this approach uses Bell’s test, which measures the quantum entanglement of two words within the context. The contexts of words are built using the hyperspace analogue to language (HAL) algorithm. Experiments fulfilled in three domains demonstrated that the proposed approach provides acceptable results. |
first_indexed | 2024-04-11T11:52:32Z |
format | Article |
id | doaj.art-e44afeea2ec5457390a5fd7cc5b9f7a8 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-04-11T11:52:32Z |
publishDate | 2020-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-e44afeea2ec5457390a5fd7cc5b9f7a82022-12-22T04:25:16ZengMDPI AGEntropy1099-43002020-02-0122327510.3390/e22030275e22030275Applying the Bell’s Test to Chinese TextsIgor A. Bessmertny0Xiaoxi Huang1Aleksei V. Platonov2Chuqiao Yu3Julia A. Koroleva4School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, ChinaSaint Petersburg National Research, University of Information Technology Mechanics and Optics, St. Petersburg 197101, RussiaBeijing Institute of Technology, Beijing 100081, ChinaSaint Petersburg National Research, University of Information Technology Mechanics and Optics, St. Petersburg 197101, RussiaSearch engines are able to find documents containing patterns from a query. This approach can be used for alphabetic languages such as English. However, Chinese is highly dependent on context. The significant problem of Chinese text processing is the missing blanks between words, so it is necessary to segment the text to words before any other action. Algorithms for Chinese text segmentation should consider context; that is, the word segmentation process depends on other ideograms. As the existing segmentation algorithms are imperfect, we have considered an approach to build the context from all possible n-grams surrounding the query words. This paper proposes a quantum-inspired approach to rank Chinese text documents by their relevancy to the query. Particularly, this approach uses Bell’s test, which measures the quantum entanglement of two words within the context. The contexts of words are built using the hyperspace analogue to language (HAL) algorithm. Experiments fulfilled in three domains demonstrated that the proposed approach provides acceptable results.https://www.mdpi.com/1099-4300/22/3/275text miningcontent analysis and indexingtext analysis |
spellingShingle | Igor A. Bessmertny Xiaoxi Huang Aleksei V. Platonov Chuqiao Yu Julia A. Koroleva Applying the Bell’s Test to Chinese Texts Entropy text mining content analysis and indexing text analysis |
title | Applying the Bell’s Test to Chinese Texts |
title_full | Applying the Bell’s Test to Chinese Texts |
title_fullStr | Applying the Bell’s Test to Chinese Texts |
title_full_unstemmed | Applying the Bell’s Test to Chinese Texts |
title_short | Applying the Bell’s Test to Chinese Texts |
title_sort | applying the bell s test to chinese texts |
topic | text mining content analysis and indexing text analysis |
url | https://www.mdpi.com/1099-4300/22/3/275 |
work_keys_str_mv | AT igorabessmertny applyingthebellstesttochinesetexts AT xiaoxihuang applyingthebellstesttochinesetexts AT alekseivplatonov applyingthebellstesttochinesetexts AT chuqiaoyu applyingthebellstesttochinesetexts AT juliaakoroleva applyingthebellstesttochinesetexts |