On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
The paradigm-shifting developments of cryptography and information theory have focused on the privacy of data-sharing systems, such as epidemiological studies, where agencies are collecting far more personal data than they need, causing intrusions on patients’ privacy. To study the capability of the...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/25/4/625 |
_version_ | 1797605592727027712 |
---|---|
author | Jiale Cheng Nan Liu Wei Kang |
author_facet | Jiale Cheng Nan Liu Wei Kang |
author_sort | Jiale Cheng |
collection | DOAJ |
description | The paradigm-shifting developments of cryptography and information theory have focused on the privacy of data-sharing systems, such as epidemiological studies, where agencies are collecting far more personal data than they need, causing intrusions on patients’ privacy. To study the capability of the data collection while protecting privacy from an information theory perspective, we formulate a new distributed multiparty computation problem called privacy-preserving epidemiological data collection. In our setting, a data collector requires a linear combination of <i>K</i> users’ data through a storage system consisting of <i>N</i> servers. Privacy needs to be protected when the users, servers, and data collector do not trust each other. For the users, any data are required to be protected from up to <i>E</i> colluding servers; for the servers, any more information than the desired linear combination cannot be leaked to the data collector; and for the data collector, any single server can not know anything about the coefficients of the linear combination. Our goal is to find the optimal collection rate, which is defined as the ratio of the size of the user’s message to the total size of downloads from <i>N</i> servers to the data collector. For achievability, we propose an asymptotic capacity-achieving scheme when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mo><</mo><mi>N</mi><mo>−</mo><mn>1</mn></mrow></semantics></math></inline-formula>, by applying the cross-subspace alignment method to our construction; for the converse, we proved an upper bound of the asymptotic rate for all achievable schemes when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mo><</mo><mi>N</mi><mo>−</mo><mn>1</mn></mrow></semantics></math></inline-formula>. Additionally, we show that a positive asymptotic capacity is not possible when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mo>≥</mo><mi>N</mi><mo>−</mo><mn>1</mn></mrow></semantics></math></inline-formula>. The results of the achievability and converse meet when the number of users goes to infinity, yielding the asymptotic capacity. Our work broadens current researches on data privacy in information theory and gives the best achievable asymptotic performance that any epidemiological data collector can obtain. |
first_indexed | 2024-03-11T05:03:16Z |
format | Article |
id | doaj.art-941e7bc0b1f34119859f530a424531e2 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-03-11T05:03:16Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-941e7bc0b1f34119859f530a424531e22023-11-17T19:08:44ZengMDPI AGEntropy1099-43002023-04-0125462510.3390/e25040625On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data CollectionJiale Cheng0Nan Liu1Wei Kang2National Mobile Communications Research Laboratory, Southeast University, Nanjing 211189, ChinaNational Mobile Communications Research Laboratory, Southeast University, Nanjing 211189, ChinaSchool of Information Science and Engineering, Southeast University, Nanjing 211189, ChinaThe paradigm-shifting developments of cryptography and information theory have focused on the privacy of data-sharing systems, such as epidemiological studies, where agencies are collecting far more personal data than they need, causing intrusions on patients’ privacy. To study the capability of the data collection while protecting privacy from an information theory perspective, we formulate a new distributed multiparty computation problem called privacy-preserving epidemiological data collection. In our setting, a data collector requires a linear combination of <i>K</i> users’ data through a storage system consisting of <i>N</i> servers. Privacy needs to be protected when the users, servers, and data collector do not trust each other. For the users, any data are required to be protected from up to <i>E</i> colluding servers; for the servers, any more information than the desired linear combination cannot be leaked to the data collector; and for the data collector, any single server can not know anything about the coefficients of the linear combination. Our goal is to find the optimal collection rate, which is defined as the ratio of the size of the user’s message to the total size of downloads from <i>N</i> servers to the data collector. For achievability, we propose an asymptotic capacity-achieving scheme when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mo><</mo><mi>N</mi><mo>−</mo><mn>1</mn></mrow></semantics></math></inline-formula>, by applying the cross-subspace alignment method to our construction; for the converse, we proved an upper bound of the asymptotic rate for all achievable schemes when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mo><</mo><mi>N</mi><mo>−</mo><mn>1</mn></mrow></semantics></math></inline-formula>. Additionally, we show that a positive asymptotic capacity is not possible when <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mo>≥</mo><mi>N</mi><mo>−</mo><mn>1</mn></mrow></semantics></math></inline-formula>. The results of the achievability and converse meet when the number of users goes to infinity, yielding the asymptotic capacity. Our work broadens current researches on data privacy in information theory and gives the best achievable asymptotic performance that any epidemiological data collector can obtain.https://www.mdpi.com/1099-4300/25/4/625secure multiparty computationdata privacyepidemiological data collectionasymptotic capacity |
spellingShingle | Jiale Cheng Nan Liu Wei Kang On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection Entropy secure multiparty computation data privacy epidemiological data collection asymptotic capacity |
title | On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection |
title_full | On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection |
title_fullStr | On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection |
title_full_unstemmed | On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection |
title_short | On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection |
title_sort | on the asymptotic capacity of information theoretic privacy preserving epidemiological data collection |
topic | secure multiparty computation data privacy epidemiological data collection asymptotic capacity |
url | https://www.mdpi.com/1099-4300/25/4/625 |
work_keys_str_mv | AT jialecheng ontheasymptoticcapacityofinformationtheoreticprivacypreservingepidemiologicaldatacollection AT nanliu ontheasymptoticcapacityofinformationtheoreticprivacypreservingepidemiologicaldatacollection AT weikang ontheasymptoticcapacityofinformationtheoreticprivacypreservingepidemiologicaldatacollection |