Channel Comparison Methods and Statistical Problems on Graphs
Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis exp...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2023
|
Online Access: | https://hdl.handle.net/1721.1/151616 https://orcid.org/0000-0003-1722-5241 |
_version_ | 1826210299387052032 |
---|---|
author | Gu, Yuzhou |
author2 | Polyanskiy, Yury |
author_facet | Polyanskiy, Yury Gu, Yuzhou |
author_sort | Gu, Yuzhou |
collection | MIT |
description | Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis explores several applications of these tools to statistical problems related to graphs.
Part I focuses on information channels and channel comparison methods, including f-divergences, strong data processing inequalities, and preorders between channels. While these theories have been well-established for binary memoryless symmetric (BMS) channels, there remains much to discover for channels with larger input alphabets. We develop a theory of q-ary input-symmetric (FMS) channels, generalizing the theory of BMS channels. We demonstrate that while FMS channels exhibit more complex behavior than BMS channels, some properties of BMS channels can be extended to FMS channels. Furthermore, we perform tight analysis on contraction properties of the Potts channels, the simplest examples of FMS channels.
In Part II, we apply the information theoretical methods established in Part I to solve problems related to random graph models with community structures. The random graph models include the stochastic block model (SBM) and its variants, which hold significance in statistics, machine learning, and network science. Central problems for these models ask about the feasibility and quality of recovering hidden community structures from unlabeled graphs. By utilizing the relationship between random graphs and random Galton-Watson trees, we demonstrate that many important problems on these graphical models can be reduced to problems on trees. We apply various channel comparison methods to solve these tree problems, demonstrating that different methods are effective for different problems and that selecting the correct tool for a problem is crucial. Problems we study include (for SBMs) weak recovery, optimal recovery algorithms, mutual information formula, and (for broadcasting on trees) reconstruction, robust reconstruction, uniqueness of belief propagation fixed points, boundary irrelevance, computation of limit information, and so on. |
first_indexed | 2024-09-23T14:47:37Z |
format | Thesis |
id | mit-1721.1/151616 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T14:47:37Z |
publishDate | 2023 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1516162023-08-01T03:20:20Z Channel Comparison Methods and Statistical Problems on Graphs Gu, Yuzhou Polyanskiy, Yury Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis explores several applications of these tools to statistical problems related to graphs. Part I focuses on information channels and channel comparison methods, including f-divergences, strong data processing inequalities, and preorders between channels. While these theories have been well-established for binary memoryless symmetric (BMS) channels, there remains much to discover for channels with larger input alphabets. We develop a theory of q-ary input-symmetric (FMS) channels, generalizing the theory of BMS channels. We demonstrate that while FMS channels exhibit more complex behavior than BMS channels, some properties of BMS channels can be extended to FMS channels. Furthermore, we perform tight analysis on contraction properties of the Potts channels, the simplest examples of FMS channels. In Part II, we apply the information theoretical methods established in Part I to solve problems related to random graph models with community structures. The random graph models include the stochastic block model (SBM) and its variants, which hold significance in statistics, machine learning, and network science. Central problems for these models ask about the feasibility and quality of recovering hidden community structures from unlabeled graphs. By utilizing the relationship between random graphs and random Galton-Watson trees, we demonstrate that many important problems on these graphical models can be reduced to problems on trees. We apply various channel comparison methods to solve these tree problems, demonstrating that different methods are effective for different problems and that selecting the correct tool for a problem is crucial. Problems we study include (for SBMs) weak recovery, optimal recovery algorithms, mutual information formula, and (for broadcasting on trees) reconstruction, robust reconstruction, uniqueness of belief propagation fixed points, boundary irrelevance, computation of limit information, and so on. Ph.D. 2023-07-31T19:53:04Z 2023-07-31T19:53:04Z 2023-06 2023-07-13T14:21:16.002Z Thesis https://hdl.handle.net/1721.1/151616 https://orcid.org/0000-0003-1722-5241 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Gu, Yuzhou Channel Comparison Methods and Statistical Problems on Graphs |
title | Channel Comparison Methods and Statistical Problems on Graphs |
title_full | Channel Comparison Methods and Statistical Problems on Graphs |
title_fullStr | Channel Comparison Methods and Statistical Problems on Graphs |
title_full_unstemmed | Channel Comparison Methods and Statistical Problems on Graphs |
title_short | Channel Comparison Methods and Statistical Problems on Graphs |
title_sort | channel comparison methods and statistical problems on graphs |
url | https://hdl.handle.net/1721.1/151616 https://orcid.org/0000-0003-1722-5241 |
work_keys_str_mv | AT guyuzhou channelcomparisonmethodsandstatisticalproblemsongraphs |