Channel Comparison Methods and Statistical Problems on Graphs

Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis exp...

Full description

Bibliographic Details
Main Author: Gu, Yuzhou
Other Authors: Polyanskiy, Yury
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/151616
https://orcid.org/0000-0003-1722-5241
_version_ 1826210299387052032
author Gu, Yuzhou
author2 Polyanskiy, Yury
author_facet Polyanskiy, Yury
Gu, Yuzhou
author_sort Gu, Yuzhou
collection MIT
description Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis explores several applications of these tools to statistical problems related to graphs. Part I focuses on information channels and channel comparison methods, including f-divergences, strong data processing inequalities, and preorders between channels. While these theories have been well-established for binary memoryless symmetric (BMS) channels, there remains much to discover for channels with larger input alphabets. We develop a theory of q-ary input-symmetric (FMS) channels, generalizing the theory of BMS channels. We demonstrate that while FMS channels exhibit more complex behavior than BMS channels, some properties of BMS channels can be extended to FMS channels. Furthermore, we perform tight analysis on contraction properties of the Potts channels, the simplest examples of FMS channels. In Part II, we apply the information theoretical methods established in Part I to solve problems related to random graph models with community structures. The random graph models include the stochastic block model (SBM) and its variants, which hold significance in statistics, machine learning, and network science. Central problems for these models ask about the feasibility and quality of recovering hidden community structures from unlabeled graphs. By utilizing the relationship between random graphs and random Galton-Watson trees, we demonstrate that many important problems on these graphical models can be reduced to problems on trees. We apply various channel comparison methods to solve these tree problems, demonstrating that different methods are effective for different problems and that selecting the correct tool for a problem is crucial. Problems we study include (for SBMs) weak recovery, optimal recovery algorithms, mutual information formula, and (for broadcasting on trees) reconstruction, robust reconstruction, uniqueness of belief propagation fixed points, boundary irrelevance, computation of limit information, and so on.
first_indexed 2024-09-23T14:47:37Z
format Thesis
id mit-1721.1/151616
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:47:37Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1516162023-08-01T03:20:20Z Channel Comparison Methods and Statistical Problems on Graphs Gu, Yuzhou Polyanskiy, Yury Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Initially driven by channel coding, information theory has developed a large collection of tools for measuring and comparing effectiveness of information channels. These tools have found applications in various fields such as statistics, probability, and theoretical computer science. This thesis explores several applications of these tools to statistical problems related to graphs. Part I focuses on information channels and channel comparison methods, including f-divergences, strong data processing inequalities, and preorders between channels. While these theories have been well-established for binary memoryless symmetric (BMS) channels, there remains much to discover for channels with larger input alphabets. We develop a theory of q-ary input-symmetric (FMS) channels, generalizing the theory of BMS channels. We demonstrate that while FMS channels exhibit more complex behavior than BMS channels, some properties of BMS channels can be extended to FMS channels. Furthermore, we perform tight analysis on contraction properties of the Potts channels, the simplest examples of FMS channels. In Part II, we apply the information theoretical methods established in Part I to solve problems related to random graph models with community structures. The random graph models include the stochastic block model (SBM) and its variants, which hold significance in statistics, machine learning, and network science. Central problems for these models ask about the feasibility and quality of recovering hidden community structures from unlabeled graphs. By utilizing the relationship between random graphs and random Galton-Watson trees, we demonstrate that many important problems on these graphical models can be reduced to problems on trees. We apply various channel comparison methods to solve these tree problems, demonstrating that different methods are effective for different problems and that selecting the correct tool for a problem is crucial. Problems we study include (for SBMs) weak recovery, optimal recovery algorithms, mutual information formula, and (for broadcasting on trees) reconstruction, robust reconstruction, uniqueness of belief propagation fixed points, boundary irrelevance, computation of limit information, and so on. Ph.D. 2023-07-31T19:53:04Z 2023-07-31T19:53:04Z 2023-06 2023-07-13T14:21:16.002Z Thesis https://hdl.handle.net/1721.1/151616 https://orcid.org/0000-0003-1722-5241 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Gu, Yuzhou
Channel Comparison Methods and Statistical Problems on Graphs
title Channel Comparison Methods and Statistical Problems on Graphs
title_full Channel Comparison Methods and Statistical Problems on Graphs
title_fullStr Channel Comparison Methods and Statistical Problems on Graphs
title_full_unstemmed Channel Comparison Methods and Statistical Problems on Graphs
title_short Channel Comparison Methods and Statistical Problems on Graphs
title_sort channel comparison methods and statistical problems on graphs
url https://hdl.handle.net/1721.1/151616
https://orcid.org/0000-0003-1722-5241
work_keys_str_mv AT guyuzhou channelcomparisonmethodsandstatisticalproblemsongraphs