Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures

Graph summarization is the problem of producing smaller graph representations of an input graph dataset, in such a way that the smaller compressed graphs capture relevant structural information for downstream tasks. One of the recent graph summarization methods formulates an optimal transport-based...

Full description

Bibliographic Details
Main Authors: Sepideh Neshatfar, Abram Magner, Salimeh Yasaei Sekeh
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10210378/
_version_ 1797739754743136256
author Sepideh Neshatfar
Abram Magner
Salimeh Yasaei Sekeh
author_facet Sepideh Neshatfar
Abram Magner
Salimeh Yasaei Sekeh
author_sort Sepideh Neshatfar
collection DOAJ
description Graph summarization is the problem of producing smaller graph representations of an input graph dataset, in such a way that the smaller compressed graphs capture relevant structural information for downstream tasks. One of the recent graph summarization methods formulates an optimal transport-based framework that allows prior information about node, edge, and attribute importance to be incorporated into the graph summarization process. However, very little is known about the statistical properties of this framework. To elucidate this question, we consider the problem of supervised graph summarization, wherein by using information theoretic measures we seek to preserve relevant information about a class label. To gain a theoretical perspective on the supervised summarization problem itself, we first formulate it in terms of maximizing the Shannon mutual information between the summarized graph and the class label. We show an NP-hardness of approximation result for this problem, thereby constraining what one should expect from proposed solutions. We then propose a summarization method that incorporates mutual information estimates between random variables associated with sample graphs and class labels into the optimal transport compression framework. We empirically show performance improvements over previous works in terms of classification accuracy and time on synthetic and certain real datasets. We also theoretically explore the limitations of the optimal transport approach for the supervised summarization problem and we show that it fails to satisfy a certain desirable information monotonicity property.
first_indexed 2024-03-12T14:02:42Z
format Article
id doaj.art-5acf795e5259462080aec03c1f0f8bb9
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-12T14:02:42Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-5acf795e5259462080aec03c1f0f8bb92023-08-21T23:00:31ZengIEEEIEEE Access2169-35362023-01-0111875338754210.1109/ACCESS.2023.330283010210378Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic MeasuresSepideh Neshatfar0https://orcid.org/0009-0001-8896-8109Abram Magner1https://orcid.org/0000-0002-3082-9915Salimeh Yasaei Sekeh2https://orcid.org/0000-0002-0854-5422School of Computing and Information Science, University of Maine, Orono, ME, USACollege of Engineering and Applied Sciences, University at Albany, Albany, NY, USASchool of Computing and Information Science, University of Maine, Orono, ME, USAGraph summarization is the problem of producing smaller graph representations of an input graph dataset, in such a way that the smaller compressed graphs capture relevant structural information for downstream tasks. One of the recent graph summarization methods formulates an optimal transport-based framework that allows prior information about node, edge, and attribute importance to be incorporated into the graph summarization process. However, very little is known about the statistical properties of this framework. To elucidate this question, we consider the problem of supervised graph summarization, wherein by using information theoretic measures we seek to preserve relevant information about a class label. To gain a theoretical perspective on the supervised summarization problem itself, we first formulate it in terms of maximizing the Shannon mutual information between the summarized graph and the class label. We show an NP-hardness of approximation result for this problem, thereby constraining what one should expect from proposed solutions. We then propose a summarization method that incorporates mutual information estimates between random variables associated with sample graphs and class labels into the optimal transport compression framework. We empirically show performance improvements over previous works in terms of classification accuracy and time on synthetic and certain real datasets. We also theoretically explore the limitations of the optimal transport approach for the supervised summarization problem and we show that it fails to satisfy a certain desirable information monotonicity property.https://ieeexplore.ieee.org/document/10210378/Graph classificationmonotonicityoptimal transportShannon mutual informationsupervised graph summarization
spellingShingle Sepideh Neshatfar
Abram Magner
Salimeh Yasaei Sekeh
Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures
IEEE Access
Graph classification
monotonicity
optimal transport
Shannon mutual information
supervised graph summarization
title Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures
title_full Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures
title_fullStr Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures
title_full_unstemmed Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures
title_short Promise and Limitations of Supervised Optimal Transport-Based Graph Summarization via Information Theoretic Measures
title_sort promise and limitations of supervised optimal transport based graph summarization via information theoretic measures
topic Graph classification
monotonicity
optimal transport
Shannon mutual information
supervised graph summarization
url https://ieeexplore.ieee.org/document/10210378/
work_keys_str_mv AT sepidehneshatfar promiseandlimitationsofsupervisedoptimaltransportbasedgraphsummarizationviainformationtheoreticmeasures
AT abrammagner promiseandlimitationsofsupervisedoptimaltransportbasedgraphsummarizationviainformationtheoreticmeasures
AT salimehyasaeisekeh promiseandlimitationsofsupervisedoptimaltransportbasedgraphsummarizationviainformationtheoreticmeasures