Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.

The exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference genera...

Full description

Bibliographic Details
Main Authors: Fan Yin, Carter T Butts
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0273039
_version_ 1797802636868583424
author Fan Yin
Carter T Butts
author_facet Fan Yin
Carter T Butts
author_sort Fan Yin
collection DOAJ
description The exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference generally employ a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the maximum likelihood estimator (MLE) of ERGM parameters, and is generally feasible for typical models on single networks with as many as a few thousand nodes. MCMC-based algorithms for Bayesian analysis are more expensive, and high-quality answers are challenging to obtain on large graphs. For both strategies, extension to the pooled case-in which we observe multiple networks from a common generative process-adds further computational cost, with both time and memory scaling linearly in the number of graphs. This becomes prohibitive for large networks, or cases in which large numbers of graph observations are available. Here, we exploit some basic properties of the discrete exponential families to develop an approach for ERGM inference in the pooled case that (where applicable) allows an arbitrarily large number of graph observations to be fit at no additional computational cost beyond preprocessing the data itself. Moreover, a variant of our approach can also be used to perform Bayesian inference under conjugate priors, again with no additional computational cost in the estimation phase. The latter can be employed either for single graph observations, or for observations from graph sets. As we show, the conjugate prior is easily specified, and is well-suited to applications such as regularization. Simulation studies show that the pooled method leads to estimates with good frequentist properties, and posterior estimates under the conjugate prior are well-behaved. We demonstrate the usefulness of our approach with applications to pooled analysis of brain functional connectivity networks and to replicated x-ray crystal structures of hen egg-white lysozyme.
first_indexed 2024-03-13T05:09:43Z
format Article
id doaj.art-329edca1a7cf419ba68a09ab2e1e2359
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-03-13T05:09:43Z
publishDate 2022-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-329edca1a7cf419ba68a09ab2e1e23592023-06-16T05:31:04ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-01178e027303910.1371/journal.pone.0273039Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.Fan YinCarter T ButtsThe exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference generally employ a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the maximum likelihood estimator (MLE) of ERGM parameters, and is generally feasible for typical models on single networks with as many as a few thousand nodes. MCMC-based algorithms for Bayesian analysis are more expensive, and high-quality answers are challenging to obtain on large graphs. For both strategies, extension to the pooled case-in which we observe multiple networks from a common generative process-adds further computational cost, with both time and memory scaling linearly in the number of graphs. This becomes prohibitive for large networks, or cases in which large numbers of graph observations are available. Here, we exploit some basic properties of the discrete exponential families to develop an approach for ERGM inference in the pooled case that (where applicable) allows an arbitrarily large number of graph observations to be fit at no additional computational cost beyond preprocessing the data itself. Moreover, a variant of our approach can also be used to perform Bayesian inference under conjugate priors, again with no additional computational cost in the estimation phase. The latter can be employed either for single graph observations, or for observations from graph sets. As we show, the conjugate prior is easily specified, and is well-suited to applications such as regularization. Simulation studies show that the pooled method leads to estimates with good frequentist properties, and posterior estimates under the conjugate prior are well-behaved. We demonstrate the usefulness of our approach with applications to pooled analysis of brain functional connectivity networks and to replicated x-ray crystal structures of hen egg-white lysozyme.https://doi.org/10.1371/journal.pone.0273039
spellingShingle Fan Yin
Carter T Butts
Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.
PLoS ONE
title Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.
title_full Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.
title_fullStr Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.
title_full_unstemmed Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.
title_short Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices.
title_sort highly scalable maximum likelihood and conjugate bayesian inference for ergms on graph sets with equivalent vertices
url https://doi.org/10.1371/journal.pone.0273039
work_keys_str_mv AT fanyin highlyscalablemaximumlikelihoodandconjugatebayesianinferenceforergmsongraphsetswithequivalentvertices
AT cartertbutts highlyscalablemaximumlikelihoodandconjugatebayesianinferenceforergmsongraphsetswithequivalentvertices