Attention for inference compilation

We present a neural network architecture for automatic amortized inference in universal probabilistic programs which improves on the performance of current architectures. Our approach extends inference compilation (IC), a technique which uses deep neural networks to approximate a posterior distribut...

Täydet tiedot

Bibliografiset tiedot
Päätekijät: Harvey, W, Munk, A, Baydin, AG, Bergholm, A, Wood, F
Aineistotyyppi: Conference item
Kieli:English
Julkaistu: SciTePress 2022
Kuvaus
Yhteenveto:We present a neural network architecture for automatic amortized inference in universal probabilistic programs which improves on the performance of current architectures. Our approach extends inference compilation (IC), a technique which uses deep neural networks to approximate a posterior distribution over latent variables in a probabilistic program. A challenge with existing IC network architectures is that they can fail to capture long-range dependencies between latent variables. To address this, we introduce an attention mechanism that attends to the most salient variables previously sampled in the execution of a probabilistic program. We demonstrate that the addition of attention allows the proposal distributions to better match the true posterior, enhancing inference about latent variables in simulators.