An <i>O</i>(log<sub>2</sub><i>N</i>) Fully-Balanced Resampling Algorithm for Particle Filters on Distributed Memory Architectures

Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slo...

Full description

Bibliographic Details
Main Authors: Alessandro Varsi, Simon Maskell, Paul G. Spirakis
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/14/12/342
Description
Summary:Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slow. Parallel computing can help to address this. However, resampling (and, hence, PFs as well) necessarily involves a bottleneck, the redistribution step, which is notoriously challenging to parallelize if using textbook parallel computing techniques. A state-of-the-art redistribution takes <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msup><mrow><mo>(</mo><msub><mo form="prefix">log</mo><mn>2</mn></msub><mi>N</mi><mo>)</mo></mrow><mn>2</mn></msup><mo>)</mo></mrow></semantics></math></inline-formula> computations on Distributed Memory (DM) architectures, which most supercomputers adopt, whereas redistribution can be performed in <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msub><mo form="prefix">log</mo><mn>2</mn></msub><mi>N</mi><mo>)</mo></mrow></semantics></math></inline-formula> on Shared Memory (SM) architectures, such as GPU or mainstream CPUs. In this paper, we propose a novel parallel redistribution for DM that achieves an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msub><mo form="prefix">log</mo><mn>2</mn></msub><mi>N</mi><mo>)</mo></mrow></semantics></math></inline-formula> time complexity. We also present empirical results that indicate that our novel approach outperforms the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msup><mrow><mo>(</mo><msub><mo form="prefix">log</mo><mn>2</mn></msub><mi>N</mi><mo>)</mo></mrow><mn>2</mn></msup><mo>)</mo></mrow></semantics></math></inline-formula> approach.
ISSN:1999-4893