Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines
Ising machines are specialized computers for finding the lowest energy states of Ising spin models, onto which many practical combinatorial optimization problems can be mapped. Simulated bifurcation (SB) is a quantum-inspired parallelizable algorithm for Ising problems that enables scalable multi-ch...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10460551/ |
_version_ | 1797243356624977920 |
---|---|
author | Tomoya Kashimata Masaya Yamasaki Ryo Hidaka Kosuke Tatsumura |
author_facet | Tomoya Kashimata Masaya Yamasaki Ryo Hidaka Kosuke Tatsumura |
author_sort | Tomoya Kashimata |
collection | DOAJ |
description | Ising machines are specialized computers for finding the lowest energy states of Ising spin models, onto which many practical combinatorial optimization problems can be mapped. Simulated bifurcation (SB) is a quantum-inspired parallelizable algorithm for Ising problems that enables scalable multi-chip implementations of Ising machines. However, the computational performance of a previously proposed multi-chip architecture tends to saturate as the number of chips increases for a given problem size because computation and communication are exclusive in the time domain. In this paper, we propose a streaming architecture for multi-chip implementations of SB-based Ising machines with full spin-to-spin connectivity. The data flow in in-chip computation is harmonized with the data flow in inter-chip communication, enabling the computation and communication to overlap and the communication time to be hidden. Systematic experiments demonstrate linear strong scaling of performance up to the vicinity of the ideal communication limit determined only by the latency of chip-to-chip communication. Our eight-FPGA (field-programmable gate array) cluster can compute a 32,768-spin problem with a high pipeline efficiency of 97.9%. The performance of a 79-FPGA cluster for a 100,000-spin problem, projected using a theoretical performance model validated on smaller experimental clusters, is comparable to that of a state-of-the-art 100,000-spin optical Ising machine. |
first_indexed | 2024-04-24T18:53:49Z |
format | Article |
id | doaj.art-38cc640b75934c7785ba6250b26bb86c |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-24T18:53:49Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-38cc640b75934c7785ba6250b26bb86c2024-03-26T17:45:42ZengIEEEIEEE Access2169-35362024-01-0112366063662110.1109/ACCESS.2024.337408910460551Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation MachinesTomoya Kashimata0https://orcid.org/0009-0001-6337-7580Masaya Yamasaki1https://orcid.org/0000-0001-7867-9605Ryo Hidaka2https://orcid.org/0009-0000-0207-6381Kosuke Tatsumura3https://orcid.org/0000-0003-0511-443XCorporate Research and Development Center, Toshiba Corporation, Kawasaki, Saiwai-ku, JapanCorporate Research and Development Center, Toshiba Corporation, Kawasaki, Saiwai-ku, JapanCorporate Research and Development Center, Toshiba Corporation, Kawasaki, Saiwai-ku, JapanCorporate Research and Development Center, Toshiba Corporation, Kawasaki, Saiwai-ku, JapanIsing machines are specialized computers for finding the lowest energy states of Ising spin models, onto which many practical combinatorial optimization problems can be mapped. Simulated bifurcation (SB) is a quantum-inspired parallelizable algorithm for Ising problems that enables scalable multi-chip implementations of Ising machines. However, the computational performance of a previously proposed multi-chip architecture tends to saturate as the number of chips increases for a given problem size because computation and communication are exclusive in the time domain. In this paper, we propose a streaming architecture for multi-chip implementations of SB-based Ising machines with full spin-to-spin connectivity. The data flow in in-chip computation is harmonized with the data flow in inter-chip communication, enabling the computation and communication to overlap and the communication time to be hidden. Systematic experiments demonstrate linear strong scaling of performance up to the vicinity of the ideal communication limit determined only by the latency of chip-to-chip communication. Our eight-FPGA (field-programmable gate array) cluster can compute a 32,768-spin problem with a high pipeline efficiency of 97.9%. The performance of a 79-FPGA cluster for a 100,000-spin problem, projected using a theoretical performance model validated on smaller experimental clusters, is comparable to that of a state-of-the-art 100,000-spin optical Ising machine.https://ieeexplore.ieee.org/document/10460551/Cluster computingcombinatorial optimizationFPGAhardware accelerationIsing machineparallel processing |
spellingShingle | Tomoya Kashimata Masaya Yamasaki Ryo Hidaka Kosuke Tatsumura Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines IEEE Access Cluster computing combinatorial optimization FPGA hardware acceleration Ising machine parallel processing |
title | Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines |
title_full | Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines |
title_fullStr | Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines |
title_full_unstemmed | Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines |
title_short | Efficient and Scalable Architecture for Multiple-Chip Implementation of Simulated Bifurcation Machines |
title_sort | efficient and scalable architecture for multiple chip implementation of simulated bifurcation machines |
topic | Cluster computing combinatorial optimization FPGA hardware acceleration Ising machine parallel processing |
url | https://ieeexplore.ieee.org/document/10460551/ |
work_keys_str_mv | AT tomoyakashimata efficientandscalablearchitectureformultiplechipimplementationofsimulatedbifurcationmachines AT masayayamasaki efficientandscalablearchitectureformultiplechipimplementationofsimulatedbifurcationmachines AT ryohidaka efficientandscalablearchitectureformultiplechipimplementationofsimulatedbifurcationmachines AT kosuketatsumura efficientandscalablearchitectureformultiplechipimplementationofsimulatedbifurcationmachines |