A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ruhr-Universität Bochum
2022-08-01
|
Series: | Transactions on Cryptographic Hardware and Embedded Systems |
Subjects: | |
Online Access: | https://tches.iacr.org/index.php/TCHES/article/view/9817 |
_version_ | 1797326089561833472 |
---|---|
author | Kavya Sreedhar Mark Horowitz Christopher Torng |
author_facet | Kavya Sreedhar Mark Horowitz Christopher Torng |
author_sort | Kavya Sreedhar |
collection | DOAJ |
description |
The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software implementations. The few works investigating hardware acceleration build on variants of Euclid’s division-based algorithm, following the approach used in optimized software. We show that adopting variants of Stein’s subtraction-based algorithm instead leads to significantly faster hardware. We quantify this advantage by performing a large-integer XGCD accelerator design space exploration comparing Euclid- and Stein-based algorithms for various application requirements. This exploration leads us to an XGCD hardware accelerator that is flexible and efficient, supports fast average and constant-time evaluation, and is easily extensible for polynomial GCD. Our 16nm ASIC design calculates 1024-bit XGCD in 294ns (8x faster than the state-of-the-art ASIC) and constant-time 255-bit XGCD for inverses in the field of integers modulo the prime 2255−19 in 85ns (31× faster than state-of-the-art software). We believe our design is the first high-performance ASIC for the XGCD computation that is also capable of constant-time evaluation. Our work is publicly available at https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022.
|
first_indexed | 2024-03-08T06:18:33Z |
format | Article |
id | doaj.art-8a382dd46a974497a061033857ce7366 |
institution | Directory Open Access Journal |
issn | 2569-2925 |
language | English |
last_indexed | 2024-03-08T06:18:33Z |
publishDate | 2022-08-01 |
publisher | Ruhr-Universität Bochum |
record_format | Article |
series | Transactions on Cryptographic Hardware and Embedded Systems |
spelling | doaj.art-8a382dd46a974497a061033857ce73662024-02-04T16:21:05ZengRuhr-Universität BochumTransactions on Cryptographic Hardware and Embedded Systems2569-29252022-08-0120224A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular InversionKavya Sreedhar0Mark Horowitz1Christopher Torng2Stanford University, Stanford, CA, USAStanford University, Stanford, CA, USAStanford University, Stanford, CA, USA The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software implementations. The few works investigating hardware acceleration build on variants of Euclid’s division-based algorithm, following the approach used in optimized software. We show that adopting variants of Stein’s subtraction-based algorithm instead leads to significantly faster hardware. We quantify this advantage by performing a large-integer XGCD accelerator design space exploration comparing Euclid- and Stein-based algorithms for various application requirements. This exploration leads us to an XGCD hardware accelerator that is flexible and efficient, supports fast average and constant-time evaluation, and is easily extensible for polynomial GCD. Our 16nm ASIC design calculates 1024-bit XGCD in 294ns (8x faster than the state-of-the-art ASIC) and constant-time 255-bit XGCD for inverses in the field of integers modulo the prime 2255−19 in 85ns (31× faster than state-of-the-art software). We believe our design is the first high-performance ASIC for the XGCD computation that is also capable of constant-time evaluation. Our work is publicly available at https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022. https://tches.iacr.org/index.php/TCHES/article/view/9817Extended GCDASICVerifiable delay functionClass groupsSquaring binary quadratic formsConstant-time |
spellingShingle | Kavya Sreedhar Mark Horowitz Christopher Torng A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion Transactions on Cryptographic Hardware and Embedded Systems Extended GCD ASIC Verifiable delay function Class groups Squaring binary quadratic forms Constant-time |
title | A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion |
title_full | A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion |
title_fullStr | A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion |
title_full_unstemmed | A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion |
title_short | A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion |
title_sort | fast large integer extended gcd algorithm and hardware design for verifiable delay functions and modular inversion |
topic | Extended GCD ASIC Verifiable delay function Class groups Squaring binary quadratic forms Constant-time |
url | https://tches.iacr.org/index.php/TCHES/article/view/9817 |
work_keys_str_mv | AT kavyasreedhar afastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion AT markhorowitz afastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion AT christophertorng afastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion AT kavyasreedhar fastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion AT markhorowitz fastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion AT christophertorng fastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion |