A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion

The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable...

Full description

Bibliographic Details
Main Authors: Kavya Sreedhar, Mark Horowitz, Christopher Torng
Format: Article
Language:English
Published: Ruhr-Universität Bochum 2022-08-01
Series:Transactions on Cryptographic Hardware and Embedded Systems
Subjects:
Online Access:https://tches.iacr.org/index.php/TCHES/article/view/9817
_version_ 1797326089561833472
author Kavya Sreedhar
Mark Horowitz
Christopher Torng
author_facet Kavya Sreedhar
Mark Horowitz
Christopher Torng
author_sort Kavya Sreedhar
collection DOAJ
description The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software implementations. The few works investigating hardware acceleration build on variants of Euclid’s division-based algorithm, following the approach used in optimized software. We show that adopting variants of Stein’s subtraction-based algorithm instead leads to significantly faster hardware. We quantify this advantage by performing a large-integer XGCD accelerator design space exploration comparing Euclid- and Stein-based algorithms for various application requirements. This exploration leads us to an XGCD hardware accelerator that is flexible and efficient, supports fast average and constant-time evaluation, and is easily extensible for polynomial GCD. Our 16nm ASIC design calculates 1024-bit XGCD in 294ns (8x faster than the state-of-the-art ASIC) and constant-time 255-bit XGCD for inverses in the field of integers modulo the prime 2255−19 in 85ns (31× faster than state-of-the-art software). We believe our design is the first high-performance ASIC for the XGCD computation that is also capable of constant-time evaluation. Our work is publicly available at https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022.
first_indexed 2024-03-08T06:18:33Z
format Article
id doaj.art-8a382dd46a974497a061033857ce7366
institution Directory Open Access Journal
issn 2569-2925
language English
last_indexed 2024-03-08T06:18:33Z
publishDate 2022-08-01
publisher Ruhr-Universität Bochum
record_format Article
series Transactions on Cryptographic Hardware and Embedded Systems
spelling doaj.art-8a382dd46a974497a061033857ce73662024-02-04T16:21:05ZengRuhr-Universität BochumTransactions on Cryptographic Hardware and Embedded Systems2569-29252022-08-0120224A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular InversionKavya Sreedhar0Mark Horowitz1Christopher Torng2Stanford University, Stanford, CA, USAStanford University, Stanford, CA, USAStanford University, Stanford, CA, USA The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software implementations. The few works investigating hardware acceleration build on variants of Euclid’s division-based algorithm, following the approach used in optimized software. We show that adopting variants of Stein’s subtraction-based algorithm instead leads to significantly faster hardware. We quantify this advantage by performing a large-integer XGCD accelerator design space exploration comparing Euclid- and Stein-based algorithms for various application requirements. This exploration leads us to an XGCD hardware accelerator that is flexible and efficient, supports fast average and constant-time evaluation, and is easily extensible for polynomial GCD. Our 16nm ASIC design calculates 1024-bit XGCD in 294ns (8x faster than the state-of-the-art ASIC) and constant-time 255-bit XGCD for inverses in the field of integers modulo the prime 2255−19 in 85ns (31× faster than state-of-the-art software). We believe our design is the first high-performance ASIC for the XGCD computation that is also capable of constant-time evaluation. Our work is publicly available at https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022. https://tches.iacr.org/index.php/TCHES/article/view/9817Extended GCDASICVerifiable delay functionClass groupsSquaring binary quadratic formsConstant-time
spellingShingle Kavya Sreedhar
Mark Horowitz
Christopher Torng
A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
Transactions on Cryptographic Hardware and Embedded Systems
Extended GCD
ASIC
Verifiable delay function
Class groups
Squaring binary quadratic forms
Constant-time
title A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
title_full A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
title_fullStr A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
title_full_unstemmed A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
title_short A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion
title_sort fast large integer extended gcd algorithm and hardware design for verifiable delay functions and modular inversion
topic Extended GCD
ASIC
Verifiable delay function
Class groups
Squaring binary quadratic forms
Constant-time
url https://tches.iacr.org/index.php/TCHES/article/view/9817
work_keys_str_mv AT kavyasreedhar afastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion
AT markhorowitz afastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion
AT christophertorng afastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion
AT kavyasreedhar fastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion
AT markhorowitz fastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion
AT christophertorng fastlargeintegerextendedgcdalgorithmandhardwaredesignforverifiabledelayfunctionsandmodularinversion