Memory Efficient Implementation of Modular Multiplication for 32-bit ARM Cortex-M4

In this paper, we present scalable multi-precision multiplication implementation and scalable multi-precision squaring implementation for 32-bit ARM Cortex-M4 microcontrollers. For efficient computation and scalable functionality, we present optimized Multiplication and ACcumulation (MAC) techniques...

Full description

Bibliographic Details
Main Author:	Hwajeong Seo
Format:	Article
Language:	English
Published:	MDPI AG 2020-02-01
Series:	Applied Sciences
Subjects:	multi-precision multiplication multi-precision squaring public key cryptography arm cortex-m4 memory-efficient implementation
Online Access:	https://www.mdpi.com/2076-3417/10/4/1539

Description
Summary:	In this paper, we present scalable multi-precision multiplication implementation and scalable multi-precision squaring implementation for 32-bit ARM Cortex-M4 microcontrollers. For efficient computation and scalable functionality, we present optimized Multiplication and ACcumulation (MAC) techniques for the target microcontrollers. In particular, we present the 64-bit wise MAC operation with the Unsigned Long Multiply with Accumulate Accumulate (UMAAL) instruction. The MAC is used to perform column-wise multiplication/squaring (i.e., product-scanning) with general-purpose registers in an optimal way. Second, the squaring algorithm is further optimized through an efficient doubling routine together with an optimized product-scanning method. Finally, the proposed implementations achieved a very small memory footprint and high scalability to cover algorityms ranging from well-known public key cryptography (i.e., Rivest−Shamir−Adleman (RSA) and Elliptic Curve Cryptography (ECC)) to post-quantum cryptography (i.e., Supersingular Isogeny Key Encapsulation (SIKE)). All SIKE round 2 protocols were evaluated with the proposed modular reduction implementations. The results demonstrate that the scalable implementation can achieve the smallest code size together with a reasonable performance.
ISSN:	2076-3417

Memory Efficient Implementation of Modular Multiplication for 32-bit ARM Cortex-M4

Similar Items