Simple, fast, and exact RNS scaler for the three-moduli set {2n - 1, 2n, 2n + 1}

Scaling in RNS has always been conceived as a performance bottleneck similar to the residue-to-binary conversion problem due to the inefficient intermodulo operation. In this paper, a simple and fast scaling algorithm for the three-moduli set {2n - 1, 2n, 2n + 1} RNS is proposed. The complexity of i...

Full description

Bibliographic Details
Main Authors: Chang, Chip-Hong, Low, Jeremy Yung Shern
Other Authors: School of Electrical and Electronic Engineering
Format: Journal Article
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/79409
http://hdl.handle.net/10220/25604
Description
Summary:Scaling in RNS has always been conceived as a performance bottleneck similar to the residue-to-binary conversion problem due to the inefficient intermodulo operation. In this paper, a simple and fast scaling algorithm for the three-moduli set {2n - 1, 2n, 2n + 1} RNS is proposed. The complexity of intermodulo operation has been resolved by a new formulation of scaling an integer in RNS domain by one of its moduli. By elegant exploitation of the Chinese Remainder Theorem and the number theoretic properties for this moduli set, the design can be readily implemented by a standard cell based design methodology. The low cost VLSI architecture without any read-only memory (ROM) makes it easier to fuse into and pipeline with other residue arithmetic operations of a RNS-based processor to increase the throughput rate. The proposed RNS scaler possesses zero scaling error and has a critical path delay of only 2[log2n]+ 9 units in unit-gate model. Besides the scaled residue numbers, the scaled integer in normal binary representation is also produced as a byproduct of this process, which saves the residue-to-binary converter when the binary representation of scaled integer is also required. Our experimental results show that the proposed RNS scaler is smaller and faster than the most area-efficient adder-based design and the fastest ROM-based design besides being the most power efficient among all scalers evaluated for the same three-moduli set.