Development of efficient Montgomery/Barrett modular multiplication to reduce computation latency or save hardware area. It includes a new algorithm to parallelize the computation of quotient and intermediate result, speed up the expression (A+B)*C, etc.
Related work:
- B. Zhang, Z. Cheng and M. Pedram, “High-Radix Design of a Scalable Montgomery Modular Multiplier with Low Latency,” in IEEE Transactions on Computers