Squaring X2 is a special case of multiplication that plays an important role to several public-key cryptosystems such as the RSA and ECC cryptosystems. This paper proposes an efficient squaring algorithm for embedded RISC processors. In order to improve the performance, we utilize the feature (multiply/accumulate unit) of the embedded RISC processors and minimize the number of external memory accesses. Our squaring algorithm is 59-72% faster than Yang et al.'s for the range of bit-length from 1024 to 8192 by Texas Instruments TMS320C55x DSP.
Proceedings of the 2008 International Computer Symposium (ICS 2008)，3頁