# A 1.2V 500MHz 32-bit Carry-Lookahead Adder Kuo-Hsing Cheng, Wen-Shiuan Lee and Yung-Chong Huang Dept. of Electrical Engineering, Tamkang University, Taipei Hsien, Taiwan, R.O.C TEL:886-2-26215656 Ext.2731 FAX:886-2-26221565 E-mail:cheng@ee.tku.edu.tw **ABSTRACT**: In this paper a 1.2V 32-bit carry lookahead adder is proposed for high speed, low voltage applications. The proposed new 32-bit Non-full adder uses Voltage Swing True-Single-Phase- Clocking Logic (NSTSPC) to implement the proposed carry lookahead adder. Because the internal node of NSTSPC was non-full swing, its operation speed would be higher than the conventional TSPC. Moreover, the supply voltage for the new adder is 1.2V, thus the power dissipation would also be reduced. The 32-bit CLA adder using 0.35um 1P4M CMOS technology with 1.2V power supply could be operated on 500MHz clock frequency. ### 1. INTRODUCTION Addition is a fundamental arithmetical operation in almost any kind of processor, and improving the efficiency of addition is a continuously attractive research topic. High-speed adder architectures include the carry-lookahead adder [1]-[5], carry-skip adder [6]-[8], carry-select adder [9], conditional sum adder [10], and combinations of these basic structures. A recent comparison among these adders showed that the CLA adder is the fastest, and requires the least hardware [12]. The CLA algorithm was first introduced by Weinberger and Smith [1], and several variants have been developed. Brent and Kung [2] opened the way to a class of carry-lookahead adders based on a binary-tree structure. In their suggested architecture, however, $2(\log_2 n-1)$ logic levels were required. Shortening the critical path is the most common way to reduce the propagation time. Therefore we use a scheme built with an extremely compact array of cells, each implementing the "•" operator [13]. Its critical path down to $\log_2 n$ logical levels while keeping the fan-out down to 2 for each cell as shown in figure 1. In this paper a 32-bit CLA is proposed for high speed and low power applications. Since the speed of a CLA adder mainly depends on the speed of carry propagation chain. In order to speed up the generation of the carry chain and hope low power, a low voltage dynamic logic, the Non-full Voltage Swing True – Single – Phase - Clocking Logic[14] was used to implement the high speed low power CLA adder. The internal node of the NSTSPC was non-full voltage swing generation, its speed is higher than the conventional TSPC logic. This paper was structured as follows: Section II discusses Non-full Voltage Swing True-Single-Phase-Clocking Logic (NSTSPC) and compare with conventional TSPC. Section III simulates the CLA using the NSTSPC. Section IV is Conclusions. # 2. THE SCHEMATIC AND OPE-RATING PRINCIPLE OF THE NSTSPC The schematic diagram of the non-full swing TSPC (NSTSPC) circuit is shown in figure 2. The logic circuit contains two parts: (1) non-full voltage swing NMOS dynamic logic gate; and (2) N-type latch. The non-full voltage swing NMOS dynamic logic gate performs the logic function. The N-type latch senses the output signal of NMOS logic gate and holds output data at node *OUT* during the precharge phase. The operating principle of the NSTSPC circuit is shown as follow: ## 2.1 The Precharge/Hold Phase (CK = 0) When the clock signal CK is low, the NSTSPC circuit is operated in the precharged/hold phase. The node *net2* is charged to full voltage VDD and the signal of node *net1* is non-full swing. The voltage of node *net1* is $$V(net1) = V_{DD} - V_{tn}$$ (1) Figure.1 CLA array architecture ●:"●" operator ○:TSPC latch. Figure.2 NSTSPC for N-block Where $V_{tn}$ is the threshold voltage of NMOS transistor NI and the node net1 is precharged to 0.55V for 1.2V supply voltage. The PMOS transistor P2 will be turned off by the voltage V(net1) of the node net1 at $$V(net1) > V_{DD} - |V_{tp}|$$ (2) The turned-on voltage, VDD- $|V_{\rm p}|$ , of the PMOS transistor P2 is under 0.45V. Therefore, the data of previous state is held by the N-type latch at node *out*. # 2.2 The Evaluation Phase (CK = 1) When the clock signal CK is high, the NSTSPC circuit is operated in the evaluation phase. Based upon the input signals, the NSTSPC circuit has two different evaluation results, evaluation-low operation and evaluation-high operation. They are shown as follow: ## 2.2.1 Evaluation-Low Operation: If NMOS transistor QI is turned off by the input signal IN during the evaluation phase, the node net2 is kept in high. It can turn on NMOS transistor Q2 to discharge the output node voltage. The node A is bootstrapped by the PMOS transistor PI and the voltage is $$V(A)=V_{DD}+\triangle V$$ (3) Where $\triangle V$ is the bootstrapped voltage of the PMOS transistor P1 generated by the clock signal CK. Thus the voltage of the node net1 is also bootstrapped to $$V(net1) = V_{DD} - V_{tn} + \triangle V$$ (4) This voltage is high enough to turn off the PMOS transistor *P2*. ## 2.2.2 Evaluation-High Operation: If NMOS transistor Q1 is turned on by the input signal IN during the evaluation phase, the voltage V(net1) of the node net1 and the voltage V(net2) of the node net2 are discharged to $V_{SS}$ . Because the highest voltage of node net1 is $V_{DD}$ - $V_{tn}$ during the previous precharge phase, the discharging time of the node net1 is fast than the time of the node net2. The NMOS transistor Q2 is turned off until the V(net2) is below 0.65V. Thus the PMOS transistor P2 is turned on to change the output node out to 1.2V quickly. # 3. CIRCUIT STRUCTURE AND SIMULATION OF CLA ## 3.1. CLA Structure As shown in figure1 the 32-bit CLA has parallel structure. The key element of the 32-bit CLA was "●" operator. The "●" operator algorithm was shown as follows: $$(g_i, p_i) \bullet (g_j, p_j) = (g_i + p_i g_j, p_i p_j)$$ (5) # g=a\*b $p=a\oplus b$ if a, b were input signal (6) The white cells "O" were identical to the black ones as shown in Fig.1, but are only used as buffers in order to make the signal propagation uniform across the adder. Each black cell is the "O" operator and the various circuits of the black cell were shown in Fig.3 to Fig.6. By using the N-P dynamic logic blocks, the 32-bit adder can be implemented as a pipelining structure. Figure.3 The NSTSPC for N-block Figure.4 The TSPC for N-block Figure.5 The NSTSPC for P-block Figure.6 The TSPC for P-block ## 3.2. CLA Cells Simulation Results Figure 3 shows the N-block NSTSPC circuit for the CLA cell. Figure 4 shows the conventional N-block TSPC circuit for the CLA cell. Figure 6 and Figure 7 show the P-block CLA cells. The HSPICE simulation waveforms of the N-block NSTSPC CLA cell and TSPC CLA cell are show in Figure 7. It shows the comparison results of operation speed of the NSTSPC is fast that than the TSPC. The speed simulation waveform of the P-block CLA cells are shown in Figure 8. The NSTSPC also kept the speed advantage. Fig.7 The simulation of NSTSPC and TSPC Fig.8 The simulation of NSTSPC and TSPC ### 3.3. 32-bit CLA Simulation Results As mentioned earlier, the CLA cells were used to implement the " " operator in the 32-bit CLA. In order to make the carry propagation chain have the critical delay path, we input the pipeline signal $(A_{31} A_{30} A_{29} ... A_1 A_0) + (B_{31} B_{30} B_{29} ... B_1)$ $B_0$ ) as follows: (000 ... 00)+ (111 ... 11) and (111 ... 11) + (000 ... 01). The 32-bit CLA adder using 0.35um 1P4M CMOS technology with 1.2V power supply. Due the CLA cells operation speed comparison results show the NSTSPC circuit has the speed advantage over the TSPC circuit. The NSTSPC 32-bit CLA adder could be operated on 500MHz clock frequency with 1.2V power supply and the conventional TSPC-adder couldn't operate on 500MHz. It was just about 333MHz. The maximum operation frequency and power dissipation comparison results are shown in Table I, and the power is dissipated under maximum operation frequency. Note that the normalized Power / Max. freq. are also given. It shows that the NSTSPC has fewer power dissipation under same operation frequency. ### 4. CONCLUSIONS In this paper, the low-voltage and high-speed NSTSPC 32-bit CLA was designed and analyzed. The internal delay and dynamic power dissipation would be reduced by non-full voltage swing scheme. Based upon the HSPICE simulation result, the NSTSPC 32-bit CLA has the speed and power dissipation advantages over the conventional TSPC 32-bit adder. Thus the new adder is suitable for lower power high-speed applications. | | Max.<br>freq. | Power dissipation | Power Max. freq. | |--------|---------------|-------------------|------------------| | NSTSPC | 500MHz | 37.98mW | 0.07596mW/MHz | | TSPC | 333MHz | 29.96mW | 0.08996mW/MHz | Table I. The comparison of power and max-imum frequency Fig.9. 1.2V 500MHz carry-lookahead adder simulation results ### 5. REFERENCES - [1] A. Weinberger and J. L. Smith, "A logic for high speed addition," *Nat. Bur. Stand. Circ.*,1958, pp. 3-12. - [2] R. P. Brent and H. T. Kung, "A regular layout for parallel adders," *IEEE Trans. Comput.*, vol. C-31, 1982, pp. 280-284. - [3] S. Waser and M. J. Flynn, Introduction to Arithmetic for Digital System Design., New York: CBS, 1982, ch.3. - [4] I. S. Hwang and A. L. Fisher, "Ultra fast compact 32-bit CMOS adder in multiple-output domino logic," *IEEE J. Solid-state Circuit*, vol. 24, 1989, pp. 358-369. - [5] B. W. Y. Wei and C. D. Thompson, "Area-time optimal adder design," *IEEE Trans. Comput.*, vol. 39, 1990, pp. 666-675. - [6] S. Turrini, "Optimal group distribution in carry-skip adders," in proc. 9<sub>th</sub> Symp. Comp. Arithmetic, Sept. 1990, pp. 96-103. - [7] P. K. Chan and M. D. F. Schlag, "Analysis and design of CMOS Manchesteradders with variable carry-skip," *IEEE Trans. Comput.* vol. 39, 1990, pp. 983-992. - [8] A. Guyot et al., "A way to build efficient carry skip adders," *IEEE Trans. Comput.*, vol. C-36, 1987, pp. 1144-1151. - [9] O. J. Bedrij, "Carry-select adder," *IRE Trans. Elec. Comp.*, vol. EC-11, 1962, pp.340-346. - [10] J. Sklansky, "Conditional-sum addition logic," IRE Trans. Elec. Comp., vol.EC-9, 1960, pp. 226-231. - [11] T. Lynch and E. E. Swartzlander, "A spanning tree carry lookahead adder," *IEEE Trans. Comput.* vol. C-41, 1992, pp. 931-939. - [12] T. k. Callway and E. E. Swartzlander, "Optimizing arithmetic elements for sigmal processing," in VLSI sig. Proc. Vol. V, K, Yao et al., Ed. New York, NY: IEEE, 1992, pp.91-100. - [13] D. Dozza, M. Gaddoni, and G. Baccarani, "A 3.5ns, 64 bit, carry-lookahead adder," 1996 IEEE Inter. Symp. On Circuit and Systems, vol. II June 1996, pp. 297-300. - [14] Kuo-Hsing Cheng and Yung-Chong Huang,"The Non-full Voltage Swing TSPC (NSTSPC) Logic Design," Proceedings of The Second IEEE Asia Pacific Conference on ASICs, Hotel Shilla Cheju Cheju, Korea August 28-30,2000 pp.37-40