Contents lists available at SciVerse ScienceDirect







journal homepage: www.elsevier.com/locate/mejo

# LSI implementation of a low-power $4 \times 4$ -bit array two-phase clocked adiabatic static CMOS logic multiplier

Nazrul Anuar Nayan<sup>a,\*</sup>, Yasuhiro Takahashi<sup>b</sup>, Toshikazu Sekine<sup>b</sup>

<sup>a</sup> Faculty of Engineering, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia
 <sup>b</sup> Faculty of Engineering, Gifu University, Gifu, Japan

rucanty of Engineering, offa oniversity, offa, jupa

### ARTICLE INFO

Article history: Received 27 May 2011 Received in revised form 27 December 2011 Accepted 29 December 2011 Available online 13 January 2012

Keywords: Low-power Adiabatic logic Energy recovery Multiplier

## ABSTRACT

As the density and operating speed of complementary metal oxide semiconductor (CMOS) circuits increases, dynamic power dissipation has become a critical concern in the design and development—of personal information systems and large computers. The reduction of supply voltage, node capacitance, and switching activity are common approaches used in conventional CMOS. In adiabatic switching circuits, the current flow through transistors can be significantly reduced by ensuring uniform charge transfer over the entire available time. This paper presents the simulation of this current in two-phase clocked adiabatic static CMOS logic (2PASCL) and conventional CMOS. From the SPICE simulations, at transition frequencies from 1 to 12 MHz, a  $4 \times 4$ -bit array 2PASCL multiplier shows a maximum reduction in power dissipation of 77% relative to that of a static CMOS. The measurement results of a  $4 \times 4$ -bit array 2PASCL multiplier demonstrate a 57% reduction compared to a  $4 \times 4$ -bit array two-phase clocked adiabatic dynamic CMOS logic (2PADCL). These results indicate that 2PASCL technology can be advantageous when applied to low-power digital devices operated at low frequencies, such as radio-frequency identification (RFID) tags, smart cards, and sensors.

© 2012 Elsevier Ltd. All rights reserved.

## 1. Introduction

Recently, power consumption has been a fundamental constraint in both high-performance and portable, energy-limited systems. In conventional CMOS circuits, power dissipation primarily occurs during device switching. A sudden flow of current through channel resistive elements results in half of the supplied energy being dissipated at each transition. In CMOS technology, as  $E_{diss} = \frac{1}{2}C_L V_{dd}^2$ , circuit designers are focusing on how to reduce  $V_{dd}$ and  $C_L$ . However, power dissipation can also be reduced by reducing the current flow into the transistors.

Low-power circuit systems achieved by implementing the concept of adiabatic switching [1] and energy recovery have been widely applied, and various energy-recovery circuits with adiabatic circuitry for ultra-low power implementation have been presented [1–12].

The essential idea of adiabatic charging is to design a circuit that allows all the nodes to be charged or discharged at a constant current. Power dissipation is minimized by decreasing the peak current flow through transistors. This flow is accomplished by using ramp-like power/clock signals [2]. The system draws some of the energy that is stored in the capacitors during a given

E-mail address: naz@eng.ukm.my (N.A. Nayan).

computation step and uses this energy in subsequent computations. However, for a single and two-phase clock circuits, diodebased families [3–9] have several disadvantages such as output amplitude degradation and the energy dissipation across the diodes in the charging path.

The pros and cons of 2PASCL compared with other proposed adiabatic logics that are easily derived from CMOS has previously been discussed in [12]. 2PASCL fundamental logics significantly exhibit a lower power dissipation.

In this paper, the current that flows in the 2PASCL inverter circuits and the comparison to static CMOS is observed. SPICE simulation and measurement of a  $4 \times 4$ -bit array 2PASCL multiplier utilizing 1.2 µm standard CMOS technology is also conducted. The 1.2 µm standard CMOS technology is purposely being performed because of this paper's focus, which is to reduce the dynamic power consumption. This is because, for technologies of smaller scales than 45-nm technology, leakage power consumption is dominant [10]. 2PASCL technology can be advantageously applied to low-power digital devices operated at low frequencies, such as radio-frequency identifications (RFIDs), smart cards, and sensors.

The remainder of the paper is organized as follows. Section 2 describes the structure and the operation of a 2PASCL inverter. Section 3 presents a simulation of the 2PASCL inverter and the measurement results of a  $4 \times 4$ -bit array 2PASCL multiplier. Finally, Section 4 consists of the concluding remarks and recommendations for future work.

<sup>\*</sup> Corresponding author. Tel.: +60 122124974.

<sup>0026-2692/</sup>\$ - see front matter © 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.mejo.2011.12.013

## 2. 2PASCL

#### 2.1. Circuit operation

Fig. 1 shows a circuit diagram of the 2PASCL inverter as compared to its CMOS equivalent. The 2PASCL implementation uses a two-phase clocking split-level sinusoidal power supply, wherein  $V_{\phi}$  and  $V_{\overline{\phi}}$  replace  $V_{dd}$  and ground, respectively. To generate this, the proposed circuit has been presented in [12]. The voltage level of  $V_{\phi}$  exceeds that of  $V_{\overline{\phi}}$  by a factor of  $V_{dd}/2$  as shown by Eqs. (1) and (2). The waveforms shown in Fig. 2 are the input, split-level sinusoidal power supply clocks, and the output from the simulation.

$$V_{\phi} = \frac{V_{dd}}{4}\sin(\omega_0 t + \theta) + \frac{3}{4}V_{dd},\tag{1}$$

$$V_{\overline{\phi}} = -\frac{V_{dd}}{4}\sin(\omega_0 t + \theta) + \frac{1}{4}V_{dd}.$$
(2)



Fig. 1. Inverter circuit implemented in (a) CMOS and (b) 2PASCL.



**Fig. 2.** Waveforms from the simulation of a 2PASCL inverter with transition frequency  $f_T$ ,  $V_X = 10$  MHz,  $V_{\phi} = V_{\overline{\phi}} = 20$  MHz.

One clock is in phase while the other is inverted. By using these two split-level sinusoidal waveforms, which have peak-to-peak voltages of 2.5 V in a standard 1.2-µm CMOS process, the voltage difference between the current-carrying electrodes can be minimized. The power supply clock design still meets the requirement in [11], where the voltages between current-carrying electrodes must be zero when the transistor switches to the on state. Consequently, power consumption is minimized. The substrates of the pMOS and nMOS transistors are connected to 5.0 V V<sub>dd</sub> and ground, respectively. From our study, the best combination of  $V_X$  and  $V_{\phi}$  (and also  $V_{\overline{\phi}}$ ) is  $f_{V_{\chi}}$  to  $2f_{V_{\phi}}$ . Both of the MOSFET diodes are used to recycle the charge from the output node and to improve the discharge speed of the internal signal nodes. A method for reducing energy dissipation in 2PASCL involves the design of a charging path without diodes. In this case, during charging, current flows only through the transistor. Thus, the 2PASCL circuit is different from other diodebased adiabatic circuits, in which current flows through both the diode and the transistor. Using the aforementioned 2PASCL circuit, higher amplitude and reduced energy dissipation is achievable.

In energy-recovery circuits, according to the law of energy conservation, dissipated energy is equal to the total energy injected into the circuit,  $E_i$ , and the energy received back from the circuit capacitance,  $E_r$ . This is shown in the energy graph in Fig. 2. Therefore, in the simulation, the power dissipated is calculated by integrating the product of voltage and current divided by the period of the primary input signal, T, as follows:

$$P = \frac{1}{T} \int_0^T \left( \sum_{i=1}^n (V_{pi} I_{pi}) \right) dt,$$
(3)

where  $V_p$  is the power supply voltage,  $I_p$  is the power supply current, and *n* is the number of power supplies [8].

The circuit operation is divided into two phases, namely, *evaluation* and *hold* [12]. In the *evaluation* phase,  $V_{\phi}$  swings up and  $V_{\overline{\phi}}$  swings down. On the other hand, in the *hold* phase,  $V_{\overline{\phi}}$  swings up and  $V_{\phi}$  swings down. Let us consider the 2PASCL inverter logic circuit demonstrated in Fig. 1. The operation of the 2PASCL inverter is explained as follows:

- (1) Evaluation phase:
  - (a) When Y is LOW and the pMOS tree is turned ON,  $C_L$  is charged through the pMOS transistor. Hence, Y is in the HIGH state.
  - (b) When node Y is HI and nMOS is ON, discharging via M1 and M4 occurs. Hence, Y is in the LOW state.



**Fig. 3.** Simulation results of the power dissipation comparison of 4-inverter chain of CMOS and 2PASCL at operating frequency from 1 to 100 MHz.

- (2) Hold phase:
  - (a) At the point when the preliminary state of Y is HIGH and the pMOS is ON, no transition occurs.

The number of dynamic switching transitions that occur during the operation of the 2PASCL circuit decreases because charging/ discharging of the circuit nodes does not necessarily occur during each clock cycle. Hence, node switching activities are suppressed to a significant extent, and, consequently, energy dissipation is reduced. One of the advantages of the 2PASCL circuit is that this circuit can be made to behave as a static logic circuit. Fig. 3 illustrates the power dissipation comparison of the 4-inverter chain of 2PASCL and CMOS at the transition frequency from 1 to 100 MHz.

## 3. Simulation and measurement results

## 3.1. CMOS circuits vis-a-vis 2PASCL

For the modeling of MOSFETs in both 2PASCL and CMOS, an ideal switch is included in series with a resistor *R* to represent the sum of the effective channel resistance of the switch and the interconnect resistance. In CMOS, when the logic level in the system is "1," there is a sudden flow of current through *R* [12]. As the power dissipation is  $p(t) = Ri^2$  and the current,  $i = C_L d\nu/dt$ , the amount of flowing current can be minimized by reducing the load capacitance and/or the rate of change of the supply voltage from 0 to  $V_{dd}$ .

In Figs. 4–7, simulation results are presented. Each figure shows the voltage supply, input  $V_X$  which oscillates at 10 MHz and output signal  $V_Y$  on the top graph, the drain current  $I_d$  on the second graph, energy dissipation on the third graph, and the instantaneous power dissipation on the fourth graph.

Fig. 4 describes the conditions in the pull-up networks of the 2PASCL inverter. When  $V_X$  changes from HI to LO, 144  $\mu$ A of  $I_d$  flows for 2.5 ns. Then from 2.5 to 25 ns, the circuit is operating in an adiabatic manner, with a constant current of 15  $\mu$ A.

In Fig. 5, as a comparison, the CMOS inverter simulation results during Simulation results of that is operating in adiabatic mode. The current passing through the nMOS transistor M1 is about 15  $\mu$ A. The energy dissipation depicted in the third graph also



Fig. 4. Simulation results of the current and energy dissipation of a 2PASCL circuit when connected to a pull-up network [13].



Fig. 5. Simulation results of the current and energy dissipation of a CMOS circuit when connected to a pull-up network [13].



**Fig. 6.** Simulation results of the current and energy dissipation of a 2PASCL circuit when connected to the pull-down networks [13].

shows that only 0.63 fJ is dissipated before the recovery mode begins to operate between 30 ns and 42 ns.

In Fig. 7, the current flow in the nMOS portion of the CMOS logic during pull-down is described. As in the pull-up case, 1.5 mA of current flows to the nMOS transistor during the logic transition. The energy dissipation, at 2.8 fJ for CMOS is four times larger than 2PASCL.

In these simulations, a significantly lower current was observed in 2PASCL compared to CMOS logic, and consequently energy consumption is reduced.

## 3.2. $4 \times 4$ -bit array 2PASCL multiplier

As shown in Fig. 8, the  $4 \times 4$ -bit array 2PASCL multiplier consists of 16 ANDs, six full adder logic circuits, and four half adder logic circuits. For fabrication, 2PASCL D-flipflops are also used to capture all of the 8-bit signals when the clock is in the HI state. Fig. 9 demonstrates the simulation results of the  $4 \times 4$ -bit



**Fig. 7.** Simulation results of the current and energy dissipation of a CMOS circuit when connected to a pull-down network [13].



**Fig. 8.** Block diagram of a  $4 \times 4$ -bit array multiplier.

array 2PASCL multiplier. In this simulation, 2PASCL D-flipflops are used.

The layout design of the  $4 \times 4$ -bit array 2PASCL multiplier is shown in Fig. 10. The image of the fabricated chip is shown in Fig. 11. The chip was fabricated in the ON-Semiconductor using 1.2-µm standard CMOS 2-metal, 2-poly technology. The input voltage rating for this process technology is 5 V. As mentioned previously, this process technology was chosen to determine the dynamic power consumption, rather than the leakage power dissipation.

The package is a 2.3-cm quad-flat-package (QFP). The chip summary is as depicted in Table 1.

The simulation results obtained using 1.2- $\mu$ m CMOS technology for the 2PASCL and CMOS multipliers were compared to the measurement results. Fig. 12 shows the power dissipation results obtained from simulations and measurements. The transition frequency varied from 50 kHz to 5 MHz.

In the measurement, we have used the following equipments: oscilloscope; Lecroy WaveSurfer MXs-A, function generator; NF



**Fig. 9.** Simulation of a 4 × 4-bit array 2PASCL multiplier with a D-Flip Flop using 1.2  $\mu$ m CMOS technology and a transition frequency  $f_T$  of 10 MHz.



Fig. 10. Layout of the  $4\times 4\text{-bit}$  array 2PASCL multiplier using 1.2- $\mu m$  CMOS technology.

circuit design WF1974. Clock generator circuit (i.e. LC oscillator) has not been used.

For the simulation, the power dissipation of the  $4 \times 4$ -bit array 2PASCL multiplier was compared to a  $4 \times 4$ -bit array CMOS multiplier and a  $4 \times 4$ -bit array 2PADCL [8] multiplier. The 2PADCL inverter is shown in Fig. 13. 2PADCL is a diode-based adiabatic logic. The main difference between 2PASCL and 2PADCL is the output voltage drop which has the effect of diode-drop. The output voltage of the previous 2PADCL is  $2V_d$  where  $V_d$  is the forward voltage drop. On the other hand, that of the proposed 2PASCL is  $V_d$  only, therefore the output noise margin of 2PASCL is also improved because the charge path is only through PMOS-tree during charge transition.

The results indicate a 77% reduction in power dissipation as compared to the  $4 \times 4$ -bit array CMOS multiplier. 2PASCL also shows a 57% reduction compared to the  $4 \times 4$ -bit 2PADCL multiplier.



Fig. 11. Image of the 2PASCL  $4 \times 4$ -bit array multiplier.

#### Table 1

Chip summary of the  $4 \times 4$ -bit array 2PASCL multiplier.

| Technology                  | 1.2-µm CMOS 2-metal, 2-poly                    |
|-----------------------------|------------------------------------------------|
| Power voltage               | 5.0 V                                          |
| Core size                   | 1354 (W) $\times$ 997 (H) $\mu$ m <sup>2</sup> |
| No. of transistors          | 992                                            |
| Dynamic operating frequency | 5–20 MHz                                       |
| Dynamic power dissipation   | 5.8 mW at 5 MHz                                |



Fig. 12. Power dissipation comparison of the  $4 \times 4$ -bit array 2PASCL multiplier and the  $4 \times 4$ -bit array CMOS multiplier.

A 4  $\times$  4-bit array 2PASCL multiplier was compared to a 2PADCL multiplier. The results indicate that a 57% reduction in power dissipation as compared to the 2PADCL multiplier was achieved.

However, based on the simulation results, the  $4 \times 4$ -bit array 2PASCL multiplier exhibits efficient logic functionality for a transition



Fig. 13. 2PADCL inverter.

frequency of up to 20 MHz on a 1.2- $\mu$ m process. We observed some signal degradation for transition frequencies above 20 MHz. This is due to the charging time *T*, which is much greater than in the conventional CMOS logic circuit. In addition, *T* is proportional to *RC*<sub>L</sub>, i.e., the longer the path, the greater the required *T*. However, these input frequencies are adequate for the applications described in Section 1. Using 0.18- $\mu$ m process, the 4 × 4-bit array 2PASCL multiplier managed to be operable up to 200 MHz [14].

### 4. Conclusion

The two-phase clocked adiabatic CMOS logic (2PASCL) circuit is a novel logic family. In the present paper, simulation on the current flow through the transistor in 2PASCL compared to CMOS was presented. Simulations indicate that by controlling the peak current flow through the transistors during charging and discharging, the power dissipation in the proposed adiabatic circuit can be reduced by half compared to CMOS. The obtained simulation results of 4 × 4-bit array 2PASCL multiplier circuit showed a 77% reduction compared to a CMOS multiplier. Compared to the 2PADCL multiplier, the 4 × 4-bit array 2PASCL multiplier reduced the power dissipation by 57%. The functionality, simulation, and measurement results revealed that power consumption in the 2PASCL multiplier is considerably lower than CMOS, which is advantageous for low-power digital applications. Future work will be needed to evaluate the short-circuit currents and detailed delay analysis of the 2PASCL.

### Acknowledgments

The multiplier chip investigated in the present study was fabricated using the chip fabrication program of the VLSI Design and Education Center (VDEC), University of Tokyo, in collaboration with On-Semiconductor, Nippon Motorola LTD., HOYA Corporation, and KYOCERA Corporation.

#### References

- W.C. Athas, L.J. Svensson, J.G. Koller, N. Tzartzains, E.Y.-C. Chou, Low-power digital systems based on adiabatic-switching principles, IEEE Trans. Very Large Scale Integration Syst. 2 (4) (1994) 398–407.
- [2] A. Kramer, J.S. Denker, S.C. Avery, A.G. Dickinson, T.R. Wik, Adiabatic computing with the 2N-2N2D logic family, in: Proceedings of IEEE Symposium on VLSI Circuits Digest of Technical Papers, 1994, pp. 25–26.
- [3] A.G. Dickinson, J.S. Denker, Adiabatic dynamic logic, IEEE J. Solid-state Circuit 30 (3) (1995) 311–315.
- [4] Y. Moon, D.K. Jeong, An efficient charge recovery logic circuit, IEEE J. Solidstate Circuit 31 (4) (1996) 514–522.

- [5] C.L. Seitz, A.H. Frey, S. Mattison, S.D. Rabin, D.A. Speck, J.L.A. van de Snepscheut, Hot-clock NMOS, in: Proceedings of 1985 Chapel Hill Conference VLSI, 1985, pp. 1–17.
- [6] K. Takahashi, M. Mizunuma, Adiabatic dynamic CMOS logic circuit, Electron. Commun. Japan Part II 83 (5) (2000) 50–58. (IEICE Trans. Electron. J81-CII(10) (1998) 810–817).
- [7] Y. Ye, K. Roy, QSERL: Quasi-static energy recovery logic, IEEE J. Solid-state Circuit 36 (2) (2001) 239–248.
- [8] Y. Takahashi, T. Sekine, M. Yokoyama, VLSI implementation of a 4 × 4-bit multiplier in a two phase drive adiabatic dynamic CMOS logic, IEICE Trans. Electron. E90-C (10) (2007) 2002–2006.
- [9] C.-Y.A. Gong, M.-T. Shiue, C.-T. Hong, K.-W. Yao, Analysis and design of an efficient irreversible energy recovery logic in 0.18 μm CMOS, IEEE Trans. Circuits Syst. 55 (9) (2008) 2595–2607.
- [10] A. Abdollahi, F. Fallah, M. Pedram, Leakage current reduction in CMOS VLSI circuits by input vector control, IEEE Trans. Very Large Scale Integration Syst. 12 (2) (2004) 140–154.
- [11] V.I. Starosel'skii, Adiabatic logic circuits: a review, Russ. Microelectron. 31 (2002) 37–58.
- [12] N. Anuar, Y. Takahashi, T. Sekine, Two phase clocked adiabatic static CMOS logic and its logic, J. Semiconductor Technol. Sci. 10 (1) (2010) 1–10.
- [13] N. Anuar, Y. Takahashi, T. Sekine, Low-power 4 × 4-bit array two-phase clocked adiabatic static CMOS logic multiplier, in: Proceedings of the ITC-CSCC 2010, 2009, pp. 296–299.
- [14] N. Anuar, Y. Takahashi, T. Sekine, 4 × 4-bit array two-phase clocked adiabatic static CMOS logic multiplier with new XOR, in: Proceedings of the VLSI-SOC 2010, 2010, pp. 364–368.