# Low-Power Pulse Triggered Flip Flip-Flop Design with Conditional Pulse Enhancement Method

Sudha Kousalya<sup>1</sup> & G.V.R.Sagar<sup>2</sup>

GPREC (Autonomous), Kurnool, Andhra Pradesh, India - 518007

**Abstract:** Low power has emerged as a principal theme in today's electronics industry. Over the past decade, the power consumption of VLSI chips has constantly been increasing. In Integrated Circuit, there is a need of data to communicate among logic gates using flip-flop (FF). In a FF, a huge portion of the on-chip power is consumed by clock systems, which consists of timing elements such as FFs, latches and clock distribution network. These components consume 30% to 60% of the total power in a system. So to reduce Power dissipation, various conventional methods are implemented using TSPC but there is a discharging path problem. To reduce long discharging, various methods have been implemented. The existing signal feed through scheme is implemented to reduce discharging problem, but pit-falls to a new pass transistor because of its direct charging and discharging. Another drawback is more number of transistors used and power consumption is also high. The proposed Low Power Pulse Triggered Flip-Flop Design with Conditional Pulse Enhanced Technique overcomes the long discharging path problem. This method is composed of pass transistor logic (PTL) based AND gate and an extra pass transistor used at clock system. The results shown based on post layout simulation using Tanner Software CMOS 45nm technology. The proposed method improves power consumption and delay.

Index Terms: P-FFs (flip-flops), Low-Power, PTL based AND logic

#### I. Introduction

Flip-flops (FFs) are the fundamental building blocks of digital system design used in computers, communications, etc. These are used for data storage purposes. FFs can be used for storage of state and such a circuit is described as sequential circuit. CDN (Clock Distribution Network) is one of the most power consumption part in a FF. has two stable states and can be store state information. The D-Flip-Flop is widely used and it is also known as a "DATA" or "DELAY" Flip-Flop. The D-Flip-Flop captures the value of the Dinput at a definite portion of the clock cycle (such as the rising edge of the clock pulse) that captured value becomes the Q output. At other times, the output Q does not change. As the power budget of today's portable digital circuit is severely limited. It is important to reduce the power dissipation in both clock distribution networks and flip-flops. The delay is mainly due to the clock system. The delay of the flip-flops should be minimized for efficient performance of the circuit. Many contemporary microprocessors selectively use masterslave and pulsed-triggered flip-flops [1]. Traditional master-slave single edge FFs for example, transmission gated flip-flop, are made up of two stages, one master and one slave. All of hard edged flip-flops are characterized by a positive setup time, causing large D-to-Q (Data to output) delays. Alternatively, pulsetriggered FFs reduce the two stages into one stage and are characterized by the soft edge property. Pulsetriggered FF (P-FF) has been considered a popular alternative to the conventional master-slave based FFs in the applications of high speed operations [2]-[4]. Besides the speed advantage, its circuit simplicity is also beneficial to lowering the power consumption up to 38.4% and less D-Q delay [5]. RCSFF (Reduced Clock Swing Flip-Flop) and Clock Slope methods introduced to reduce clock system power on CDN (Clock Distribution Network). Reduces the power consumed by clock system with reduced clock swing and clock slope. The energy contribution in a clock domain has shown smoother clock slope leads to an increase in the FFs energy [6]&[7]. Introducing high performance flip-flop CDFF (Conditional Discharge Flip-Flop) to reduce internal switching activities and also gives less glitch at the output, while maintaining negative setup time and small D-Q delay. The proposed CDFF method saves 39% energy [8]. Low power flip-flop CCFF (Conditional Capture Flip-Flop) reduces the delay of internal node to get statistical power reduction. In electronics, a Flip – Flop or latch is a circuit that The CCFF flip flop achieves power saving up to 61% with no impact on latency [9]. A P-FF consists of a pulse generator for generating strobe signals and a latch for data storage. The circuit complexity of a P-FF is simplified since only one latch, as opposed to two used in conventional master-slave configuration, is needed. Depending on the method of pulse generation, P-FF designs can be classified as implicit or explicit [2]&[10]. The pulse generator is used to generate a clock pulse and it combines inverters and NAND gate [1]. The combination gives the better control on pulse width; hence very narrow pulse can be produced. The narrow pulse reduces the race through problem and improves noise sensitivity. In an implicittype P-FF, the pulse generator is a built-in logic of the TSPC latch design, and no explicit pulse signals are generated. In an explicit-type P-FF, the designs of pulse generator and latch are separate [11]. Implicit pulse generation is often considered to be more power efficient than explicit pulse generation. This is because the former merely controls the discharging path while the latter needs to physically generate a pulse train.

#### 1.1. Implicit Type Pulse Triggered Flip-Flop

In implicit type flip-flops the clock distribution circuit is a built in logic and there is no need for an external circuit for pulse generation as shown in fig. 1. Implicit type flip-flops consist of two parts, pulse generator for clock and a latch for data storage. Several low power techniques are available which can be applied to the pulse triggered flip-flops based on control signal, they are conditional enhancement, conditional precharge, conditional capture and conditional discharge/ conditional data mapping. It does not occupy much space because of its internal structure and it consumes less power but suffer from long discharging path. As a consequence, the transistors of pulse generation logic are often enlarged to assure that the generated pulses are sufficiently wide to trigger the data capturing of the latch.

#### 1.2. Explicit Type Pulse Triggered Flip-Flop (ep-DCO)

In an explicit type P-FF, the pulse generator and the latch are separated as shown in fig. 2, hence the power consumption is more. Even this method consumes more power; the logic separation from the latch design gives the FF a unique speed advantage. It gives better performance as the height of nMOS stack in ep-FF is less than that in ip-FF. Its power consumption and the circuit complexity can be effectively reduced because one pulse generator shares a group of FFs. Here, we will discuss on the explicit type P-FF designs only. The fig. 2, having a drawback of more switching power dissipation when clock and data are at high logic, but it's suitable for reduction of long discharging problem. To overcome this problem, all existing methods are implemented. Hence it is suitable for low-power and high performance designs.







Fig.2. Explicit Data Close Output Flip-Flop

Pulse-triggered Flip-Flop (P-FF) has more advantages than Transmission Gate (TG) and master-slave flip-flops in high speed applications, because of its single latch design [12]. The P-FF circuit is simple in designing and its simplicity reduces the power consumption of the clock system. A P-FF consists of a pulse generator for strobe signals and a latch for data storage. On the rising edge of the clock, the data input enters into the flip-flop. But the output does not reflect the input state until the falling edge of the clock. These are very sensitive to any change of input levels during the clock is still high, the input must be setup prior to the clock pulse rising edge and must not be changed before the falling edge. If the triggering pulses are sufficiently narrow, the latch acts like an edge-triggered FF. P-FF has only one latch circuit is simpler than the conventional Master-Slave based FF because P-FF has only one latch and two in master slave FF [1], [12]. P-FFs also allow time borrowing across clock cycle boundaries and feature a zero or even negative setup time. Despite these advantages, pulse generation circuitry requires delicate pulse width control to cope with possible variations in process technology and signal distribution network. Energy consumption, delay, and area are analyzed and compared the performance of Flip-Flops [7], [14]. This leads to a higher toggle rate for high-speed operations, small delay, less power consumption but large space used for the method implementation [12].

In this paper, we present a low power pulse triggered flip-flop design with conditional pulse enhanced method to reduce the long discharging path. We are implementing a PTL (pass transistor logic) based AND logic. The AND gate is constructed by connecting two transistors in parallel. By keeping the most of the time zero at the output of AND gate, for logic. The number of transistors in the discharging path can be reduced. This method occupies less area on a chip and consumes less power with less amount of delay.

#### II. Existing Methods

The ep-DCO flip-flop is considered as one of the fastest flip-flops due to its semi-dynamic structure. Drawback is even if their static input is 1 the internal node X is discharged on every rising edge of the clock. To overcome this problem, existing methods had been implemented. Based on control signal the P-FF was designed with conditional discharge method.



Fig. 3. Schematic of CDFF



Fig. 4. Schematic of Static CDFF

In CDFF (Conditionally Discharge Flip-Flop) method, the internal node X is modified with an inverter and pMOS transistor instead of a latch as shown in fig. 3. An extra N3- nMOS transistor is used as a control signal and it is controlled by the Q-bar output signal to reduce the redundant switch power. This circuit avoids discharging problem when input is logic '1'. Q output was buffered to isolate from the output. CDFF method features less switching noise generation, which is an important issue in mixed signal circuits. Moreover, the internal node X stays HIGH or precharged in most cases, which helps in simplifying the keeper structure as shown in above figure, and it also reduces the capacitive load at node. This method consumes more power compared to all existing methods.



Fig. 5. Schematic of MHLFF



Fig. 6. Schematic of TSPCFF

The Static Conditional Discharge FF (SCDFF) uses static latch structure. The back to back connection of inverters gives the latch structure named as internal node X to avoid dynamic nodes and it is exempted from periodical precharges [1]. The delay (D-Q) is larger in SCDFF compared to CDFF method. But both methods results a worst case delay caused by discharge path through three stacked nMOS transistors i.e., N1-N3 as shown in fig. 4. To overcome this delay for better speed performance, a powerful pull-down circuitry is needed, which causes extra layout area and power consumption.

In MHLFF (Modified Hybrid Latch Flip-Flop) method, the pulse generator is modified as inverters and a pass transistor shown in fig. 5. This method is similar to implicit type of FF and it employs a static latch structure [2]. Node X is no longer precharged periodically by the clock signal. At first stage, a weak pull-up transistor pMOS P1 is controlled by the output signal Q and it maintains the level of node X when Q is '0'. Despite its circuit simplicity, the MHLFF method also has two drawbacks. First, since node X is not pre discharged, a prolonged 0 to 1 delay is expected. The delay deteriorates further, because a level-degraded clock pulse is applied to the discharging transistor N3. Second, node X becomes floating in certain cases and its value may drift causing extra dc power.

A TSPCFF (True Single Phase Clock) based on a signal feed-through scheme improves the delay using a direct pass transistor [10]. This method consists of a modified TSPC latch structure incorporating a mixed

design style consisting of pass transistor and pseudo-nMOS logic. This method is implemented with three major differences. One, to charge internal node X, a new transistor pMOS P1 is introduced by connecting the gate always to ground. Second, the pull down network is removed at the output. Third, a new pass transistor N4 is introduced in between output and input for charging/discharging and it's controlled by the clock pulse and the data input is given directly. The pull-up transistor P2 provides extra auxiliary signal driving which reduces the data transition delay. N4 conducts for very short period because of its pitfalls of all pass transistors. This method solves the discharging problem but consumes more power.

#### III. Proposed Method

Different existing methods encountered the same long discharging path problem. Referring to Fig. 7, the proposed P-FF with conditional pulse enhanced method improves the power consumption, area and delay compared to previous one. The mechanism used to conditionally enhance the pull-down strength and reduced leakage power due to shrunken transistors in critical discharging path and in the delay inverters. The conditional discharge only reduces internal switching activities but also generates less glitch at output. The numbers of transistors are reduced by using the PTL based AND logic at pulse generator. The proposed method is similar to the design of P-FF based on a signal feed through scheme. A weak pull up P1 transistor gate is always connected to ground to keep the internal node X precharges. The design looks like a pseudo-nMOS logic.



Fig. 7. Schematic of P-FF design with conditionally pulse enhanced method

Hence the transistor P1 always in ON state. A pass transistor N4 gate is controlled by clock signal and also it's is connected to data along with P2 pull-up transistor. When 0 to 1 data transitions, the N4 provides an extra driving to node Q and for 1 to 0 transitions the node Q discharges through N4 transistor. The pul-up transistor P2 provides an extra auxiliary signal driving to node Q, which reduces the data transition delay. Transistor N2 taken as control signal and it is controlled by a feedback signal Qbar output to conditionally discharge.

The pulse generator is modified with an inverter and a PTL (Pass Transistor Logic) based AND logic to generate a clock signal. The transistor N3 and N5 are connected in parallel to form AND logic and it controls the discharge of transistor N6 even if both transistors are ON. The two inputs to the AND logic is always complementary and its connection is given as shown in fig. 7. As a result, the output node is kept at zero most of the time. There is a floating node when both input signals equal to "0". But it doesn't provide any harm to the circuit performance. Critical circumstance occurs only when there is rising edges at the clock pulse. Even if both transistors N3 and N5 are turned ON together, sends a weak signal to node which is connected to the gate of N6 transistor N6 also ON similar to AND gate logic. This weak pulse strength is enhanced by switching ON the transistor N6 by a time span equal to the delay provided by inverter at the clock pulse. The switching power at node can be reduced due to a diminished voltage swing. The discharge control signal is driven by a single transistor N6.

The methodology of this method is when the clock signal applied to transistor N4, if no transition of data occurs, there will be no change at the output Q but current passes through it. At that time the data and Qbar are assume complementary and pull-down path is off. If data transitions from 0 to 1 the path discharges to turn on the P2 transistor which provides high output. Direct charging and discharging of the pass transistor N4 may get common pitfalls of all transitors, scenario is different in this case because it conducts very short period. When 1 to 0 data transitions with clock signal N4 is turn ON and the output discharges. Input source loading effect is not significant because N4 conducts very short period of time. Parallel conduction of two nMOS transistors (N3 and N5) speeds up the operations of pulse generation. By designing the flip-flop in this way, the number of stacked transistors along the discharging path reduced and sizes are also reduced (N1-N3, N5, and N6) hence area reduced. This speeds up the pulse generation and the power consumed in a circuit also reduced.

#### **IV.** Simulation Results

Based on post layout simulations using TSMC CMOS 45nm technology, the proposed method and existing methods, performances are evaluated. The existing methods CDFF, Static CDFF, MHLFF and TSPCFF are compared with the proposed method. Among these existing methods the pulse generator is same, which is a combination of inverters and NAND gate except for MHLFF method, which employs its own pulse generator as shown in fig. 5. The size of transistors used in pulse generator circuit is also designed using 45nm.

I. Power consumption performance of FF-methods

Table-I summarizes the flip-flops performances are shown interns of power and simulation results of all FFs. The proposed method was designed with least number of transistors and consumes less power because of PTL based AND structure used at clock.

|             | -                  | -              | -                |                  |
|-------------|--------------------|----------------|------------------|------------------|
| FFs         | No. of Transistors | Avg. Power     | Max. power (n W) | Min. power (n W) |
|             |                    | Consumed (n W) |                  |                  |
| CDFF        | 28                 | 2.151737e-007  | 2.632299e-007    | 1.671176e-007    |
| Static CDFF | 29                 | 2.128484e-005  | 2.135757e-005    | 2.121211e-005    |
| MHLFF       | 19                 | 1.658567e-008  | 1.659122e-008    | 1.658012e-008    |
| TSPCFF      | 24                 | 1.866165e-007  | 4.801089e-009    | 7.682023e-010    |
| Proposed FF | 14                 | 1.492592e-008  | 2.856942e-008    | 1.282416e-009    |

#### Table-I: Comparison of Proposed FF with Existing FFs

| FFs         | D-Q delay (ns) | Rise time (ns) | Fall time (ns) |
|-------------|----------------|----------------|----------------|
| CDFF        | 6.8973e-008    | 4.0000e-009    | 4.0000e-009    |
| Static CDFF | 9.6062e-008    | 4.0000e-009    | 4.0000e-009    |
| MHLFF       | 1.0470e-007    | 4.0000e-009    | 8.0000e-009    |
| TSPCFF      | 5.3696e-008    | 3.2000e-009    | 3.2000e-009    |
| Proposed FF | 2.9996e-007    | 4.0000e-009    | 4.7140e-009    |

### Table-II: Delay and Timing Analysis

The proposed method gives the best results interns of power compared to four existing methods shown in table-I. The AND logic used at pulse generator provides best results, the stacked nMOS transistors are also reduced in discharging path when data input is high. The area is reduced and hence the power also reduced. The largest power is consumed by the CDFF method because of its internal node structure and discharging problem. The MHLFF method also provides good results but it possesses few drawbacks, which are not suitable for reduction of long discharge path problem.

Using T-SPICE, the above mentioned parameters in Table-I are found. The power is calculated in between Vdd and GND by inserting the power command on T- spice. The PTL based AND logic in the circuit gives best results. The number of transistors are reduced greatly compared to the TSPCFF method, hence the area reduced. So the power consumed in proposed method reduced compared to TSPCFF method, but slightly same as MHLFF method as shown in graph in Fig. 8.

## II. Delay analysis of FF-methods

The proposed method reduces the delay compared to TSPCFF method because few transistors used. The delay, rise time and fall time are found by taking the instance symbols from spice commands. The D-Q Delay for proposed method is reduced compared to the TSPCFF method but not lesser than MHLFF method.



Fig. 8. Overall Power Analysis of FFs.



Fig. 9. Simulated waveforms of proposed method

The waveforms are shown for proposed method in fig. 9. The voltages of proposed method are printed at clock signal, D input and Q output. The waveforms are displayed as (a) Clock, (b) data input, (c) output Q and (d) internal node. The Q output is depends on clock and data, and it's generated with delay as shown in fig (c). The period of D input signal is taken as 400ns and pulse width is 95ns are taken and for clock signal period is 200ns and pulse width is 95ns are taken as shown in fig. 9. For these signals rise time and fall time are taken as 5ns. If clock and data are (1,1) the Q output is high, for (0,0) the Q remains high, For (1,0) the Q maintains low. The Q output signal is deviated from data signal because of its internal structure as shown in fig. 9(c). Fig. 9(d), showing that the pass transistor N6 conducts for very short period of time which gives less power consumption as shown in proposed method. The delay, fall time and rise time are shown in a graph, in fig. 10. Delay analysis of different FFs with different colors. In terms of delay the MHLFF method reduces the delay but whenever data

transmits from 0-1, there is a short circuit problem. The proposed method greatly reduces this problem by designing pulse generator with PTL based AND logic.



#### V. Conclusion

In this paper, we proposed a low power P-FF Design with Conditional Pulse Enhanced Method. The design been implemented with a new technique at the pulse generator (clock system) i.e., PTL based AND with two transistors placed as parallel to speed up the clock pulse and reduces the number of transistors in discharging path, hence area and power are also reduced. This flip flop design reduces the discharging path problem. The one of its FF design application is Jonson counter.

Future Scope: In this paper we implemented the power reduction of the circuit by reducing the unwanted discharge of internal node and external node whenever input was at static 1 condition. So in the future we can extend this circuit by implementing clock gating to the circuit. Clock gating reduces the unwanted clock reaching at Flip-Flop when ever input doesn't changes.

#### References

- Vladimir Stojanovic and Vojin G. Oklobdzija, "Comparative analysis of master slave latches [1]. and filp flops for high performance and low power systems," IEEE Journal of Solid-State Circuits, vol. 34, no. 4, April 1999, pp.536-548.
- [2]. S.P.Loga priya, P.Hemalatha, "Design and Analysis of low power pulse triggered flip flop," International Journal of Scientific and Research Publications, Volume 3, Issue-4, April 2013, ISSN: 2250-3153.
- Jiren Yuan and Christer Svensson, "New Single Clock CMOS Latches and Flip Flops with Improved Speed and Power Saving," [3]. IEEE Journal of Solid State Circuits, vol.32, no.1, Jan 1997, pp. 62-69.
- [4]. Hong-Yi Huang, 'Kuo-Hsing Cheng, Jim-Shyan Wang, Yuan-Hua Chu, Vain-Shun Wu, and Clung-Yu Wu, " Low voltage low power CMOS true single-phase clocking scheme with locally asynchronous logic circuits," IEEE Xplore, pp. 1572-1575.
- Yin-Tsung Hwang, Jin-Fa Lin, and Ming-Hwa Sheu, "Low-Power Pulse-Triggered Flip-Flop Design With Conditional Pulse-[5]. Enhancement Scheme," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, no.2, Feb. 2012, pp.361-366.
- [6]. Hiroshi Kawaguchi and Takayasu Sakurai, "A Reduced Clock-swing Flip-Flop (RCSFF) for 63% Power Reduction," IEEE journal of solid-state circuits, vol. 33, no. 5, May 1998, pp. 807-811.
- Massimo Alioto, Elio Consoli, and Gaetano Palumbo, "Flip-Flop Energy/Performance versus Clock Slope and Impact on the Clock [7]. Network Design," IEEE Transactions on Circuits And Systems-I: Regular Papers, vol. 57, no. 6, June 2010, pp. 1273-1286.
- Peivi Zhao, Tarek K. Darwish, and Magdy A. Bayoumi, "High-Performance and Low-Power Conditional Discharge Flip-Flop," [8]. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 12, no. 5, May 2004, pp. 477-484.
- Bai-Sun Kong, Sam-Soo Kim, and Young-Hyun Jun, "Conditional-Capture Flip-Flop for Statistical Power Reduction," IEEE [9]. Journal of Solid-State Circuits, vol. 36, no. 8, Aug. 2001, pp.1263-1271.
- Jin-Fa Lin, "Low-Power Pulse-Triggered Flip-Flop Design Based on a Signal Feed-Through Scheme," IEEE Trans. on Very Large [10]. Scale Integration (VLSI) Systems, vol. 22, no. 1, Jan. 2014, pp.181-185.
- [11]. Jyothi Bandi, K. Rakesh, "Explicit Pulse Triggered Flip Flop Design based on a Signal Feed-Through Scheme," International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, vol.4 Issue-5, Nov. 2014.
- Elio Consoli, Massimo Alioto, Gaetano Palumbo, Jan Rabaey, "Conditional Push-Pull Pulsed Latches with 726fJ ps Energy-Delay [12].
- Product in 65nm CMOS," IEEE Solid State Circuits Conf., Feb. 2012, pp. 482-483. Bo Wang, Jun Zhou, Kah Hyong Chang, Minkyu Je and Tony T. Kim, "A 0.18V Charge-Pumped DFF with 50.8% Energy-Delay Reduction for Near/Sub-threshold Circuits," IEEE Asian Solid-State Circuits Conf., (A-SSCC), April 2013, pp.121-124. [13].
- Massimo Alioto, Elio Consoli, and Gaetano Palumbo, "Analysis and Comparison in the Energy-Delay-Area Domain of Nanometer CMOS Flip-Flops: Part I-Methodology and Design Strategies," IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. [14]. 19, no. 5, May 2011, pp.725-736.
- [15]. Fabian Klass, Chaim Amir, Ashutosh Das, Kathirgamar Aingaran, Cindy Truong, Richard Wang, Anup Mehta, Ray Heald, and Gin Yee, "A new family of semi dynamic and dynamic flip flips with embedded logic for high performance processors," IEEE J. Solid -State Circuits, vol. 34, no. 5, May 1999, pp.712-716.
- [16]. Peiyi Zhao, Jason B. McNeely, Pradeep K. Golconda, Soujanya Venigalla, Nan Wang, Magdy A. Bayoumi, Weidong Kuang, and Luke Downey, "Low-Power Clocked-Pseudo-NMOS Flip-Flop for Level Conversion in Dual Supply Systems," IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 17, no. 9, Sep. 2009, pp. 1196-1202.