CMOS Logic Structures

  •  Full complementary static CMOS gates may be undesirable because:
  •  The area overhead.
  •  Their speed may be too slow.
  •  The function may not be feasible as a full complementary structure (e.g. PLA).

  •  Smaller faster gates can be implemented at the cost of:
  •  Increased design time.
  •  Increased operational complexity.
  •  Decreased operational margin.

  •  Full complementary gates can be designed as ratioless circuits:
  •  A fixed ratio in size between pull-up and pull-down structures is not required for proper operation.

  •  Unlike those we will consider now.

CMOS Logic Structures

  •  Pseudo-nMOS logic
  •  Gain ratio of n-driver transistors to p-transistor load (beta driver /beta load ), is important to ensure correct operation.
  •  Accomplished by ratioing the n and p transistor sizes.

CMOS Logic Structures

  •  Dynamic CMOS Logic
  •  Pull-up time improved by virtue of the active switch (p-transistor can be much larger).
  •  Pull-down time increased due to the ground switch.

CMOS Logic Structures

  •  Dynamic CMOS Logic
  •  What is wrong with cascading these structures?
  •  (Hint: Consider the delay in the discharge of the left-most n-logic block at the start of the evaluate phase).

CMOS Logic Structures

  •  CMOS Domino Logic:
  •  These structures can be cascaded.
    •  In a cascaded set of logic blocks, each stage evaluates and causes the next stage to evaluate (in the same way a line of dominos fall).

CMOS Logic Structures

  •  Pass-Transistor Logic:

CMOS Logic Structures

  •  Other forms of CMOS logic include:
  •  BiCMOS Logic
  •  Clocked CMOS Logic (C 2 MOS).
  •  NP Domino Logic (Zipper CMOS).
  •  Cascade Voltage Switch Logic (CVSL).
  •  Source Follower Pull-up Logic (SFPL).
  •  (See Weste and Eshraghian for details.)

  •  Where should one use what gate?
  •  Complementary: Best option for most cases. Safe, fast, no DC power.
  •  Pseudo-nMOS: Large fan-in NOR gates, i.e. PLAs, ROMs. DC power.
  •  Transmission gate: Speed advantage, good for complex boolean functions.
  •  CMOS domino logic: Low-power, high speed. Requires simulation!

Clocked Systems

  •  Majority of VLSI systems are Finite State machines and Pipelined machines:

Clock Strategy

  •  One of the most important decisions made at the start of a design is the selection of a clocking strategy.

  •  It effects:
  •  How many transistors are used per storage element.
  •  How many clock signals need to be routed throughout the chip.

  •  Topics:
  •  Latch, Master-Slave Flip-flop and Edge-Triggered Flip-flop designs.
  •  Setup and Hold time and clock race conditions.
  •  CMOS Static and Dynamic Flip-flops.
  •  Single phase clocking, clock skew/slew.
  •  Two-phase clocking techniques.
  •  Clock generation techniques.

Latches and Flip-flops

Latches and Flip-flops

  •  The ambiguity of having a non-allowed mode caused by trigger pulses going active simultaneously can be avoided by adding two feedback lines:
  •  Note if both J and K are high, and clock pulses, the output is complemented.
    •  However, doing so enables the other input and the FF oscillates .
  •  This places some stringent constraints on the clock pulse width (e.g. < than the propagation delay through the FF).

  •  Synchronous circuit:
    •  Changes in the output logic states of all FFs in a design are synchronized with the clock signal, phi.

Latches and Flip-flops

  •  Note that the:
  •  T FF ( toggle FF ) is a special case of the JK with J and K tied together.
  •  D FF ( delay FF ) is a special case with J and K connected with complementary values of the D input.
    •  Here the D FF generates a delayed version of the input signal synchronized with the clock.

  •  These FFs are also called latches .
    •  A FF is a latch if the gate is transparent while the clock is high (low).
    •  Any changes in the input are reflected in the output after a nominal delay.

  •  The transparent nature can cause race problems:

Master-Slave Flip-flops

Master-Slave Set/Clear Asynchronous FFs

Edge-triggered FFs

  •  Problem with master-slave approach:
    •  The circuit is sensitive to changes in the input signals as long as phi is high.
    •  If the inputs do not remain constant when the clock is high, the master follows D , which, for example, consumes power.
  •  The fix is to allow the state of the FF to change only at the rising (falling) edge of the clock.

Edge-triggered FFs

  •  The modification applied to the JK FF is shown below.
  •  Note that the inputs must be stable for some time before the clock goes low.

  •  This is also true for the master-slave D FF, but the constraints are different.

  •  Let's first define some terms.

Flip-flop Timing Definitions

  •  Timing diagram showing the terms used to define the proper operation of a Flip-flop.
  •  Tc: Clock Cycle Time.
  •  T s : The amount of time before the clock edge that the D input has to be stable.
  •  T h : Data has to be held for this period while the clock travels to the point of storage.
  •  T q : Clock-to-Q delay: Delay from the positive clock input to the new value of Q.

Setup/Hold Time Violations

  •  Depending on the design, one or both of T s and T h may have to be non-zero.
    •  For example, the master-slave D FF is likely to require a longer setup time than the edge-triggered D FF.
  •  Edge triggered FF prevents the "master" from following the D input so the FF's internal delay does not affect setup time.

Setup/Hold Time Violations

  •  The hold time interval starts with the beginning of the clock transition.
    •  Clock skew and slew and other design details of the FF affect the hold time.


  •  Toggle Flip-Flop with Asynchronous Clear:

System Timing

  •  Two possible strategies to implement clocked systems:
  •  Latches are a more economical implementation strategy but are transparent on half of the clock cycle, and cannot be used in feedback systems.
  •  Also, the following constraint must be met for latches:
    •  T d < T c /2 - T q - T s
    •  where T is the worst case propagation delay, T c is the clock cycle time, T q is the Clock-to-Q time of latch A and T sis the setup time for latch B.

Clock Race Conditions

  •  Occurs when the data input to the register does not obey the setup and hold-time constraints.

  •  Delays in the clock line to Reg B (hold-time violation).
    •  New data stored instead of previous data:

Clock Race Conditions

  •  Delays in the combinational logic that are larger than the clock cycle time (setup violation).
    •  Data arrives late at Reg B, old data retained instead of latching new data.
  •  As you can see, designers have to walk a temporal 'tight-rope', e.g., they have to minimize clock skew while considering worst and best case delays through combinational logic.

CMOS Static Flip-Flops

  •  Full complementary version of the master-slave FF requires 38 transistors !

CMOS Dynamic Flip-Flops

  •  Positive feedback is not the only means to implement a memory function.
    •  capacitor can act as a memory element as well.

  •  In this case, a periodic refresh is required (in the millisecond range) due to leakage (hence the word dynamic ).

  •  Consider the following "cheaper" (1/2 transmission gate) positive level-sensitive latch as a step toward deriving a dynamic FF:

CMOS Dynamic Flip-Flops

  •  A master-slave FF is created by cascading two of these latches and reversing the clocks.
  •  The problem with this latch is that phi 1 and phi 1 might overlap, which may cause two types of failures:
  •  Node A can become undefined as it is driven by both and B when phi 1 and phi 1 are both high.
  •  D can propagate through both the master and slave if both phi 1 and phi 1 are high simultaneously for a long enough period.

Single Phase Clock Skew/Slew

  •  Clock skew causes conflicts and transparency.

  •  Clock slew (slow rise and fall times) can also cause transparency:
  •  Clock skew is a dominant problem in current high performance designs.

CMOS Dynamic Two-Phase Flip-Flops

  •  Pseudostatic FF : The fix is to use two non-overlapping clocks phi 1 and phi 2 :
  •  A large t phi-12 allows proper operation even in the presence of clock skew.
  •  Note that node A floats (dynamic) during the time period t phi-12 but is driven during t phi-1 and t phi-2 .
    •  Hence, the name pseudostatic .

CMOS Dynamic Two-Phase Flip-Flops

  •  This version is simplier (6 trans) and is often used in pipelined datapaths for microprocessors and signal processors.
  •  Disadv: 2 non-overlapping clocks required (4 if transmission gates are used).
  •  These implementations MUST be simulated at all process corners (under worst-case conditions).

Two-Phase Clocking

  •  Clock skew/slew:

CMOS Dynamic Two-Phase Flip-Flops

  •  C 2 MOS: A clever method which is insensitive to clock skew:

CMOS Dynamic Two-Phase Flip-Flops

  •  C 2 MOS is insensitive to overlap as long as the rise and fall times of the clk edges (clock slew) are sufficiently small:

C2MOS Flip-Flop

  •  Races are just not possible since the overlaps activate either the pull-up or the pull-down networks but never both simultaneously.
    •  The inverters force 0-1 and 1-0 propagation modes only.

  •  However, if the rise and fall times of the clock are slow, there exists a time slot in which both n- and p-transistors are conducting simultaneously.

  •  Correct operation requires the rise/fall times be smaller than about 5 times the propagation delay through the FF.

  •  This is not hard to meet in practical designs, making C 2 MOS especially attractive in high speed designs where avoiding clock overlap is hard.

  •  Lots of other possible latch configurations, static and dynamic -- see Weste and Eshraghian.

Single Phase Local Clock Generation

  •  Clock skew is minimized but area cost is severe.

Single Phase Global Clock Generation

  •  Transistors in the inverter and pass gate should be similar in size.
    •  Keep them small and use buffers to drive the load.

  •  Note: The routing load MUST also be balanced on each of the clk lines.

Two-phase Global Clock Generation

Multi-Phase Clocking

  •  Four-phase clocking strategies discussed in Weste and Eshraghian.

  •  Modern designs tend to minimize the number of clock phases used due to problem of generating and distributing multiple clocks.

  •  Single phase schemes used for complex, high-speed CMOS circuits.

Clock Distribution

  •  Assume all the registers in a large CMOS design result in a capacitive load of 2000 pF. What is the peak current and average dynamic power?
  •  Two techniques:
  •  A single large buffer (cascaded inverters): Use when the module has a large number of diverse modules, i.e. a microprocessor.
  •  A distributed-clock-tree: Use when design is highly structured and repetitive, i.e. a datapath.