UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Techniques for clock skew reduction over intra-die process, voltage, and temperature variations Mueller, Jeff


Synchronous clock distribution continues to be the dominant timing methodology for very large scale integration circuit designs. As processes shrink, clock speeds increase, and die sizes grow, an increasingly larger percentage of the clock period is being lost to skew and jitter. New techniques to reduce clock skew and jitter must be deployed to utilize the faster clock frequencies possible with future process technologies, especially in the presence of on-chip process-voltage-temperature (PVT) variations. This dissertation first proposes a pre-silicon design modification to symmetric clock buffers of traditional clock distribution networks. Specifically, clock performance is improved by targeting the critical clock edge (the edge activating rising edge-triggered flip-flops) while relaxing the requirements of the non-critical edge. This system uses alternating, asymmetric clock buffers to focus inverter resources on only one edge of the clock pulse; hence, it is called Single Edge Clocking (SEC). A novel re-design of the traditional clock buffer is proposed as a drop-in replacement for existing clock distribution networks, yielding timing performance improvements of over 20% in latency and skew, and up to 30% in jitter; alternatively, these timing advantages could be traded off to reduce clock buffer area and power by 33% and 12%, respectively. PVT variations, especially intra-die, increasingly upset the distribution of a synchronized clock signal, even in properly balanced clock tree networks. Hence, active clock deskewing becomes necessary to tune out unwanted clock skew after chip fabrication. This dissertation proposes a post-silicon autonomous deskewing system using tunable buffers to dynamically reduce clock skew. The operation of a specially designed phase detector is described, and four such phase detectors are used to construct a stable, autonomously locking Quad Ring Tuning (QRT) configuration that effectively links together four distributed Delay-Locked Loops (DLLs) without the need for any system-level controller. This cyclic, unidirectional, self-controlled, quad-DLL ring tuning technique is then implemented hierarchically to dynamically adjust clock signal delays across an entire chip during normal circuit operation. A simple form of the two-level QRT system is presented for a generic H-tree distribution network, demonstrating stable locking behavior and more than 50% average reduction in full-chip clock skew.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International