Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Application of active inductors in high-speed I/O circuits Lee, Yen-Sung Michael 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2008_fall_lee_yensungmichael.pdf [ 1.25MB ]
Metadata
JSON: 24-1.0066695.json
JSON-LD: 24-1.0066695-ld.json
RDF/XML (Pretty): 24-1.0066695-rdf.xml
RDF/JSON: 24-1.0066695-rdf.json
Turtle: 24-1.0066695-turtle.txt
N-Triples: 24-1.0066695-rdf-ntriples.txt
Original Record: 24-1.0066695-source.json
Full Text
24-1.0066695-fulltext.txt
Citation
24-1.0066695.ris

Full Text

Application of Active Inductors in High-Speed I/O Circuits by Yen-Sung Michael Lee B.Eng., Carleton University, 2006  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Applied Science in  THE FACULTY OF GRADUATE STUDIES (Electrical and Computer Engineering)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  October 2008  © Yen-Sung Michael Lee, 2008  ABSTRACT This thesis explores the use of active inductors as a compact alternative to the bulky passive spiral structures in high-speed I/O circuits. A newly proposed PMOS-based topology is introduced and used in active-inductor terminations. The 1st prototype design fabricated in a 90-nm CMOS process consists of an output driver using active-inductor terminations to provide channel equalization and output impedance matching. From measurement results, the use of active inductors in the termination, as compared to when the active inductor is disabled, increases the vertical eye opening in the receiver side by a factor of two and reduces the jitterp-p by 30% of the transmitted 10 Gb/s (231-1) pseudo-random binary sequence pattern, over a 6-inch FR4 channel. An output impedance matching with S22 less than -10 dB over a bandwidth of 20 GHz is achieved. The pair of active-inductor terminations occupies 17×25µm2 and has a low overhead power consumption of 0.8 mW. In the 2nd prototype design, a 4-stage output buffer with active-inductor loads is designed and implemented in a 65-nm CMOS process. Simulation results verify that when operating at 31.25 Gb/s, the output eye of the active-inductor load buffer compares favorably with that of the passive-inductor load buffer. For a similar eye-height and 78% less timing jitter the active-inductor load design’s speed (31.25 Gb/s) is 25% faster than the passive-resistor load design (25 Gb/s). The active-inductor load output buffer achieves comparable performance in terms of speed, power, and output swing with other reported designs using passive inductors. Its total area is 135×30 µm2 (including three differential active inductors) which is comparable to the size of a single passive spiral inductor having a 0.5~1 nH inductance.  ii  TABLE OF CONTENTS Abstract................................................................................................................................... ii Table of Contents................................................................................................................... iii List of Tables .......................................................................................................................... v List of Figures........................................................................................................................ vi List of Abbreviations ........................................................................................................... viii Acknowledgment................................................................................................................... ix 1 Introduction .................................................................................................................... 1 1.1 Motivation .............................................................................................................. 1 1.2 Research Objectives ............................................................................................... 5 1.3 Thesis Organiation.................................................................................................. 5 2 Background..................................................................................................................... 6 2.1 Inductive Peaking in High-Speed I/O..................................................................... 6 2.2 Inductor Implementation ........................................................................................ 8 2.3 Basics of Active-Inductor Shunt Peaking............................................................... 9 2.4 Design Challenges ................................................................................................ 12 2.4.1 Voltage Headroom ........................................................................................ 12 2.4.2 Nonlinearity.................................................................................................. 14 2.4.3 Noise............................................................................................................. 15 2.5 Eye Diagram......................................................................................................... 16 3 Prototype Design I........................................................................................................ 18 3.1 Inductive Peaking Termination............................................................................. 18 3.2 PMOS-based Active Inductor............................................................................... 19 3.2.1 Circuit Topology........................................................................................... 19 3.2.2 Small-Signal Analysis .................................................................................. 21 3.2.3 Large-Signal Operation ................................................................................ 21 3.2.4 Linearity Enhancement................................................................................. 22 3.3 Output Driver with Active Inductor Load ............................................................ 23 3.3.1 Full Schematic .............................................................................................. 23 3.3.2 Biasing Voltage............................................................................................. 24 3.3.3 Testing Consideration................................................................................... 25 3.4 Experimental Results............................................................................................ 27 3.4.1 Driver Frequency Response ......................................................................... 28 3.4.2 Channel Frequency Response....................................................................... 29 3.4.3 Eye-Diagram Measurement.......................................................................... 31 3.4.4 Output Matching Measurement.................................................................... 34 4 Prototype Design II....................................................................................................... 37 4.1 Output Buffer........................................................................................................ 37 4.1.1 Introduction .................................................................................................. 37 4.1.2 Design Consideration ................................................................................... 38 4.1.3 Architecture .................................................................................................. 39 iii  4.2 Active-Inductor Load ........................................................................................... 40 4.2.1 Topology....................................................................................................... 40 4.2.2 Biasing Voltage............................................................................................. 41 4.3 Design Procedure.................................................................................................. 42 4.3.1 Main Driver .................................................................................................. 42 4.3.2 Pre-Driver ..................................................................................................... 43 4.3.3 Tail Current................................................................................................... 44 4.3.4 Active-Inductor Load ................................................................................... 45 4.4 Simulation Results................................................................................................ 47 4.4.1 DC Transfer Characteristics ......................................................................... 47 4.4.2 Frequency Response..................................................................................... 51 4.4.3 Step Response............................................................................................... 52 4.4.4 Eye-Diagram................................................................................................. 55 5 Conclusions .................................................................................................................. 60 5.1 Conclusion............................................................................................................ 60 5.2 Contributions ........................................................................................................ 61 5.3 Future Work .......................................................................................................... 62 References ............................................................................................................................ 64  iv  LIST OF TABLES Table 4.1. Summary of the eye-diagram simulation............................................................. 57 Table 4.2. Performance summary of the output buffer designed with active-inductor ........ 59  v  LIST OF FIGURES Fig. 1.1. Trend in microprocessor I/O bandwidth. ................................................................. 1 Fig. 1.2. Frequency responses of backplane channels............................................................ 3 Fig. 2.1. Shunt peaking in a CS amplifier............................................................................... 6 Fig. 2.2. Effect of inductive peaking on magnitude response. ............................................... 7 Fig. 2.3. Inductive (RL) terminations used for equalization. ................................................. 8 Fig. 2.4. Conventional active inductor: (a) schematic and (b) small-signal mode................. 9 Fig. 2.5. Active-inductor load: (a) more detailed model and (b) its passive RLC equivalent. ...............................................................................................................................................11 Fig. 2.6 (a). Conventional active inductor with VHIGH (b) folded active inductor. ............... 12 Fig. 2.7. Construction of an eye diagram (a) a data stream (b) overlapping each bit-time segment of the data stream to form a data eye. .................................................................... 16 Fig. 2.8. Illustration of trailing edges and the effect of ISI. ................................................. 17 Fig. 3.1. Doubly terminated I/O blocks with inductive peaking terminations. .................... 18 Fig. 3.2. Schematic of the proposed active-inductor termination circuit. ............................ 19 Fig. 3.3. Simulated Rs and the corresponding S22 vs. ID. ...................................................... 22 Fig. 3.4. Full schematic of the CML output driver with PMOS-based active-inductor loads. .............................................................................................................................................. 23 Fig. 3.5. On-chip reference voltage circuits ......................................................................... 24 Fig. 3.6. Ideal design with both transmit-side and receive-side terminations. ..................... 26 Fig. 3.7. Actual fabricated chip with only transmit-side termination................................... 26 Fig. 3.8. Die Micrograph. ..................................................................................................... 27 Fig. 3.9. Simulated frequency response of the output driver with active inductance turned off and turned on while sweeping VG2 .................................................................................. 28 Fig. 3.10. Test setup for the measurement of channel frequency response. ......................... 29 Fig. 3.11. Channels used for measuring the eye-diagrams at the receiver side.................... 30 Fig. 3.12. Measured responses of a 6-inch FR4 trace and a 4-m RG-58/U cable ................ 30 Fig. 3.13. Test setup for eye-diagram measurements ........................................................... 31 Fig. 3.14. Received eye-diagrams for a 5 Gb/s 231-1 PRBS pattern through a 4-m RG-58/U cable (a) active inductance off (b) active inductance on ...................................................... 32 Fig. 3.15. Received eye-diagrams for a 10 Gb/s 231-1 PRBS pattern through a 6-inch FR4 trace (a) active inductance off (b) active inductance on....................................................... 33 Fig. 3.16. Test setup for S22 measurements........................................................................... 34 Fig. 3.17. Measured S22 of the output driver with different DC current in the active-inductor load ....................................................................................................................................... 36 Fig. 4.1. Output buffer architecture. ..................................................................................... 40 Fig. 4.2. Topology of the NMOS-based active-inductor load. ............................................. 40 Fig. 4.3. gm1 vs. VGS1 (W1/L1)=(12um/60nm). ....................................................................... 41 Fig. 4.4. Output buffer with transistor sizes, tail currents, CPAD and Rs and L values of active-inductor loads labeled................................................................................................ 45 Fig. 4.5. Three-stage pre-driver with active-inductor loads and transistor sizes, tail currents and Rdeg values labeled.......................................................................................................... 46 vi  Fig. 4.6. Layout of the output buffer with active-inductor loads.......................................... 46 Fig. 4.7. Input-output characteristic (pre-driver with active-inductor loads). ...................... 48 Fig. 4.8. ID vs. Vin_diff (Pre-driver with active-inductor loads).............................................. 49 Fig. 4.9. Differential Gm vs. Vin_diff (Pre-driver with active-inductor loads)......................... 50 Fig. 4.10. Frequency responses of the entire pre-drivers and 1st-stages designed using different loads. ...................................................................................................................... 51 Fig. 4.11. Step response (rising) of the 1st-stage pre-driver.................................................. 52 Fig. 4.12. Step response (falling) of the 1st-stage pre-driver. ............................................... 53 Fig. 4.13. Step response (rising) of the entire pre-driver...................................................... 54 Fig. 4.14. Step response (falling) of the entire pre-driver..................................................... 54 Fig. 4.15. Eye-diagram of the output buffer using active-inductor loads at 31.25Gb/s. ...... 55 Fig. 4.16. Eye-diagram of the output buffer using passive-inductor loads at 31.25Gb/s. .... 55 Fig. 4.17. Eye-diagram of the output buffer using passive-resistor loads at 31.25Gb/s....... 56 Fig. 4.18. Eye-diagram of the output buffer using passive-resistor loads at 25Gb/s............ 58  vii  LIST OF ABBREVIATIONS BER: Bit-Error Rate CML: Current-Mode Logic CS: Common-Source DEMUX: Demultiplexer DSM: Deep Submicron FO: Fanout I/O: Input/Output ISI: Inter-Symbol Interference MUX: Multiplexer PRBS: Pseudo-Random Binary Sequence PVT: Process, Voltage, and Temperature UI: Unit Interval  viii  ACKNOWLEDGMENT I would like to express my deepest gratitude to my advisor, Prof. Shahriar Mirabbasi, without whom I would not have the opportunity to come to UBC and this work* would not have been made possible. I thank him for his support, guidance, and immense amount of patience and understanding. I am also hugely in debt to my senior lab colleague, Samad Sheikhaei, who has played an invaluable role in my work. Samad often spends a lot more time than he has available, to work together with and help me through the many obstacles that I have encountered in this work. Without his inputs, it would have been much more difficult, if possible at all, to obtain my chip measurement and submitting a conference paper that we worked on together before the deadline, and to submit this thesis to the committee in time. I would like to thank our test lab manager, Dr. Roberto Rosales, for teaching me how to use the testing equipments and being always so generous to offer helps whenever I need them. I want to thank Prof. David Michelson and Prof. Alireza Nojeh for accepting to serve on my defense committee and providing me with their valuable comments. I was lucky to overlap my Master program with Daryl Van Vorst, who is also in Shahriar’s group. Daryl has broad and in-depth knowledge in RF designs and electrical engineering  This work is supported in part by Intel Corporation and the Natural Sciences and Engineering Research Council of Canada (NSERC). CAD tool support is provided by Canadian Microelectronics Corporation (CMC Microsystems).  ix  in general. I usually go to him when I need someone with industrial background to discuss with. I want to thank him for reading my thesis and the many useful comments he’s given to me on the thesis. He was also the one who helped me build the FR4 board for channel measurements. I would also like to thank Li Chen (Nanoelectronics Lab) for explaining to me the device questions that I have come across to in this work and Ben He (Communications Lab) for helping me with the coding of the Matlab scripts that were used in the eye-diagram simulations. I also appreciate the loyal friendships and supports that my bio-MEMS roommate, Michael Chen, and race-car engineering friend, Gary Chou (who both understand the pronunciation of sesame in the Min-Nan dialect of Taiwan very well) have been giving to me. The people who made my time in the SoC lab memorable are Lyon Lee, Danny Yoo, Arash Zargaran, Cindy Mark, Andrew Lam, Paul Teehan, Rod Froist, and Cathy Yuan. Beyond any words in my command, I am extremely grateful for having had the loves and supports of Pinky Li and Lydia Lee during different periods of time in the past two years of my graduate study. Lastly, I want to dedicate this thesis to my most beloved grandma and parents for their unconditional love and sacrifices.  x  To  Ah Po (grandma in the Hakka dialect of Taiwan).  xi  1 INTRODUCTION 1.1 Motivation Device scaling and architecture design advances have fueled the rapid growth in on-chip processing power for the past decades. This continuing growth in computing capacity demands that the off-chip bandwidth also scale in step. Following the exponential trend of growth in the demand for higher microprocessor I/O bandwidth as depicted in Fig. 1 [1], computing platforms that use multiple cores ([2], [3], and [4]) and high-performance graphics will require an off-chip bandwidth as high as several terabits-per-second in the near future to fully realize their computing power.  Fig. 1.1. Trend in microprocessor I/O bandwidth. Another driver for higher I/O bandwidth comes from network communication systems such as Synchronous Optical NETwork (SONET) and Ethernet which are constantly evolving in 1  response to the ever increasing demand for higher network service capacity. Take the Internet for example, apart from the drastic increase in the number of internet users in developing nations such as China and India (India’s internet users grew by 33% between the years 2006 and 2007 [5]), emerging online entertainments like real-time video streaming and multiplayer games have been further increasing the global network traffic. Last year, the popular video site YouTube launched only three years ago, alone consumed as much bandwidth as the entire Internet did in 2000 [5] . This rapidly increasing usage in internet network services as well as the need for higher data transfer rates are afforded by high aggregate-bandwidth communication technologies such as 10 Gb/s SONET for backbone or metro area networks (that have been developed), and other Gb/s technologies including 10Gb/s Ethernet, the Infiniband system, and 40 Gb/s SONET (which are undergoing development) [6]. The aggregate bandwidth in the key components (e.g., routers, framers, and switches) in these systems will soon reach 800 Gb/s and beyond [7].  To deliver the large off-chip bandwidth required in high-performance computer and network communication systems, many high data-rate/pin I/Os (e.g., 10 Gb/s/pin and 40 Gb/s/pin) have to be integrated on the same chip to curb the overall cost. There are a lot of challenges in realizing power-efficient I/O links operating at such high data rates. One of the main challenges is to maintain good signal integrity while transmitting data through physically impaired and band-limited channels. Every physical electrical channel exhibits different degrees of non-ideal characteristics such as DC loss, skin effect, dielectric loss, and impedance discontinuity. In the example of typical backplane channel responses shown in Fig. 1.2 [8], the signal loss could be as large as 30 dB at 3 GHz.  2  Fig. 1.2. Frequency responses of backplane channels. Another challenge in pushing the speed envelope of I/O links is to increase the maximum data rate at which on-chip I/O circuits can operate for a given process technology, and at the same time keep the overall I/O power consumption low. Some of the most speed-critical building blocks in an I/O transceiver are multiplexer, output buffer, decision circuit, voltage-controlled oscillator, and clock buffer since they usually have to drive large capacitive loads while operating at the highest clock rate.  A very power-efficient way to boost the bandwidth of I/O circuits is to employ passive filtering (e.g., shunt and series peaking) [9][10], which uses inductors to trade off bandwidth versus peaking in the magnitude response. It is also of great advantage, in the view of minimizing power consumption, to use on-chip passive filters to provide equalization [11][12]. The main problem with the deployment of passive filters is the large area needed to implement the on-chip passive inductors (also known as spiral inductors). The amount of area (on the low side) needed to realize a pair of differential spiral inductors with an inductance around 1-1.5 nH is on the order of 100 μm by 100 μm [9]. The area 3  occupied by spiral inductors usually determines the overall area of an I/O building block. Unlike active devices, spiral inductors do not scale with technologies. As more and more I/Os are needed to be integrated on the same chip to achieve a large aggregate off-chip bandwidth, the area consumed by extensive use of spiral inductors in I/O building blocks will be prohibitively large.  A much more compact alternative to realize on-chip inductors for bandwidth enhancement as well as equalization is to use active inductors. Both the area and speed of active inductors scale with technology and they are easy to implement in standard digital CMOS processes since they can be designed using only active devices. These favorable features make their usages compatible with the trend of very high level of integration, particularly in systems that are enabled by CMOS technology (e.g., System-on-Chip).  The state-of-the-art microprocessors [13] [14] and communication SoC chips [15] [16] continue to migrate to deeper submicron CMOS process technologies (such as 65-nm and 45-nm nodes) due to the performance and economic incentives brought by scaling. To evaluate the suitability of using active inductors in the I/Os of current and future high-performance digital chips, in this thesis we are motivated to design and implement prototype active inductor circuits in the most advanced CMOS technologies that we have access to.  4  1.2 Research Objectives The research objectives of this work are as follows: •  Investigate the potentials and challenges of using active inductors to provide areaefficient and low-power bandwidth enhancement in I/O circuits and compensation for channel losses  •  Develop techniques and design methods to enhance the performance of active inductors when applied to I/O circuits  •  Design and implement prototype circuits in advanced deep submicron CMOS technologies and verify the performance improvement, if at all, brought by the use of active inductor  1.3 Thesis Organiation The remaining of the thesis is organized as follows. Chapter 2 provides the background on the basics of inductive peaking, inductor implementations, principle and analysis of the operation of active-inductor shunt peaking, and the design challenges associated with active inductors. Chapter 3 presents the design and silicon validation of an output driver with active-inductor termination to provide output impedance matching and channel loss compensation at 5 and 10 Gb/s data rates. Chapter 4 presents the design of a 30 Gb/s output buffer with active-inductor loads for bandwidth enhancement. The bandwidth enhancement effect of active-inductor shunt peaking is compared with passive-inductor shunt peaking and no shunt peaking. Chapter 5 concludes the thesis and discusses future work. 5  2 BACKGROUND 2.1 Inductive Peaking in High-Speed I/O As the operating speed of I/O circuitry reaches well into the multi-Gb/s range, inductors are used more and more extensively in high-speed I/O building blocks such as multiplexers/demultiplexers (MUX/DEMUX), buffers, and output drivers. For example, around 400 inductors are used in the 40 Gb/s/pin serial-link transmitter presented in [17]. In such applications, inductors are primarily used to provide bandwidth enhancement in the heavily capacitive-loaded nodes by means of shunt or serial peaking.  L RD  CL  Fig. 2.1. Shunt peaking in a CS amplifier.  Fig. 2.1 illustrates the basic idea of how shunt peaking works in a simple common-source (CS) amplifier. By inserting an inductor L in series with the resistor load RD to resonate with the load capacitor CL, a zero is created in the frequency response of the amplifier 6  which moves the band-limiting pole (ωp=1/(RDCL)) to a higher frequency. In the time domain this bandwidth extension can be intuitively explained by applying an input step pulse to the amplifier: The inductor impedes the instantaneous change in the output current ID and acts as an open circuit, allowing all the current to flow through the capacitor rather than through the resistor. As a result, the output voltage changes faster and thereby enables the amplifier to operate at a higher speed.  Fig. 2.2. Effect of inductive peaking on magnitude response. Inductive peaking is also useful in implementing passive equalization to allow higher off-chip signaling rates while keeping the power consumption at reasonable levels [1], [11], and [12]. Despite the word peaking, when used for on-chip bandwidth enhancement the magnitude of peaking is to be kept to a minimum (or ideally no overshoot) to allow a well-behaved response to random data (minimum settling time). As shown in Fig. 2.2, when L is equal to Lopt, a bandwidth extension of up to 70 % can be achieved without peaking. Lopt is the value needed to result in a maximally flat response in a 2nd–order RLC network 7  [18]. But when used to assist channel equalization, L should be made larger than Lopt to create the peaking that is needed to compensate the high-frequency signal loss in the channel. In [1] it was shown that by choosing proper values of inductance and resistance in the inductive terminations of the transmitter and receiver, up to 6 dB of gain in the overall channel magnitude response can be achieved.  Fig. 2.3. Inductive (RL) terminations used for equalization.  2.2 Inductor Implementation On-chip inductors are typically realized by passive spiral inductors. Advantages of using passive spiral inductors as opposed to active inductors include lower power consumption, lower voltage-headroom requirement, lower noise, and better linearity. However, the implementation of spiral inductors typically occupies a large fraction of the silicon area and they do not scale with process technology. In addition, the ever increasing demand for higher aggregate bandwidth and the need to support different communication standards in SoC design require more I/O blocks to be integrated on the same chip. In such circuits, the overhead space needed to prevent mutual coupling (cross-talk) between inductors of different I/O channel circuits on the same chip would make the implementation of the spiral inductors even more costly in terms of area. A more area-efficient alternative to realize on-chip inductors is to use active inductors [19] [20]. Active inductors are particularly  8  suitable for implementing shunt peaking in current-mode logic (CML) circuits due to both the low-Q requirement and low-swing nature of such circuits. One of the most attractive features of active inductors is that, unlike their passive counterparts, both their area and resonant frequency scale with technology. Also, they can be implemented in a standard digital process, since they can be designed using only active devices. In addition, the tunable nature of active inductors allows for gain tuning over the frequency range of interest. This property can be taken advantage of in channel loss compensation and equalization. Furthermore, active inductors can be tuned to compensate for the effects of process, voltage, and temperature (PVT) variations in the circuit.  2.3 Basics of Active-Inductor Shunt Peaking  VDD  RG RG gm1Vgs1  M1  Cgs1  Vgs1 Vout  ZL Vin CL  Z1  Z2  ZL (a)  (b)  Fig. 2.4. Conventional active inductor: (a) schematic and (b) small-signal mode.  9  Although there exist different topologies that a shunt-peaking active inductor can assume, their characteristics and principles of operation are very similar. To illustrate how such circuits work, we start by looking at the conventional structure of an active inductor used in the load of a CS amplifier as shown in Fig. 2.4 (a) [21]. The active inductor shown inside the dashed box consists of an NMOS transistor M1 and a resistor RG connected between VDD and the gate of M1. The impedance looking into the active-inductor load is denoted by ZL. Fig. 2.4(b) shows the simplified small-signal model of the active-inductor load. If a test signal Vx is applied to the source of M1, the voltage across Cgs1 is given by 1 sCgs1 Vgs1 = Vx 1 RG + sCgs1  = Vx  (2.1)  1 . sRG Cgs1 + 1  (2.2)  The corresponding current generated that flows into the drain of M1 will be I d 1 = g m1Vgs1  = g m1Vx  (2.3)  1 . sRG Cgs1 + 1  (2.4)  The impedance Z2 looking into the drain, from the source, of M1 can then be obtained by Z2 =  =  =  Vx Id1  (2.5)  sRG Cgs1 + 1 g m1 sRG C gs1 g m1  +  1 . g m1  (2.6)  (2.7)  The first term of Z2, i.e., (sRGCgs1)/gm1, is linearly proportional with frequency and is what gives rise to the inductive property of the active-inductor load. Now with a more precise 10  small-signal model of the active-inductor load shown in Fig. 2.5 (a), which includes the drain-to-source conductance gds1 and drain-to-source capacitance Cds1 of M1, the impedance ZL is given by ZL =  1 + sCgs1RG s RG Cds1Cgs1 + s(Cgs1 + Cds1 + RGCgs1 gds1 ) + ( gds1 + g m1 ) 2  .  (2.8)  L  RG gm1Vgs1 Cgs1  Cp  1/gds1 Cds1  Rs  ZL  ZL (a)  (b)  Fig. 2.5. Active-inductor load: (a) more detailed model and (b) its passive RLC equivalent. For frequencies well below the resonance, the impedance looking into the active-inductor load can be approximated by Rs, L, and Cp of an RLC network as shown in Fig. 2.5 (b). Rs =  L =  1 g ds1 + g m1 RG Cgs1 g ds1 + g m1  C p = Cds1 ωo =  g ds1 + g m1 RG C gs1Cds1  (2.9)  (2.10) (2.11)  (2.12)  11  Note that from (2.9) the value of series resistance Rs of the active inductor depends on both  gm1 and gds1 which are associated with the drain current and DC bias point of M1 transistor. Also it can be seen from (2.10) the inductance of the active inductor L can be tuned independently by changing the value of RG while keeping other parameters constant. The resonant frequency (ω0) of the active inductor is approximately proportional to the square root of the fT of transistor M1, which scales with technology. L is proportional to RG as shown in (2.10), which implies that for a larger inductance the operating speed of the active inductor would be lower (since ω0 ∝ 1/ RG ). Such characteristic is also common to spiral inductors.  2.4 Design Challenges  2.4.1  Voltage Headroom  VDD  VHIGH  RG  M2  VDD  RG  M1 M1 CL  (a)  CL  (b)  Fig. 2.6 (a). Conventional active inductor with VHIGH (b) folded active inductor. 12  The conventional active inductor topology shown in Fig. 2.6 (a) requires a large DC voltage drop to ensure that VGS1 is higher than the threshold voltage of M1 transistor (Vth1) so that M1 does not turn off when the output of the amplifiers experiences a large signal swing. This voltage headroom requirement is made worse by the body effect of the NMOS transistor M1. This is because the threshold voltage Vth of a MOS transistor is an increasing function of its source-to-body voltage VSB. Since VB of all NMOS transistors have to be at the lowest potential of the entire chip (ground in this case) for CMOS processes that use p-type substrates (as are the 90-nm and 65-nm processes used in this thesis). The NMOS transistors that do not have their source nodes VS connected to the same potential as VB B  would have higher Vth and thus require higher voltage headroom (i.e., VGS and VDS) to keep them properly biased. As can be seen from (2.12), the resonant frequency ω0 of the active inductor improves with transconductance (gm1) given the same device dimensions in M1. To find the expression for gm1, we consider the I-V characteristic of a short channel MOSFET, I D = Wvsat Cox  (VGS − Vth ) 2 (1 + λVDS ) , (VGS − Vth ) + Ecrit L  (2.13)  where vsat is the saturation velocity (vsat ≈107 cm/s) and Ecrit is the critical field (Ecrit ≈ 6×104 V/cm for electrons). gm can be found by taking the derivative of ID with respect to VGS: gm =  (VGS − Vth ) 2 + (VGS − Vth ) Ecrit L ∂I D . = Wvsat Cox ∂VGS (VGS − Vth )2 + (VGS − Vth ) Ecrit L + ( Ecrit L) 2  (2.14)  Since EcritL is a constant term, it can be shown that from (2.14), gm increases with VGS. Thus it is of advantage to have a larger VGS1 as it allows a higher resonant frequency in the active inductor for a given M1 size. Moreover, as will be discussed in the next section, having a larger VGS1 also improves the linearity of the active-inductor load.  13  Most current-mode logic circuits employ tail current transistors and mandate a sufficient amount of drain-source voltage (VDS) to ensure their core transistors (i.e., differential-pair driver) remain in saturation at all times. The limited voltage headroom that is left available for a conventional active-inductor load would largely compromise its usefulness, if working at all. The voltage headroom problem is more pronounced if the circuit is to operate from a low supply voltage, i.e., 1.2 V or 1 V, which is common in advanced deep submicron (DSM) CMOS technologies such as the 90-nm and 65-nm nodes. Two solutions have been previously proposed to mitigate such headroom constraints in conventional active inductors. The first solution [19], as shown in Fig. 2.6 (a), uses a voltage-boosting technique to provide a gate bias voltage (VHIGH) that is higher than the supply voltage (VDD). The cost of this method is increased design complexity and area associated with the addition of voltage-boosting circuitry. Another solution shown in Fig. 2.6 (b) [20] adopts a folded topology which allows the gate-source voltage of the NMOS transistor to be biased at much higher than its Vth level without additional circuitry and at the same time eliminates the body effect. However, this approach consumes extra power and has a narrower bandwidth since its load is composed of the output resistance of M2 transistor in parallel with the folded active inductor.  2.4.2  Nonlinearity  As compared to a CML circuit that uses passive-inductor loads, the circuit with activeinductor loads does not have a very well-defined DC output level, owing to the nonlinear I-V characteristics of active devices. If the DC output level variations in circuits that use active-inductor loads are too large, two adverse effects might occur. One is the reduced output voltage swing if the DC output level is higher than the original design. Another is 14  when the DC output level (which is also the input bias voltage to the next stage) drops lower than what is needed for the following stage circuits to remain in saturation, the gain and swing of those stages could be severely degraded.  The active inductors used in I/O circuits are intended to operate with a relatively large signal swing (i.e., up to a few hundred micro-volts). In CML-based I/O circuits, the load impedance and the resonant frequency of the active inductor are bias dependent as shown in (2.9), (2.10), and (2.12) and could vary quite significantly as the output level changes with time. Such nonlinear effects could result in different rise and fall times and introduce more jitter in the output waveform. Hence the effect of the distortion caused by the impedance variations must be investigated and kept to an acceptable level for active-inductor loads to be a viable alternative to passive-resistor or passive-inductor loads.  2.4.3  Noise  Active inductor loads have higher noise levels as compared to their passive counterparts. The two main noise contributions come from the thermal noise associated with the channel resistance and the flicker noise due to charge trapping in transistor M1. The thermal noise also includes a high-frequency component, which is caused by the gate current and is relevant in I/O circuits operating at multi-Gb/s. Since MOS transistors conduct current near the surface of silicon where surface acts as traps that capture and release current carriers, their flicker noise components can be large. These traps capture and release carriers in random fashion and the trapping times are distributed in a way that lead to a 1/f spectrum. Thus, the flicker noise power is mainly concentrated on the lower frequencies. However, the magnitude of the noise introduced by active inductors is typically negligible as 15  compared to the voltage swing level in most I/O circuits, except for the front end of highly sensitive receivers [22].  2.5 Eye Diagram  (a)  (b)  Fig. 2.7. Construction of an eye diagram (a) a data stream (b) overlapping each bit-time segment of the data stream to form a data eye.  The eye diagram is an intuitive graphical representation of (electrical and optical) data communication signals. A data eye can be obtained by overlapping many bit-long signal traces from a random data stream onto a single bit-time interval (e.g., for 10 Gb/s, bit-time = 100ps) as shown in Fig. 2.7 [23]. The vertical eye opening (or eye height) and horizontal eye opening (or eye width) are important characteristics of the eye diagram that aid in quantifying the signal quality. The vertical eye opening is measured at the sampling instant and is strongly related to the amount of inter-symbol interference (ISI) presented in the signal. The horizontal eye opening is usually expressed as the percentage of one unit bit width, also known as the unit interval (UI), and is largely influenced by ISI as well. In a band-limited system (channel or electronics or both), each transition in the input logic level would produce an exponential output response; the narrower bandwidth of the system, the longer the exponential tails. The longer the output response tails, the greater the effect it has 16  on corrupting the amplitude of the subsequent bits. This increases the probability of detecting the wrong logic level in the receiver. Such effect is known as ISI and is illustrated in Fig. 2.8 [8]. The two rectangular step pulses represent the input bits (101) to a band-limited channel and the three curves represent the responses (100, 001, and 101) appeared at the output of the channel after a finite delay. Note that one of the output responses (101) is the superposition of the other two responses (100 and 001). As can be seen from Fig. 2.8, when the input pattern 101 is being applied to the band-limited channel, the ISI introduced by the long trailing edges between the transitions of the 1st to 2nd output bits (in 100) and the 2nd to 3rd output bits (in 001) can cause the 2nd output bit (in 101) to take a wrong value (a logic 1 instead of 0).  Fig. 2.8. Illustration of trailing edges and the effect of ISI.  By inspecting the amount of ISI presented in the eye diagram at different operating speed, a quick estimate on the bandwidth or maximum achievable data rate for a given bit-error rate (BER) can usually be obtained. 17  3 PROTOTYPE DESIGN I This chapter presents the prototype design of a novel PMOS-based active inductor (first proposed by the author of this thesis [22]) which provides a compact alternative for implementing inductive termination and offers channel loss compensation through tunable peaking. The prototype chip was fabricated by STMicroelectonics (STM) in a 90-nm CMOS technology and measurement results are presented. Note that the 90-nm technology was the most advanced CMOS process node that was available at the time to the author.  3.1 Inductive Peaking Termination Inductive Termination  ZT  Z0  Z0  ZT  Rx  Tx CPAD  CPAD  Fig. 3.1. Doubly terminated I/O blocks with inductive peaking terminations. Fig. 3.1 shows a doubly terminated I/O circuit using two inductive terminations with impedance ZT. In this figure, CPAD represents the lumped parasitic capacitance seen at the I/O terminals that is typically dominated by the parasitic capacitance of the electrostatic discharge (ESD) protection circuit and the bonding pad. Z0 is the characteristic impedance 18  of the channel. The termination impedance ZT should ideally exhibit a real part (Rs) that is equal to Z0 and constant over the bandwidth of interest. ZT have an imaginary part (jωL) which can be used to create a peak at the target transmission frequency so as to partially compensate the high-frequency signal loss in the channel. If the data rate of the transmitted pulse is higher than the frequency of the pole caused by the equivalent RC low-pass filter seen at the termination node, the inductor value can also be chosen to resonate with the parasitic capacitance seen at the node to improve the I/O bandwidth.  3.2  PMOS-based Active Inductor  As discussed in the previous chapter, it is of interest to realize inductive peaking circuits using active inductors to save area and provide tuning capability. A PMOS-based active inductor that is capable of implementing the inductive termination as shown in Fig. 3.1 is presented in the following sub-sections.  3.2.1  Circuit Topology  VDD M3 VG2 M2  VG4  M4  M1 Vout  ZT  CPAD  Fig. 3.2. Schematic of the proposed active-inductor termination circuit.  19  The proposed active inductor topology is shown inside the dashed box in Fig. 3.2. It has an active resistor (M2, which operates in deep-triode region) through which the output node is coupled to the gate of the PMOS transistor M1 via a source follower M3. A level shifter, consisting of a source follower M3 and a current source M4, is inserted between M1 and M2 to allow a lower gate bias voltage for M1. This topology allows the active inductor to operate with low voltage headroom without the need to use a separate voltage higher than VDD for biasing the gate of M1 as it does in NMOS-based active inductor [19]. By tuning VG2 to change the resistance of M2 (i.e., RG), the inductance L of the active inductor can be independently controlled (2.10). The proposed active inductor also consumes less power than [20] while exhibiting a low resistance (i.e., 50 Ω) for termination. Note that [20] needs to sink an additional, relatively large amount of bias current to its folded active inductor for it to exhibit a Rs of 50 Ω ( Rs∝1/ g m ) on top of the tail current that is already needed in the core driver transistors. The proposed topology reuses the tail current for biasing and consumes no extra current. The topology is based on a PMOS transistor (M1) which does not suffer from body effect and hence has a lower and relatively constant Vth1 (since VS1 is connected to VDD so VSB1 shall always remain constant) as compared to when a NMOS transistor is used. Having a lower Vth level that does not increase with the output level almost always serves to the advantage of the circuit operation in analog design. This is especially true when the circuit is implemented in DSM CMOS technologies where supply voltages are expected to continue to drop. In the case of the proposed active inductor the lower Vth level in M1 serves to help ease the voltage headroom requirements of the active inductor (i.e., VGS1 and VDS1) and can also be translated to a higher output swing tolerance given the same voltage headroom since VD,sat is made higher (VD,sat=VGS1-Vth1). It is worth noting that although gm3, gds3, and gmb3 of the source follower transistor M3 vary with the output level 20  due to body effect, such variations have a limited impact on the gain of the level shifter which is given by 1/(1+(gds4+gds3+gmb3)/gm3). The drawbacks of the proposed topology are the additional overhead power consumed by the level shifter and the fact that PMOS transistors are slower than NMOS transistors due to the lower carrier mobility. Also, the non-idealities of the level shifter, such as the parasitic capacitances and less-than-unity gain, could adversely affect the speed of the proposed topology.  3.2.2  Small-Signal Analysis  Assuming the level shifter has an unity gain over the entire bandwidth of interest, the small-signal analysis of the proposed active inductor would be the same as that derived for the conventional active inductor in Section 2.2. The termination impedance of the active inductor is thus (2.8) and the expressions for its passive RLC equivalent model are (2.9), (2.10), (2.11), and (2.12)..  3.2.3  Large-Signal Operation  The presented active inductor is intended to operate under a relatively large signal swing of the output and its termination impedance and resonant frequency are bias dependent as discussed in Section 2.4.2. To achieve an acceptable impedance matching, the variation of the termination impedance, in particular the real part Rs has to be kept low. Figure 3.3 shows the change of Rs in the active-inductor load when the current through it (ID) varies from 0 to 10 mA and the corresponding effect this variation has on the output matching S22. The active inductor is designed to operate with an ID of 4 mA when the differential driver is balanced (i.e., each branch has the same current). Note that the S22 shown in Fig. 3.3 assumes an 21  infinite output resistance r0 in the driver transistors. Low frequency operation is also assumed in this simulation since at high frequencies the termination impedance will have to account for the effects of its imaginary component (jωL) as well as the capacitive load seen at the pad (CPAD). From this plot, it can be seen that by confining ID between 2 mA and 6 mA, a relative constant Rs can be obtained. 80  0 S22 -10  60  -20  50  -30  40  -40  30  -50  20  -60 0  2  4  6  8  S22 (dB)  R S (Ohm)  Rs 70  10  ID (mA)  Fig. 3.3. Simulated Rs and the corresponding S22 vs. ID.  3.2.4  Linearity Enhancement  A method to improve the linearity of the termination impedance while fully switching the tail current is to add a common-mode degeneration resistor (Rdeg) between the differential outputs of the driver [24]. Fig. 3.4 shows the full schematic of the differential version of the output driver with a common-mode degeneration resistor Rdeg and the proposed active-inductor loads. With the addition of Rdeg the active-inductor load should exhibit a Rs value such that the effective resistive load seen by the driver is Rs,eff =Rs // 0.5Rdeg=50 Ω (for termination) and part of the switching current would flow through Rdeg depending on the 22  ratio of Rs and Rdeg. This way a portion of the current is always preserved in one branch of the active-inductor load so as to reduce the variation of the termination impedance while allowing complete current switching.  3.3 Output Driver with Active Inductor Load  3.3.1  Full Schematic  VDD  M3  M3 M1  VG4  VG2  M1  M4  M2  M4  Rdeg  VoutCPAD  M2  Vout+  Mdiff  Mdiff  VG2  VG4  CPAD  +  Vin  -  Vtail  Mtail  Itail  Fig. 3.4. Full schematic of the CML output driver with PMOS-based active-inductor loads. As shown in Fig. 3.4, the load to the driver is a pair of inductive terminations implemented by the PMOS-based active inductors. The differential driver transistors Mdiff, upon the application of the input voltage Vin, steers the tail current Itail between the loads accordingly  23  to create the output logic pulses needed for data transmission. The on-chip bandwidth of the driver is limited by the RC time constant seen at the output nodes. The R term (of the RC) is 25 Ω, resulting from the parallel of the effective termination resistance Rs,eff and the characteristic impedance ZT of the transmission line (both are 50 Ω). The C term (of the RC) is determined by CPAD which is dominated by the parasitic capacitances of the ESD protect device and the bond pad. Since no ESD protection circuits were used in the design, two MIM capacitors each with a capacitance of 700 fF were placed at the output nodes of the driver to emulate the parasitic capacitance of the ESD protection devices. Unfortunately, the fab mistakenly excluded the MIM option and thus these two capacitors were not fabricated on the chip. As a result, the CPAD seen by the driver is mainly due to the pad which is on the order of 100 fF. The off-chip bandwidth limits are set by the physical impairments of the channel such as skin effect, dielectric loss, and impedance discontinuity. If the bandwidth of the on-chip circuitry is higher than the bandwidth of channel, the peaking created by the active inductor can be used to compensate the part of the high-frequency signal loss in the channel.  3.3.2  Biasing Voltage  VDD_TAIL  VTAIL MV_TAIL  VDD_G2  VDD_G4 VG2  MV_G2  VG4 MV_G4  Fig. 3.5. On-chip reference voltage circuits 24  Other than the supply voltage VDD, which is provided directly from an off-chip power supply, all three biasing voltages VTAIL, VG2, and VG4 are generated on-chip using the circuits shown in Fig. 3.5. The voltages (i.e., VDD_TAIL, VDD_G2, and VDD_G4) used to power the reference voltage circuits are supplied from external power supplies to make the testing easier. Note that however in a commercial chip these DC bias and tuning voltages (i.e., VDD_TAIL, VDD_G2, and VDD_G4) are typically provided on-chip by bandgap reference generators [25] and digital self-tuning circuitries to reduce the number of pins and the complexity in board design. With the use of reference voltage circuits shown in Fig. 3.5 it is easier to tune the bias voltages with a finer scale since the voltage drop across the diode-connect transistor changes more gradual than the external power-supply voltage. This also makes the bias voltages less sensitive to the variation in external power-supply voltage. Moreover, since the reference-voltage circuit is also a current mirror, it is easy to estimate ITAIL and ID4 (current through M4 transistor) by considering the size ratios between the MTAIL and MV_TAIL, and M4 and MV_G4.  3.3.3  Testing Consideration  Ideally, the prototype driver should be designed to have two active inductor terminations, one at the transmit-side and one at the receive side (Fig. 3.6), to provide a higher equalization gain. The measurement setup for evaluating the equalization effect of both the transmit-side and receive-side inductive terminations on the channel response would look like that illustrated in Fig. 3.6. The data pulses transmitted by the output I/O driver (which is equipped with a pair of transmit-side terminations) should travel via the off-chip channel typically (a PCB trace, coax cable, or copper wire) to the receive-side terminations that are also implemented on the same chip as the transmitter driver. To obtain measurements on the 25  received signals, a high input impedance (e.g., 1000 Ω) oscilloscope has to be used (so as to not load the received signal) by connecting it to the receive-side termination nodes. However, since on-chip probes with 4 high-speed pins were not available, probing four high-speed input and four high-speed output pads on the same chip would not have been made possible. Oscilloscopes with high input impedances that could measure signals at the GHz range were not available either. Thus, in the fabricated chip, only the transmit-side terminations are implemented as shown in Fig. 3.7.  On-chip 50 Ω  50 Ω  Scope  Tx 1000 Ω  Fig. 3.6. Ideal design with both transmit-side and receive-side terminations.  On-chip  50 Ω  Scope  Tx 50 Ω  Fig. 3.7. Actual fabricated chip with only transmit-side termination. 26  3.4 Experimental Results  Active-Inductor Load 17 µm × 25 µm  Output Driver  Fig. 3.8. Die Micrograph. The prototype active inductor circuit as the load of a CML output driver is fabricated in STM’s 7-metal 90-nm CMOS process. Fig. 3.8 shows the micrograph of the die under probing. The area of the active inductor circuits measures 17 µm × 25 µm. The circuit operates from a 1 V supply. Note that the circuit only draws current from the 1 V supply while the other off-chip DC voltages (e.g., VDD_TAIL, VDD_G2, and VDD_G4) are used for bias only. The simulation and measurement results are presented in the following sub-sections.  27  3.4.1  Driver Frequency Response  Due to the limitation of equipment, only the simulated frequency response of the circuit is presented. The simulation was done using Cadence Spectre in the 90-nm design kit with the transistor models provided by the foundry. Since the active-inductor termination is primarily designed for compensating high-frequency signal loss in the channel, the use of active inductor for bandwidth enhancement (i.e., ideally wants to achieve maximum on-chip bandwidth extension without peaking) will not herein be discussed.  Fig. 3.9. Simulated frequency response of the output driver with active inductance turned off and turned on while sweeping VG2 By changing the inductance of the active-inductor load, through adjusting VG2, the peaking of the driver’s transfer function can be controlled. Fig. 3.9 shows the simulated S21, both with the active inductance on (and while sweeping VG2) as well as active inductance off. The active inductance can be turned on and off by adjusting the bias of the active inductor circuit. When the active inductance is off (L=0) the peaking is removed and it mimics a 28  typical RC response when a passive resistor is used in the load. Although both are without peaking, passive resistors are more linear and hence are expected to have somewhat less ISI than in the case when active inductance is turned off. The bandwidth of the driver, when the active inductance is turned on, is observed to be at around 20 GHz in the simulation and is sufficient for the intended data rates in this work (5 Gb/s and 10 Gb/s). Note that this simulation is carried out with an output capacitance CPAD of 100 fF. A smaller bandwidth is expected if a larger CPAD is present. As shown in Fig. 3.9, by tuning VG2, the channel can be compensated by up to 3 dB and it allows the peaking frequency to be varied between 2 GHz and 10 GHz.  3.4.2  Channel Frequency Response  Agilent 8510C  VNA Port 1  Port 2  channel  Fig. 3.10. Test setup for the measurement of channel frequency response.  29  (a)  (b)  Fig. 3.11. Channels used for measuring the eye-diagrams at the receiver side (a) A 6-inch FR4 board trace (b) A 4-m RG-58/U cable with BNC connectors. The test setup for measuring the channel frequency responses is depicted in Fig. 3.10. The two ports of the Agilent 8510C vector network analyzer (VNA) are connected to the two ends of the FR4 board and RG-58/U coaxial cable shown in Fig. 3.11, respectively.  Fig. 3.12. Measured responses of a 6-inch FR4 trace and a 4-m RG-58/U cable  Fig. 3.12 shows the measured frequency responses of the 6-inch FR4 trace and the 4-m long coaxial cable with BNC connectors. These two channels are used to demonstrate the tunable peaking that the active inductor circuit can provide to help compensating the channel loss. 30  For the FR4 trace and RG-58/U cable the losses are about 3.1 dB at 5 GHz and 6.5 dB at 2.5 GHz, respectively.  3.4.3  Eye-Diagram Measurement  Anritsu MP1763B Pulse Pattern Generator  Agilent Infinium DSO81304A Oscilloscope  VIN+ VIN  VOUT+ VDD VDD_TAIL VDD_G2 VDD_G4  Design Under Test  VOUT-  Scope 50 Ω  Fig. 3.13. Test setup for eye-diagram measurements The test setup for the eye-diagram measurement is shown in Fig. 3.13. The differential psudo-random binary sequence (PRBS) with a magnitude of 300 mVp-p and a DC offset of 700 mV is generated by the pulse pattern generator (Anritsu MP1763B) and fed to the inputs of the driver via SMA cables. The four DC voltage supplies (VDD, VDD_TAIL, VDD_G2, and  VDD_G4) are provided by four external power supplies. The outputs of the driver are connected to the oscilloscope input ports via the channels. The signals are terminated by the internal 50-Ω resistors of the oscilloscope. 31  200ps  (a)  200ps  (b) Fig. 3.14. Received eye-diagrams for a 5 Gb/s 231-1 PRBS pattern through a 4-m RG-58/U cable (a) active inductance off (b) active inductance on  32  100ps  (a)  100ps  (b) Fig. 3.15. Received eye-diagrams for a 10 Gb/s 231-1 PRBS pattern through a 6-inch FR4 trace (a) active inductance off (b) active inductance on  33  Figs. 3.14 and 3.15 show the received eye-diagrams when a 5 Gb/s and a 10 Gb/s 231-1 PRBS data patterns are transmitted over the FR4 and coaxial-cable channels, respectively, by the output driver. In Figs. 3.14(a) and 3.15(a) the active inductances are turned off. In Figs. 3.14(b) and 3.15(b) the active inductances are turned on and tuned to give maximum gains at the data-rate frequencies. In Figs. 3.14(a) and 3.14(b) the vertical eye openings are 64 mV and 127 mV with a differential swing slightly less than 300 mVp-p. Peak-to-peak jitters are about 0.41 UI and 0.285 UI, respectively in Fig. 3.14(a) and (b). The vertical eye openings in Figs. 3.15(a) and (b) are 67 mV and 133 mV with a differential swing of 300 mVp-p and the peak-to-peak jitters are 0.51 UI and 0.35 UI, respectively. The pair of active-inductor loads consumes 0.8 mW overhead powers compared to passive loads due to the use of level shifter. This overhead power is independent of the amount of tail current being used by the driver.  3.4.4  Output Matching Measurement  Agilent 8510C  VNA Port 1  VIN+ VIN VDD VDD_TAI VDD_G2 VDD_G4  Port 2  VOUT+ Design Under Test  VOUT-  Fig. 3.16. Test setup for S22 measurements 34  The test setup for the output impedance-matching S22 measurements is shown in Fig. 3.16. The differential inputs of the driver are connected to the same external DC supply for biasing. The other four DC bias voltages are connected in the same way as described in Section 3.4.3. One of the driver’s outputs is open circuit while another connects directly to a VNA port through the DC-block capacitor. Note that the calibration kit available at the time of measurement, for this VNA, can only calibrate up to the 3.5-mm connector of the VNA cable. The remaining electrical connections through which the signal has to travel to reach the integrated circuit (IC), including the SMA-to-3.5mm connectors and SMA cables, were not calibrated. Since the impedance of the active-inductor load varies with the bias current as previously shown in Fig. 3.3, it is important to take S22 measurements with different DC currents flowing through the active inductor. When the inputs of the driver are biased by the same common-mode DC voltage, the differential pair is said to be balanced and the current through each active-inductor load is just half of the tail current ITAIL. ITAIL can be adjusted by tuning VD_TAIL.  35  Fig. 3.17. Measured S22 of the output driver with different DC current in the active-inductor load Figure 3.17 shows the measured S22 of the output driver when varying the currents in the active-inductor load. The driver has an ITAIL of 8 mA and Rdeg is designed to have 2 mA flowing through it when ITAIL is being completely switched by the differential drivers. This means that the current flowing through each branch of the active-inductor loads will vary between 2 to 6 mA (since the tail current is 8 mA) while generating maximum output swing. As shown in Fig. 3.17, S22 of -10 dB or lower can be achieved when the current is being confined in this range. S22 of -10 dB signifies a 10% signal power reflection (or loss) at the output node of the circuit; in other words, 90% of the incident signal power is absorbed by the circuit. A -10 dB of impedance matching is a typical minimum requirement for most communication systems.  36  4 PROTOTYPE DESIGN II The focus of this chapter is to demonstrate the bandwidth improvements and saving of area (compared to the use of passive spiral inductors) brought by the use of active-inductor loads in an output buffer designed for high-speed I/O transmitters. The 4-stage cascaded output buffer with active-inductor loads was designed in STM’s 65-nm CMOS process (which was made available to the author some time after the submission of the 1st prototype chip) with a supply voltage of 1.2 V and has been taped out for fabrication. The chip is expected to return by November. Design and simulation results are presented in the following.  4.1 Output Buffer  4.1.1  Introduction  An output buffer enables internal logic circuits to drive the off-chip 50-Ω loads and the large capacitances seen at the outputs of an I/O transmitter. When incorporated into the design of high-speed multiplexers (MUX) in serial-link transmitters, the use of output buffers can cut down the area and power consumption needed to implement the MUX . This is because the MUX can be designed using smaller circuits since the core MUX circuit does not have to drive the off-chip 50 Ω load directly. An output buffer usually consists of a cascade of multiple driver stages that are tapered in 37  device dimensions and bias current from the first stage to the last so as to maintain its bandwidth while delivering high output currents to it loads. At a speed higher than 10 Gb/s CML drivers are usually called for in the design of an output buffer (since they can achieve the highest speed of operation among all families of logic circuits, due to the current-steering and low-swing natures). CML drivers are particularly suitable for implementing active-inductor shunt peaking since the core driver only has two stacked transistors, one for the differential pair and one for the current source, which allows for allocation of voltage headroom for the active-inductor loads. Output buffers, particularly the ones used in high-speed parallel links, are excellent examples of where the area-saving benefits of using active inductors (as opposed to spiral inductors) can prove to be very significant as a large number of such buffers (e.g., in 32-bit or 64-bit processor I/Os), each employing multiple stages of cascaded CML drivers, are typically needed.  4.1.2  Design Consideration  The power consumption and data rate are the two most important measures of performance as well as tradeoffs in the design of an output buffer. To achieve a reasonable balance between the two, a data rate of 30 Gb/s is targeted in this design. Output swing is also another important performance metric since it directly trades with power consumption. The magnitude of the output swing is proportional to the vertical eye opening of the received signal in the receiver side of I/O links, thus a large maximum output swing is usually desired. Moreover, the output swing should ideally be controlled independently to facilitate the implementations of transmit pre-emphasis and scalable I/O’s [1]. The fanout (FO) of each stage in the output buffer is dependent on the bandwidth requirement in each stage. The number of stages needed in a tapered buffer is dependent on the total fanout ratio FOtot 38  (ratio between the output capacitance seen by the last stage and the input capacitance of the first stage). A larger FOtot allows the internal logic or MUX circuits preceding the output buffer to be made smaller so as to save power and area for a given output capacitor load  CPAD.  4.1.3  Architecture  The architecture used in the output buffer design is illustrated in Fig. 4.1. To allow a controllable output swing, the main driver of the output buffer uses only passive resistor loads and has a separate bias voltage VTAIL2 for its tail current transistor. The pre-driver consists of three cascaded drivers, each employing active-inductor loads that are represented by a pair of resistors (Rs) and inductors (L) enclosed by dashed boxes. The three current sources in the pre-driver are biased by the same voltage VTAIL1. Note that the RC time constant seen at the output node of the main driver is relative small (R=25 Ω due to the parallel of a transmit-side and a receive-side 50-Ω terminations and a small CPAD of 250 pF), thus it has a sufficiently large bandwidth even without the use of inductive peaking.  39  Pre-Driver  Main Driver  1.2V  1.2V  1.2V  L1  L2  L3  Rs1  Rs2  Rs3  1.2V 50Ω  VOUT- VOUT+  Mdiff1  Mdiff1  Mdiff3 Mdiff3  Mdiff2 Mdiff2  Mdiff Mdiff  +  CPAD  VIN VTAIL1  VTAIL1  MTAIL1  MTAIL2  VTAIL1  VTAIL2 MTAIL3  MTAIL4  Fig. 4.1. Output buffer architecture.  4.2 Active-Inductor Load 4.2.1  Topology  VHIGH VG2  VDD  M2  L Rs  M1 Vout Vin  Fig. 4.2. Topology of the NMOS-based active-inductor load.  40  Fig. 4.2 depicts the NMOS-based active-inductor load with VHIGH that was discussed in Section 2.4.1, except the replacement of the passive resistor RG by a PMOS active resistor  M2. This active inductor topology is chosen as the loads for the three stages of the pre-driver to implement active shunt peaking. The advantages of this topology are higher speed due to the higher charge-carrier mobility of NMOS transistors as compared to the PMOS transistors and zero power overhead. The only power overhead that is needed is to power the voltage boosting circuitry that might be needed for generating VHIGH. It is also possible to provide VHIGH through off-chip at the cost of an additional pin. In SoC designs very often there are multiple VDD’s, and the ones used in the I/O’s are often higher than the ones used in the logic cores. To comply with the signaling levels required for different I/O standards, voltage supplies as high as 1.8 V and 3.3 V are usually available even in designs realized in more advanced CMOS processes such as the 90-nm and 65-nm nodes. In such cases the use of voltage-boosting circuitry would not be necessary and the additional power cost for using NMOS-based active inductors is further reduced.  Biasing Voltage 12 VDS=0.2 VDS=0.4  10  VDS=0.6 8  6  m  g (mS)  4.2.2  4  2  0  0  0.2  0.4  0.6 VGS (V)  0.8  1  1.2  Fig. 4.3. gm1 vs. VGS1 (W1/L1)=(12um/60nm). 41  By inspecting (2.10) and (2.12), it can be found that the resonant frequency ω0 of the shunt-peaking active inductor is proportional to gm1 for a given M1 transistor size and a given inductance value L (in the active inductor). The choice of the gate biasing voltage  VHIGH for M1 transistor should be made based on optimizing gm1. This should be done while ensuring that the gate-source voltage VGS of the M1 transistor (which is VHIGH-Vout) does not exceed 1.2V (the supply voltage VDD) at all time of operation, so as not to damage the thin gate oxide in the transistor. A single NMOS transistor M1 is simulated in the 65-nm design kit using Cadence Spectre with models provided by the foundry. Fig. 4.3 plots gm1 versus  VGS1 with different VDS1 for a M1 transistor having a drawn gate width and length of 12 µm and 60 nm, respectively. It can be seen that for all three plots, gm1 peaks at around  VGS1=0.8V and remains relatively constant (except for VDS=0.6V, which drops a little) up to VGS1=1.2V. Note that the drop in the VDS1=0.6V plot is attributed to the fact that M1 transistor starts to enter into triode region of operation as its overdrive voltage (VGS1-Vth1) exceeds its VDS1,sat. This figure suggests that the optimal VGS1 value is around 1 V for an output swing of ± 200 mV (ΔVout) in the driver (source of M1 transistor is the same node as  Vout of the driver). The DC output level and output swing in the pre-driver should be set to around 0.8 V and ± 200 mV (which corresponds to 0.2 V ≤ VDS1 ≤ 0.6 V), respectively. This implies a bias voltage VHIGH of 1.8 V is required to result in optimal VGS1.  4.3 Design Procedure  4.3.1  Main Driver  The loads of the main driver are designed with two 50-Ω poly resistors to provide output impedance matching. At a multi-Gb/s link the receiver almost always has to have 50-Ω 42  on-die terminations to minimize reflection and signal power loss. Therefore the main driver at the transmitter effectively sees a 25-Ω load at its outputs. The tail current of the main driver is determined by the division between the desired output swing and the 25-Ω load (i.e., ITAIL4=(2×ΔVout)/25Ω). The output swing of the main driver is designed to be maximized (i.e., larger than that of the pre-driver) to increase the overall signal integrity (±350 mV). The maximum output swing it can achieve is largely limited by the modest input drive voltage of ± 200 mV provided by the pre-driver. The lower swing level in the pre-driver helps lower the overall power consumption at the cost of a reduced noise margin. To deliver such a large current to its loads the differential pair transistors Mdiff4 in the main driver have to be sized sufficiently large. The large input capacitances associated with the main driver could limit the bandwidth of its preceding stages.  4.3.2  Pre-Driver  To drive the large input capacitances of the main driver, the pre-driver is tapered into three cascaded stages. When N stages are cascaded as an amplifier (e.g., limiting amplifiers), the bandwidth of each stage has to be made larger than the overall bandwidth of the amplifier. Assuming each stage has the same -3-dB bandwidth ω0 and the overall circuit has a -3-dB bandwidth of ω-3dB, the relationship between the two is given by [26]:  ω −3dB = ω 0  N  2 −1  (4.1)  Thus, for the 3-stage pre-driver,  ω −3dB ≈ 0.51ω 0  (4.2)  This implies that for the pre-driver to operate at the maximum data rate with reasonable ISI, each stage has to have a -3-dB bandwidth of around twice as large as that of the entire 43  pre-driver circuit. To increase the overall bandwidth of the circuit, active-inductor shunt peaking is employed in all three stages. The resistance Rs in the active-inductor load of each stage is chosen based on the load capacitance CL and the bandwidth requirement of each stage. From the Rs and the voltage swing (± 200 mV), the tail current of each stage can be determined. The driver transistors of each stage are sized to guarantee the desired output swing can be achieved and propagate down the driver chain. Since the gain does not need to be particularly high (slightly higher than unity in this case), the degree to which the output common-mode levels in the 3-stage pre-driver could fluctuate in the presence of PVT variations is also limited. Thus, common-mode feedback is not required. The inductance L of the active-inductor load is designed to give a maximally flat response (L=0.4× Rs2× CL) [18] in each stage.  4.3.3  Tail Current  Due to the use of active-inductor loads in the pre-driver, special attention has to be given to the output common-mode level as it is more sensitive to PVT variations mainly because the non-linear I-V characteristics of the active-inductor loads. As such, the output impedance of the tail current transistors in each stage of the output buffer has to be made high (by using longer channels and allocating sufficient voltage headroom to the tail current transistors,) so as to minimize ITAIL variation due to the change in the common-mode output level of previous stage. The two biasing voltages for the tail currents transistors VTAIL1 and VTAIL2 are provided by the same current-mirror circuits as described in Section 3.3.2. The schematic of the pre-driver and main driver is shown in Fig. 4.3. Transistor sizes, Rs and L values in each active-inductor load, CPAD value, and tail current in each driver are labeled in the schematic as well. 44  Pre-Driver  Main Driver  1.2V  1.2V  1.2V  1.6nH  0.6nH  120pH  570Ω  175Ω  50Ω  1.2V 50Ω  VOUT- VOUT+  +  Mdiff14µm Mdiff1  Mdiff2 Mdiff2  60nm  VIN VTAIL1  780µA  MTAIL1  Mdiff3 Mdiff3 50µm 60nm  15µm 60nm  VTAIL1  2.5mA  MTAIL2  VTAIL1  Mdiff Mdiff 108µm 60nm  8mA  MTAIL3  VTAIL2  250 fF  24mA  MTAIL4  Fig. 4.4. Output buffer with transistor sizes, tail currents, CPAD and Rs and L values of active-inductor loads labeled.  4.3.4  Active-Inductor Load  As shown in Fig. 4.6 output degeneration resistors Rdeg are used in all three stages of the pre-driver to improve the linearity of the active-inductor loads. The width of M1 transistor in each stage’s active inductor is sized to exhibit an effective resistance Rs,eff that will result in the desired Rs as shown in Fig. 4.5 (i.e., Rs= Rs,eff // 0.5Rdeg). Minimum length is used in the  M1 transistors of all three stages. The inductance L of the active-inductor load can be independently tuned through VG2, which controls the resistance of the active resistor implemented by M2 transistor. To reduce the number of pads needed for biasing voltages (since the chip will be tested using on-chip probing and the number of pins on the probe is limited), the tuning voltage VG2 is shared by all three active resistors and is provided directly from off-chip. Thus, the M2 transistors in all three stages have to be sized in proportion to their desired L values for a given VG2 voltage. Fig. 4.6 shows the three-stage pre-driver with active-inductor loads, each with transistor sizes, Rdeg, and VHIGH labeled. 45  ` 1.8V  1.8V  1.8V  1.8V  1.8V  1.8V  VG2 1.2V  0.14µm 60nm  0.14µm 60nm  VG2  0.4µm 60nm  1.25µm 60nm  1.2V  0.4µm 60nm  3.75µm 60nm  VG2  2µm 60nm  2µm 60nm  VG2  12µm 60nm  1300 Ω  5000 Ω  1.2V  400 Ω To main driver  + VIN -  15µm 60nm  4µm 60nm  870µA  VTAIL1  40µm 0.4µm  50µm 60nm  2.7mA  VTAIL1  120µm 0.4µm  VTAIL1  7.8mA 340µm 0.4µm  Fig. 4.5. Three-stage pre-driver with active-inductor loads and transistor sizes, tail currents and Rdeg values labeled.  130µm 30µm Fig. 4.6. Layout of the output buffer with active-inductor loads. The final layout of the entire output buffer with active-inductor loads in the pre-driver stages is shown in Fig. 4.6. It occupies an area of 130µm × 30µm, which is about the size of a typical spiral inductor having an inductance of 0.5 nH ~ 1 nH.  46  4.4 Simulation Results To evaluate the effectiveness of the bandwidth enhancement gained from the use of active-inductor shunt peaking, as opposed to when passive inductors or no shunt peaking are used, pre-drivers using passive-resistor loads and passive-inductor loads are also designed and simulated. The values of load resistors and inductors used in the passive-load implementations are the same as that shown in Fig. 4.5. In an attempt to make the comparisons fair, all three implementations have the same levels of tail currents, output swings, and same device dimensions in the differential pairs and current source transistors, in each stage. The only difference is that with a swing of ± 200 mV, the passive implementations have an output common-mode level of 1 V while the active implementation has an output common-mode level of about 0.8 V. The additional 200 mV voltage headroom is used to allow a more optimal biasing for the active-inductor loads so that they can achieve a faster speed and operate more linearly. The tradeoff is the reduced voltage headroom for the tail current transistors MTAIL in the pre-driver using active-inductor loads; the bias voltage for these MTAIL transistors thus is adjusted to match the tail current levels of that in the two pre-drivers using passive loads. All simulations are done using Cadence Spectre at the schematic level and presented in the following.  4.4.1  DC Transfer Characteristics  To gain insight into the difference that might have resulted in the DC transfer characteristics of the pre-driver due to the use of active-inductor loads as opposed to passive loads, the DC input voltage is swept against the output voltages and currents of the circuit. 47  1.1  Vout1  Vout2  Output (V)  1 0.9  Common-Mode Level  0.8 0.7 0.6  0.5 -0.8  -0.6  -0.4  -0.2  0 Vin1-V in2 (V)  0.2  0.4  0.6  0.8  Fig. 4.7. Input-output characteristic (pre-driver with active-inductor loads). The input/output characteristic of the three-stage pre-driver with active-inductor loads is shown in Fig. 4.6. This graph validates that the pre-driver has a large-signal gain of about unity. With a differential input voltage of 400 mV applied to the pre-driver, the differential swing generated at its outputs is also about 400 mV. The lowest level that the output signal can reach is limited to above 600 mV because the common-mode level is designed to be slightly higher than 800 mV. The intent for making the CM level slightly higher is to reduce the chances for the voltage between the gate and source (VGS) of M1 transistor to exceed 1.2 V under the presence of PVT variations (Note that VHIGH is 1.8 V so Vout cannot go below 600 mV).  48  8 7  ID1  ID2  6  ID (mA)  5 4 3 2 1 0 -0.6  -0.4  -0.2  0 Vin1-Vin2 (V)  0.2  0.4  0.6  Fig. 4.8. ID vs. Vin_diff (Pre-driver with active-inductor loads). Fig. 4.7 shows the changes in the drain currents of the M1 transistors in the active-inductor loads of the 3rd-stage pre-driver vs. the differential input swept on the 1st-stage pre-driver. The tail current in the 3rd stage is designed to be around 7.8 mA when the differential pair is balanced. That is, the point when differential input is zero. As can be seen from Fig. 4.7, the tail current is not completely switched between the two active-inductor loads, even at a high level of signal input. Nonetheless the tail current is completely switched between the driver’s differential pair transistors Mdiff. This phenomenon is mainly due to the presence of the degeneration resistors Rdeg, through which a portion of the switching current detours from the higher potential node to the lower potential node of the differential outputs when they are unbalanced. This way the current flowing into one of the Mdiff transistors is always the sum of the current from the active-inductor load in its branch and the current diverted from the opposite branch of active-inductor load via Rdeg. As described in Section 3.2.3, the purpose of the output degeneration resistor is to moderate the impedance variation in the active-inductor loads. 49  30  25  Gm (mS)  20  15  10  5  0  -0.6  -0.4  -0.2  0 V in1-V in2 (V)  0.2  0.4  0.6  Fig. 4.9. Differential Gm vs. Vin_diff (Pre-driver with active-inductor loads).  Shown in Fig. 4.8 is the total transconductance Gm (i.e., ∂ΔI D 4 / ∂ΔVin1 ) of the pre-driver versus the differential input voltage to the pre-driver. Similar to a typical differential pair with resistive loads, Gm peaks when the circuit is balanced and decays to zero as the current switches from one branch to another.  50  4.4.2  Frequency Response  7 6  Active Inductor Passive Resistor Passive Inductor  Entire Pre-driver  5  3-dB 4 Gain (dB)  3-dB 3 2 1 0 -1  3-dB  3-dB  1st-Stage Pre-driver  -2 9 10  10  10 Frequency (Hz)  Fig. 4.10. Frequency responses of the entire pre-drivers and 1st-stages designed using different loads. As shown in Fig. 4.9, the small-signal bandwidth of the pre-driver (when every differential driver stage is balanced) gives a quick estimate and comparison on the maximum speed that each of the three output buffers, two with bandwidth enhancement (through the use of active-inductor loads and passive-inductor loads) and one without bandwidth enhancement (using only passive-resistor loads), can achieve. The -3-dB bandwidth of the pre-driver using active-inductor loads is about 11 GHz, slightly better than the 10 GHz in the case when passive-inductor loads are used. The pre-driver designed with passive-resistor loads has a -3-dB bandwidth of 7.5 GHz. Note that the pre-driver using active-inductor load has a low-frequency gain 1-dB lower than that of the two pre-drivers using passive loads. This is because, in large signal operations the small-signal magnitude responses of all three cases change with their bias points constantly and thus cannot represent the actual (large-signal) transient voltage gain of the pre-drivers. For example, at other bias points (other than 51  balanced mode) the pre-driver with active-inductor loads might have higher small-signal gains than the pre-drivers with passive loads such that the overall gain of each of the three pre-drivers will end up the same in the transient responses. The transient responses of the three buffer circuits will be presented in the following sub-section. The frequency responses of the 1st-stages of the pre-drivers designed are also plotted in Fig 4.9. The -3-dB bandwidths in the cases when active-inductor, passive-inductor, and passive-resistor loads are used are about 20 GHz, 23 GHz, and 17 GHz, respectively. Note that when operated under large signal, the bandwidth of the drivers using active-inductor loads varies more than the drivers using passive loads. This is because both the resistance Rs of the active inductor and the output resistance r0 of the driver transistor change with the output level. In the case when passive loads are used only r0 changes. Therefore, to more accurately predict the speed of the circuits using active-inductor loads under large-signal operation, analysis on the transient responses is helpful and is presented in the next sub-section.  Step Response  0.5  0.4 Voltage (V)  4.4.3  0.3  0.2 Input Step Active Inductor  0.1  Passive Resistor Passive Inductor  0 50  60  70  80  90 100 Time (ps)  110  120  130  140  Fig. 4.11. Step response (rising) of the 1st-stage pre-driver.  52  Input Step Active Inductor Passive Resistor Passive Inductor  0  Voltage (V)  -0.1  -0.2  -0.3  -0.4 50  60  70  80 90 Time (ps)  100  110  120  Fig. 4.12. Step response (falling) of the 1st-stage pre-driver.  To examine the differences in the large-signal transient responses between the pre-drivers designed using active-inductor, passive-resistor, and passive-inductor loads, two input step voltages of 400 mV (peak-to-peak swing), one with a rise time of 5 ps and another with a fall time of 5 ps, are applied to the inputs of the three pre-drivers. Fig. 4.10 shows the transient responses of the 1st-stage pre-drivers in all three cases when the rising input step is applied. The 10% - 90% rise times for the output signals, in the cases where active-inductor and passive-resistor loads are used, are about the same (~26 ps). The transient responses when a falling input step is applied are plotted in Fig. 4.11. The 90% - 10% fall time for the output response in the case of active-inductor load is about the same as that of the passive-inductor load (~13 ps). The response speed of the 1st-stage pre-driver using active-inductor loads, upon the application of the falling input step, is about twice faster than that when the rising input step is applied. This can be attributed to the reduced output RC time constant as Rs,eff of the active-inductor load decreases when the output level swings down (i.e., r0 of driver transistors Mdiff decreases). 53  0.5  Voltage (V)  0.4  0.3  0.2 Input Step Active Inductor Passive Resistor Passive Inductor  0.1  0 50  60  70  80  90 100 Time (ps)  110  120  130  140  Fig. 4.13. Step response (rising) of the entire pre-driver.  Input Step Active Inductor Passive Inductor Passive Resistor  0  Voltage (V)  -0.1  -0.2  -0.3  -0.4 50  60  70  80 90 Time (ps)  100  110  120  Fig. 4.14. Step response (falling) of the entire pre-driver. Fig. 4.12 and 4.13 illustrate the rise-time and fall-time step responses of the three pre-drivers. Both the rise-time and fall-time performances of the active-inductor load implementation are about in the midway between the passive-inductor and passive-resistor implementations.  54  4.4.4  Eye-Diagram  VOUT (mV)  ~1.4 ps  710 mV  Time (ps) Fig. 4.15. Eye-diagram of the output buffer using active-inductor loads at 31.25Gb/s.  VOUT (mV)  ~1.9 ps  760 mV  Time (ps) Fig. 4.16. Eye-diagram of the output buffer using passive-inductor loads at 31.25Gb/s. 55  VOUT (mV)  ~5.3 ps  515 mV  Time (ps) Fig. 4.17. Eye-diagram of the output buffer using passive-resistor loads at 31.25Gb/s. The eye-diagram simulations are performed by applying a 31.25 Gb/s (231-1) PRBS with ±200 mV magnitude generated by a Matlab script written by the author, to the inputs of the three output buffers. The eye-diagrams of the output buffers using active-inductor, passive-inductor, and passive-resistor loads in their pre-drivers are shown in Fig. 4.14, 4.15, and 4.16, respectively. Table 4.1 summaries the eye-diagram simulation results of the three cases as well as the relative differences between the active inductor and passive inductor cases and the active inductor and passive resistor cases, respectively.  56  Table 4.1. Summary of the eye-diagram simulation. Vertical Eye Opening  Jitterp-p  I. Active-Inductor Load  710 mV  ~1.4 ps  II. Passive-Inductor Load  760 mV  ~1.9 ps  III. Passive-Resistor Load  515 mV  ~5.8 ps  I. relative to II.  - 6.5%  + 36%  I. relative to III.  + 37.8%  +340%  To illustrate the speed improvement in the case when active-inductor loads are used, as compared to when passive-resistor loads are used, the output buffer using only passive-resistor loads are simulated with the same PRBS input pattern at lower data rates. It has been found that the buffer using passive-resistor loads operating at 25 Gb/s has a comparable output eye as that of the buffers using active-inductor or passive-inductor loads operating at 31.25 Gb/s. The output eye-diagram of the buffer with passive-resistor loads operating at 25 Gb/s is shown in Fig. 4.17. The vertical eye opening is about 740 mV and the peak-to-peak jitter is approximately 2.5 ps. The speed improvement of the active-inductor over the passive-resistor loads is about 25% while the jitter of the buffer with active-inductor loads (~1.4 ps) is about 78 % less than the buffer with passive-resistor loads (~2.5 ps).  57  VOUT (mV)  ~2.5 ps  40 ps 740 mV  Time (ps) Fig. 4.18. Eye-diagram of the output buffer using passive-resistor loads at 25Gb/s.  Table 4.2 gives the summary on the performance of the output buffer designed using active-inductor loads and compares the results with other output buffers reported in literature that are used in high-speed data transmitters.  58  Table 4.2. Performance summary of the output buffer designed with active-inductor loads and comparisons with other reported output buffers. This workα  [27]β  [28]β  [29]β  CMOS Technology  65-nm  80-nm  90-nm  0.15-µm  VDD (V)  1.2  1  1.2  1.5  Data Rate (Gb/s)  31.25  40  40  20  Power (mW)  41.2  24  50.4  45  Differential Vout (mV)  710  660  400  900  Fanout ratioλ  41.6  23.75  8γ  *  # of spiral inductors  0  10  8  4  Area  130µm × 30µm  *  *  *  α  Simulated. β Measured. γ Estimated. * Not specified. λ Division between CL and Cin (input capacitance of the 1st-stage pre-driver) of the output buffer (i.e., 250fF/6fF)  59  5 CONCLUSIONS 5.1 Conclusion This thesis explores the use of active inductors as a compact alternative to passive spiral inductors for implementing shunt-peaking to extend on-chip bandwidth and providing equalization for channel loss through tunable peaking. A novel PMOS-based active inductor topology that can operate with low voltage headroom and requires no voltage boosting has been proposed. The first prototype design is an output driver circuit using the PMOS-based active inductor as its termination load and is implemented in a 90-nm CMOS process. The peaking frequency and its corresponding magnitude of the active inductor circuit can be adjusted to facilitate channel loss compensation. Operating at 10 Gb/s over a 6-inch FR4 channel, as compared to the case when the active inductor structure is disabled, the use of active inductor in the transmit-side termination increases the vertical eye opening at the receiver side by a factor of two and reduces the received peak-to-peak jitter by 30%. Appropriate output impedance matching of  S22 less than -10 dB is achieved by the active-inductor termination. The pair of active inductor circuits occupies 17 × 25 µm2 and has a low overhead power consumption of 0.8 mW. The second prototype design is a 4-stage output buffer using NMOS-based active-inductor shunt peaking for bandwidth extension and has been taped out for fabrication in a 60  cutting-edge 65-nm CMOS process. Through simulations it was verified that the output buffers with active-inductor shunt peaking compared favorably with passive-inductor shunt peaking. The peak-to-peak jitter when the active-inductor shunt peaking was used was more than three times lower and the vertical eye opening is more than 30 % larger than that when no shunt peaking was used. When operating at 25% higher data rate than the buffer designed using passive-resistor loads (i.e., 31.25 Gb/s vs. 25 Gb/s), the use of active-inductor loads achieves the similar vertical eye opening and 78% better jitter performance. The output buffer with active-inductor shunt peaking achieved a data rate of 31.25 Gb/s, a peak-to-peak output swing of 710 mV, and a total fanout ratio of 41.6 while dissipating 41.2 mW and occupies an area of 135 × 30 µm2.  5.2 Contributions The main contributions of this thesis are summarized as following: Introduction of a novel active inductor topology  ¾  z  Low voltage headroom (200 ~ 300 mV)  z  Needs no voltage boosting and consumes little power overhead (0.8mW)  z  Published in IEEE International Symposium on Circuits & Systems (ISCAS) 2008 First to propose using active inductors for high-speed I/O terminations  ¾  ¾  z  Increases received voltage margin by a factor of two over a 6-inch FR4 channel  z  Achieves S22 of -10 dB and better  z  Accepted by IEEE Asian Solid-Solid Circuits Conference (A-SSCC) 2008  Design of three-stage pre-driver using active-inductor loads in a 30 Gb/s output buffer z  Fastest shunt-peaking active inductors that have been reported up to date 61  (i.e., 40 Gb/s in each stage in simulation) z  Speed improves by 25 % while having 78% less jitter compared to the same output buffer designed using passive-resistor loads  z  Consumes no extra power comparing to when passive loads are used  5.3 Future Work For active inductors to be reliably deployed in high-speed I/O circuits, the robustness of such circuits in the presence of PVT variations has to be more thoroughly investigated, especially when implemented in very DSM CMOS technologies (i.e., 65 nm and below) in which process variations are getting much worse. It is thus one of the most important follow-up works to be considered. Another future work will be testing the 2nd prototype design that has been presented (the 4-stage output buffer using active-inductor loads) when the chip returns in November. Also of interest is to apply the proposed active-inductor shunt peaking technique to other speed critical I/O building blocks such as MUX and DEMUX. Most MUX and DEMUX operating at 10 Gb/s or higher employ the tree architecture [26] in which CML latches and selectors are used extensively as their constituent sub-circuits. Both CML latch and selector need three stacking transistors excluding the loads (CML driver needs only two). It is thus expected to be more challenging to use active-inductor loads in these structures due to the reduction in available voltage headroom.  62  63  REFERENCES [1] G. Balamurugan et al., “A Scalable 5–15 Gbps, 14–75 mW Low-Power I/O Transceiver in 65 nm CMOS,” IEEE Journal of Solid-State Circuits, vol. 43, pp. 1010-1019, Apr. 2008. [2] S. Vangal et al., “An 80-Tile 1.28 TFLOPS Network-on-Chip in 65nm CMOS,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 98-589, Feb 2008. [3] J. Dorsey et al., “An Integrated Quad-Core Opteron Processor,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 102-103, Feb 2008. [4] U. Nawathe et al., “An 8-Core 64-Thread 64b Power-Efficient SPARC SoC,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 108-590, Feb 2008 [5] http://www.comscore.com/press/release.asp?press=1242. [6] T. Suzuki et al., “A 50-Gbit/s 450-mW Full-Rate 4:1 Multiplexer with Multiphase Clock Architecture in 0.13- μm InP HEMT Technology,” IEEE Journal of Solid-State Circuits, vol. 42, pp. 637-646, 2007. [7] H. Takauchi et al., “A CMOS Multichannel 10-Gb/s Transceiver,” IEEE Journal of Solid-State Circuits, vol. 38, pp. 2094-2100, 2003. [8] V. Stojanovic and M. Horowitz, “Modeling and Analysis of High-Speed Links,” Proceedings of the IEEE Custom Integrated Circuits Conference (CICC), pp. 589-594, 2003. [9] S. Galal and B. Razavi, “40-Gb/s Amplifier and ESD Protection Circuit in 0.18µm CMOS Technology,” IEEE Journal of Solid-State Circuits, vol. 39, pp. 2389-2396, 2004. [10] Jri Lee, “High-Speed Circuit Designs for Transmitters in Broadband Data Links,” IEEE Journal of Solid-State Circuits, vol. 41, pp. 1004-1015, 2006. [11] R. Sun et al., “A Tunable Passive Filter for Low-Power High-Speed Equalizers,” Symposium on VLSI Circuits, Digest of Technical Papers, p. 198, 2006.  64  [12] Jian-Hao Lu, Chi-Lun Luo, and Shen-Iuan Liu, “A Passive Filter for 10-Gb/s Analog Equalizer in 0.18-μm CMOS Technology,” IEEE Asian Solid-State Circuits Conference (A-SSCC), pp. 404-407, 2007. [13] B. Stackhouse et al., “A 65nm 2-Billion-Transistor Quad-Core Itanium Processor,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 92-598, 2008. [14] O. Takahashi et al., “Migration of Cell Broadband Engine™ from 65nm SOI to 45nm SOI,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 86-597, 2008. [15] G. Gammie et al., “A 45nm 3.5G Baseband-and-Multimedia Application Processor using Adaptive Body-Bias and Ultra-Low-Power Techniques,” IEEE International Solid-State Circuits Conference,(ISSCC), Digest of Technical Papers, pp. 258-611, 2008. [16] M. Naruse et al., “A 65nm Single-Chip Application and Dual-Mode Baseband Processor with Partial Clock Activation and IP-MMU,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 260-612, 2008. [17] Jaeha Kim et al., “Circuit Techniques for a 40Gb/s Transmitter in 0.13µm CMOS,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 150-589, 2005. [18] E. Säckinger, Broadband Circuits for Optical Fiber Communication, John Wiley & Sons, Inc., 2005. [19] E. Sackinger and W. Fischer, “A 3 GHz, 32 dB CMOS Limiting Amplifier for SONET OC-48 Receivers,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 158-159, 2000. [20] Chia-Hsin Wu, Jieh-Wei Liao, and Shen-Iuan Liu, “A 1V 4.2mW Fully Integrated 2.5Gb/s CMOS Limiting Amplifier using Folded Active Inductors,” Proceedings of the International Symposium on Circuits and Systems (ISCAS), pp. I-1044-7, 2004. [21] T.H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits, Cambridge University Press, 2003. [22] Y.-S.M. Lee and S. Mirabbasi, “Design of an Active-Inductor-Based Termination Circuit for High-Speed I/O,” Proceedings of the International Symposium on Circuits and Systems (ISCAS), pp.061-64 Vol.1; 2008. [23] C.K. Yang, “Design of High-Speed Serial Links in CMOS,” Ph.D. Dissertation, Stanford University, Stanford, CA, Dec. 1998.  65  [24] K. Tam, “High-Speed Differetnial Logic Buffer,” U.S. Patent 7,236,011 B2, June 26, 2007. [25] B. Razavi, Design of Analog CMOS Integrated Circuits, McGraw-Hill, 2000. [26] B. Razavi, Design of Integrated Circuits for Optical Communications, McGraw-Hill, 2002. [27] G. Sialm et al., “40 Gbit/s Limiting Output Buffer in 80 nm CMOS,” Electronics Letters, vol. 41, pp. 1051-1053, 2005. [28] K. Kanda et al., “40Gb/s 4:1 MUX/1:4 DEMUX in 90 nm Standard CMOS,” IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, pp. 152-590, 2005. [29] M. Vadipour and J. Savoj, “A Low–Power 20–Gb/s CMOS 2:1 Multiplexer/Driver,” Proceedings of the European Solid-State Circuits Conference (ESSCIRC), pp.231-234, 2002.  66  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0066695/manifest

Comment

Related Items