RADIO FREQUENCY CMOS: FROM ULTRA-HIGH SPEED TO ULTRA-LOW POWER by AMIR HOSSEIN MASNADI SHIRAZI NEJAD A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (ELECTRICAL AND COMPUTER ENGINEERING) THE UNIVERSITY OF BRITISH COLUMBIA (VANCOUVER) MARCH 2018 © AMIR HOSSEIN MASNADI SHIRAZI NEJAD 2018 ii ABSTRACT Over the last three decades, radio-frequency(RF) Complementary Metal-Oxide-Semiconductor(CMOS) electronics has made a huge impact in our world. Wireless Local Area Networks(WLANs), cellular networks, Global Positioning Systems(GPSs), and Bluetooth are a few examples where the impact of RF CMOS has led to rapid adoption and standardization of the technology. However, there still exists several challenging areas at the intersection of RF and CMOS where new paradigms must be established. This thesis summarizes the research to meet those goals as briefly described here: Research during the past decades provided CMOS solutions to RF applications that utilize the frequency spectrum up to 6 GHz. However, efficient system integration of mm-wave and THz in CMOS is still a challenging task. The THz spectrum is gaining interest due to its wider and less populated available spectrum, as well as its intriguing applications in molecular spectroscopy, imaging, and sensing. This band, although very useful, has been difficult to realize in hardware because of the limitations in CMOS electronics. In the first four chapters of this thesis, we investigate the challenge of implementing signal-sources at mm-wave and sub-THz frequencies using low-cost and versatile CMOS circuits, replacing the existing expensive solutions. Demand for embedded low-power electronics for wireless connectivity is growing due to the rapid proliferation of Internet-of-Things (IoT). Although Wireless Sensor Network(WSN) had been around for decades, some applications such as biomedical monitoring systems require ultra-low-power(ULP) and cost-effective wireless solutions. Research on energy-harvesting systems (e.g., RF energy harvesting, thermoelectric, etc.) and integrated-circuits(IC) bears the promise of medium-reach battery-free wireless connectivity solutions. In Chapters 5 and 6 of this thesis, multiple ULP wireless connectivity solutions for both commercial standards such as Bluetooth Low Energy(BLE) and custom-designed application-specific-radios are proposed and implemented in 40nm and 130nm CMOS technologies, respectively. Finally, application of RF electronics in power-electronics is studied in the last chapter. Although power-management integrated circuit is a well-developed field of research, PMICs still have existing bottlenecks (e.g., die area and output ripple) which can be addressed with the knowledge of RF electronics. In this thesis, feasibility of GHz-range converters is studied. iii LAY SUMMARY This thesis summarizes our efforts to push the limits of electronic design in Complementary Metal-Oxide-Semiconductor (CMOS) technology towards achieving higher speed, lower power consumption and more compact integration. The proof-of-concept prototypes designed as part of this research on cost-effective CMOS platforms have the promise to enable – (a) very-high-frequency (30 GHz to 300 GHz) applications in imaging, sensing and communications for medical, security and internet systems, (b) ultra-low-power wireless connectivity for longer operation of battery-powered/battery-less sensors and links for internet-of-things (IoT), and (c) lower cost, size, and weight of electronic systems by integrating bulky power-electronics blocks. ivPREFACE I, Amir Masnadi, am the principle contributor of all chapters of this dissertation. Professor Shahriar Mirabbasi, Sudip Shekhar, and Thierry Taris who supervised the research provided technical consultation and editing assistance on the manuscript. During my PhD we have collaborated with a number of scholars and research groups as reflected in publication list below. I would like to thank all of them, especially from my great friend and colleague, Dr. Amir Nikpaik. He is the principle contributor of [2], where I provided technical assistance for the design, fabrication, and measurement of those Voltage Controlled Oscillators (VCOs). Please find below the list of the published works which have been used as the main chapters of this thesis: Conferences: 1. Mojtaba Sharifzadeh, Amir H. Masnadi Shirazi, Hossein M. Lavasani, Yashar Rajabi and Mazi Taghivand, ”A Fully Integrated Multi-Mode High-Efficiency Transmitter for IoT Applications in 40nm CMOS”, IEEE Custom Integrated Circuits Conference (CICC), SanDiego , USA, 2018. 2. Amir H. Masnadi Shirazi, Hossein M. Lavasani, Mojtaba Sharifzadeh, Yashar Rajabi, Shahriar Mirabbasi, and Mazi Taghivand, ”A 980μW 5.2dB-NF Current-Reused Fully Integrated Direct-Conversion Bluetooth-Low-Energy Receiver in 40nm CMOS”, IEEE Custom Integrated Circuits Conference (CICC), Austin , USA, 2017. 3. Amir H. Masnadi Shirazi, Amir Nikpaik, Shahriar Mirabbasi, and Sudip Shekhar, ”A Quad-Core-Coupled Triple-Push 295-to-301 GHz Source with 1.25 mW Peak Output Power in 65nm CMOS Using Slow-Wave Effect”, IEEE Radio Frequency Integrated Circuits Symposium (RFIC) , San Francisco, USA, 2016. 4. Amir H. Masnadi Shirazi, Amir Nikpaik, Reza Molavi, Shahriar Mirabbasi, and Sudip Shekhar, “A Class-C Self-Mixing-VCO Architecture with High Tuning-Range and Low Phase-Noise for mm-Wave Applications”, IEEE Radio Frequency Integrated Circuits Symposium (RFIC), Phoenix, USA, 2015. Winner of the Best Paper Award. Journals: 1. Amir Nikpaik, Amir H. Masnadi Shirazi, Abdolreza Nabavi, Shahriar Mirabbasi, and Sudip Shekhar, “A 218-to-231GHz Frequency-Multiplier-Based VCO with ~3% peak DC-to-RF Efficiency in 65nm CMOS”, IEEE Journal of Solid-States Circuits (JSSC), 2017. 2. Amir H. Masnadi Shirazi, Amir Nikpaik, Reza Molavi, Hormoz Djahanshahi, Mazi Taghivand, Shahriar Mirabbasi, and Sudip Shekhar, “On the Design of mm-Wave Self-Mixing-VCO Architecture for High Tuning-Range and Low Phase-Noise”, IEEE Journal of Solid-States Circuits (JSSC), 2015, (Invited) vTABLE OF CONTENTS ABSTRACT .................................................................................................................................. II LAY SUMMARY ........................................................................................................................ III PREFACE .................................................................................................................................... IV TABLE OF CONTENTS ............................................................................................................. V LIST OF TABLES ................................................................................................................... VIII LIST OF FIGURES .................................................................................................................... IX LIST OF ABBREVIATIONS ................................................................................................. XIII ACKNOWLEDGEMENTS .................................................................................................... XIV INTRODUCTION............................................................................................. 1 1.1. MOTIVATION ................................................................................................................................ 1 1.2. OBJECTIVE ................................................................................................................................... 2 1.3. CONTRIBUTIONS .......................................................................................................................... 2 1.3.1. Over Fmax signal generation ................................................................................................................... 2 1.3.2. CMOS Bottlenecks in mm-Wave High Power Signal-Generation ........................................................ 6 1.3.3. Low-Phase-Noise mm-Wave Signal Generation ................................................................................... 8 1.3.4. µW-level custom radio for highly efficient connectivity solutions ........................................................ 8 1.3.5. Ultra compact, fully integrated DC-DC converter ................................................................................. 9 1.4. THESIS OUTLINE .......................................................................................................................... 9 LOW PHASE-NOISE, HIGH TUNING-RANGE, HIGH FOM, MM-WAVE SIGNAL GENERATION ............................................................................................. 11 2.1. INTRODUCTION AND OVERVIEW OF PREVIOUS TECHNIQUES .................................................... 11 2.2. PN COMPARISON OF DIRECT AND INDIRECT SIGNAL SYNTHESIS ............................................. 14 2.2.1. Comparison of On-Chip Passives ........................................................................................................ 16 2.2.2. Comparison of ISF ............................................................................................................................... 20 2.2.3. PNexcess of a Class-B VCO For Different Up-conversion Ratios ..................................................... 22 2.3. PROPOSED 60 GHZ 3H-VCO ..................................................................................................... 23 2.3.1. Benefits of Using Class-C Push-Push VCO ........................................................................................ 24 2.3.2. Comparison of Output Signal Swing ................................................................................................... 28 2.4. MEASUREMENT RESULTS ......................................................................................................... 31 2.5. CONCLUSION.............................................................................................................................. 35 HIGH-POWER OVER FMAX SIGNAL GENERATION: SLOW-WAVE 300 GHZ SOURCE ..................................................................................................................... 37 3.1. INTRODUCTION .......................................................................................................................... 37 3.2. PPV AND TPV EFFICIENCY COMPARISON ................................................................................ 39 3.3. THE PROPOSED SLOW-WAVE QUAD-CORE-COUPLED TPV ...................................................... 41 3.4. IMPLEMENTATION AND MEASUREMENT RESULTS .................................................................... 44 3.5. CONCLUSION.............................................................................................................................. 47 ii iiiiv v viiiix xiiixiv vi HIGH DC-TO-RF EFFICIENCY, HIGH-OUTPUT-POWER TUNABLE SOURCE: A 220 GHZ VCO ...................................................................................................... 48 4.1. INTRODUCTION .......................................................................................................................... 48 4.2. CONDITIONS FOR HIGH-EFFICIENCY (SUB-)THZ POWER GENERATION IN CMOS ................... 51 4.2.1. Conditions for Optimum Fundamental Oscillation .............................................................................. 51 4.2.2. Conditions for Efficient Harmonic Power Generation ......................................................................... 54 4.3. CHALLENGES IN CMOS HARMONIC OSCILLATORS .................................................................. 55 4.4. CONVENTIONAL PUSH-PUSH OSCILLATOR ................................................................................ 55 4.4.1. Optimum Harmonic Oscillator to Extract 2f0 ...................................................................................... 58 4.5. FREQUENCY-MULTIPLIER-BASED SOURCES IN CMOS ............................................................. 61 4.6. PROPOSED 219-TO-231 GHZ CMOS VCO ................................................................................ 64 4.6.1. Class-C Frequency Doubler ................................................................................................................. 66 4.6.2. Frequency Tuning Scheme .................................................................................................................. 68 4.7. MEASUREMENT RESULTS .......................................................................................................... 69 4.8. CONCLUSION.............................................................................................................................. 74 HIGHLY-EFFICIENT SUB-THRESHOLD RADIO DESIGN ON CMOS: A 500µW 2.4 GHZ RECEIVER ............................................................................................... 75 5.1. INTRODUCTION .......................................................................................................................... 75 5.2. POWER OPTIMIZATION IN RF BLOCKS ...................................................................................... 78 5.3. POWER-EFFICIENT VCO DESIGN ............................................................................................... 82 5.3.1. Inductor sizing ..................................................................................................................................... 83 5.3.2. Active device sizing ............................................................................................................................. 86 5.4. POWER-EFFICIENT LNA DESIGN ............................................................................................... 87 5.5. LOW-POWER LOW-VOLTAGE MIXER DESIGN ........................................................................... 92 5.6. OVERALL DESIGN .................................................................................................................... 101 5.7. MEASUREMENT RESULTS ........................................................................................................ 101 5.8. CONCLUSION............................................................................................................................ 103 HIGHLY-EFFICIENT, PVT-ROBUST, SUB-THRESHOLD RADIO : A 1MW BLE RECEIVER ON CMOS ........................................................................................ 105 6.1. INTRODUCTION ........................................................................................................................ 105 6.2. PROPOSED RX ARCHITECTURE ............................................................................................... 107 6.3. RF FRONT-END ........................................................................................................................ 109 6.3.1. LNA and Active Balun ...................................................................................................................... 110 6.3.2. Mixer ................................................................................................................................................. 116 6.4. PROPOSED LOW POWER AND COMPACT PLL .......................................................................... 117 6.4.1. VCO and Dividers ............................................................................................................................. 117 6.4.2. In-Loop Circuitry Isolation and Loop Dynamics ............................................................................... 119 6.5. LOW-POWER BASEBAND AMPLIFIER ....................................................................................... 119 6.6. MEASUREMENT RESULTS ........................................................................................................ 121 6.6.1. Standalone LNA + S2D ..................................................................................................................... 125 6.6.2. PLL .................................................................................................................................................... 125 6.6.3. RX ...................................................................................................................................................... 126 6.7. CONCLUSION............................................................................................................................ 128 HIGH FREQUENCY DC-DC CONVERTER: A 1.3 GHZ FULLY INTEGRATED BUCK CONVERTER................................................................................... 129 7.1. INTRODUCTION ........................................................................................................................ 129 7.2. CONVERTER WITH PARASITIC LOSS MITIGATION ................................................................... 131 7.3. IMPLEMENTATION DETAILS ..................................................................................................... 136 vii7.4. MEASUREMENT RESULTS ........................................................................................................ 138 7.5. CONCLUSION............................................................................................................................ 141 CONCLUSION ............................................................................................. 142 8.1. INTRODUCTION ........................................................................................................................ 142 8.2. ACHIEVEMENTS ....................................................................................................................... 142 8.2.1. High tuning-range and FoM mm-wave signal generation on CMOS ................................................ 142 8.2.2. Highly efficient, high-power, sub-THz to THz signal generation ..................................................... 142 8.2.3. Sub-mW Radio on CMOS ................................................................................................................. 143 8.2.4. Fully Integrated compacted DC-DC Converter with GHz-range switching ...................................... 143 REFERENCES .......................................................................................................................... 144 APPENDIX ................................................................................................................................ 157 A. DERIVATION OF DC-TO-NF0 CURRENT EFFICIENCY ..................................................................... 157 B. DERIVATION OF NOISE FACTOR FOR PROPOSED PP-LNA ........................................................... 158 1) Output noise current due to the thermal noise of Rs: .............................................................................. 158 2) Output noise current due to the thermal noise of M1 and M2 ................................................................ 158 3) Output noise current due to the thermal noise of Rf : ............................................................................. 158 viiiLIST OF TABLES Table 2 - 1 Operating region of transistors in class-b VCO ........................................................ 26 Table 2 - 2 Operating region of transistors in class-c VCO ......................................................... 27 Table 2 - 3 Simulated Performance Comparison of 60 GHz VCO Topologies........................... 30 Table 2 - 4 Measured Performance Summary of State-Of-The Art Millimetre-Wave VCOs ..... 35 Table 3 - 1 Comparison of CPW, GCPW, and S-CPW TPV ....................................................... 43 Table 3 - 2 Performance Summary of State-Of-The Art THz VCOs ........................................... 47 Table 4 - 1 Measured Performance Summary and Comparison with State-Of-The Art Designs operating between 200 and 300 GHz. ........................................................................................... 73 Table 5 - 1 Components Value for Proposed Receiver.............................................................. 100 Table 5 - 2 Performance Summaries and Comparison .............................................................. 103 Table 6 - 1 LNA Performance over PVT @ 2.4GHz .................................................................. 114 Table 6 - 2 Performance Summary of State-Of-The BLE Receivers ......................................... 128 Table 7 - 1 Performance Summary and Comparison .................................................................. 141 ixLIST OF FIGURES Figure 1 - 1 : Some applications of sub-THz to THz band ............................................................. 4 Figure 1 - 2 : fmax and ft versus gate length of transistor ................................................................. 5 Figure 1 - 3 General method of higher than fmax signal generation ............................................... 7 Figure 1 - 4 : Maximum reported output power of oscillators ....................................................... 7 Figure 1 - 5 Contributions of this thesis ....................................................................................... 10 Figure 1 - 6 sub-THz to THz contributions of this thesis ............................................................ 10 Figure 2 - 1 Application of H-VCO in a PLL. ............................................................................. 13 Figure 2 - 2 Indirect LO synthesis techniques for H-VCOs: (a) Triple-push VCOs with 3rd -harmonic summation [27], (b) VCO followed with 3rd-harmonic generation and band-pass filtering [34], (c) VCO followed with 3rd-harmonic extraction and injection locking, and (d) SMV with 1st and 2nd -harmonic ................................................................................................... 13 Figure 2 - 3 (a) A 2H-VCO and (b) an F-VCO, both generating 2ω0. (c) A conventional Class-B VCO core. ..................................................................................................................................... 15 Figure 2 - 4 Differential inductance (Ldiff), maximum differential quality factor (Qdiff-max), and the frequency where the quality factor reaches its maximum value (fQ-max) versus radius (R) of inductor for three different inductor widths. ................................................................................. 18 Figure 2 - 5 Quality factor of (a) MOS varactors with 30% tuning range, and (b) mimcaps. ..... 19 Figure 2 - 6 (a) Conceptual ISFs, ΓM, and ΓT at low frequency for a Class-B VCO. Simulated ΓM at (b) 7 GHz, (c) 21 GHz (d) and 63 GHz, respectively. ISF is normalized to the maximum charge displacement, qmax [39]. .................................................................................................... 21 Figure 2 - 7 Simulated ΓT of a Class-B VCO at 7 GHz, 21 GHz, and 63 GHz (single ended noise is injected as shown in Fig.6 (a) ). ISF is normalized to the maximum charge displacement, qmax [39]. ....................................................................................................................................... 21 Figure 2 - 8 (a) A 2H-VCO (push-push Class-B core), and (b) a 3H-VCO with SMV architecture (Class-B core). .............................................................................................................................. 22 Figure 2 - 9 Simulation results for PNexcess for N = 2 and 3. Phase noise is simulated at 1 MHz offset frequency. ........................................................................................................................... 23 Figure 2 - 10 (a) Transformer-based 3rd harmonic extraction. (b) SMV implementation [14]. .. 24 Figure 2 - 11 Cross-coupled pair in Class-C and Class-B VCOs and corresponding drain currents. ......................................................................................................................................... 25 Figure 2 - 12 Current Efficiency versus conduction angle. ......................................................... 26 Figure 2 - 13 Parasitic capacitances in a cross-coupled oscillator. .............................................. 27 Figure 2 - 14 Parasitic capacitance (CD-Par) of cross-coupled pair in Class-B (with 200fF parasitic capacitance at tail) and Class-C (with Ctail of 1pF) VCO at 20 GHz. ........................................... 28 Figure 2 - 15 Measurement setup and chip micrograph of SMV architecture............................. 31 Figure 2 - 16 Measured SMV spectrum at the buffer output at 62.48 GHz. ............................... 32 Figure 2 - 17 Measured frequency tuning range. ......................................................................... 33 Figure 2 - 18 Measured SMV phase noise at 17.9 GHz (fundamental), 53.68 GHz (tripled, lowest frequency) and 62.49 GHz (tripled, highest frequency). ................................................... 34 Figure 2 - 19 Measured phase noise (at 1 MHz offset) and FoM versus SMV output frequency........................................................................................................................................................ 35 xFigure 3 - 1 General steps for generating a high power THz source. ........................................... 38 Figure 3 - 2(a) Simulated PPV and TPV with their (b) DC-to-RF efficiency in 65nm process with ideal combiner ............................................................................................................................... 40 Figure 3 - 3 (a) Proposed slow-wave quad-core-coupled TPV, (b) S-CPW structure, and (c) output matching (S11) .................................................................................................................... 41 Figure 3 - 4 Grounded and slow-wave CPW structures ............................................................... 43 Figure 3 - 5 Q and S21 insertion loss of S-CPW, GCPW, and CPW ........................................... 43 Figure 3 - 6 Chip micrographs for proposed TPVs....................................................................... 45 Figure 3 - 7 Measurement Setup ................................................................................................... 45 Figure 3 - 8 (a)Measured frequency at 300.8 GHz. (b) and Frequency tuning and output power versus supply voltage .................................................................................................................... 46 Figure 4 - 1 DC-to-RF efficiency of the recently published VCO with output frequency beyond 200 GHz implemented in CMOS technologies………………………………………………….49 Figure 4 - 2 Generic representation of (a) a harmonic oscillator, and (b) a frequency-multiplier-based source .................................................................................................................................. 50 Figure 4 - 3 High-frequency, small-signal transistor model. Subscript i represent the intrinsic nodes. ............................................................................................................................................ 53 Figure 4 - 4 Simulated (solid red line) and calculated φopt for a 15×1 µm/60 nm NMOS device with gm = 20 mA/V, RG = 14.8 Ω, Cm = 2.6 fF, Cgs = 11.8 fF, and Cgd = 7.4 fF. The simulated fmax for the transistor biased at (VDS,DC , VGS,DC) = (1.2 V , 1.2 V), is 233 GHz. ........................ 54 Figure 4 - 5 Cross-coupled pair used in push-push oscillator and its equivalent circuit at a specific harmonic (2f0). ................................................................................................................. 55 Figure 4 - 6 (a) Simulation setup, (b) ηI,2 and Gs,2 for a transistor in cross-coupled pair. (c) Total DC to 2f0 power efficiency, ηp,2 (circles), drain DC to 2f0 power efficiency (dashed curve), and fundamental harmonic power (P1) generated by the transistor (solid curve)................................ 57 Figure 4 - 7 Second harmonic power loss in the resonator transmission line of push-push oscillator. ....................................................................................................................................... 58 Figure 4 - 8 Contours for (a) ηI,2, (b) Gs,2 and (c) DC to second-harmonic power efficiency (ηp,2) (step = 1%) and fundamental power (P1) (step = 0.3 mW) for generic harmonic oscillator. Shaded area shows device activity region. ................................................................................... 60 Figure 4 - 9 (a) ηI,2, (b) Gs,2, (c) ηp,2 and (d) Pav,2 for the frequency doubler with respect to fundamental gate swing (Ag) and DC level VG,DC. ....................................................................... 60 Figure 4 - 10 (a) Fundamental power efficiency plot with respect to A = Ag = Ad and gate DC voltage (VDC), (b) Fundamental power efficiency contours (max = 30% and step = 5%) for core oscillator and P1 = 0.8 mW. .......................................................................................................... 61 Figure 4 - 11 Proposed VCO architecture..................................................................................... 65 Figure 4 - 12 Detailed schematic of the proposed VCO ............................................................... 66 Figure 4 - 13 ( a) Frequency doubler, with W/L = 8 µm/60 nm for M1x and M2x. (b) Gate voltage and drain current waveforms. (c) Output power (dotted curve) and power efficiency (solid curve) contours for gate signal swing A = 1.3 V at 110 GHz for four frequency doublers combined together. ........................................................................................................................................ 68 Figure 4 - 14 (a) Parasitic capacitances of a MOS transistor, and different C-V characteristics for (b) Cgs, (c) Cgd, and (d) Cds, for VG = 0.35 V, 0.77 V, and 1.2 V ................................................. 69 Figure 4 - 15 Chip micrograph...................................................................................................... 70 Figure 4 - 16 (a) Frequency, phase noise, and (b) power measurement setups. ........................... 70 xiFigure 4 - 17(a) Measured spectrum for LO input of 13.9 GHz whose 16th harmonic down-converts the RF input of 230.06 GHz to 7.66 GHz. (b) Measured tuning curvec.........................72 Figure 4 - 18 Measured (a) output power, and (b) DC-to-RF efficiencyc.....................................72 Figure 4 - 19 Measured (a) phase noise at 1 MHz offset, as the oscillation frequency is varied across the tuning range, and (b) down converted phase noise plot at the output frequency of 227 GHz downconverted to 1.39 GHz using the 16th harmonic of 14.1 GHz LO input. ..................... 73 Figure 5 - 1 Conventional SOM, LNC and LMV architectures. ................................................. 76 Figure 5 - 2 Proposed LMV+Filter architecture. ......................................................................... 76 Figure 5 - 3 gm/ID versus gate-source voltage of a NMOS transistor in a 0.13-μm CMOS process (device gate length is 120 nm ). ....................................................................................... 79 Figure 5 - 4 Transconductance efficiency versus inversion coefficient and transistor’s fT versus inversion coefficient...................................................................................................................... 79 Figure 5 - 5 LC cross-coupled oscillator and equivalent model at resonance ............................. 80 Figure 5 - 6 FOM of NMOS transistor versus inversion coefficient ........................................... 83 Figure 5 - 7 QL.L and SRF versus inductor value L of inductors in the 130-nm CMOS Technology ................................................................................................................................... 84 Figure 5 - 8 Systematic approach for designing a low power VCO , This last loop allows meeting the tuning range while minimizing the phase noise and power consumption (basically, maximum Q.L optimizes phase noise and power, however, if needed, we can apply the constraint on the maximum value of L to meet the tuning) ........................................................................... 85 Figure 5 - 9 Simplified noise model of LNA, gate induced noise and flicker noise of transistors are ignored. ................................................................................................................................... 89 Figure 5 - 10 (a) Current reused push-pull feedback LNA, (b) matching network (Bias is excluded), and (c) S11 simulation of proposed push-pull low noise amplifier. Rf is 30kΩ, LG = 10nH , M1=14µ120 , M2 =14µ120 . .......................................................................... 90 Figure 5 - 11 NF simulation of proposed LNA , Minimum simulated NF is 2.5dB in range of 2.15GHz to 2.45 GHz ................................................................................................................... 91 Figure 5 - 12 Systematic approach for designing a low power LNA .......................................... 92 Figure 5 - 13 Conventional down conversion mixer (a) Gilbert-type mixer (b) Combined Gm-LO mixer (c) Folded mixer. .......................................................................................................... 93 Figure 5 - 14 Gm-Switched mixer (a) Conceptual single-ended mixer. Gm can be turned off and on by switching its supply terminal. (b) Implementation of switching-stage and gm-stage. ....... 95 Figure 5 - 15 Inductive peaking technique in proposed mixer .................................................... 96 Figure 5 - 17 (a) Inverter with dynamic-threshold-voltage NMOS (DTNMOS) (b) Inverter with dynamic-threshold-voltage PMOS (DTPMOS) (c) Inverter with DTPMOS and DTNMOS (d) Conventional inverter.................................................................................................................... 97 Figure 5 - 18 Output voltage of the inverter with Wp= 200 µm, Wn=100 µm, CL=1pF, 2.45 GHz (A) with dynamic threshold (DTMOS) inverter (B) without dynamic threshold voltage ... 98 Figure 5 - 19 Schematic of the proposed LVM receiver ............................................................. 98 Figure 5 - 20 Chip Micrograph of (a) LNA-Mixer, (b) LNA-VCO-Filter and (c) LNA-VCO-Mixer-Filter ................................................................................................................................. 100 Figure 5 - 21 Measured buffered VCO a) Buffered spectrum b) Phase Noise. The center frequency is 2.45GHz ................................................................................................................. 102 Figure 5 - 22 Measured LNA S11 input matching .................................................................... 103 xiiFigure 6 - 1 FoMRX versus power of reported ULP radios since 2005 [1]. This work is shown with a red marker. ....................................................................................................................... 106 Figure 6 - 2Simplified block diagram of (a) sliding IF architecture with two LO stages and image rejection filter, (b) low-IF architecture with image rejection filter, (c) LNA-less passive architecture, and (d) the proposed ultra-low power direct conversion receiver ......................... 108 Figure 6 - 3 Conventional PP-LNA and proposed DBPP-LNA and S2D .................................. 111 Figure 6 - 4 Simplistic model of proposed LNA ........................................................................ 112 Figure 6 - 5 Simplistic model of proposed LNA and S2D ......................................................... 113 Figure 6 - 6 Half circuit of proposed passive mixer and CML divider....................................... 116 Figure 6 - 7 Diagram of (a) PLL, (b) VCO, (c) CML divider, and (d) shielded CPWs used for the clock path .................................................................................................................................... 118 Figure 6 - 8 Low-power baseband amplifier and automatic offset calibrator ............................. 120 Figure 6 - 9 Chip micrograph of proposed RX, DBPP-LNA, and compact PLL ....................... 121 Figure 6 - 10 DBPP-LNA measurement results at room temperature ........................................ 122 Figure 6 - 11 Measured PLL PN in (a) free running unlocked mode and (b) PLL mode ........... 123 Figure 6 - 12 Measured Vcontrol showing the PLL settling Time ............................................. 124 Figure 6 - 13 Measured NF, input S11, RX gain, and IIP3 ........................................................ 124 Figure 6 - 14 Measured blocker performance of proposed RX .................................................. 125 Figure 6 - 15 Power breakdown in low power and high performance modes ............................ 127 Figure 7 - 1 Simplified schematics for (a) a buck converter, and (b) a multiphase ripple canceller..................................................................................................................................................... 131 Figure 7 - 2 Proposed buck converter with parasitic-loss mitigation. ........................................ 133 Figure 7 - 3 Effect of resonance in lowering switch loss ............................................................ 135 Figure 7 - 4 Power consumption and efficiency vs. switching frequency. ................................. 135 Figure 7 - 5 Simulated converter efficiency vs. switching frequency for load current of 100mA and with different Q factors ........................................................................................................ 137 Figure 7 - 6 High-density custom capacitance. ........................................................................... 138 Figure 7 - 7 Chip micrograph...................................................................................................... 139 Figure 7 - 8 Measured converter efficiency vs. switching frequency. ........................................ 139 Figure 7 - 9 Measured converter efficiency vs. load power. ...................................................... 140 Figure 7 - 10 Measured switching spur and ripple with fSW = 1.5 GHz. .................................... 140 xiiiLIST OF ABBREVIATIONS RF Radio Frequency LO Local Oscillator IF Intermediate Frequency CMOS Complementary Metal-Oxide Semiconductor LMV LNA-Mixer-VCO SMV Self-Mixing-VCO VGS Gate-to-Source voltage Vth Threshold Voltage VOV Overdrive Voltage gm Transconductance CML Current-Mode-Logic PN Phase Noise VCO Voltage Control Oscillator HVCO Harmonic-VCO FVCO Fundamental-VCO SMV Self-Mixig-VCO THz Terahertz DPL Dynamic Power Loss CPW Coplanar Wave Guide SCPW Slow-Wave Coplanar Wave Guide GCPW Grounded Coplanar Wave Guide PLL Phase-Locked-Loop ZVS Zero-Voltage-Switching PMIC Power Management Integrated Circuit FTR Frequency Tuning Range FOM Figure of Merit xiv ACKNOWLEDGEMENTS I consider the time spent for my Ph.D. as one of the best times of my life. Beyond hard work and discipline, I think I was very lucky and fortunate in life, especially in graduate-school. I would not have completed this thesis without the significant help of many great people who have made my life at UBC a pleasant and productive experience. First and foremost, I would like to express my sincere gratitude to my supervisors, Prof. Shahriar Mirabbasi and Prof. Sudip Shekhar for their continuous support and patience during my stay at UBC. I consider myself very fortunate for having the opportunity to work with them. I would like to extend special thanks to my great friend and colleagues Dr. Amir Nikpaik (Tarbiat Modares University), Dr. Mazheredin Taghivand (Stanford University and Qualcomm Atheros, San Jose), and Dr. Hormoz Djahanshahi (Microsemi, Canada). I learnt a lot from them and our collaborations resulted in number of publications. I would like to thank Professor S. Savafi-Naeini and A. Nabavi at University of Waterloo and Professor A. Niknejad at BWRC, University of California, Berkeley for providing access to measurement equipment and Andrew Townley for his help with some THz test setups. Also Prof. Thierry Taris (University of Bordeaux, France), Dr. Reza Molavi (D-Wave Systems Inc., Canada), Dr. Beomsup Kim (Qualcomm Atheros), Dr. Hooman Rashtian (University of California, Davis), Dr. Mohammad Mahani (Microsemi, Canada), Dr. Hossein Miri Lavasani (Qualcomm Atheros), Dr. Abbas Sohrabpour (University of Minnesota), and Dr. Asad Kalantarian (SunDisk, San jose for very fruitful discussions. I also thank Dr. Roberto Rosales and Roozbeh Mehrabadi (UBC) for measurement and CAD tool support. Finally, this work is dedicated to my wonderful parents for all their love and support. I always felt them near me, even though they were living far away during my period of study in Canada. They have been my main motivation throughout my studies. Also I would like to thank my dear brothers, Mostafa, and Mohammad Sadegh for paving the way for me and enormously helping me to finish my Masters and PhD. in less than expected time. The Hypnotizing Beauty of Iranian Ceilings, Shahe-Cheragh’s monument in Shiraz xv TO MY FAMILY Sheikh-Lotfolah’s Monument, Isfahan , Iran 1 INTRODUCTION 1.1. MOTIVATION Over the last four decades, integrated circuits (ICs) have revolutionized our everyday lives. Advances in IC design have allowed us to integrate millions of transistors in a small area of a few mm2. In the context of wireless communications, almost all functionalities, from wireless transceiver front-end to digital baseband signal processing, have been successfully integrated on a single chip. Three main challenges are still fueling research and development in academia and industry: First, the thirst for higher data rates as well as the possibility of new applications have motivated the use of higher frequency bands. Second, smart devices and sensors are now an inseparable part of our daily life and billions of transistors are at the service of each person. The modernized 21st century lifestyle demands for ultra low power, and even self-sufficient, battery-free electronics. And third, thanks to microelectronics, miniaturization and encapsulation is the hallmark of the past two decades (e.g., cell-phone size is reduced by more than 3× from 1988 to 2017), and many researchers are trying to push the limits to even smaller dimensions. Although processor units such as CPUs have shrunken significantly, power management units, such as inverters and converters, did not scale proportionally and are still among the most bulky parts, occupying more than 85% of printed circuit boards (PCB). This results in increasing cost and weight of the overall system and hence efforts for realising a single-chip fully integrated power management ICs (PMIC) is ongoing. Despite great advances, there are still existing bottlenecks, less-explored and hidden aspects in abovementioned streams that conventional, cost-effective, CMOS electronics is yet to address. For example, enabling communication systems to operate in the sub-THz and mm-wave frequency range is beyond the capability of most cost-effective commercialized electronics. Or most state-2of-the-art battery-powered wireless/wireline connectivity solutions such as cell-phones still demand power levels for which the user regularly needs to charge or replace the battery. This thesis is an effort to address those challenges and unsolved problems and push the boundaries of cost-effective electronics further toward higher speed data-rates and lower power consumption. 1.2. OBJECTIVE With abovementioned motivations, the objective of this thesis is to study and design building blocks and systems in a cost-effective CMOS process that can be used in: 1) High performance mm-waves communication systems such as 5G (Chapter 2). 2) sub-THz to THz systems such as mm-wave high resolution scanners (Chapter 3-4). 3) Ultra-low-power µWatt-level wireless communication systems (Chapter 5-6). As a proof-of-concept, in each chapter, multiple test chips are designed, fabricated, measured, and compared with the state-of-the-art. The CMOS processes that are used include 130nm, 90nm, 65nm, and 40nm which are among most cost-effective processes for the industry. 1.3. CONTRIBUTIONS 1.3.1. Over Fmax signal generation The sub-THz (100 to 300 GHz) to THz (0.3 to 3 THz, also known as the Terahertz gap) frequency spectrum is increasingly gaining interest due to its intriguing applications in molecular spectroscopy, imaging, radar sensing, and more recently wireless high-speed communication [1]–[6]. Unlike X-ray, mm-wave-imaging in sub-THz to THz band starting from 94 GHz to 1.035 THz is non-ionizing and hence is useful in non-invasive medical and dental diagnostics [1]. In addition, THz imaging allows for the detection of various objects including concealed weapons, explosives, and contraband underneath a person’s clothing. Commercial and near-commercial mm-wave 3sensing systems capable of fast acquisition, cm-scale resolution and long standoff distances make this band appealing for radar systems (e.g., Frequency-Modulated Continuous-Wave (FMCW) radar) and Automatic Cruise Control (ACC) systems. Another key advantage of this frequency band is for high-speed communication systems. Based on Shannon’s channel capacity theory [7], the maximum speed of data transmission (C) in a particular channel depends on bandwidth (BW) and signal to noise ratio (SNR) and can be written as:  = . log(1 + SNR) (1 − 1) As shown by this equation, the data rate is directly proportional to the BW and thus the mm-wave region is a good candidate for achieving high data rates. It has been almost a decade since commercial 60 GHz wireless links with almost 7 GHz of BW are available, and now our desire and addiction1 for ultra-high-speed communication pushes research teams to focus on sub-THz links for such data transfers (e.g., 110 GHz link for satellite to satellite communication). Although the sub-THz and THz region supports useful applications, this band has been one of the most challenging within the electromagnetic spectrum because of the limitations of device technologies. Most of the available solutions are based on group III-V and cryogenic technology – so power consumption and cost limit many practical and portable applications. In the first chapter of this work, we look into some of the most challenging electronic functions that will pave the way for the realization of an ultra-high speed integrated system in this band. Our goal is to replace the expensive and bulky group III-V solutions in this frequency band with cost-effective CMOS alternatives. 1There is no doubt that during the past decade we have been addicted to the internet. A simple proof of this addiction is the ongoing upload of 40 Million pictures per day on Facebook. A simple calculation reveals that uploading just these number of pictures requires transferring 1015 bits/day or almost 10-15 Gb/s. 4 Freq. BW Modulation Angle Range Resolution Application Short Range 24 GHz 7GHz Pulsed 70 10m <10cm Parking Mid Range 24GH 250 MHz FMCW 30-60 40m ≈1mm Stop & Go Long Range 77GHz 1 GHz FMCW 16 150m ≈1mm ACC Long Range 110GHz 1 GHz FMCW 8 300m ≈1mm ACC Figure 1 - 1 : Some applications of sub-THz to THz band2 As mentioned earlier, there is a growing interest in signal generation in the mm-wave and terahertz frequency ranges. Signal generation at these frequencies is a major challenge in solid-state electronics due to the limited transit frequency (which is the frequency at which the current gain is unity), fT, maximum oscillation frequency (which is the frequency at which the power gain is unity), fmax, breakdown voltage of active devices as well as the lower quality factor of passive components caused by ohmic and substrate loss. Figure 1-2 shows the graph of fmax and fT versus the feature size (gate length) of the technology [8][9]. For example, as can be seen for a 65 nm bulk CMOS, fmax and fT are about 250 GHz. 2 Sources for pictures used in this figure: dailymail.co.uk, tempest.das.ucdavis.edu, ll.mit.edu, and sadcircuitdesigner.com 5 Gate Length (nm) Gate Length (nm)Frequency(GHz)Frequency(GHz)FinFet Planar FinFet PlanarFigure 1 - 2 : fmax and ft versus gate length of transistor3 Traditionally, compound semiconductors are used to implement fundamental oscillators at mm-wave and terahertz frequencies [7], [10]–[12]. Recently, SiGe and CMOS transistors have also been employed in the same frequency range using fundamental and push-push oscillators [13]. A fundamental oscillation with a frequency of 346 GHz is achieved in [14] using a 35 nm InP HEMT with a maximum oscillation frequency (fmax), of 600 GHz. SiGe HBTs with an fmax of 160 GHz are used in [15] to achieve a fundamental oscillation frequency of 100 GHz. A 104 GHz fundamental oscillator is also reported in [16], which employs 90 nm CMOS transistors with an fmax of 300 GHz. As can be seen most of oscillation frequencies are limited to 1/3 to 1/2 of the fmax, raising questions regarding the reason for the upper limit. In 2011, a breakthrough investigation was made by Momeni et al [17], where they proved theoretically that by choosing a proper topology for an oscillator, its oscillation frequency can reach very close to fmax of the process and a 121 GHz 3 This figure is an extension of the figure in [9] 6oscillator was realized in a 0.13 μm process with an fmax of about 135 GHz. Motivated by this exciting finding, in this work we would further investigate different VCO topologies which can oscillate near fmax and at the same time have superior performance (e.g., power efficient with superior phase noise performance). During this phase, different near-fmax VCO architectures were be implemented and their performance was experimentally evaluated and compared. 1.3.2. CMOS Bottlenecks in mm-Wave High Power Signal-Generation As far as signal generation in concerned, one of the major bottlenecks for terahertz electronics is the limited achievable power with conventional electronic sources. Typically imaging, radar, and communication application require about 1 to 10 mW of power (equal to 0 to 10 dBm). As previously mentioned, generating near- fmax mm-wave signal is almost possible. However, the fmax itself is limited and for example, for a 65nm CMOS process it is around 250 GHz. This means that to enter THz region (300 GHz to 3 THz) in this process, the VCO should oscillate at a frequency which is higher than the fmax of the process, a very challenging if not impossible task with active and linear devices. As shown in Figure 1-3, to address this issue, recent research has focused on using nonlinear devices to generate and extract 2nd, 3rd, and 4th harmonics of the fundamental VCO. Among popular methods are push-push oscillator for extracting 2nd harmonic and triple-push for the 3rd. However, due to passive nature of harmonic generation, output power degrades significantly and only very small portion of the consumed DC power will be translated to higher order harmonics, which results in a poor DC-to-RF efficiency. Figure 1- 4 presents recent achieved signal powers versus frequency [8]. Here, the state-of-the-art value for DC-to-RF efficiency of a 4th harmonic 300 GHz source with fundamental frequency at 75 GHz is about 1% [18]. Meaning that for achieving 1 mW of power 100 mW has to be wasted as DC power. In addition to this very poor efficiency, we are limited to maximum DC current density of mm-wave interconnects which can affects aging performance of the generator and even limit and burn the circuit. In this work, 7novel architectures will be presented to achieve high output power while achieving improved DC-to-RF efficiency. Nonlinear StageFiltering and Phase Adjusting StageCos(ωfLOt )βCos(α ω fLOt +φ1)Nonlinear StageFiltering and Phase Adjusting StageβCos(α ω fLOt +φ2)....Nonlinear StageFiltering and Phase Adjusting StageβCos(α ωfLOt+φn )Cos(ωfLOt )nβCos(α ωfLOt+φ )N stages Figure 1 - 3 General method of higher than fmax signal generation Figure 1 - 4 : Maximum reported output power of oscillators4 4 Source : http://isscc.org/wp-content/uploads/sites/10/2017/05/isscc2015_trends.pdf , retrireved November 2017 81.3.3. Low-Phase-Noise mm-Wave Signal Generation With recently commercialized 5G cellular network, researches are looking for mm-wave and more specifically 25-28GHz and 60 GHz bands, aiming for up to 10 Gb/s for wireless communication. Essential to any wireless high performance communication system is the signal integrity and in particular the phase noise (PN) performance of the receiver or transmitter clock. Most of previous researches and analytical calculations on phase noise have been focused on the 300 MHz to 8 GHz band (VHF to C band) and in some cases they make certain assumptions which are not practical in mm-wave bands. For example, the effect of non-linear transistor capacitors are ignored. Although these capacitances are insignificant and have only second-order effects on PN at low frequencies, at mm-wave frequencies, any small variation in these capacitances can directly or indirectly impact PN. Thus, a detailed investigation is required for PN analysis at mm-wave. 1.3.4. µW-level custom radio for highly efficient connectivity solutions The proliferation of wireless communication over the last decade has played a significant role in accessing and using the ever-increasing amount of data that surrounds us. The advances in the semiconductor and wireless industry have enabled a plethora of technologies, for example, a wide network of sensors to autonomously monitor biomedical and environmental conditions. Vital to the existence of such wireless sensor networks (WSNs) is the design of ultra-low-power radio-frequency (RF) transceivers. Furthermore, low-power consumption is critical to reduce the burden on the battery and/or the energy harvesting unit, most notably in portable devices. As part of this thesis, we focus on lowering the power consumption of the receive (RX) path of RF transceivers using current-reuse techniques. The proposed current-reuse techniques can also be applied to the building blocks of the transmit (TX) path. To make a fully functional system, we present an alternative way of combining RX building blocks that the overall system can achieve sub-mW 9power consumption with a low supply voltage ( as low as 0.8 V) while providing reasonable system performance. 1.3.5. Ultra compact, fully integrated DC-DC converter Dynamic-voltage-scaling (DVS) enables fine-grain power control by providing different voltage levels to drive a wide range of load currents in microprocessors and mobile system-on-chips (SoCs). Considerations such as system-integration cost, ease of routing, and power delivery have driven significant research on fully on-chip voltage regulators. Specifically, inductor-based buck converters remain attractive [19]–[24] due to their relatively high efficiency and the ability to cover a wide range of voltage/power. However, apart from the efforts to increase the power efficiency, two key challenges still remain in their design: (1) the use of off-chip inductor (Lfilter) and large filter capacitor (Cfilter) requires a significant area, and (2) limitations of on-chip inductance and capacitance due to area constrains result in a large output ripple (e.g., 30 to 100 mV). To address the abovementioned issues, in this thesis, we present a loss-mitigation technique based on resonance that allows for GHz-range fsw while achieving 72.2% efficiency under full-load condition. 1.4. THESIS OUTLINE This thesis is organized as follows, and as shown in Figure 1-5: 1. Investigation and implementation of highly efficient and low-phase-noise mm-wave oscillator design is presented in Chapter 2 (Figure 1-6). 2. Highly efficient Sub-THz to THz tunable signal sources are presented in Chapter 3 and 4 (Figure 1-6). 3. Investigation and implementation of highly efficient and ultra-low-power connectivity solutions are presented in Chapter 5 and 6. 10 4. Investigation and implementation of ultra-compact DC-DC converter with GHz-range switching is presented in Chapter 7. Sub-THz to THz on CMOSHigh FOM mm-Wave VCOHigh Power sub-THz SourceHigh Power , Tunable , Efficient, sub-THz VCOChapter 2 Chapter 3 Chapter 4NEW FRONTIERS IN RF ELECTRONICS Ultra-Low-Power Wireless Connectivity SolutionUltra Low Power Stacked LNA-VCO-MixerSub-mW current reused Bluetooth Chapter 5 Chapter 6Ultra Compact, fully integrated DC-DC Converter1.3 GHz Fully Integrated DC-DC ConverterChapter 7 Figure 1 - 5 Contributions of this thesis Low Phase Noise and Optimum Fundamental mm-wave VCONonlinear Stage and FilteringHighly Efficient Over fmax Signal GenerationNonlinear Stage and FilteringHigh Power over fmax Signal GenerationNonlinear Stage and FilteringNonlinear Stage and FilteringCouplingCouplingCouplingCoupling Constructive SuperpositionHigh Power mm-wave Signal Source Chapter 2 Chapter 3-4Chapter 3-4 Figure 1 - 6 sub-THz to THz contributions of this thesis 11 LOW PHASE-NOISE, HIGH TUNING-RANGE, HIGH FOM, MM-WAVE SIGNAL GENERATION1 2.1. INTRODUCTION AND OVERVIEW OF PREVIOUS TECHNIQUES The availability of wide bandwidth in the mm-wave portion of the frequency spectrum makes these bands attractive for high-data-rate applications such as wireless high-definition video streaming and medical imaging [25]–[28]. One of the main challenges in most communication systems operating in the 60 GHz and higher frequency bands is to synthesize an on-chip local oscillator (LO) with a high spectral purity and a large tuning range. In addition, when incorporated into a phase-locked loop (PLL), this LO signal needs a mm-wave divider which is challenging to design, and can consume more than 70% of the PLL power budget [28] . The signal synthesis techniques can be broadly classified into direct and indirect synthesis, based on whether the desired LO frequency is the same as the VCO fundamental (f0), or higher than f0, respectively. As mentioned in previous section at mm-wave frequencies, direct LO synthesis techniques face several design challenges to meet the desired PN and tuning-range requirements. First, as f0 approaches the maximum oscillation frequency (fmax) of the transistors in a given process technology, the available power gain of transistors degrades. Thus, an excess power is required to guarantee a sustained oscillation. Second, the quality factor (Q) of passive devices (capacitors, varactors, and inductors) implemented on a lossy silicon substrate degrades significantly in mm-wave region and adversely impacts the PN performance and start-up power requirement. Third, the parasitic capacitance of the oscillator active core, interconnects, and output buffer stage (Cpar) become a significant fraction of the total tank capacitance, thereby permitting only a small MOS varactor (Cvar) to be used for frequency tuning. FTR, being proportional to Cvar and inversely 1 The material presented in this subsection is based on [53]. 12 proportional to Cpar, is therefore significantly limited [25]–[27]. Fourth, the switched-tuning technique [28] is no longer effective for reducing the VCO gain, KVCO (defined as the derivative of VCO output frequency to its input control voltage), since switches add more parasitic capacitance and loss to the tank. Consequently, with a large KVCO, the voltage noise on the oscillator input control node results in increased amplitude noise to PN conversion (also known as AM to PM conversion) [29]. To address abovementioned challenges in cross-coupled F-VCOs, several interesting techniques have been proposed recently. In [30], the quality factor of the LC tank, Qtank, is increased by reducing the size of the varactor and the VCO frequency is tuned by using the body effect which can adjust the drain capacitance of the MOS device. Although a higher Qtank results in a better PN performance, FTR is limited (4.5%) due to the small size of varactor. An effective method of increasing FTR and to cover a large range of frequencies is to use dual-mode LC-tank – up to 28% of FTR is achieved in [31]. However, due to the use of a complex inductor structure, Qtank is sacrificed which deteriorates the PN performance. To improve the PN performance while still utilizing an explicit MOS varactor, inductive peaking at the gate of MOS devices is used to increase the fmax and the transconductance of the cross-coupled pair [26]. Although achieving a reasonable trade off of PN, power consumption and FTR (<9%), a larger FTR (> 10–15%) is still desirable. Indirect LO synthesis techniques using H-VCO generate and utilize higher-order harmonics. The benefits of this approach include increased FTR and ease of implementation in a PLL by relaxing the frequency constrains for the divider as shown in Figure 2-1 [28],[32]. Figure 2-2 (a) shows a triple-push-VCO structure [27] where f0 and 3f0 generated by three VCOs are added destructively and constructively, respectively. Although this technique achieves a large FTR at 60 GHz, extensive electromagnetic simulations of the nested-inductor and tank layouts are needed 13 and any error modelling in coupling factors may alter the centre frequency, degrade the Q and adversely affect the PN [27]. Furthermore, improving the PN performance requires higher selectivity at f0, thereby reducing the power of the desired 3f0 harmonic. Finally, due to the use of multiple oscillators, a relatively large DC power is consumed. Figure 2-2 (b) shows another technique [33] where a large signal swing at f0 generated by the VCO core is fed to a non-linear limiting amplifier with a load resonant at 3f0. This technique however, suffers from low f0 to 3f0 efficiency of the limiting amplifier at mm-wave, and has a significant trade-off between the output power at 3f0 and the DC power consumption of the non-linear limiting amplifier. Figure 2 - 1 Application of H-VCO in a PLL. Figure 2 - 2 Indirect LO synthesis techniques for H-VCOs: (a) Triple-push VCOs with 3rd -harmonic summation [27][34], (b) VCO followed with 3rd-harmonic generation and band-pass filtering[35], (c) VCO followed with 3rd-harmonic extraction and injection locking, and (d) SMV with 1st and 2nd -harmonic 14 Although several designs for indirect and direct synthesis are presented in literature, the approach resulting in a superior performance still remains a matter of debate [28]. In this chapter, after discussing the benefits of H-VCO versus F-VCO, we present a self-mixing-VCO (SMV), whose basic topology is shown in Figure 2-2 (d)[35]. Instead of generating a 3f0 component out of the VCO, the SMV utilizes the 2nd harmonic (2f0) from the common-mode output along with the fundamental (f0) from the differential output —we refer to this VCO as a 3H-VCO from now on. The amplitude of 2f0 signal that can be extracted from a VCO is larger than the amplitude of 3f0 signal, thereby making it a superior implementation than [27]. Furthermore, the SMV architecture does not suffer from strict matching requirements, single-ended operation and large power consumption of the triple push VCO [27]. Finally, we propose the use of a Class-C VCO topology to further enhance the amplitude of the 2f0 component and improve the PN. Thus, the specifications for a low PN, large tuning range and low DC power can be simultaneously met. The chapter is organized as follows: Section II compares direct and indirect signal generation with focus on the PN performance. Section III describes the proposed Class-C SMV architecture in details. Section IV presents the measurement results of a proof-of-concept prototype SMV that is implemented in a 0.13-µm CMOS as well as performance comparison with the state-of-the-art designs. Section V provides concluding remarks. 2.2. PN COMPARISON OF DIRECT AND INDIRECT SIGNAL SYNTHESIS Consider a 2H-VCO and an F-VCO, both generating 2%&, as shown in Figure 2-3(a) and Figure 2-3(b), respectively. The core oscillator in 2H-VCO operates at %& and uses an upconverter (e.g., mixer) to achieve 2%&, with the PN of the final output at 2%& about 20log(2) higher than that of the core oscillator, assuming the up-conversion process to add negligible AM to PM noise. 15 Figure 2 - 3 (a) A 2H-VCO and (b) an F-VCO, both generating 2ω0. (c) A conventional Class-B VCO core. Based on Hajimiri’s PN theory [36], it can be shown that the PN of the F-VCO is larger than the overall PN of the 2H-VCO, with the excess PN given by: '()*+),, = 10 log -./,12./,12 . 3 4124125 . 6/,78,,12  + 96:,78,;)<<,12 6/,78,,12  + 96:,78,;)<<,12  . 3 =12=125> (1) where RT represents the tank parallel loss, A is the oscillation amplitude, 6/,78, is the rms values for impulse sensitivity function (ISF) of RT noise, 6:,78,;)<< is the rms value for the effective ISF of transistor (M1 or M2 shown in the conventional Class-B oscillator of Figure 2-3(c)) thermal noise, and γ is the excess noise factor of a MOS transistor [37]. The subscripts ω0 and 2ω0 in (1) are associated with the 2H-VCO and F-VCO, whose cores are operating at these frequencies, respectively. Assuming that both cores of H-VCO and F-VCO generate the same signal swing 16 (4%0 = 4%0), RT be proportional to Q (e.g., RT = ?=%&), and L is chosen to be inversely proportional to %&, '()*+),, can be simplified as: '()*+),,(=, 6) = 10 log 36/,78,,12 + 96:,78,;)<<,12 6/,78,,12 + 96:,78,;)<<,12  . =12=125 (2) For an NH-VCO with an up-conversion ratio of N and output frequency of (%&, (2) can be generalized as: '()*+),,(=, 6, () = 10 log 36/,78,,@12  + 96:,78,;)<<,@12 6/,78,,12  + 96:,78,;)<<,12  . =12=@125 (3) Although at lower frequencies, Q for the two VCOs can be almost the same, as we approach mm-wave region the difference in Q for the two oscillators becomes higher. In addition to Q, ISFs of both active and passive devices is predicted to be higher in mm-wave range. The following two subsections analyse and compare the ISF and Q of the two VCO topologies to estimate PNexcess, and therefore, the low PN merits of H-VCOs. 2.2.1. Comparison of On-Chip Passives The quality factor of a mm-wave LC-tank can be limited by either quality factor of the inductor, namely =B, or quality factor of the capacitor and varactor, namely =C. The design of the tank must be carried out with two considerations – operating at the maximum achievable Q which minimizes PN and power consumption, and, using DE,+ = FG√BC, the inductance should be sufficiently small so that the tank requires additional explicit capacitance provided by the varactor to maximize the FTR of the VCO. Generally, =B of a spiral inductor is limited by loss in the metal lines and the substrate, which at mm-wave frequencies is frequency dependent due to factors such as skin effect and proximity effect [38]. Using a double-π equivalent model [39]for inductors, it can be shown that at low 17 frequencies, QL increases with frequency until it reaches its maximum value Qmax, beyond which it drops, as the frequency dependent loss becomes significant. One way to reduce the substrate-loss and increase =B in mm-wave range is to reduce the effective substrate area under the inductor by decreasing the width [38], and the number of turns of the inductor. In this work, =B of a single-turn octagonal inductor, suitable for mm-wave signal generation, is studied. The structure is shown in Figure 2 - 4. The inductance is adjusted by changing the radius and the width of the metal layer. For the purpose of simulation, both Sonnet and PickView 3D planar EM simulators are used. Figure 2 - 4 presents differential inductance, maximum achievable differential quality factor, Qdiff-max, and the frequency at which QL reaches its maximum value, namely fQ-max, versus the octagonal radius (R) for three different metal widths (W = 40 µm, 10 µm, and 4 µm). For a 3H-VCO with its core oscillator operating at 20 GHz, a high Qdiff-max ≈ 36 at 20 GHz is obtained with 175 pH of inductance (W = 40 µm, R= 95 µm), yielding about 350 fF of tank capacitance. On the other hand, for an F-VCO operating at 60 GHz, an inductor having a Qdiff-max at 60 GHz for best PN performance limits FTR by permitting small tank capacitance (W =10 µm, R = 40 µm, Qdiff-max = 27.9, L = 120 pH, C = 55 fF; or W =4 µm, R = 48 µm, Qdiff-max = 26.6, L = 180 pH, C = 38 fF), or a lower Qdiff-max of 19.1~19.7 for larger FTR through the choice of smaller inductance (40 ~ 60 pH) for W = 10 µm. If the inductor limits the quality factor of the LC-tank, (3) suggests that the 3H-VCO could have 10 log IJKLMNOPJQMNOP R ≈ 2.7dB better PN. As Qtank is dictated by the loss in inductors as well as in MOS varactors and the metal-insulator-metal capacitors (mimcaps) at the mm-wave range of operation (FJSTUV ≈ FJX + FJYZY[\] + FJY^_[\]), the overall Qtank may be dominated by the varactor loss, which often shows a lower Q due to the use of lower metal layer connections to the MOS device. Figure 2 - 5 show how the Q of the MOS-varactor and mimcap degrades with frequency, for three different capacitor values. Qtank for this technology is clearly limited by the 18 loss in the MOS varactors. Clearly, a 3H-VCO operating at 20 GHz core can achieve significantly higher Qtank than that of an F-VCO operating at 60 GHz. 50 100 15050100150200250300350Ldiff (pH) 50 100 15010152025303540f Q-max(GHz) 50 100 150303234363840Radius(µm)Qdiff-max 30 40 50 60100120140160180200220240Ldiff (pH) 30 40 50 6024.52525.52626.527Radius(µm)Qdiff-max 30 40 50 605060708090100f Q-max(GHz) Width = 40 µmWidth = 4 µm40 60 8024.52525.52626.52727.52828.5Radius (µm)Qdiff-max40 60 80100150200250300Ldiff (pH) 40 60 8030354045505560f Q-max(GHz) Width = 10 µm Figure 2 - 4 Differential inductance (Ldiff), maximum differential quality factor (Qdiff-max), and the frequency where the quality factor reaches its maximum value (fQ-max) versus radius (R) of inductor for three different inductor widths. 19 1 2 3 4 5 6 7 8 9 10x 101005101520Frequency(GHz)Q 60 fF 200 fF 390 fF 10 20 30 40 50 60 70 80 90 100Frequency (GHz) (a) 1 2 3 4 5 6 7 8 9 10x 1010051015202530354045Frequency(GHz)Q 60 fF 120 fF 240 fF 4 5 6r10 20 30 0 50 0 70 80 90 100Frequency (GHz) Figure 2 - 5 Quality factor of (a) MOS varactors with 30% tuning range, and (b) mimcaps. 20 2.2.2. Comparison of ISF Next, we analyse the effect of increasing the operating frequency of the VCO core on 6:,78,;)<< and 6/,78,. Consider a conventional Class-B oscillator and its associated 6: and 6/ as shown in Figure 2 - 6(a). A detailed study of Class-B topology can be found in [40]. First, let us consider the ISF for M1, namely, 6:. Based on the operating region of transistors (deep triode or cut off), the current noise of M1 finds two major paths to reach the output node and generate PN – through M2 when M2 turns on, or through the parasitic tail capacitance, Cpar, via the ground when M2 is off and M1 is on. At low operating frequency, the impedance of Cpar is negligible, and the noise current only reaches the output when the switches are commutating, through M2. This results in almost a flat zero area in 6: when M2 is off. However, as frequency increases, the effect of Cpar becomes more significant and 6:F,78, increases, as seen in the plots for 6: at 7 GHz, 21 GHz, and 63 GHz (all simulations have been done with the same Qtank and core transistors are optimised to reach the lowest PN) in Figure 2 - 6(b)-(d), respectively. Two conclusions can be easily made here: (1) due to the lower core frequency, H-VCO architecture has smaller 6:,78,;)<< than F-VCO, thereby providing a better PN, and (2) to alleviate the effect of Cpar in mm-wave range, one can use an alternative topology which has less sensitivity to the tail parasitic capacitance. For example, a Class-C VCO topology can be attractive in which the transistors are in saturation and tail capacitance improves the PN performance [36]. As for the tank-referred ISF, 6/,78,, it is expected that 6/,78, increases when operating at the mm-wave range. This is predominantly due to increased AM to PM noise conversion from the nonlinear capacitances of active core transistors which have comparable size to varactors at the higher frequency. Figure 2 - 7 shows the simulation results of 6/,78, versus frequency (all simulations have been done with the same Qtank and core transistors, while the frequency is 21 changed by adjusting the value of the tank capacitance). Increasing the frequency from 20 GHz to 60 GHz increases 6/,78, by about 26.8%. 0 π/2 π 3π/2 2π−1−0.500.51ISF0 π/2 π 3π/2 2π−0.500.5ISFπ/2 π 3π/2−0.500.5ISF0 π/2 3π/2 2π−1 Figure 2 - 6 (a) Conceptual ISFs, a`, and b` at low frequency for a Class-B VCO. Simulated a` at (b) 7 GHz, (c) 21 GHz (d) and 63 GHz, respectively. ISF is normalized to the maximum charge displacement, qmax [36]. 3.015 3.02 3.025 3.03 3.035 3.04x 10-9-0.4-0.200.20.40.6 7 GHz 21 GHz 63 GHz0 π/2 π 3π/2 2π−0.8 Figure 2 - 7 Simulated b` of a Class-B VCO at 7 GHz, 21 GHz, and 63 GHz (single ended noise is injected as shown in Fig.6 (a) ). ISF is normalized to the maximum charge displacement, qmax [36]. 22 2.2.3. cdefgehh of a Class-B VCO For Different Up-conversion Ratios In the previous subsection, we compared and discussed the parameters which are contained in '()*+),,(=, 6, (). To have a better understanding of realistic values for '()*+),,(=, 6, () , a 2H-VCO and 3H-VCO are compared with an F-VCO at different frequencies. Here, the F-VCO is realized with a Class-B oscillator (Figure 2 - 8(a)), 2H-VCO is implemented using a push-push Class-B VCO (Figure 2 - 8(b)), and the 3H-VCO is achieved by self-mixing the output of a Class-B and the 2nd harmonic from the tail using an ideal mixer (Figure 2 - 8(c)). In each simulation, the LC-tank and the active-cores are separately optimised for the target frequency to achieve the minimum PN (all inductors are designed and optimised using PickView). Figure 2 - 9 presents the simulation results of '()*+),,(=, 6, 3) and '()*+),,(=, 6, 2) for a 1 GHz to 100 GHz oscillator. As predicted, at low frequencies (e.g., less than 10 GHz) where the quality factor of passives and ISFs of H-VCO and F-VCO are close to each other, '()*+),, is negligible. However, as the frequency increases, H-VCO shows a better PN performance due to the superior = and noise sensitivity of the core oscillator. In the next section, we discuss the implementation of the proposed 3H-VCO. Figure 2 - 8 (a) A 2H-VCO (push-push Class-B core), and (b) a 3H-VCO with SMV architecture (Class-B core). 23 0 10 20 30 40 50 60 70 80 90 100012345Frequency (GHz)PNexcess(Γ,Q,N) (dB) N=3N = 2 Figure 2 - 9 Simulation results for PNexcess for N = 2 and 3. Phase noise is simulated at 1 MHz offset frequency. 2.3. PROPOSED 60 GHZ 3H-VCO As suggested earlier, Class-C operation can alleviate the detrimental effect of tail parasitic capacitance in increasing 6:,78, by changing the operating region of the transistors. We will further show that a Class-C topology can also provide a lower active-core parasitic capacitance across the LC-tank and hence an improved FTR. Thus, we focus our attention on the Class-C implementation in this section. Figure 2 - 10 illustrates two possible architectures for achieving a Class-C 3H-VCO. In Figure 2 - 10(a), the 3rd harmonic of a Class-C oscillator is extracted using a tuned transformer at 60 GHz. An alternative SMV is shown in Figure 2 - 10(b). In the first stage, the structure uses a Class-C push-push VCO to generate f0 and 2f0 components at 20 GHz and 40 GHz, respectively. In the second stage, a single-balanced active mixer is used to combine the f0 and 2f0 components and generate the desired LO component at 3f0 (~60 GHz) at the output of the mixer. A λ/4 (at 2f0) transmission-line is used for biasing the VCO and mixer as well as to maximize the 2nd-harmonic 24 component. This λ/4 line is ideally open (high impedance) at 2f0 and allows the second harmonic current I2f0 to sink into the mixer. The mixer is tuned at 3f0 (~60 GHz) to provide frequency selectivity and suppress spurious components at the lower mixing sideband. To avoid transformer design, in this work, we focus on implementation of the Class-C SMV as shown in Figure 2 - 10(b). Figure 2 - 10 (a) Transformer-based 3rd harmonic extraction. (b) SMV implementation [14]. 2.3.1. Benefits of Using Class-C Push-Push VCO The operational Class of core VCO significantly impacts the amount of 2f0 component generated at its output. A smaller 2f0 component reduces the conversion gain of the mixer in SMV, thereby necessitating a larger power consumption in the mixer to increase the signal swing of the 3f0 output signal. The Class-B push-push VCO has been frequently used for generating 2f0[41],[42]. However, it achieves low DC-to-2f0 efficiency due to the nature of the VCO current waveforms. Figure 2 - 11 shows a Class-B cross-coupled VCO, where the cross-coupled transistors (M1 and M2) operate mostly in triode region. In contrast, in a Class-C design, with VBias-Gate < VDD, these transistors operate in the saturation region when conducting. Also, a Class-C VCO employs a large L BiasL B L BL B25 capacitor in parallel to the tail device. In comparison to the ideal Class-B operation with square-wave drain current, Class-C VCO shows a (Class-C) waveform with conduction angle of ΦC < π [43]. This has two main benefits: first, while the square-wave drain current has a zero 2nd-order harmonic in ideal Class-B operation (although in a real implementation, the current waveform is not an ideal square-waveform and does contain some 2nd harmonic), the Class-C drain-current results in generating a larger 1st and 2nd harmonic and consequently higher DC-to-f0 and DC-to-2f0 efficiencies. An expression for the DC-to-nf0 current efficiency (ij,k) for the Class-C operation is derived later in the Appendix, and is plotted in Figure 2 - 12 as a function of one-half the conduction angle for various n. The second benefit of Class-C, as shown in [36][43] and attributed to its better current efficiency, is that Class-C drain current waveform in VCOs results in 3.9 dB lower PN at same DC power consumption (or 50% lower DC current while achieving the same PN [43]). Figure 2 - 11 Cross-coupled pair in Class-C and Class-B VCOs and corresponding drain currents. 26 In addition to a better DC-to-2f0 efficiency and superior PN performance, the Class-C VCO has a lower parasitic capacitance, Ccore, across the resonator LC-tank. 0 10 20 30 40 50 60 70 80 9000.511.52Current Efficiencyn=3n=4n=5n=1n=2π/2 π Figure 2 - 12 Current Efficiency versus conduction angle. Table 2 - 1 Operating region of transistors in class-b VCO Operating Region Ccore I. Small swing M1 & M2 in saturation l,F + 4lmF + 4mnF ≈ 23 ?E* + 5Ep II. (or IV) Medium swing M1 off, M2 in saturation or vice-versa −12 q8/rstq8 + (/rst + l,)% + 4lmF III. Large swing M1 off, M2 in triode 12 qm,. (2l, + /rst)qm, + (/rst + l,)% + 2lmF + mn2 Consider a cross-coupled pair and its associated parasitics as shown in Figure 2 - 13. Here, Ccore presents the amount of the active core parasitics and is a function of l,, lm, mn q8, qm,, and CTail. As shown in [44], transistors in a class B core undergo four different operating regions over a complete period of signal swing. Table 2 - 1 shows the different operating regions, and the expression for instantaneous Ccore in each region [44]. Here Cov is the overlap capacitance per unit channel width (W), and COX is the gate oxide capacitance per unit area. From Table 2 - 1 and Figure 2 - 13, even though Cgs is larger or equal to Cgd in all the operating regions, the effect of Cgd in Ccore is amplified by 4X due to differential swings and is dominant. In 27 operating region III, both Cgd and Cdb are large, and if CTail is designed in Class-B operation to be large in order to reduce the flicker noise upconversion at the output phase noise, Ccore becomes significantly large in this region. Averaging over the entire period, the large signal parasitic capacitance (CD-par) for Class-B remains considerably big, severely limiting the tuning range at mm-wave frequencies. I0M1 M2CTailCgs2Cgs1Cdb2Cgd2Cgd1Yin=G + jωCcoreCdb1Rb RbVBias-GateCgsCdbCgd2/3WLCOX1/2WLCOXCapacitanceTriodeSaturationWCovCut-offDifferential Signal SwingClass-C Class-B Figure 2 - 13 Parasitic capacitances in a cross-coupled oscillator. Table 2 - 2 Operating region of transistors in class-c VCO Operating Region Ccore I. Both M1 & M2 off ≈ 4Cgd,off + Cgs1,off = 5WCov II. (or III.) M1 off, M2 in saturation or vice-versa ≈ Cgs On the other hand, in order to find the instantaneous small-signal capacitance of a class-C VCO, the oscillation period can be divided in three regions: (I) when both cross-coupled transistors are off, and (II) (or (III)) when only one device is on and operates in the saturation region. Table 2 - 2 shows the different operating regions in a Class-C operation, and the expression for instantaneous Ccore in each region.By avoiding operating in triode region, and with a conduction angle smaller than Class-B, a Class-C core therefore has much lower CD-par, and, therefore, a larger tuning range at mm-wave frequencies. 28 Figure 2 - 14 compares the simulated drain capacitance, CD-Par, for Class-C and Class-B VCOs with the same core transistor size. As can be seen, CD-Par in Class-C is almost 2/3 of that of Class-B. A Class-C VCO therefore ensures a higher FTR, especially at mm-wave frequencies. Figure 2 - 14 Parasitic capacitance (CD-Par) of cross-coupled pair in Class-B (with 200fF parasitic capacitance at tail) and Class-C (with Ctail of 1pF) VCO at 20 GHz. 2.3.2. Comparison of Output Signal Swing VCOs are often designed and compared in terms of PN, FTR and power dissipation. Two popular figures of merit (FoMs) for comparing VCOs are: uvw = '( − 20 log(Dx ∆D⁄ ) + 10 log('{C 1⁄ ) (4) uvw/ = uvw − 20 log(u|. 10⁄ ) (5) Although these FoMs are applicable to mm-wave VCOs, and indeed widely used and reported [25][27]they do not directly incorporate the output signal swing of the VCOs or the power 0 1 2 3 4 5 6 7120130140150160170180190200210Tail Current (mA)Drain Parasitic Capacitance (fF) Class-C VCOClass-B VCO29 consumption of the buffers. The buffers are often designed to be able to drive 50Ω for test purposes, including the prototype presented in the Section IV. In order to use a mm-wave VCO in a monolithic transceiver, it must have a significantly high voltage swing and be able to drive a capacitive load. F-VCOs usually have large signal swings, although at the expense of a severe trade off with FTR, PN, and power dissipation to start and ensure oscillations. On the other hand, H-VCOs have improved PN, FTR and power dissipation in the core as described earlier, but suffer from insufficient output swing. In order to amplify the signal swing, additional power must be consumed in buffers that can drive a capacitive load. For a fair comparison between F-VCO and H-VCOs for the case of similar output signal swing, five different topologies are simulated, as shown in Table 2 - 3. This includes a 60 GHz F-VCO, a 60 GHz 2H-VCO (Figure 2 - 3(a)) with the core operating at 30 GHz, and a 60 GHz 3H-VCO (Figure 2 - 10(b)) with the core operating at 20 GHz. All the cores use a Class-C topology for fair comparison in terms of FTR and PN. For the H-VCOs, the output of the mixer is amplified using a 3-stage buffer, each stage being a common-source buffer with a 1:2 transformer based resonant load at 60 GHz. The final stage drives a 30fF load with a single ended swing of at least 900 mV, assumed to be sufficiently large to switch transistors on/off in 130-nm CMOS process. The 1:2 transformers in the buffer are implemented with a quality factor of 10 in the primary and secondary turns. For the F-VCO, only one stage of buffer is needed. Furthermore, for each H-VCOs, both active (Figure 2 - 10(b)) and passive mixers are implemented. The main concerns for the mixer design are the capacitive loading for the core and the conversion gain, which are best addressed by an active mixer. However, passive mixers, implemented using transmission gates, as presented in [40], do not consume static power. The common mode input levels of the passive mixer are held to ground to reduce power consumption. No considerable PN performance difference in passive vs active mixer is seen for the same H-VCO design. Each of the design is operated at 1.2 V supply. 30 Table 2 - 3 presents the simulation results for an iso-swing design comparison, from which several conclusions can be drawn: (1) Although F-VCO has reduced power consumption and design complexity by needing only one stage of buffer amplification, the performance is poor in terms of PN and FTR. (2) Although the 2H-VCO (or F-VCO) does not need to generate any second harmonic which suffers from smaller voltage-swing, the lower Qtank at 30 GHz (or 60 GHz) compared to 20 GHz degrades the PN significantly. (3) The VCO core experiences a higher capacitive loading in the 2H-VCO topology compared to the 3H-VCO, lowering the FTR. (4) The VCO cores operating at higher frequencies must be operated with a larger current from the supply in order to ensure oscillation start-up. We observe that even when accounting for signal swing, the 2H-VCO with an active mixer has the best performance in terms of PN, tuning range and total power consumption. Table 2 - 3 Simulated Performance Comparison of 60 GHz VCO Topologies Parameters Active-mixer 3H-VCO (This work) Passive-mixer 3H-VCO Active-mixer 2H-VCO Passive-mixer 2H-VCO F-VCO Mixer Gain (V/A Ohm) * 152.46 28.60 150 28.52 NA PN (dBc/Hz) @ 1 MHz –102.4 –101.5 –97.2 –96.8 –94.7 FTR % 10.3 10 8.3 5.8 4.0 Single-Ended Voltage Swing (mVpk-pk) 900 900 900 900 900 Power (core) (mW) 18.5 17.4 21.4 16.2 16.36 Power (buffer) (mW) 10.3 25.8 10.4 11.2 4.28 Total Power (mW) 28.8 43.2 31.4 24.4 20.64 FoM (dBc/Hz) (core) @ 60 GHz –185.3 –184.6 –179.43 –180.26 –178.13 FoM (dBc/Hz) (core+buffer) –183.36 –180.7 –17.77 –178.08 –177.12 FoMT (dBc/Hz) (core+buffer) –183.62 –180.7 Peson meat u nexge176.15 –173.04 –169.16 * Since the mixers are operating in current mode, fed by the harmonic current generated by the push-push VCO at the VCO common-mode node, the gain of the mixer is in Ohm. 31 2.4. MEASUREMENT RESULTS The 3H-SMV shown in Figure 2 - 10(b) is designed and fabricated in a 0.13-μm CMOS process. The core Class-C VCO is designed to operate between 17-to-21 GHz; the mixer and the output buffer at 60 GHz. For this prototype, a large signal swing into a capacitive load is not a design constraint. Instead, a single stage output buffer is designed to drive the 50-Ω load of the test equipment so that a comparison can be made to the state-of-the-art designs. The VCO varactor is implemented using a thick oxide accumulation MOS varactor. Both LBias and LB are implemented using a λ/4 transmission-line. All transmission-lines and inductors are modelled and simulated using Momentum planar electromagnetic software. Figure 2 - 15 Measurement setup and chip micrograph of SMV architecture. Figure 2 - 15 shows the die micrograph. The active die area (excluding pads) is about 300×670 μm2. The device under test (DUT) is directly probed (using 50-67 GHz Cascade infinity probes) and measured with a signal and spectrum analyzer (R&S-FSW67). The output signal power of 32 SMV is -31 dBm measured at the maximum output frequency of 62.48 GHz, as shown in Figure 2 - 16. Figure 2 - 17 shows the tuning range plot of the SMV, spanning from 52.8-to-62.5 GHz with an FTR of 16.8%. Figure 2 - 18 shows the measured PN plot of SMV at mid-band carrier frequency of 53.68 GHz with a PN of −100.6 dBc/Hz at 1 MHz and −124.8 dBc/Hz at 10 MHz offset, respectively. The variation in the PN, measured at 1 MHz offset frequency from the carrier, is shown in Figure 2 - 19(a) across the entire tuning range of the VCO. The overall variation is about 2 dB. Figure 2 - 19(b) shows the corresponding variation in the FOM across the tuning range. The performance summary of the Class-C SMV architecture and its comparison to the state-of-the-art designs is presented in Table 2 - 4. Figure 2 - 16 Measured SMV spectrum at the buffer output at 62.48 GHz. 33 Figure 2 - 17 Measured frequency tuning range. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.852545658606264Tuning Voltage (V)Frequency (GHz) 9.7 GHz34 Figure 2 - 18 Measured SMV phase noise at 17.9 GHz (fundamental), 53.68 GHz (tripled, lowest frequency) and 62.49 GHz (tripled, highest frequency). 35 52 54 56 58 60 62 64-101-100.5-100-99.5-99-98.5-98Frequency(GHz)PN(dBc/Hz)52 54 56 58 60 62 64-186.5-186-185.5-185-184.5Frequency (GHz)FOM(dBc/Hz) Figure 2 - 19 Measured phase noise (at 1 MHz offset) and FoM versus SMV output frequency. Table 2 - 4 Measured Performance Summary of State-Of-The Art Millimetre-Wave VCOs Parameters Architecture Freq. (GHz) FTR % PN (dBc/Hz) @ 1 MHz PN (dBc/Hz) @ 10 MHz VDD (V) PDC (mW) Buffered POUT (dBm) Tech. (nm) CMOS Process FoM (dBc/Hz) FoMT (dBc/Hz) This Work H-VCO Class-C Self-Mixing VCO (SMV) 52.8-62.5 16.8 –100.57@53.6 GHz –124.75@53.6 GHz 1.2 7.6 4.1 VCO + 3.5 Mixer -28@52.8 GHz, -31@62 GHz 130 –186.3 –190.85 [25] 3H-VCO Transformer-based 3rd order extractor 58 25.4 –100.1 –122.3 1 13.5 NA 40 181.5 189.6 [26] F-VCO Class-B Inductive peaking 64 8.7 –95 NA 0.6 3.16 -20@64 GHz 90 –186 –185 [27] 3H-VCO Triple-Push VCO 67.8 13.6 NA –95 1.4 18 –36.4@63.2 GHz 130 –159 –161.7 [32] F-VCO Dual Mode Tank 66.1 27.9 NA –103 1 13 –31@75 GHz 65 –168 –177 [38] 3H- ILO Injection-Locked 60.5 8.2 –95* –113* 1.2 >33.6 (VCO+ILO) –-10@60.48 GHz* 65 –175 –173.4 [45] F-VCO Class-B DCO 57 11.6 –92 NA 1.2 13.2 NA 65 176.4 177.7 [46] F-VCO Inductive Divider Feedback 54 9.1 -95 –119.2 1 24 NA 65 –179.8 –179 [32] F-VCO Magnetically Tuned Multimode 73.8 41 NA –104.6 to –112.2 1.2 8.4 to 10.8 NA 65 –172 to –180 –184 to –192.2 [47] F-VCO Inductive-division LC tank 61.7 4.81 –90 NA 0.43 1.2 NA 90 –185 –178.6 [48] F-VCO CMOS VCO 70.2 9.55 NA –106.14 1 5.4 –37@73 GHz 65 –175.76 -175.36 [49] F-VCO Transformer-Based Dual-Band 67.9 19.8 NA –105.8 to –112 1.2 1.44 NA 65 –173.8 to –180.4 –180 to –187.4 [50] F-VCO Magnetically Coupled 59.3 39 NA –101.7 to –113.4 1 8.9 to 10.4 –31@75 GHz 65 –167.8 to –179 –179.6 to 190.6 * includes off-chip amplification 2.5. CONCLUSION A self-mixing-VCO architecture is described for mm-wave applications. The structure uses a push-push Class-C VCO for low PN, high tuning range, and higher first- and second-harmonic 36 content, and then mixes these two harmonics to generate the third harmonic as the desired output. We refer to this architecture as 3H-VCO. Analyses and simulations confirm that this indirect LO signal generation technique has superior tuning range and PN performance as compared to those of direct synthesis techniques. As a proof-of-concept, a 60-GHz 3H-VCO prototype is designed and fabricated in a 0.13-μm CMOS process. The measure performance of the implemented 60-GHz prototype compares favourably with the state-of-the-art designs. 37 HIGH-POWER OVER FMAX SIGNAL GENERATION: SLOW-WAVE 300 GHZ SOURCE1 3.1. INTRODUCTION Scaling of the cost effective CMOS process attracts designers to implement sub-THz to THz building blocks in recent years [5], [17], [18], [26], [51]–[56]. Ranging between 300 GHz to 3 THz, the THz spectrum lies between the domains of optics and electronics on the frequency scale. THz band offers wide range of applications. For wireless communication, the availability of very wide unused bandwidth in the THz spectrum makes it possible to build up a wireless data-link with data rates in excess of tens of gigabits-per-second (e.g., by using modulation scheme such as amplitude shift keying (ASK)). In addition, THz waves can penetrate through clothing and detect concealed items, make it useful for high resolution imaging. THz spectroscopy is effective in cancer detection, tooth cavity detection, and food quality control. It is well proven during past decade that THz signal detection is quit feasible on CMOS [5], however efficient THz signal generation on CMOS is still a bottleneck and an open topic for research. Most of the abovementioned application require a high power (>1mW) THz source and usually they have been implemented by a costly nonsilicon process such as HBT/HEMT oscillators, cascade quantum lasers, and group III–V based multipliers. Generating a THz source in CMOS has three main challenges: First, direct synthesis is impossible due to the limited transistor’s maximum frequency of oscillation (fmax , for example fmax ≈ 200 GHz in a 65nm process [17], [51]), hence indirect signal-synthesis or multiplication is required which impacts the generated output power. As a result high power generation requires constructive summation of 1 The material presented in this subsection is based on [112] 38 weak harmonics which in most cases requires a fully symmetric layout, any mismatch between cores results in poor superposition and lower power respectively. Second, indirect signal-synthesis approach has a poor RF to DC efficiency (typically between 0.1% to 1.5%) [5], [17], [18], [26], [51]–[56]. Third, FTR is limited by parasitic capacitances and using explicit varactor is impractical due to having a poor quality factor [51]. Figure 3 - 1 General steps for generating a high power THz source. To overcome abovementioned challenges, most of CMOS based THz sources use coupled harmonic generators, such as N-Push architecture, to generate and combine THz harmonics (as show in Figure 3 - 1). Among them push-push VCO (PPV) structure has attracted great attention due to even harmonics extraction capability and ease of layouting due to symmetricity [17], [51], [52], [57]. However, it should be noted that the maximum achievable fundamental frequency (fosc-max) of a PPV is architecture dependent and in most cases is portion of fmax (e.g., fosc-max ≈ fmax/2 for a class-B PPV which results fosc-max≈100GHz for a 65-nm CMOS process [17], [51]) thus 2nd harmonic extraction is practical for only generating harmonic frequencies close to fmax which is typically lower than the THz band (e.g., f2nd ≈ 200 GHz). To further increase frequency using push-push, 4th harmonic is extracted [17]. Although this approach benefits from requiring a lower fundamental frequency and benefits from a symmetric structure of PPV, it results in a weaker 39 output power and poor DC-RF efficiency (0.13% at 320GHz [51] and 0.03% at 256 GHz [16]). For odd-harmonics, triple-push (TPV) structure can be used [55], [56], [58]. It is proven that TPV is a great candidate for boosting fosc-max to as close as fmax of a process[51], [56]. Most of the reported THz TPVs are single-stage (step 2 in Figure 3 - 1) and not coupled, thus reported output powers are in range of –10 to –6 dBm [55], [56]. The reason might be layout complexities of TPV which DRC rules force using an asymmetric tank (e.g., 60 degree bends are not allowed in some processes). In this chapter, to improve the output power and DC-to-RF efficiency, we propose a quad-core passively coupled TPV which extracts the third harmonic and delivers 0.9 dBm (1.25 mW) at the third harmonic. As will be shown, compared to the 4th harmonic, the 3rd harmonic extraction results in a 1.5× higher efficiency in the band of 250-to-300GHz. To further boost the efficiency, each VCO utilizes a slow-wave inductor for its tank and combiner. Compared to the conventional CPW and grounded-CPW (GCPW) structures, the proposed slow-wave structure can reach 40% higher quality factor (Q) and 2.6 dB lower insertion loss. Measurement results confirm that as compared to CPW or GCPW, S-CPW can deliver 2.6 dBm higher power (both structures are measured). The chapter is organized as follows: Section II briefly compares the efficiency of PPV and TPV structures. Section III presents the proposed slow-wave triple-push VCO. Measurement results and concluding remarks are provided in Section IV and V, respectively. 3.2. PPV AND TPV EFFICIENCY COMPARISON To compare harmonic efficiency of PPV and TPV over frequency, two structures as shown in Figure 3 - 2 are designed and simulated in a 65-nm CMOS process. At each frequency, the component values of the core oscillator of each structure are adjusted so as to optimize the DC-to-RF efficiency. For the purpose of simulation, ideal passive components such as RF choke (RFC) 40 and combiner are used. Also Q of the LC tanks are chosen to be ~30 at all frequencies. In addition, both architectures utilise LGate to boost their effective fmax [17], [51]. Figure 3 - 2 plots the DC-to-RF efficiency for the 2nd, 3rd, and 4th harmonics. For the simulated PPV structure, the maximum f0,osc-max is about 140 GHz which would generate 2nd and 4th harmonics at 280 and 560 GHz. Figure 3 - 2 suggests that for frequencies higher than 210 GHz the 2nd harmonic generation is not as efficient as the 3rd harmonic counterpart. This can be attributed to the fact that the fundamental frequency is approaching the fmax of the transistor. Also, for frequencies higher than 360 GHz, the 4th harmonic generation is preferred. In this work, our target frequency is below 360 GHz and thus we focus on the TPV architecture which offers a superior efficiency based on Figure 3 - 2. 100 200 300 400 500 600051015Freqeuncy (GHz)DC-to-RF Efficiency (%) Second HarmonicThird HarmonicFourth HarmonicMore efficient to extract Third Harmonic (e.g. by using Tripple-Push)More efficient to extract Fourth Harmonic(e.g. by using Push-Push)More efficient to extract 2nd Harmonic (e.g. by using Push-Push) Figure 3 - 2(a) Simulated PPV and TPV with their (b) DC-to-RF efficiency in 65 nm CMOS process with ideal combiner 41 3.3. THE PROPOSED SLOW-WAVE QUAD-CORE-COUPLED TPV Figure 3 - 3(a) shows the proposed TPV. Four triple push oscillators are coupled in-phase and the third harmonics are combined and matched to 50 Ω. Each oscillator is tuned at 100 GHz and the tank (40pH inductor) is implemented using a slot-type float S-CPW as shown in Figure 3 - 3(b) [59]. LGate is implemented using CPW line to control drain-gate phase and boost gm of the devices [17], [51]. At the third harmonic, the LGate shows high impedance (ideally open) and the generated harmonic current sinks to the centre-tap (CT) node. The oscillators are coupled in-phase at the fundamental frequency by coupling the drain node of each transistor with the consecutive stage. The generated 3rd harmonics are then combined with four shielded S-CPW and connected to the output pad. A 5-port electromagnetic (EM) simulation is carried out to match the combiner to 50 Ω. Figure 3 - 3(c) shows the output matching (S11) of the combiner. As will be discussed next, using slotted slow-wave structure for inductor and combiner results in a higher quality factor which in turn relaxes the start-up condition of the oscillator as well as reduces the insertion loss of the combiner. 100 200 300 400 500-20-15-10-50Frequency (GHz)S11(dB) Figure 3 - 3 (a) Proposed slow-wave quad-core-coupled TPV, (b) S-CPW structure, and (c) output matching (S11) 42 3.3.1.1. Slotted S-CPW and Comparison with Conventional CPW Figure 3 - 4 illustrates the difference between GCPW and S-CPW lines. The primary goal of using patterned CPW lines is to isolate the lossy substrate from the signal path and reduce the associated eddy current loss in the substrate. Theoretically, GCPWs are able to fully isolate the substrate from the signal path; however, in practice providing a truly 0 V reference is impossible and thus signal can be induced to the ground plane which impacts substrate/signal isolation [59]. An alternative solution is to use slotted S-CPW. Since the shield is a good conductor, there is no electric field tangential to the strips and thus the voltage on the shield is zero with respect to CPW and hence can provide a better shield than GCPW [59]. The phase velocity is given by: }~ = % = €&‚7,)<< = 1√? , where % is angular frequency,  is propagation constant, €& is speed of light, ‚7,)<< is effective substrate permittivity, and L and C are inductance and capacitance per unit length. Using slotted strips under the CPW increases the effective C without impacting the inductance significantly. Consequently the ‚7,)<< (or ) increases. It can be shown that the quality factor of the transmission line can be written as [59]: = = 2ƒ = %‚7,)<<2€&ƒ , where ƒ is attenuation constant of the line. Increasing ‚7,)<< using float strip lines in turn boosts the =. To validate the phenomenon, the slotted S-CPW is designed using the top thickest metal (M9) with slots on the next metal layer (M8). The structure is EM simulated and compared with conventional CPW and GCPW. Figure 3 - 5 shows simulation results of the insertion loss as well as quality factor of the structures. The S-CPW attains around 50% better quality factor and 2 dB lower insertion loss at 300 GHz. 43 Figure 3 - 4 Grounded and slow-wave CPW structures 270 280 290 300 310 320 330510152025Frequency(GHz)Q GCPWS-CPWCPW260 280 300 320-6-5-4-3-2-101Frequency (GHz)S21 GCPWS-CPWCPW280 0Frequ ( z) Figure 3 - 5 Q and S21 insertion loss of S-CPW, GCPW, and CPW Table 3 - 1 Comparison of CPW, GCPW, and S-CPW TPV TPV Architecture Frequency (GHz) Peak Output Power (dBm) DC-to-RF Efficiency (%) Single with CPW tank 298 –13.9 0.20% Single with GCPW tank 301 –13.2 0.22% Single with S-CPW tank 300 –10.8 0.6% Quad-core with CPW Combiner 297 –3.2 0.15% Quad-core with GCPW Combiner 301 –1.7 0.21% Quad-core with S-CPW Combiner 299 0.9 0.51% 44 3.4. IMPLEMENTATION AND MEASUREMENT RESULTS As a proof-of-concept, the proposed slow-wave TPV (Figure 3 - 3) is designed and implemented in a 65-nm CMOS process. Figure 3 - 6 shows chip micrographs of single-stage and quad-core-coupled TPVs. The active die area (including pads) for the quad-core-coupled TPV is about 290×316 μm2 (Figure 3 - 6(c)). To confirm the advantages of slow-wave design, the same TPV structure is replicated using GCPW and CPW combiners and tanks (Figure 3 - 6(a) and Figure 3 - 6(b)). All passive components are simulated using Sonnet 3D electromagnetic (EM) simulator. Figure 3 - 7 shows the test setup used for the measurements. For frequency measurements, the chip is probed and the VCO signal is down converted using an OML M03HWD harmonic mixer (the chip is also measured using a VDI WR3.4 sub-harmonic mixer). The LO is provided using an Agilent E4448A PSA spectrum analyser with an added capability to map the downconverted signal back to its original frequency. The output power is measured using Erickson PM4 power meter. Table 3 - 1 summarizes the measurement results for the different flavours of the implemented structure. As can be seen from the table, the slow-wave TPV has the best performance, with 2.6 dBm higher output power and almost 2× better DC-to-RF efficiency compared to other implementations. The advantage of using a quad-core-coupled architecture is also apparent. Figure 3 - 8(a) shows the captured 300 GHz signal (note that the spectrum analyser has mapped the downconverted IF signal back to the RF band). The VCO is tuned by changing the supply voltage which consequently changes the gate parasitics of the MOS devices. Figure 3 - 8(b) shows output power and tuning range of the proposed prototype. Table 3 - 2 summarizes the performance of the proposed slow-wave TPV prototype and includes the performance of the related state-of-the-art designs for the purpose of comparison. The proposed design compares favourably with the state-of-the-art and achieves 2.4 dBm higher output power and 2X better efficiency than the best performing prior design at 300 GHz [58]. 45 S-CPW Combiner 2S-CPW Combiner 4floating strips Figure 3 - 6 Chip micrographs for proposed TPVs Figure 3 - 7 Measurement Setup 46 (a) 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4294296298300302 -3-2-101Supply Voltage (V)FrequecnyOutput PowerFrequency (GHz)Output Power (dBm) Figure 3 - 8 (a)Measured frequency at 300.8 GHz. (b) and Frequency tuning and output power versus supply voltage 47 Table 3 - 2 Performance Summary of State-Of-The Art THz VCOs Parameters This Work [51] JSSC 2012 [17] JSSC 2011 [58] JSSC 2013 [56] JSSC 2014 [55] JSSC 2015 Architecture 4-core Slow-Wave Coupled TPV 4-core GCPW Coupled PPV Single-core TPV 2-core Coupled TPV Single-core TPV Single-core Injection Locked TPV Harmonic 3rd 4th 3rd 3rd 3rd 3rd Frequency (GHz) 295–301 320 482 288 280–303 504 FTR % 1.7 2.6 NA NA 8 4.4 PN (dBc/Hz) @ 1 MHz –79 –77 –76 –87 –80.26 –77.6 VDD (V) 0.95 1.3 1 1.25 1.8 1.2 PDC (mW) 235 339 61 275 105.6 150 Pout (dBm) 0.9@299 GHz –3.3@320 GHz -7.9@482GHz –1.5@288 GHz -14@300GHz -15.3@504 GHz DC-to-RF Efficiency 0.51% 0.13% 0.23% 0.25% 0.03% 0.02% Technology 65-nm CMOS 65-nm CMOS 65-nm CMOS 65-nm CMOS 90-nm SiGe 90-nm SiGe BiCMOS 3.5. CONCLUSION A high output power, passively-coupled, tunable triple-push source is presented in this chapter. A 3rd-harmonic 295-to-301 GHz proof-of-concept prototype is designed and measured in a 65-nm CMOS process. The structure uses S-CPW to increase the quality factor of the fundamental tank and the combiner, and four cores coupled together to increase the output power. The measurement results show S-CPW-based design achieves 2.6 dB and 4.1 dB lower power loss as compared to the equivalent GCPW-based and CPW-based structures, respectively. The performance of the proposed designs compares favorably with that of the state-of-the-art. 48 HIGH DC-TO-RF EFFICIENCY, HIGH-OUTPUT-POWER TUNABLE SOURCE: A 220 GHZ VCO2 4.1. INTRODUCTION As noted earlier CMOS promises a low-cost, portable platform for a variety of mm-wave and (sub-)THz applications such as medical imaging, non-invasive industrial testing, spectroscopy, and high-data-rate wireless communication [60]. However, as mentioned in previous chapter and to remind efficient power generation at these high frequencies in bulk CMOS processes faces daunting challenges on several fronts: (1) such frequencies are near or above the unity power gain frequency, fmax, of the transistors, (2) high-frequency loss mechanisms such as substrate loss, skin and proximity effects deteriorate the quality factor of passive components, and (3) CMOS scaling, though increasing the fmax of MOS transistors, reduces the maximum tolerable signal swing due to lower supply voltage (VDD). To overcome the limited power gain of the device in CMOS technology, device nonlinearity is usually utilized to generate power around the harmonic components. DC-to-RF power efficiency, ηp, defined as the output power at the desired radio-frequency (RF) component to the DC power consumed, is of prime importance for battery-powered systems. Different circuit techniques for more efficient power extraction from a transistor at these frequencies have been developed recently [5], [17], [61], [62], and ηp for bulk CMOS voltage-controlled oscillators (VCOs) operating near or above fmax is slowly improving (Figure 4 - 1). Signal sources based on harmonic power extraction leverage either harmonic oscillators or frequency-multipliers. Harmonic oscillators, shown in Figure 4 - 2(a), contain active device(s) (M1) that are simultaneously responsible for providing the power to sustain the oscillation at the 2 This chapter is written collaboratively and the results are published in [113]. Dr. Amir Nikpaik is the first author of this paper and I am the second author. 49 fundamental, f0, as well as the desired harmonic, nf0. The feedback network for oscillation at f0 is represented by the Y-parameter matrix, [Yp], and the harmonic output at nf0 is extracted from the output with a matching network tuned at nf0. For example, a 2f0 component can be extracted at the center-tap of a push-push cross-coupled oscillator [3] using a matching network tuned at 2f0, or a 3f0 component can be extracted from a triple-push oscillator[17], . Harmonic oscillators with high output power and good ηp are the subject of active research[63], as shown in Figure 4 - 1. However, higher ηp still remains to be achieved. A frequency-multiplier-based source (Figure 4 - 2(b)), on the other hand, uses an active device (M1) for sustaining oscillation only at f0, and uses another active device (M2) as a non-linear harmonic extractor to give the desired output at nf0. Recent examples of CMOS frequency-multipliers, without integrated fundamental sources, have achieved high output power and high harmonic efficiencies[64]–[66]. Unfortunately, those CMOS sources which have integrated both the fundamental oscillator and the multiplier to efficiently generate harmonic power near or above fmax [64]–[66] have not achieved comparable ηp. Figure 4 - 1 DC-to-RF efficiency of the recently published VCO with output frequency beyond 200 GHz implemented in CMOS technologies. [Landsberg TMTT'13][Khamaisi TMTT'12][Koo TCAS'15][Adnan ISSCC'14][Grzyb JSSC'13][Sengupta ISSCC'12][Han JSSC'13][Tousi ISSCC'12]This Work00.511.522.533.5200 220 240 260 280 300DC-to RF Efficiency (%)Frequency (GHz)50 Matching Network[YP]M1to output @ nf0(a)Matching Network[YP]M1 M2to output @ nf0Fundamental Oscillator Frequency Multiplier (b)Matching Network Figure 4 - 2 Generic representation of (a) a harmonic oscillator, and (b) a frequency-multiplier-based source To provide a high ηp, the fundamental source and the frequency-multiplier must be co-designed and co-optimized together. In this chapter, we discuss and compare challenges to design high-efficiency (sub-)THz sources in bulk CMOS processes for each of the above-mentioned method. Optimum conditions to efficiently extract harmonic power (near or above fmax) from a MOS transistor are described in Section II, and then impediments to fulfill such conditions for harmonic oscillators are explained in Section III. Multiplier-based sources are then described in Section IV, where it is shown that they can achieve higher overall ηp through optimal co-design and loss minimization, even with additional active stages in comparison to harmonic oscillators. Next, a 51 219-to-231 GHz multiplier-based source with 2.95% peak DC-to-RF efficiency and 3 dBm peak output power is introduced in Section V. Implemented in a 65-nm CMOS process, in the proposed architecture, four fundamental oscillator cores are injection-locked in-phase together at f0. Each fundamental oscillator drives a class-C frequency doubler. The output power of these four doublers are combined and matched to the output. Section VI presents the measurement results for the proposed source and compares its performance with other state-of-the-art CMOS oscillators. 4.2. CONDITIONS FOR HIGH-EFFICIENCY (SUB-)THZ POWER GENERATION IN CMOS 4.2.1. Conditions for Optimum Fundamental Oscillation In [16] and [55], optimum conditions for voltage gain (K = |V2/V1|) and phase (φ = ∠V2−∠V1) between the ports of an active two-port network are derived that maximize the net fundamental power generated by the transistor. In harmonic oscillators (Figure 4 - 2(a)), there is no external load (e.g., buffer) to the fundamental signal at f0 and hence, in the steady-state, the fundamental power generated by the active device (P1) is exactly equal to the fundamental power dissipated in the passive peripheral (Ppassive). To excite transistor’s nonlinearity and the harmonic power, usually, it is desirable to increase the gate and drain voltage swings. As voltage swings approach VDD, the active device enters the deep triode region, internal loss increases, P1 decreases, and eventually, in the steady-state, P1 equals Ppassive. Assuming that the passive circuit is linear, Ppassive increases quadratically with the voltage swing. Thus, if the optimum conditions for fundamental power generation are fulfilled, the oscillation sustains at higher voltage swings and hence generates more harmonic power. In other words, in harmonic oscillators, optimum condition for fundamental power generation is necessary, but not sufficient, condition to maximize the harmonic power. 52 The transistor sizing and the passive feedback network should be designed such that K = Kopt and φ = φopt, and the net power delivered by the transistor at f0 equals the power dissipated in the passive peripheral circuits at f0. The optimum phase condition is [67]: *12 21( ),optY Yϕ π= −∠ + (1) where, Yij (i, j = 1, 2) are Y-parameters of the transistor. Under this condition, the maximum power generated at the fundamental harmonic will be [67]: ( )22 *,max 11 22 12 21 ,2gout opt optAP G K G K Y Y= − − + + (2) where, Gij denotes the real part of Yij, and Ag is the fundamental voltage swing at the gate. By using the high-frequency transistor model shown in Figure 4 - 3[68], Y-parameters of the transistor can be calculated as below: 2 211( ) ( ) ,G gs gd gs gdY R C C j C Cω ω≅ + + + (3) 212( ) ,G gd gs gd gdY R C C C jCω ω≅− + − (4) ( )221 ( ) ( ) ,m G gd gs gd m G gs gd gd mY G R C C C j G R C +C C Cω ω≅ − + − + + (5) 2 2 2 2 2 222( ) ( ) ( ) ,DS m G gd gs gd G gd db bb db ds gd dbY G G R C C C R C R R C j C C Cω ω ω ω≅ + + + + + + + + (6) where, Cm is the device transcapacitance which captures the different effects of the drain and the gate on each other in terms of charging current. In deriving these formulas, we have assumed that at the frequency of interest, (RG(Cgs+Cgd)ω)2 << 1, and Rs, Rd, Lg, Ls, Ld are negligible. Also, the impacts of Rsb and Csb are neglected. Using (1), (4), and (5) we have 12( )tan ,2 ( )m G gs gd moptm G gd gs gdG R C C CG R C C Cω ωϕ πω− + +≅ −   − +  (7) ( ) .mG gs gdmCR C CGωπ ω ≅ − + +   (8) 53 Ls LdLgRbbRsb RdbRs RdRgGS DCgdGmVgs GdsCgs −jωCmVgs GmbVbsGi DiSiBiCdsIntrinsic Transistor ModelCdbCsb Figure 4 - 3 High-frequency, small-signal transistor model. Subscript i represent the intrinsic nodes. Equation (8) presents a simple expression to design the passive network, where the parameters in the right hand side of (8) can be readily obtained using a DC operating point simulation. Figure 4 - 4 shows that the simulated φopt matches well with the approximations obtained using (7) and (8). Using (2)–(6), assuming Gm >> RG((Cgs+Cgd)ω)2 and Gm >> Cmω, Pout,max can be approximated as: 2 2 22 2 2 2 2 2,max( )( ) ( ) 1 ( )2 2g g m G gs gdout m opt DS G gs gd opt G gd opt db bb dbgdA A G R C CP G K G R C C K R C K R R CCω   + ≅ − − + + + + +      (9) The first parenthesis in (9) is the frequency independent part of Pout,max. As Ag increases, Pout,max increases first. But beyond a certain threshold, as Ag increases, the transistor operates in the triode region for greater portion of the oscillation period, the effective Gm degrades, the effective GDS increases and hence Pout,max drops. Pout,max also degrades with increase in frequency because the power dissipation at the gate (due to RG) and at the extrinsic part of the transistor (due to Rdb and Rbb) rises quadratically with the frequency. Therefore, careful layout of the transistor is important to minimize RG. 54 20 40 60 80 100 120 140 160 180 2000 220150160170140180Frequency (GHz)φopt (degree)Simulatedapprox. (7)approx. (8) Figure 4 - 4 Simulated (solid red line) and calculated φopt for a 15×1 µm/60 nm NMOS device with gm = 20 mA/V, RG = 14.8 Ω, Cm = 2.6 fF, Cgs = 11.8 fF, and Cgd = 7.4 fF. The simulated fmax for the transistor biased at (VDS,DC , VGS,DC) = (1.2 V , 1.2 V), is 233 GHz. 4.2.2. Conditions for Efficient Harmonic Power Generation Consider a cross-coupled pair as shown in Figure 4 - 5. When operating as a push-push oscillator, the resonance circuit (not shown) attenuates higher voltage harmonics so that vg(t) and vd(t) of the transistors can be approximated with their fundamental components. At a specific harmonic, a transistor operating under large-signal time-varying regime may be modeled by an equivalent circuit3 [69]. The available output power of this source is 2, , ,/8av n s n s nP I G= (10) 3 In this model, the impact of the presence of nth harmonic on Is,n and Gs,n is not included. For the range of frequencies of our interest, nf0 falls near or above the device fmax, and therefore the above approximation usually provides an acceptable estimation for Pav,n. 55 M1 M2IS,2 GS,2Pav,2=I2S,2/8GS,2jBS,2@2f0 Figure 4 - 5 Cross-coupled pair used in push-push oscillator and its equivalent circuit at a specific harmonic (2f0). where Is,n is the harmonic component of the drain current at nf0 and Gs,n is the effective conductance seen at the drain of the transistor at nf0. In general, Is,n and Gs,n are functions of the gate signal waveform (vg(t)) and the drain signal waveform (vd(t)). Therefore, Is,n and Gs,n will be functions of gate and drain signal amplitudes (Ag and Ad, respectively), drain to gate phase difference (φ), gate bias voltage (Vg,DC) and operating frequency (f0). Both Is,n and Gs,n can be calculated using harmonic balance simulations. The DC power consumption of the transistor is PDC = IDC×VDD, where IDC is the transistor DC current. The DC-to-nf0 power efficiency of the device is written as 2, ,,,8av n I n DCP nDC s n DDP IP G Vηη = = × (11) where ηI,n = Is,n / IDC is the drain current efficiency at nf0. To improve ηp,n for a specific power consumption, one should increase ηI,n and simultaneously decrease Gs,n. 4.3. CHALLENGES IN CMOS HARMONIC OSCILLATORS Unfortunately, in harmonic oscillators, ηI,n and Gs,n cannot be optimized simultaneously. In this Section, we first consider a push-push oscillator, and then extend our discussion to a generic harmonic oscillator. 4.4. CONVENTIONAL PUSH-PUSH OSCILLATOR As mentioned, in conventional push-push oscillator, it can be shown that the optimum phase condition does not hold, because drain to gate phase difference is equal to 180○. Therefore, the 56 necessary condition for efficient harmonic extraction is not fulfilled. Next, we investigate ηI,n and Gs,n. Consider a simulation setup for the cross-coupled pair shown in Figure 4 - 6(a), where the drains of the transistors M1 and M2 are set to VDD = 1.2 V and superimposed with two sinusoidal voltage sources with opposite phases at f0 = 110 GHz and voltage swing of A. In such a push-push oscillator [70], as oscillation amplitude rises, Is,2 and ηI,2 increase (Figure 4 - 6(b)), however, large voltage swings at the gate and drain nodes push the transistor into the deep-triode region, increasing the output conductance at 2f0, namely, Gs,2. On the other hand, the second harmonic power generated at the drain node is partially dissipated at the gate resistance (RG), because the drain node is cross-connected to the gate of the transistor pair. Moreover, in push-push architectures, the device transconductance (Gm) affects Gs,2. Indeed, at even harmonics, each transistor in the cross-coupled pair can be considered as a diode-connected device. Neglecting the parasitic capacitance of the device, Gs,2 will be equal to Gm+GDS [69]. Increasing effective Gm to achieve a higher oscillation amplitude and to improve Is,2 and ηI,2 inevitably results in higher Gs,2 and hence, lowers second harmonic power. As a result, according to (10) and (11), both Pav,2 and ηp,2 degrades (Figure 4 - 6(c)). In other words, there is a fundamental trade-off between power generation at the fundamental and at a higher harmonic frequency. In Figure 4 - 6(c), as A approaches VDD the net power generated at fundamental reduces and eventually, at A = 1.05 V, it crosses 0 mW. Beyond this point, though ηp,2 increases significantly, no oscillations are possible because the cross-coupled pair is unable to deliver any power at the fundamental. Consequently, neglecting all passive losses, the maximum achievable ηp,2 (i.e., at A = 1.05 V) in this 65-nm CMOS technology for the push-push oscillator is about 1.3% at 220 GHz. Repeating simulations for different transistor widths indicates that ηp,2 does not change noticeably with the device dimensions. 57 M1 M2~ ~ VDD VDDA<0o A<180oIS,2 GS,2Pav,2=I2S,2/8GS,2jBS,2@2f00.4 0.6 0.8 1.00.2 1.220406008011.60 11.80 12.00 12.20 12.40 11.40 12.60 GS,2 (mA/V)ηI,2 (%)Voltage Swing (V)0.4 0.6 0.8 1.00.2 1.20.51.01.52.02.50.03.0-1.0-0.50.00.5-1.51.0Voltage Swing (V)ηp,2 (%)P1 (mW)A(b) (c)(a) Figure 4 - 6 (a) Simulation setup, (b) ηI,2 and Gs,2 for a transistor in cross-coupled pair. (c) Total DC to 2f0 power efficiency, ηp,2 (circles), drain DC to 2f0 power efficiency (dashed curve), and fundamental harmonic power (P1) generated by the transistor (solid curve). An elegant solution to mitigate the impact of Gm is proposed in [69] (self-feeding oscillator) where the feedback between the gate and drain nodes at 2f0 is broken by exploiting a quarter-wavelength transmission line (T-line). Although the impact of Gm on Gs,2 is alleviated in self-feeding oscillator, the gate node still experiences harmonic voltage component at 2f0 leading to harmonic loss at the gate due to RG. More importantly, due to the large fundamental voltage swing at the drain node, the influence of GDS on Gs,2 still persists. Another factor affecting both the output power and ηp is the loss in the passive components. To achieve the desired power level at such high frequencies, typically, the harmonic power generated at different oscillator cores are combined. Unfortunately, the loss of the output combining/matching network directly degrades the output power as well as the harmonic power efficiency. For example, consider a push-push (or n-push) structure. In this architecture, the second (or nth) harmonic generated at the drain of the transistor should inevitably travel along the tank’s 58 transmission line (T-line) or inductor to reach node C (Figure 4 - 7) wherein the fundamental power vanishes and the desired harmonic power is combined. Due to the impedance mismatch at node C in Figure 4 - 7, multiple reflections occur which significantly adds loss to the harmonic signal [71]. As an example, in the design proposed in [64], the second harmonic power generated at the drain of each transistor (at about 250 GHz) should traverse 1.3 mm through T-lines with different characteristic impedances to the output pad. In an optimum design, to avoid this loss, harmonic power combining should take place right beside the transistor. 50Ω M1 M2Matching NetworkVDDTLd TLdP2f0 P2f0− PLoss,2f0 CMultiple Reflection Loss Figure 4 - 7 Second harmonic power loss in the resonator transmission line of push-push oscillator. 4.4.1. Optimum Harmonic Oscillator to Extract 2f0 Consider a simulation setup similar to Figure 4 - 6(a) for the generic harmonic oscillator of Figure 4 - 2(a). Assume that two 110 GHz sinusoidal voltage sources are applied again to the drain and the gate of the NMOS transistor with the same sizing (W/L = 8µm / 60nm) as in Section III-A. If a perfect harmonic isolation is hypothetically assumed between gate and drain terminals, no second harmonic power is dissipated at the gate. Under this condition, simulations will give an upper limit for Pav,2 and ηp,2 [71]. The phase of the gate source is set to zero and that of the drain is set to φopt = 158°, which is obtained using the large-signal harmonic-balance (HB) simulations. 59 Both the fundamental voltage swings at the gate and drain are swept, ηI,2, Gs,2, and contours for P1 and ηp,2 are plotted in Figure 4 - 8(a)-(c), respectively. As the signal swing and DC gate voltage increases toward VDD, ηp,2 increases. Again, neglecting passive power losses, an upper limit for ηp,2 is attained at point A where the contour for P1 = 0 mW touches ηp,2 = 4.5%. However, because of the passive losses (at both f0 and 2f0) and imperfect isolation between the drain and the gate terminals, the reported ηp,2 for CMOS harmonic oscillators proposed to date is much lower than this upper limit. Figure 4 - 8(c) also shows the fundamental trade-off between the fundamental power generation and the harmonic power generation for the generic harmonic oscillator. As shown in Figure 4 - 8(c), the harmonic power efficiency keeps increasing as Ag and Ad increase. On the other hand, the fundamental power P1 is maximized at a completely different point and the direction of its variation is also totally different from that of the harmonic power. As evident in Figure 4 - 8(c), passive loss at f0 results in a lower oscillation amplitude and hence a lower ηp,2 (point B in Figure 4 - 8(c)). If a wider transistor is used to increase P1 and the oscillation amplitude, to keep the oscillation frequency constant, inevitably, a smaller tank inductance should be used. Smaller inductors result in higher Ppassive (= A2 / (2LQω)), if Q is assumed to be constant. Consequently, it again degrades ηp,2. To summarize, efficient harmonic power generation using harmonic oscillators suffers from the following issues: i) harmonic power generated at the drain of the transistor is partially dissipated at the gate, ii) increasing the oscillation amplitude to increase ηI,2 can significantly degrade the effective output conductance of the transistor at 2f0 (or nf0) (Figure 4 - 8 (a), (b)), and iii) the harmonic power generated at the drain of a MOS transistor is significantly attenuated because of insertion loss and multiple reflection loss in the passive network at 2f0. The biggest limitation for harmonic oscillators is the fact that a single transistor simultaneously generates harmonic power and restores fundamental passive loss. 60 Ag (V)Ad (V)ηI,2 (%)(a) (b)(b)0.6 0.8 1.0 1.20.4 1.40.60.81.01.20.41.4Ad (V)Ag (V)P1 > 0WAB(c) Figure 4 - 8 Contours for (a) ηI,2, (b) Gs,2 and (c) DC to second-harmonic power efficiency (ηp,2) (step = 1%) and fundamental power (P1) (step = 0.3 mW) for generic harmonic oscillator. Shaded area shows device activity region. Ag (V)ηI,2 (%)VGS,DC (V)VGS,DC (V)Ag (V)Gs,2 (S)ηp,2 (%)Ag (V) VGS,DC (V)Ag (V) VGS,DC (V)Harmonic Power (W)(a) (b)(c) (d) Figure 4 - 9 (a) ηI,2, (b) Gs,2, (c) ηp,2 and (d) Pav,2 for the frequency doubler with respect to fundamental gate swing (Ag) and DC level VG,DC. 61 0.6 0.8 1.0 1.20.4 1.40.60.81.01.20.41.4Ad (V)Ag (V)P1=0.8mWηp,1=30%VG,DC (V) (a) (b) Figure 4 - 10 (a) Fundamental power efficiency plot with respect to A = Ag = Ad and gate DC voltage (VDC), (b) Fundamental power efficiency contours (max = 30% and step = 5%) for core oscillator and P1 = 0.8 mW. 4.5. FREQUENCY-MULTIPLIER-BASED SOURCES IN CMOS The above-mentioned issues of harmonic oscillators can be addressed in multiplier-based signal sources. Consider the generic multiplier-based source shown in Figure 4 - 2(b), where M2 is used for frequency multiplication. As the drain node of M2 does not necessarily experience high swing at f0, M2 can operate without being pushed into the triode region for large voltage swings at the gate node, and hence Gs,2 does not degrade. Besides, in comparison to a cross-coupled pair, there is neither a negative feedback from the drain to the gate, nor any harmonic power that is being dissipated at the gate. As a result, there is no severe trade-off between the harmonic current efficiency ηI,2 and Gs,2. Furthermore, unlike harmonic oscillators, where both the fundamental power (at f0) and the harmonic power (at 2f0) generated at the drain of the transistor suffers from the loss of the T-line in the core oscillator tank (see Figure 4 - 7), only the fundamental power (at f0) suffers from such loss here. The harmonic power (at 2f0) at the drain of M2 can be directly delivered to the output matching/combining network. Therefore, a transistor employed in a multiplier can more efficiently extract harmonic power as compared to a transistor used in a harmonic oscillator. 62 The main drawback of the (sub-)THz multiplier-based signal source is that a separate high-power fundamental signal source is needed to drive the frequency-multiplier. The overall ηp of the system will therefore be strongly dependent on the power efficiency of the fundamental oscillator. Next, we quantitatively compare the maximum achievable harmonic power efficiency of the CMOS multiplier-based source to that of the harmonic oscillator. Consider M2 used as a frequency doubler in Figure 4 - 2(b) to have a size of (W/L) = (8 µm/60 nm) and a 110 GHz sinusoidal source applied to its gate only. The drain terminal is simply connected to the supply voltage. Both the gate DC voltage (VG,DC) and voltage swing (A) are swept, and then ηI,2, Gs,2, ηp,2 and Pav,2 are plotted in Figure 4 - 9(a)-(d), respectively. In these simulations, the drain node of M2 does not see any fundamental swing. As shown in Figure 4 - 9(c) and 9(d), both Pout and ηp,2 are maximized as VG,DC approaches 0 V. Thus, by choosing VG,DC = 0, the conduction angle of the current waveform is reduced and higher second harmonic current is generated (Figure 4 - 9(a)). As shown in Figure 4 - 9(c), ηp,2 can be as high as 12% at (VG,DC, A) = (0 V, 1.5 V), which is significantly higher than the corresponding upper limit of ηp,2 = 4.5% for the harmonic oscillator. Near ηp,2 = 12%, VG,DC = 0 V, and therefore, the conduction angle of the doubler transistor is lower than 180° and it is operating in class-C mode. The next step is designing a fundamental oscillator to drive the doubler. The fundamental power that flows into the gate of the doubler transistor M2 at (VG,DC , A) = (0 V, 1.2 V), translated into ηp,2 = 11.5% and Pav,2 = 320 µW, is 0.8 mW. The fundamental oscillator should deliver 0.8 mW to the frequency doubler with minimum DC power consumption. To increase ηp,1 for the fundamental oscillator, we can decrease the VG,DC of the core transistor (M1 in Figure 4 - 2(b)), without any consideration of ηp,2. Figure 4 - 10(a) shows simulated ηp,1 with respect to VG,DC and voltage swing A = Ad = Ag when φ = φopt. From this figure, it is evident that around VG,DC = 0.5 V, ηp,1 is maximized. Note that in harmonic oscillators, on the other hand, decreasing VG,DC while improving ηp,1 degrades ηp,2. To achieve the 63 highest fundamental power and efficiency, φ is set again to φopt and VG,DC is reduced to 0.5 V, and ηp,1 contours with respect to Ag and Ad (Ag ≠ Ad) are plotted in Figure 4 - 10(b). The highest ηp,1 contour that intersects with P1 = 0.8 mW contour gives ηp,1 = 30%. Interestingly, at (VG,opt , φopt) = (0.55 V,158°) in Figure 4 - 10(a), the transistor delivers a power efficiency of 28% which is more than double of what it delivers in a standard cross-coupled configuration at (VG, φ) = (1.2 V, 180°). By inspecting these figures, one can conclude that in order to optimize the output power and the power efficiency of the fundamental oscillator, Ag, Ad, φ, and VG,DC should be carefully tailored. In general, passive circuits embedding the transistor(s) determine the above-mentioned conditions. The fundamental oscillator consumes 2.66 mW to deliver 0.8 mW power at 110 GHz to the frequency doubler. With ηp,2 = 11.5% and Pav,2 = 320 µW in the multiplier, the multiplier consumes 2.78 mW, leading to an overall DC power consumption of 5.44 mW and DC-to-RF efficiency of 5.9%. One can therefore conclude that, within the frequency range of our interest in 65-nm CMOS, the oscillator-doubler combination more efficiently extracts 2f0 power from DC compared to a harmonic oscillator (ηp,2 = 4.5%).4 As mentioned earlier, the maximum ηp,2 = 4.5% for a harmonic oscillator is achieved by assuming perfect harmonic isolation between gate and drain terminals and also by neglecting the passive loss at f0 and 2f0. Due to these facts, in practice, ηp,2 for harmonic oscillators will be significantly lower than this limit. But, in multiplier-based sources, the maximum ηp,2 = 5.9% is relatively less affected by the passive losses and imperfect harmonic isolation between gate-drain as compared to the harmonic oscillators. Repeating the above procedure for other transistors and harmonic power levels results in a similar conclusion. 4 As mentioned before, the impact of the second harmonic has not been included in our discussion. The presence of the second harmonic voltage at the drain of the transistor can move P1 = 0 contour upward in Fig. 8(c), thereby improving ηp,2 for harmonic oscillators. In case of multiplier-based sources, the presence of the second harmonic components will improve the fundamental power generation efficiency of the core oscillator as well as the harmonic power generation efficiency of the doubler. Thus, the conclusion remains valid even if the impact of the second harmonic components is taken into account. 64 4.6. PROPOSED 219-TO-231 GHZ CMOS VCO A multiplier-based source, being more efficient in generating second harmonic power at 219-to-231 GHz in the 65-nm CMOS process, is used in the proposed VCO architecture, shown in Figure 4 - 11. The structure comprises of four differential oscillator cores, each oscillating at f0 (~110 GHz) and injection locked together to operate in phase at f0. The oscillator core at the fundamental frequency is designed using the procedure presented in Section IV. The cores are passively coupled together through T-lines (TLC) based on grounded co-planar waveguides (GCPW) [72]. It should be noted that the TLC lines are part of resonator and affect the oscillation frequency. This passive coupling scheme does not dissipate additional DC power, and is therefore attractive for overall efficiency. The output of each core feeds an active doubler designed to efficiently extract the second harmonic (2f0) power by operating in class-C mode. To increase the output power, the harmonic power of the doublers should be combined. As the fundamental differential signals driving the doublers are in phase, the second harmonic components at the output of the doubler also remain in phase and add to each other constructively, while the fundamental tones vanish. Care must be paid in the layout because any mismatch between cores results in imperfect fundamental tone cancellation and also reduction in output harmonic power. To avoid mismatch, the layout is designed to have four-way symmetry. Coupling N oscillators together reduces the phase noise by 10log(N), therefore, the phase noise of the overall source is 6 dB lower than the phase noise of a single core. The detailed schematic of the proposed VCO is shown in Figure 4 - 12. For the oscillator core, LG, LD, LC, and CC are chosen such that the optimum condition for voltage swings and phases at the gate and drain terminals are fulfilled. A design procedure to find their optimal values is described in the Appendix. For the cores, as adjacent transistors should operate in opposite phases, RF blocking resistors Rg (~2 kΩ) have been added to the center-tap of the TLC to prevent unwanted even-mode oscillation, whose frequency is far 65 below that of the desired differential-mode oscillation at f0. Simulations show that these resistors slightly improve the power efficiency of the core oscillators. Slow-wave GCPW are used to reduce loss for a given electrical length and their ground planes are implemented using the two lowermost metal layers, represented by the shaded gray areas in Figure 4 - 12. Short T-lines at the gate of the frequency doublers (LG) perform impedance matching and enhance the signal swing at the gate. Recall from Figure 4 - 9(c) that ηp,2 is highly dependent to the fundamental voltage swing. Indeed, these T-lines boost the fundamental swing from the optimum gate swing for the core oscillator to the optimum input swing for the doublers. The DC bias of the core transistors, VG,DC, is provided through the center tap of the TLC lines. Next, we describe the frequency doubler and the tuning scheme for the proposed VCO. OSC1~OSC2~OSC4 OSC3f0f0f02f0 2f02f0 2f0~ ~Power Combining & Matching Figure 4 - 11 Proposed VCO architecture. 66 OUTLgLgLgLgLgLgLgLg Figure 4 - 12 Detailed schematic of the proposed VCO 4.6.1. Class-C Frequency Doubler Figure 4 - 13(a) shows the frequency doubler used in the proposed VCO. In this circuit, the differential pair (M1x and M2x), the tail transistor (MTx) and the large tail capacitor CT (~300 fF) form an amplitude peak detector in which the DC value of the common-source node increases as the signal swing at the gate of M1x and M2x increases. The tail capacitor is charged during the oscillation build-up to a voltage level slightly below the input DC voltage of the differential pair. 67 Consequently, conduction angle (ϕC) of the drain current waveform becomes smaller than π radians, driving the transistors M1x and M2x in class-C mode. Simulated drain current and gate voltages are shown in Figure 4 - 13(b). In this circuit, there is no large fundamental signal swing at the drain node. Thus, transistors operate in the saturation region and their effective output conductance remains at their minimum, which according to (11) maximizes the second harmonic power generation efficiency. When the multiplier is operating in class-C mode, it can be shown that for a conduction angle of ~75°, ηI,2 could be as high as 100% [53]. If the fundamental swing A at the gate of device increases, the conduction angle decreases and the DC-to-RF efficiency increases. By exploiting a high frequency transistor model [68]shown in Figure 4 - 3, (6) provides an approximation for the output conductance of the class-C harmonic extractor which can be written as: ( )( )02 2 2 2 2,2 0 02 4 ( ) ( )S f S d ds db G gd db bb dbG G f R C C R C R R Cπ≅ + + + + + (12) where, GS0 is the effective low-frequency output conductance of a transistor operating in class-C regime, and is approximately equal to φC×gds/2π. According to (11) and (12), the available power of class-C frequency doubler decreases with increase in operating frequency. Furthermore, RG should be minimized in layout to reduce the output conductance. Output power and efficiency contours for the frequency doubler, obtained from load-pull simulations, are shown in Figure 4 - 13(c). To simplify output matching and power combining networks, transistors are sized such that all doubler blocks provide 50 Ω matching when combined together, and their capacitances are tuned out using output inductors, Lout. 68 0.10.20.30.40.50.60.70.80.91.01.21.41.61.82.03.04.05.0102020-2010-105.0-5.04.0-4.03.0-3.02.0-2.01.8-1.81.6-1.61.4-1.41.2-1.21.0-1.00.9-0.90.8-0.80.7-0.70.6-0.60.5-0.50.4-0.40.3-0.30.2-0.20.1-0.1MAX=8.5%Step=2%MAX=2.4 mWStep=0.4 mW(a)CTMTxM1x M2xVtailVinp VinnToward output PAD(c)(b)2 4 6 8 10 12 14 16 180 20-0.50.00.51.01.5-1.02.0-202 4 6 8 10 -4 12 time (psec)Gate Voltage (V)Drain Current (mA) Figure 4 - 13 ( a) Frequency doubler, with W/L = 8 µm/60 nm for M1x and M2x. (b) Gate voltage and drain current waveforms. (c) Output power (dotted curve) and power efficiency (solid curve) contours for gate signal swing A = 1.3 V at 110 GHz for four frequency doublers combined together. 4.6.2. Frequency Tuning Scheme In the proposed design, the use of an explicit varactor is avoided due to their high loss at mm-wave frequencies. MOS parasitic capacitances are instead employed for frequency tuning. Figure 4 - 14 illustrates the change in parasitic capacitances to voltage (C-V) for the three distinct operating regions (cut-off/saturation/triode), which the transistor spans during one oscillation period. As the effective parasitic capacitance depends on the gate DC bias voltage, VG, it can be used as the control voltage. One drawback of such intrinsic tuning scheme is that the power efficiency of the source degrades with increasing VG (refer to Figure 4 - 9 and Figure 4 - 10(a)). This can be remedied in a future design by decoupling the DC bias of the core fundamental oscillator from the frequency doubler. In this way, the gate bias of the core oscillator and the doubler can separately be chosen and hence, overall DC-to-RF efficiency will improve. 69 -0.9 -0.6 -0.3 0.0 0.3 0.6 0.9-1.2 1.256789410A (V)Cgs (fF)456738A (V)Cgd (fF)2.53.03.54.04.52.05Cds (fF)A (V)VG+AVDD− AM1CgsCgdCdbIntrinsic Transistor(a) (b)(c) (d)-0.9 -0.6 -0.3 0.0 0.3 0.6 0.9-1.2 1.2 -0.9 -0.6 -0.3 0.0 0.3 0.6 0.9-1.2 1.2 Figure 4 - 14 (a) Parasitic capacitances of a MOS transistor, and different C-V characteristics for (b) Cgs, (c) Cgd, and (d) Cds, for VG = 0.35 V, 0.77 V, and 1.2 V 4.7. MEASUREMENT RESULTS Figure 4 - 15 shows the micrograph of the chip, implemented in a 65-nm CMOS process with an area of 725×725 μm2. The output of the VCO is taken near the center of the layout for the purpose of testing this proof-of-concept chip. In real applications, depending on many factors including whether the source should drive an antenna or other circuits, the floorplan of the whole system, the number of coupled cores, the required output power and the layout of the source should be tailored. All passives are carefully simulated using the Sonnet electromagnetic simulator. The ground layer underneath the output signal pad is removed to reduce the parasitic capacitance of the pad from 22 fF to about 13 fF, with 45 μm opening. The parasitic capacitance of the pad and 70 the doubler block is tuned out by adding a short T-line between the signal and ground pads which realizes impedance matching at the output. 725 µm725 µm Figure 4 - 15 Chip micrograph VDDVGErickson PM4 Power MeterSensor headwaveguideTaper WR3.4 waveguideCascade I-325 GSG probeMultiplierAMC-08-RFH00VDIWR3.4 SHMVDDVGWR3.4 waveguideCascade I-325 GSG probe WR3.4 bendAnritsu MG3690C Signal SourceAgilent E4440A Spectrum Analyzer for Freqeuncy Measurement / Agilent E5052B Signal Source Analyzer for PN Measurement(a) Frequency, PN, and Indirect Power Measurement Bias-TBias-TWR08(b) Direct power measurement WR-3.4 to WR-10WR3.4 Bend Figure 4 - 16 (a) Frequency, phase noise, and (b) power measurement setups. 71 Figure 4 - 16(a) shows the experimental setup used for frequency measurement. The output pads of the VCO are connected to a Cascade i325-GSG infinity probe with a built-in bias tee that feeds DC voltage to the drain of the doublers. Other DC voltages are provided by wirebonds that connect DC pads to the test printed-circuit board (PCB). For frequency measurement purposes, a VDI WR3.4 subharmonic mixer (SHM) is used to down-convert the signal. This SHM mixes the signal at its RF port with the signal at twice the LO frequency. The LO signal of SHM at 90-to-140 GHz is obtained by multiplying the output signal of the 11.25-to-17.5 GHz signal generator with an 8× frequency-multiplier (AMC-08-RFH00). Figure 4 - 17(a) shows the spectrum of the IF signal, obtained through down-conversion by the 16th harmonic of the LO. The tuning range, measured with respect to the VG is shown in Figure 4 - 17(b). In order to accurately measure the output power [73], the power measurement setup, incorporating an Erickson PM4 power meter, is shown in Figure 4 - 16(b). The loss of all measurement setup including probes, tapered waveguides, and connection cables are de-embedded. The measured probe loss is provided by the manufacturer (Cascade Microtech), and it varies between 4.4 to 5 dB within the frequency range of the DUT. The loss of the WR10 waveguide and the WR3-WR10 taper are 0.2 and 0.4 dB, respectively. Figure 4 - 18 shows the measured output power and ηp, with respect to the output frequency. As shown, the maximum output power achieved at 227 GHz is about 3 dBm. The variation of the output power is less than 4 dBm within the tuning range. Total DC power consumption varies from 45 mW to 101mW by changing VG. For higher VG, which results in lower output frequencies, the fundamental voltage swing increases, and therefore the power consumption increases. Since the conduction angle for both the core oscillator and the frequency doubler increases with VG, their power efficiencies degrade. The peak ηp, is 2.95% and it remains higher than 1% across the tuning range. Phase noise at 1 MHz offset varies from –94 dBc/Hz to –90 dBc/Hz within the tuning range 72 as shown in Figure 4 - 19(a). Figure 4 – 19(b) shows the phase noise plot, at an output frequency of 227 GHz. The performance summary of the proposed VCO architecture and its comparison to state-of-the-art CMOS signal sources operating between 200 and 300 GHz are presented in Table 4 - 1. (a) (b)2162202242282320.3 0.5 0.7 0.9 1.1 1.3Output Frequency (GHz)VG (V) (a) (b) Figure 4 - 17(a) Measured spectrum for LO input of 13.9 GHz whose 16th harmonic down-converts the RF input of 230.06 GHz to 7.66 GHz. (b) Measured tuning curve. -101234218 220 222 224 226 228 230 232Output Power (dBm)Output Frequency (GHz)11.522.53218 220 222 224 226 228 230 232DC to RF Efficiency (%)Output Frequency (GHz)DC−to−RF Efficiency (%) (a) (b) Figure 4 - 18 Measured (a) output power, and (b) DC-to-RF efficiency. 73 -95-94-93-92-91-90-89218 220 222 224 226 228 230 232Phase Noise (dBc/Hz)Output Frequency (GHz) (a) (b) Figure 4 - 19 Measured (a) phase noise at 1 MHz offset, as the oscillation frequency is varied across the tuning range, and (b) down converted phase noise plot at the output frequency of 227 GHz downconverted to 1.39 GHz using the 16th harmonic of 14.1 GHz LO input. Table 4 - 1 Measured Performance Summary and Comparison with State-Of-The Art Designs operating between 200 and 300 GHz. Reference [17] [58] [69] [74] This Work Center Freq. [GHz] 256 288 260 256 225 Tuning Range [%] NA 1.4a 1.4 4.3 (6.5a) 5.33 DC Power [mW] 71 275 800 227 68 Peak DC-to-RF Efficiency [%] 0.03 0.3 0.33 1.14 2.95 VDD [V] 1.25 1.25 1.2 1.6 1.2 Phase Noise @ 1 MHz [dBc/Hz] –88 –87 –78 –94 –94 @227GHz Peak Output Power [dBm] –17 –1.5 0.5 4.1 3 Source Type Harmonic oscillator Harmonic oscillator Harmonic oscillator Harmonic oscillator Multiplier-based Technology 130-nm CMOS 65-nm CMOS 65-nm CMOS 65-nm CMOS 65-nm CMOS a) Including frequency tuning by change of supply voltage. b) VCO uses a 1.8 V supply and doubler employs a 1.2 V supply voltage. c) Excluding the DC power consumption of the driving fundamental source. d) Excluding the DC power consumption of the external locked reference. 74 4.8. CONCLUSION Impediments to enhance the output power and DC-to-RF power efficiency of VCOs operating near or above fmax of MOS transistors are investigated. The influence of the drain current waveform, transistor’s large-signal output conductance, and the loss of the output power combining/matching network are studied. To improve the DC-to-RF power efficiency for a specific power consumption, the DC-to-RF current efficiency must be increased simultaneously with a decrease in output conductance. This is difficult to realize in harmonic oscillators because of a fundamental trade-off between power generation at the fundamental and at a higher harmonic. A VCO architecture based on frequency-multiplier is presented that ameliorates this tradeoff for efficient power generation. The use of a distinct class-C harmonic generator block and a power-efficient fundamental oscillator to drive it are proposed. Implemented in a 65-nm bulk CMOS process and operating at 219 to 232 GHz, the VCO attains a peak output power and DC-to-RF efficiency of 3 dBm and 3%, respectively. Future work will investigate the impact of second harmonics [32], [33] in further improving the DC-to-RF efficiency of multiplier based sources.75 HIGHLY-EFFICIENT SUB-THRESHOLD RADIO DESIGN ON CMOS: A 500µW 2.4 GHZ RECEIVER 1 5.1. INTRODUCTION The proliferation of wireless communication over the last decade has played a significant role in accessing and using the ever-increasing amount of data that surrounds us. The advances in the semiconductor and wireless industry have enabled a plethora of technologies, for example, a wide network of sensors to autonomously monitor biomedical and environmental conditions. Vital to the existence of such wireless sensor networks (WSNs) is the design of ultra-low-power radio-frequency (RF) transceivers. Furthermore, low-power consumption is critical to reduce the burden on the battery and/or the energy harvesting unit, most notably in portable devices. In this chapter, we focus on lowering the power consumption of the receive (RX) path of RF transceivers, however, the presented current-reuse technique can also be applied to the building blocks of the transmit (TX) path. Broadly speaking, there are three general approaches for designing low-power RX paths. The first approach is the conventional technique to independently optimize the power consumption of each block in the RX path, e.g., LNA, VCO, and mixer [75]–[78]. The second approach is to combine and co-design two or more of the receiver blocks and thus save power by reusing their bias currents [79]. The third approach is to use elegant techniques to extract the information from the RF received signal without using traditional low-noise amplification and down-conversion techniques. Among popular solutions in the third class are envelope detectors [80], super-regenerative receivers [81], and the injection-locked-based demodulators [82]. However, these elaborate techniques are application-specific and may only be 1 The material presented in this subsection is based on [86] 76 appropriate for specific modulation schemes. In this work, we focus on the second approach which potentially offers more power saving compared to conventional methods and is compatible with a broad range of modulation schemes. Figure 5 - 1 Conventional SOM, LNC and LMV architectures. Figure 5 - 2 Proposed LMV+Filter architecture. 77 Three popular schemes of combining the receiver blocks are shown in Fig. 1. These techniques include the combination of the LNA and the mixer [83], namely, low-noise converter (LNC), the combination of the voltage-controlled oscillator (VCO) and the mixer [84], namely, self-oscillating mixer (SOM), and the stacked LNA, mixer and VCO (LMV) [85]. These structures are shown in Figure 5 - 1. The LNC offers a relatively high gain with a moderate noise figure (NF) [86]. The SOM architecture typically provides moderate voltage gain and NF. The LMV structure [85]and its variant, namely, quadrature LMV (QLMV) [87] also achieve reasonable gain and NF. Despite promising performance of these structures, due to the number of stacked stages low-voltage implementation of these architectures is a challenging task. Furthermore, the power consumption of the reported state-of-the-art combined receivers is still in the mW range and beyond. In this chapter, we present an alternative way of combining LNA, mixer, and VCO in the RX path that can achieve sub-mW power consumption (510 μW) with a low supply voltage (0.8 V) while providing reasonable gain, NF, and phase noise performances. In the proposed structure, as shown in Figure 5 - 2, the VCO is stacked with the cascade of the LNA and mixer. Therefore, the currents of the LNA and the mixer are added and reused by the VCO. In comparison to the conventional methods of stacking the building blocks where all stages reuse the same current, in the proposed structure, the bias current of the VCO is higher than the bias currents of the LNA and mixer. This approach, i.e., increasing the bias current of the VCO, reduces the phase noise and increases the output power of the VCO which in turn improves the switching performance of the mixer (higher peak-to-peak voltage in VCO can result in a higher gain of the mixer). Furthermore, in the proposed structure the number of stacked transistors is reduced and thus the supply voltage and power consumption can be further shrunk. Since the output of the VCO is a relatively large voltage signal (compared to the output of the LNA), special attention is paid to the 78 stacking of LNA and VCO as the LO-to-RF leakage can deteriorate the linearity and compromise the performance of the overall system. To minimize this leakage an inter-stage LC filter is included between the VCO and the cascade of the LNA and mixer to improve the isolation. As a proof of concept, a 2.4 GHz receiver with the proposed architecture is designed and fabricated in 0.13-µm CMOS. The receiver achieves an RF-to-IF gain (S21) of 30.1 dB, input matching (S11) of −16 dB, VCO phase noise of −119.4 dBc/Hz at 1 MHz offset (carrier frequency of 2.4GHz), and an NF of 8.3 dB. The receiver consumes 510 µW from a 0.8-V voltage supply. The organization of the chapter is as follows: a general power optimization technique for RF building blocks is discussed in Section II. The technique is applied to LNA (Section III) and VCO (Section IV). Section V presents an architecture for ultra-low-power and low-voltage CMOS mixers. Section VI describes the proposed approach for combining the LNA, the VCO, and the mixer. Section VII presents the measurement results and comparison with state-of-the-art low-power receivers. Section VIII provides concluding remarks. 5.2. POWER OPTIMIZATION IN RF BLOCKS Many recent RF circuit design techniques rely on an approach based on optimization of transistor transconductance efficiency, i.e., the ratio of transistor transconductance to its drain current [88], [89]. This technique, namely the (gm/ID) approach, when employed for low-power analog circuit design typically pushes the operating point of the transistors to the vicinity of the transistor threshold voltage. Figure 5 - 3 shows the plot of gm/ID versus the gate-source voltage of an NMOS device in the 0.13-µm CMOS process used in this work. It should be noted that the transconductance efficiency of submicron MOS devices has a similar profile across different CMOS processes. As shown in the figure, the transconductance efficiency is low when the device is operating in its strong inversion region. gm/ID increases in the moderate inversion and reaches a plateau in 79 the sub-threshold region. Thus, from the power efficiency point of view, operating the device in subthreshold region is attractive. Traditionally, subthreshold region was not favourably considered for RF applications mainly due to the fact that a low bias current would adversely impact the transit frequency (fT ) of the device. However, due to the continuous increase in fT of transistors in advanced technologies, operation in moderate (close to the threshold voltage) and weak inversion (subthreshold) has become a feasible solution for RF applications. Figure 5 - 3 …†‡ˆ versus gate-source voltage of a NMOS transistor in a 0.13-μm CMOS process (device gate length is 120 nm ). Figure 5 - 4 Transconductance efficiency versus inversion coefficient and transistor’s fT versus inversion coefficient. 0 0.2 0.4 0.6 0.8 1 1.205101520253035VGS(V)gm/ID(V -1) W/L Aspect Ratio From 7 to 700Subthreshold SuperthresholdVTH = 415 mV L = 120nm10-210-1100101020gm/ID(S/A)050ICf T (GHz)80 10-410-310-210-11001011020100200300400500600ICFOM Figure 5 - 5 FOM of NMOS transistor versus inversion coefficient Classical VGS-based expressions do not accurately describe the current of transistors in weak inversion region, hence are not suitable for designing circuits that operate close to the threshold voltage. An alternative approach is to describe the drain current of transistors using the inversion coefficient (IC) which is a measure of the level of channel inversion [90].IC is defined as the drain current normalized to a specific current Ispec, as defined in (1) and (2): ‰ = ‰{‰Š‹)+ (1) ‰Š‹)+ = 2Œ&E*Ž ? = ‰Š‹)+;□  ? (2) In (2), Cox is the gate-oxide capacitance per unit area, Ž is the thermal voltage, µ0 is the low-field surface mobility; and n is a slope factor (with a value ranging from 1.3, in the weak inversion region, to 1.6 in the strong inversion region). 81 For a transistor operating in the saturation region, one can show that for IC ≤ 0.1 the transistor is in weak inversion (WI), for 0.1 ≤ IC ≤ 10, the device is in moderate inversion (MI) and for 10 ≤ IC the transistor is in strong inversion (SI). Figure 5 - 4 shows the plots of transconductance efficiency and fT versus IC. Note that gm/ID exhibits the same behavior as that of the plot in Figure 5 - 3. Unlike VGS-based descriptions, the IC-based approach, normalizes the drain current ID to Ispec, and thus the results are independent of the technology parameters and transistor dimensions. A useful figure of merit (FOM) for low-power RF design that accounts for both transconductance efficiency (gm/ID) and transit frequency (fT) is defined in as [91], [92]: uvw = D/ . q8‰{ (3) Figure 5 - 5 shows the plot of FOM versus the inversion coefficient. As can be seen from the figure, the optimum bias point of a transistor with respect to this FOM occurs when the device is biased for an IC of approximately 7, where the transistor is operating in moderate inversion. For the 0.13-µm CMOS process used in this work, IC = 7 results in a slope factor in a range of 1.3 to 1.6 and hence the drain current can be written as: 7 = ‰{2Œ&‘ ? Ž ⇒ ‰{ = 14Œ&‘? Ž ⇒ 18.2 Œ&‘ ? Ž ≤ ‰{ ≤ 22.4 Œ&‘ ? Ž For example, if we consider a transistor with ”B = 120 in the 0.13-µm CMOS process used in this work, then for optimum FOM, the approximate range for the drain current would be 86 μA ≤ ‰{ ≤ 107 μA. Considering the nominal 1.2-V supply voltage for the process, such a bias current will result in a power consumption within 103.2 to 128.5 µW. Achieving the same 82 FOM=D/ . l—j˜ and ‰ with a lower power consumption is possible provided that the supply voltage is scaled down, or the aspect ratio of the transistor is decreased (which results in the same inversion coefficient but lower drain current) [89], [91]. In the proposed design, the power consumption of the LNA and VCO is reduced further by scaling the supply voltage and the transistor sizes, and that of mixer is lowered by supply voltage scaling and using dynamic power instead of static power. In the subsequent sections, the design of each block is discussed in more details. 5.3. POWER-EFFICIENT VCO DESIGN The inherently better phase noise of LC oscillators in comparison to their ring oscillator counterparts has made them the architecture of choice in wireless applications. The VCO used in this work is based on the standard cross-coupled differential LC-VCO topology [93] with MOSCAP varactors. Figure 5 - 6 illustrates the generic schematic and the equivalent model of such a cross-coupled LC-VCO with a CMOS varactor. The negative transconductance, −gm, is the equivalent transconductance of the cross-coupled active devices. In the steady state, the tank parallel loss, Rp, and the negative active resistance, −1/gm, cancel each other so that the remaining circuit is ideally a lossless LC tank with a variable capacitor (varactor). In CMOS implementations, typically, the tank loss is dominated by the inductor loss [20]. To sustain the oscillation, one can show that the following inequality should hold [93]: q8F ≥ 1=B ∙ ? ∙ 2› ∙ D7), ⟹ q8F,8sk = 1=B ∙ ? ∙ 2› ∙ D7), (4) Equation (4) can be rewritten as (5) ‰nsr, ≥ 2=B ∙ ? ∙ 2› ∙ D7), ∙ Iq8F‰{F R (5) 83 where QL is the quality factor of the inductor L and gm1 is the small-signal transconductance of transistor M1 (or M2). Equation (5) specifies the lower bound of the required bias current (Ibias=2.ID1) for sustaining the oscillation and highlights that the minimum Ibias is inversely proportional to the product of the inductor value, L, its quality factor, QL, and the transconductane efficiency, gm1/ID1. Figure 5 - 6 LC cross-coupled oscillator and equivalent model at resonance 5.3.1. Inductor sizing According to (5), assuming that gm1/ID1 and oscillation frequency are set, in order to minimize Ibias the product QL.L must be maximized. For the 0.13-μm CMOS process used in this work, the product QL.L versus the inductance of the inducto r is plotted in Figure 5 - 7. The plot of the self-resonance frequency (SRF) of the inductor is also shown. The trend shown in these two plots are representative of the behavior of spiral inductors in CMOS processes. Based on the figure, by increasing the value of L, QL.L also increases, while the SRF of the inductor decreases. Thus, there is a trade-off between the value of QL.L and the SRF of the inductor. Note that at frequencies 84 close to SRF the behavior of impedance of the inductor switches from inductive to capacitive, and thus adversely affects the performance of the oscillator. Therefore, the frequency of operation of the oscillator, i.e., fosc, is typically set much lower than the SRF of the tank inductor. Targeting an oscillation frequency of 2.4 GHz, we consider SRFmin as 5 GHz which corresponds to an inductor value of 10.2 nH and a (QL.L) of 124 nH. Note that as shown in the proposed flowchart for the systematic design of LC VCO (refer to Figure 5 - 8), if the value of the inductance does not meet the required tuning range, the optimization step has to be re-iterated to meet the desired tuning range. Figure 5 - 7 QL.L and SRF versus inductor value L of inductors in the 130-nm CMOS Technology 6 7 8 9 10 11 12 13105110115120125130135140Q.L44.555.566.577.5L(nH)SRF(GHz)85 Specifications:Maximum Frequency fHMinimum Frequency fLPower Consumption GoalMaximize (L×QL) (consider constraints) Find the minimum required fT of transistor (3×fH)Oscillation sustains over PVT?Find maximum gm/ID (with fTconstraint) and ICoptFind minimum gm (typical corner) to start the oscillationCheck for ICopt, Adjust Idc (if needed)Increase gm by increasing W/L of the core transistorsNoLC tank meets the desired tuning range (fH – fL)?Reduce LOkNoYesYes12345678910 Figure 5 - 8 Systematic approach for designing a low power VCO , This last loop allows meeting the tuning range while minimizing the phase noise and power consumption (basically, maximum Q.L optimizes phase noise and power, however, if needed, we can apply the constraint on the maximum value of L to meet the tuning) 86 5.3.2. Active device sizing As shown in (4), increasing the value of l—j˜ reduces the required Ibias. From Figure 5 - 4, the maximum value of l—j˜ is achieved when the transistor is biased in the weak inversion region, i.e., IC<0.1. However, the transit frequency of M1, fT, is typically chosen to be 3 times larger than the oscillation frequency to ensure proper switching operation [93]. For this design a minimum fT of 7.5 GHz is chosen so that the cross-coupled pair provides enough gm at the frequency of oscillation, i.e., fosc = 2.4 GHz. This requires the value of IC to be larger than 0.5 which corresponds to a l—j˜ of 20 V-1. Given l—j˜ , IC, QL.L, and fosc, one can estimate the minimum required bias current from (4). To guarantee oscillation start-up under all conditions, e.g., process, supply voltage, and temperature (PVT) variations, the transconductance gm1 is usually chosen to be larger than the minimum gm1 required for sustained oscillation (typically 2 to 3 times larger [93]). To increase gm1 (and gm2) while keeping l—j˜ intact, as it is outlined in the design flow chart in Figure 5 - 8, one can increase W/L of the cross-coupled devices to increase gm and then proportionally increase the tail current such that l—j˜ stays the same. In this work, we have used gm1=2.5×gm1,min, to ensure proper oscillation start-up at different PVT corners. The flowchart in Figure 5 - 8 summarizes the proposed design methodology for low-power LC VCOs. This flowchart depicts the design steps to find the proper inductor and transistor sizes. The first step is to set the center frequency, the required tuning range, and the minimum power consumption which are typically dictated by the application and/or the specifications of the communication standard that is being used. The next step is to use the highest operation frequency, fH, to find the minimum required transit frequency, fT,min as ~3×fH and then from the graph of fT versus IC graph (same as Figure 5 - 5) ICmin (the minimum required inversion coefficient) can be calculated. Then using the graph of 87 transconductance efficiency versus IC, the corresponding maximum l—j˜ can be found. Next, on the passive side, by considering the minimum required SRF, the maximum QL.L and the required inductor value can be found from Figure 5 - 7. Therefore, the tank resistance can be calculated and from (3) and (4) the minimum required current can be estimated. From the current and transconductance efficiency, the gm and the size of cross-coupled transistors can be calculated and the oscillator can be simulated over PVT corners. If the VCO cannot sustain oscillation over the desired corners, the cross-coupled transistor sizes can be increased to produce a higher gm and correspondingly increase the tail current to keep l—j˜ constant. After finalizing the transistor sizes, the tuning range should be checked as it may have changed due to the additional parasitic capacitance associated with increasing the size of the transistors. In case of any deviation from the specification, the inductor size can be adjusted. This procedure should be re-iterated to meet desired design specifications. 5.4. POWER-EFFICIENT LNA DESIGN In the receive path, the LNA must satisfy three critical performance considerations: 1) The input impedance of the LNA, should be matched to the impedance of the antenna (typically 50 Ω) across the desired frequency range of operation. 2) The noise figure (NF) of the LNA should be minimized since it directly adds to the total noise of the receiver. 3) A larger gain for the LNA is more desirable as it reduces the effect of the noise from the subsequent blocks [94]. Figure 5 - 9 depicts a feedback transimpedance stage that uses a CMOS inverter (push-pull structure) as its gain stage. The use of PMOS transistor allows this topology to achieve high gain values due to gm boosting through current reuse. However, the parasitic capacitance and noise of the PMOS transistor impact the performance. 88 For majority of LNA topologies, such as the popular source degenerated LNA [95], the input matching depends on the operating condition of the input MOS device. That is, changing the bias current or equivalently the gm of the devices impacts the matching. Additionally, as mentioned earlier, the maximum D/ . l—j˜ of a transistor occurs at IC ≈ 7 to 10 which means operation in the moderate inversion region. In this region, channel charge transport mechanism changes smoothly from drift to diffusion and hence q8 =  j˜ ¡¢£ is more sensitive to gate-source voltage as ¤Š approaches the threshold voltage of the device. This further exacerbates the dependence of input matching on the bias conditions. Push-pull LNA employed in this work proves to be an attractive candidate in that respect [96], [97]and special care can significantly lower the sensitivity of ¥sk to q8. The schematic of the LNA with its resistive feedback is shown in Figure 5 - 9. Here, Cload models the input impedance of the next stage of the LNA (e.g., mixer in this work) and its value is between 100 to 300 fF. CGS is the equivalent gate-source capacitance of M1 and M2 transistors. Assuming that gmn ≈ gmp, a large RF resistor (e.g., RF > 10 kΩ) , and the transistor operates in the subthreshold region (e.g., gm < 1 mS), Zin can be approximated as : ¥sk(¦) ≈ B + ¤Š.< B ¤Š% − § 1¤Š. % ≈ ¥sk;¨)rt − §. ¥sk;j8rl (6) As can be seen from (6), ¥sk;¨)rt is independent from gm and ¥sk;j8rl can be tuned out using a ᴨ-matching network shown in Figure 5 - 9 (b). To check the feasibility of this matching, the LNA structure is simulated in Cadence and Figure 5 - 9 (c) shows the result for S11. In this design, the supply voltage of LNA is 0.55 V and the power consumption is 96 µW. As can be seen, S11 is below −10 dB in the vicinity of 2.4 GHz. 89 Figure 5 - 10 shows the simplified model for noise calculation of the LNA, the noise factor expression, when operating at the resonance frequency, D& = F©ªZ«¬­, can be shown (please refer to the Appendix) to be estimated as: u(%&) ≈ 1 + (1 + 2q8.,).,(sk%&)2.<®q8 + 98.,(sk%&)q8ƒ (7) Figure 5 - 9 Simplified noise model of LNA, gate induced noise and flicker noise of transistors are ignored. 90 0.5 1 1.5 2 2.5 3 3.5 4 4.5x 109-20-15-10-50Frequency(GHz)S11(dB) Figure 5 - 10 (a) Current reused push-pull feedback LNA, (b) matching network (Bias is excluded), and (c) S11 simulation of proposed push-pull low noise amplifier. Rf is 30kΩ, LG = 10nH , M1=¯°µ†¯±²³† , M2 = ¯°µ†¯±²³† . where 9 is the excess noise factor and ƒ is the channel conduction coefficient [97]. Thus, assuming that the LNA is operating at the resonance frequency, by increasing q8 and .< the noise factor (or equivalently, noise figure) can be reduced. Figure 5 - 11 shows the simulation results of NF of the LNA for which .< is 30 kΩ. The supply voltage and power consumption are 0.55 and 96 µW, respectively. Figure 5 - 12 shows a flowchart for the systematic design of the active components of the proposed push-pull LNA. Note that the matching network can be sized after choosing active components .The design process begins with finding the desired inversion coefficient, ‰E‹Ž, which maximizes D/ . l—j˜ . Next, the required bias current of the LNA is found form ‰B@´ =91 (2µk ”B Ž)‰E‹Ž. To find the initial estimate for ‰B@´, ”B can be set to 1 in the first iteration. Then, using ”B and ‰B@´, the transconductance and gain can be determined. If the transconductance is not sufficient, ”B is increased and ‰B@´ is adjusted until the design achieves the required value of transconductance. Once the design meets the specifications, the input can be matched to the antenna by adjusting Ctune (which sets ¥sk;¨)rt) and LG (which tunes out Z·¸;¹º»¼). It should be noted that in subthreshold inverter LNA, the performance can vary in different corners, especially is SS-Cold and FF-Hot. This is mainly due to the change in bias current of the LNA. The robustness to PVT can be achieved by using a dynamic biasing circuit that adjusts the bias of the LNA (e.g., by adjusting the gate voltage of the NMOS device) based on the supply, temperature and process corner of the design. In the current prototype the biasing is provided off-chip, however, in a robust design, the biasing can be provided through a more elaborate biasing circuit such as using a dynamic/digital temperature monitoring biasing circuitry. Figure 5 - 11 NF simulation of proposed LNA , Minimum simulated NF is 2.5dB in range of 2.15GHz to 2.45 GHz 0.5 1 1.5 2 2.5 3 3.5 4x 10912345678910Frequency(GHz)NF(dB)92 Specifications:Re(Zin) = 50 Ohm Gm-min, NFminFind the ICopt Where (gm.fT/ID)maxGm == Gm-minILNA= (Wini/Lini) *Ispec . ICoptIncrease W/LNoRe(Zin) == 50 Ohm OkYesYes12368Find VGSMatch InputFind Re(Zin) Adjust CtuneAdjust LGNo5479 Figure 5 - 12 Systematic approach for designing a low power LNA 5.5. LOW-POWER LOW-VOLTAGE MIXER DESIGN In conventional current-commutating mixers operation from a low supply voltage (VDD < 0.8V) while consuming a low power (PDC < 500 µW) and achieving a reasonable performance is a 93 fundamentally challenging task. Gilbert-type designs are widely used due to their simplicity, reasonable conversion gain, noise figure, and linearity[97], [98]. However, Gilbert-type structures are not quite amenable to low-voltage operation. The main reason is due to stacking the switching stage, the transconductance stage, and the load of the mixer. Such stacked mixers require a large supply voltage to bias all stages in their proper operating region. As shown in Figure 5 - 13(b), an alternative mixing structure have been proposed which utilizes body effect to mix RF and LO [99]. Such architectures can combine LO-stage and Gm-stage of the mixer and improve headroom, however, due to the use of body-effect, they typically have a low CEkp)7,sEk ¤rsk½E¾)7 ratio and a poor linearity. To reduce the number of stacked transistors, folded architectures (Figure 5 - 13(c)), have been proposed [100]. These architectures offer a wideband response and are robust to mismatches; however, in addition to typically requiring bulky RF chokes for biasing, they provide a low conversion gain at low supply voltages. Figure 5 - 13 Conventional down conversion mixer (a) Gilbert-type mixer (b) Combined Gm-LO mixer (c) Folded mixer. 94 A survey on current-commutating type mixers and an alternative mixing solution has been presented in [101], where instead of current-commutating as a method of mixing, Gm-switching is used for mixing LO and RF signals. Figure 5 - 14 shows the basic concept of Gm-switched mixer. By choosing a proper size and transconductance for transistor M1 and M2, switching-stage driver (which can be as simple as an inverter) can operate almost independent from the Gm-stage. Considering Figure 5 - 14(b), we have a rail-to-rail switching (between VDD and GND) at nodes X and Y, and in contrast with the conventional stacked Gilbert-type mixers, in this structure the Gm-stage and the LO-stage are cascaded. Thus, the supply voltage can be lowered significantly and the mixer is functional with supply voltages to as low as the threshold voltage of the transistor [101]. From power consumption perspective, in such mixers the dynamic power of the switches is dominant and the overall power is proportional to ' ∝ ,¾. DB . {{, where ,¾ is the total parasitic capacitance at node X and Y, DB is the frequency of the local oscillator and {{ is the supply voltage of the mixer. In many applications, DB is given (e.g., in this work it is 2.45 GHz), however, {{ and ,¾ can be lowered to achieve low power consumption. Given the low-power nature of the Gm-switched mixer structure, a modified version of this mixer is used in the proposed receiver in this work. 95 Figure 5 - 14 Gm-Switched mixer (a) Conceptual single-ended mixer. Gm can be turned off and on by switching its supply terminal. (b) Implementation of switching-stage and gm-stage. Another critical consideration for low power mixer-based receivers is the sensitivity of the mixer to the LO power. The output voltage swing of typical cross-coupled VCOs directly depends on its bias current. Due to this constraint, it is desirable that the mixer be able to function with a low LO amplitude which results in a lower power consumption of the VCO as well as the overall receiver. Here, we use two techniques, namely, inductive peaking and dynamic threshold switching, to lower the sensitivity of the Gm-switched mixer to the LO amplitude. 96 Figure 5 - 15 Inductive peaking technique in proposed mixer ????????? Figure 5 - 15 shows the inductive peaking technique used in the proposed mixer where the resonance frequency of the LC tank, FBÀÁTV.C£ , is equal to DB. Since source of M2 and M1 are 97 dynamically biased they have variable transconductance (as shown in Figure 5 - 16). When M2 is off (or equally VY = 0) Gmp(t) is ~ 0 and thus 1/Gmp(t) is very large. In this situation, the LC tank has a large quality factor which we denote as Qstart. In the next half cycle, when the inverter toggles the voltage of node Y to VDD, initially, due to having a large Qstart, VY may overshoot to more than VDD and will facilitate switching of M2 from “off” state to “on” state. In addition, such overshoot can facilitate reducing of the supply voltage of the inverter as the voltage at node X can be passively amplified by LC tank. Similarly, the same case happens to M1. Figure 5 - 16 (a) Inverter with dynamic-threshold-voltage NMOS (DTNMOS) (b) Inverter with dynamic-threshold-voltage PMOS (DTPMOS) (c) Inverter with DTPMOS and DTNMOS (d) Conventional inverter 98 Figure 5 - 17 Output voltage of the inverter with Wp= 200 µm, Wn=100 µm, CL=1pF, 2.45 GHz (A) with dynamic threshold (DTMOS) inverter (B) without dynamic threshold voltage In this work, due to the low power operation of VCO, the single-ended peak-to-peak voltage of the LO signal is limited to 400 mV. On the other hand, considering the proposed LVM architecture in Figure 5 - 1, the supply voltage of the LNA and the mixer are shared. Thus, as the LNA is designed to operate from a 0.55 V supply, the mixer also should be functional Figure 5 - 18 Schematic of the proposed LVM receiver 0 0.2 0.4 0.6 0.8 1 1.2 1.4x 10-9-0.100.10.20.30.40.50.6Output Voltage of InverterTime(Sec)Voltage(V) With Dynamic Threshold VoltageWithout Dynamic Threshold VoltageAB99 from the same supply voltage. Having a 2.45 GHz oscillator signal with a peak-to-peak voltage of 400 mV, and a 0.55 V supply, it is challenging to design CMOS inverter for the switching stage of the with proposed mixer. To have a functional high-speed inverter a low supply voltage (VDD < 0.5V) and switching threshold, [102] present a dynamic-threshold-MOS (DTMOS) inverter. Figure 5 - 17 shows the conventional inverter along with different combinations for DTMOS inverter. To understand the dynamic threshold technique, consider the simple DTNMOS inverter (or equally DTPMOS inverter) in Figure 5 - 17(a), in which the threshold of the NMOS can be written as: ŽÂ;@:Š = ŽÂ& + à ÄÅ2∅< + ŠÇ − Å2∅<È (8) where ŽÂ& is the threshold voltage for zero substrate bias, 2∅< is the surface potential, ŠÇ is the source-to-body voltage, and à is the body effect parameter. Considering this equation, when LO signal is low, namely, when CL is charged and node A is equal to VDD, ŽÂ;@:Š becomes lower than ŽÂ& due to the body effect. This means that in the next half cycle, when LO becomes high, NMOS transistor can turn on and discharge node A faster. Thus as compared to the conventional inverter, the DTNMOS (or equally DTPMOS) inverter is functional with lower LO power as well as supply voltage. Figure 5 - 18 compares transient response of DTMOS inverter with conventional one. In this simulation input signal is set to 2.45 GHz, supply voltage is 0.5, and load of inverter is a 1-pF capacitance. As can be seen dynamic threshold technique is effective for improving performance of switching stage. 100 Figure 5 - 19 Chip Micrograph of (a) LNA-Mixer, (b) LNA-VCO-Filter and (c) LNA-VCO-Mixer-Filter Table 5 - 1 Components Value for Proposed Receiver Component Value Responsibility M1, M2 ÉÊ8F&k8, FËÊ8F&k8 LNA M3, M4 116μ120 VCO M5, M6 FÌÊ8F&k8 , Í&Ê8F&k8 Gm-Switched Mixer LRes 4nH LNA-VCO Isolation LS 7nH Reduce Capacitive Load on LNA LPeak 15nH Inductive Peaking in Mixer Reduces Capacitive Load on LNA L0 4nH Input Matching LV 4.2nH VCO Tank CDC-Large 20 pF Bias for LNA and Mixer As for the LO to IF leakage performance, proposed mixer shows a superior isolation than conventional Gilbert mixer. This is due to the fact that LO can reach output mainly via the small 101drain source capacitance of the Gm-stage (CDS-M1 or CDS-M2 ), whereas in most Gilbert type mixers LO can be coupled to IF via a larger gate-drain capacitance, namely CGD [97] . 5.6. OVERALL DESIGN Figure 5 - 19 shows the schematic of the proposed low-power low-voltage combined LVM receiver. CDC-Large is a 20 pF capacitor that acts as a supply voltage for mixer and LNA blocks. At VCO frequency of 2.4 GHz, CDC-Large is low impedance and is shorted to ground. The inductor, LRes, is designed to resonate with the parasitic capacitances at the tail of the VCO, namely, Cparasitic, and creates an equivalent open circuit at resonance and improves the isolation between LNA and VCO. To reduce the sensitivity of LNA to its output capacitive load, LS is used to resonate with the input capacitance of the mixer and hence to nullify the effects of the parasitic capacitances of the mixer at 2.4 GHz. ?‹)rÎ is chosen to have an improved switching performance in the mixer. Table 5 - 1 summarizes the values of the main components of the proposed LVM receiver. 5.7. MEASUREMENT RESULTS The RX shown in Figure 5 - 19 is implemented in a 0.13-µm CMOS process and occupies 0.88 mm2, excluding pad areas. To facilitate the characterization of individual blocks, the combination of LNA-VCO (without the mixer) and LNA-mixer (without the VCO) are also included on the test chip (as shown in the chip micrograph in Figure 5 - 20). The RF frequency is 2.45 GHz and IF frequency is 50 MHz. The overall chip operates from a 0.8 V supply while consuming 510 µW. Figure 5 - 21 shows the spectrum and phase noise of the implemented VCO which is −119.4 dBc/Hz at 1 MHz offset. Comparing the phase noise of the VCO with that of the state-of-the-art, as shown in Table 5 - 2, the proposed VCO can achieve lower phase noise while consuming sub-mW power. Figure 5 - 22 shows the measurement results for S11. Based on this figure, the effective bandwidth of the LNA where S11<−10dB is between 2.25 GHz and 2.53 GHz. The measured 102conversion gain from input of the receiver to the output is 30.1 dB which 18.6 dB gain is generated by the LNA and the remaining 11.5 dB is due to the mixer. The performance summery and comparison with the state-of-the-art is provided in the Table 5 - 2. Compared to the LMV structure [103] which is also in a 0.13-µm CMOS process, the power consumption of the proposed design is lower by almost an order of magnitude and the supply voltage is reduced by 33% (i.e., 0.4 V). As can be seen from the table, performance of the proposed RX compares favorably with that of the state-of-the-art. Figure 5 - 20 Measured buffered VCO a) Buffered spectrum b) Phase Noise. The center frequency is 2.45GHz 103 Figure 5 - 21 Measured LNA S11 input matching Table 5 - 2 Performance Summaries and Comparison Parameters This Work [104] ISSCC, 2013 [103]JSSC, 2010 [87] JSSC, 2010 Architecture LMV+Filter Blixer LMV LMV Frequency (GHz) / Application 2.4 / WSN 2.4 / ZigBee 2.4 / ZigBee 1.57 / GPS VDD (V) 0.8 1.2 1.2 1 PDC(mW) 0.51 2.7 3.6 6.4 NF (dB) 8.3 9 9 6.5 Gain (dB) >30.1 55* 75* 42.5* S11 Matching < –10 dB (GHz) 2.25 to 2.53 2.25 to 3.55 1.5 to 1.7 1.55 to 1.65 VCO Phase Noise (dBc/Hz) <–119.4 @ 1MHZ - –116 @ 3.5 MHz –110 @ 1MHz VCO Tuning Range (GHz) 2.28 to 2.51 - - 1.45 to 1.85 IIP3 (dBm) –9.3 –6 –12.5 –30 CMOS Technology 0.13 µm 65 nm 90 nm 0.13 µm * Includes IF post amplification 5.8. CONCLUSION A systematic design technique for ultra-low-power ultra-low-voltage CMOS receivers is presented. Based on the proposed design methodology, an LMV architecture is designed which can operate with sub-1 mW power consumption. The proposed structure uses a push-pull current-reuse LNA stage with a resistive feedback to achieve a reasonable trade-off between the NF and 0 1 2 3 4 5x 109-20-15-10-505Frequency (GHz)S11(dB)S11 < -10dB : 2.25 – 2.53 GHz104input matching. A low-power switching mixer which uses a dynamic threshold scheme to control its switching completes the receive path. The VCO is stacked on top of the cascaded LNA-mixer design. To alleviate coupling issues, special attention is paid to the isolation amongst the RF blocks. A proof-of-concept prototype LMV is designed and fabricated in a 0.13-μm CMOS technology and achieves a conversion gain of 30.1 dB and a combined NF of 8.3 B while consuming only 510 μW from a 0.8 V supply. 105 HIGHLY-EFFICIENT, PVT-ROBUST, SUB-THRESHOLD RADIO : A 1MW BLE RECEIVER ON CMOS 6.1. INTRODUCTION Ultra-low-power (ULP) wireless communication is one of the key enabling technologies for many smart networking applications including Internet of Things (IoT). Over the last few years, advanced energy-efficient low-data-rate ubiquitous sensor and medical connectivity applications have become increasingly popular requiring low-cost sub-mW radio transceivers [1]-[10]. Academic and industrial research activities on sub-mW connectivity solutions have manifested into two distinct classes of transceivers: first, low-data-rate (<100 kb/s), µW-level, mid-range radios which typically require a custom design utilizing a simple modulation scheme, e.g., on-off keying (OOK), binary phase-shift keying (BPSK), and very few timing constraints, e.g., non-coherent [5]-[8]. These ULP radios are commonly used in the application areas of smart biomedical connectivity through smart body area networks (BAN) [10]and autonomous wireless sensor networks (WSNs) [12]. The second class includes higher-data-rate radios that use more sophisticated modulation schemes allowing for operation over a longer distance with reasonable power consumption (<10 mW) [2]-[4]. While most of these systems are designed to be compatible with the BLE standard, recently adopted protocols such as IEEE 802.11ah allow for further power saving without sacrificing the data-rate by switching to lower carrier frequencies below 1 GHz. To better observe the power/performance tradeoff of a receiver, it is insightful to define the following figure of merit (FoM) with the tacit assumption of maintaining the same bit-error-rate (BER) between the radios.: FoMRX= | RXSensitivity − 10log(ϻлѻÐÒFÓÔÕÖ ) + 10log(Õ×ØÒÑF ºÙ ) | (1). 106Figure 6 - 1 shows FoMRX versus power consumption of various published ULP radios since 2005 [1]. For a given communication channel bandwidth, the data-rate increases with a spectrally efficient modulation scheme. But, the problem is that for a spectrally efficient modulation, the receiver would require a higher SNR to satisfy a given BER, consequently increasing the RX power consumption to satisfy the more stringent linearity and noise requirement of the RX. This work demonstrates a sub-mW BLE-compatible receiver with a sensitivity of better than −95dBm at a data-rate of 1 Mb/s while improving its FoMRX (red marker in Figure 6 - 1). Figure 6 - 1 FoMRX versus power of reported ULP radios since 2005 [1]. This work is shown with a red marker. While Bluetooth 5.0 is emerging as a unified standard for low-power wireless communication among smart objects [13], BLE remains the most popular ULP wireless connectivity in industry [2]-[6]. Most BLE transceivers sacrifice performance to conserve power. However, by 10-310-210-1100101102103104105Power Consumption ( W)20406080100120140160180FoMRX (dB)OOKFSKBFSKAMQPSKBPSKGFSKOTHER SCHEMEThis work107incorporating innovative low-power circuit design techniques, and the right choice of radio architecture, it is possible to drastically reduce the power consumption while delivering competitive performance [6,10]. Several low-power circuit design techniques have been explored by researchers to realize low-power CMOS radios [14]-[15]. Conventional designs typically take advantage of gain and bandwidth enhancement techniques to produce a low power transceiver [16-18]. Despite delivering excellent performance in sub-10 GHz range, these techniques are usually inefficient for µW-level systems. Block stacking is noted as a solution that fundamentally reduces the power consumption by re-using the current from one block into another. However, the limited voltage headroom complicates the deployment this technique in CMOS radios since the devices are forced to operate in subthreshold. Improved mobility and reduced device parasitic in advanced CMOS nodes, have made the subthreshold operation useful for multi-GHz operation [14]. The remainder of this chapter is organized as follows. Section II overviews the state-of-the-art ULP RX solutions and introduces the proposed architecture. Section III elaborates on the design of the RF front-end (RFFE) including the low-noise amplifier (LNA), the single-ended-to-differential converter (S2D), and the mixer. Section IV presents the design and implementation of the compact ultra-low-power integer-N phase-locked loop (PLL). Section V discusses the implementation of the baseband filter and the DC-offset cancellation loop. Implementation details, receiver measurement results, and concluding remarks are provided in Section VI and Section VII, respectively. 6.2. PROPOSED RX ARCHITECTURE Circuits employing current re-use and subthreshold techniques not only suffer from performance degradation over PVT variations, however, they are also sensitive to the choice of receiver architecture. Receiver architectures such as discrete-time (DT) low-IF and continues-time (CT) 108sliding IF (SIF) have been used for low-power BLE radios (Figure 6 - 2 (a) and (b)) [1]-[6]. The low-IF and SIF topologies suffer from a problematic image that should be rejected by power-hungry multi-stage active filters and downconverters (local oscillators and mixers). This results in a high power consumption for receivers which target sensitivities better than −95dBm. There may also be a need to use high-Q off-chip (or on-chip) filters to further reject the image. Furthermore, multiple LO frequencies and associated spurs could desensitize the RX through reciprocal mixing [19, 20]. In addition, using off-chip components to match the first stage, i.e., the LNA, to 50 Ω increases the cost and degrades the NF [12]. Using a direct-conversion architecture (DCR), on the other hand, eliminates the problems of image; potentially obviating the need for high-Q filtering with minimal power overhead. (c)(a)LO(I,IB)ALO(I,IB)LNAImage Reject SAWLO(I,IB)ALO(I,IB) vRF1:6-vRFLO2(I,IB)ALO2(I,IB)LNALO1(b)Image Reject SAW PP-LNATIAIF ampRXINLO(I,IB)LO(I,IB)LO(Q,QB)LO(Q,QB)Q : Automatic Offset CalibratorI : Automatic Offset CalibratorS2DStacked TIA IF ampQQBIBIADC and BB ProcessorPFD1/N1/2 GMGMCPLF48MHz XOLOILOIBLOQLOQBBiasI REF4.8 GHz PLL8b8b8b8bCDCD8b4b2bRX Gain and Mode ControlXTALMaster BiasSerial InterfaceGain = 18dB to 31dB(d) Figure 6 - 2Simplified block diagram of (a) sliding IF architecture with two LO stages and image rejection filter, (b) low-IF architecture with image rejection filter, (c) LNA-less passive architecture, and (d) the proposed ultra-low power direct conversion receiver Another attractive low-power approach is to use an all-passive architecture. In CMOS front-ends, designers have attempted to replace active LNA with a passive voltage-booster such as an 109inductive-peaking network and a transformer to minimize the power consumption of the RFFE [10] (Figure 6 - 2 (c)). Although using a passive LNA improves the linearity and brings the power consumption of the RFFE near zero, its noise contribution to the RX can be prohibitively high. Moreover, it requires large devices in the mixer and baseband amplifier to minimize the noise, which in turn, pushes the power consumption of LO generator block higher toward the milliwatt range [10]. A compromise solution involves using LNAs operating in subthreshold to save power with minimal impact on the NF of the system. This work presents a low-power fully differential direct-conversion receiver, employing current-reuse and subthreshold design techniques, with an integer-N PLL (Figure 6 - 2 (d)). The PLL uses a low-power double-frequency VCO that eliminates the intermediate down-conversion steps and the associated circuitry, thus pushing the RX power below 1 mW. Compared to other RX architectures, the DCR has minimal issues with reciprocal mixing of the LO harmonics and spurs. However, fully differential operation is needed to mitigate the DC-offset and I/Q mismatch issues [6]-[7]. 6.3. RF FRONT-END The RFFE consists of a combined LNA-S2D followed by a downconverter (Figure 6-2(d)). Having a power-efficient and high-gain LNA is pivotal in realizing a low-power and highly sensitive RX. Unlike conventional designs, downconverters and baseband amplifiers in ULP radios significantly contribute to the overall NF of the system. Having a high-gain LNA helps suppress the noise contribution from these blocks. Increasing the gain usually comes at the cost of power. Moreover, the need to support differential operation further constraints the power consumption. A combination of high-gain LNA and stacked active S2D with inductive peaking goes a long way to address this concern with minimal power penalty. Assisted with µW-level feedbacks in the biasing network, the design exhibits robustness to PVT variations. 110 6.3.1. LNA and Active Balun A simplified schematic of the proposed LNA is shown in Figure 6 - 3. The LNA uses a subthreshold current-reuse push-pull structure (PP-LNA) with a single-inductor input matching network. The effective transconductance of the LNA can be written as gmp+gmn and hence, as compared to an NMOS-only design, it achieves similar gain with lower power [14]. The push-pull design also improves the linearity of the LNA [21]. The conventional biasing technique for PP-LNA uses a resistive feedback, namely RF, to set the output DC voltage of the LNA to ~VDD/2 (Figure 6 - 3). Although using RF simplifies the design, due to the insufficient voltage-headroom available in the stacked topology, the output DC voltage varies across PVT which adversely impacts the gain and input matching (S11) of the subthreshold-biased LNA. The proposed dynamic-biased PP-LNA (DBPP-LNA), however, resolves the PVT variation issue by using a dynamically biased feedback loop for transistors M2 and M1 which forces the output DC level of the LNA to an adjustable voltage, Vref, generated from the bandgap voltage (Figure 6 - 3). The DC current of LNA is then fixed by the NMOS device (M1) relative to a bandgap current (Iref). The gate bias of M1 is generated by passing Iref through a replica of M1, namely, M1-R, whose drain voltage tracks Vref using a similar loop. 111RFVDDVDDVRFVREF CAP Bank Gai n ControlM1M2IREFM1RVBiasVREFVoutM1M2Gai n ControlCAP Bank VoutLNA S2DVDD VDD LPFM1 Repl ica current source Drain DC-Fix loop VRF VRFProposed DBPP-LNAConventional PP-LNAVBiasVBias M3M4M5 M6VOUTVOUTBLDProposed S2DVinVDDSample sS2D Phase DifferencePhase-Diff: 181.3°SD : 0.43° Figure 6 - 3 Conventional PP-LNA and proposed DBPP-LNA and S2D 112M1,1M2VRE FVBiasVBias VDDVDDAmp CDC-FilterCT2DynamicBias VREFIREFM1,2 M1,3G1 G2CAP Bank Gain ControlG1G2G3M3 M4M5 M6VGS5 = - VGS6M1 BIASOUT OUTBS2DDBPP-LNA20pFLDLMatc hCT1CPadC urrent-bank C ompensa torG1G2G3G3M1-R Figure 6 - 4 Simplistic model of proposed LNA 113 To perform single-ended-to-differential conversion while increasing the overall gain of the RFFE, an active S2D converter consisting of two input common-source (CS) buffers followed by a gm-boosted active-balun is used (Figure 6 - 3). The CS-stages boost the LNA signal and provide a differential VGS for M5 and M6 (VGS5 = −VGS6) resulting in a very small phase imbalance with the standard deviation of 0.43° in the presence of device mismatches (Figure 6 - 3). The amplitude mismatch is negligible due to the high gain of the active balun. The inductor LD resonates out the input capacitance of the mixer near 2.4 GHz allowing the current to flow through the mixer. To save power, the current of the S2D converter is reused in the LNA by stacking the two stages on top of each other (Figure 6 - 4). A 20 pF decoupling-cap (CDC-Filter) provides a high frequency AC-short for both the LNA supply and the S2D ground. S2DVDDCdCT1CgdCLGm12 vgsZinrovgsCT2RM LMRSCPa dS2DVDD 1/2gm5gm3 vgs3 2gm5 vgs5LDCmixerRDvgs3 vgs5 Figure 6 - 5 Simplistic model of proposed LNA and S2D 114Table 6 - 1 LNA Performance over PVT @ 2.4GHz DBPP-LNA 1.05V DBPP-LNA 1V PP-LNA 1V PP-LNA 1.05V TT Gain 32.3 32.3 31.2 31.5 NF 2.8 2.8 2.8 2.7 S11 -28.6 -28.6 -9.9 -11.6 FF Gain 32.8 32.8 32 32.2 NF 2.74 2.74 2.8 2.75 S11 -27.4 -27.4 -11.5 -11.3 SS Gain 29.1 29.4 23.4 24.2 NF 2.95 2.95 4.3 4.2 S11 -14 -18 -6.4 -6.5 Using a simplistic model for the LNA (Figure 6 - 5), the gain and noise-factor (F) near the matching frequency (i.e., close to the resonance frequency of the input LC-tank) can be expressed as: 4p ≈ =sk. Ú8F. q8®. Ä?{B È (2), u ≈ 1 + .:., + 98F2.,ƒÚ8F=sk + 98®q8®ƒ4p (3), where =sk is the quality factor of the input LC-tank, Gm12 is the effective transconductance of M1 and M2 (e.g., gm1+gm2), 9 is the excess noise factor, and ƒ is the channel conduction coefficient. Assuming the drain-source resistances (ro) are comparably large, the input impedance can be estimated as ¥sk(¦) = ¥sk;¨)rt − §%. ¥sk;j8rl where: ¥sk;¨)rt ≈ .sk Ä 1(.sk(m + /F))% + 1È ≈ Û.sk % ≪ %®mÇ.sk( ( %%®mÇ) + 1) ÝÞ¦Ý (4) ¥sk;s8rl ≈ .sk. (m + /F)(.sk(m + /F))% + 1 = .sk%®mÇI %%®mÇR + 1 (5), 115where .sk = CX¤—ßCà and %®mÇ = F¨áU(CàâCã) . Considering the matching inductor, LM, the real part seen by the antenna is ¥sk;¨)rt + .: and ¥sk;j8rl can be tuned out by LM. If the gate-drain impedance is negligible and the feedback is weak (e.g., input transistor is small), ¥sk;¨)rt is small and input-matching can be done by adjusting the frequency dependent resistance of the input inductor (namely, .: ≈ 1BáUJáU ). Qin should be chosen such that RS = RLin. To have an ideal 50-Ω matching at 2.4 GHz for a 15 nH inductor (used in the proposed LNA), the metal width (or layer resistivity) should be chosen to have Qin ≈ 4.5. Although this method of matching is feasible and simple, having a low Qin negatively impacts gain and NF. In addition, if antenna impedance has different temperature-coefficient than inductor’s metal layer, matching would be degraded over temperature. Hence, in this work, Qin is maximized; providing only a half of the required real impedance by the inductive loss (.: ≈ ¨ä ≈ 25 Ω). The remaining half is provided by utilizing the transistor’s feedback impedance ¥sk;¨)rt. As shown in Figure 6 - 4, the LNA gain is controlled by splitting and switching in/out the slices of transistor M1 which effectively changes the transconductance of the input device. To keep the current of S2D constant in all gain-modes a current-compensator which guarantees a constant current for the stacked S2D stage, is added. Since adding/reducing the slices of M1 changes the effective capacitive loading, CL and CT2 tune accordingly to maintain proper matching at the frequency of operation. Table 6 - 1 compares the post-layout PVT simulation results of the DBPP-LNA with the conventional resistive feedback PP-LNA. As can be seen from the table, while conventional subthreshold PP-LNA fails to maintain the performance in some corners, the proposed stacked design delivers robust performance across PVT. 1166.3.2. Mixer The RX uses a double-balanced current-driven passive I/Q mixer with 50% duty cycle. The down-converted I/Q current is absorbed by low-input-impedance TIAs (half circuit is shown in Figure 6 - 6). The DC current of mixer switches and DC gate-drain voltage VGD are zero and hence their flicker noise contribution is minimized [22]. The LO for the I/Q mixer is provided by halving the 4.8 GHz PLL clock (namely, LOPLL in Figure 6 - 6) using a current-mode logic (CML) frequency divider. At 2.4 GHz, every 1fF gate parasitic capacitance of the mixer translates into about 2.4 µW of dynamic power from a 1-V supply. Given that 8 switches are used for I/Q mixers, the estimated dynamic power is 19.2 µW/fF. To minimize interconnect parasitic capacitance, the CML buffers are interdigitated with mixer switches. This results in 20% reduction in parasitic from wires. The downconverter consumes about 230 µW. LOLOBLOBLOARTIARTIA vRF-vRFZin-T IA≈ 1/(RTIA×A)LNA 15KΩ VDD15KΩ 15KΩ 15KΩ LOILOIBLOQLOQBVDDIFQ IFILNA+LNA-LOPLL-LOPLL Figure 6 - 6 Half circuit of proposed passive mixer and CML divider 1176.4. PROPOSED LOW POWER AND COMPACT PLL 6.4.1. VCO and Dividers An integer-N PLL operating at twice the carrier frequency is used for LO generation (Figure 6 - 7(a)). The PLL operates at 2× of the carrier frequency to minimize the pulling effect with the inductor used in the input matching network of the LNA and alleviate LO self-mixing issue in DCR systems [19]. A 4.8 GHz CMOS Class-C VCO with variable PMOS gate-bias is implemented and followed by a chain of divider and buffers to drive the downconverter (Figure 6 - 7(b) and (c)). By adjusting PBias, the conduction angle of the VCO Gm-core can be controlled and optimized for maximum power efficiency [23]. To minimize the VCO power consumption, Qtank × Ltank, that is the product of VCO inductance and its Q, is maximized. The VCO tuning range is from 4.55 to 5.15 GHz. The PLL consumes 280 µW of which 150 µW is used in the VCO. The output frequency of the VCO is divided by two using a quadrature CML divider with resistive load (Figure 6 - 7(c)). The VCO design is optimized to achieve low phase noise at the offset frequency of 3 MHz to alleviate reciprocal mixing of the interferers in adjacent channels. To reduce the silicon area, the PLL building blocks are physically placed inside the 3-turn VCO inductor. This approach reduces the Q by ~6% but results in 25% saving of the PLL area and 8% saving of the total chip area (Figure 6 - 7(a)). 118PFD1/NCPLF48MHz XOI REFXTALPLL in Inductor280µmGm & 1/2½ CLkCLk QQBDDBQBDDB QVDDCLKQ QBDB D1/2VDDCvarCvar VCPBia s3BitCML Divide by 2 & Mixer 4.8 GHz VCO½ to PLL 1/NLO(Q,QB) LO(I,IB)I IB Q QBGNDGND SIGNALSUBSTRATEGNDGNDSHIELDED CPW Figure 6 - 7 Diagram of (a) PLL, (b) VCO, (c) CML divider, and (d) shielded CPWs used for the clock path 119 6.4.2. In-Loop Circuitry Isolation and Loop Dynamics The frequency generation unit in this prototype uses an integer-N PLL architecture. The integrated PLL core inside the inductor operates significantly lower than half of the VCO’s frequency. To reduce EM disturbance of circuitry inside inductor, 1/N divider, phase-frequency detector (PFD), and charge pump (CP) are implemented by using metal-1 to metal-3 layers and stripped metal-4 layer is used to shield the PLL’s active core from inductor loop. Moreover, the clock path between CML, 1/N divider, and PFD are isolated using shielded coplanar waveguides (CPWs) (Figure 6 - 7(d)). 6.5. LOW-POWER BASEBAND AMPLIFIER The down-converted current is absorbed by a current-driven baseband filter followed by a voltage amplifier (Figure 6 - 8). To maximize the gain, a differential regulated-cascode (RGC) transimpedance amplifier (TIA) is employed as the input stage (Figure 6 - 8). Using an active load (M3), the input impedance is independent of the bias current, Itail, and can be written as: ¥sk;/j´ ≈ 1/((æEF||æE®)×q8F. q8é) . ƒ. (êk + ê‹)Å BBë”딝 (6) where q8, æE, and ê are the transconductance, output impedance, and channel-length modulation of transistors. Following the TIA, the second stage is a programmable differential voltage amplifier which employs a tunable shunt-shunt feedback with 12 dB of tuning range. The baseband is capable of producing over 40 dBΩ of gain when interfaced with the mixer. The gain of the baseband section can be written as: Úìí ≈ .B./.F (7) 120IBandgap_RefREFCMFB8bits8bits55dB opampIF Current from Mixer-IF Current from MixerZin-TIAM1 M2M3 M4M5 M6RL RLR1R2R1R2ro1ro3VoutIinIF ampR1R2R1R2TIA45dBopampDC Offset Calibrator Loop2bitsNode selectorZin-TIAIoff1 Ioff2VDD20 µW 15 µW1/2LNAUnbalanced DC Leackage2ItailItailItail47 to 72 dB tunable gain Figure 6 - 8 Low-power baseband amplifier and automatic offset calibrator A common-mode feedback loop is included within each differential stage. Input DC offset of the basedband amplifier is a potential concern for the system performance as the DC signal experiences a large gain before reaching the output. To address this issue, an analog DC offset cancellation feedback loop is used in which a sense amplifier balances the output DC levels by injecting a current into the input of the TIA (Figure 6 - 8). 1216.6. MEASUREMENT RESULTS The ULP BLE receiver is fabricated in a 40-nm LP CMOS process and occupies 0.7 mm2 (Figure 6 - 9). A photograph of the test printed circuit board (PCB) is also shown in Figure 6 - 9. A standalone LNA-S2D structure is also fabricated on a separate chip to better characterize the effectiveness of the stacked design (Figure 6 - 9). PLL+VCOLNA+S2DMIXERBB LPF/AMPOFFSET CALIBRATORDIGITALATBISPMASTER BIAS1mm0.7mmRFING GGGOUTOUTB0.370mm0.470mm PLL Loop0.280mmBLE Wire bonded on BoardOutputRFIN Figure 6 - 9 Chip micrograph of proposed RX, DBPP-LNA, and compact PLL 122 Figure 6 - 10 DBPP-LNA measurement results at room temperature 123 (a) Figure 6 - 11 Measured PLL PN in (a) free running unlocked mode and (b) PLL mode 1241.31 1.60.70.40.1 Figure 6 - 12 Measured Vcontrol showing the PLL settling Time 10410510610701020304010KHz to 1MHz Integrated NF = 5.2dBNF(dB)2 2.2 2.4 2.6 2.8-40-35-30-25-20-15-10-50S11(dB)-50 -45 -40 -35 -30 -2545.54646.547 P1dB = -27.2 dBm-80 -70 -60 -50 -40 -30 -20 -10 0-150-100-50050IIP3= - 19.7 dBmInterpolated MeasuredFrequency (Hz)Input Power (dBm) Input Power (dBm)Output Power (dBm)Frequency (GHz)RX Gain (dB)(a) (b)(c)(d) Figure 6 - 13 Measured NF, input S11, RX gain, and IIP3 125 Figure 6 - 14 Measured blocker performance of proposed RX 6.6.1. Standalone LNA + S2D The LNA-S2D test structure measures 470 µm×370 µm and is measured using RF probes and a vector network analyzer (VNA). The proposed DP-PPLNA design delivers over 30 dB of gain with NF less than 3.2 dB, and S11 better than −15 dB at 2.4 GHz while consuming 400 µW from a 1 V supply (Figure 6 - 10). 6.6.2. PLL The PLL is characterized independently using a separate transmitter (TX) chip which uses an identical PLL for frequency generation. The open-loop phase noise of the PLL is shown in Figure 6 - 11a. The VCO achieves a spot phase noise of −108.6 dBc/Hz at 1 MHz offset implying a FoM of −184.4 [18]. Out of Band Blocker Level (dBm)126When the loop is closed, the PLL achieves an integrated phase noise (1 kHz to 1 MHz) of 0.83° and a spot phase noise of −119.9 dBc/Hz at 3 MHz offset (Figure 6 - 11b). The measured locking time of the PLL is less than 12 µs (Figure 6 - 12). 6.6.3. RX Figure 6 - 13 shows the measured S11, NF, and linearity performance of the RX when consuming 980 µA from 1 V supply. At 980 µW, the receiver achieves an NF of 5.2 dB integrated from 10 kHz to 1 MHz (Figure 6 - 13a) and a sensitivity of −95.8 dBm. The RX shows in-band S11 < −15dB indicating a good matching to 50 Ω (Figure 6 - 13b). The gain can be tuned by more than 25 dB from 72 dB down to 47 dB by adjusting the gain of the RFFE and the baseband. At the gain setting of 47 dB, the system P1dB and IIP3 are −27.2 dBm and −19.7 dBm, respectively (Figure 6 - 13c and 13d). Figure 6 - 14 shows the measured blocker performance of the RX at the same gain setting when plotted against the BLE compliance mask. Owing to the high selectivity of the on-chip matching network of the LNA and the narrowband nature of the LC tank load of the S2D, any interferer is significantly attenuated before reaching the high gain baseband section. The integration of PLL loop-filter inside the VCO inductor has reduced the total chip area to 0.7 mm2. The BLE receiver performance is compared with that of the state-of-the-art BLE radios in Table 6 - 2. The power breakdown for the RX is shown in Figure 6 - 15. The receiver achieves a figure of merit [3] with an absolute value of 95.9 dB which, to the best of the authors’ knowledge, is the highest reported |FoM| among BLE receivers published in the literature. The design also offers lower NF with comparable linearity and smaller chip area than advanced BLE radios. 127Mixer Driver23%LNA + S2D41%VCO+PLL29%IF 7%Mixer Driver19%LNA + S2D42%VCO+PLL30%IF 9%High Performance ModeTotal Power = 1.48 mW Total Power = 970 µW LOW Performance Mode Figure 6 - 15 Power breakdown in low power and high performance modes 128Table 6 - 2 Performance Summary of State-Of-The BLE Receivers Reference [105] [106] [107] [108] This Work* This Work** Data Rate & Modulation 1-Mbps GFSK 1-Mbps GFSK 1-Mbps GFSK 1-Mbps GFSK 1-Mbps GFSK 1-Mbps GFSK Architecture Direct Conversion TD Sliding IF TD Sliding IF DT Low IF Direct Conversion Direct Conversion Integrated NF (dB) N/A 6.5 N/A 6.5 5.2 4.9 RX Gain (dB) NA NA N/A 48 47 to 72 47 to 75 RX Sensitivity (dBm) −94.5 −94.5 −94 −95 −95.8 −98.2 RX IIP3 (dBm) N/A N/A N/A −19 @ 48dB Gain −19.7 @47 dB Gain −18.3 @ 47 dB RX Gain Integrated PN(°) N/A N/A N/A 0.87 0.83 0.79 PN @ 3MHz N/A N/A N/A N/A −119.9 −121.1 VDD (V) 0.9 to 3.3 1.1 1 1 1 1.05 PDC(mW) 11.2 6.3 3.3 2.75 0.98 1.48 Area(mm2) 2.9 1.1 1.3 1.84 0.7 (RX) 0.7 (RX) ***FoM, FoMRX 84, 144 86.5 , 146.5 88.8, 148.8 90.1, 150.1 95.9, 155.9 96.5, 156.9 CMOS 55 nm 40 nm 40 nm 28 nm 40 nm 40 nm * Low power mode ** High performance mode *** FoM = | RXSensitivity + 10log(î×ØÒÑFºÙ ) | FoMRX= | FoM - 20log(ï»Ð»;Ñ»ÐÒF ÓÔ/Ö )) | 6.7. CONCLUSION An ultra-low-power, PVT-robust, and compact direct- conversion receiver suitable for BLE applications is presented in 40-nm CMOS. The RFFE uses a novel subthreshold DBPP-LNA stacked with active-balun S2D with NF≈3.2 dB to achieve a high gain at low power and suppress the noise coming from subsequent stages in the RX chain. A compact all-in-inductor PLL operating at twice the carrier frequency is implemented to perform down-conversion while minimizing the VCO leakage to the LNA. The down-converted signal passes through dedicated I/Q current-driven baseband filters each consisting of a TIA and a voltage amplifier with programmable gain. The RX achieves NF≈5.2 dB with IIP3 and P1dB near −19.7 dBm and −27.2 dBm at 47 dB gain, respectively. The results implies a receiver sensitivity of −95.8 dBm for a 1 MHz GFSK signal indicating a FoM of −95.9 dB. 129 HIGH FREQUENCY DC-DC CONVERTER: A 1.3 GHZ FULLY INTEGRATED BUCK CONVERTER 7.1. INTRODUCTION Dynamic-voltage-scaling (DVS) enables fine-grain power control by providing different voltage levels to drive a wide range of load currents in microprocessors and mobile system-on-chips (SoCs). Considerations such as system-integration cost, ease of routing, and power delivery have driven significant research on fully on-chip voltage regulators. Specifically, inductor-based buck converters remain attractive [19]–[24]due to their relatively higher efficiency and the ability to cover a wide range of voltage/power. However, apart from the efforts to increase the power efficiency, two key challenges still remain in their design: (1) the use of on-chip inductor (Lfilter) and large filter capacitor (Cfilter) requires a significant silicon area, and (2) limitations of on-chip inductance and capacitance due to area constrains result in a large output ripple (e.g., 30 to 100 mV in [69],[71],). The magnitude of the output ripple for the buck converter shown in Figure 7 - 1(a) can be approximated by [1]: Ripple ≈ {(F;{)¡˜˜Ë.BðáñSÁò.CðáñSÁò.<äóß (1) where D is the pulse-width modulation (PWM) duty cycle, VDD is the input supply voltage and fsw is the switching frequency. Clearly, increasing fsw reduces the ripple. For example, a recent design in 22-nm CMOS achieves a ripple of 10 mV with fsw = 500 MH[20] . In addition, higher fsw facilitates denser integration with a smaller Cfilter and Lfilter which in turn allows for implementation of the fully integrated converter on a single chip. However, increasing the fsw also increases the dynamic switching loss in the switch-driver thereby, degrading the overall efficiency. In addition, higher fsw increases simultaneous occurrence of high current and voltage, known as hard-switching, 130which in turn adds in switch loss. Zero-voltage-switching (ZVS) is an effective way of alleviating the hard-switching effect [109]. Low ripple can be obtained using a multi-phase switching DC-DC converter as shown in Figure 7 - 1(b). Beyond low ripple, a multi-phase converter can also provide high output power and high efficiency for a span of load currents [21]. However, use of multiple inductors makes the design unsuitable for a single-chip fully-integrated implementation. Furthermore, they employ multiple complex control logic that consumes large power for proper operation. Hysteretic controllers, although simpler to implement, have nonlinear behavior and can in fact result in large ripple [21]. An elegant way to increase efficiency and reduce dynamic loss is to use switch-scaling and pulse frequency modulation (PFM) techniques. Under a low load current condition where the switch losses (dynamic loss) are dominant and significantly impact the efficiency, switch and frequency scaling effectively reduce the number of switches and the associated dynamic loss [19].However, the PFM technique substantially adds to the design complexity and requires an additional control loop and may result in an increased output ripple. To address the abovementioned issues, in this work, we present a loss-mitigation technique based on resonance that allows for GHz-range fsw while achieving 72.2% efficiency under full-load condition. As a proof-of-concept, a monolithic prototype is implemented in a 0.13-µm CMOS process. Operating at GHz-range switching frequency, in turn, significantly lowers the output ripple (<5 mV) and facilitates the integration of the converter in an area of 0.46 mm2 (including the filter capacitor and inductor). The rest of this chapter is organized as follows. Section II elaborates on the proposed loss mitigation technique which boosts efficiency of the converter. Section III presents the design and implementation of the proposed converter. Measurement results and conclusions are provided in Section IV and Section V, respectively. 131 Figure 7 - 1 Simplified schematics for (a) a buck converter, and (b) a multiphase ripple canceller 7.2. CONVERTER WITH PARASITIC LOSS MITIGATION Figure 7 - 2 shows the block diagram of the implemented converter where all the components including the switch-driver, the filter inductor (Lfilter) and capacitor (Cfilter) are integrated on the same die. In a buck converter, the efficiency can be written as: 132i = ½XôTà½XôTàâ½Xôää ≈ ½XôTà½XôTàâ½£õMXôääâ½áUàMXôääâ½ôSöÁòMXôää (2) where PLoad is the delivered power to the load, PSW-loss is the loss in switches (can be either dynamic or static), Pind-Loss is the power loss in the inductor due to the resistive loss and is dependent upon the quality factor of the inductor, and Pother-Loss account for losses in the controller part and package-related leakage which are comparably negligible[19], [110]. The dynamic switching loss power (DSLP) has two main components: DSLPIN and DSLPOUT due to the loss in the input (Cinput) and output (Cpar) parasitic capacitance of the switch-driver stage, respectively. For converters that must support large output powers, e.g., in excess of 100 mW, large switch-driver transistors are required, leading to a significant Cpar. Therefore, the second loss term drastically degrades the power efficiency of the converter. In conventional designs, Lfilter is usually dictated by the minimum inductance as required for continuous conduction and area constraints for monolithic implementation. In most conventional mathematical modeling and optimizations of buck converters, the dynamic loss in Cpar is modeled as Cpar.fsw.VDD2 and is independent from Lfilter [19], [110]. Although correct at low switching frequencies, a better approximation of this loss at higher switching frequencies is Lfilter dependent and in this work we demonstrate an optimal combination of Lfilter and fsw such that the effect of Cpar is nullified. 133 Figure 7 - 2 Proposed buck converter with parasitic-loss mitigation. In this design, the equivalent impedance of Cfilter is rather small (~35 mΩ) over the frequency range of operation (fsw = 1 to 1.5 GHz), and Vout may be considered virtually an ac-ground. Thus, the parallel combination of Lfilter and Cpar resonates at: D7 = F©BQáñSÁò.C÷Tò . (3) Switching at this frequency, i.e., fsw ≈ fr, improves the performance and significantly lowers DSLPOUT. In order to observe and confirm the impact of aforementioned resonance on the performance of converter, a simplified model of converter as shown in Figure 7 - 3 is simulated in 134Spectre. To better understand the output current behavior, MOS transistors are replaced with Verilog model incorporating their series loss component (with the on-resistance of the switch ~ 10 Ω). Parasitic capacitance of MOS devices is also modelled using explicit capacitance, Cpar, as shown in the schematic. The simulated current waveforms of Lfilter and Cpar at resonant and non-resonant frequencies are also depicted in Figure 7 - 3. When operating at a non-resonant frequency, the current of Lfilter and Cpar are dissimilar, with the difference flowing through the switches and resulting in a switching loss. At resonance, however, the current of Lfilter and Cpar are roughly the same which in turn minimizes the amount of current going through the switches and thus lowers their associated switching loss. In effect, at resonance Cpar is exclusively charged and discharged through Lfilter. Figure 7 - 4 shows the power consumption in the converter as fsw is increased from fmin, where fmin is the minimum frequency for continuous-current conduction. The maximum value of fsw is limited by the system clock, above which the power consumption of generating a higher frequency clock must also be included in the overall efficiency calculations. For a constant Lfilter, duty cycle and load current, the power delivered to the load (PLOAD) remains constant, DSLPIN increases linearly (DSLPIN α CinputfswVDD2), whereas, DSLPOUT shows a frequency dependent behavior governed by the parallel resonant tank. In the vicinity of fsw ≈ fr, DSLPOUT reaches its minimum value. DSLPIN can be minimized using switch-scaling technique [19]and therefore, the overall efficiency is optimal when fsw ≈ fr. A direct consequence of higher fsw is that the output ripple is attenuated. Note that increasing fsw beyond fr is not only inefficient [24]but also needlessly complicates the controller design, especially for fully digital implementations [20]; for analog implementations [22], dominant pole compensation must be properly analyzed for stability. 135 Figure 7 - 3 Effect of resonance in lowering switch loss Figure 7 - 4 Power consumption and efficiency vs. switching frequency. 1367.3. IMPLEMENTATION DETAILS The proposed converter, shown in Figure 7 - 2, with switch-scaling, ZVS, and parasitic-loss mitigation techniques is designed in a 0.13-μm CMOS process. Figure 7 - 5 shows the simulation results for the efficiency of the converter versus frequency for a 1.2 nH Lfilter and 3.5 nF Cfilter. This figure confirms that for the given driver and inductor size, the optimum switching frequency is in the vicinity of the resonance frequency, fr. Based on this figure an efficiency of 72.2% is obtained by operating at resonance frequency. The series parasitic resistance of inductor can significantly impacts the efficiency of the converter by introducing ohmic-loss at DC. To minimize that, a 25 µm line-width inductor, Lfilter = 1.2 nH, is designed with three top tick-metal layers stacked together. The inductor is designed with a 3D electromagnetic simulator (Sonnet) and provides Q of 13.2 at 1.3GHz and self resonance frequency (SRF) of 16.2GHz. In the proposed converter two parameters are critical for the filter capacitor: First, it should have a high value at DC to minimizes the ripple, and second, it should have higher SRF than the designated converter resonance frequency (e.g., 1.3 GHz in proposed design) as well as reaching a low impedance at this frequency to provide an ideal virtual ground node for the inductor. To achieve two goals, A high-density filter capacitor, Cfilter = 3.5 nF, is implemented using a parallel combination of MOS-cap, metal capacitors made by M1-M3 and M2-M4, and a high density dual-MIM-cap, all stacked vertically as shown in Figure 7 - 6. The 3.5nF Cfilter occupies an area of 0.16 mm2 and shows SRF of 1.95GHz with quality factor of 2.1 at 1.3GHz. At the desired switching frequency of 1.3GHz the Cfilter shows impedance of 80mΩ. 1370.5 1 1.5 2 2.5 3 3.5 4x 1090.550.60.650.70.750.80.85Frequency(GHz)EfficiencyL = 1.2 nHInductor Q from 1.8 to 20 Figure 7 - 5 Simulated converter efficiency vs. switching frequency for load current of 100mA and with different Q factors As shown in Figure 7 - 2, to control and set the GHz-range PWM of the switching signal, a 4-bit controllable delay is implemented for the input 300-to-1200 MHz clock, generated by an on-chip ring voltage-controlled oscillator (VCO). To control the duty cycle, this delayed signal is then XORed with the original clock to generate a 600-to-2400 MHz controllable PWM, with the XOR also acting as a frequency doubler. The PWM signal is then piped with fan out of two buffers and drives the switch-scaled load drivers. The driver uses minimum-length regular Vth devices and the width can be scaled from 0.8mm to 2.4mm (results in overall drain parasitic capacitance of 11.4pF to 12.6pF). 138 Figure 7 - 6 High-density custom capacitance. 7.4. MEASUREMENT RESULTS A proof-of-concept converter is fabricated in a 0.13-μm CMOS process. The chip micrograph is shown in Figure 7 - 7. The chip occupies 0.46 mm2 and 0.74 mm2, without and with pads, respectively. Figure 7 - 8 shows the measured overall efficiency of the converter versus fsw, confirming that for the given driver and inductor size, the optimum fsw is in the vicinity of the tank resonance frequency, fr. The measured power efficiency is plotted versus PLOAD in Figure 7 - 9. Note that due to using the switch-scaling technique, the output parasitic capacitance of driver is slightly dependent on number of active drivers and the fsw should be tuned accordingly (fr changes from 1.28GHz to 1.35GHz). In this work the VCO frequency is manually controlled to adjust fsw≈ fr in different scales of driver. the output voltage is tuned from 0.55V to 1V (for input voltage of 1.2V) by manually adjusting the PWM with changing 4bits of delay controller. The measured output ripple and spur at fsw = 1.5 GHz under 100 mA of load current and output voltage of 0.98 V are shown in Figure 7 - 10. The performance summary and comparison with the state-of-the-art integrated CMOS converters are provided in Table 7 - 1. Each design in Table 7 - 1, including this work, assumes the switching frequency is provided by the system clock. Accordingly, the reported efficiency of this work includes the power consumed in the clock buffers, but not in the VCO core (≈18 mW). The proposed converter compares favorably with the state-of-the-art designs in terms of the peak efficiency while achieving lower ripple and smaller area. 139 Figure 7 - 7 Chip micrograph. Figure 7 - 8 Measured converter efficiency vs. switching frequency. 140 Figure 7 - 9 Measured converter efficiency vs. load power. Figure 7 - 10 Measured switching spur and ripple with fSW = 1.5 GHz. 141Table 7 - 1 Performance Summary and Comparison Reference [19] JSSC 2011 [20] VLSI 2014 [22] JSSC 2008 [23] TPE 2011 [24] ISSCC 2007 This Work fSW (GHz) 0.2 – 0.3 0.5 0.170 0.225 3 1-1.5 Ripple (mV) 30 – 40 10 40 110 25 4-5 Efficiency *[Peak] (%) 70 – 75 68 77.9 58 50 72.2 PLOAD [Max.] (mW) 266 250 315 800 100 200 Input Voltage (V) 1.2 1.5 1.2 2.6 1 1.2 – 1.5 Output Range (V) 0.3 – 0.88 0.7 – 1.2 0.9 2 – 2.6 0.5 – 0.7 0.55 – 1 Lfilter (nH) 2 1.5 2 ×2.0 4 ×3.9 0.32 1.2 Cfilter (nF) 5 10 5.2 5.2 0.35 3.5 Area (mm2) 1.59 1.5 1.5 3.76 0.27 0.456 Process (nm) 130 22 130 130 90 130 *Excluding the power consumed in generating the system clock 7.5. CONCLUSION Given the availability of a high-frequency system clock, a GHz range, fully integrated DC-DC converter is proposed which utilizes resonance loos mitigation technique to improve efficiency. GHz range operation of the converter significantly lowers the output ripple to as low as 4-5 mV, eases the integration of the LC filter (1.2 nH, 3.5 nF), and the total converter including the filter is implemented in comparably small area of 0.456 mm2. A proof-of-concept fully integrated buck converter is implemented and measured in a 0.13-μm CMOS process and measurement results shows that the proposed converter compares favorably with state-of the-art. 142 CONCLUSION 8.1. INTRODUCTION This thesis was an effort to study and address some of the major bottlenecks in CMOS design for realizing (1) mm-wave, sub-THz to THz systems, (2) ultra-low-power sub-mW wireless connectivity solutions, and (3) ultra-compact fully integrated DC-DC converters. Some of the main achievements are summarized as below: 8.2. ACHIEVEMENTS 8.2.1. High tuning-range and FoM mm-wave signal generation on CMOS As discussed in Chapter-2, due to poor quality factor of passives in mm-wave frequency range as well as the vulnerability of mm-wave CMOS VCOs to intrinsic parasitics, realizing a highly efficient signal generator with high tuning range as well as low phase noise is challenging. In this thesis, we proposed a few techniques to address these bottlenecks. We showed that the proposed signal generation method can simultaneously improve tuning range and PN which results in getting a higher FoM than the state-of-the-art designs. Proposed solutions are as follow: • Self-Mixing-VCO (SMV): We analytically showed that harmonic-VCO (HVCO) with SMV architecture is an effective method for generating mm-wave signals. Compared to a fundamental-VCO (FVCO) at the same center frequency, HVCO results in a higher tuning range as well as a superior PN performance. A proof-of-concept prototype showed that at mm-wave frequencies near 60 GHz, HVCO results in approximately 3dB better PN as well as ~50% better tuning range. • Class-C mm-wave VCO: Class-C operation of VCO in mm-wave range is demonstrated and showed that it can result in a better tuning range compared to a class-B VCO. Since class-C VCO core devices are operating in saturation, we analytically proved that the LC-tank sees less parasitic capacitance, resulting in better tuning range. 8.2.2. Highly efficient, high-power, sub-THz to THz signal generation In Chapter 3-4, we tried to answer two major questions about generation and extraction of sub-THz to THz signals on CMOS. First, what is the optimum strategy for generating high power sub-THz to THz harmonics? Next, after generating the desired harmonic, what is 143the optimum way of extracting the generated harmonic and deliver it to the load (e.g. antenna)? Here are major findings on these questions: • Separating harmonic-extraction-stage and fundamental-frequency generation stage would significantly improve system efficiency. In other words, by separate optimization of harmonic-extraction-stage and fundamental-frequency stage we analytically showed that stronger harmonics can be extracted, and compared to the state-of-the-art, results in a better DC-RF efficiency and PN. • Maintaining the length of the routings small in the combiner stage of a sub-THz source is essential for lowering the loss. We showed that slow-wave-coplanar wave-guide (SCPW) is a good candidate to miniaturizing passives and results in ~3dB lower loss compared to CPW structure. 8.2.3. Sub-mW Radio on CMOS In Chapter 5-6, we tried to answer the question: what is the optimum system architecture for lowering the power consumption of a CMOS radio frontend? We demonstrated that sub-threshold operation as well as current reusing in major blocks (e.g. LNA, VCO, Mixer) is an effective way of lowering power consumption. However, realizing such a system with robustness across PVT corners is still challenging. Next, we showed that this challenge can be addressed and mitigated by utilizing µW-level adaptive bias and feedback control loops. As a proof of concept, multiple adaptive control loops have been applied in a sub-threshold current-reused direct-conversion BLE receiver, which passes BLE specifications over PVT corners. The 40nm BLE porotype achieved highest reported receiver FoM and ~10x lower power compared to a highly efficient commercial BLE receiver. 8.2.4. Fully Integrated compacted DC-DC Converter with GHz-range switching In the last Chapter of this thesis, we studied how to encapsulate bulky power-electronics blocks. A GHz range, fully integrated DC-DC converter is proposed which utilizes resonance loos mitigation technique to improve efficiency. The GHz range operation of the converter significantly lowers the output ripple as well as the size of the required filter inductance and capacitance. It is shown that resonance-loss mitigation technique is an effective way for lowering the dynamic-loss of converter’s switch. This technique can improve the efficiency of converter by ~20%. 144 REFERENCES [1] P. Siegel, “Terahertz technology,” IEEE Trans. Microw. Theory Tech., vol. 50, no. 3, pp. 910–928, 2002. [2] M. Tonouchi, “Cutting-edge terahertz technology,” Nature Photonics, vol. 1, no. 2. pp. 97–105, 2007. [3] E. Seok, D. Shim, C. Mao, and R. Han, “Progress and challenges towards terahertz CMOS integrated circuits,” IEEE J. Solid-State Circuits, vol. 45, no. 8, pp. 1554–1564, 2010. [4] R. Han et al., “A 280-GHz Schottky diode detector in 130-nm digital CMOS,” in IEEE Journal of Solid-State Circuits, 2011, vol. 46, no. 11, pp. 2602–2612. [5] R. Han et al., “280GHz and 860GHz image sensors using Schottky-barrier diodes in 0.13μm digital CMOS,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2012, vol. 55, pp. 254–255. [6] A. Tang, N. Chahat, and E. Decrossas, “CMOS THz communication links for wireless applications: Where do they fit into mobile access and fixed access?,” in International Conference on Infrared, Millimeter, and Terahertz Waves, IRMMW-THz, 2014. [7] M. Seo et al., “>300GHz fixed-frequency and voltage-controlled fundamental oscillators in an InP DHBT process,” in IEEE MTT-S International Microwave Symposium Digest, 2010, pp. 272–275. [8] “ISSCC Trends,” 2016. [Online]. Available: http://isscc.org/isscc-in-the-news/trends/. [9] H. Shimomura, “A Study on High-Frequency Performance in MOSFETs Scaling,” 2011. [10] V. Radisic et al., “Demonstration of a 311 GHz fundamental oscillator using InP HBT technology,” IEEE Trans. Microw. Theory Tech., vol. 55, no. 11, pp. 2329–2335, 2007. [11] V. Radisic et al., “A 330-GHz MMIC oscillator module,” in IEEE MTT-S International 145Microwave Symposium Digest, 2008, pp. 395–398. [12] Z. Lao, J. Jensen, K. Guinn, and M. Sokolich, “80-GHz differential VCO in InP SHBTs,” in IEEE Microwave and Wireless Components Letters, 2004, vol. 14, no. 9, pp. 407–409. [13] Q. J. Gu et al., “Generating terahertz signals in 65nm CMOS with negative-resistance resonator boosting and selective harmonic suppression,” in IEEE Symposium on VLSI Circuits, Digest of Technical Papers, 2010, pp. 109–110. [14] V. Radisic, X. Mei, W. Deal, and W. Yoshida, “Demonstration of sub-millimeter wave fundamental oscillators using 35-nm InP HEMT technology,” IEEE Microw., 2007. [15] S. T. Nicolson et al., “Design and scaling of W-band SiGe BiCMOS VCOs,” in IEEE Journal of Solid-State Circuits, 2007, vol. 42, no. 9, pp. 1821–1832. [16] B. Heydari, M. Bohsali, E. Adabi, and A. M. Niknejad, “Millimeter-wave devices and circuit blocks up to 104 GHz in 90 NM CMOS,” in IEEE Journal of Solid-State Circuits, 2007, vol. 42, no. 12, pp. 2893–2903. [17] O. Momeni and E. Afshari, “High power terahertz and millimeter-wave oscillator design: A systematic approach,” IEEE J. Solid-State Circuits, 2011. [18] M. Adnan and E. Afshari, “A 247-to-263.5GHz VCO with 2.6mW peak output power and 1.14% DC-to-RF efficiency in 65nm bulk CMOS,” Dig. Tech. Pap. - IEEE Int. Solid-State Circuits Conf., vol. 57, pp. 262–263, 2014. [19] S. S. Kudva and R. Harjani, “Fully-integrated on-chip DC-DC converter with a 450X output range,” in IEEE Journal of Solid-State Circuits, 2011, vol. 46, no. 8, pp. 1940–1951. [20] H. K. Krishnamurthy et al., “A 500 MHz, 68% efficient, fully on-die digitally controlled buck Voltage Regulator on 22nm Tri-Gate CMOS,” in IEEE Symposium on VLSI Circuits, Digest of Technical Papers, 2014, pp. 1–2. [21] S. J. Kim, R. K. Nandwana, Q. Khan, R. Pilawa-Podgurski, and P. K. Hanumolu, “A1.8V 14630-to-70MHz 87% peak-efficiency 0.32mm2 4-phase time-based buck converter consuming 3??A/MHz quiescent current in 65nm CMOS,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2015, vol. 58, pp. 216–217. [22] J. Wibben and R. Harjani, “A high efficiency DC-DC converter using 2nH on-chip inductors,” in IEEE Symposium on VLSI Circuits, Digest of Technical Papers, 2007, pp. 22–23. [23] M. Wens and M. Steyaert, “An 800mW fully-integrated 130nm CMOS DC-DC step-down multi-phase converter, with on-chip spiral inductors and capacitors,” in 2009 IEEE Energy Conversion Congress and Exposition, ECCE 2009, 2009, pp. 3706–3709. [24] M. Alimadadi, S. Sheikhaei, G. Lemieux, S. Mirabbasi, and P. Palmer, “A 3GHz switching DC-DC converter using clock-tree charge-recycling in 90nm CMOS with integrated output filter,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2007, pp. 532–620. [25] Z. Zong, M. Babaie, and R. B. Staszewski, “A 60 GHz 25% tuning range frequency generator with implicit divider based on third harmonic extraction with 182 dBc/Hz FoM,” in Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium, 2015, vol. 2015–Novem, pp. 279–282. [26] L. Li, P. Reynaert, and M. Steyaert, “A 60-GHz CMOS VCO using capacitance-splitting and gate–drain impedance-balancing techniques,” IEEE Trans., 2011. [27] B. Catli and M. Hella, “Triple-push operation for combined oscillation/divison functionality in millimeter-wave frequency synthesizers,” IEEE J. Solid-State Circuits, 2010. [28] S. Kang, J. Chien, and A. Niknejad, “A W-band low-noise PLL with a fundamental VCO in SiGe for millimeter-wave applications,” IEEE Trans., 2014. [29] A. Kral, F. Behbahani, and A. Abidi, “RF-CMOS oscillators with switched tuning,” in 147Proceedings of the IEEE in Custom Integrated Circuits Conference, 1998, pp. 555–558. [30] F. Pepe, A. Bonfanti, S. Levantino, C. Samori, and A. L. Lacaita, “Analysis and minimization of flicker noise up-conversion in voltage-biased oscillators,” IEEE Trans. Microw. Theory Tech., vol. 61, no. 6, pp. 2382–2394, Jun. 2013. [31] H. K. Chen, H. J. Chen, D. C. Chang, Y. Z. Juang, and S. S. Lu, “A 0.6 V, 4.32 mW, 68 GHz low phase-noise VCO with intrinsic-tuned technique in 0.13 μm CMOS,” IEEE Microw. Wirel. Components Lett., vol. 18, no. 7, pp. 467–469, 2008. [32] C. Hung and R. Gharpurey, “A 57-to-75 GHz dual-mode wide-band reconfigurable oscillator in 65nm CMOS,” Radio Freq. Integr. Circuits, 2014. [33] R. Molavi, S. Mirabbasi, and H. Djahanshahi, “A low-power technique to boost the output amplitude of multi gigahertz push-push LC VCOS,” Microw. Opt. Technol. Lett., vol. 55, no. 7, pp. 1581–1584, 2013. [34] Yu-Lung Tang and Huei Wang, “Triple-push oscillator approach: theory and experiments,” IEEE J. Solid-State Circuits, vol. 36, no. 10, pp. 1472–1479, 2001. [35] M. Danesh, F. Gruson, P. Abele, and H. Schumacher, “Differential VCO and frequency tripler using SiGe HBTs for the 24 GHz ISM band,” in IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, 2003, 2003, pp. 277–280. [36] A. Hajimiri and T. H. Lee, “A general theory of phase noise in electrical oscillators,” IEEE J. Solid State Circuits, vol. 33, no. 2, pp. 179–194, 1998. [37] C. C. Wang, Z. Chen, and P. Heydari, “W-band silicon-based frequency synthesizers using injection-locked and harmonic triplers,” in IEEE Transactions on Microwave Theory and Techniques, 2012, vol. 60, no. 5, pp. 1307–1320. [38] W. L. Chan and J. R. Long, “A 5665 GHz injection-locked frequency tripler with quadrature outputs in 90-nm CMOS,” in IEEE Journal of Solid-State Circuits, 2008, vol. 43, no. 12, 148pp. 2739–2746. [39] A. Musa, R. Murakami, T. Sato, W. Chaivipas, K. Okada, and A. Matsuzawa, “A low phase noise quadrature injection locked frequency synthesizer for MM-wave applications,” in IEEE Journal of Solid-State Circuits, 2011, vol. 46, no. 11, pp. 2635–2649. [40] A. Shirazi, A. Nikpaik, and R. Molavi, “A Class-C self-mixing-VCO architecture with high tuning-range and low phase-noise for mm-wave applications,” (RFIC), 2015 IEEE, 2015. [41] T. Dickson et al., “30-100-GHz inductors and transformers for millimeter-wave (Bi) CMOS integrated circuits,” IEEE Trans., 2005. [42] Y. Cao, R. Groves, and X. Huang, “Frequency-independent equivalent-circuit model for on-chip spiral inductors,” IEEE J. solid, 2003. [43] E. Hegazi, J. Rael, and A. Abidi, The designer’s guide to high-purity oscillators. 2006. [44] L. Fanori and P. Andreani, “Highly efficient class-C CMOS VCOs, including a comparison with class-B VCOs,” IEEE J. Solid-State Circuits, 2013. [45] W. Wu, X. Bai, R. B. Staszewski, and J. R. Long, “A 56.4-to-63.4GHz spurious-free all-digital fractional-N PLL in 65nm CMOS,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2013, vol. 56, pp. 352–353. [46] T. Xi, S. Guo, P. Gui, J. Zhang, and K. Kenneth, “Low-phase-noise 54GHz quadrature VCO and 76GHz/90GHz VCOs in 65nm CMOS process,” Radio Freq., 2014. [47] L. Li, P. Reynaert, and M. S. J. Steyaert, “Design and analysis of a 90 nm mm-wave oscillator using inductive-division LC tank,” IEEE J. Solid-State Circuits, vol. 44, no. 7, pp. 1950–1958, Jul. 2009. [48] D. D. Kim et al., “A 70GHz manufacturable complementary LC-VCO with 6.14GHz tuning range in 65nm SOI CMOS,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2007, pp. 540–620. 149[49] Y. Chao and H. C. Luong, “Transformer-based dual-band VCO and ILFD for wide-band mm-Wave LO generation,” in Proceedings of the Custom Integrated Circuits Conference, 2013, pp. 1–4. [50] H. Jia, B. Chi, L. Kuang, and Z. Wang, “A 47.6-71.0-GHz 65-nm CMOS VCO based on magnetically coupled π-type LC network,” IEEE Trans. Microw. Theory Tech., vol. 63, no. 5, pp. 1645–1657, May 2015. [51] Y. Y. M. Tousi, O. Momeni, and E. Afshari, “No Title,” vol. 47, no. 12, pp. 3032–3042, Dec. 2012. [52] K. Sengupta and A. Hajimiri, “A 0.28 THz power-generation and beam-steering array in CMOS based on distributed active radiators,” IEEE J. Solid-State Circuits, vol. 47, no. 12, pp. 3013–3031, 2012. [53] A. H. M. Shirazi et al., “On the Design of mm-Wave Self-Mixing-VCO Architecture for High Tuning-Range and Low Phase Noise,” IEEE J. Solid-State Circuits, vol. 51, no. 5, pp. 1210–1222, May 2016. [54] Y. Zhang, R. Zhou, W. Rhee, and Z. Wang, “A 1.9-mW 750-kb/s 2.4-GHz F-OOK Transmitter with Symmetric FM Template and High-Point Modulation PLL,” IEEE J. Solid-State Circuits, vol. 52, no. 10, pp. 2627–2635, 2017. [55] T. Chi, J. Luo, S. Hu, and H. Wang, “A multi-phase sub-harmonic injection locking technique for bandwidth extension in silicon-based THz signal generation,” IEEE J. Solid-State, 2015. [56] P. Y. Chiang, Z. Wang, O. Momeni, and P. Heydari, “A silicon-based 0.3 THz frequency synthesizer with wide locking range,” IEEE J. Solid-State Circuits, vol. 49, no. 12, pp. 2951–2963, 2014. [57] A. H. M. Shirazi, A. Nikpaik, R. Molavi, S. Mirabbasi, and S. Shekhar, “A Class-C self-150mixing-VCO architecture with high tuning-range and low phase-noise for mm-wave applications,” in Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium, 2015, vol. 2015–Novem, pp. 107–110. [58] J. Grzyb, Y. Zhao, and U. R. Pfeiffer, “A 288-GHz lens-integrated balanced triple-push source in a 65-nm CMOS technology,” IEEE J. Solid-State Circuits, vol. 48, no. 7, pp. 1751–1761, 2013. [59] T. S. D. Cheung, “Shielded passive devices for silicon-based monolithic microwave and millimeter-wave integrated circuits,” in IEEE Journal of Solid-State Circuits, 2006, vol. 41, no. 5, pp. 1183–1200. [60] Y. M. Tousi, O. Momeni, and E. Afshari, “A novel CMOS high-power terahertz VCO based on coupled oscillators: Theory and implementation,” IEEE J. Solid-State Circuits, vol. 47, no. 12, pp. 3032–3042, Dec. 2012. [61] D. Huang et al., “Terahertz CMOS frequency generator using linear superposition technique,” in IEEE Journal of Solid-State Circuits, 2008, vol. 43, no. 12, pp. 2730–2738. [62] B. Razavi, “A 300-GHz fundamental oscillator in 65-nm CMOS technology,” IEEE J. Solid-State Circuits, vol. 46, no. 4, pp. 894–903, Apr. 2011. [63] H. Koo, C. Y. Kim, and S. Hong, “Design and analysis of 239 GHz CMOS push-push transformer-based VCO with high efficiency and wide tuning range,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 62, no. 7, pp. 1883–1893, Jul. 2015. [64] B. Cetinoneri, Y. A. Atesal, A. Fung, and G. M. Rebeiz, “W-band amplifiers with 6-dB noise figure and milliwatt-level 170-200-GHz doublers in 45-nm CMOS,” IEEE Trans. Microw. Theory Tech., vol. 60, no. 3 PART 2, pp. 692–701, Mar. 2012. [65] J. Sharma, T. Dinc, and H. Krishnaswamy, “A 134 GHz +4 dBm Frequency Doubler at fmax in 130 nm CMOS,” IEEE Microw. Wirel. Components Lett., vol. 24, no. 11, pp. 784–151786, Nov. 2014. [66] O. Momeni and E. Afshari, “A broadband mm-wave and terahertz traveling-wave frequency multiplier on CMOS,” in IEEE Journal of Solid-State Circuits, 2011, vol. 46, no. 12, pp. 2966–2976. [67] O. Momeni and E. Afshari, “High power terahertz and millimeter-wave oscillator design: A systematic approach,” IEEE J. Solid-State Circuits, vol. 46, no. 3, pp. 583–597, Mar. 2011. [68] A. M. Niknejad and H. Hashemi, mm-Wave Silicon Technology 60 GHz and Beyond. 2008. [69] R. Han and E. Afshari, “A CMOS high-power broadband 260-GHz radiator array for spectroscopy,” IEEE J. Solid-State Circuits, vol. 48, no. 12, pp. 3090–3104, Dec. 2013. [70] E. Seok et al., “A 410GHz CMOS push-push oscillator with an on-chip patch antenna,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2008, vol. 51, pp. 472–629. [71] R. Han et al., “A SiGe Terahertz Heterodyne Imaging Transmitter with 3.3 mW Radiated Power and Fully-Integrated Phase-Locked Loop,” IEEE J. Solid-State Circuits, vol. 50, no. 12, pp. 2935–2947, Dec. 2015. [72] W. Steyaert and P. Reynaert, “A 0.54 THz signal generator in 40 nm bulk CMOS with 22 GHz tuning range and integrated planar antenna,” IEEE J. Solid-State Circuits, vol. 49, no. 7, pp. 1617–1626, Jul. 2014. [73] P. Y. Chiang, O. Momeni, and P. Heydari, “A 200-GHz Inductively Tuned VCO with -7 dBm Output Power in 130-nm SiGe BiCMOS,” IEEE Trans. Microw. Theory Tech., vol. 61, no. 10, pp. 3666–3673, Oct. 2013. [74] M. Adnan and E. Afshari, “A 247-to-263.5GHz VCO with 2.6mW peak output power and 1.14% DC-to-RF efficiency in 65nm bulk CMOS,” in Digest of Technical Papers - IEEE 152International Solid-State Circuits Conference, 2014, vol. 57, pp. 262–263. [75] T. Song, H. S. Oh, E. Yoon, and S. Hong, “A low-power 2.4-GHz current-reused receiver front-end and frequency source for wireless sensor network,” in IEEE Journal of Solid-State Circuits, 2007, vol. 42, no. 5, pp. 1012–1021. [76] R. Fiorelli, S. Member, E. J. Peralías, F. Silveira, and S. Member, “LC-VCO Design Optimization Methodology Based on the gm/Id Ratio for Nanometer CMOS Technologies,” IEEE Trans. Microw. Theory Tech., vol. 59, no. 7, pp. 1822–1831, 2011. [77] H. Lee and S. Mohammadi, “A 500uW 2.4GHz CMOS subthreshold mixer for ultra low power applications,” in Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium, 2007, pp. 325–328. [78] H. Kraimia, T. Taris, J. B. Begueret, and Y. Deval, “A 2.4GHz ultra-low power current-reuse bleeding mixer with resistive feedback,” in 2011 18th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2011, 2011, pp. 488–491. [79] T. Taris, H. Rashtian, A. H. Masnadi Shirazi, and S. Mirabbasi, “A low-power 2.4-GHz combined LNA-VCO structure in 0.13-μm CMOS,” in 2013 IEEE 11th International New Circuits and Systems Conference (NEWCAS), 2013, pp. 1–4. [80] N. M. Fletcher, S. Gambini, and J. M. Rabaey, “A 2GHz 52uW wake-up receiver with -72dBm sensitivity using uncertain-IF architecture,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2008, vol. 51. [81] M. Vidojkovic, S. Rampu, K. Imamura, P. Harpe, G. Dolmans, and H. De Groot, “A 500uW 5Mbps ULP super-regenerative RF front-end,” in ESSCIRC 2010 - 36th European Solid State Circuits Conference, 2010, pp. 462–465. [82] H. Yan et al., “A 120μW fully-integrated BPSK receiver in 90nm CMOS,” in Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium, 2010, pp. 277–280. 153[83] M. A. Abdelghany, R. K. Pokharel, H. Kanaya, and K. Yoshida, “Low-Voltage Low-Power Combined LNA-Single Gate Mixer for 5GHz Wireless Systems,” (RFIC), 2011 IEEE, pp. 4–7, 2011. [84] T. W. T.-P. Wang et al., “A low-power oscillator mixer in 0.18-/spl mu/m CMOS technology,” IEEE Trans. Microw. Theory Tech., vol. 54, no. 1, pp. 88–95, 2006. [85] M. Tedeschi, A. Liscidini, and R. Castello, “Low-power quadrature receivers for ZigBee (IEEE 802.15.4) applications,” in IEEE Journal of Solid-State Circuits, 2010, vol. 45, no. 9, pp. 1710–1719. [86] A. H. Masnadi Shirazi, H. Rashtian, R. Molavi, T. Taris, H. M. Lavasani, and S. Mirabbasi, “On the design of combined LNA-VCO-mixer for low-power and low-voltage CMOS receiver front-ends,” Microelectronics J., vol. 57, pp. 34–47, 2016. [87] K. W. Cheng, K. Natarajan, and D. J. Allstot, “A current reuse quadrature GPS receiver in 0.13 μm CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 3, pp. 510–523, 2010. [88] C. Enz, F. Krummenacher, and E. Vittoz, “An Analytical MOS Transistor Model Valid in All Regions of Operation and Dedicated to Low-Voltage and Low-Current Applications.pdf,” Analog integrated circuits and. 1995. [89] D. M. Binkley, M. Bucher, and D. Foty, “Design-oriented characterization of CMOS over the continuum of inversion level and channel length,” in Proceedings of the IEEE International Conference on Electronics, Circuits, and Systems, 2000, vol. 1, pp. 161–164. [90] C. C. Enz, F. Krummenacher, and E. A. Vittoz, “An analytical MOS transistor model valid in all regions of operation and dedicated to low-voltage and low-current applications,” Analog Integr. Circuits Signal Process., vol. 8, no. 1, pp. 83–114, Jul. 1995. [91] P. Dautriche, “Analog design trends and challenges in 28 and 20nm CMOS technology,” in 2011 Proceedings of the European Solid-State Device Research Conference (ESSDERC), 1542011. [92] R. Molavi, S. Mirabbasi, and M. Hashemi, “A Wideband CMOS LNA Design Approach,” in Proceedings - IEEE International Symposium on Circuits and Systems, 2005, pp. 5107–5110. [93] K. Schrodinger, J. Stimma, and M. Mauthe, “A fully integrated CMOS receiver front-end for optic Gigabit Ethernet,” IEEE J. Solid-State Circuits, vol. 37, no. 7, pp. 874–880, Jul. 2002. [94] C. Low, N. Amplifier, D. K. Shaeffer, S. Member, and T. H. Lee, “A 1.5-V, 1.5-GHz CMOS Low Noise Amplifier,” IEEE J. Solid-State Circuits, vol. 32, no. 5, pp. 745–759, 1997. [95] R. Molavi, S. Mirabbasi, and M. Hashemi, “A Wideband CMOS LNA Design Approach,” in Proceedings - IEEE International Symposium on Circuits and Systems, 2005, pp. 5107–5110. [96] T. Taris, H. Rashtian, and A. Shirazi, “A low-power 2.4-GHz combined LNA–VCO structure in 0.13-µm CMOS,” Analog Integr. Circuits, 2014. [97] B. Razavi, RF Microelectronics. 2011. [98] G. Han and S. Edgar, “CMOS Transconductance Multipliers : A Tutorial,” IEEE Trans. Circuits Syst. II Analog Digit. Signal Process., vol. 45, no. 12, pp. 1550–1563, 1998. [99] K. H. Liang, H. Y. Chang, and Y. J. Chan, “A 0.5-7.5 GHz ultra low-voltage low-power mixer using bulk-injection method by 0.18-??m CMOS technology,” IEEE Microw. Wirel. Components Lett., vol. 17, no. 7, pp. 531–533, Jul. 2007. [100] K. Choi, D. H. Shin, and C. P. Yue, “A 1.2-V, 5.8-mW, ultra-wideband folded mixer in 0.13-μm CMOS,” in Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium, 2007, pp. 489–492. [101] A. H. M. Shirazi and S. Mirabbasi, “An ultra-low-voltage ultra-low-power CMOS active 155mixer,” Analog Integr. Circuits Signal Process., vol. 77, no. 3, pp. 513–528, 2013. [102] F. Assaderaghi, D. Sinitsky, S. A. Parke, J. Bokor, and P. K. Ko, “Dynamic threshold-voltage MOSFET (DTMOS) for ultra-low voltage VLSI,” IEEE Trans. Electron Devices, vol. 44, no. 3, pp. 414–422, 1997. [103] M. Tedeschi, A. Liscidini, and R. Castello, “Low-power quadrature receivers for ZigBee (IEEE 802.15.4) applications,” in IEEE Journal of Solid-State Circuits, 2010, vol. 45, no. 9, pp. 1710–1719. [104] Z. Lin, P. I. Mak, and R. Martins, “A 1.7mW 0.22mm2 2.4GHz ZigBee RX exploiting a current-reuse blixer + hybrid filter topology in 65nm CMOS,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2013, vol. 56, pp. 448–449. [105] J. Prummel et al., “A 10 mW Bluetooth Low-Energy Transceiver with On-Chip Matching,” IEEE J. Solid-State Circuits, vol. 50, no. 12, pp. 3077–3088, 2015. [106] T. Sano et al., “A 6.3mW BLE transceiver embedded RX image-rejection filter and TX harmonic-suppression filter reusing on-chip matching network,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2015, vol. 58, pp. 240–241. [107] Y. H. Liu et al., “A 3.7mW-RX 4.4mW-TX fully integrated Bluetooth Low-Energy/IEEE802.15.4/proprietary SoC with an ADPLL-based fast frequency offset compensation in 40nm CMOS,” in Digest of Technical Papers - IEEE International Solid-State Circuits Conference, 2015, vol. 58, pp. 236–237. [108] F. Kuo, S. Ferreira, M. Babaie, and R. Chen, “A Bluetooth low-energy (BLE) transceiver with TX/RX switchable on-chip matching network, 2.75 mW high-IF discrete-time receiver, and 3.6 mW all-digital transmitter,” VLSI Circuits (VLSI-, 2016. [109] M. Schlecht and L. Casey, “Comparison of the square-wave and quasi-resonant topologies,” Appl. Power Electron. Conf., 1987. 156[110] S. Musunuri and P. L. Chapman, “Optimization of CMOS transistors for low power DC-DC converters,” in PESC Record - IEEE Annual Power Electronics Specialists Conference, 2005, vol. 2005, pp. 2151–2157. [111] T. H. Lee, The Design of CMOS Radio-Freqeuncy Integrated Circuits. 2004. [112] A. H. M. Shirazi, A. Nikpaik, S. Mirabbasi, and S. Shekhar, “A quad-core-coupled triple-push 295-to-301 GHz source with 1.25 mW peak output power in 65nm CMOS using slow-wave effect,” in Digest of Papers - IEEE Radio Frequency Integrated Circuits Symposium, 2016, vol. 2016–July, pp. 190–193. [113] A. Nikpaik, A. H. Masnadi Shirazi, A. Nabavi, S. Mirabbasi, and S. Shekhar, “A 219-to-231GHz Frequency-Multiplier-Based VCO With ~ 3 dBm Peak DC-to-RF Efficiency in 65-nm CMOS,” IEEE J. Solid-State Circuits, 2017. 157APPENDIX A. DERIVATION OF DC-TO-NF0 CURRENT EFFICIENCY Consider the drain current of a short-channel device operating in saturation region [111]: 1( ) ( )2D n ox sat GS th n GS thI C WE V V K W V Vµ= − = − (A1) where µn is carrier mobility, Cox is the gate oxide capacitance per unit area, W is the channel width, Esat is the electric field at which the carrier velocity drops to half the value extrapolated from weak-field mobility, and Vth is the threshold voltage. ID(t) for a transistor working in a Class-C core can be written as: , 0 01( cos( ) ) 2 2( ) 20n ox sat GS DC thDC WE V A t V k t kI telsewhereµ ω ϕ π ω ϕ π + − − + ≤ ≤ +=  (A2) where A is the oscillation amplitude and ΦC=2φ is the coduction angle, given by: ,12cosGS DC thCV VA− − Φ =   . (A3) Using the Fourier expansion of ID(t), the DC term can be calculated as: 1 1(cos( ) cos( ))2 21[sin( ) cos( )]2DC n ox satn ox satI C WE A dC WE Aϕϕµ α ϕ απµ ϕ ϕ ϕπ−= −= −∫ (A4) Similarly, the magnitude of the nf0 harmonic component is given by: 1 1(cos( ) cos( ))cos( )21 sin( 1) sin( 1) 2cos( )sin( )2 1 1n n ox satn ox satI C WE A n dn nC WE A nn n nϕϕµ α ϕ α απϕ ϕµ ϕ ϕπ−= −+ − = + − + − ∫ (A5) Thus, the DC-to-nf0 current efficiency, ij,k is calculated as: 158ij,k = | ¦í€ Ä( − 1)ú› È + ¦í€ Ä( + 1)ú› È − 2€x¦ú¦í€ Iú› R ¦í€ Iú›R − cos(ú) | (A6) where the normalized sinc function is defined as sinc(x) =sin(πx)/πx. B. DERIVATION OF NOISE FACTOR FOR PROPOSED PP-LNA Considering the LNA shown in Figure 5 - 10, the following terms contribute to the total noise at the output: 1) Output noise current due to the thermal noise of Rs: I¸;×ûÐ;üÖ = e¸ýgº¸ + gºÕþ(R|| 1jωC·¸) + jωL& + RÖ . (R||1jωC·¸) (A7) where e¸ is the equivalent noise voltage of RS. Since in the proposed design Rf is large, this noise power can be simplified as: I¸;×ûÐ;üÖ = (gº¸ + gºÕ)4kTRÖ∆f(1 − ωC·¸L&) + (RÖC·¸ω) (A9) 2) Output noise current due to the thermal noise of M1 and M2 IF;ï = 4kTγºÕgº¸∆fαº¸ , I;ï = 4kTγºÕgºÕ∆fαºÕ (A10) where 9 is the excess noise factor and ƒ is the channel conduction coefficient [97]. 3) Output noise current due to the thermal noise of Rf : 159‰k;EŽ;¨< ≈ I1 + ýq8k + q8‹þ.,(1 + =)R ‰k;¨  I.< + .,(1 + =)R (410) where Q is quality factor of the matching network and ‰k;¨  is current noise of the feedback resistor. By assuming .‹ = .,(1 + =) and having all contribution of all noise sources, the overall noise factor can be estimated as: u(%) ≈ ‰k;EŽ;¨, + ‰k;EŽ;¨< + ‰:F;{ + ‰:;{‰k;EŽ;¨, (411) = 1 + ý1 + ýq8k + q8‹þ.‹þ.<(.< + .‹) + 98‹q8kƒ8k + 98‹q8‹ƒ8‹.,(q8k + q8‹)(1 − %sk?¤) + (.,sk%) Due to using a large Rf in this design (Rf >> Rp), and further assuming that γºÕ ≈ γº¸ = γº, αº¸ ≈ αºÕ = α , and gº¸ ≈ gºÕ = gº , F can be simplified as : u(%) ≈ 1 +ý1 + 2q8.‹þ.<® + 298q8ƒ2.,q8(1 − %sk?¤) + (.,sk%) (412) u(%) ≈ 1 + ý1 + 2q8.‹þ((1 − %sk?¤) + (.,sk%))2.,.<®q8+ 98((1 − %sk?¤) + (.,sk%)).,q8ƒ (413) 160By operating at the resonance frequency, D& = F©ªZ«¬­, the noise factor expression is further reduced to: u(%&) ≈ 1 + (1 + 2q8.,).,(sk%&)2.<®q8 + 98.,(sk%&)q8ƒ (416