Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Energy spectra informed performance of clocked quantum-dot cellular automata Retallick, Jacob 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2021_may_retallick_jacob.pdf [ 11.31MB ]
JSON: 24-1.0395406.json
JSON-LD: 24-1.0395406-ld.json
RDF/XML (Pretty): 24-1.0395406-rdf.xml
RDF/JSON: 24-1.0395406-rdf.json
Turtle: 24-1.0395406-turtle.txt
N-Triples: 24-1.0395406-rdf-ntriples.txt
Original Record: 24-1.0395406-source.json
Full Text

Full Text

Energy Spectra Informed Performance of ClockedQuantum-dot Cellular AutomatabyJacob RetallickB.A.Sc., The University of British Columbia, 2014A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Electrical and Computer Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)December 2020© Jacob Retallick, 2020The following individuals certify that they have read, and recommend to the Facultyof Graduate and Postdoctoral Studies for acceptance, the thesis entitled:Energy Spectra Informed Performance of Clocked Quantum-dot Cel-lular Automatasubmitted by Jacob Retallick in partial fulfillment of the requirements forthe degree of Doctor of Philosophyin Electrical and Computer EngineeringExamining Committee:Konrad Walus, Electrical and Computer Engineering, UBCSupervisorSteve Wilton, Electrical and Computer Engineeringi, UBCSupervisory Committee MemberLukas Chrostowski, Electrical and Computer Engineering, UBCUniversity ExaminerRobert Raussendorf, Physics and Astronomy, UBCUniversity ExaminerAdditional Supervisory Committee Members:John Madden, Electrical and Computer Engineering, UBCSupervisory Committee MemberiiAbstractUnderstanding the dynamic behaviour of nanoscale quantum-dot cellular automata(QCA) networks involves the simulation of large numbers of QCA devices, withthe complexity of the full quantum treatment exponential in the network size. Pre-vious attempts to limit this complexity introduce simplifying assumptions withknown inaccuracies. In this thesis we investigate an alternate approach, extractingperformance metrics through analysing the low energy eigenspectrum of a clockednetwork. We make two major contributions.In the first part of this thesis, we study the use of silicon dangling bonds(SiDBs) as a platform for combinatorial logic, and ultimately nanoscale QCA. Wepresent models for understanding the preferred configurations and dynamics ofcharges in these structures. We consider the clocking of SiDB-based QCA wires,and reveal a complicated trajectory of charge states that serve as a challenge forQCA operation. By studying SiDB-based QCA from the framework of the famil-iar 3-state model, in which these preferred charge states translate to eigenstates of asystem Hamiltonian, we determine conditions for which SiDB-QCA wires can cor-rectly operate when clocked. These conditions are potentially impractical unlessnet-neutral SiDB arrangements can be achieved.The remaining bulk of the thesis revolves around the link between QCA clock-ing and quantum annealing. We first investigate the adiabaticity of simple 2-stateQCA networks under zone clocking. We present upper bounds on the clocking fre-quency beyond which adiabaticity falls below a 99% threshold, and demonstratehow we can efficiently estimate clocking performance using only a few of the en-ergy eigenstates. Due to a natural mapping between QCA cells and superconduct-ing flux qubits, the potential for investigating performance using a physical quan-iiitum annealer is explored. Methods for embedding QCA networks onto the annealerare discussed and a selection of annealing results are analyzed. Finally, we estab-lish a method for decomposing the system Hamiltonian into contributions fromgiven components, and a means to identify meaningful components which criti-cally affect clocking performance. This framework reveals a heuristic algorithmfor approximating the low energy eigenspectra of large QCA networks, enablingfuture investigations into the performance of networks well beyond previous sizelimitations.ivLay SummaryThere is increasing interest in technologies which extend beyond the expected scal-ing limitations of silicon transistors. Quantum-dot cellular automata (QCA) is apotential approach which can represent binary information by the arrangements ofelectrons among a collections of atoms or within single molecules. At this smallscale, understanding the behaviour of QCA circuits requires modelling large num-bers of devices as they are being clocked. This requires an exponential amount ofcomputational resources.In this work, we make two major contributions. First, we assess one potentialarchitecture for implementing nanoscale QCA using silicon dangling bonds. Sec-ond, we develop an approach to understanding the behaviour of QCA arrangementsby looking at the rate of transitions between a few of the lowest energy states of thesystem, and present an efficient method for finding these low energy states even forlarge QCA circuits.vPrefaceThis work was supervised by Dr. Konrad Walus and is focused around tworelated research efforts. First, a collaboration with the research group of Dr. RobertA. Wolkow at the University of Alberta to understand and model the behaviour ofnetworks of silicon dangling bonds (SiDBs) on hydrogen passivated silicon. Thesecond effort is the main contribution of my research, a study of the behaviour ofquantum-dot cellular automata (QCA) networks near the adiabatic limit employingknowledge of the low energy eigenspectrum. Several parts of this thesis containcontent which has previously been published or is currently in submission.The first three chapters present a review of models and simulation techniquesfor QCA devices and performance, including a discussion on quantum annealingand its link to QCA clocking. Chapter 4 presents on overview of published workregarding SiDB structures with content from the following:[1] M. Rashidi, W. Vine, T. Dienel, L. Livaderu, J. Retallick, T. Huff, K. Walus,and R. A. Wolkow, “Initiating and monitoring the evolution of single elec-trons within atom-defined structures,” Phys, Rev. Lett., vol. 121, no. 16,p. 166801, Oct. 2018.[2] S. S. H. Ng, J. Retallick, H. N. Chiu, R. Lupoiu, L. Livadaru, T. Huff,M. Rashidi, W. Vine, T. Dienel, R. A. Wolkow, and K. Walus, “SiQAD:A design and simulation tool for atomic silicon quantum dot circuits,” IEEETransactions on Nanotechnology, vol. 19, pp. 137-146,[3] H. N. Chiu, S. S. H. Ng, J. Retallick, and K. Walus, “PoisSolver: a tool formodelling silicon dangling bond clocking networks,” in 2020 IEEE 20th In-ternational Conference on Nanotechnology (IEEE-NANO). IEEE, Jul. 2020,pp. 134-139The experimental work by Wolkow’s group discussed in [1] was the motivat-ing for the models and tools in [2–4] that followed. My main contribution to themanuscript for [1] was in discussion of interpreting the experimentally observedslow dynamics, the potential influence of the tip on charge transitions, and heuris-tic models which would be used in later work. An effort in implementing SiQAD,a CAD tool for designing and simulating SiDB structures is discussed in [2]. Ideveloped the initial version of the graphical interface, as well as much of the lowlevel code for design definitions and tools. In additional, I was responsible forthe models used both for defining low energy metastable charge configurations inSiDB arrangements as well as the hopping model and its Python implementation.The solver for the general Poisson equation and the resulting manuscript [3] wasprimarily the work of Nathan Chiu. My main contribution was in early discus-sion on scalable strategies for solving arbitrary electrode arrangements as well ascommenting the manuscript.For all remaining content, I was the lead researcher, prepared the manuscripts,and produced all code, data, and figures. Dr. Walus provided insight for directingthe investigations and advice for manuscript development. A version of Chapter5 has been published in [4]. The discussion has been slightly expanded to betterintegrate with the rest of this thesis.[4] J. Retallick and K. Walus, “Population congestion in 3-state quantum-dot cel-lular automata,” Journal of Applied Physics, vol. 127, no. 24, p. 244301,2020.Chapter 6 has been published in [5]. New results regarding better estimates ofperformance using the low energy spectrum have been added.[5] J. Retallick and K. Walus, “Limits of adiabatic clocking in quantum-dot cellu-lar automata,” Journal of Applied Physics, vol. 127, no. 5, p. 054502, 2020.viiChapter 7 contains work published in a number of forms. The embedding al-gorithm was based on earlier work by Michael Babcock, Miguel Aroca-Ouellette,and Shane McNamara; however, the more recent implementation and its improve-ments were my work, as well as all results discussed in the manuscript [6]. Anunpublished expanded discussion can be found on the arXiv [7]. Annealing resultsusing the quantum annealer developed by D-Wave Systems Inc. were done withthe assistance of Aidan Roy, and a sample of these results previously presented atthe ACS Spring Meeting in 2017 [8].[6] J. Retallick, M. Babcock, M. Aroca-Ouellette, S. McNamara, S. Wilton,A. Roy, M. Johnson, and K. Walus, “Embedding of quantum-dot cellularautomata circuits onto a quantum annealing processor,” in 2014 Conferenceof Optoelectronic and Microelectronic Materials & Devices. IEEE, Dec.2014.[8] J. Retallick and K. Walus, “Investigation of quantum-dot cellular automatanetworks using a quantum annealing processor,” Presented at the 253rd ACSNational Meeting, San Francisco, USA, Apr. 2017A version of Chapter 8 has been submitted and conditionally accepted byIEEE Transactions on Nanotechnology. Code for extracting performance metricsfrom the heuristic estimate of the low energy spectra has since been implemented.Appropriate results are included in Section 8.4 which are not currently included inthe submitted work.[9] J. Retallick and K. Walus, “Low-energy eigenspectrum decomposition (LEED)of quantum-dot cellular automata networks,” submitted for publication, June2020viiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiList of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . 11.2 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Modelling and Simulation of QCA Devices . . . . . . . . . . . . . . 62.1 Reduced Basis Representations . . . . . . . . . . . . . . . . . . . 72.2 Dynamics Using Reduced Correlation Models . . . . . . . . . . . 113 QCA Clocking as Quantum Annealing . . . . . . . . . . . . . . . . . 203.1 Quantum Annealing . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 QCA Clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . 25ix4 Dangling Bonds on Hydrogen Passivated Silicon . . . . . . . . . . . 264.1 Charge States of SiDB Arrangements . . . . . . . . . . . . . . . 274.2 Clocking SiDB Structures . . . . . . . . . . . . . . . . . . . . . . 344.3 Dynamic Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . 344.4 SiQAD: Simulation and Design of SiDB Structures . . . . . . . . 394.5 SiDB Implementations of QCA . . . . . . . . . . . . . . . . . . . 404.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Population Congestion in 3-State QCA . . . . . . . . . . . . . . . . . 465.1 Full 3-State Hamiltonian . . . . . . . . . . . . . . . . . . . . . . 475.2 Considered Devices . . . . . . . . . . . . . . . . . . . . . . . . . 485.3 Ground State Characterisation . . . . . . . . . . . . . . . . . . . 535.4 Wave Clocking of Wires . . . . . . . . . . . . . . . . . . . . . . 655.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696 Limits of Adiabatic Clocking for 2-State QCA Networks . . . . . . . 716.1 Choosing a Clocking Schedule . . . . . . . . . . . . . . . . . . . 726.2 Coherent Behaviour of QCA Components . . . . . . . . . . . . . 826.3 Dissipative Behaviour . . . . . . . . . . . . . . . . . . . . . . . . 916.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1027 Simulating 2-State QCA on Quantum Annealing Hardware . . . . . 1037.1 Embedding Problems on the QPU . . . . . . . . . . . . . . . . . 1057.2 Annealing Results . . . . . . . . . . . . . . . . . . . . . . . . . . 1147.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1218 Eigenspectrum Decomposition of QCA Networks . . . . . . . . . . . 1228.1 Decomposing the QCA Network Hamiltonian . . . . . . . . . . . 1238.2 System Energy-Based Decomposition . . . . . . . . . . . . . . . 1268.3 Heuristic Solver for Large Networks . . . . . . . . . . . . . . . . 1328.4 Estimating Transitions in the Component Mode Basis . . . . . . . 1378.5 Component Substitution . . . . . . . . . . . . . . . . . . . . . . 1408.6 Application to Alternate QCA Models . . . . . . . . . . . . . . . 1408.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143x9 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . 1449.1 SiDBs as a Platform for Nanoscale QCA . . . . . . . . . . . . . . 1449.2 Estimating QCA Performance using the Low Energy Spectrum . . 145Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148A Algorithm Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161A.1 SiDB HoppingDynamics Engine . . . . . . . . . . . . . . . . . . 161A.2 Dense Placement Algorithm . . . . . . . . . . . . . . . . . . . . 164A.3 Component Mode Solver . . . . . . . . . . . . . . . . . . . . . . 169B Additional Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . 172B.1 Analytical Estimate of the 3-State QCA Cell Ground State Energy 172B.2 Clocking Field Range for 3-State QCA Wires . . . . . . . . . . . 173B.3 Y Spin Invariance in the 2-State Approximation . . . . . . . . . . 174C Supplemental Details . . . . . . . . . . . . . . . . . . . . . . . . . . 175C.1 Pegasus Topology . . . . . . . . . . . . . . . . . . . . . . . . . . 175C.2 Additional Embedding Results . . . . . . . . . . . . . . . . . . . 175D Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180D.1 Gell-Mann Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 180D.2 Equations for All Single and Two Point Correlations . . . . . . . 180xiList of TablesTable 5.1 Worst-case errors over all (h,C) for different u parameter ap-proximations. . . . . . . . . . . . . . . . . . . . . . . . . . . 55Table 5.2 Upper limits on ξ for in-order population of a wire in the weaktunneling limit. . . . . . . . . . . . . . . . . . . . . . . . . . . 58Table 6.1 F-parameters in Eq. (6.4) from Eq. (6.5) for the simulated devices. 74Table 6.2 Fit parameters for the hyperbolic approximation of the mini-mum gap of the inverter for the different clocking schedules. . 77Table 6.3 Comparison of fit and analytical estimates of the Landau-Zenerν parameter for wires. . . . . . . . . . . . . . . . . . . . . . . 85Table 6.4 Maximum clocking frequencies relative to f0, for our basic QCAcircuits based on different metrics. . . . . . . . . . . . . . . . 88Table 6.5 Maximum clocking frequencies computed using different adia-baticity estimation methods. . . . . . . . . . . . . . . . . . . . 90Table 6.6 Threshold β value for high performance in the relaxed regime. 95Table 6.7 Comparison of β ∗ values for the two temperature dependentsteady states. . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Table 7.1 Benchmark circuits and the number of non-driver cells . . . . . 111Table C.1 Power law fit parameters for the characteristic circuit size, µ ,and run time t(µ). . . . . . . . . . . . . . . . . . . . . . . . . 178xiiList of FiguresFigure 1.1 Schematics and polarizations for common QCA devices . . . 2Figure 1.2 Common QCA building blocks used in larger QCA circuits. . 2Figure 1.3 Schematic of 4-phase clocking . . . . . . . . . . . . . . . . . 3Figure 1.4 Schematic for zone and dynamic wave clocking protocols. . . 4Figure 1.5 Example electrode arrangements for 4-phase clocking. . . . . 4Figure 2.1 2-state QCA device configuration states . . . . . . . . . . . . 7Figure 2.2 3-state QCA device configuration states . . . . . . . . . . . . 9Figure 2.3 Majority gate with uneven inputs under the ICHA. . . . . . . 11Figure 2.4 Conversion rules for generating 2-state CVF equations. . . . . 14Figure 2.5 Graph representations of the interdependence of correlationtensors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Figure 2.6 Simulation of a five cell QCA wire using a few correlationmodels excluding any dissipation. . . . . . . . . . . . . . . . 16Figure 2.7 Majority gate using a single multi-cell Hamiltonian. . . . . . 18Figure 2.8 Two majority gates in series with potential multi-cell groups. . 19Figure 3.1 Schematic avoided level crossing described in the Landau-Zenerapproximation. . . . . . . . . . . . . . . . . . . . . . . . . . 23Figure 4.1 Schematic of the H-Si(100)-2×1 surface. . . . . . . . . . . . 27Figure 4.2 Schematic of the SiDB charge states and the band structure ofthe silicon substrate. . . . . . . . . . . . . . . . . . . . . . . 28Figure 4.3 Experimentally observed charge configurations for an SiDBOR gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30xiiiFigure 4.4 Simulated charge configurations of the OR gate . . . . . . . . 33Figure 4.5 Demonstration of hopping behaviour in a symmetric structurestudied in Ref. [1] . . . . . . . . . . . . . . . . . . . . . . . . 35Figure 4.6 Schematic of hopping rates in the hopping model. . . . . . . . 36Figure 4.7 Dynamics simulation of the system in Fig. 4.5 . . . . . . . . . 38Figure 4.8 Experimental demonstration of a 4-dot SiDB cell. . . . . . . . 40Figure 4.9 SiDB implementation of an inverting chain and the equivalent2-dot QCA representation. . . . . . . . . . . . . . . . . . . . 41Figure 4.10 Sequence of lowest energy metastable configurations of theSiDB inverting chain. . . . . . . . . . . . . . . . . . . . . . . 41Figure 4.11 SiDB implementation of a non-inverting wire and the equiva-lent 4-dot QCA representation. . . . . . . . . . . . . . . . . . 42Figure 4.12 Sequence of lowest energy metastable configurations of theSiDB wire. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Figure 4.13 Dynamics simulation of an SiDB inverting chain under 4-phasewave clocking. . . . . . . . . . . . . . . . . . . . . . . . . . 44Figure 5.1 Schematics for 3-state Fc+FcC2B−9 devices. . . . . . . . . . . 49Figure 5.2 Kink and congestion energies for both Fc+FcC2B−9 device con-figurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Figure 5.3 SiDB device geometries and interaction energies. . . . . . . . 51Figure 5.4 Normalized clocking wave profiles. . . . . . . . . . . . . . . 52Figure 5.5 Spectra of the 3-state Cell Hamiltonian . . . . . . . . . . . . 54Figure 5.6 3-state cell responses in the two limiting cases of polarizationbias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Figure 5.7 Arrangements for driven 3-state wires. . . . . . . . . . . . . . 56Figure 5.8 Driven inverting wire in the weak tunneling limit after success-fully populating the first n cells. . . . . . . . . . . . . . . . . 57Figure 5.9 Ground state polarizations and populations of biased 3 cell wires 59Figure 5.10 Ground state polarizations and populations of a biased 3-dotinverting wire with γ = Ek0/10. . . . . . . . . . . . . . . . . . 60Figure 5.11 Maximum congestion ratios for 3 cell wires using devices forvarying tunneling rates. . . . . . . . . . . . . . . . . . . . . . 61xivFigure 5.12 Ground states of Maj-101 using different congestion ratios. . . 62Figure 5.13 Ground states, energy gaps, and transition parameters for Maj-101 with non-zero tunneling. . . . . . . . . . . . . . . . . . . 64Figure 5.14 Increases in the maximum congestion ratios of 3 cell wires forwave clocking at various tunneling rates . . . . . . . . . . . . 66Figure 5.15 Dependence of the minimum required clocking fields on thenull site depth for a 3-dot wire using a 200 nm trapezoidal wave 68Figure 6.1 QCA device dimensions and computed kink energies relativeto E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74Figure 6.2 Schedule energies for the Linear, Quasilinear, and Sinusoidalclocking schedules. . . . . . . . . . . . . . . . . . . . . . . . 76Figure 6.3 Comparison of classical performances for the different clock-ing schedules. . . . . . . . . . . . . . . . . . . . . . . . . . . 78Figure 6.4 Initial coherent oscillations in a simulation of Maj-101 usingthe Quasilinear schedule. . . . . . . . . . . . . . . . . . . . . 79Figure 6.5 Classical performance of Maj-101 for different α0 values be-fore and after smoothing. . . . . . . . . . . . . . . . . . . . . 79Figure 6.6 Classical performance of a wire and inverter for different α0values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Figure 6.7 Low energy spectra of the simulated devices. . . . . . . . . . 81Figure 6.8 Landau-Zener parameters as a function of the wire length. . . 83Figure 6.9 Performance and maximum characteristic rates of wires of dif-ferent lengths. . . . . . . . . . . . . . . . . . . . . . . . . . . 84Figure 6.10 Performance metrics for the QCA building blocks. . . . . . . 86Figure 6.11 Low energy spectrum and performance metrics of Maj-110. . 87Figure 6.12 Transition parameters for the QCA devices . . . . . . . . . . 89Figure 6.13 Logical performance of Maj-101 for different spectral steadystates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93Figure 6.14 Three operational regimes for steady state relaxation models. . 94Figure 6.15 Logical performance in the relaxed regime for QCA compo-nents with the Boltzmann distribution steady state. . . . . . . 94xvFigure 6.16 Maximum clocking frequencies relative to f0 for different spec-tral steady states. . . . . . . . . . . . . . . . . . . . . . . . . 96Figure 6.17 ICHA: Maximum clocking frequencies relative to f0 for thedifferent spectral steady states. . . . . . . . . . . . . . . . . . 97Figure 6.18 Oscillations in the ICHA simulation of Wire-5. . . . . . . . . 98Figure 6.19 Logical performance in the relaxed regime for the mean fieldrelaxation with the ICHA. . . . . . . . . . . . . . . . . . . . 100Figure 6.20 Maximum clocking frequencies relative to f0 for mean fieldrelaxation using the ICHA. . . . . . . . . . . . . . . . . . . . 101Figure 7.1 Qubit layout for one tile of the QPU . . . . . . . . . . . . . . 104Figure 7.2 Additional couplers in the Pegasus topology. . . . . . . . . . 104Figure 7.3 Minimal embeddings for K4 on the Chimera and Pegasus topolo-gies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105Figure 7.4 Schematic of 4-dot QCA device connectivity. . . . . . . . . . 107Figure 7.5 Schematic of 2-dot QCA device connectivity. . . . . . . . . . 107Figure 7.6 Connectivity graph representations for a selection of simpleQCA components. . . . . . . . . . . . . . . . . . . . . . . . 108Figure 7.7 Minimal embeddings for the 4-dot majority gate. . . . . . . . 109Figure 7.8 Minimal embeddings for the 4-dot inverter. . . . . . . . . . . 109Figure 7.9 Number of physical qubits used for benchmark circuits . . . . 111Figure 7.10 Comparison of physical qubit usage and vertex model sizes forboth algorithms and adjacencies . . . . . . . . . . . . . . . . 112Figure 7.11 Average number of physical qubits needed to embed generatedQCA circuits of differetn sizes . . . . . . . . . . . . . . . . . 113Figure 7.12 Average single trial embedding success probabilities . . . . . 113Figure 7.13 Annealing schedule for the D-Wave 2X . . . . . . . . . . . . 114Figure 7.14 Annealing results for wires of different lengths with variousannelaing times. . . . . . . . . . . . . . . . . . . . . . . . . 115Figure 7.15 Change in the spectra of embedded QCA networks . . . . . . 117Figure 7.16 Comparison of transition parameters before and after embedding118Figure 7.17 Proportion of annealing results which produce the ground stateas a function of the annealing time . . . . . . . . . . . . . . . 120xviFigure 8.1 Sample decomposition of the 2-dot majority gate with binaryinputs 101. . . . . . . . . . . . . . . . . . . . . . . . . . . . 130Figure 8.2 Example decomposition tree of a large QCA network. . . . . 134Figure 8.3 Circuit diagram and the equivalent 4-dot QCA network for theconsidered 49 cell XOR gate, not counting fixed input cells. . 135Figure 8.4 Sample decompositions of the XOR gate. . . . . . . . . . . . 136Figure 8.5 Estimates of the transition parameters for the 49 cell XOR gate. 139Figure 8.6 Change in the spectra and transition parameters associated withsubtituting the inverting element in the XOR gate. . . . . . . . 141Figure A.1 Cohopping schematic . . . . . . . . . . . . . . . . . . . . . . 163Figure A.2 Example embedding of a serial adder QCA circuit . . . . . . 166Figure A.3 Occurrence probability of qubit chain lengths and vertex-modelsizes for the Dense Placement and Minorminer algorithms . . 167Figure A.4 Schematic of a chain between two end nodes . . . . . . . . . 168Figure C.1 Tile connectivity of the Pegasus topology. . . . . . . . . . . . 176Figure C.2 Connectivity between qubits in different Chimera sub-graphs. 176Figure C.3 Average run times for generated circuits on the 512 qubit pro-cessor for both algorithms with best fits. . . . . . . . . . . . . 177Figure C.4 Metrics for the 50% circuit size and characteristic run time forboth embedding algorithms. . . . . . . . . . . . . . . . . . . 178Figure C.5 Relative decrease in the embeddable circuit size as a functionof the proportion of qubits disabled. . . . . . . . . . . . . . . 179xviiList of AbbreviationsAFM atomic force microscope.AQC adiabatic quantum computing.CAD computer-aided design.CM Component Mode solver.CMOS complementary metal-oxide semiconductor.CVF coherence vector formalism.DMRG density matrix renormalization groups.DP Dense Placement embedding algorithm.FEM finite element method.FPGA field-programmable gate array.ICHA Intercellular Hartree-Fock Approximation.LES low energy eigenspectrum.MVM mixed valence molecule.ODE ordinary differential equation.xviiiQA quantum annealing.QCA quantum-dot cellular automata.QPU quantum processing unit.QUBO quadratic unconstrained binary optimization.SA simulated annealing.SiDB silicon dangling bond.SiQAD Silicon Quantum Atomic Designer.STM scanning tunneling microscope.TISG transverse Ising spin glass.USE Universal, Scalable, and Efficient clocking scheme.VRH variable range hopping.xixAcknowledgmentsI would like to thank my graduate supervisor, Dr. Konrad Walus, for his guidance,support, and the many insightful discussions around campus. I would also liketo thank my lab mates, Samuel Ng and Nathan Chiu, for their camaraderie andfor bringing humour and compelling discourse to what might otherwise have beenlong hours in the lab. Finally, I would like to thank my family for their years ofsupport throughout my education.This work was supported by the Natural Sciences and Engineering Research Coun-cil of Canada under Grant STPGP 478838-15.xxChapter 1Introduction1.1 Background and MotivationIn recent years, there has been great interest in technologies that extend beyondthe projected scale limits of conventional CMOS, ranging from new transistor de-signs with alternate channels [10, 11] to entirely novel computational architec-tures [12–14]. Quantum-dot cellular automata (QCA) encodes binary informationin the distribution of charges in devices or cells composed of arrays of quantumdots [15, 16]. Schematics of some common QCA devices are shown in Fig. 1.1.Coulombic interactions between occupying charges facilitate coupling between thecharge states of neighbouring cells. Arrangements of these cells can be designedwith ground states that encode familiar logic gates [17]. Inverters and majoritygates form the fundamental basis of logic functions for the vast majority of QCAdesigns. Suitable arrangements for these functions are shown in Fig. 1.2. Earlyproof-of-concept implementations have been fabricated and tested using metallicisland devices [17–21], and nanomagnetics [22–24]. More recent efforts have fo-cused on the potential for nanoscale implementations of QCA. Among the mostpromising candidates are mixed-valence molecular devices [25, 26] and patterneddangling bonds on hydrogen passivated silicon [14, 27, 28]. Each QCA cell occu-pies only a few nm2 in area, potentially offering high device densities of 1014 cm−2.Significant challenges must be solved for any realistic QCA implementation, suchas limiting device power at high density using reversible gates [29, 30], designing1(a) 2-State (b) 3-StateFigure 1.1: Schematics and polarizations for common QCA devices. The shadedquantum dots show the locations of mobile charges. Inter-dot tunneling paths areindicated by the lines.(a) Wire / Shift Register (b) Inverter (c) Majority GateFigure 1.2: Common QCA building blocks used in larger QCA circuits. Redand blue shaded cells indicate inputs with +1 or -1 polarization respectively. Therightmost yellow cells produce the logical output. In this work, we will denote by“Wire-N” a wire of N cells and “Maj-ABC” a majority gate with binary inputs asindicate: ex Maj-101 shown.robust wire crossings [31] and clocking networks [32, 33], and interfacing with theexisting CMOS architecture.The operation of QCA networks requires the generation of multi-phase clock-ing fields which control information flow by sequentially activating QCA devices[20]. One common approach is to segment the network into clock zones, regionswhich each share a clocking phase. Fig. 1.3 shows a schematic of this clockingprotocol. This approach is used in QCADesigner [34], a popular QCA design andsimulation tool, and is ubiquitous in the design literature. The complexity of com-puting the ground state of a clock zone is exponential in the number of cells. Mean2RELAXRELEASE SWITCHLATCH(a) Schematic of 4-phase clocking (b) XOR1000Figure 1.3: (a) Schematic of the 4-phase clocking protocol. Regions of QCAdevices are driven between unpolarized relaxed and polarized latched states: eg. bymodulating their inter-dot tunneling energies, γ . Adjacent clock zones are pi/2 phaseshifted to enforce directional information flow. (b) An example XOR gate withsuitable clock zones phases indicated by the colours. We are primarily interestedin the switching phase, in which the polarization ground state needs to be achieved.field methods are often employed to approximate the ground state using manage-able resources [15, 35]. These methods are known to produce incorrect groundstates for certain QCA arrangements [36]. QCA designs and selections of clockzones are often influenced by what these approximation methods can reliable han-dle rather than any expectation of real behaviour. An alternative approach is toemploy electrodes, each addressed with one of the clock phases, to generate a dy-namic clocking wave which can be swept over the network [30, 37]. Instead ofclock zones being activated in sequence, devices are then activated within the ris-ing edge of the travelling clocking wave. A schematic comparison of zone andwave clocking is shown in Fig. 1.4. Example electrode arrangements appropriatefor either zone or wave clocking are shown in Fig. 1.5. Columnar or 2DWave [38]floor plans are suitable for feed-forward combinatorial networks but cannot imple-ment feedback. A scheme like the USE floor plan proposed in [32], or similar [33],can be used for more general QCA networks.Current nanofabrication limitations constrain the minimum separation for clock-ing electrodes to tens of nm [39]. Not only does this restrict possible electrode ge-31 2 3 4 1Zone Clocking 4-Phase Wave ClockingFigure 1.4: Schematics of both clocking protocols. For zone clocking, all devicessee the same clocking field. By using 4-phase buried electrodes, we a generate atravelling wave where computation is done within the rising edge.1 2 3 4(a) Columnar1 2 3 4111222333444(b) 2DWave [38]1 2 3 4111222333444(c) USE [32]Figure 1.5: Example electrode arrangements for 4-phase clocking. The phase ofeach electrode is indicated. Arrows show the direction of information flow. All areinfinitely tileable.ometries, but it results in a lower bound on the size of a region of a QCA networkthat has non-trivial dynamics. In zone clocking, our clock zones have minimumdimensions and cannot be arbitrarily assigned based on simulation requirements;for wave clocking, we must expect multiple QCA devices to occupy the region ofthe rising edge. It is not yet clear how much computation can be reliably achievedwithin a single clock zone at some desired operational frequency. Such a questionrequires a greater capacity for estimating the behaviour of larger QCA networks.Work by To´th and Lent demonstrated that the inaccuracies in the mean fieldmodels resulted from missing inter-cell correlations [36, 40]. Their proposed in-termediate model which incorporates a subset of these correlations has proven suc-4cessful at producing more accurate results; however, there remain challenges inapplying this approach to general arrangements of QCA cells [41, 42]. In this the-sis, we consider an alternate approach to studying the dynamic behaviour of QCAnetworks in the fully coherent regime. We will investigate QCA clocking fromwithin the framework of quantum annealing, in which performance is governed byparticular details of the low energy spectrum.1.2 Thesis OverviewThis thesis is organized as follows. In Chapter 2, a brief overview of the variousmodels and simulation methods used in the study of QCA devices is given, withemphasis on reduced basis approximations of the system Hamiltonian and dynam-ics. In Chapter 3, we discuss quantum annealing, adiabatic evolution, and presentthe link to QCA clocking. Chapters 4 and 5 contain a study of silicon danglingbonds and their potential applications for combinatorial logic and nanoscale QCA.In Chapter 6, we perform a detailed investigation of the performance of small 2-state QCA networks under zone clocking, extracting meaningful metrics from thelow energy spectrum. In Chapter 7, we discuss one potential approach to studyinglarger QCA networks: direct hardware simulation on a quantum annealing pro-cessor. Finally, in Chapter 8, we develop a methodology for understanding thespectra of QCA networks in terms of contributions from network components, andpresent a novel method for approximating the low energy spectrum of large QCAnetworks.5Chapter 2Modelling and Simulation ofQCA DevicesFor an arrangement of N devices, each composed of some number of quantumdots with a finite set of possible charge and spin states, the Hamiltonian can beexpressed as a Hubbard model in second quantization [43]:Hˆ =∑m,iE0nˆmi,σ − ∑m,i 6= jti jaˆ†mi,σ aˆm j,σ +∑m,iEQnˆmi,↑nˆmi,↓+Kc ∑m,〈i j〉nˆmi,σ nˆm j,σ ′∣∣~rmi−~rm j∣∣ +Kc ∑〈mk〉,i, j nˆmi,σ nˆk j,σ ′∣∣~rmi−~rk j∣∣ , (2.1)where aˆ†mi,σ and aˆmi,σ are the creation and annihilation operators for a spin state σat the i’th dot of the m’th device, nˆmi,σ = aˆ†mi,σ aˆmi,σ is the corresponding numberoperator, and the~rmi give the positions of each dot. The first term accounts for theon-site energy of a charge occupying a given dot. The second term represents theenergy associated with tunneling between two dots within a device. Intuitively, aparticle with a given σ is first annihilated at one dot and then created at the other.The third term represents the energy of two charges with differing σ occupying thesame dot. Finally, the remaining two terms define the Coulombic interactions ofcharges respectively within and between devices, where Kc = q2e/4piε for electroncharge qe and electrical permittivity ε . Einstein summation is employed over the62-Dot Devices 4-Dot DevicesFigure 2.1: Example polarization configurations for 2-state QCA devices.spin states σ ,σ ′ for conciseness. This model excludes terms associated with chargehopping between devices.2.1 Reduced Basis RepresentationsThe Hubbard model produces a Hilbert space which is impractically large in di-mension for most QCA networks of interest. For example, among the most com-mon model devices considered in QCA is a 4 dot cell containing two charges ofopposite spin [43]. In this case, there are 16 possible charge configurations in thesingle cell Hamiltonian. For N cells, the full Hilbert space is then 16N is size. Evenfinding the ground state of such a system is computationally intractable beyond ahandful of devices. Using Lanczos iteration [44], which can make use of the spar-sity and Hermitian structure of the Hamiltonian, eigenvalue decomposition is atleast as complex as matrix-vector multiplication. In the basis of charge states, theHamiltonian contains 16N diagonal elements, and each row of the matrix contains6N off-diagonal non-zero elements. The off-diagonal elements correspond to statesthat can be reached by hopping one charge to one of its potential alternate sites in acell: 2N charges, 3 locations per charge. In total then, matrix-vector multiplicationrequires 16N(1+ 6N) multiplications. At 5 devices, we already have about 32.5million operations per Lanczos iteration. In practice, we assume each device canbe modelled using a subset of the full set of charge states.2.1.1 Two-State ApproximationFor the 4-dot devices discussed, it has been shown that the ground state can beapproximately described within a subspace of the 16 dimension Hilbert space givenby the 2 polarization states shown in Fig. 2.1 [43]. The projection of the system7Hamiltonian into this reduced basis is known as the two-state approximation. Wecan express the Hamiltonian of a single 2-state QCA device in the {|+1〉, |−1〉}basis asHˆi(t) =−12γi(t)σˆx+ 12 hiσˆz =12[hi −γi(t)−γi(t) −hi], (2.2)where γi(t) is the tunneling energy with γi/} a measure of the rate of tunnelingbetween the two polarization states, hi is some polarization bias, and the σˆµ forµ ∈ {x,y,z} are the Pauli matrices, included for convenience:σˆx =(0 11 0), σˆy =(0 −ii 0), σˆz =(1 00 −1).Devices in this model are clocked by manipulating the tunneling rates γi. In thecase of metal island QCA, tunneling barriers between the sites could be directlycontrolled, thereby enabling a simple mechanism for clocking [18, 20]; however, itis not yet clear if this model of clocking is appropriate for nanoscale QCA. For Ncells, we incorporate coupling between polarization states of neighbouring devicesvia the so-called kink energy, Eki j, the cost of two cells having opposite polariza-tions. The Hamiltonian takes the formHˆ(t) =−12∑iγi(t)σˆ ix+ 12[∑ihiσˆ iz−∑〈i j〉Eki jσˆizσˆjz], (2.3)where here hi = ∑D EkiDPD is understood to arise from interactions with fixed-polarization driver or input cells, and the σˆ iµ operate on device i. These Paulioperators are constructed by replacing the i’th of N identity operators in a Kro-necker product with the corresponding Pauli matrix:σˆ iµ = Iˆ⊗·· ·⊗ σˆµ ⊗·· ·⊗ Iˆ.In this form, the polarization of cell i in the ground state is given as Pi = 〈σˆ iz〉.Matrix-vector multiplication in this approximation involves a somewhat more man-ageable 2N(1+N) terms. This allows us to handle networks approximately 4 timeslarger using the same computational resources. In practice, simulations without83-Dot Devices 6-Dot DevicesFigure 2.2: Example polarization configurations for 3-state QCA devices.further approximation are feasible for networks of at most 20 or so devices. Notethat while Tougaw and Lent also demonstrated that the dynamical behaviour ofa driven 3 cell QCA wire in this approximation shows close similarity to resultsobtained using the full basis [43], the general robustness of the two-state approxi-mation is likely strongly dependent on the particular QCA implementation.2.1.2 Three-State ApproximationAnother important model for QCA devices is the three-state approximation. Inthis case, we consider devices which accommodate both two polarization statesas well as a third null or inactive state. This null state produces no polarization-like interaction with neighbouring devices. An example of such a device and itsconfigurations are shown in Fig. 2.2. It is common to exclude tunneling betweenpolarization states, in which case the Hamiltonian of a 3-state cell can be expressedin the {|+1〉, |−1〉, |0〉} basis asHˆi(t) =−γiΓˆ+hiPˆ −Ci(t)(Nˆ − Iˆ)= hi 0 −γi0 −hi −γi−γi −γi Ci(t). (2.4)Here γi defines the tunneling between the polarized and null states, hi is againa polarization bias, and Ci describes a modulated clocking field controlling theenergy of the null state. The appropriate operators can be expressed asΓˆ =0 0 10 0 11 1 0, Pˆ =1 0 00 −1 00 0 0, Nˆ =1 0 00 1 00 0 0.9It is clear that Γˆ takes the role of σˆx from Eq. (2.2) and Pˆ the role of σˆz. Thepolarization is similarly defined as P = 〈Pˆ〉. Another useful quantity is the acti-vation or population, N = 〈Nˆ〉. For theoretical manipulations, it can be useful toexpress these operators in terms of the Gell-Mann matrices: Γˆ = λˆ4+ λˆ6, Pˆ = λˆ3,and Nˆ = 13(2Iˆ+√3λˆ8). The Gell-Mann matrices are included in Appendix D.1.Details of this method, including considerations for the N cell Hamiltonian, will bediscussed later in Chapter 5. For completeness, the number of non-zero elementsin the 3-state Hamiltonian is 3N(1+ 43 N).2.1.3 Intercellular Hartree-Fock ApproximationWhile the reduced basis approximations somewhat limit the computational com-plexity when compared to the full Hubbard model, the space of states remainsexponential. There are a number of strategies we might employ to further reducecomplexity. Perhaps the most extreme is the Intercellular Hartree-Fock Approxi-mation (ICHA), a ground state approximation method employing the Hartree-Fockmethod on the N cell Hamiltonian [15, 35]. It attempts to find the lowest energystate which can be expressed as a product state: |ψ0〉 =⊗Ni=1|ϕi〉. For the 2-stateHamiltonian, Eq. (2.3), the |ϕi〉 end up being the ground states of effective cellHamiltoniansHˆi =−12γiσˆ ix+ 12 h˜iσˆ iz, (2.5)where h˜i = hi−∑ j 6=i Eki jP j is the effective polarization bias for the current cell po-larizations,P j = 〈ϕ j|σˆz|ϕ j〉. These polarizations must be computed self-consistently.For these simple 2×2 cell Hamiltonians, the ground state can be computed analyt-ically and an expression for Pi found:Pi = −h˜i√γ2i + h˜2i. (2.6)The ICHA is known to be highly susceptible to being trapped in higher energymetastable configurations [36], the most well-known of which being a majoritygate with uneven input wire lengths. An example of this scenario is shown inFig. 2.3. If we initially assume all cells to be unpolarized, the polarization of the10(a) Majority Gate schematic (b) ICHA ground stateFigure 2.3: A majority gate with uneven input lengths exhibits incorrect behaviourusing the ICHA. If we start with all cells unpolarized, the shortest input reaches theoutput cell in fewer iterations of the self-consistency method. The output cell thenin turn biases the center cell which blocks the other inputs from contributing. Thedarker shaded cells here are fixed polarization inputs with P =+1 (red) or P =−1(blue). In (b), a typical ground state produced with the ICHA is shown. The lightershading is mainly to help with interpreting the cell polarizations. The two cellsadjacent to the cross from the longer inputs tend to be weaker in polarization dueto backpressure.shortest input will reach and polarize the output cell in fewer iterations. Fromthat point, the center cell will always see two equally polarized neighbours andbe resistant to flipping polarization, even when the other inputs eventually arrive.A common but non-ideal solution to this is to define clock zones such that, forexample, all majority gates have inputs of the same length. This imposes ruleson QCA design which have nothing to do with performance but instead merelylimitations in the simulation tools.2.2 Dynamics Using Reduced Correlation ModelsIf we ignore any decoherence mechanism, the dynamics of a network are given bythe Schro¨dinger equation:i}ddt|ψ〉= Hˆ(t)|ψ〉.11It is useful to express the dynamics in dimensionless formf˜dds|ψ〉=−iHˆ(s)|ψ〉, (2.7)where s = f t for some frequency scale f , Hˆ( f t) = EHˆ(s) for energy scale E , andf˜ is a dimensionless characteristic rate defined by 2pi f˜ = f/ f0 with f0 = E/h =E × 241.8THz/eV. The relevant energy scale is defined by the kink energies ofthe particular QCA implementation, on the order of tens to hundreds of meV fornanoscale QCA. Regardless of our choice in ordinary differential equation (ODE)solver, we are limited again by the matrix-vector product complexity of a Hamilto-nian which is exponential in the QCA network size. When including decoherenceeffects, it is common to apply the Liouville-von Neumann equation with a relax-ation term [29]. It can be expressed in dimensionless form asf˜ddsρˆ(s) =−i[Hˆ(s), ρˆ(s)]− ξ˜ [ρˆ(s)− ρˆss(s, ρˆ)] (2.8)where ρˆss is the steady state density operator and 2piξ˜ = 1/τ f0 defines the dimen-sionless relaxation rate for some characteristic time τ that it takes the system torelax to its steady state. In general, we allow ρˆss to be a function of both time andthe current system state.2.2.1 Coherence Vector FormalismDue to suspected limited correlations in QCA networks, it has been proposed toexpress the dynamics in terms of the coherence vector formalism (CVF) [36, 42].The density operator can be expressed in the basis of the generators of SU(2N):ρˆ(s) =12N[Iˆ+∑iλ iaσˆia+∑〈i j〉Ki jabσˆiaσˆjb + · · ·], (2.9)where we employ Einstein summation for the subscripts over the set {x,y,z}. Thedynamics of ρˆ are equivalent to those of the real-valued coefficients: λ ia = 〈σˆ ia〉,Ki jab = 〈σˆ iaσˆ jb 〉, etc. These coefficients are classified by the number of cells theyconsider: each cell is assigned a coherence vector λ i containing the 3 single-point12expectation values λ ia; each pair of cells gets a two-point correlation tensor K i jcontaining the 32 two-point correlations Ki jab; and so on. The power of this formal-ism comes in the capacity to limit the number of terms used to approximate ρˆ . Thedynamics of any one of these terms can be computed asf˜ddsΛk(s) = i〈[Hˆ(s), Λˆk]〉− ξ˜ [Λk(s)−Λssk (s, ρˆ)], (2.10)with Λˆk any of the operators in Eq. (2.9), Λk its expected value, and Λssk = Tr ρˆssΛˆk.Only the lowest order approximation of the dynamics is usually considered. Thisproduces a dynamic equivalent of the ICHA, including only the coherence vectors,and excluding two-point and higher correlations. The resulting system of equationsis given byf˜ddsλ i = Γi×λ i− ξ˜ (λ i−η i), (2.11)with Γi =[−γi, 0, 12(hi−∑n6=i Ekniλ nz)]and η i the local dissipation vector. Ab-sent dissipation, this has the nice property that each cell can be seen to evolveunder an instantaneous Hamiltonian dependent on the λ iz values of neighbouringcells. These are precisely our ICHA single-cell Hamiltonians:Hˆi =−12γiσˆx+ 12(hi−∑j 6=iEki jλiz)σˆz, (2.12)where we note Pi = λ iz by definition. This guarantees unitary evolution of thesystem state.2.2.2 Higher Order ModelsBy excluding all higher order correlations, the coherence vector formalism suffersall the same issues as the self-consistent ICHA approach for the ground state. As anattempt to resolve this issue, To´th et al. detail an approach employing a restrictedset of the correlation tensors [40]. There are two main concepts to apply. First,we consider the equations obtained from Eq. (2.10) for higher order correlationtensors. Expressions of i〈[Hˆ, Λˆ]〉 can be found for any given Ki jab, Ki jkabc, etc. byapplying a set of rules summarized in Fig. 2.4. These rules translate patterns in13X Y Z+- +-(a) ia→ ibX Y+-(b) i jaz→ ibX Y+-(c) ia→ inbzFigure 2.4: Super and subscript pattern conversion rules for generating equationsfor Eq. (2.10). Each matching pattern in a given Ki jab, Ki jkabc, etc. produces a singleterm in i〈[Hˆ, Λˆ]〉. The order of superscript-subscript pairs is arbitrary. Each arrowdescribes a conversion of a subscript a→ b which produces a term with a givensigned weight. In (c), the sum is over all cells which are not in the original term.the super-subscript pairs of Ki jk···abc··· into corresponding terms in Eq. (2.10). Take Ki jxzas an example, which includes components from all rule-sets. From Fig. 2.4(a),we have Ki jxz→−hiKi jyz and Ki jxz →−γiKi jxy; from (b), we have Ki jxz → Eki jλ iy; finally,from (c) we get the sum, Ki jxz→∑n6∈{i j}EkniKi jnyzz . We arrive at the dynamic equationf˜ddsKi jxz =−γiKi jxy−hiKi jyz+Eki jλ iy+ ∑n6∈{i j}EkniKi jnyzz − ξ˜(Ki jxz− Ki jxz∣∣ss).Appropriate definitions for the steady state term Ki jxz|ss and similar can be foundin Ref. [45]. The equations for all the two-point correlations are summarized inAppendix D.2. These rules make it trivial to automatically generate the relevantequations for any order of included correlations. The second important conceptlies in assessing the so-called correlation tensor propers:Mi jab =〈(σˆ ia−λ ia)(σˆ jb −λ jb )〉= Ki jab−λ iaλ jb , (2.13a)Mi jkabc =〈(σˆ ia−λ ia)(σˆ jb −λ jb )(σˆ kc −λ kc )〉= Ki jkabc−(Ki jabλkc +Kikacλjb +Kjkbcλia)+2λ iaλjbλkc ,(2.13b)and so on. These terms relate to the strength of specific correlations. If we eitherset or assume these terms to be zero, we can approximate higher order correlationsas combinations of lower order terms: e.g. Ki jab = Mi jab+λiaλjb ≈ λ iaλ jb . In Fig. 2.5,we see the dependence of the tensors for two-point interactions, before and after14xzzxxzxyzyyzyzzzxyxxxyyy zzxzyz(a) Full Correlation Setxxxyyy zzxzyzzxy(b) Reduction of Ki jkabcFigure 2.5: Graph representations of the interdependence of correlation tensors.All terms related to two-point correlations are shown in (a). Edges indicate a de-pendence of the dynamics of one node on the current values of the other. Thestructure of the conversion rules makes all dependencies symmetric. If three-pointcorrelations are neglected, additional dependencies are introduced. Each arrow in(b) represents a new direct dependence of the root node: e.g. xz is now depen-dent on zz. These dependencies tend not to be symmetric. The new single-pointcorrelation dependences are not shown.applying Eq. (2.13) to approximate Ki jkabc. Similar constructions can be made forcorrelations in 3-state QCA.To´th demonstrated that the difficulties experienced with the ICHA are primar-ily due to missing correlations in critical circuit components such as majority gates[36]. By reintroducing even just the two-point correlations, much of these issuesare resolved. There are, however, caveats when attempting to apply this method.2.2.3 Challenges with the CVFIt has previously been noted that calculated values of Mxy and Myz during simula-tions of 2-state dynamics tend to be much smaller than other two-point correlations[36, 41, 42]. It was argued by Karim that Kxy and Kyz are therefore good candidatesfor approximation with λa values. We need to be careful making such arguments,for reasons that follow. From Eq. (2.11), we see that the cell polarizations, λ iz ,are essentially just the integrals of γi(s)λ iy(s). As the values of γi are independent150.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Valuexyz(a) All correlations0.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Valuexyz(b) ICHA: Excluding all Mab0.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Valuexyz(c) Excluding only Mxy and Myz0.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s3210yValues: ×103(a)(c)(d) Comparison of λy values.Figure 2.6: Simulation of a five cell QCA wire using a few correlation modelsexcluding any dissipation. Each of (a-c) gives the coherence vector values for eachcell in the wire throughout a clocking protocol discussed in Chapter 6; the detailsare otherwise unimportant here. In (d), we show that the arbitrary exclusion ofMxy and Myz results in smaller amplitude λy, and thus a failure of the wire to fullypolarize. Including all two-point correlations produces a result similar to (a). Thevalue of f˜ in these results is set at 1× 10−3, slow enough to be approximatelyadiabatic.16of f˜ , we see that for a cell to go from unpolarized to fully polarized over a clockcycle, the scale of the λy values must be effectively proportional to f˜ : i.e. if theclock takes twice as long, the values of λ iy must be roughly half as large. In fact,it is trivial to show (see Appendix B.3) that for any energy eigenstate of a 2-stateHamiltonian, the expected value of any product of Pauli operators with an oddnumber of σˆy terms is exactly zero. It follows that any Mabc··· term with an oddnumber of y subscripts will be small for a sufficiently slowly clocked system. Thevalues of λ iy, Mi jxy, and Mi jyz are all small for precisely the same reasons and underthe same circumstances. The same argument that leads to negating Mxy would alsojustify setting λ iy to zero, neglecting all dynamics entirely. The same discussion ap-plies to the 3-state Hamiltonian where σˆy is replaced by any linear combination of{λˆ2, λˆ5, λˆ7}with real coefficients. In Fig. 2.6, simulations for a 5-cell wire usingdifferent included correlations demonstrate this result.There is a more general issue when identifying correlations which may beexcluded. As Ki jk···abc··· is the expected value of a product of Pauli operators, thevalues must satisfy certain constraints. For example, as each of the Pauli oper-ators have eigenvalues of ±1, it follows that for any properly normalized state,−1 ≤ Ki jk···abc··· ≤ 1. More complicated relationships are satisfied by groups of theKi jk···abc···. The simplest relates the λia for any given device: (λ ix)2+(λ iy)2+(λ iz)2 ≤ 1.Whatever the values of Mi jk···abc···, they must satisfy such normalization constraints.Arbitrarily assigning values without additional considerations can easily lead tophysically non-viable states.2.2.4 Multicellular ICHA ApproachOne well behaved approach to incorporating additional necessary correlations whichmaintains all normalisation constraints is to use a generalized variation of the ICHAemploying multi-cell Hamiltonians [36, 41]. A general interpretation of the ICHAcan be seen as two approximations: (1) the system Hamiltonian is a sum, in partic-ular a Kronecker sum, of cell Hamiltonians; and (2), the system state is a productstate of individual cell states. These approximations can be summarized by the17(a) Multi-cell selection (b) Corrected ground stateFigure 2.7: Describing the entire 5-cell cross using a single multi-cell Hamilto-nian, the errors observed using the ICHA can be avoided.following equations:Hˆ =M⊕m=1Hˆm, |ψ〉=M⊗m=1|ϕm〉, f˜ dds |ϕm〉=−iHˆm|ϕm〉, (2.14)where, for example,⊕3m=1 Hˆm = Hˆ1⊗ Iˆ⊗ Iˆ+ Iˆ⊗Hˆ2⊗ Iˆ+ Iˆ⊗ Iˆ⊗Hˆ3 denotes themulti-term Kronecker sum, and⊗3m=1|ϕm〉 = |ϕ1〉⊗ |ϕ2〉⊗ |ϕ3〉 is the Kroneckerproduct. For the traditional ICHA, each Hˆm is a single cell Hamiltonian given byEq. (2.5) and M = N is the number of non-fixed-polarization cells. In general,Hˆm can describe any group of cells rather than a single cell. In Fig. 2.7, we showone multi-cell choice which solves the issues observed in the majority gate withuneven input lengths. Each group Hamiltonian is treated as a normal multi-cellQCA network where, as with the ICHA, instantaneous cell polarizations produceadditional contributions to the effective h biases in neighbouring Hˆm. More detailon a conceptually similar approach will be seen in Chapter 8. In the language ofthe previous discussion, this multi-cell ICHA neglects all Mi jk···abc··· involving cellsin different groups; all correlations within groups are maintained. In addition, aseach |ϕm〉 simply evolves under a Schro¨dinger equation for an effective Hamil-tonian, the evolution is unitary and all normalisation-like constraints are satisfiedautomatically.18Figure 2.8: Two majority gates in series with potential multi-cell groups shown.Here it is necessary to combine the two majority gates into a single multi-cell groupin order to avoid incorrect behaviour.This approach is effective at resolving issues with individual majority gates,and similar cases such as certain inverters and fan-outs; however, there are knownproblems. As an example, Karim describes two majority gates in series [41]. Asimilar arrangement is depicted in Fig. 2.8. If the multi-cell groups are chosensimply to be the two majority gates, then for certain input lengths and polarizationswe arrive at a similar problem to that of the ICHA. The order of events is roughlyas follows. The rightmost negative input initially polarizes the second gate beforeits corresponding positive input arrives. The now negative second gate biases theoutput of the first gate. If this bias arrives before both of the positive inputs, thefirst gate will eventually remain in a situation where it sees two negative and twopositive inputs. The output of the first gate remains negative, as will then be theoutput of the second gate. This can be corrected by increasing the distance betweenthe majority gates, but we then again would be constraining our design choices towhat a solver can handle. Alternatively, if both majority gates, as well as the con-necting wire, are included in a single multi-cell group, then the correct behaviouris achieved. There remain two issues here: (1) it is not necessarily trivial to iden-tify the set of groups which avoid all problems; and (2), these larger combinedgroups are constrained in size by the largest Hamiltonian size we can handle formatrix-vector multiplication. These issues are partially addressed in Chapter 8.19Chapter 3QCA Clocking as QuantumAnnealingAdiabatic quantum computing (AQC) describes an approach to finding the groundstate of a complicated system by slowly evolving an initially simple Hamiltonianinto one of interest [46, 47]. In practice, the system of interest is typically con-structed so that its ground state encodes the solution to some NP problem. Intheory, the ground state of any such system is guaranteed to be found if the sys-tem is isolated from its environment and the evolution is sufficiently slow [48];however, the exact meaning of sufficiently slow relates to details of the low energyeigenspectrum (LES): see Section 3.1.1. Quantum annealing (QA) is a heuristicapproach which employs quantum evolution to attempt to find the lowest energyconfiguration of a given energy function, without strictly adhering to the conditionsrequired for AQC [49]. QA can be implemented numerically as in simulated quan-tum annealing [50–52], or in hardware [53, 54]. It can be interpreted as an attemptto realise the ideal of AQC with the understanding that real systems cannot alwaysbe completely isolated, that the strict constraints on annealing speed demanded byAQC are not necessarily practical for all problems, and that often a sufficientlylow energy approximation of the true optimum is good enough. In this chapter, abrief overview of the QA protocol is presented, and the link between QA and QCAclocking is demonstrated.203.1 Quantum AnnealingOur goal is to find the ground state of a Hamiltonian of interest, HˆP. We beginwith some simple initial Hamiltonian, HˆI , with an easy to achieve ground state.Often, this ground state is an equal superposition of all the eigenstates of HˆP. Thetime-dependent Hamiltonian of the system can be expressed asHˆ(s) = A(s)HˆI +B(s)HˆP, (3.1)where A(s) and B(s) define the annealing schedule satisfying A(0)/B(0) 1 andB(1)/A(1) 1. The quantity A(s)/B(s) plays a similar role in QA to the temper-ature in simulated annealing (SA).3.1.1 The Adiabatic TheoremThe guarantee of finding the ground state is provided by the quantum adiabatictheorem. A complete proof is not presented here, but it is informative to see themain idea. Following the approach of Born and Fock [48], we express the quantumstate of the system in the basis of instantaneous eigenstates of Hˆ(s):|ψ(s)〉=∑ncn(s)|ϕn(s)〉e−iθn(s)/ f˜ , (3.2)where Hˆ(s)|ϕn(s)〉 = En(s)|ϕn(s)〉 and θn(s) =∫ s0 En(s′)ds′. The dynamics of thecoefficients cn can be found by differentiating Eq. (3.2):dds cn =−cn〈ϕn| dds |ϕn〉− ∑m 6=ncm〈ϕn| ddsHˆ|ϕm〉Em−En e−i(θm−θn)/ f˜ . (3.3)The first term can be set to zero by adding an arbitrary phase eiφn(s) to each eigen-state: see Ref. [55]. If we assume that we start in the ground state, c0(0) =1,cn6=0(0) = 0, and integrate in the limit of small f˜ , we eventually obtaincn6=0(s) = i f˜[Tn(0)− e−i(θn(s)−θ0(s))/ f˜ Tn(s)]+O( f˜ 2) (3.4)21where we define the transition parametersTn(s) =〈ϕn| ddsHˆ|ϕ0〉(En−E0)2 . (3.5)Here the Hˆ, En, and hence also Tn are assumed to be dimensionless. In full units,the Tn are scaled by the ratio f/E of the frequency and energy scales. In Eq. (3.4),these scale factors are ultimately absorbed into f˜ . Though not strictly necessaryfor this discussion, it is useful to note|cn6=0(s)|= f˜ |Tn(s)|[1+O(|Tn(0)|/|Tn(s)|)]+O( f˜ 2) (3.6)and that typically either Tn(0) is small as a result of the choice in annealing sched-ule, or at least Tn(0)/Tn(s) is small when cn(s) is significant. The probability ofbeing in the ground state is simply P0(s) = |c0(s)|2 = 1−∑n6=0 |cn(s)|2, and theevolution is adiabatic if P0(s) remains sufficiently close to 1. That is, the |cn(s)| areall sufficiently small. We then arrive at the usual condition for adiabaticity:maxs,n6=0|cn(s)| ∝ f˜ maxs,n6=0|〈ϕn| ddsHˆ|ϕ0〉|(En−E0)2 < δ (3.7)For some small δ  1. This in turn can be reinterpreted as an upper bound on thecharacteristic rate:f˜ < f˜max ∼ mins,n6=0(En−E0)2∣∣〈ϕn| ddsHˆ|ϕ0〉∣∣ . (3.8)This condition is often simplified to the softer bound f˜max = O(∆20/σmax) where∆0 =mins(E1−E0) is the smallest energy gap between the ground and first excitedstate and σmax = maxs || ddsHˆ||2 is the largest L2 matrix norm of ddsHˆ(s) [56].3.1.2 Landau-Zener Approximation for Diabatic TransitionsDetermining a specific value for f˜max requires additional information about the par-ticular eigenspectrum. The simplest model with an analytic solution was separatelypresented by Landau [57] and Zener [58]. The Landau-Zener approximation is asimplification of a two state avoided level crossing, shown in Fig. 3.1, with twoassumptions: (1) the energy difference between the two states is a linear function22Figure 3.1: Schematic avoided level crossing described in the Landau-Zener ap-proximation. It is assumed that the ground and excited states swap during thecrossing. The dashed lines show the energies of |1〉 and |2〉. The coupling termopens a gap at the point of crossing, producing a hyperbolic energy gap in theeigenenergies, shown with the solid lines.of time: 〈2|Hˆ|2〉−〈1|Hˆ|1〉= α · (s− s0); and (2), the coupling term is a non-zeroconstant determined by the minimum gap: 〈1|Hˆ|2〉=±12∆0. Under these assump-tions, the energy gap between the two eigenstates takes the form of a hyperbola:∆(s) = ∆0√1+ 1W 2 (s− s0)2, (3.9)where W is a measure of the width of the crossing. The probability of a diabatictransition out of the ground state can be expressed analytically asPg→e = e−2pi∆0W/4 f˜ , (3.10)and the appropriate bound on the annealing rate for a defined adiabaticity, PA, isf˜max =pi∆0W−2ln(1−PA) (3.11)233.1.3 Special Case of Quadratic OptimizationOne important class of NP problems are those that can be expressed in terms ofquadratic unconstrained binary optimization (QUBO). A large set of challengingand important problems can be mapped, either exactly or as an approximation, toQUBO problems: graph partitioning and colouring [59], Hamiltonian cycles andtravelling salesman [60], protein folding [61], boolean satisfiability (k-SAT) [62],integer factorization [63], and more [62, 64, 65]. In general, QUBO involves theminimization of a quadratic energy functionE(x) =∑ibixi+∑〈i j〉Qi jxix j (3.12)with variables xi in either {0,1} or {±1}, exchangeable under a linear transfor-mation, and bi and Qi j some real-valued weights. The optimal solution exactlycorresponds to the ground state of an Ising model, with HamiltonianHˆIsing =∑ihiσˆ iz +∑〈i j〉Ji jσˆ izσˆjz . (3.13)If the QUBO problem is expressed in the {±1} basis, then hi = bi and Ji j = Qi jup to some positive scaling factor. As a QA problem, it is practical to beginin a superposition of all possible solutions of Eq. (3.12): |ψ(0)〉 = ∑n|xn〉/2N/2,summed over all 2N possible binary vectors. This corresponds to the ground state ofHˆI =−∑i σˆ ix, which defines a suitable initial Hamiltonian. The full time-dependentHamiltonian can be expressed asHˆ = 12 A(s)HˆI + 12 B(s)HˆIsing=−12 A(s)∑iσˆ ix+ 12 B(s)[∑ihiσˆ iz +∑〈i j〉Ji jσˆ izσˆjz],(3.14)with the factors of 1/2 added for convention. This is the so-called transverse Isingspin glass (TISG) or sometimes transverse field Ising model: see for example Ref.[66]. This model is, by far, the most widely studied application of QA. It is si-multaneously simple to implement as well as powerful due to its link to QUBO.It also describes the operational Hamiltonian provided by the quantum processing24unit (QPU) developed by D-Wave Systems Inc. [54, 67].3.2 QCA ClockingAs discussed in Chapter 1, there are two main approaches considered for clockingQCA networks: zone clocking, and dynamic wave clocking. The relevant terms forclocking are the tunneling rates γi(s) and null state energies Ci(s) for 2 and 3-stateQCA respectively. The distinction between zone and wave clocking is fundamen-tally the same in both models; we consider 2-state clocking for simplicity.3.2.1 Zone ClockingIn zone clocking, all cells within a single clock zone satisfy γi(s) = γφ (s) for one ofthe clock phases. By comparing Eq. (2.3) and Eq. (3.14), we see that zone clockingin the 2-state approximation exactly corresponds to QA with a TISG model. Asuitable mapping is given byA(s) = γφ (s), B(s) = 1, hi = hQCAi , Ji j =−Eki j. (3.15)3-state QCA zone clocking is somewhat different and will be discussed in moredetail in Chapter 5. It can, however, still be understood as a QA problem, wherethe system is initialized with all cells in the null state, |ψ(0)〉=⊗i|0〉, and slowlytransitioned towards a fully polarized final state.3.2.2 Wave ClockingFor wave clocking, each cell experiences a clock dependent on its position: γi(s) =γ(s,~ri). The exact dependence is determined by the floor-plan of the clockingelectrodes: see Fig. 1.5. In general, wave clocking cannot be expressed in the formof Eq. (3.1); however, the adiabatic theorem, as well as the concepts of avoidedlevel crossings and diabatic transitions, still apply.25Chapter 4Dangling Bonds on HydrogenPassivated SiliconSo far, we have left ambiguous the actual structure of any physical QCA implemen-tation. In this chapter we discuss a recent and promising candidate architecture fornanoscale QCA. Work by Wolkow et al. has made possible the precise fabricationof arrangements of silicon dangling bonds (SiDBs) on hydrogen passivated siliconsubstrate: H-Si(100)-2×1 [68–70]. A schematic of the relevant surface structure isshown in Fig. 4.1. Each SiDB is constructed by applying a voltage pulse througha scanning tunneling microscope (STM) tip positioned over a surface hydrogen,desorbing the hydrogen and leaving exposed a dangling bond [70–73]. The pos-sible locations of SiDBs are constrained by the lattice geometry. The case forH-Si(100)-2×1 is shown in Fig. 4.1(b); however, alternate geometries such as thehexagonal lattice of H-Si(111) and its surface reconstructions could be of potentialinterest [74]. It has been demonstrated that particular patterns of SiDBs are capa-ble of reproducing logic functions, such as the binary wires and OR gate describedand investigated by Huff et al. [28]. The potential application of SiDBs as an ar-chitecture for nanoscale QCA has also been previously proposed [14, 27], with asimple biased QCA cell comprised of 4 SiDBs studied. Further details on fabrica-tion can be found in the literature [68, 70, 75]. This chapter is structured in orderto highlight both the properties of the SiDB architecture as well the capabilitiesof the simulation methods developed by members of Walus Lab. These methods26H DBSi(a) H-Si(100)-2×1 surface2.34 7.683.84(b) SiDB positionsFigure 4.1: Hydrogen atoms cap the top-most layer in 2×1 reconstructed Si(100).By applying a current through STM tip, surface hydrogen is desorbed exposing adangling bond with charge states in the band gap. The dimensions and possiblepositions for created SiDBs are indicated in (b).are included in the design tool discussed in Ref. [2]. Wherever appropriate, themodels and simulated results are presented immediately following the theoreticaldiscussion and experimental results.4.1 Charge States of SiDB ArrangementsThe core idea behind implementing logic using arrangements of the SiDB struc-tures is relatively simple. Each SiDB can contain some number of electrons andtake an appropriate charge state. The electrons in neighbouring SiDBs experienceCoulombic repulsion, coupling the charge states and making certain charge con-figurations more favourable than others. Tunneling between the SiDBs as well astransport between the surface and available states in the bulk result in a capacityfor the system to tend towards a low energy configuration. Here we discuss thenature of the SiDB charge states, experimental observations of the charge configu-rations of multi-SiDB arrangements, as well as a simple model for predicting theseconfigurations.4.1.1 The Preferred Population of an SiDBDepending on the doping of the substrate, the distribution of charge in the nearbysurface, and any applied bias in an STM or atomic force microscope (AFM) tip,SiDBs have been demonstrated to take on one of three charge states [27, 75]: an27DB-DB0DB+0/-+/0Transition Levelsn-dopingVACUUM SILICONVBCBEnergyDepth(a) Unbiased band structureDB-DB0DB+SILICONVBCBEnergyDepth0/-+/0(b) Upward band-bending-24.7 Hz -54.0 HzDB0DB-(c) AFM ScansFigure 4.2: Schematic of the SiDB charge configurations and the band structureof the silicon substrate. The charge transition levels are known to lie in the gapbetween the valence and conduction bands. The bulk Fermi level, ESiF is a functionof the doping and establishes the charge state of an isolated SiDB absent othereffects. Additional interactions at the surface are taken both to shift the transitionlevels relative to ESiF as well as induce band-bending, shown in (b). In (c), we showAFM scans of an isolate SiDB at different surface-tip biases.unoccupied positive state (DB+), singly occupied neutral state (DB0), and a dou-bly occupied negative state (DB-). We use population to describe the number ofcharges in an arrangement of SiDBs. The charge transition levels between thesestates are known to lie in the band gap of the bulk silicon [27]. It is argued thatthis isolates the surface charge states from the bulk, allowing for discrete and sta-ble charge population in the surface [76]. A schematic of the band structure of thesilicon bulk with the appropriate charge transition levels can be seen in Fig. 4.2.Absent all other influences, a single isolated SiDB takes on a charge state de-termined by the position of the Fermi energy of the doped silicon bulk with respectto the transition levels [28]. For example, if the Fermi energy lies between thetwo transition levels, than a double occupied DB- SiDB should eventually lose oneelectron to the bulk, and any unoccupied DB+ SiDB should eventually take on anelectron from the bulk. In this way, the “default” charge state is tunable by con-trolling the doping during fabrication. Further band-bending can result in a shiftof these transition levels with respect to the Fermi energy, and thereby change the28preferred charge state. A number of factors can contribute to band-bending nearthe surface, such as tip induced effects [1, 77] or charge interactions at the sur-face [78]. Understanding how these interactions adjust the energy landscape of thesurface is essentially to predicting the behaviour of these devices.4.1.2 Stable Configurations of Interacting SiDBsThe experimental results of Huff et al. regarding their OR gate implementation areincluded in Fig. 4.3. In these, and all other experimental results, the doping ofthe substrate produces a Fermi level above the (0/-) transition level, resulting in apreferred DB- charge state. Constant current STM is employed to visualize the lo-cations of created SiDBs, whereas non-contact AFM is suitable for distinguishingbetween DB0 and DB- states. See Ref. [1] for a discussion on the conditions underwhich AFM imaging can be expected to influence the surface configuration. Thereare two qualitative observations that can be made: (1) not all SiDBs are doublyoccupied; and (2), those that are seem to be fairly spread out.Non-neutral SiDBs interact through Coulombic repulsion. These interactionsresult in shifts in the transition levels of each SiDB with DB- or DB+ neighbours.In Ref. [28], measurements were made of the shift in the (0/-) transition level in atest SiDB at various distances from a nearby DB- SiDB. The amount of shift wasfound to be consistent with a screened Coulomb potential of the form∆E(r)≡ qeV = Kcr e−r/λT F , (4.1)with Kc ·εr = 1.44eVnm, ri j the SiDB separation, and both the relativity permittiv-ity εr and Thomas-Fermi screening wavelength λT F fit to experimental data. Valuesof εr = 5.6 and λT F = 5nm were reported. In Ref. [78], Huff et al. measured thisshift in the presence of various surface defects, each having its own fit parame-ters. For SiDB-SiDB interactions, the appropriate values were εr = 4.1± 0.2 andλT F = 1.8± 0.1. These two sets of values were extracted from experiments ondifferent regions of the H-Si(100)-2×1 surface and are significantly different. Wemust allow then that for different samples, and potentially different locations oneach sample, the physical parameters necessary for modelling may vary.If we approximate a DB- state as a single electron at the center of a host silicon29STMAFMSchematicFigure 4.3: Experimentally observed charge configurations for the OR gate de-signs by Huff et al. [28]. In the top row, constant current STM scans reveal thelocations of the exposed dangling bonds. The inputs of the gate are defined bythe presence or absence of perturber SiDBs at the top: see the schematic repre-sentations on the bottom row for clarity. In the middle row, non-contact AFMreveal the individual charge states. Adapted with permission from Springer NatureCustomer Service Center GmbH: Springer Nature, Nature Electronics, [28] (Huffet al. , Binary atomic silicon logic), ©2018.30atom, then we can associate the level shift with an electric potential Vi j = ∆Ei j/qe.A DB- SiDB raises the transition levels of its neighbours, potentially raising the(0/-) level above the Fermi energy. As two DB- SiDBs are placed closer together,their mutual repulsion raises their transition levels until they pass the Fermi energy.At this point, only one additional charge can exist between them. The number ofcharges in the surface is not simply a function of the number of SiDBs, but dependssignificantly on the particular arrangement and spacing of those SiDBs.4.1.3 Predicting Stable ConfigurationUsing this observation of screened charge interactions, we can understand the taskof finding stable charge configurations of any arrangement of SiDBs as a con-strained energy optimisation problem. We assume the total energy of an arrange-ment of charged SiDBs can we expressed asE(n) =∑iV exti ni+∑〈i j〉Vi jnin j, (4.2)where ni ∈ {−1,0,+1} is the net charge at each SiDB, Vi j is the Coulombic inter-action energy given by Eq. (4.1), and V exti is the local potential at each site arisingfrom any external sources such as electrodes or nearby impurities or surface in-homogeneities. Each SiDB experiences a local potential arising from both theexternal sources as well as the instantaneous population of all other sites:V loci (n) =Vexti +∑j 6=iVi jn j. (4.3)The amount of upward band-bending experienced by a given SiDB is −V loci . Weimpose two metastability criteria. Any electron in a DB- site should prefer to tunnelor hop to a nearby DB0 site if doing so reduces the total energy. We refer to thisas configuration stability. This can be understood from the perspective of the localpotentials. The change in energy associated with a charge ni at site i moving to apreviously uncharged site j can be found to be∆Ei j =−[V loci − (V locj −Vi jni)]ni, (4.4)31where the V loc are computed prior to the change. In the case of a DB- site, we haveni =−1, and thus the energy decreases if V locj −Vi jni >V loci . Our configuration sta-bility criterion is equivalent to the statement that all DB- SiDBs, after subtractingoff their own influences Vi jni from all DB0 sites, should not see a higher V locj .As discussed, a stable charge configuration requires that the transition levelsbe appropriately positioned with respect to the Fermi energy. We define µ− =ET L − EF to be the energy difference between the (0/-) transition level and theFermi energy for an isolated SiDB without any external bias. After band-bending,the (0/-) transition levels of every DB0 and DB- state must be above and belowthe Fermi energy respectively. We then require V loci ≥ µ− for all DB- sites andV loci ≤ µ− for DB0 sites. This we refer to as population stability. Note that weimpose configuration stability even if the resulting change in state would violatepopulation stability. Based on expected in-surface [79, 80] and bulk-surface [81]tunneling rates, it is assumed that the in-surface dynamics are much faster than thebulk-surface ones. A simulated annealing algorithm was implemented by SamuelNg to find solutions to this problem [2, 82]. It makes use of a number of physicallyinspired heuristics to both efficiently explore the space of possible charge statesand enforce metastability. Discussion on the criteria for DB+ states can be in [2].In Fig. 4.4, we show simulated results for the OR gate from Fig. 4.3. As dis-cussed in Ref. [2, 82], the particular physical parameters relevant to the originalOR gate presented by Huff et al. are uncertain. Further, there is strong evidenceof tip influence in the AFM scans, particularly in the case of the 11 input with 5DB- sites in close proximity. Using the values of εr and λT F reported in Ref. [28],we fail to produce either the experimental results or even suitable OR gate outputs;however, there exists a small range of parameters which produce the correct be-haviour. An example set is shown in the middle row of Fig. 4.4. In Ref. [2], wedemonstrated that a slight modification of the original design, simply moving thebottom-most SiDB to the lower site of the dimer, produces a much more robust ORgate with a large range of operating parameters.32MODIFIEDHUFF011011011000111000111000111111000000Figure 4.4: Simulated charge configurations of SiDB OR gates, showing the low-est energy metastable configurations for all inputs. The top row shows the re-sults for the OR gate in Fig. 4.3, using the reported εr = 5.6, λT F = 5nm, andchoosing µ− = −280meV. The middle row uses a parameter set which repro-duces the correct OR gate behaviour: εr = 10, λT F = 10nm, µ− = −250meV. Inthe bottom row, we shift the perturber on the output down from the upper to thelower site of the dimer, weakening the repulsion on the output SiDB. This mod-ification functions using the experimental parameters: εr = 5.6, λT F = 5nm, andµ−=−280meV. Filled circles indicate DB- sites. Half-filled circles indicate SiDBpairs with a charge degeneracy: a single electron occupying either site without anydifference in energy.334.2 Clocking SiDB StructuresThe dopant concentration establishes the default charge state through setting theFermi energy; however, we now understand that band-bending at the surface canresult in a shift in the preferred charge states of SiDBs. This observation leadsto a potential scheme for clocking SiDB arrangements by manipulating the band-bending through an externally controlled surface potential, thereby populating anddepopulating the SiDBs. This effect was experimentally verified through use ofa titanium silicide “island” created on the silicon surface with an array of SiDBsplaced in proximity [83]. The island behaves like a Schottky contact. When ap-plying an electrical bias to the silicide, STM measurements of the surface revealedupward band-bending. This allowed the controllable depopulation of all SiDBsnear the island. In [2], we proposed a more general approach to clocking usingeither buried or suspended electrodes. These electrodes would be used to control-lably induce band-bending at the surface, allowing the activation and deactivationof regions of the surface by raising or lowering the effective local (0/-) transitionlevels relative to the Fermi energy.4.3 Dynamic BehaviourPairs of SiDBs have been investigated as a potential candidate for charge qubits[79, 80]. The tunneling and decoherence rates of a shared charge were estimatedto be on the order of terahertz, with the precise value rapidly decaying with theseparation: from 103 THz at 3.84 A˚, to 101 THz at 3× 3.84A˚. We should expectany observations of dynamics at this scale to be limited to very specific methodsof investigation, such as time-resolved pump-probe scanning tunneling microscopy[81]. Observations of actual charge states of SiDB arrangements have thus far beenlimited to relatively slow AFM scans. However, recent results demonstrate that incertain circumstance much slower dynamics can be observed [1, 28].4.3.1 Degenerate HoppingIn Fig. 4.5, one of the results of Rashidi et al. is reproduced [1]. An arrangement ofsix SiDBs was designed such that the two innermost sites share a single electron.Due to the symmetry of the arrangement, there is no preferred location absent34363330f (Hz)0 9 18Time (min)(a) (b)Figure 4.5: Symmetric structure studied in Ref. [1], with six SiDBs and a chargedegeneracy between the two inner sites: (a) AFM scan at low tip-surface separa-tion, revealing all SiDBs as DB- sites; (b) 800 AFM line scans, back and forth overthe structure over about 18 minutes with a large tip-surface separation to preventtip-induced band-bending. Adapted from [1] with permission, copyright from theAmerican Physical Society.additional bias. Repeated AFM scans were made, passing the tip back and forthover all SiDBs, observing the evolution of the charge state over a period of about18 minutes. If the electron were distributed between the two degenerate sites, andassuming the interaction with the tip to be non-perturbative, we might expect tosee both SiDBs as partially charged with ∆ f between the DB0 and DB- values.Alternatively, we might expect that, as the tip approaches, the charge is fixed toone side or the other. In that case the location would be randomized betweenscans due to the expected fast tunneling. Instead, we see that the charge statepersists over multiple scans. This stabilization is attributed to lattice relaxation [1,28], which serves as an additional energy barrier to charge dynamics that becomessignificant in the cases of degeneracy. This barrier is potentially overcome in theseexperiments through a combination of thermal fluctuations and tip influence.351 24 3(a) SiDB-SiDB HoppingBULKSURFACE(b) Surface-Bulk HoppingFigure 4.6: Schematic of hopping rates in the hopping model. For surface re-configurations, rates are computed only from DB- to DB0 sites using Eq. (4.6).Population/depopulation is modelled by surface-bulk hopping, where each SiDB ispaired with a state in the reservoir with an energy of µ−. The appropriate energiesare shown in blue.4.3.2 Hopping Rate Model for Charge DynamicsA number of models were considered for approximating charge dynamics. An im-portant qualifier for any chosen model should be that the metastable configurationsdefined in Section 4.1.3 be stable under the dynamics. One simple approach is tointerpret all charge reconfigurations as sequences of hopping events. In particular,over some small time step dt, we allow some probability of a charge at a DB- siteto hop to a DB0 site: pi→ j(t;dt) = νi j(t)dt where νi j is the hopping rate. In thisdiscussion, we will address our two metastability criteria separately.Configuration StabilityIf a hop is associated with a decrease in Eq. (4.2) we should require a large νi j;an increase in energy should be heavily penalized. A schematic of this scenario isshown in Fig. 4.6(a). Two suitable models for the hopping rate were implemented:Mott or Miller-Abrahams variable range hopping (VRH) [84], and Marcus Hop-ping [85]. The hopping rate between a DB- and DB0 site is expressed asνi j ∝ e−2αri jη(∆Gi j) (4.5)where ri j is the distance between the SiDBs, α is the spatial decay of the hoppingrate, ∆Gi j is the change in Gibbs free energy associated with the hop, and η is36some function of ∆Gi j which depends on the choice of hopping model. In thecurrent implementation, we equate ∆Gi j to the change in electrostatic energy, ∆Ei jfrom Eq. (4.2), associated with the hop, plus an optional self-trapping energy, ρi,intended to approximate any expected effect of lattice relaxation. It is convenientto use a calibrated form for νi j:VRH : νi j = ν0e−2α(ri j−r0)e−∆Ei j/kBT (4.6a)Marcus : νi j = ν0e−2α(ri j−r0)e−∆Ei j[∆Ei j+2(ρi−λ0)]/4λ0kBT (4.6b)where here ν0 is the experimentally observed hopping rate between two degenerate(∆Ei j = 0) SiDBs at a separation of r0, λ0 is the reorganization energy in Marcustheory, and kBT is the thermal energy. Note that for VRH, assuming the sameself-trapping energy for all DB- sites, ρi is absorbed into ν0. For the results inFig. 4.5, we have at least 23 hops over 18 minutes and a separation of 5 dimers:ν0 ≈ 21mHz, r0 = 19.2A˚. In all results, a spatial decay of α = 1nm−1 and a ther-mal energy of kBT = 0.36meV at 4.2K is assumed. Note that for non-degeneratehopping, the ∆Ei j are typically on the order of tens of meV. In this case, eventhough ν0 is slow, the νi j may be on the order of GHz or faster.Population StabilityFor each SiDB, we require that its local potential from Eq. (4.3) be compatible withits charge state: including the self-trapping energy, a DB- site should depopulate ifV loci + µ− ≤ ρi, and a DB0 site should populate if V loci + µ− ≤ 0. For DB- states,we define a depopulation rate νi,B, associated with a hop between the surface SiDBand some charge reservoir, here understood to be the valence band of the bulk. Wecan define a similar population rate, νB,i, between the reservoir and each DB0site. We set the energy of charges in the reservoir to be at µ−. A schematic ofsurface-bulk hopping is shown in Fig. 4.6(b) with the appropriate energies. Weassume self-trapping is present only for DB- sites and thus it is ignored for νB,i.It is not clear that the form of these rates should be anything like those of surfacehopping. We anticipate that there is some maximum surface-bulk tunneling rateνB and invent some function f : ∆E→ [0,1], which translates an energy difference370 9 18 012Time (min)Charges(a) (b)Figure 4.7: Simulation of the system in Fig. 4.5: (a) lowest energy metastableconfiguration, including the observed degenerate pair; (b) simulated degeneratehopping using calibrated VRH. The same tip trajectory used in [1] is simulated:800 line scans over 18 minutes. This figure first appeared in [2].between the surface-bulk states into a scaled hopping rate. We can writeνi,B = νB f (−V loci +ρi−µ−), (4.7a)νB,i = νB f (µ−+V loci ). (4.7b)The function f should increase from f (−∞) = 0 to f (∞) = 1. In this work, weopted to use a sigmoid of the form f (∆E) = 1/(1+ exp(−∆E/kBT )), consistentwith a state of thermal equilibrium between the surface and bulk sites. In Ref.[81], expected values for bulk to surface tunneling rates were measured. At atemperature of 4K, values range from a few kHz to MHz. In our results we useνB = 1MHz. In Fig. 4.7, we show a simulation of the same system studied in Ref.[1]. In these results, the trajectory of a tip was simulated travelling at 8.9nm/sin order to achieve 800 line scans over 18 minutes to match the AFM scan. Thecharge of each SiDB was recorded as the “tip” passed over. No tip-induced band-bending or other interactions were considered. Further results using the hoppingmodel will be presented in Section 4.5. Details on the implementation hoppingmodel can be found in Appendix A.1.384.4 SiQAD: Simulation and Design of SiDB StructuresSilicon dangling bonds show promise as a platform for nanoscale computing; how-ever, exploration of this design space requires tools for the rapid design and simula-tion of arbitrary SiDB arrangements. To this end, a computer-aided design (CAD)tool was developed, with the code base and simulation tools developed by membersof Walus Lab, and experimental calibration and additional theoretical discussionfrom members of Wolkow’s group.Silicon Quantum Atomic Designer (SiQAD) is an open source CAD tool thatprovides both functionality for the rapid design of arrangements of SiDBs as wellas a number of simulation suites to approximate the behaviour of those arrange-ments [2]. The graphical user interface (GUI) is designed using the Qt C++ frame-work [86]. A custom application programming interface (API) is provided for themodular integration of any simulation tools. This API also offers limited capacityfor future efforts in design automation; an external tool can provide a list of com-mands to, for example, create or remove SiDBs, or specify electrode placementbased on the current design. There are currently three simulation engines includedwith SiQAD:SimAnneal: a simulated annealing algorithm implemented by Samuel Ngwhich attempts to efficiently find the lowest energy metastable configurations asdefined in Section 4.1.3. For a detailed discussion on the implementation of SimAn-neal, as well as a comparison against other potential solvers, see Refs. [2, 82].HoppingDynamics: a graphical tool which allows users to run the hopping-based dynamics model in Section 4.3.2 and Appendix A.1 on designs made inSiQAD. Dynamics can be computed and observed in “real-time”, up to a time-scaling factor, with model and clocking parameters adjustable without interruptingor resetting the simulation. In the current implementation, it is not possible to addor remove SiDBs during the simulation, except perturbers with fixed charge.PoisSolver: a finite element method (FEM) solver for the generalized Poissonequation developed by Nathan Chiu allowing for arbitrary arrangements of buriedor suspended electrodes. Details of this method can be found in Ref. [3].For an expanded discussion on the design and capabilities of SiQAD, see Refs.[2, 82].39(a) STM scan1.302 nm0.768 nm(b) SiDB layoutFigure 4.8: Constant current STM scan of a biased 4-dot QCA cell composed of4 SiDBs designed by Haider et al. [27]. Adapted with permission from Ref. [27].Copyrighted by the American Physical Society.4.5 SiDB Implementations of QCAHaider et al. demonstrated a simple 4 SiDB design which, when biased by twoadditional DB- sites, behaved like a 4-dot QCA cell. The relevant STM scan andSiDB layout are reproduced in Fig. 4.8. Beyond this result, there have been no fur-ther experimental investigations into the behaviour of larger arrays of SiDB-QCAcells. In his thesis, Ng presented a design for a 4-dot QCA cell made using SiDBsin a slightly different geometry. Calculations using SimAnneal were presented forsome some simple components: a wire, an inverter, a 90◦ wire bend, and a majoritygate [82]. Ng demonstrated that, at a certain value of µ−, these arrangements havethe correct metastable configurations. While an important result, this leaves openthe question of whether they would operate correctly when clocked. SimAnnealcannot directly tell us anything about clocking performance; however, we can at-tempt to use it to find the sequence of lowest energy metastable configurations asthe applied clocking field is swept. If we assume the physical system evolves totrack these configurations, this sequence gives us an idea of the trajectory of chargestates a clocked arrangement must go through.In Fig. 4.9, we present an SiDB-based inverting chain with dimensions consis-tent with Ng’s design. Each vertical pair of SiDBs can be understood as a 2-dotQCA cell. We add an input perturber to the left with the same separation as the402.304 nm3.072 nm3 rows8 columns(a) SiDB layout(b) Equivalent QCA schematicFigure 4.9: SiDB implementation of an inverting chain and its equivalent QCArepresentation. The leftmost SiDB serves as the input, and the rightmost SiDBsapply back-pressure without any polarisation bias. This emulates the effect of therest of the circuit. The orange cells in (b) represent cells with fixed charge distri-bution. Important dimensions and a sub-region of the lattice are shown.-0.113-0.180-0.096-0.093-0.054-0.083-0.030-0.016Figure 4.10: Sequence of lowest energy metastable configurations of the invertingchain as the value of µ− is swept. Perturber sites are excluded for simplicity.The numbers indicate the smallest magnitude µ−, in eV, necessary for the shownconfiguration. The intended final state is highlighted. Values shown are for εr =5.6, λT F = 5nm.412.304 nm2.304 nm3.072 nm3 rows8 columns6 columns(a) SiDB layout(b) Equivalent QCA schematicFigure 4.11: SiDB implementation of a non-inverting wire and its equivalent QCArepresentation. Note here that without drawing in the boundaries of each QCA cell,the distinction between four 4-dot QCA cells forming a wire, and eight 2-dot QCAcells forming an inverting chain is made non-arbitrary only by the differences ininter- and intra-cell distances.-0.011-0.021-0.038-0.060-0.094-0.099-0.122-0.131-0.138-0.214Figure 4.12: Sequence of lowest energy metastable configurations of the wire asthe value of µ− is swept. Bias sites are excluded for simplicity. The numbersindicate the smallest magnitude µ−, in eV, necessary for the shown configuration.The correct state is highlighted. Values shown are for εr = 5.6, λT F = 5nm.42cell-cell spacing. The two SiDBs at the right are there to apply back-pressure tothe chain without adding any polarization bias. These make up for the missingband-bending due to truncating the chain. In Fig. 4.10, we sweep the value of µ−from slightly below zero1 down until we first observe a lowest energy metastablestate with one too many charges. The first value of µ− at which each state is ob-serve was recorded. When clocking a QCA wire, the idea is to propagate the inputpolarization from cell to cell. This is not what we see here. The charge repul-sion means that the wire populates effectively from the middle outwards. We willstudy this sort of population order more in the next chapter. We show very similarbehaviour for a non-inverting wire of 4-SiDB cells in Fig. 4.11 and Fig. 4.12.This picture of clocking, where we sweep the value of µ− shared by all SiDBs,is consistent with applying a uniform external voltage at the surface. This could beachieved over a finite region using an electrode like the zone clocking arrangementshown in Fig. 1.4. It is clear that some of the issues might be reduced by usingwave clocking, which in this case would afford better control over where popula-tion/depopulation events occur. In Fig. 4.13, we show a simulation of an invertingwire under wave clocking using HoppingDynamics. An important quantity hereis the ratio between the clocking wavelength, λ , and the inter-cell separation. Weused a wavelength of 48nm in order to demonstrate that, in principle, with a smallenough wavelength these population order issues can be resolved; however, currentlimitations in nanofabrication techniques constrain the minimum electrode pitch,and hence also the clocking wavelength. A wavelength on the order of 200nmis more realisable, four times the contacted gate pitch in current FinFET fabrica-tion processes [39]. If we used a 200nm clocking wave, we would require cellseparations of about 12nm in order to produce similarly well behaved clocking.This is roughly twice the Thomas-Fermi screening distance at 5.6nm, significantlyweakening the cell interactions. We will see in the next chapter that the inclusionof a tunneling energy within the SiDB-QCA cell model further helps here, poten-tially making wires practical; however, higher density regions like the center of amajority gate would still be a challenge.1If we start at zero, the perturbers themselves are unpopulated. We start the sweep at the weakestµ− needed to populate the perturbers.433.072 nmInput(a) Inverting chain layout (b) 4-phase electrode schematic(c) Charge states at pi/2 increments in clocking phase.Figure 4.13: HoppingDynamics simulation of an SiDB inverting chain under 4-phase wave clocking. Clocking was implemented using buried electrodes, 100nmbelow the surface with a clocking wavelength of λ = 48nm. At that depth the fieldat the surface is approximately a travelling sinusoid, chosen to have a peak-to-peakamplitude of 200mV. Values of V ext(x, t) are indicated by the colorbar. In (c), weshow the charge states in a dynamic simulation at time slices corresponding to pi/2phase shifts. With such a small clocking wavelength, the bit-packet propagationbehaves as expected. Between φ = pi/2 and φ = pi , we changed the input state inorder to demonstrate that multiple wave packets can be propagated independently.The colorbars show V ext(x, t). A version of this figure first appeared in [2].444.6 SummaryThis chapter gave an overview of the potential of silicon dangling bond structuresas a platform for binary logic, and in particular, their candidacy as a means of im-plementing nanoscale QCA. We demonstrated that experimentally observed chargestates of arrangements of SiDBs can be understood as a constrained energy min-imization problem. A simulated annealing solver, SimAnneal, was developed bySamuel Ng for the purpose of efficiently solving this problem. Despite expectedrapid tunneling between neighbouring SiDBs, experimental results suggest signifi-cant lattice relaxation in H-Si(100)-2×1 which enforces charge localisation amongdegenerate SiDBs pairs over time scales on the order of seconds to minutes. Thisbehaviour motivated a hopping model for the dynamics and ultimately a simulationtool, HoppingDynamics. We briefly introduced SiQAD, a CAD tool for developingand simulating SiDBs arrangements which makes use of SimAnneal, HoppingDy-namics, and PoisSolver, our in-house FEM solver for finding the time-dependentelectric potential at the surface due to any arbitrary arrangements of clocking elec-trodes. Finally, while it is possible to design SiDBs arrangements which reproducethe functionality of simple QCA networks, we discussed the potential challengeto clocking that arises due to the complex charge state trajectories that result fromout-of-order population of even simple wires. In the next chapter, we will raise thisissue of population order from the perspective of conventional QCA analysis.45Chapter 5Population Congestion in 3-StateQCAWhen designing QCA networks, the predominant focus is on the coupling betweenpolarized states, arising from the energy cost of neighbouring cells having the sameor opposite polarizations. For 3-state mixed valence molecule (MVM) devices, it isknown that additional interactions between the null and polarized states of neigh-bouring devices lead to an increase in the required clocking field needed to activatea cell [87, 88]. It has been proposed that complexities arising from these interac-tions can be mitigated by employing a dynamic clocking wave with a sufficientlylarge maximum field strength. In the previous chapter, we observed that for SiDB-based QCA implementations, the order of cell activations, in this case SiDB popu-lation events, can be complicated. We will use the term congestion to refer to thiseffect of strong population repulsion. SiDB devices pose an additional challengein that a large applied clocking field can easily produce too large of a surface popu-lation. It is therefore unclear whether SiDB based QCA devices could be operatedin a regime of clocking parameters for which this congestion is mitigated.In this chapter, we investigate the conditions under which these congestioninteractions contribute significantly to 3-state QCA devices and the behaviour ofthese devices in the regime of strong congestion. In Section 5.1, we discuss the Ncell 3-state QCA Hamiltonian including congestion and polarization interactions.In Section 5.2, we consider two potential MVM and SiDB-QCA devices and ap-46proximate the dependence of the interaction energies on the device geometries.In Section 5.3, we study the ground states of simple QCA networks. We presentan analytical estimate of the single cell ground state and calculate the maximumcongestion interaction strength beyond which a QCA network displays irregularground states. Finally, in Section 5.4, we discuss the extent to which dynamicwave clocking addresses these challenges for SiDB-QCA devices.5.1 Full 3-State HamiltonianThe Hamiltonian of a single 3-state QCA device in the presence of some clockingfield was given in Eq. (2.4), repeated here for convenience:Hˆi(t) =−γiΓˆ+hiPˆ −Ci(t)(Nˆ − Iˆ)The cell population/activation, Ni = 〈Nˆi〉, is the probability of a device being ineither of the polarized states, or equivalently, not in the null state. For multipledevices, the full Hamiltonian will take the formHˆ(t) = −∑i γiΓˆi+∑i hiPˆi−∑i(Ci(t)−µi)(Nˆi− Iˆ)−12 ∑〈i j〉Eki jPˆiPˆ j +∑〈i j〉Di jNˆiNˆ j (5.1)where charges in the upper sites both interact via their polarizations through Eki jand inhibit population of neighbouring devices with congestion energy Di j. Wealso introduce a term µi which represents a population/congestion bias. Individualeffective device Hamiltonians from the 3-state ICHA are given byHˆi =−γiΓˆ+ h˜iPˆ − (Ci− µ˜i)(Nˆ − Iˆ), (5.2)with h˜i = hi− 12 ∑ j 6=i Eki jP j as with the 2-state ICHA and instantaneous effectivecongestion biases µ˜i = µi +∑ j 6=iDi jN j for current populations N j. We can arguefor a suitable value of Di j in the same way we define Eki j. Suppose we had a pairof 3-state devices with γi = hi = µi = Ci = 0. The pair then has a HamiltonianHˆi j =−12 Eki jPˆ1Pˆ2+Di jNˆ1Nˆ2. (5.3)47The eigenstates with non-zero energy are those with both devices in a polarizationstate. They have energies|−1,−1〉 :−12 Eki j +Di j |−1,+1〉 : 12 Eki j +Di j|+1,−1〉 : 12 Eki j +Di j |+1,+1〉 :−12 Eki j +Di jThe kink energy is then the change in energy when one of the polarizations isswitched, and the congestion energy is the average. That is, if E++ and E+− arethe energies of |+1.+1〉 and |+1,−1〉 absent all other devices and biases, thenEki j = E+−−E++, Di j = 12(E+−+E++). (5.4)5.2 Considered DevicesWe consider two interesting candidates for atomic scale QCA: zwitterionic nidocarborane (Fc+FcC2B−9 ) [26] and the discussed SiDBs [14, 28]. We assume E+−and E++ can be expressed as a sum of point charge interactions between devices:E(n) = ∑i∈Ω1∑j∈Ω2V (|~ri−~r j|)nin j (5.5)where ni is the number of electrons at site i located at ~ri, V gives the distancedependence of the charge interaction, and Ωi is the set of sites in the i’th device.5.2.1 Zwitterionic Nido CarboraneA number of mixed-valence species have been considered as candidates for molec-ular QCA. Of particular interest in recent studies are diferrocenyl acetylene, havingtwo Fe centers serving as quantum dots, and self-doping zwitterionic nido carbo-rane. The latter contains two Fe centers as well as a carborane cage, providingthree quantum dots, one fixed electron and a mobile hole [89]. The schematics ofthe appropriate 3-dot and 6-dot devices using Fc+FcC2B−9 are shown in Fig. 5.1.We will focus on wire arrangements with device centers separated by some dis-tance s≥ d. If the device separation is sufficiently large, we can find approximate483-Dot Device Geometry6-Dot Device GeometryFigure 5.1: Schematics for 3-state Fc+FcC2B−9 devices. Devices always remainnet neutral. Parallel 3-dot devices produce inverting wire chains. 6-dot devices aremade from pairs of 3-dot devices with upper sites forming a square.expressions for our devices. Assuming no screening, V (r) = Kc/r, we arrive atEk3-dot ≈−Kc2d(d/s)3[1− 38(d/s)2], (5.6a)D3-dot ≈ Kcv (v/s)3, (5.6b)for the 3-dot devices. Both parameters are cubic in inverse distance indicative of adipole-dipole interaction between devices. For the 6-dot device, we getEk6-dot ≈6Kcd(d/s)5, (5.7a)D6-dot ≈ 4Kcv (v/s)3[1+ 32(d/s)2]. (5.7b)Interestingly, while the kink energies fall off as (1/s)5, the congestion energy decayremains cubic. From the point of view of congestion, there is no distinction be-tween the two polarized states. In this view, both the 3-dot and 6-dot devices areeffectively dipoles with some positive charge in the upper sites, negative chargein the lower site(s), and a length scale of v. For the 6-dot device this has the un-fortunate outcome that the congestion persists between devices at separations wellbeyond the kink energies. Fig. 5.2 compares the exact Ek and D values computed496-Dot Devices3-Dot Devices1 2 3 4 5Cell Separation, (s/d)100101102Interaction Energy, meV100|Ek|/Ek1 2 3 4 5Cell Separation, (s/d)100101102Interaction Energy, meV100|Ek|/EkFigure 5.2: Kink and congestion energies for both Fc+FcC2B−9 device configura-tions. Dashed lines show the values computed from Eqs. (5.6) and (5.7). The rightaxis gives values for the D/Ek ratio. For the 6-dot device, s < 2d is excluded as ithas dot-dot separations between cells which are smaller than intra-cell separations.Values are shown for d = 1nm, h = 0.5nm, εr = 1. The red gradient indicates thethermal energy at 300K for reference.from Eq. (5.5) to the approximations. Device separations must be kept small tokeep the kink energies above the thermal energy; however, at these separations thecongestion energy is also significant. A higher clocking field must be applied inorder to guarantee the eventual population of all devices.5.2.2 Silicon Dangling BondsDimensions for SiDB devices are shown in Fig. 5.3(a-b). We use the screenedpotential discussed in Section 4.1.2: V (r) = Kc exp(−r/λT F)/r, and opt to usethe fit values given in Ref. [78]: εr = 4.1, λT F = 1.8nm. These values are morefavourable for the D/Ek ratio. Absent evidence that doubly occupied and unoc-cupied states can be present at the same dopant concentration, we should expectQCA devices made of SiDBs to be either net negative or net positive. This hintsat the susceptibility to congestion observed in employing them for logic gates. Forthe 2-dot device, we approximate the interaction strengths asEk2-dot ≈−Kc2de−s/λT F F1(s/λT F)(d/s)3, (5.8a)D2-dot ≈ Kcs e−s/λT F[1− 14 F1(s/λT F)(d/s)2], (5.8b)501nm1.15 nm2-Dot Devices 4-Dot Devices(a) (b)(c) (d)1 2 3 4 5Cell Separation, (s/d)100101102Interaction Energy, meV100101|Ek|/Ek1 2 3 4 5Cell Separation, (s/d)100101102Interaction Energy, meV100101|Ek|/EkFigure 5.3: SiDB device geometries and separations are constrained by theH-Si(100)-2×1 surface lattice shown in (a). In (b) we define the relevant dimen-sions for 2-dot and 4-dot SiDB devices. The corresponding interactions energiesare shown in (c) and (d) with d = 1nm and w = 1.15nm. At 4.2 K, the thermalenergy is only about 0.36 meV.where F1(x) = 1+ x is the first of a set of polynomials which describe the screen-ing dependence of the dominant terms in the expansion. Similarly, for the 4-dotdevices:Ek4-dot ≈Kc2se−s/λT F F3(s/λT F)(d/s)2(w/s)2, (5.9a)D4-dot ≈ Kcs e−s/λT F[4−F1(s/λT F)(d/s)2+F2(s/λT F)(w/s)2], (5.9b)with F2(x) = 2F1(x) + x2 and F3(x) = 5F2(x) + 2F1(x) + x3. The calculated in-teraction energies are shown in Fig. 5.3(c) and (d). Even with screening, we seethat congestion dominates. We will see that for such large congestion energies 3-state QCA operation is a potential challenge. We conclude with a summary of our510.0 0.5 1.0Normalized Time, s101Clocking Field0.0 0.5 1.0Normalized Time, s101Clocking FieldTrapezoidal Wave Sinusoidal WaveFigure 5.4: Simple normalised wave profiles. In our case, observe that the maxi-mum clock occurs as s = 1/2.approximations of the ξ =D0/Ek0 ratios, keeping only the most dominant term:ξ3 ≈ 2(v/d)2, ξ6 ≈ 23(v/d)2(d/s)−2,ξ2 ≈ 2F1(s/λT F)(d/s)−2, ξ4 ≈ 23F1(s/λT F)(d/s)−2(w/s)−2.(5.10)5.2.3 ClockingWe will consider both zone and wave clocking: refer to Fig. 1.4. We write theclocking field in either case in the formZone Clocking : Ci(t) = C0+ ∆C2 ×C¯( f t), (5.11a)Wave Clocking : Ci(t) = C0+ ∆C2 ×C¯( f t− xi/λC). (5.11b)where C0 and ∆C define the average and peak-to-peak amplitude of the clockingfield at the surface, f is the frequency of one of the clocking phases, xi are the de-vices locations, λX is the wavelength of the clocking field, and C¯ is some functionwith a period of 1 satisfying−1≤ C¯(s)≤ 1 which defines the clocking profile. Forwave clocking, we disregard any complexities of the generated field and merelyassume it to be some wave translated in xˆ. For the clocking profiles, C¯, we consideronly trapezoidal and sinusoidal waveforms, shown in Fig. 5.4 for completeness.We note also that the interpretation of the clocking field is slightly different be-tween MVM and SiDB devices. For the 3-dot and 6-dot devices, we are interested52in the difference in electrical potential between the null and upper sites. This isapproximately determined by the zˆ component of the electric field and the nullsite depth, v: Ci(t) ≈ ~E(xi, t) · vzˆ. For the 2-dot and 4-dot devices, the clockingfield is just the applied electrical potential at each site associated with the clockingelectrodes: Ci(t) =V exti (t) [2].There are two main constraints we need to impose on our clocking fields. First,we set a minimum clocking wavelength of 200nm consistent with current nanofab-rication limitations: see Section 4.5. Second, we must consider practical upperbounds on the clocking fields for our two architectures. For MVM devices, thecombinations of large kink energies and small v gives rise to large electric fields.A required clocking field equal to Ek necessitates an electric field on the order of102 or 103 meV/nm. While a few V/nm is realisable [88], larger electric fields arepotentially a challenge. For SiDB devices, it is relatively easy to populate both ofthe SiDBs in a 2-dot device [2]. The population of one of the SiDBs raises the tran-sition level of the other by an amount equal to V (d), approximately 200 meV forour chosen parameters. The difference between a sufficiently large clocking fieldfor full population and the case of over-population is then only about 200meV. Wewill see that this constraint limits the benefits of wave clocking for SiDB devices.5.3 Ground State CharacterisationHere we investigate the ground states of 3-state QCA devices. First, we look indetail at how a single device responds to all our parameters. Then we consider theconditions for which the ground state of simple QCA components indicate correctdevice operation.5.3.1 Single Device AnalysisIt will be useful to fully understand the ground state of a single 3-state device underdifferent parameters. We take the single-cell Hamiltonian given in Eq. (2.4) wherewe understand µ to merely act as a shift in the clock C. The population N andpolarization P can be expressed in terms of the ground state energy, E0:N = C−E0(C+hP˜)−2E0, P˜ = PN =2E0hE20 +h2; (5.12)53100 0 100Clocking Field,50050Energy:E/2100 0 100Clocking Field,50050Energy:E/2(a) (b)Figure 5.5: Spectra of the 3-State Cell Hamiltonian: (a) γ = 10meV, h = 0meV;(b) γ = 10meV, h = 50meV. Dotted lines indicate E = C,0,±h. We subtract C/2from the energies to highlight symmetries in the spectrum. Energies are in meV.however, E0 is non-trivial to express analytically. In Fig. 5.5, we show spectra forthe two limiting cases of |h|: small and large with respect to γ . In both of thesecases, the ground state energy is approximately the lower half of a hyperbola withcenter (C,E− C/2) = (−|h|,−|h|/2). It is natural then to assume a general form ofE0 = 12(C−|h|)− 12√α2+(C+ |h|)2. (5.13)This is exact up to some parameter α(γ,h,C). In Appendix B.1, we discuss ourapproach to estimating α2. Here, it is sufficient to note that 4γ2≤α2≤ 8γ2 with theminimum and maximum values achieved at lim|h|→∞ and h = 0. The assumptionof a hyperbolic form for E0 is equivalent to assuming α is independent of C. Weuse a general approximationα2(γ,h,C) = 4γ2[2+u2−u√2+u2], (5.14)with a parameter u that depends on the desired accuracy. In Table 5.1, we givenumerically achieved upper bounds on the errors relating to different forms of u.These bounds are independent of γ , justified as follows. From Eq. (5.12), we seethat if E0/γ is a function only of (h/γ,C/γ) then so areN and P˜ . We can also rewriteEq. (2.4) in the form Hˆ(γ,h,C) = γH¯(h/γ,C/γ). It follows thatE0(γ,h,C) = γE¯0(h/γ,C/γ). (5.15)54-10 -5 0 5 10Clocking Field, Values| |-10 -5 0 5 10Clocking Field, Values| |(a) (b)Figure 5.6: Polarization and population responses for the two limiting cases ofbiases: (a) h = γ/10; (b) h = 5γ . The population is approximately a sigmoid withcenter C =−|h| (indicated) and a width of α ∝ γ .Table 5.1: Worst-case errors over all (h,C) for different u parameter approxima-tions.Approximation ∆E0/(E0−C/2) ∆P˜ ∆Nu = 1/2⇔ α2 = 6γ2 13.4% 0.037 0.039u = |h|/γ 4.7% 0.1061 0.019u = 3|h|/(√9γ2+C2−C) 1.1% 0.006 0.013This form also holds for Eq. (5.13) if the u parameter depends only on (h/γ,C/γ), asis the case for our approximations. The worst cases for the absolute errors ofN andP˜ and the relative error of any linear combination of (E0,h,C) then occur at specific(h/γ,C/γ) pairs and are otherwise independent of the choice in γ . In Fig. 5.6, weshow the typical cell response for low and high |h|/γ. A fair approximation of thepopulation is the sigmoidN ≈[1+ e− 2(C+|h|)α(γ,h,−|h|)]−1(5.16)obtained by expanding Eqs. (5.12) and (5.13) about C = −|h|. This sigmoid ap-proximation gives a maximum error of ∆N . 0.06 for u = 1/2. If the polarizationbias is strong, the degree of polarization, |P|, is essentially given by the population;1In the case of u = |h|/γ, the large error in P˜ occurs in the regime where both C is large and|h|  γ . This corresponds to the presence of some kink at a populated cell. For C < 0, the error in P˜is less than 0.014.55Inverting Wire Non-Inverting WireFigure 5.7: Wire arrangements with a driver cell to the left with fixed polarizationand population of 1. Wires shown in their ideal fully populated configuration.otherwise, the polarization will tend to lag behind the population as the clockingfield is increased.5.3.2 Populating a Wire near the Classical LimitThe influence of the congestion energy comes into effect only when multiple de-vices are being considered. In the ICHA picture the situation is intuitive: the in-stantaneous polarizations act to shift the biases, hi, of each of the devices, andthe populations shift the effective clocking fields, Ci. The net effect is to shift thepopulation sigmoid to a new center, which we will refer to as the activation energy:Cai = µi+∑j 6=iDi jN j−∣∣∣∣∣hi− 12∑j 6=i Eki jP j∣∣∣∣∣= µ˜i−|h˜i|, (5.17)neatly expressed in terms of the instantaneous effective biases from the ICHAHamiltonian. We should expect a device to have a population of about 1/2 whenthe local clocking field is equal to the instantaneous activation energy. Using thisidea, we can estimate bounds on the minimum clocking field needed to operate afunctioning biased QCA wire as shown in Fig. 5.7. In Appendix B.2, we discussbounds on the smallest and largest expected activation energies, Calo and Cahi, duringwire clocking. Note that for wires which populate out-of-order a larger range ofactivation energies may be possible. Using u = 1/2 in Eq. (5.16), we expect thatwith a clocking field 5γ above Cai we get a population of Ni ≈ 0.98. We extendour expected lower and upper bounds on the activation energy by this 5γ factor toguarantee sufficient population and depopulation during clocking. We then obtainvalues for the average clock and clock swing:C0 = 12(Calo+Cahi), ∆C = Cahi−Calo+10γ. (5.18)56n+1 n+2Figure 5.8: Driven inverting wire in the weak tunneling limit after successfullypopulating the first n cells. In-order population would be violated if cell n+2 hasa lower activation energy than n+1.In the absence of congestion, wires tend to populate in-order. For zone clock-ing, this simply meansNn+1 ≤Nn for all devices, n. In wave clocking, we are onlyconcerned with the devices within the rising edge. We will consider the constraintson ξ which produce in-order population. The result is simple if we can ignore tun-neling: γ = 0. We suppose that some number of the cells have populated in-order.This scenario is depicted in Fig. 5.8. The wire will continue to populate in order ifCan+2 ≥ Can+1. From Eq. (5.17), this corresponds to the constraintξ ≤ (|h˜n+1|− |h˜n+2|)/Ek0(µ˜n+1− µ˜n+2)/D0 . (5.19)In our case, the contributions to Ca from congestion fall-off more slowly than frompolarization. As a wire populates, we should then expect this bound to decrease.We assume then that there are only two cases of interest: (1) the first cell to populatein an unpopulated wire; and (2), the next cell to populate in an infinite wire. Ifwe consider only the dominant term in our estimates of the kink and congestionenergies, we can make a useful approximation of the cell interaction in a wire:|Eki j| ≈ Ek0 |i− j|−p, Di j ≈D0α |i− j|−1s |i− j|−q, (5.20)where αs = e−s0/λT F is the Thomas-Fermi screening factor for nearest neighbourcell-separation s0, and p and q are some powers determined by the dependence ons in our interaction energy estimates. Note that we keep αs only in the case of thecongestion energies in SiDBs as all of our other energies decay at least as fast ass−3. Contributions from next-nearest neighbours are effectively negligible. As an57Table 5.2: Upper limits on ξ for in-order population of a wire in the weak tunnel-ing limit. Values for αs correspond to device separations of 1.15nm and 2.30nm.Device Geometry p q αs Eq. (5.22) Eq. (5.23)3-dot 3 3 1 0.500 0.4026-dot 5 3 1 0.554 0.5002-dot 3 1 0.527 0.594 0.4024-dot 5 1 0.278 0.563 0.500example, for a wire of 3-dot devices, we getEki j ≈−Kc2d(d/si j)3 =−Kc2d(d/s0)3|i− j|−3. (5.21)We would then have Eki j ≈−Ek0 |i− j|−3. Assuming interactions of the form givenby Eq. (5.20), the first populated cell in a wire gives the constraintξ ≤(2p−12q−αs)2q−p−1. (5.22)For the case of the infinite wire, it depends on whether the wire is inverting ornon-inverting:ξ ≤(1− 1/2p−1)ζ (p)− 12 , Inverting12 , Non-Inverting(5.23)where ζ is the Riemann zeta function. The resulting constraints on ξ are sum-marized in Table 5.2. We can combine these constraints with our ratio estimatesin Eq. (5.10) to obtain approximate requirements on our device geometries. Forexample, for a 3-dot device of the form presented in Fig. 5.1(a) to achieve in-orderpopulation of a wire using zone clocking with γ = 0, the device height must sat-isfy v ≤ 0.45d. In Fig. 5.9, we show the polarizations and populations given bythe ground states of biased 3 cell wires using zone clocking in the weak tunnelinglimit. We see that in this limit even the 3-dot and 6-dot wires would show out-of-order population for v = 0.5d. Similar ideas about out-of-order wire populationhave been seen in work by Karim et al. , although their focus was on identifyingmaximum required clocking fields [90]. Note that out-of-order population of non-581231230.16 0.12 0.08 0.04Clocking Field, /|Ek0|1011230.05 0 0.05 0.101230.06 0.00 0.061230.05 0.15 0.25123(a) (b) (c)(d) (e) (f)0.150.12 0.30Clocking Field, /|Ek0|101Clocking Field, /|Ek0|101Clocking Field, /|Ek0|101Figure 5.9: Ground state polarizations and populations of biased 3 cell wires: (a-c) show an inverting 3-dot wire with cells labelled and congestion ratios of (b)0.35 and (c) 0.45; (d-f) show a non-inverting 6-dot wire with ratios (e) 0.45 and (f)0.55. The polarizations for the indicated cells are shown by the bar color with thesolid black line showing the population. The half-populated regimes in (c) and (f)indicate a degeneracy between populating cells 1 and 2.inverting wires will not result in incorrect polarizations and thus presents less of anobstacle for properly functioning QCA networks. While this may be the case forwires, this argument will not apply in general as we will see in Section 5.3.4. For a2-dot SiDB device, the appropriate constraint is (s/d)2 ≤ 0.201[1+(d/λT F)(s/d)]. Inorder to find a solution with s≥ d, we require that d be at least 2.5λT F . The regimein which SiDB-based QCA wires operate in the classical limit using zone clockingcorresponds to a case where the SiDBs barely interact at all.590.4 0.0 0.4 0.8Clocking Field, /|Ek0|1011230.4 0.0 0.4 0.8Clocking Field, /|Ek0|1010.4 0.0 0.4 0.8 1.2Clocking Field, /|Ek0|1010.50 0.0 0.50 1.00 1.50Clocking Field, /|Ek0|101123123123(a) (b)(c) (d)Figure 5.10: Ground state polarizations and populations of a biased 3-dot invertingwire with γ = Ek0/10 and congestion ratios of (a) 0.5, (b) 0.6, (c) 0.7, and (d) 0.8.The solid green line shows Nn−Nn−1 with the dashed line indicating zero.5.3.3 Congestion for Nonzero TunnelingFor non-zero γ , the population of a cell does not switch suddenly. In this case,we must consider a suitable condition for ordering. Previously, it was sufficientthat within the region of clocking we hadNn+1 ≤Nn; however, we might not nec-essarily be concerned if a later cell becomes slightly populated. As an example,Fig. 5.10 shows the ground states of the 3-dot inverting wire for different conges-tion ratios at γ = Ek0/10, a value often seen in the literature. Beyond ξ ≈ 0.5, cell 3takes on a slight incorrect polarization associated with the same out-of-order popu-lation seen in Fig. 5.9c. However, this effect is initially minimal. We might definea more forgiving in-order metric as something likeNn+1−Nn ≤N∗ ∀n,C (5.24)603-Dot6-Dot2-Dot4-Dot0.000 0.025 0.050 0.075 0.100Scaled Tunneling, /|Ek0| RatiomaxFigure 5.11: Maximum congestion ratios for 3 cell wires using devices for varyingtunneling rates. The dashed lines show the values for the self-consistent ICHA.Values computed for a wire of length 3. For such a short wire, the γ = 0 limit liesbetween the estimates made by Eqs. (5.22) and (5.23).for a threshold value N∗ which is somewhat arbitrary. At ξ = 0.6, the out-of-order population is relatively clear with an observed maximum value of N3−N2of about 0.33. We take N∗ = 0.3 to be suitable. In Fig. 5.11, we show the maxi-mum values of ξ which satisfy this criterion for all our considered devices over arange of γ . We compare the trends for the ground state of the full Hamiltonian withthose achieved using the ICHA. These are obtained by self-consistently computingthe polarizations and populations using Eq. (5.12), initialized with those from theprevious clocking fields. Beyond about γ = 0.075Ek0 for inverting and γ = 0.03Ek0for non-inverting interactions, all the trends are approximately linear. For suffi-ciently high ξ , we should expect congestion to dominate. In this case, the groundstate would depend primarily on D0/γ = ξ/(γ/Ek0) as observed. Note that the arbitraryselection of N∗ changes the slope of this linear regime.We conclude this discussion by observing that even without employing waveclocking, correct operation of the 3-dot and 6-dot wires is achievable with modesttunneling rates. In the case of the highest ξ for 3-dot devices at 1 nm separation,we note that correct operation is possible above γ ≈ 0.15Ek0 . For larger separations,γ = 0.1Ek0 is sufficient. The 2-dot and 4-dot SiDB devices, however, would requiremuch higher tunneling rates on the order of γ ≈ 0.5Ek0 .610.4 0.2 0.0 0.2 0.4Clocking Field, /|Ek0|1010.4 0.2 0.0 0.2 0.4 0.4 0.2 0.0 0.2 0.41234 5123451234512345(a) (b)(c) (d)Clocking Field, /|Ek0|101Clocking Field, /|Ek0|101Figure 5.12: Ground states of the majority gate shown in (a) with congestion ratiosof (b) 0.15, (c) 0.20, and (d) 0.25. The first three cells are adjacent to the inputs,cell 4 experiences the highest congestion at the center of the cross, and cell 5 is theoutput, expected to take on a polarization of +1. By ξ = 0.2 we already observeundesirable behaviour in the output polarization.5.3.4 The Case of the Majority GateHere we consider the case of a majority gate with inputs (+1,−1,+1), a challeng-ing case for zone clocking [5]. A schematic of the network as well as the groundstates for a number of ξ values is shown in Fig. 5.12. It is not clear that in-orderpopulation should matter here. It is important then that we identify exactly thesort of behaviour that suggests a challenge to clocked operation. In Section 3.1.1,we determined an upper bound on the maximum characteristic rate of the dy-namics that relates to transition parameters Tn(s) =∣∣〈ϕn| ddsHˆ|ϕ0〉∣∣/(En−E0)2. InFig. 5.13, we show the ground states, energy gaps, and transition parameters for themajority gate using two different γ values. Here ddsHˆ(s) =−∑i ddsCi(s)(Nˆi− Iˆ)=− ddsC(s)∑i(Nˆi− Iˆ) for zone clocking. We assume a trapezoidal clocking profile,62which gives ddsC = 4∆C. We will expand on the application of this transition pic-ture in the next chapter. It suffices here to note that the characteristic rate of thedynamics for adiabatic evolution is limited by the inverse of the maximum Tn(s).When tunneling is weak, each change in the ground state that occurs as theclocking field is swept produces a local minimum in the energy gap. If γ is suf-ficiently small the avoided level crossings are hyperbolic and well approximatedby a Landau-Zener model: see for example Ref. [91]. The values of the minimacan be roughly categorized based on the number of population-depopulation eventsthey involve. For example, the population or depopulation of a single cell wouldproduce a gap of roughly 2γ as in Fig. 5.5(b). The first minima in Fig. 5.13(b) cor-responds to the simultaneous population of two cells, and has a gap smaller than2γ . The remaining three minima can each be seen to describe three simultaneousevents: a single depopulation and two populations. As we have assumed no tunnel-ing between the upper sites, we decompose the change of the polarization of cell5 as a depopulation immediately followed by a population. As we increase γ , it isthese polarization-flip transitions which yield the smallest gaps and most stronglyconstrain the dynamics. We can see this in the transition parameter results, wherethe final transition dominates. We see also that the value of γ will have a signif-icant effect on the potential operating speed of the 3-state devices. Increasing γfrom 0.02Ek0 to 0.05Ek0 yielded a three order of magnitude reduction in the diabatictransition rates. Accounting then for non-zero tunneling, we define the majoritygate to display poor behaviour when the ground state contains a change in cellpolarization after first achieving an incorrect polarization of magnitude |P| ≥ 0.3,keeping with out earlier defined N∗Returning then to Fig. 5.12, we see that at ξ = 0.15 the ground state is wellbehaved; however, even at ξ = 0.20 we observe undesirable behaviour of cell 5,the output. It is an interesting exercise justifying the order of events here. For ex-ample, in Fig. 5.12(c), the ground state is correct until C ≈ −0.1Ek0 . At this point,we have cells 1, 3, and 4 in the correct polarization. As soon as cell 2 is populatedthere is too much congestion at cell 4, which depopulates in favour of the outputcell; however, the output cell is now under inverting diagonal interactions from thepopulated cells 1 and 3 and takes on the wrong polarization. A different sequenceof populations occurs at ξ = 0.25. The inclusion of non-zero tunneling for the ma-63101 1234512345Clocking Field, /|Ek0|Transition ParametersTransition ParametersClocking Field, /|Ek0|0.4 0.2 0.0 0.2 0.4 0.2 0.0 0.21010.000.050.10Energy Gap, Gap,Figure 5.13: Ground states, energy gaps, and transition parameters for the majoritygate in Fig. 5.12(a) for ξ = 0.20 and non-zero tunneling. In the left column, weshow the case for weak tunneling, γ = 0.02Ek0 , where each change in the groundstate produces a clear hyperbolic gap. For the stronger tunneling, γ = 0.05Ek0 , onthe right, we see that the gaps corresponding to simple changes in cell populationincrease significantly, leaving the most crucial gap corresponding to the change inpolarization of cell 5. Vertical dotted lines are added to help correlate the minimato features in the ground states. We see in the transition parameters that the finalchange in the ground state is critical, and that the value of γ will significantlyinfluence the maximum rate of the dynamics.64jority gate does not increase ξ as much as for the wire. At γ = Ek0/10, the onset ofpoor ground state behaviour increases from ξ = 0.18 to ξ = 0.23. To operate themajority gate using 6-dot devices, we require either a larger value of γ or a smallervalue of v. For example, using γ = 0.1Ek0 requires that we satisfy v < 0.3d. Alter-natively, using v = 0.5d, we require γ > 0.92Ek0 . For net-negative 4-dot devices, azone clocked majority gate of this form requires γ > 0.85Ek0 . Large γ/Ek0 can berealised by increasing the device separation but may cause other complications dueto the resulting weak Ek0 . Note that the OR gate design in Ref. [28] and additionalgates in Ref. [2] demonstrate that alternate gate designs involving fewer charges arepossible. If we cannot otherwise mitigate these congestion interactions, we shouldtake this result as inspiration to consider such alternatives for SiDB devices.5.4 Wave Clocking of WiresBy applying a spatially varying clocking field, we can overcome some of the chal-lenges observed in zone clocking. The discussion is most easily understood forwires. From the perspective of activation energies, the effect of wave clocking isrelatively simple. By populating a wire under a clocking gradient, the conditionfor in-order population becomesCan+1 < Can +(Cn+1−Cn)≈ Can +∇Cn∆xn (5.25)for clocking field gradient ∇Cn at site n and cell separation ∆xn. In the weak tun-neling limit, the result is to increase the constraint on ξ by an amount∆ξ = minn−∂xCn∆xnµ˜n+1− µ˜n . (5.26)For a trapezoidal clocking wave, and assuming a wire has populated in-order, thisbecomes∆ξ = 4∆CEk0s0λ. (5.27)For a sinusoidal clocking wave it depends on the gradient of the clock when thecell is populating. We can, however, set an upper bound given by the maximum650.00 0.05 0.05 0.05 0.05 in Ratio,Increase in Ratio,Scaled Tunneling, /|Ek0| Scaled Tunneling, /|Ek0|Scaled Tunneling, /|Ek0| Scaled Tunneling, /|Ek0|Increase in Ratio,Increase in Ratio,3-Dot6-Dot2-Dot4-Dot3-Dot6-Dot2-Dot4-Dot3-Dot6-Dot2-Dot4-Dot3-Dot6-Dot2-Dot4-Dot(a) (b)(c) (d)Figure 5.14: Increases in the maximum congestion ratios of 3 cell wires for waveclocking at various tunneling rates: (a) and (b) respectively show results for atrapezoidal and sinusoidal wave with λ = 200nm; (c) and (d) show results forλ = 100nm. Minimum device separations are used.gradient∆ξ ≤ pi ∆CEk0s0λ. (5.28)We will begin this discussion first in the absence of any adjustment in ∆C, an ap-proach discussed in Section Dependence on Device DimensionsIn Fig. 5.14, we show ∆ξ for each of our devices as a function of the tunnelingrate using the minimum ∆C from our zone clocking considerations. As discussed,clocking wavelengths below 200 nm are currently a challenge. Values for λ =100nm are shown simply to demonstrate that the ∆ξ would approximately double66as expected. We see also that the sinusoidal wave results are approximately pi/4times those of the trapezoidal wave, indicating that the cells happen to populatewhen the clocking gradient is strongest. Perhaps most importantly, we see that ∆ξis small for λ = 200nm. This should not be surprising, as for nanoscale devicess0/λ is on the order of 10−2 and our zone clocking ∆C/Ek0 is of order 1. To enableSiDB wire operation we need an order of magnitude further increase of ∆ξ . Onepotential approach is to increase s0/λ either by decreasing λ or increasing s0. Theformer we know to be effective. The applicability of changing s0 is more involved.We see in Figs. 5.2 and 5.3 , that increasing s0 decreases Ek0 and may signifi-cantly increase ξ . Decreasing Ek0 has a number of consequence. First, the limitingfrequency scale of clocking is approximately proportional to the energy scale ofthe system [5]. By decreasing Ek0 we likely decrease the operating speed of ourdevices. Secondly, we risk thermal excitations out of the ground state if the en-ergy scale of the system approaches the thermal energy. For the 1 nm scale MVMdevices, separations greater than ∼ 3nm decrease Ek0 to below the room tempera-ture thermal energy. For SiDB devices at liquid helium temperatures, the thermalenergy scale is reached at 5.1 nm and 5.4 nm separations for the 2 and 4-dot de-vices respectively. At room temperature and ignoring other necessary alterations,the thermal energy is achieved at around 1.6 nm and 2.6 nm. These considerationlimit how high we can set s0. The most significant consequence, however, comesfrom the relative increase in the strength of γ/Ek0 which grows faster than ξ . Theapproximations in Section 5.2 suggest (γ/Ek0)/ξ is proportional to s30 for MVMdevices and s0 for SiDB devices. We know also from Fig. 5.11 that the maximumξ/Ek0 for zone clocking is linear in γ . As an example, for 2-dot devices and somefixed γ , doubling s0 will increase ξ by a factor of 4, γ/Ek0 by a factor of 8, and thusξmax will grow faster than ξ . This increase tends to be more significant than the∆ξ from wave clocking.5.4.2 Raising the Maximum Clocking StrengthWe are free to increase the clock sweep by increasing the maximum clockingstrength: i.e. replacing Cahi in Eq. (5.18) with some larger value. As discussedin Section 5.2.3, there are limits to how strong the clocking field can be. For MVM670.1 0.5 1.0 1.5Null Site Depth, nm100101Electric Field, V/nm0.100.200.30Figure 5.15: Dependence of the minimum required clocking fields on the null sitedepth for a 3-dot wire using a 200 nm trapezoidal wave. Three choices of γ/Ek0 areshown. The horizontal line shows 3 V/nm for reference: s0 = 1nm.devices, the electric fields can be quite large. As an example, for a 3-dot invertingwire with s0 = 1nm and v = 0.5nm using a 200 nm trapezoidal clocking wave, theminimum required clocking field for γ = 0.2Ek0 is about Chi = 1.92Ek0 ≈ 880meV.This corresponds to an electric field strength of Cmax/v= 1.76V/nm. The requiredelectric field needed to achieve a given C can be reduced by increasing the nullsite depth, v. However, for MVM devices, ξ is proportional to v2, increasing therequired clocking field for correct device operation. In Fig. 5.15, we show theminimum required electric field needed to operate a 3 cell wire of 3-dot devicesin-order as a function of v for a few different choice of tunneling strength. We seethat increasing v is only a benefit in cases where v is small. There is an interestingregime over which the required field is independent of the choice in v, indicatingthat the required C is proportional to v. We finally note that there is some maximumv beyond which the required fields quickly increase.Unfortunately, this approach has limited application for SiDB devices due tothe discussed problem of populating additional dangling bonds in our devices. Wetake the maximum allowable clocking field to be about 200 meV above the largestobserved activation energy. As an example, we consider 2-dot devices and set γ =6.2meV, equal to Ek0/10 for s0 = 1.15nm. For a 200 nm trapezoidal wave, a 3 cellinverting wire correctly operates for maximum clocking fields above 63Ek0 ≈ 3.9V,68and the highest observed activation energy is about 5Ek0 ≈ 0.3V. This differenceis much larger than our allowed 200 meV. We can again reduce the clocking fieldsby increasing the device separation. At s0 = 1.92nm, leaving 4 hydrogen betweeneach SiDB device, Ek0 is reduced to 14.1 meV and thus γ = 0.44Ek0 . This γ isnearly large enough to allow operation under zone clocking, yet it still requires aclocking field 350 meV above the largest activation energy. At the next possibledevice separation of s0 = 2.30nm, we no longer need to increase the clockingfield above the zone clocking estimate with Ek0 = 7.8meV and a large γ valueof 0.8Ek0 . Similar results are observed for larger values of γ: the constraints onthe maximum clocking field simply do not easily allow a significant increase inthe clock sweep without operating in a regime in which Ek0 is small and γ/Ek0 islarge. We should not necessarily exclude the possibility that wires can still functionwithin the rising edge of the clock despite overpopulation elsewhere. In this case,it may be acceptable to allow higher clocking fields. Further investigation intothe dynamic behaviour of the SiDB devices in a more complete charge basis isrequired. If we must avoid overpopulation, these results suggest that SiDB basedQCA may not significantly benefit from wave clocking unless λ can be furtherreduced or a relatively high γ/Ek0 is realisable.5.5 SummaryThe observed prominence of congestion interactions in SiDB devices raises ques-tions about the feasibility of certain higher density arrangements such as logic gatesor even simple QCA devices. These interactions are relatively easily overcome formolecular QCA devices such as zwitterionic nido carborane by employing dynamicwave clocking with an increased maximum clocking field. We have investigatedthe role of these congestion interactions in 3-state QCA, identifying limits for theamount of congestion allowable before the ground state displays undesirable be-haviour, both for zone clocking as well as wave clocking. For zone clocking, wefind reasonable parameter ranges for molecular QCA which enable both wire andmajority gate operation. We find that the high congestion energies of net-negativeSiDB structures makes even wire operation under zone clocking challenging un-less the inter-dot tunneling rate is allowed to be quite large: roughly half the kink69energy or more. Such large values may pose other challenges. For majority gateoperation, it may be necessary to employ architecture specific gates, such as thosepresented in Ref. [28] and Ref. [2], rather than the common 5 cell cross usuallyconsidered in QCA design. We argue that the capacity for overpopulation of theseSiDB devices limits the extent to which wave clocking can mitigate congestion in-teractions for current constraints on clocking wavelengths. These large congestionenergies emerge due to the use of net-negative SiDB devices. If a net-neutral vari-ant could be demonstrated, the congestion energies would decay similarly to thoseof molecular QCA devices and SiDB-based QCA operation would be more easilyachieved.70Chapter 6Limits of Adiabatic Clocking for2-State QCA NetworksWe will now look more closely at the link between QCA clocking and quantumannealing introduced in Chapter 3. It was established that zone clocking in 2-state QCA could be understood as quantum annealing with a transverse field Isingmodel. We will use this understanding to investigate the adiabaticity and maximumclocking frequency of a subset of the basic building blocks of QCA networks. Inparticular, we consider the frequency below which a QCA logic gate gives the cor-rect output with at least 99% likelihood. The selection of building blocks, withthe relevant naming schemes, are shown in Fig. 1.2. The analysis is divided intothree contributions. In Section 6.1, we discuss how the choice in clocking sched-ule influences overall performance, including how basic knowledge about the lowenergy spectrum can inform a better choice in schedule. In Section 6.2, we studythe performance of our building blocks in the coherent limit, excluding any envi-ronmental interactions. Finally, in Section 6.3, we incorporate a simple model ofthermal dissipation, and discuss the extent to which dissipation changes the perfor-mance estimates.It is necessary to emphasize the distinction between two different rates that willbe included in this discussion. The operation in QCA zone clocking which paral-lels quantum annealing corresponds to the switching phase in Fig. 1.3. It is naturalto restrict our normalized time s to this phase, which represents only a quarter of71the clocking period. As a consequence, the characteristic rate of the dynamics asdefined in Section 2.2 does not directly correspond to the actual clocking frequencyof the network. This clocking frequency, expressed relative to the f0, can be ob-tained as f˜c = f/4 f0 = pi f˜/2 where the frequency scale f should be understood tobe the inverse of the switching period.6.1 Choosing a Clocking ScheduleThe choice of annealing schedule has a significant effect on the ultimate perfor-mance of a quantum annealing process: see, for example, [92]. From Section 3.2and Eq. (3.14), the Hamiltonian of interest for 2-state zone clocking isHˆ(s) =−12 A(s)∑iσˆ ix+ 12 B(s)[∑ihiσˆ iz−∑〈i j〉Eki jσˆizσˆjz]. (6.1)For now we allow B(s) to be time dependent and define the initial transverse andclassical Hamiltonians HˆX = −∑i σˆ ix and Hˆcl = ∑i hiσˆ iz−∑〈i j〉Eki jσˆ izσˆ jz . We willuse the term clocking schedule to describe the profile of A(s) used for a given clockzone. QCADesigner, uses a sinusoidal schedule which is approximately linear overthe switching regime [34]. Linear clocking schedules have also been consideredfor the 3-state QCA model [93]. In this section, we make some considerations forhow details of the shape of the clocking schedule influence the behaviour of thesystem during clocking.6.1.1 Performance MetricsThe first task is to better establish metrics for QCA performance. As we will beconsidering dissipation, it is natural to use the density operator, ρˆ(s), to define thestate of the QCA network, with dynamics calculated with Eq. (2.8). We make the72following definitions, explained in the following discussion:Adiabaticity: MA(s) = Tr ρˆ(s)Pˆg(s) (6.2a)Classical Performance: Mcl = Tr ρˆ(1)Pˆcl (6.2b)Logical Performance: ML = 12|Ω|∏i∈Ω(1+λ iz(1)P cli ) (6.2c)For adiabatic clocking, the system should remain near the ground state at all times.MA(s) describes the overlap between the system state and the space of potentiallydegenerate ground states of Hˆ(s), given as the expected value of the ground stateprojection operator Pˆg(s) = ∑d |ψdg 〉〈ψdg |. Clocking ideally results in the systemreaching the ground state of the classical Hamiltonian Hˆcl. We assume a non-degenerate classical ground state with a projection operator Pˆcl = |ψcl〉〈ψcl|. Mcldescribes the overlap with this state. Absent correlations in ρˆ(1),Mcl simplifiestoM˜cl ≈ 12NN∏i=1(1+λ iz(1)P cli ), (6.3)with P cli = Tr Pˆclσˆ iz the polarization of each cell in the classical ground state. Inprinciple, the polarizations of certain output cells may be logically correct even ifthe state fails to reach the ground state. By restricting the product in Eq. (6.3) toa finite subset, Ω, of the cells, we can define a logical performance, ML. In thisinvestigation, we consider a “high performance” target ofMcl ≥ Figures of Merit for the Ground StateFor typical QCA networks, we will have a non-degenerate ground state. We cantherefore use non-degenerate perturbation theory to determine important figures ofmerit for the initial and final ground states:M0 = 〈PˆX〉(0) = 1−α−20 F0+O(α−30), (6.4a)M1 = 〈Pˆcl〉(1) = 1−α21 F1+O(α31), (6.4b)where PˆX is the projection operator onto the ground state of HˆX , expectation valuesare taken for the ground state of Hˆ(s), α0 and α1 are the initial and final values of73ss2s1 1-0.223-0.2230.031(a) Device dimensions (b) Kink energiesFigure 6.1: QCA Device dimensions and computed kink energies relative to E .We ignore the 0.03 and weaker interactions.Table 6.1: F-parameters in Eq. (6.4) from Eq. (6.5) for the simulated devices.Device Wire-N Inverter Maj-111 Maj-101 Maj-110F0 116(N+3)0.581 1.012 1.012 1.012F1 2.217 1.141 2.259 1.735the ratio α(s) = A(s)/B(s), and F0 and F1 are network-specific parameters dependentonly on (hi,Eki j). In the limit of slow clocking,Mcl approachesM1. It is necessarythen to choose α1 such thatM1 > 0.99. For Hˆcl, we findF0 = 14N∑i=1|hi|2+ 116N∑〈i j〉|Eki j|2 , F1 = 14N∑i=11h˜2i, (6.5)with h˜i = hi−∑ j 6=i Eki jP cli calculated for the polarizations of the ground state ofHˆcl. For our performance target, we obtain a necessary constraint on α1:α−11 ≥max√F1/1−M∗, (6.6)with someM∗ > 0.99. Schematics of the device interactions are shown in Fig. 6.1with the computed F0 and F1 shown in Table 6.1. Our constraint is approximatelyα1 ≤ 1/15. If we can assume thatMA(1) is independent of α1 when α1 is small,then it can be shown that [5]Mcl( f˜ ,α0,α1)≈Mcl( f˜ ,α0,0)M1(α1). (6.7)74We will use QCADesigner’s default value of α1 = 1/20, givingM∗ = .9943 whichsatisfies our 0.99 performance constraint.6.1.3 Candidate Clocking schedulesAn optimal choice of clocking schedule will depend on details of the environmentalinteraction and will not be addressed here. Instead, we introduce a schedule whichmimics the standard linear schedule used in quantum annealing studies. We assumethe following: (1) the kink energies are fixed by the network geometry and henceB(s) = 1; (2) the tunneling energies cannot be made infinite or zero; and (3) the rateof change of the eigenstates for the linear schedule is a good choice for quantumannealing. The eigenstates of Eq. (6.1) depend only on the ratio α(s). We defineour clocking schedules by their initial and final ratios α0 and α1. We consider forcomparison an appropriate Linear schedule with B(1) = 1:AL(s) = 1− (1−α1)s, (6.8a)BL(s) = 1− (1− 1/α0)(1− s). (6.8b)Imposing B(s) = 1, we define an analogous schedule with the same eigenstates:AQ(s) = AL(s)/BL(s),AQ(s) = 1+(α0−1) 1−ks1+(α0−1)s , (6.9)which we refer to as the Quasilinear schedule, as it is perhaps the closest wecan get to the linear schedule under the given constraints. The k here satisfiesAL(1/k) = BL(1/k) and has value k = 1+(1−α1)/(1−α−10 ). Finally, the Sinu-soidal schedule in QCADesigner has the formAS(s) =α0−α1√2cos(pi2(s+ 1/2))+α0+α12. (6.10)Schematics of these clocking schedules and the corresponding low energy eigen-spectra of an inverter are shown in Fig. 6.2. If we assume a simple Landau-Zenermodel for the minimum energy gap, we can make a first order performance esti-mate. The gap was fit to Eq. (3.9), with the Landau-Zener parameters summarized750.0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.23Schedule EnergySpectrum, Normalized Time, s Normalized Time, s(a) Linear: Eq. (6.8)0.0 0.2 0.4 0.6 0.8 1.0012345 A(s)B(s)0.0 0.2 0.4 0.6 0.8 0.36Schedule EnergySpectrum, Normalized Time, sNormalized Time, s(b) Quasilinear: Eq. (6.9)0.0 0.2 0.4 0.6 0.8 1.0012345 A(s)B(s)0.0 0.2 0.4 0.6 0.8 0.36Schedule EnergySpectrum, Normalized Time, s Normalized Time, s(c) Sinusoidal: Eq. (6.10)Figure 6.2: Schedule energies for the considered clocking schedules with α0 =0.5, α1 = 1/20. Spectra are shown for the QCA inverter, including only the 10lowest energy eigenvalues with the ground state subtracted. The location and sizeof the minimum gap is indicated for comparison. The gap was fit using Eq. (3.9)within ∆s= 0.07 of the minimum gap. The fit is shown in the inset over the shadedregion representing ±W about the minimum.76Table 6.2: Fit parameters for the hyperbolic approximation of the minimum gap ofthe inverter for the different clocking schedules. Errors indicate 2σ estimates fromthe fit covariance matrix where significant.Clocking schedule ∆0 W ∆0WLinear 0.233 0.162 0.037Quasilinear 0.362 0.168±0.005 0.062±0.002Sinusoidal 0.362 0.087±0.002 0.030±0.001in Table 6.2. If the avoided crossing were well described by the Landau-Zenermodel we should expect the characteristic rate needed to achieve a givenMcl tobe proportional to ∆0W ; however, we have not established that the eigenstates in-volved in the avoided crossing satisfy the conditions inherent to the Landau-Zenerapproximation. Further, the hyperbolic fit is not particularly convincing for theQuasilinear schedule and we are ignoring other details of the spectrum. Neverthe-less, this result suggests that the Quasilinear schedule might outperform both theLinear and Sinusoidal schedules. The classical performance was computed for oursimple QCA components using Eq. (2.8) for the different clocking schedules, withthe system initialized in the ground state of Hˆ(0). Performance comparisons areshown in Fig. 6.3, including the maximum characteristic rate at whichMcl = 0.99was achieved. In all cases, the Quasilinear schedule performed at least as well asthe Linear and Sinusoidal schedules; in most cases, it was a significant improve-ment. Unless stated otherwise, it will be used for all results that follow.6.1.4 Initial Coherent OscillationsEven if we initialize the system in the ground state of Hˆ(0), we will observe oscilla-tions induced by the clocking field. These oscillations manifest in our performancemetrics as can be seen in Figs. 6.4 and 6.5. We can approximate the magnitude ofoscillations using time dependent perturbation theory for the ground state and firstexcited state of Hˆ(0), ignoring degeneracy for this simple analysis. If A(s) has aninitial linear component, we get a transition probability into the excited state forsmall s ofPg→e(s) ∝| dds A(0)|2A40(f˜A0)sin2(ωrs2), (6.11)7710 3 10 2 10 Rate, Classical Performance(a) Wire-510 3 10 2 10 Rate, Classical Performance(b) Inverter10 3 10 2 10 Rate, Classical Performance(c) Maj-11110 3 10 2 10 Rate, Classical Performance(d) Maj-101Figure 6.3: Comparison of classical performances for the different clocking sched-ules: α0 = 5, α1 = 1/20. Dashed lines mark the maximum characteristic rate, belowwhichMcl ≥ 0.99. Schedule smoothing employed: see Section 6.1.4.where A0 = A(0) and ωr ≈ A0/f˜ . If A(s) has no initial linear component, we calcu-latePg→e(s) ∝| d2ds2 A(0)|2A40(f˜A0)3sin2(ωrs2). (6.12)These oscillations are “Rabi-like” in that they match what we would expect if wereplaced the clocking field with a weak oscillation given by the ωr component ofits frequency decomposition. Importantly, if f˜/A0  1 we can significantly reducethese initial oscillations by modifying the initial clocking schedule in order to re-move any linear component. We employ a smoothing procedure in which we applythe maps′ = s(1− e−s2/2σ2). (6.13)780.0 0.2 0.4 0.6 0.8 Time, sExpectation Value(a) Coherence Vector0.0 0.2 0.4 0.6 0.8 1.00.9880.9900.9920.9940.9960.9981.000Normalized Time, sAdiabaticity, (b) AdiabaticityFigure 6.4: Coherence vector and adiabaticity for a simulation of Maj-101 usingthe Quasilinear schedule with f˜ = 3×10−2, α0 = 5, α1 = 1/20. Initial coherent os-cillations arise due to the changing Hamiltonian. These oscillations tend to weakenduring clocking; however, a lowered adiabaticity when approaching the criticalregime of avoided level crossings can propagate to a reduced final performance.10 2 2 × 10 2 4 × 10 2 6 × 10 20.950.960.970.980.991.0024681020Characteristic Rate, Classical Performance(a) Unmodified0.950.960.970.980.991.002468102010 2 2 × 10 2 4 × 10 2 6 × 10 2Characteristic Rate, Classical Performance(b) SmoothedFigure 6.5: Classical performance of Maj-101 for different α0 values (a) beforeand (b) after initial smoothing. Results use the Quasilinear schedule with α1 = 1/20.The dashed line indicates theMcl = 0.99 performance threshold, with the dottedline the effective threshold for α1 = 0 using Eq. (6.7). Note the oscillations in (a)are more prominent for small α0.790.950.960.970.980.991.002468102010 2 2 × 10 2 4 × 10 2 6 × 10 2Characteristic Rate, Classical Performance(a) Wire-50.950.960.970.980.991.002468102010 2 2 × 10 2 4 × 10 2 6 × 10 2Characteristic Rate, Classical Performance(b) InverterFigure 6.6: Classical performance for different α0 values with α1 = 1/20. In caseswhere there is a clear avoided level crossing in the LES, smaller α0 values yieldhigher performance. We use the left-most intercept when defining maximum char-acteristic rate to account for potential performance oscillations. The lower F0 valueof the wire gives it a higherM1 upper bound.This map has a few useful properties: (1) it leaves the initial value of the scheduleunchanged; (2) it cancels the linear component of the initial schedule; and (3), itonly affects the schedule over a period of∼ 2σ , meaning we can remove the initialoscillations without affecting the later schedule. We set σ = 4pi f˜/A0 to smooth overtwo periods of the oscillations. The effect is clear in Fig. 6.5b.6.1.5 Choosing the Initial Clocking FieldFor Maj-101 in Fig. 6.5, the choice of α0 did not significantly influence the clas-sical performance. This will not generally be the case. In Fig. 6.6, we see theperformance for both the wire and inverter for different initial clocking values. Inthese cases, the lower the value of α0, the higher the performance. If we assumethe only important feature of the clock to be a single Landau-Zener-like avoidedlevel crossing, the maximum characteristic rate that allows a given adiabaticity isf˜max ≈ pi∆0W−2ln(1−MA) , (6.14)800.0 0.2 0.4 0.6 0.8 0.50Spectrum, Normalized Time, s(a) Wire-50.0 0.2 0.4 0.6 0.8 0.36Spectrum, Normalized Time, s(b) Inverter0.0 0.2 0.4 0.6 0.8 0.56Spectrum, Normalized Time, s(c) Maj-1110.0 0.2 0.4 0.6 0.8 0.45Spectrum, Normalized Time, s(d) Maj-101Figure 6.7: Low energy spectra of the simulated devices relative to the groundstate energy with the location and size of the minimum gap indicated. Only 10energy levels are shown for the inverter.where the gap width, W , is inversely proportional to the slope of the clocking fieldnear the minimum gap. For the Quasilinear schedule, the slope satisfiesddsAQ(s) =α20[1+(α0−1)s]2ddsAQ(1), (6.15)where dds AQ(1) = −(1− α1/α0) ≈ −1 for small α1. We then expect f˜max to beproportional to [s∗+(1− s∗)/α0]2 with s∗ the location of the minimum gap. Thespectra for most of the simulated devices are shown in Fig. 6.7. For majority gates,the minimum gap occurs in the classical limit, s∗ = 1, hence we expect no α0dependence. If the minimum gap occurs at some earlier s∗, we expect f˜max to81decrease with increasing α0 as observed.QCADesigner uses a value of α0 = 5 and from these considerations we mightnaı¨vely assume we should use an even smaller value of α0; however, we are ig-noring some important caveats. We initialize the system near the ground state ofHˆ(0). If α0 is small, the initial state already has a significant projection onto theclassical ground state. In that sense, much of the work the clocking is supposed toachieve must have been done by whatever process set up the initial state. Indeedfrom Eq. (6.4) we see that α0 = 5 gives a projection onto the ground state of HˆXof onlyM0 = 0.96 in the worst case of a majority gate. In addition, the cell polar-izations, λ iz , will have initial values on the order of α−10 , as high as 0.2 in Fig. 6.4,pointing to a second issue: with small α0, “deactivated” cells may remain partiallypolarized, perhaps enough that the next clock zone may bias our outputs. Both ofthese issues are resolved by using a higher value of α0; however, we see in Fig. 6.6that there is only minimal decrease in performance beyond α0 = 5. We concludethen that we cannot justify using a smaller value of α0, nor would we expect usinga larger value to influence performance within the scope of our analysis.6.2 Coherent Behaviour of QCA ComponentsHere we consider the behaviour of our QCA components excluding any dissipation.In all results that follow, we use clocking ratios of α0 = 5 and α1 = 1/ Analysis of QCA WiresIn Fig. 6.9, we consider the performance of wires of various lengths. For the specialcase of a left-driven wire as in Fig. 1.2, the Hamiltonian is of the formHˆW (s) =−12A(s)N∑i=1σˆ ix+12B(s)[σˆ1z −N∑〈i j〉σˆ izσˆjz]. (6.16)Using a slight modification to the Jordan-Wigner transform approach used for theunbiased wire [94], we can obtain an analytic description of the low energy spec-trum for biased wires in terms of the set of eigenenergiesεk =√B2(s)sin2(qk)+ [A(s)−B(s)cos(qk)]2, (6.17)822 4 6 8 10 1210 1100 0WWire Length, N(a) Landau-Zener parameters0.0 0.2 0.4 0.6 0.8 0.28Spectrum, Normalized Time, s(b) Wire-10 spectrumFigure 6.8: Landau-Zener parameters as a function of the wire length: (a) com-parison of Eq. (6.18) to values extract from fitting the lowest energy gap in thespectrum; (b) example wire spectrum with a hyperbolic fit of the gap. As the wirelength increases, the gap becomes more hyperbolic.with k ∈ {1, · · · ,N} and pseudo-momenta qk = kpi/(N+1). Each energy level ofHˆW (s) is found by summing over a subset of the eigenenergies εk. For large N, eachof the εk can be fit to the hyperbolic form Eq. (3.9) with the minimum occurringfor A(s∗) = B(s∗) and parameters:∆0 ≈ B(s∗)qk, W = ∆0∆m ≈B(s∗)qk∆m, (6.18)where ∆m = dds |A−B|(s∗) is the difference in schedule rates at the gap. Fig. 6.8compares Eq. (6.18) with ∆0 and W parameters fit to computed wire spectra. Asthe wire length increases, the lowest energy gap becomes increasingly hyperbolicand Eq. (6.18) becomes more accurate. For a wire of length N, both ∆0 and W areapproximately proportional to qk ∝ (N + 1)−1. The Landau-Zener approximationthen predicts an adiabaticity ofMA(N, f˜ )≈ 1− e−2piν/4 f˜ (N+1)2 . (6.19)If there were only one excited state, ν/(N+1)2 would correspond to ∆0W for k = 1and large N; however, we see from Fig. 6.8(b) that there are multiple potentiallyrelevant excited states. We then take ν to be some parameter dependent on the8310 2Classical PerformanceCharacteristic Rate, (a) Classical Performance2 4 6 810 210 1LinearQuasilinearSinusoidalWire LengthMax Char. Rate,(b) Maximum characteristic ratesFigure 6.9: (a) Performance of wires of different length as a function of the char-acteristic rate using the Quasilinear schedule. Maximum characteristic rates areindicated by the vertical lines. In (b), we summarize these rates for all clockingschedules. Dashed lines are predictions from Eq. (6.20) using fit ν .clocking schedule. Using Eq. (6.7) and solving Eq. (6.19) for f˜ , the maximumcharacteristic rate for 99% classical performance is thenf˜max(N)≈− piν2log(1− 0.99/Q1(N))(N+1)2 , (6.20)Simulations were run for wires of various lengths and their maximum character-istic rates extracted. The results are shown in Fig. 6.9. These rates were fit withEq. (6.20) for N ≥ 4 to obtain νL ≈ 1.60, νQ ≈ 2.93, and νS ≈ 1.18 for our respec-tive Linear, Quasilinear, and Sinusoidal schedules. It is clear that the Landau-Zenermodel gives a good estimate of wire performance, at least beyond N = 3. UsingEq. (6.18) and considering only the first excited state, we can estimate ν ≈ ν1:ν1 = (N+1)2 ∆0W |k=1 ≈pi2∆mB2(s0). (6.21)The other excited states serve as additional channels for diabatic transitions andhence we should expect ν < ν1. For our schedules, we can find expressions for ν1:84Table 6.3: Comparison of fit and analytical estimates of the Landau-Zener ν pa-rameter for wires. Errors indicate 2σ deviations from the fit covariance matrix.Clocking schedule ν f it ν1Linear 1.60±0.02 1.80Quasilinear 2.93±0.07 3.19Sinusoidal 1.18±0.04 1.99νL1 = pi2 (1− α1/α0)2[2− (α1+ 1/α0)]3= 1.80, (6.22a)νQ1 = pi2k+(α0−1)k2(α0−1) = 3.19, (6.22b)νS1 =4pi√(α0−α1)2+4(α0−1)(1−α1)= 1.99. (6.22c)For comparison, the fit ν and analytical estimates of ν1 are listed in Table 6.3.We see that ν < ν1 as expected. Further, with the exception of the Sinusoidalschedule, the first excited state gives a good estimate for ν . We arrive at twoimportant observations: (1) QCA wires seem to adhere to the simple Landau-Zenermodel for adiabaticity, even considering only the first excited state; and (2), in theabsence of additional factors, the (N + 1)−2 scaling of f˜max means the maximumcharacteristic rate of any QCA network under zone-clocking will quickly be limitedby the longest wire to be clocked. The Landau-Zener character of wires, in additionto the ease of calculating the energy spectrum, suggests an obvious target for futureinvestigation of an optimised wire clocking schedule.6.2.2 QCA Building BlocksWe consider now the coherent performance of our set of basic QCA components.We transition from using the classical performance to using the logical perfor-mance, Eq. (6.2c), for the component output cells. This is perhaps a more mean-ingful metric for real device performance and will also allow more direct compari-son with the ICHA results in the next section. Fig. 6.10 shows all the performancemetrics for each device as a function of the characteristic rate. There are a num-8510 3 10 2 10 10.9800.9850.9900.9951.000ClassicalLogicalAdiabaticPerformanceCharacteristic Rate, (a) Wire-510 3 10 2 10 10.9800.9850.9900.9951.000ClassicalLogicalAdiabaticCharacteristic Rate, Performance(b) Inverter10 3 10 2 10 10.9800.9850.9900.9951.000ClassicalLogicalAdiabaticCharacteristic Rate, Performance(c) Maj-11110 3 10 2 10 10.9800.9850.9900.9951.000ClassicalLogicalAdiabaticCharacteristic Rate, Performance(d) Maj-101Figure 6.10: Performance metrics for the QCA building blocks. 99% thresholdsfor the different metrics are indicated.ber of things to note: (1) in the limit of high adiabaticity, or slow f˜ , the metricsapproximately differ only by the ratios determined in Section 6.1.2:Mcl/MA ≈M1, ML/MA ≈ 1− 14α21/|h˜n|2, (6.23)with h˜n the effective bias for the output cell in the classical limit; (2) the remainingfeatures and oscillations in the performance arise later in the simulations and arenot a consequence of the initial Rabi oscillations discussed in Section 6.1.4; (3) thepredicted maximum characteristic rate for wires doesn’t significantly depend onthe choice of metric, meaning our discussion in the previous section also appliesto the logical performance; and (4), we observe that in certain cases the logicalperformance gives significantly higher maximum characteristic rates than either of8610 3 10 2 10 1Switching Rate, f0.980.991.00PerformanceClassicalLogicalAdiabatic(a) Maj-110 spectrum and performance metrics0.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Spectrum,/= 0.45(b) Ground State (c) Excited StateFigure 6.11: (a) and (b) show the low energy spectrum and performances of Maj-110. Both the classical ground state (c) and the first excited state (d) have thesame logical output. This results in an increasedML metric overMcl and highercharacteristic rates. These states differ only where indicated.the other metrics. Note that a network is logically correct either when in the groundstate or when in an excited state which happens to have correct outputs. This is thecase for Maj-101 and Maj-110; the latter is illustrated in more detail in Fig. 6.11.A summary of the maximum clocking frequencies for the different devices andmetrics is included in Table 6.4. Due to their simplicity, we should expect wires tobe the highest performing devices of a given size. Our inverters have an input-to-output path length of 5 cells and maximum clocking frequencies similar but slightlylower than those of Wire-5. Majority gates have a path length of 3 cells from each87Table 6.4: Maximum clocking frequencies, pi f˜max/2, relative to f0, for our basicQCA circuits based on different metrics. We include also Wire-3 which has a com-parable size scale to the majority gates. Minimum gaps are included for reference.Device Adiabatic Classical Logical ∆0Wire-5 4.7×10−2 4.4×10−2 4.5×10−2 0.500Inverter 2.9×10−2 2.2×10−2 3.0×10−2 0.362Wire-3 9.8×10−2 9.4×10−2 9.5×10−2 0.707Maj-111 8.8×10−2 7.8×10−2 8.1×10−2 0.556Maj-101 5.1×10−2 3.7×10−2 7.6×10−2 0.449Maj-110 5.7×10−2 4.5×10−2 8.8×10−2 0.448input, which we can compare against the results for Wire-3. This suggests ourbounds for wires found in the previous section may serve as upper bounds on morecomplicated components through an effective input-to-output path length. Anothernecessary observation is that majority gates, unlike wires and inverters, do not havea prominent hyperbolic minimum gap. In such cases, or when there are multipleminima in the spectrum which might significantly contribute, the minimum gaptends not be a good predictor for performance. A more detailed analysis of the lowenergy spectrum would yield a better prediction.6.2.3 Estimating Performance from the LESHere we will use the formulation discussed in Section 3.1.1 to construct efficientmethods of investigating adiabaticity either directly from the LES or by simulatingdynamics using a reduced subset of the energy eigenstates. We determined thatin the limit of small f˜ , the adiabaticity can be approximated in terms of transitionparameters Tn = 〈ϕn| ddsHˆ|ϕ0〉/(En−E0)2. It is possible to use this simple observa-tion to construct a rough estimate of the maximum characteristic rate directly fromthe Tn. We take P0(s) = 1−∑n6=0 |cn(s)|2, and impose our adiabaticity condition:P0(s)≥ 0.99∀s. Applying our approximation of cn(s) from Eq. (3.6), we arrive atf˜max ≈√1−0.99maxs∑n6=0 |Tn(s)|2. (6.24)880.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s012345Transition Parameters(a) Wire-50.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s02468Transition Parameters(b) Inverter0.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Parameters(c) Wire-30.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Parameters(d) Maj-1110.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Parameters(e) Maj-1010.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Parameters(f) Maj-110Figure 6.12: Transition parameters for all our QCA devices. When there is onlyone primary peak, there is good agreement between Eq. (3.8) and the measuredclocking frequencies. In the case of the majority gates, multiple peaks produce ahigher overall diabatic transition probability and hence a lower maximum clockingfrequency.89Table 6.5: Maximum clocking frequencies computed using different adiabaticityestimation methods. The exact values are taken from Table 6.4. For Eq. (6.25), thenumber of included excited states is listed.Eq. (6.25)Device Table 6.4 Eq. (6.24) 1 Excited 2 ExcitedWire-5 4.7×10−2 2.94×10−2 4.84×10−2 4.76×10−2Inverter 2.9×10−2 1.70×10−2 3.07×10−2 2.98×10−2Wire-3 9.8×10−2 6.50×10−2 9.90×10−2 9.82×10−2Maj-111 8.8×10−2 7.80×10−2 9.25×10−2 8.79×10−2Maj-101 5.1×10−2 3.54×10−2 8.48×10−2 5.28×10−2Maj-110 5.7×10−2 4.47×10−2 4.71×10−2 5.06×10−2There are a number of inaccuracies in this approximation. First, Eq. (3.6) requiresthat f˜ be sufficiently small. We are not necessarily in a suitable regime near f˜max.Second, we are still ignoring all interactions between excited states. Nevertheless,in Fig. 6.12, we calculate |Tn(s)| from the LES of all our devices. The resultingmaximum clocking frequencies are included in Table 6.5. In this case, Eq. (6.24)always underestimates the maximum clocking frequency of our devices: at a maxof about 40% in the case of the inverter. Another potential approach is to directlysimulate the dynamics of cn(s) through Eq. (3.3). After excluding the 〈ϕn| dds |ϕn〉term, we havedds cn =− ∑m6=0cm〈ϕn| ddsHˆ|ϕm〉Em−En e−i(θm−θn)/ f˜ , (6.25)with c0(0) = 1 and cn6=0(0) = 0. The maximum clocking frequency is the lowestpi f˜/2 such that |c0(1)|2 < 0.99. In Fig. 6.12, we see that only a few of the eigen-states sufficiently couple to the ground state to be worth including. The simplestmodel includes only the ground state and a single excited state. The result max-imum clocking frequencies are also included in Table 6.5. This model producesgood agreement with the exact values for the wires and inverter, a fairly good esti-mate for Maj-111 and Maj-110, but a poor estimate of Maj-101. Looking again atFig. 6.12, we can make some justification. Transitions in the wires and inverter aredominated by the first excited state. Additional excited states, primarily the secondexcited state, are non-negligible for all majority gates, with the contribution actu-90ally overcoming that of the first excited state in Maj-101. Even including just thissecond excited state, we get much improved accuracy.6.3 Dissipative BehaviourWe will consider two approaches to modelling a simple dissipation mechanismfor environmental interactions: spectral relaxation, in which the density operatorrelaxes to some ρˆss(s) dependent only on the eigenspectrum of Hˆ(s) as in [95];and mean field relaxation, where the coherence vectors of each cell relax to a localdissipation vector given by the instantaneous state of the network [45]. For spectralrelaxation, we consider three different potential steady states:Boltzmann [95] : ρˆβss(s) = 1Zβ e−βHˆ(s) (6.26a)Ground : ρˆGss(s) = 1Zg Pˆg(s) (6.26b)Classical [96] : ρˆCss(s) = 1ZC Pˆcl (6.26c)where the Z are appropriate normalization constants such that Tr ρˆss(s) = 1. Eachof these spectral steady states has a local dissipation vector:η ia = Tr ρˆssσˆia. (6.27)For the ICHA, a mean field steady state is usually employed of the form [29],η ia =− tanh(β2 |Γi|) Γia|Γi| , (6.28)with Γi as in Eq. (2.11). While it is possible to construct a steady state densityoperator from the local dissipation vectors by defining the higher order steady stateelements [45], we have found this process to be prohibitively slow. We will presentresults for mean field relaxation only via the ICHA. All performance results arenow in terms of the logical performance. Dynamics using ρˆ(s) are computed usingEq. (2.8), while ICHA results use Eq. (2.11).916.3.1 Spectral Relaxation of the Density OperatorWe consider the steady states specified in Eq. (6.26) that depend only on the energyeigenspectrum. As an example, Fig. 6.13 shows the logical performance for Maj-101 over a range of both the characteristic rate, f˜ , and the relaxation rate, ξ˜ . Thereare three regimes of interest illustrated in Fig. 6.14. If the characteristic rate is largewith respect to the relaxation rate, f˜ & 10ξ˜ , then the dynamics are approximatelycoherent. In this regime, the choice of steady state does not significantly affect theperformance. If the relaxation rate is large with respect to the characteristic rate,ξ˜ & 10 f˜ , then the system tracks the steady state and the performance is entirelygoverned by whether the steady state has the correct logic. For small α1, both theground state and classical steady states give the correct logic and thus have highperforming relaxed regimes. For a Boltzmann distribution the behaviour is morecomplicated. The logical performance in the relaxed regime has a lower boundapproximately defined by the first excited state with incorrect output logic. To firstorder for sufficiently small thermal energies,ML > 1−d1e−β∆1 , (6.29)with ∆1 the gap between the ground state and the lowest energy incorrect state andd1 its degeneracy. We should expect a region of high performance in the relaxedregime for β above approximatelyβ ∗ ≈ [log(d1)− log(1−0.99)]/∆1. (6.30)For Maj-101 we get ∆1 = 0.554, d1 = 2, and an estimated transition at β ∗ ≈ 9.6.In Fig. 6.13(c-f) we observe a domain with ML above 0.99 emerge somewherebetween β = 5 and 20. The transition actually occurs just above β = 10.0 (seeFig. 6.13(e)) which is fairly close to our estimate. The exact value ofML in thelimit of infinite ξ˜ for a Boltzmann steady state is obtained asML(β ) = 12(1+ |Tr ρˆβss(1)σˆnz |) (6.31)for output cell n. In Fig. 6.15, we showML(β ) for all the QCA components. Ta-ble 6.6 compares the β ∗ estimates using Eq. (6.30) with the values from Fig. 6.15.9210 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.0010 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.0010 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.0010 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.0010 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.0010 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.00(a) (b)(c) (d)(e) (f)Characteristic Rate, Characteristic Rate, Characteristic Rate, Characteristic Rate, Characteristic Rate, Characteristic Rate, Figure 6.13: Logical performance of Maj-101 for different spectral steady states:(a) ρˆGss ; (b) ρˆCss; (c-f) ρˆβss for β = 1,5,10+,20. The dashed line represents equalcharacteristic and relaxation rates and the solid contour represents a performanceof 0.99. (e) shows the behaviour immediately after the critical inverse temperature:10+= 10.19310 3 10 2 10 1 10010 310 210 1100TRANSITIONRELAXEDCOHERENTRelaxation Rate,Characteristic Rate, Figure 6.14: There are three regions of interest: the coherent regime, in which thesystem is governed by coherent dynamics; the relaxed regime, in which the systemclosely tracks the steady state; and the transition between these two regimes.4 6 8 10 12 14Inverse Temperature,0.900.920.940.960.981.00Relaxed PerformanceWire-5InverterMaj-111Maj-101Maj-110Figure 6.15: Logical performance in the relaxed regime for QCA components withthe Boltzmann distribution steady state.ML = 0.99 thresholds are indicated.The most significant difference occurs for the inverter, which has a second incor-rect excited state with a slightly higher energy gap of 0.554. The contribution ofthis state is ignored in our estimate. If the characteristic and relaxation rates areof the same order, the behaviour depends on the interplay between the coherentdynamics and the relaxation. Of the steady states considered, the most interestingbehaviour occurs for ρˆCss, with a band of low performance along ξ˜ ≈ f˜ . This resultis easily explained by observing that the classical ground state will generally be ahigh energy configuration of Hˆ(0) and very near the ground state of Hˆ(1). Ini-94Table 6.6: Threshold β value for high performance in the relaxed regime. Param-eters for Eq. (6.30) are estimated from the spectrum of 12HˆclDevice d1 ∆1 Eq. (6.30) Fig. 6.15Wire-5 5 1 6.2 6.3Inverter 1 0.416 11.1 12.5Maj-111 1 0.554 8.3 8.7Maj-101 2 0.554 9.6 10.0Maj-110 5 1 6.2 6.2tially, the dissipation acts to excite the network out of the ground state; later in theclock, the dissipation acts to help drive the network back down in energy. Increas-ing ξ˜ initially hurts performance until the relaxation is strong enough to overcomethese initial excitations.We summarize the performance for each QCA component by extracting themaximum clocking frequency as a function of ξ˜ for each choice of steady state.The results are shown in Fig. 6.16. Each set of data is obtained by first findingf˜max in the coherent limit and then tracking the ML = 0.99 contour as ξ˜ is in-creased. We first mention a few unsurprising results: (1) the clocking frequency isapproximately independent of the choice of steady state in the coherent limit; (2)the trends become parallel to f˜ = ξ˜ in the relaxed regime, meaning performance isguaranteed as long as ξ˜/f˜ is sufficiently large; and (3), for Boltzmann steady statesthe performance is improved as the temperature is decreased. There are apparentdiscontinuities in some of the trends. These correspond to cases like Fig. 6.13(e),where we observe two regions of high performance. Interestingly, unless ξ˜ is suf-ficiently large, we observe a decrease in maximum clocking frequency for any ofour spectral steady states.6.3.2 Spectral Relaxation with the ICHAIn Fig. 6.17 we repeat the analysis using the ICHA. The behaviour in the relaxedregime matches what we observe in Fig. 6.16. On inspection, the limits in the co-herent regime differ slightly, with the maximum clocking frequency slightly higherfor wires and inverters and lower for majority gates. In the transition regime,however, we observe significantly different results, particularly for Wire-5 and95Clocking Frequency,Clocking Frequency,Clocking Frequency,Clocking Frequency,Clocking Frequency,Figure 6.16: Maximum clocking frequencies relative to f0 for different spectralsteady states. A sample set of β values are indicated for the Boltzmann distribution.Classical steady state results are cropped for ease of view. The shaded region anddashed line indicate the approximate edges of the transition regime.96Clocking Frequency,Clocking Frequency,Clocking Frequency,Clocking Frequency,Clocking Frequency,Figure 6.17: ICHA: Maximum clocking frequencies relative to f0 for the differentspectral steady states. A sample set of β values are indicated for the Boltzmanndistribution.9710 3 10 2 10 10.960.970.980.991.00PerformanceClassicalLogical10 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.0010 3 10 2 10 1 10010 310 210 1100Relaxation Rate,0.000.500.700.800.900.950.980.991.000.0 0.2 0.4 0.6 0.8 1.0Normalized Time, s0. Valuexyz(a) Coherence Vector: (b) Performance Metrics(c) (d)Characteristic Rate, Characteristic Rate, Characteristic Rate, Figure 6.18: (a) Oscillations in the ICHA simulation of Wire-5: f˜ = 2× 10−2.These oscillations manifest in the coherent performance metrics (b) and produceinteresting features in the logical performance contours. We compare the groundstate relaxation using the density operator (c) and ICHA (d).Maj-101. To understand this behaviour, we need to consider the dynamic equa-tions of the ICHA. Absent any dissipation, Eq. (2.11) is stationary only if λ i ∝ Γi;otherwise, coherent oscillations occur as has been previously studied [95]. Thiscan be seen by taking the derivative of the λ iy equation in Eq. (2.11):f˜ 2d2ds2λ iy =−(h˜2i + γ2i )λ iy+O(γnλ ny ), (6.32)98where the remaining terms tend to be small near the end of the switching phase.For sufficiently fast f˜ , the λ iy can remain non-zero and will then oscillate withangular frequencies ν2i = (h˜2i +γ2i )/ f˜ 2. The λ ix and λ iz are driven to oscillate by theremaining equations:f˜ddsλ ix =−h˜iλ iy, f˜ddsλ iz =−γiλ iy. (6.33)Fig. 6.18 shows the coherence vector for Wire-5 using the ICHA for a particularchoice of f˜ . In this case, these coherent oscillations are clear. From Eq. (6.32), wecan predict an oscillation frequency for Wire-5 in non-dimensional units:fosc = 12pi f˜[|λ 4z |2+α21 ]1/2. (6.34)Taking |λ 4z | ≈ 1, we get fosc ≈ 8.0 which matches the observed oscillation. Thisleads to oscillations in the performance both in the coherent limit (Fig. 6.18(b)) aswell as when dissipation is added (Fig. 6.18(d)). In Fig. 6.18(d), we see also a bandof slightly lower performance along ξ˜ ≈ f˜ . That this effect appears significant isentirely a consequence of our arbitrary choice of the 0.99 threshold. This region oflower performance also exists in Fig. 6.18(c); however, we observe slightly higherML values in the low f˜ regime when using the density operator approach, keepingthis region above Mean Field Relaxation with the ICHAWe conclude by considering the commonly used mean field method: Eq. (6.28).As this method is β dependent it is natural to compare it with our analysis of theBoltzmann steady state. We can determine the β ∗ sufficient to guarantee a highperforming relaxed regime by making two observations: (1) in the limit of highξ˜ we have λ nz = ηnz and hence ML = 12(1+ |ηnz |) for output cell n; and (2), inthis limit the Γi in Eq. (6.28) becomes a function of the vector η z of all the η izvalues. We can then obtain η z as a fixed point of Eq. (6.28) by iteration startingwith the ground state polarizations of Hˆ(1). The results of this process are shownin Fig. 6.19 with a comparison of the extracted β ∗ to those of the Boltzmann steadystate shown in Table 6.7. With the exception of Maj-111, we get lower β ∗ values994 6 8 10 12 14Inverse Temperature,0.900.920.940.960.981.00Relaxed PerformanceWire-5InverterMaj-1x1Maj-110Figure 6.19: Logical performance in the relaxed regime for the mean field relax-ation with the ICHA. Maj-101 and Maj-111 are indistinguishable for sufficientlylarge β .Table 6.7: Comparison of β ∗ values for the two temperature dependent steadystates.Device Wire-5 Inverter Maj-111 Maj-101 Maj-110Boltzmann 6.3 12.5 8.7 10.0 6.2Mean Field 4.7 12.0 8.7 8.7 4.9using the mean field.In Fig. 6.20, we show the maximum clocking frequencies for the mean fieldsteady state. We use spectral relaxation to the ground state as reference. Qualita-tively, there is much in common with the Boltzmann steady state, keeping in mindthe slightly different β ∗ transitions. One significant feature that was not observedin our previous results is that the mean field can give maximum clocking frequen-cies higher than what can be achieved using spectral relaxation to the true groundstate. This is likely a result of the tendency for the mean field to reinforce the po-larizations of cells, increasing the output cell polarization and henceML. See, forexample, the cell response in Ref. [41].100Clocking Frequency,Clocking Frequency,Clocking Frequency,Clocking Frequency,Clocking Frequency,Figure 6.20: Maximum clocking frequencies relative to f0 for mean field relax-ation using the ICHA. The results for the ground state are shown for reference andcomparison to Fig. 6.17.1016.4 SummaryBy interpreting QCA clocking as quantum annealing, we arrived at an understand-ing of some of the mechanisms by which properties of the low energy spectruminfluence dynamics and, ultimately, clocking performance. We identified criteriafor choosing a clocking schedule, and arrived at an improved schedule allowingfor high performance, defined as a 99% likelihood of observing the correct groundstate, at higher clocking frequencies. In the case of wires, this new schedule can beoperated roughly 2.5 times faster than the existing schedule in QCADesigner. Inthe coherent limit, we observed upper bounds on the clocking frequency 1-2 ordersof magnitude below the intrinsic frequency f0 = E/h. Earlier work by Timler et al.established similar bounds at 1-3 orders of magnitude below f0 [29]. Their analy-sis was based on constraining the expected amount of power dissipated by deviceoperation in a given area. This suggests that while considerations of power areimportant for evaluating QCA implementations, we should carefully consider theconditions under which QCA networks can be expected to produce correct outputs.Using an analytical solution for driven wires, we determined that they are wellapproximated by a simple Landau-Zener model for adiabaticity, having a maxi-mum clocking frequency that falls off with the square of the wire length. Thissuggests an unforgiving trade-off between operational frequency and maximumclock zone size unless alternative decoherence mechanisms can be invoked [97].We also observed for this small dataset that wires present an upper bound for morecomplicated networks through an effective longest input-to-output length. Furtherwork would be needed to determine whether this is of general use.We investigated a simple thermal bath relaxation model for decoherence. Whilethe choice in steady state does influence performance, the effect is relatively smallunless the system is operated in a regime where the rate of relaxation dominates.Outside of this regime, we observe either of two cases. If the steady state has thecorrect logic in the relaxed regime, then we observe at worst an approximate fac-tor of 2 decrease in maximum clocking frequency compared to the coherent limit;otherwise, there is a maximum allowable relaxation rate beyond which high per-formance is not achievable. The bounds in this near-coherent regime are thereforewithin a factor of 2 of those in the coherent limit.102Chapter 7Simulating 2-State QCA onQuantum Annealing HardwareIn the previous chapter, it was demonstrated that through understanding QCAclocking as a QA process, information about the LES could be used both to char-acterize the choice in clocking schedule as well as predict clocking performance.Without further development, this method is practical only for small networkswhere the LES can be easily computed; however, the link to QA suggests thatadditional insight might be gained by investigating an analogous QA hardware ar-chitecture, in which numerical simulation is not a limitation. Work by D-WaveSystems Inc. in Burnaby, Canada, to realize a large scale quantum annealing plat-form has made it possible to explore physical quantum annealing in real-world ap-plications including large numbers of variables [67, 98]. Most of the discussion inthis chapter focusses on the previous D-Wave 2000Q QPU, which offers 2048 po-tential physical superconducting flux qubits in a 16×16 grid of 8-qubit tiles. Onechallenge to running the QPU on a given problem lies in the sparse connectivitybetween qubits. The 2000Q uses the “Chimera” topology, summarized in Fig. 7.1,with each qubit connected to at most six neighbours. The more recent AdvantageQPU offers more than 5000 qubits and uses a higher connectivity topology, termed“Pegasus” [99, 100]. Pegasus introduces four additional couplers within each tile,shown in Fig. 7.2, as well as additional inter-tile couplings. Each qubit in the Pe-gasus topology is connected to at most 15 qubits. Details on the tile connections103(a) Qubit schematic (b) Chimera graphFigure 7.1: Qubit layout for one tile of the QPU. There are 4 vertical and 4 horizon-tal qubits with couplers at the intersections. Couplers between tiles are indicatedby the dashed lines. In the graph representation qubits are represented by nodesand couplers by the connecting edges. Each qubit has at most 6 adjacent qubits.Figure 7.2: The Pegasus topology adds 4 internal couplers within each 8-qubit tile.This enables the mapping of 3-cycles, allowing for more compact embeddings.are given in Appendix C.1. This connectivity constraint ultimately means that itis not possible to embed a given QCA network onto the QPU using a one-to-onemapping between individual QCA cells and qubits. Additional qubits are neededto facilitate all interactions, with multiple qubits representing a single QCA cell.The choice of these multi-qubit groups defines the embedding of a QCA circuit.In this chapter, we discuss the challenge of embedding and the steps necessary forsimulating QCA networks on the D-Wave QPU.1041234(a) K4 QUBO graph (b) Chimera (c) PegasusFigure 7.3: Minimal embeddings for K4 on the Chimera and Pegasus topologies.Colours show the corresponding node in the K4 problem graph. Coloured edgesare internal to the vertex model with J = −1. Dashed edges are optional. Theembeddings are not unique; in Chimera for example, any horizontal qubit in a tilecan we swapped with any other, and similarly for the vertical qubits.7.1 Embedding Problems on the QPUIn general, any QUBO or equivalent problem can be mapped to the D-Wave QPU.To avoid confusion between the hi and Ji j terms of an Ising model and those of theQPU, we will use the bi and Qi j notation from QUBO. We can represent a givenproblem by a problem graph, with the bi values mapped to node weights, and thecoupling terms, Qi j, to edge weights. Similarly, we can understand the potentialqubits and couplers on the QPU as a hardware connectivity graph, where eachqubit is a node, and each coupler an edge between the corresponding qubit nodes.A single tile of the Chimera graph is shown in Fig. 7.1(b). In the literature, nodesin the problem graph tend to be called logical qubits and nodes in the hardwaregraph are called physical qubits. Ideally, each logical qubit would be mapped to asingle physical qubit; however, the edges in the hardware graph are limited by theQPU connectivity. The solution is to map each logical qubit to a set Ω, called avertex model, of physical qubits. This task is known as minor embedding [101].As an example, embeddings for the complete graph of 4 nodes, K4, which use thefewest number of physical qubits are shown in Fig. 7.3. There are a few rules forassigning parameters to physical qubits in vertex models [102]:• Couplers between physical qubits in a vertex model should attempt to forcethe qubits to be of the same spin in the ground state. As parameter ranges on105the QPU are constrained to |hi|, |Ji j| ≤ 1, this translates to Ji j = −1 withinthe vertex model.• Edges and the corresponding Qi j between logical qubits must map to edgesbetween the corresponding vertex models. If multiple edges are availablebetween two vertex models, the edges must satisfy ∑n∈Ωi ∑m∈Ω j Jnm = Qi j.In this work, we set one of the Ji j to Qi j and leave the rest disabled.• Node weights within a vertex model must satisfy ∑n∈Ωi hn = bi. We chooseto set hn = bi/|Ωi| ∀n ∈Ωi.For a real QPU, some of the qubits or couplers may be disabled due to issues infabrication or calibration [103]. This does not fundamentally change the task, onlyfurther restrict the hardware connectivity graph. Finding efficient embeddings isknown to be NP-hard [101], with deterministic algorithms having complexity ex-ponential in the size of problem graph [104]. Fortunately, polynomial time heuris-tic algorithms exists for attempting to find graph minors [104, 105]. Results forembedding larger problems will be summarized in Section 7.1.3; however, we firstneed to establish the problem graph structure of 2-state QCA networks.7.1.1 QCA Connectivity Graph RepresentationsIt is natural to discuss the interactions between QCA cells in terms of a connectivitygraph. There are two species of 2-state QCA cell often considered in the literature:the 4-dot cell with quadrapole-quadrapole like interactions; and the 2-dot cell withdipole-dipole interactions. Typical device geometries, polarization states, and Ji jterms are shown in Figs. 7.4 and 7.5. Importantly, we can consider each cell as avertex in a connectivity graph, with edge weights given by Ji j. Due to the rapidfall-off of cell interactions, we can ignore most interactions, and thus the degreeof such graphs is often relatively small. A few important examples are shown inFig. 7.6. The 4-dot majority gate and inverter often represent the highest densityof cells in 4-dot networks, and have degrees of only 4 and 5 respectively. Thesomewhat more complicated 2-dot majority gate has a higher degree of either 10or 12 depending on the choice of included interactions. This is representative ofthe highest device density typically observed in 2-dot networks. Higher density106-10.2230.007 -1-0.03(a)(b)(c)Figure 7.4: Schematic of the 4-dot cell. Intra- and inter-cell distances are definedin (a). In (b), we define the two polarization states given by antipodal chargeoccupation of the dots; here each cell has two holes (red) and two electrons (blue).In (c), we show computed Ji j parameters for different cell separations, scaled to thestrongest observed interaction. The colours aid in identifying the sign of Ji j withdotted lines showing weak interactions which are usually ignored. Values shownfor d = 1nm and s = h = 2nm.-10.670.103-0.159-0.10-0.0650.008(a)(b)(c)Figure 7.5: Schematic of the 2-dot cell: (a) intra- and inter-cell distances; (b),polarization states; (c), scaled Ji j parameters. Here the next-to-nearest interactionsare sufficiently strong that their contribution cannot necessarily be ignored. Valuesshown for d = 1nm, s = 1.5nm, and h = 2nm.107(a) (b)(c)Figure 7.6: Connectivity graph representations of basic components in QCA net-works: (a) 4-dot majority gate, (b) 4-dot inverter, and (c) 2-dot majority gate. Thecolored edges in (c) are added to remove clutter and highlight slightly weaker in-teractions. The weaker -0.065 interaction is neglected for simplicity.devices can be speculated, such as multi-wide wires [106, 107] or 5-input majoritygates [108]; however, these are not currently common in QCA design. Larger QCAnetworks are usually sparsely connected arrangements of these or similar relativelydense sub-networks. Embeddings for the majority gate and inverter are shown inFigs. 7.7 and Summary of Embedding AlgorithmsEarly embedding experiments were targeted for the 512 qubit D-Wave Two hard-ware graph. Due to the sparsity of QCA connectivity graphs, an in-house algorithmwas developed for comparison against D-Wave’s heuristic minorminer [105]. A1084 5312(a) Majority Gate (b) Chimera (c) PegasusFigure 7.7: Minimal embeddings for the majority gate. The extra intra-tile cou-plers in the Pegasus topology do not reduce the number of required physical qubits;however, they do offer additional choices for edges, which is potentially useful ifsome couplers/qubits are not functional on the QPU.431 2567(a) Inverter (b) Chimera (c) PegasusFigure 7.8: Minimal embeddings for the inverter. In this case, a perfect embeddingis possible in the Pegasus topology.brief overview of the two algorithms is given here for comparison.Dense Placement AlgorithmThe Dense Placement (DP) algorithm uses a simultaneous “place-and-route” ap-proach commonly used in field-programmable gate array (FPGA) design [109]. Itattempts to place every logical qubit onto a single corresponding assigned physi-cal qubit. Chains of physical qubits are routed between the assigned qubits if noedge exists. A number of strategies are used for selection and assignment of thefirst logical qubit, order of logical qubits to embed, maintenance and reservationof available neighbours for future routings, and recovery when no neighbours areavailable. If an embedding is found, a linear programming approach is used to109convert this “assigned qubit and interaction chain model” into an equivalent set ofvertex models. Details regarding the DP algorithm can be found in Appendix A.2or in published work [6, 7].D-Wave’s Heuristic Algorithm: MinorminerFor each node in the problem graph, assign an initial vertex model as follows. Ifnone of the current node’s neighbours have been assigned a vertex model, randomlyselect a physical qubit as the vertex model. Otherwise, find the qubit with thesmallest total path cost to all of the adjacent vertex models and assign this qubitas the root of the vertex model. Initially, the same qubit can be used by multiplepaths at a high cost. Paths to the adjacent vertex models are distributed betweenvertex models to minimize the largest vertex model size. After initial placement,iteratively loop through each problem node, forget the previously assigned vertexmodel and try to find a better one. The metrics for improvement are defined as thenumber of times a single qubit is used in the set of all vertex models (must be nomore than once for a successful embedding), the sum of vertex model sizes, andthe largest vertex model size. A detailed discussion on minorminer can be foundin Ref. [105].Recent AdvancementsSince our initial investigation, more recent efforts by Pinilla and Wilton [110] haveintroduced layout aware embedding of QCA networks. In some cases, this ap-proach has been shown to produce embeddings requiring fewer physical qubitsthan either the DP algorithm or minorminer. In addition, minorminer has under-gone a number of revisions, improving performance. We include here a limitedsample of the original embedding results in order to demonstrate general trends.7.1.3 Embedding ResultsWe consider 4-dot QCA using two potential levels of connectivity: Full Adja-cency, where we include both nearest neighbour (Ji j = −1) as well as diagonal(Ji j = 0.233) interactions; and Limited Adjacency, where we exclude diagonal in-teractions except those on the output of an inverter. Refer to Fig. 7.4, for context.110sr-flip-flopsel-oscxor-gatefull-addmem-loopser-add4b-mux4b-accDense PlacementMinorminer0 100 200 300 40050 150 250 350Physical Qubits(a) Limited Adjacency0 100 200 30050 150 250sr-flip-flopsel-oscxor-gatefull-addmem-loopser-addDense PlacementMinorminerPhysical Qubits(b) Full AdjacencyFigure 7.9: Number of physical qubits used for each benchmark circuit. Smallerdots are individuals embeddings, with 100 attempts per circuit. The large markersshow the average. The 4 bit 2-1 multiplexer and accumulator could not be embed-ded onto the 512 qubit QPU using full adjacency.Table 7.1: Benchmark circuits and the number of non-driver cellsCircuit Cells Circuit CellsSR Flip Flop 20 Memory Loop 120Selectable Oscillator 31 Serial Adder 126XOR Gate 73 4 Bit 2-1 MUX 192Full Adder 98 4 Bit Accumulator 273Both algorithms were run on two categories of QCA networks: a set of benchmarkcircuits of differing complexity and number of cells, and randomly generated QCAconnectivity graphs. We assume here a Chimera topology without any disabledqubits or couplers. For performance of the algorithms with disabled qubits, seeAppendix C. Benchmark CircuitsThe list of considered benchmark circuits and the corresponding number of cellsis included in Table 7.1. For each benchmark, we attempted 100 embeddings us-ing both algorithms and adjacency types. The results are summarized in Fig. 7.9.The DP algorithm required fewer physical qubits for limited adjacency, with mi-norminer requiring fewer for full adjacency. In Fig. 7.10, we compare the average1110501001502002503003504000 100 200 30050 150 250Number of CellsAverageQubitUsageDP: LimitedDP: FullMinorminer: LimitedMinorminer: Full(a) Average physical qubit usage23456780 100 200 30050 150 250Number of CellsAverage Maximum Vertex ModelDP: LimitedDP: FullMinorminer: LimitedMinorminer: Full(b) Average largest vertex modelFigure 7.10: Comparison of physical qubit usage and vertex models sizes for bothalgorithms and adjacencies. In (a), we show the averages from Fig. 7.9, with theblack line indicating a one-to-one embedding, and coloured lines showing linearfits. The average largest vertex model sizes are shown in (b).physical qubit usage and maximum vertex model size for all cases. We see that thenumber of required qubits is approximately proportional to the number of cells.There is less of a trend for the vertex model sizes, which likely depend more on thecircuit complexity.7.1.5 Generated CircuitsA stochastic circuit generator was implemented to investigate the range of QCAcircuits that can be embedded onto the QPU. Each generated circuit connectivitygraph is comprised of a random set of QCA components: inverters, majority gates,fixed polarization inputs. These components are wired together by generating ran-dom trees from the component outputs to random sets of available inputs. Wegenerated 2000 circuits in both adjacencies and attempted to find embeddings. Foreach circuit, ten attempts were made with a maximum allowed runtime of 30 sec-onds, after which the embedding was assumed to have failed. The average requiredphysical qubits for all successfully embedded circuits are shown in Fig. 7.11. Wesee similar trends to what was observed for the benchmark circuits. In Fig. 7.12,we show the single trial success probabilities as a function of the number of cells.These values correspond to the number of generated circuits of a given size which112Number of Cells050100150200250300350Full AdjacencyLimited AdjacencyAverageQubitUsage0 100 20050 150(a) Dense Placement050100150200250300350400 Full AdjacencyLimited Adjacency0 100 20050 150 250Number of CellsAverageQubitUsage(b) MinorminerFigure 7.11: Average number of physical qubits needed to embed generated QCAcircuits of different sizes. The solid black lines show power function fits which areapproximately linear. Full AdjacencyLimited Adjacency100 200 30050 150 250Number of CellsSingle Trial Success Probability(a) Dense Placement0. AdjacencyLimited Adjacency100 200 30050 150 250Number of CellsSingle Trial Success Probability(b) MinorminerFigure 7.12: Average single trial embedding success probabilities using both al-gorithms. Points are obtained by counting the number of generated circuits withsizes in bins of 5 cells that were successfully embedded. Sigmoid fits are shown.were successfully embedded within ten trials. We see that minorminer is muchmore likely to succeed in a given embedding attempt.Further embedding results can be found in Appendix C.2, including averagerun times for finding embeddings of generated circuits, characteristic run times forlarger QPUs, and results when random sets of disabled qubits are considered.1130.0 0.2 0.4 0.6 0.8 1.0Annealing Time, s024681012Schedule Energy (GHz)A(s)B(s)(a) Schedule Energies0.0 0.2 0.4 0.6 0.8 1.0Annealing Time, s10 410 2100102Clocking Ratio, A(s)/B(s)(b) α(s) = A(s)/B(s)Figure 7.13: Annealing schedule for the D-Wave 2X, used in all presented anneal-ing results and calculations. For reference, 1GHz≡ 4.1356µeV7.2 Annealing ResultsHere we present a number of results run on the 1152 qubit D-Wave 2X. The an-nealing schedule for the QPU is shown in Fig. 7.13. The energy scale on the QPUis comfortably expressed in GHz. This conversion relates to the energy of a photonwith a given frequency f : E = h f , for the Planck constant h = 4.1356µeV/GHz.When running problems on the QPU, users have control of a number of parame-ters. Relevant here are the number of annealing trials per gauge transformation,an arbitrary redefinition of the spin conventions of the qubits, and the annealingtime, ta, which defines our characteristic rate: f˜ = 10−3 GHz · µs/E · ta. Stayingconsistent with the framework in Chapter 6, we set E = B(1) ≈ 9.5GHz. The an-neal times are typically on the order of 101 µs to 103 µs, meaning our characteristicfrequency is on the order of 10−5 to 10−7. These are much slower than the 10−1to 10−2 sufficient for adiabatic evolution as determined in Chapter 6. This poses apotential issue for investigating the performance of small QCA devices if the goalis to study diabatic transitions near the coherent limit. At such relatively slow an-nealing speeds we should expect all our devices to run adiabatically. Any diabatictransitions likely arise due to other details of the evolution or QPU implementation,such as thermally assisted excitations [103, 111, 112].1140. 15 20 25 30 35 40 45 50Wire Length, NGround State Probability510204010020040010002000(a) Measured adiabticity0 10 20 30Wire Length, N100Minimum Gap,0(GHz)(b) Minimum gap vs. Wire lengthFigure 7.14: Annealing results for wires of different lengths with various anneal-ing times. In (a), we show the proportional of trials returning the correct groundstate. Each mark represents an annealing run of 50,000 trials. Two wires of eachlength were embedded using different sets of qubits. The lines pass through the av-erage. Lengths of 5, 10, 15, 20, 30, and 50 were used; the markers are staggered forease of view. The wire length at which the minimum gap falls below the thermalenergy at 15mK is indicated: N ∼ 18.45. In (b), we present the minimum energygap as a function of the wire length. The black dashed line gives the approxima-tion from Eq. (6.18): ∆0(N) ≈ B(s∗)pi/(N + 1), with A(s∗) = B(s∗) = 1.934GHzat s∗ = 0.413.7.2.1 Adiabaticity of WiresAn important test for the QPU behaviour is simple wires of various lengths. Theyhave both an analytical spectrum, Eq. (6.17), as well as a guaranteed one-to-oneembedding. In Fig. 7.14, we present the measured adiabaticity, approximated bythe number of annealing trials which return the correct ground state, for a selectionof wires. We saw in Section 6.2.1 that long QCA wires are well approximated by asimple Landau-Zener model; however, the results here fall off far too quickly. Us-ing Eqs. (6.18) and (6.20), we can calculate that the largest wire we should be ableto run for a given adiabaticity, PA, should be Nmax ≈√−20.2GHz · ta/ ln(1−PA).Even for the shortest annealing time, ta = 5µs, we should have an adiabaticity of0.99 at Nmax ≈ 147. This approximation ignores both environmental interactionsas well as any additional details of the qubit behaviour. We see in Fig. 7.14 thatthe minimum energy gap of the wire becomes smaller than the thermal energy be-115tween N = 18 and N = 19. This roughly corresponds to the start of the fall-off inthe adiabaticity curves. This raises the possibility that what we are seeing is theresult of thermal excitations.7.2.2 Changes in the Spectrum after EmbeddingBefore looking more closely at the dependence on the annealing time, ta , it is nec-essary to discuss a potential issue. While our choice in embedding and parameterassignment guarantees that the ground states in the classical limit (s= 1) before andafter embedding correspond, we have not yet considered what happens to the restof the spectrum during annealing. In Fig. 7.15, we present the low energy spectrafor a few simple QCA components that require only one additional physical qubitfor embedding. In the case of the adapter, which connects normal 4-dot QCA cellsto a rotated variant, we see barely any change to the spectrum after embedding,with the exception of some of the higher energy states and a slight decrease in theminimum gap. For Maj-101, we again see a slight decrease in the minimum gapbut also significant change among the first three excited states. The most promi-nent change occurs for the inverter, with a large decrease in the minimum gap. Infact, at s = 1 the first excited state for the embedded inverter has a kink within thevertex model, describing a state which has no analog in the original network. Itwould be possible to avoid this new state by increasing the relative strength of theJi j within the vertex models, achieved by decreasing all other hi and Ji j by somescaling factor; however, this would effectively increase the thermal energy, makingthermal excitations more likely.7.2.3 Transition ParametersIt is informative to look at the transition parameters associated with these spectra,shown in Fig. 7.16. Unlike previous cases, we are not expressing our spectra indimensionless units. The transitions parameters are then given in units of inverseenergy: GHz−1. We will primarily focus our analysis on the first excited state.For the adapter, we observed a 21.7% decrease in the minimum gap. If only theeigenenergies have changed, we should expect the transition parameter T1 , in-versely proportional to the gap squared, to have increased by 63%. We observe1160.2 0.3 0.4 0.5 0.6 0.70123Energy:E1E 0(GHz)sNormalized Time,(a) Adapter: ∆0 = 0.327GHz0.2 0.3 0.4 0.5 0.6 0.70123Energy:E1E 0(GHz)sNormalized Time,(b) Adapter-Embedded: ∆0 = 0.256GHz0.2 0.3 0.4 0.5 0.6 0.70123Energy:E1E 0(GHz)sNormalized Time,(c) Maj-101: ∆0 = 1.180GHz0.2 0.3 0.4 0.5 0.6 0.70123Energy:E1E 0(GHz)sNormalized Time,(d) Maj-101-Embedded: ∆0 = 1.013GHz0.2 0.3 0.4 0.5 0.6 0.70123Energy:E1E 0(GHz)sNormalized Time,(e) Inverter: ∆0 = 0.803GHz0.2 0.3 0.4 0.5 0.6 0.70123Energy:E1E 0(GHz)sNormalized Time,(f) Inverter-Embedded: ∆0 = 0.180GHzFigure 7.15: Change in the spectrum for the most efficient embeddings. Depend-ing on the device and the embedding, the low energy spectrum can be significantlyeffected, as with the inverter, or barely affected at all. Minimum gaps are includedin the sub-figure captions. The thermal energy at 15mK is indicated by the reddashed line.1170.2 0.3 0.4 0.5 0.6 0.7Normalized Time, s10010124.0Transition Parameters(a) AdapterTransition Parameters0.2 0.3 0.4 0.5 0.6 0.7Normalized Time, s10010139.4(b) Adapter-Embedded0.2 0.3 0.4 0.5 0.6 0.7Normalized Time, s1001013.4Transition Parameters(c) Maj-101Transition Parameters0.2 0.3 0.4 0.5 0.6 0.7Normalized Time, s1001014.7(d) Maj-101-Embedded0.2 0.3 0.4 0.5 0.6 0.7Normalized Time, s100101 10.9Transition Parameters(e) Inverter0.2 0.3 0.4 0.5 0.6 0.7Normalized Time, s10010164.1Transition Parameters(f) Inverter-EmbeddedFigure 7.16: Transition parameters for the spectra in Fig. 7.15. The vertical scaleis the same for all plots to help with comparison. The slight jaggedness is due tovery small steps in the B schedule which causes spikes in dds B(s) and henceddsHˆ.The values of the maximum peaks are given.118an increase of 64%. Similarly for Maj-101, we should expect an increase in T1 ofabout 36%, and we observe 38%. We infer from this that the embedding did notsignificantly change 〈ϕ1| ddsHˆ|ϕ0〉. It might in principle be possible then to correctfor this increase in T1 by reducing the annealing speed near the minimum gap, de-crease the transition rate and thereby recover some estimate of the dynamics of theoriginal network, at least if we anneal slow enough that the rest of the eigenstatescan be ignored. The same argument does not apply for the inverter. For a 77.5%reduction in the gap, we should expect T1 to increase by nearly 20×, yet we ob-served only about 6×. This means that 〈ϕ1| ddsHˆ|ϕ0〉 is significantly different nearthe minimum gap. This is consistent with our observation that the first excited stateat s = 1 is incompatible with any state in the original system. Interestingly, we seethat despite having a much smaller gap, the second excited state ends up having ahigher transition parameter near the end of the anneal, suggesting a much strongercoupling if not for the larger gap. This second excited state ends up correspondingto the original first excited state, which can be seen in the spectra.7.2.4 Dependence on the Annealing TimeIn Fig. 7.17, we show the ground state probabilities as a function of the annealingtime for both a subset of the wires as well as the adapter, majority gate, and inverter.Unlike in the Landau-Zener model, where running things slower always improvesadiabaticity, here we see an optimum annealing time for each given embedding.That time differs even for the same problem embedded using different qubits. Thedecline in performance with the increase in annealing time has been attributedto coupling to the environment [111, 112]. It is argued that for fast annealing,the dynamics are similar to diabatic transitions seen in the Landau-Zener model,or perhaps other descriptions of closed system phase transitions (see Ref. [112]).Such transitions are inverse to the annealing time. Due to the presence of thermalnoise, excitations into energy levels with gaps on the order of the thermal energyare possible [111]. The longer the system spends near the minimum gap, wheresuch transitions are easiest, the more likely it is for the system to be excited out ofthe ground state.119101 102 103Annealing Time ( s)0.9850.9900.995Ground State Probability10(a) Wire-10101 102 103Annealing Time ( s)0.6250.6500.6750.7000.7250.7500.775Ground State Probability(b) Adapter101 102 103Annealing Time ( s)0.840.860.880.900.920.940.96Ground State Probability20(c) Wire-20101 102 103Annealing Time ( s)0.9880.9900.9920.9940.996Ground State Probability(d) Maj-101101 102 103Annealing Time ( s)0.500.550.600.650.700.75Ground State Probability50(e) Wire-50101 102 103Annealing Time ( s)0.540.560.580.600.620.640.660.68Ground State Probability(f) InverterFigure 7.17: Proportion of annealing results which produce the ground state forour sample embeddings as a function of the annealing time. The colours indi-cate different embeddings for each network, with 50,000 annealing trials for eachmarker. The dashed lines show a simple quadratic fit for each embedding.1207.3 SummaryIn this chapter, we have discussed the potential for applying a physical implemen-tation of quantum annealing as a means of investigating the dynamic behaviourof QCA networks, in particular the QPU developed by D-Wave Systems Inc. Forthe particular qubit connectivity available in the Chimera or Pegasus topology, itis not possible to map all QCA networks to the QPU without the use of addi-tional qubits. Schemes for embedding a given QCA network onto the QPU werediscussed, where each QCA cell is mapped to a connected set of physical qubitscalled a vertex model. We found that the use of these addition qubits results ina difference between the spectra of the original and embedded networks. In thesimple case, we observe a slight decrease in the minimum gap without any signif-icant change in 〈ϕ1| ddsHˆ|ϕ0〉. The embedding of an inverter introduced a new firstexcited state which did not correspond to a polarization state. For this embedding,it is questionable whether any meaningful estimate of performance could be ex-tracted. For larger networks, not only is it a challenge to determine the spectrum inorder to check the consequences of the embedding, but we typically end up havingmultiple vertex models containing more than two physical qubits. This suggeststhat additional work would be needed in order to better understand how to map theannealing results back to performance estimates of the original system.We saw from the annealing results that there are two competing mechanismsfor diabatic transitions on the QPU. For fast annealing rates, transitions consistentwith closed system dynamics dominate. These transitions decrease as the annealingtime is increased. As the annealing time is increased further, transitions associatedwith coupling to the environment become prominent. This results in an optimumannealing time which produces the largest adiabaticity. It is not clear how thesetransition mechanism on the QPU map to those of any particular QCA implemen-tation.121Chapter 8Eigenspectrum Decomposition ofQCA NetworksIn Chapter 6, we concluded that the low energy spectrum was a powerful tool forunderstanding QCA performance, but that its applicability was limited by the in-tractable diagonalization of the N-cell Hamiltonian for larger QCA networks. Inthe previous chapter, we noted that extracting performance metrics for larger QCAnetworks embedded on the QPU would require a better understanding of how em-bedding effects the spectrum. We will conclude by addressing two ideas related toQCA spectra. First, we investigate a means to decompose larger QCA networksinto meaningful subcomponents, and how important features in the low energyspectrum of the full network can be attributed to those components. Secondly, wediscuss how to estimate the spectrum for large QCA networks. In Section 8.1, wediscuss how the system Hamiltonian in the 2-state approximation can be decom-posed into meaningful sub-Hamiltonians given some partition or decomposition ofa QCA network into components. These sub-Hamiltonians give estimates of thereduced density operators of each component. In Section 8.2, we establish an ob-jective function which gives a measure of how well a given decomposition capturesthe low energy characteristics of the full network. Using this metric, we identifythe problem of network decomposition as a graph partitioning problem with signededge weights. Strategies for finding the optimal network decomposition for smallerQCA networks with exactly solvable ground states are discussed. We show how122the low energy spectra of the component Hamiltonians describe features of the fullspectrum. In Section 8.3, we introduce a heuristic method based on the componentHamiltonian construction which allows for the study of larger networks. In Sec-tion 8.4, we discuss the computation of transition parameters. In Section 8.5, wedemonstrate one potential application of this formalism, the meaningful compari-son of alternate variations of network components. Finally, in Section 8.6, we givea few additional details relating to extensions of the component model to 3-stateQCA and for dynamic wave clocking.8.1 Decomposing the QCA Network HamiltonianWhen designing QCA networks, it is practical to conceptually decompose the net-work into smaller interconnected components, typically logic gates. It is mean-ingful then to ask whether it is possible to understand properties such as the per-formance of a given QCA network in terms of contributions from subsets of thenetwork. The first step is in understanding how to decompose the Hamiltonian.We can define a network partition {Sk} to be any set of non-empty components,where each QCA cell belongs to exact one of the components Sk. Note that for ourmodels of QCA, the Hamiltonian only ever contains operators that affect at mosttwo cells. Then for a given partition, we can always decompose Hˆ asHˆ =∑kHˆk+ ∑〈k`〉Mˆk`, (8.1)where Hˆk only contains terms within Sk and Mˆk` contains all the terms betweencomponents Sk and S`. A good choice for the Hˆk and Mˆk` will be discussedlater. In QCA, we are mostly interested in quantities that relate to either individualcells or pairs of cells. If we assume the system is in thermal equilibrium, then theexpectation value of any operator Oˆk that influences only cells within Sk can beexpressed as〈Oˆk〉= TrOˆkρˆ = 1Z TrOˆke−β Hˆ, (8.2)with inverse temperature β = 1/kBT and normalization constant Z = Tre−β Hˆ. AsOˆk depends only on Sk, we can take the trace over the complement of Sk to obtain123the reduced density operator ρˆk = Trck ρˆ and get〈Oˆk〉= Trk Oˆkρˆk (8.3)8.1.1 The Ideal Component HamiltonianFor any finite temperature and reduced density operator ρˆk, there always exists aHamiltonian Hˆk that captures the physics of all operators Oˆk. In particular, weexpress ρˆk in two forms: (1) its own eigenbasis(Wαk , |uαk〉)with Wαk ≥ 0; and (2),a Boltzmann distribution:ρˆk =∑αWαk |uαk〉〈uαk|= 1Zk e−β Hˆk . (8.4)Every eigenstate |uαk〉 is also an eigenstate of Hˆk with eigenvalue Eα =−β−1 logWαk −β−1 logZk. We can writeHˆk =−β−1∑αlogWαk |uαk〉〈uαk|−β−1 logZkIˆ. (8.5)Ideally, we would equate these Hˆk to those in Eq. (8.1); unfortunately, computingHˆk in this way requires knowledge of the full system density operator and is thusnot practical for larger networks. Instead, we will employ mean-field theory toattempt to estimate Hˆk.8.1.2 Mean-Field Theory ApproximationThe approach here is to estimate properties of the reduced density operators in theabsence of correlation terms between the components. We begin with the Callen-Suzuki identity [113]:〈Oˆk〉 ≈ 1Z Tr Eˆk(Oˆk)e−β Hˆ, (8.6)whereEˆk(Oˆk) = Trk Oˆke−βH¯kTrk e−βH¯k, (8.7)124is an operator on the complement of Sk for the expected value of Oˆk over thedegrees of freedom of Sk. The Hamiltonian H¯k here contains all terms of Hˆ thataffect Sk. In particular, for the terms defined in Eq. (8.1),H¯k = Hˆk+ ∑6`=kMˆk` (8.8)We will use the 2-state approximation as a specific case study. From Eq. (2.3), wecan write H¯k explicitly asH¯k =−12 ∑i∈Skγiσˆ ix+ 12[∑i∈Skhˆiσˆ iz− ∑〈i j〉∈SkEki jσˆizσˆjz](8.9)where hˆi = hiIˆ−∑ 6`=k∑ j∈S` Eki jσˆ zj is a biasing operator containing all interactionsinvolving site i excluding those internal to Sk. We make the ansatz that, for i and jin different components, we can exchange the σˆ zj term with some constant m j: i.e.σˆ jz σˆ iz → m jσˆ iz. These mi would typically be called magnetisations; however, wewill simply called them the mean fields. Then H¯k represents an Ising spin-glassmodel on Sk with hˆi ≈ h˜iIˆ for a constant:h˜i = hi− ∑6`=k∑j∈S`Eki jm j. (8.10)As this form of H¯k satisfies [H¯k, Hˆ−H¯k] = 0, it follows that this approximationis equivalent to assuming a decomposable density operator given by the Kroneckerproduct of the reduced density operators: ρˆ =⊗k ρˆk. Here we again assume ρˆis given by a Boltzmann distribution. Conventional mean field theory tells us weshould use [113]mi = Trk σˆ izρˆk =1Zk Tr σˆize−βH¯k . (8.11)We can now define our estimate of the terms in Eq. (8.1):Hˆk =− 12 ∑i∈Skγiσˆ ix+ 12[∑i∈Skh˜iσˆ iz− ∑〈i j〉∈SkEki jσˆizσˆjz](8.12a)Mˆk` =− 12 ∑i∈Sk∑j∈S`Eki j(σˆ iz−mi)(σˆ jz −m j)+Ek` (8.12b)125where Ek` = 12 ∑i∈Sk∑ j∈S` Eki jmim j is just a constant. The necessary identity oper-ators that accompany all constants are suppressed for conciseness. In practice, weneed to compute Hˆk self-consistently with mi via Eq. (8.11).8.2 System Energy-Based DecompositionWhen operating near the adiabatic limit, the low energy eigenstates of a QCA net-work produce important information about network performance [5]. We considerthe relevant case of determining the energy of a QCA network. Observe that inthe limit of zero thermal energy, ρˆ becomes equal to the ground state of Hˆ. Inthis case, the mi are analogous to the instantaneous ground state polarizations usedin the multi-cell ICHA discussed in Section 2.2.4. If the thermal energy is on theorder of the energy gap between the ground and first excited states, then ρˆ and theρˆk contain contributions only from a few of the lowest energy eigenstates. If wewant to estimate the energy of the system, we haveE = 〈Hˆ〉=∑k〈Hˆk〉+ ∑k6=`〈Mˆk`〉. (8.13)We define 〈Aˆ〉r as our estimate of the expected value of Aˆ using only the reduceddensity operators, and ∆(Aˆ)= 〈Aˆ〉r−〈Aˆ〉. We can then split the error in our estimateof the system energy into two components: the error internal to each component,arising from any inaccuracy in the estimation of ρˆk, and the error in estimatinginter-component interactions. If we used the exact ρˆk defined in Eq. (8.4), theformer would be exactly zero; however, for our mean field approximation we areleft with an error:∆Eint =−12∑iγi∆(σˆ ix)+ 12[∑ihi∆(σˆ iz)−∑k∑〈i j〉∈SkEki j∆(σˆizσˆjz )]. (8.14)The error arising from inter-component couplings is given by∆Eext =−12 ∑〈k`〉∑i∈Sk∑j∈S`Eki j∆(σˆizσˆjz ). (8.15)126As discussed, our definition of the mean fields, mi, produces a density operator ofthe form ρˆ =⊗k ρˆk. It follows that the ∆(σˆ izσˆjz ) terms in Eq. (8.15) are given by∆(σˆ izσˆjz ) =−Mi jzz +∆(σˆ iz)〈σˆ jz 〉+ 〈σˆ iz〉∆(σˆ jz ) (8.16)plus an additional term ∆(σˆ iz)∆(σˆjz ) taken to be small. The correlation properMi jzz = 〈σˆ izσˆ jz 〉− 〈σˆ iz〉〈σˆ jz 〉 is a measure of the loss of information about the cou-pling between sites i and j due to neglecting inter-component correlations [36, 42].The latter two terms can be absorbed into Eq. (8.14) by subtracting Eki j〈σˆ jz 〉 termsfrom the hi. This leaves∆E = ∆Eint({mi})+ 12 ∑〈k`〉∑i∈Sk∑j∈SkEki jMi jzz. (8.17)Here ∆Eint is now a function of the 〈σˆ iz〉, or equivalently the mean fields. Bydefinition, a good approximation of the reduced density operators produces small∆ terms for operators internal to the components. We assume then that ∆Eint willbe small relative to the second term in Eq. (8.17). We are tempted at this point toidentify the decomposition of QCA networks as a graph partitioning problem.8.2.1 Energy-Optimal Component DecompositionIf our goal is to identify a decomposition which best reproduces the reduced densityoperators when the system operates nearly adiabatically, i.e. to capture informationabout the lowest energy eigenstates of the approximate sub-Hamiltonians, we seethat the best choice of components in a decomposition should minimize the errorterm in Eq. (8.17) arising from the correlation propers. That is, we should minimizea cost function of the formC({Sk}) =∣∣∣∣∣∑〈k`〉 ∑i∈Sk ∑j∈S` Eki jMi jzz∣∣∣∣∣. (8.18)We assume for now that the correlation propers are known from the ground stateof the full system. In Section 8.3, we discuss an approach for approximating thesevalues for systems too large to solve exactly. Using the connectivity graph structure127defined in Section 7.1.1, we see that QCA network decomposition becomes a graphpartitioning problem with signed edge weights and an objective function C. Inparticular, the partitioning takes the form of a minimum k-cut problem [114, 115].There are two challenges here. First, common graph partitioning libraries employan objective function given as a linear combination of edge weights. To addressthis, we could define two objective functions C+ and C−:C±({Sk}) = ∑〈k`〉∑i∈Sk∑j∈S`±Eki jMi jzz. (8.19)We can then find the optimal partitions, {Sk}±, for both of these objective func-tions. Whichever produces the smallest absolute value of their corresponding C±optimises C. The second challenge is due to the signed edge weights wi j = Eki jMi jzz .Available solvers require non-negative, typically integer-valued, edge weights. Wewill not attempt to solve the signed partitioning problem in this work. Instead, theresults we present are achieved in two steps. We first consider an upper bound onEq. (8.18):C({Sk})≤ C˜({Sk}) = ∑〈k`〉∑i∈Sk∑j∈S`∣∣Eki jMi jzz∣∣. (8.20)This objective can be optimized using a standard solver such as METIS [116] withedge weights wi j = |Eki jMi jzz|. Where integer weights are required, simple quantiza-tion is used. We take the resulting partition as an initial guess of the true optimum.Local adjustments are then made manually in order to further reduce the values ofEq. (8.18). This refinement process could be automated in future.8.2.2 Decompositions for Clocked NetworksNetwork clocking manifests as some time varying parameter in the Hamiltonian.The decomposition which optimizes Eq. (8.18) then may vary with time. Thisis not particularly appropriate for our purposes. We instead consider a differentobjective function. In particular, we define the optimal decomposition as that whichminimizes the highest observed value of C over a set of sample times, {tn}. We128define the dynamic objective functionCd({Sk}) = maxtn∣∣∣∣∣∑〈k`〉 ∑i∈Sk ∑j∈S` Eki j(tn)Mi jzz(tn)∣∣∣∣∣. (8.21)Again, we can find an initial partition using the upper bounded variant with edgeweights wi j = maxtn |Eki j(tn)Mi jzz(tn)|. Of interest during QCA clocking are the in-tervals over which the gap between the ground and first excited state is minimum[5]. We use these mimima as our tn.8.2.3 Decomposition of a Solvable SystemLet us consider the 2-dot majority gate in Fig. 7.6(c) where we fix the inputs tohave respective polarizations of (+1,−1,+1) or binary 101. The Hamiltonian ofthis 19 cell network is too large for exact eigenvalue decomposition in a reasonabletime; however, using sparse matrix methods we can successfully employ Lanczositeration to reliably approximate the spectrum [44]. In Fig. 8.1 we show a few pos-sible decompositions of the majority gate and the resulting energy eigenspectra.These spectra are achieved as follows. First we find the k lowest eigenpairs of thefull system Hamiltonian Hˆ(γ) using a Python wrapper of the ARPACK package[117, 118]. It is assumed that all cells share a clock: γi = γ . Using the com-puted ground states, we construct the low-temperature mean fields mi(γ) = 〈σˆ iz〉.We compute the component Hamiltonians using the instantaneous effective biasesh˜i(γ) from Eq. (8.10), and then the lowest energy eigenstates in the componentspectra. We can make a few interesting observations: (1) the best choice of net-work decomposition is not typically the one which evenly divides the network intoequally sized components; (2), when the components are chosen well, the com-ponent spectra seem to reproduce important features of the full spectrum; and (3),even a slight deviation from the optimal decomposition can drastically change thecomponent spectra and result in a failure to predict important features. Any de-composition that splits up the two high density columns fails to reproduce the firstminima in the component spectra.It is worth considering what the component spectra actually describe. We con-sider the basis of product states of the eigenstates of the component Hamiltonians.1290. Spectrum,Clocking Field, Spectrum,Clocking Field, Spectrum,Clocking Field,(a)(b)(c)Figure 8.1: Sample decomposition of the 2-dot majority gate with binary inputs101. The orange shaded devices have fixed polarization and simply provide h bi-ases to their neighbours. In this case, the network is decomposed into 2 componentsindicated with red and blue. The corresponding energy eigenspectra are shown asthe clocking parameter, γ , is swept. Ground state energies are subtracted to high-light features in the spectra. Here the black dashed line shows the spectrum of thefull system. The colored spectra are obtained from the component Hamiltoniansgiven by Eq. (8.12a) with the mean fields given by the polarization of the full sys-tem ground state. In (a), we use a naı¨ve balanced decomposition, where we simplysplit the network in half. The decomposition based on our dynamic objective func-tion, Eq. (8.21), is shown in (b). We see that the component spectra closely repro-duce the features of the true spectrum. In (c), we see that a slight deviation from theoptimal decomposition can drastically change the component spectra, which fail topredict the first minimum. Only the few lowest energy excited states are shown.Values of Cd are shown for the set γ ∈ {1.196,0.11}.130By subtracting the ground state energies, a given component spectrum describesproduct states where all other components are in their ground state. This followsfrom the observation that for our definition of the mean fields and given that onlyone component is excited, at least one of σˆ iz−mi and σˆ jz −m j will have an expectedvalue of 0 for i and j in different components. If a feature in the true spectrummatches one of the component spectra, this suggests that both the ground state andthe corresponding excited state must be approximately such product states. Thisholds for the ground state when the instantaneous value of Eq. (8.18) is small,and thus the ground state energy is approximated by the component ground states.Our choice of decomposition does not directly consider whether this is true for theexcited states; however, by choosing our sample times {tn} to correspond to theminima in the energy gap, we attempt to guarantee at least one of the necessarycriterion for the full and component spectra to agree.8.2.4 Additional Details on Finding PartitionsIn Fig. 8.1, we can get a decomposition with a slightly lower cost then our presentedoptimal if we simply use the cell adjacent to the 0 input as the first component. Forthe given inputs, that cell has very little influence on the rest of the network anddoes not significantly contribute. This decomposition gives Cd = 0.043. The com-ponent spectrum of the rest of the network is effectively that of the full network.Such decompositions, however, are not particularly informative as they do not at-tribute features in the spectrum to individual network components. There are anumber of ways we might avoid this scenario. Firstly, we have said nothing abouthow many components should be used in a decomposition. We are free to increasethis number to attempt to force the partitioning algorithm to identify non-trivialcomponents. Additionally, solvers like METIS attempt to minimise the objectivefunction, in our case the so-called edge-cut function, while trying to keep the com-ponent sizes balanced. A constant ε is defined such that for K components, thelargest component contains at most a fraction (1+ε)/K of the cells. For all resultspresented in this work, initial partitioning was done using METIS 5.1.0 set fork-way multilevel partitioning using the “cut” objective type and set to force con-tiguous partitions. The value of ε , through the “ufactor” parameter, was set to 0.3131by default but adjusted for each network if unsuitable components were produced.8.3 Heuristic Solver for Large NetworksWhen the true ground state of the system is known, we can determine the Mi jzzexactly and attempt to find an optimal decomposition. However, the size of theHilbert space for 2-state networks scales as 2N for N cells, meaning the complexityof exact eigenvalue decomposition is at least as hard as O(4N). This is solvablefor practical hardware and time scales for only a dozen or so cells. Using sparseheuristic solvers which employ methods such as Lanczos iteration [44, 119], wecan extend this to 20 or so cells. Beyond this limit, we must begin to make ap-proximations. One approach would be to employ something like density matrixrenormalization groups (DMRG) [120]. Interestingly, the component decomposi-tion approach we have established introduces a potential related solution. In iden-tifying the optimal decomposition, we used information about the ground stateto construct mean fields which produce useful component Hamiltonians. We canconsider the inverse problem. Given an approximation of the component Hamilto-nians, we attempt to efficiently produce an estimate of the ground state, or indeedadditional states, of the full system. We will employ a variational procedure basedon basis reduction, similar in concept to DMRG. This approach will be referred toas the Component Mode (CM) solver.8.3.1 Component Mode FormulationWe suppose we are given some initial partition and a guess of the mean fields.We assume for now that each component is small enough to be solved exactly:|Sk| / 20. We can now construct our estimates of each of the component Hamil-tonians, Eq. (8.12a). Recall that our mean field approximation was equivalent toassuming a density operator of the form ρˆ =⊗k ρˆk. If this were truly the case, theground state of the full system would be the product state of the ground states ofeach of our component Hamiltonians. We observe that we must assume a numberof inaccuracies: (1) the initial decomposition will generally be sub-optimal; (2),the given mean fields may be inaccurate; and (3), the mean-field model itself is im-perfect. For simplicity we assume there are only two components. From Eq. (8.12),132the Hamiltonian of the full system can be expressed up to a constant asHˆ = Hˆ1⊕ Hˆ2− 12 ∑i∈S1∑j∈S2Eki j(σˆ iz−mi)⊗ (σˆ jz −m j), (8.22)Where here we understand the Hˆn and σˆ iz to be of dimension 2|Sn| and explicitly de-note the Kronecker sum and product. Our approach will be to select a subset of theeigenstates of each of the component Hamiltonians. In DMRG, we would find thedominant eigenpairs of the reduced density operators, those with the largest eigen-values [121]. As we have assumed our reduced density operators to be Boltzmanndistributions, these dominant eigenstates are exactly the lowest energy eigenstatesof the component Hamiltonians. For each component, we select some number, dn,of the lowest energy eigenstates. We call these states the component modes, whichform the columns of the transformation operators Un ∈ C2|Sn |×dn . We then obtainreduced basis representations of the relevant component operators of size dn×dn:H n =U†n HˆnUn, σiz =U†n σˆizUn. (8.23)The reduced basis representation of the full system Hamiltonian is then constructedsimply asH = H 1⊕H 2− 12 ∑i∈S1∑j∈S2Eki j(σ iz−mi)⊗ (σ jz−m j), (8.24)with a dimension of d = Πndn. For decomposition into K components we maketwo adjustments: (1) we replace Hˆ1⊕ Hˆ2 with ⊕n Hˆn, the Kronecker sum overall component Hamiltonians; and (2), we have a coupling term for each pair ofcomponents, made easier by first constructing the d ×d representations of the σ iz.We can then estimate the ground state of H , |ϕ 0〉, or even additional low energyeigenstates. Using the reduced basis representation of σˆ iz, we can get new estimatesof the mean fields and correlation propers:mi = 〈ϕ 0|σ iz|ϕ 0〉, Mi jzz = 〈ϕ 0|σ izσ jz |ϕ 0〉−mim j, (8.25)13310000100001100000 001 1010Figure 8.2: Example decomposition tree of a large QCA network. The cell coloursindicate the components at that level of the decomposition. Orange cells havefixed polarization and belong to either of two classes. Dark orange cells indicatefixed inputs in the full network. Faded orange cells indicate representative biasesarising from inter-component interactions. These should be understood as cellswith fixed polarizations defined by the current mean field estimates. Note that thefixed polarization cells do not get split between the components. They are copiedto all components which they influence.again using the full d ×d dimensional version of σ iz. The mi can then be fed backinto Eq. (8.22) after updating the Hˆn and the ground state solved self-consistently.In addition to estimating the low energy spectrum of the full system, note that thismethod necessarily produces the component spectra in finding the reduced basis.8.3.2 Larger NetworksIf the components themselves are too large for exact diagonalization, we can em-ploy this heuristic method recursively. We observe that the component formulationproduces a useful reduced basis representation of the system Hamiltonian. Note134100001Figure 8.3: Circuit diagram and the equivalent 4-dot QCA network for the consid-ered 49 cell XOR gate, not counting fixed input cells.also that there is nothing in Eq. (8.22) that requires Hˆn or σˆ iz to be in their fullrepresentations, only that the σˆ iz be in the same representation as their correspond-ing Hˆn. Then for each component, we either apply exact diagonalization if thecomponent is sufficiently small, or recursively call the CM solver to construct anappropriate reduced basis representation. This requires two additions to the CMsolver: First, we must now provide both the component spectra as well as the re-duced representations of the σˆ iz as computed in Eq. (8.23). Second, we now requirea tree of decompositions which inform the solver of how each component is furtherbroken down into sub-components. An example of such a tree is shown in Fig. 8.2.We can apply our existing understanding of identifying decompositions here, ei-ther automatically using Eq. (8.20), or by manually specifying the decomposition.Details on the implementation of the CM solver are discussed in Appendix A. Example NetworkOf primary interest are relatively compact QCA networks, 10-20 cells across andperhaps 40-100 cells in total, that can be fit within a single clock zone. Most largernetworks are typically just sparsely wired together combinations of such networks.As an example, we consider the 49 cell exclusive-or gate shown in Fig. 8.3 withinputs 0 and 1. Even though simple, this network is beyond exact diagonalizationand of sufficient complexity for our purposes. In Fig. 8.4, we show a selectionof decompositions and the corresponding low-energy spectra and objective costs.We compare against the energy gap between the ground and first excited states ascomputed using the DMRG implementation in ITensor v3.1.3, an open-source ten-sor network library [122]. For each value of γ , the ground and excited state were135100001100001100001Spectrum, Field,,Clocking Field,,Clocking Field,(a)(b)(c) 8.4: Sample decompositions of the XOR gate. Colours indicate both com-ponent cells and the corresponding spectra. The low energy spectrum computedusing the CM solver is indicated by the dashed lines. The crosses show the firstexcited state as computed using a DMRG. The first 2-component decompositionin (a) is achieved purely using the kink energies as weights: wi j = |Eki j|. The bluecomponent fairly accurately predicts the location and value of the lowest minima.A second minima is seen in the full spectrum but is not obviously associated witheither component. The ground states at the minima, γn ∈ {0.673,0.295}, were usedto compute Cd , from which we achieved the optimal 2-component decompositionshown in (b). We see that the adjusted red component contributes two features: theγ→ 0 limit, as well as part of the first minima observed in the DMRG results. Thismotivates a search for a 3-component decomposition. The optimum is shown in(c). The particular choice of DMRG accuracy parameters used were often unableto correctly track the ground state for small γ , leading to the scattered values. Aself-consistency limit of 0.001 was used for the CM solver, hence values of Mi jzzand thus Cd are only reliable to that order.136retained to initialise the DMRG at the next instance. Instability in the DMRG re-sults is apparent near the γ → 0 limit; however, we will see that the second lowerγ minimum gap is expected to contribute far less towards transitions in the dy-namics than the higher γ minimum. Verification against the DMRG results in thisregime is therefore less critical. Further refining of the accuracy parameters mayhave resolved the instability. We observe that even using the simplest 2-componentdecomposition shown in Fig. 8.4(a), much of the features of the first excited stateare captured assuming the DMRG results to be accurate. That being said, it isnot clear that the red component spectrum predicts anything of interest. Outsideof the noted instability, the ground states for both methods agreed to two decimalplaces. Using the minima from the CM solver as not to bias the choice in γn usingthe DMRG results, we identified the optimal 2-component decomposition shownin Fig. 8.4(b). The predicted minima did not change. The two observed minimain the red component spectrum motivates an optimized 3-component decomposi-tion shown in Fig. 8.4(c). Not only do we observe close agreement between thepredicted features and the DMRG results, we see clearly how each componentcontributes to the spectrum.8.4 Estimating Transitions in the Component Mode BasisThe CM solver produces estimates of a limited subset of the low energy eigenstates.It is natural then to consider whether it possible to compute the transition param-eters in order to get some idea of clocking performance. The Tn(s) require thatwe evaluate both 〈ϕn| dds Hˆ|ϕ0〉 as well as the energy gaps, En−E0, for each state.While we can produce both an estimate of the eigenvalues of the system as wellas the corresponding eigenstates, these eigenstates are expressed in a somewhatnon-intuitive basis: a recursively defined product space of component eigenstates.It would be very useful if we could express dds Hˆ in this same basis. Fortunately, thecomponent mode construction both allows and requires that we maintain represen-tations of σˆ iz in the component basis. It is trivial, to keep track of σˆ ix as well. The137component basis representation of dds Hˆ is given by :dds H =−12∑iddsγiσix+12[∑idds hiσiz−∑〈i j〉dds Eki jσizσjz]. (8.26)If we assume clocking via γi(s) only, then we have the simple expression〈ϕm| dds Hˆ|ϕn〉=−12∑iddsγi(s)〈ϕm|σ ix|ϕ n〉. (8.27)The computed transition parameters are shown in Fig. 8.5. We can also considertransitions from the perspective of the component Hamiltonians.Tkn (s) =〈ϕkn | dds Hˆk|ϕk0 〉(Ekn −Ek0 )2(8.28)where Hˆk|ϕkn 〉 = Ekn |ϕkn 〉 for the n’th eigenstate of the k’th component Hamilto-nian, and 〈ϕkn | dds Hˆk|ϕk0 〉 ≈−12 ∑i∈Sk ddsγi(s)〈ϕkn |σ ix|ϕk0 〉. The results are includedin Fig. 8.5(b). We see that the transition parameters computed using the compo-nent spectra almost perfectly reproduce those of the full spectrum. This should notbe surprising as we have chosen our decomposition such that the first excited stateis approximately a product state with all but one of the components in the groundstate. Transitions from the ground state to the excited state should exactly cor-respond to transition from the ground state of that component to its excited state.The only exceptions occur near crossings between the component spectra, in whichcase the full system eigenstates tend to be superpositions of the crossing states.We do not include dynamics calculations here; however, it is possible alsoto directly compute the dynamics using the reduced basis approach used in Sec-tion 6.2.3. In the component mode basis, we havedds cn ≈− ∑m 6=ncm〈ϕ n| dds H |ϕm〉Em−En e−i(θm−θn)/ f˜ (8.29)1380.00.51.0051015202530Transition ParameterClocking Field,(a) Full SpectrumClocking Field, Parameter100001(b) Component SpectraFigure 8.5: Estimates of the transition parameters for the 49 cell XOR gate. As-sumes a linear schedule with 12ddsγi(s) =−1 The multiple crossings among higherenergy states in the full spectrum lead to rapid changes in some of the parameters.In (b), we consider the transition parameters computed using only the componentspectra. Values for the lowest energy states in (a) are overlaid for comparison.1398.5 Component SubstitutionFor a given network decomposition, we can consider the effect of altering the de-sign of one of the components. As an example, suppose we substituted the invertercomponent in the XOR gate with another symmetric variant, as in Fig. 8.6(a).There are a few points to consider here. Firstly, if the component substitutiondoes not significantly change the mean fields of the cells at the component inter-faces throughout the full range of clocking parameters, then the component spectraof all other components should remain approximately unchanged. Secondly, ifthe decomposition remains of high quality, having a small value of Cd , after thesubstitution, then any changes to the component spectrum for the new invertershould correspond to changes in the full spectrum. We can see both of these effectsin the computed spectra in Fig. 8.6(b), with the transition parameters computedin Fig. 8.6(c). This simple observation suggests a potential strategy for targetingweaknesses in designs and making systematic improvements.8.6 Application to Alternate QCA ModelsThe effectiveness of the decomposition method and heuristic solver arise due tothe sparsity of the network connectivity graphs and expected limited correlations.There is nothing special about our focus on 2-state QCA or even our use of zoneclocking. Here we briefly discuss the other common model for QCA devices, the3-state QCA, as well as an important consideration for dynamic wave clocking.8.6.1 Modifications for 3-State QCAFor the N device Hamiltonian in the 3-state approximation, Section 5.1, a similaranalysis to Section 8.1 will reveal that we should replace hi and µi in the componentHamiltonians with effective parametersh˜i =hi− ∑6`=k∑j∈S`Eki jP j, (8.30a)µ˜i =µi+ ∑6`=k∑j∈S`Di jN j, (8.30b)140100001100001(a) Substitution of the inverting component in the XOR gate.Spectrum, Field,, Field, Resulting change in the low energy spectrum and component spectra.Clocking Field, Parameter100001Clocking Field, Parameter100001(c) Change in the transition parametersFigure 8.6: Change in the low energy spectra and transition parameters associatedwith substituting the inverting element in the green component with an alternatevariant. Regions of interest are highlighted. In this particular case, there is a slightincrease in the minimum gap associated with the blue component. This results in adecrease in the corresponding transition peak.141where Pi = 〈Pˆi〉 and Ni = 〈Nˆi〉 are the mean fields. The component couplingoperators are justMˆk` =−12 ∑i∈Sk∑j∈S`Eki j(Pˆi−Pi)(Pˆ j−P j)+ ∑i∈Sk∑j∈S`Di j(Nˆi−Ni)(Nˆ j−N j)+ const. (8.31)For component decomposition, the appropriate objective function isC({Sk}) =∣∣∣∣∣∑〈k`〉 ∑i∈Sk ∑j∈S`(Eki jMi jPP−2Di jMi jNN)∣∣∣∣∣, (8.32)with Mi jPP = 〈PˆiPˆ j〉−PiP j and Mi jNN = 〈NˆiNˆ j〉−NiN j. The details of networkdecomposition and the heuristic method are otherwise analogous.8.6.2 Dynamic Wave ClockingThe dynamic objective function Eq. (8.21) applies perfectly well in the case wherethe clocking fields are cell specific. The primary challenge here is that wave clock-ing does not employ clock zones, and importantly lacks the discrete boundariesbetween those zones. In zone clocking, we can limit the set of cells to be con-sidered to those within a single clock zone. Cells in adjacent zones either experi-ence relatively high or relatively low clocking fields, and thus do not significantlychange their state as the current zone is clocked. As a result, they are effectivelyjust biases. Neither the CM solver not DMRG will cope well with potentially thou-sands of cells in the entire QCA network for some of the larger designs. For waveclocking, we can still keep the idea of a finite set of relevant cells; however, that setevolves as the clocking wave travels. Efficiently computing the spectra then likelyrequires a strategy that tracks the set of cells within the rising edge of the clockingwave that cannot be assumed to serve only as biases. One simple approach wouldbe to include cells once their clocking fields surpass some threshold value, and setthem as biases once their polarization in the estimated ground state surpasses somemagnitude. Further investigation would be needed.1428.7 SummaryIn this chapter, we have produced two significant results. The main goal of ourinvestigation was to demonstrate that important features of the low energy spec-trum can be attributed to individual components of the network. We establishedone way to identify these critical components when the spectrum is known. Thisform of decomposition is useful from a design perspective as it allows us to di-rectly link overall weaknesses in the network performance with individual choicesin components. Future research may provide new design rules based on select-ing components which optimize features in the spectrum. The second result wasthe formulation of the CM solver, which naturally emerges from the componentdecomposition. It has proved promising for approximating the low energy spec-trum of larger networks, where exact diagonalization of the system Hamiltonianis infeasible. We were able to extract the necessary parameters, 〈ϕn| ddsHˆ|ϕm〉, forestimating diabatic transitions efficiently in the component mode basis. This po-tentially allows estimation of clocking performance for large QCA networks.143Chapter 9Conclusion and Future WorkNanoscale implementations of QCA offer the potential for high frequency, lowpower computation beyond the limits of conventional CMOS. However, at suchsmall dimensions, additional challenges emerge. One such challenge is the needto simulate the behaviour of large numbers of devices. In the limit of full coher-ence, this can mean manipulating quantum states with exponentially many vari-ables, seemingly possible only for small networks. The contributions of this the-sis can be broadly split into two related categories: (1) modelling the behaviourof clocked SiDB arrangements and investigating their suitability as a platform fornanoscale QCA, and (2), a study of QCA clocking from the perspective of quantumannealing, ultimately determining how details about the low energy eigenspectrumduring clocking relate to QCA performance.9.1 SiDBs as a Platform for Nanoscale QCAIn Chapter 4, we discussed simple models which reproduce observed behaviour infabricated SiDBs. This included an understanding of the preferred charge statestaken on by arrangements of SiDBs as a constrained energy optimization prob-lem, as well as a hopping model for the slow dynamics observed when chargesare shared between degenerate sites. SiQAD, a CAD tool for designing and sim-ulating SiDB arrangements, emerged as an attempt both to understand existingexperimental results, as well as to predict and study new designs, such as potential144SiDB-based QCA implementations. This tool was made open source with the aimto promote efforts in exploring the design space, as well as to develop additionalsimulation engines and design tools. Using clocking electrodes, the electric poten-tial at the surface can be manipulated, producing controllable band-bending in re-gions of SiDBs. This allows for a population-based clocking scheme, where SiDBdevices are deactivated by inducing upward band-bending. When this method wasapplied to SiDB-based QCA wires, a complicated trajectory of preferred configu-rations was revealed, with devices populating out-of-order.In Chapter 5, we studied this out-of-order population from the familiar terrainof 3-state QCA, with preferred charge configurations becoming ground states ofa 3-state Hamiltonian with strong congestion interactions. In this framework, achange in the preferred configuration translates to an avoided level crossing be-tween the current ground and first excited states. We ultimately determined that ifwe assume SiDB-based QCA devices to be net-negative when activated, and onlyconsider tunneling between the SiDBs and null sites (bulk), then even wire oper-ation may be a challenge. Familiar designs for logic gates such as majority gateswould likely need to be replaced with SiDB-specific arrangements. If, in the future,it is demonstrated that net-neutral SiDB devices can be implemented, potentiallyusing both DB- and DB+ states, then SiDB-based QCA may be more feasible.Inclusion of a tunneling energy between the SiDBs may also be of interest.9.2 Estimating QCA Performance using the Low EnergySpectrumMuch progress has been previously made in understanding the dynamics of QCAnetworks. Use of the ICHA, though computationally efficient, results in an ap-proach to QCA design and clock zone selection which is not practical for the rela-tive dimensions of clocking electrodes in nanoscale QCA. Intermediate models in-cluding additional inter-cell correlations have shown some success in more closelyreproducing the behaviour of the full quantum treatment; however, there remainchallenges which have yet to be resolved. In this thesis we took a different ap-proach by establishing the importance of the low energy spectrum in understand-ing the behaviour of QCA networks. Previously, the spectrum has been used to145justify reduced basis approximations [43], and more recently in discussing adia-batic switching of individual QCA cells [93]. This work aims to position the studyof the energy spectrum as an essential tool for assessing the behaviour of QCAdesigns.Chapter 6 served as a demonstration for some useful applications of the spec-trum for small QCA networks. First, we established criteria for the shape of theclocking profile in order to increase adiabaticity. These ideas were based on prop-erties of the resulting ground state and early dynamics. An analytical model forbiased QCA wires in the 2-state approximation was presented and used to demon-strate that zone-clocked QCA wires beyond a few cells adhere to a simple Landau-Zener model. We compared estimates of threshold clocking frequencies beyondwhich performance suffered. Merely looking at the eigenenergies was not suffi-cient to accurately predict performance with the exception of the wire; however, byincorporating the values of 〈ϕm| ddsHˆ|ϕn〉, we achieved high accuracy using only thelowest few energy eigenstates. This demonstrates that once the spectrum is known,the dynamics can be efficiently simulated. This left the challenge of efficientlycomputing the spectrum.The parallels between QCA zone clocking and quantum annealing on the QPUdeveloped by D-Wave Systems Inc. presented an obvious potential method fordirectly studying the performance of clocked QCA without being limited by com-putational resources; however, a number of as yet unresolved challenges becameclear. The main obstacle is in understanding how the particular choice of em-bedding changes the spectrum, and ultimate the effective dynamics of the originalQCA network. In the cases where this change was minimal, we observe evidenceof diabatic transitions likely resulting from coupling with the environment. Thishighlights an important point which should be addressed. The claim is often madein the QCA literature that as the nearest-neighbour kink energies for nanoscaleQCA lie above 25meV, operation at room temperature should be possible withoutsuffering from thermal excitations. There are a couple of contentions we see fromthe experimental annealing results. First, the kink energies may give a measureof the gap between the ground and first excited states in the classical limit, butwe see from the spectra that the actual minimum gap during clocking can easilybe an order of magnitude smaller. Further, even in the case where the minimum146gaps lie above the thermal energy, as is the case for the embedded majority gatein Fig. 7.15, as the annealing time is increased we still see evidence of weak butnon-negligible thermal excitations. More work is needed, both in order to improvethe agreement between the spectra of the original and embedded network but alsoin order to determine if there is any mapping between the diabatic mechanismsexhibited by the QPU and those in any particular implementation of QCA.The final contribution was in attempting to address the need for understandingthe spectra of large QCA networks. This investigation simultaneously touched onthe question of how we might study the behaviour of large QCA circuits throughknowledge of individual components. We established a method for decomposingthe system Hamiltonian into contributions from a given set of components, and ameans to identify components which contribute important features to the spectrum.This result is primarily significant in that it will allow researchers to identify im-portant sub-circuits which significantly impact performance. This may stimulateresearch into design rules which optimize these critical components. The compo-nent formulation naturally led to a heuristic approach for estimating the low energyspectrum of large QCA networks, particularly in such a way that the terms neededfor estimating dynamics using a limited subset of the energy eigenstates are effi-cient to compute. There are a number of open questions here. First, the current im-plementation leaves the decomposition only semi-automated, still requiring somefamiliarity and guesswork in order to converge on the optimal decomposition. Amore robust method is needed, potentially also addressing the necessary challengeof signed graph partitioning. Dynamic wave clocking remains a conceptual chal-lenge from the perspective of analysing the low energy spectrum. It is clear thatonly a relatively small subset of the network contributes to the dynamics at any givetime; however, the exact approach to efficiently implement either the componentmode solver or the decomposition method is not immediately clear.147Bibliography[1] M. Rashidi, W. Vine, T. Dienel, L. Livadaru, J. Retallick, T. Huff, K. Walus,and R. A. Wolkow, “Initiating and monitoring the evolution of single elec-trons within atom-defined structures,” Phys. Rev. Lett., vol. 121, p. 166801,Oct 2018.[2] S. S. H. Ng, J. Retallick, H. N. Chiu, R. Lupoiu, L. Livadaru, T. Huff,M. Rashidi, W. Vine, T. Dienel, R. A. Wolkow, and K. Walus, “SiQAD:A design and simulation tool for atomic silicon quantum dot circuits,” IEEETransactions on Nanotechnology, vol. 19, pp. 137–146, 2020.[3] H. N. Chiu, S. S. H. Ng, J. Retallick, and K. Walus, “PoisSolver: a toolfor modelling silicon dangling bond clocking networks,” in 2020 IEEE 20thInternational Conference on Nanotechnology (IEEE-NANO). IEEE, Jul.2020, pp. 134–139.[4] J. Retallick and K. Walus, “Population congestion in 3-state quantum-dotcellular automata,” Journal of Applied Physics, vol. 127, no. 24, p. 244301,2020.[5] J. Retallick and K. Walus, “Limits of adiabatic clocking in quantum-dotcellular automata,” Journal of Applied Physics, vol. 127, no. 5, p. 054502,2020.[6] J. Retallick, M. Babcock, M. Aroca-Ouellette, S. McNamara, S. Wilton,A. Roy, M. Johnson, and K. Walus, “Embedding of quantum-dot cellularautomata circuits onto a quantum annealing processor,” in 2014 Conferenceon Optoelectronic and Microelectronic Materials & Devices. IEEE, Dec.2014.[7] J. Retallick, M. Babcock, M. Aroca-Ouellette, S. McNamara, S. Wilton,A. Roy, M. Johnson, and K. Walus, “Algorithms for embedding quantum-dot cellular automata networks onto a quantum annealing processor,” 2017.148[8] J. Retallick and K. Walus, “Investigation of quantum-dot cellular automatanetworks using a quantum annealing processor,” Presented at the 253rd ACSNational Meeting, San Francisco, USA, Apr. 2017.[9] J. Retallick and K. Walus, “Low-energy eigenspectrum decomposition(LEED) of quantum-dot cellular automata networks,” submitted for publi-cation, 2020.[10] A. Nourbakhsh, A. Zubair, R. N. Sajjad, A. Tavakkoli K. G., W. Chen,S. Fang, X. Ling, J. Kong, M. S. Dresselhaus, E. Kaxiras, K. K. Berggren,D. Antoniadis, and T. Palacios, “MoS2 field-effect transistor with sub-10 nmchannel length,” Nano Letters, vol. 16, no. 12, pp. 7798–7806, Dec 2016.[11] C. Qiu, Z. Zhang, M. Xiao, Y. Yang, D. Zhong, and L.-M. Peng, “Scalingcarbon nanotube complementary transistors to 5-nm gate lengths,” Science,vol. 355, no. 6322, pp. 271–276, 2017.[12] D. Bhowmik, L. You, and S. Salahuddin, “Spin hall effect clocking of nano-magnetic logic without a magnetic field,” Nature Nanotechnology, vol. 9,pp. 59 EP –, Nov 2013.[13] X. Fong, Y. Kim, K. Yogendra, D. Fan, A. Sengupta, A. Raghunathan, andK. Roy, “Spin-transfer torque devices for logic and memory: Prospects andperspectives,” IEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems, vol. 35, no. 1, pp. 1–22, Jan 2016.[14] R. A. Wolkow, L. Livadaru, J. Pitters, M. Taucer, P. Piva, M. Salomons,M. Cloutier, and B. V. C. Martins, Silicon Atomic Quantum Dots EnableBeyond-CMOS Electronics. Berlin, Heidelberg: Springer Berlin Heidel-berg, 2014, pp. 33–58.[15] C. S. Lent, P. D. Tougaw, W. Porod, and G. H. Bernstein, “Quantum cellularautomata,” Nanotechnology, vol. 4, no. 1, pp. 49–57, jan 1993.[16] P. D. Tougaw and C. S. Lent, “Logical devices implemented using quantumcellular automata,” Journal of Applied Physics, vol. 75, no. 3, pp. 1818–1825, 1994.[17] I. Amlani, A. O. Orlov, G. Toth, G. H. Bernstein, C. S. Lent, and G. L.Snider, “Digital logic gate using quantum-dot cellular automata,” Science,vol. 284, no. 5412, pp. 289–291, 1999.149[18] A. O. Orlov, I. Amlani, G. H. Bernstein, C. S. Lent, and G. L. Snider, “Real-ization of a functional cell for quantum-dot cellular automata,” Science, vol.277, no. 5328, pp. 928–930, 1997.[19] G. L. Snider, A. O. Orlov, I. Amlani, G. H. Bernstein, C. S. Lent, J. L. Merz,and W. Porod, “Quantum-dot cellular automata: Line and majority logicgate,” Japanese Journal of Applied Physics, vol. 38, no. Part 1, No. 12B, pp.7227–7229, Dec. 1999.[20] G. To´th and C. S. Lent, “Quasiadiabatic switching for metal-islandquantum-dot cellular automata,” Journal of Applied Physics, vol. 85, no. 5,pp. 2977–2984, 1999.[21] A. O. Orlov, R. Kummamuru, R. Ramasubramaniam, C. S. Lent, G. H. Bern-stein, and G. L. Snider, “Clocked quantum-dot cellular automata shift regis-ter,” Surface Science, vol. 532-535, pp. 1193–1198, Jun. 2003.[22] R. P. Cowburn, “Room temperature magnetic quantum cellular automata,”Science, vol. 287, no. 5457, pp. 1466–1468, Feb. 2000.[23] G. Csaba, A. Imre, G. Bernstein, W. Porod, and V. Metlushko, “Nanocom-puting by field-coupled nanomagnets,” IEEE Transactions on Nanotechnol-ogy, vol. 1, no. 4, pp. 209–213, Dec. 2002.[24] A. Imre, “Majority logic gate for magnetic quantum-dot cellular automata,”Science, vol. 311, no. 5758, pp. 205–208, Jan. 2006.[25] C. S. Lent, B. Isaksen, and M. Lieberman, “Molecular quantum-dot cellularautomata,” Journal of the American Chemical Society, vol. 125, no. 4, pp.1056–1063, 2003, pMID: 12537505.[26] J. A. Christie, R. P. Forrest, S. A. Corcelli, N. A. Wasio, R. C. Quardokus,R. Brown, S. A. Kandel, Y. Lu, C. S. Lent, and K. W. Henderson, “Synthesisof a neutral mixed-valence diferrocenyl carborane for molecular quantum-dot cellular automata applications,” Angewandte Chemie International Edi-tion, vol. 54, no. 51, pp. 15 448–15 451, 2015.[27] M. B. Haider, J. L. Pitters, G. A. DiLabio, L. Livadaru, J. Y. Mutus, andR. A. Wolkow, “Controlled coupling and occupation of silicon atomic quan-tum dots at room temperature,” Phys. Rev. Lett., vol. 102, p. 046805, Jan2009.150[28] T. Huff, H. Labidi, M. Rashidi, L. Livadaru, T. Dienel, R. Achal, W. Vine,J. Pitters, and R. A. Wolkow, “Binary atomic silicon logic,” Nature Elec-tronics, vol. 1, no. 12, pp. 636–643, 2018.[29] J. Timler and C. S. Lent, “Power gain and dissipation in quantum-dot cel-lular automata,” Journal of Applied Physics, vol. 91, no. 2, pp. 823–831,2002.[30] C. S. Lent, M. Liu, and Y. Lu, “Bennett clocking of quantum-dot cellularautomata and the limits to binary logic scaling,” Nanotechnology, vol. 17,no. 16, pp. 4240–4251, aug 2006.[31] D. Abedi, G. Jaberipur, and M. Sangsefidi, “Coplanar full adder in quantum-dot cellular automata via clock-zone-based crossover,” IEEE Transactionson Nanotechnology, vol. 14, no. 3, pp. 497–504, May 2015.[32] C. A. T. Campos, A. L. Marciano, O. P. Vilela Neto, and F. S. Torres, “USE:A universal, scalable, and efficient clocking scheme for qca,” IEEE Transac-tions on Computer-Aided Design of Integrated Circuits and Systems, vol. 35,no. 3, pp. 513–517, March 2016.[33] E. Blair and C. Lent, “Clock topologies for molecular quantum-dot cellularautomata,” Journal of Low Power Electronics and Applications, vol. 8, no. 3,2018.[34] K. Walus, T. J. Dysart, G. A. Jullien, and R. A. Budiman, “QCADesigner: arapid design and simulation tool for quantum-dot cellular automata,” IEEETransactions on Nanotechnology, vol. 3, no. 1, pp. 26–31, March 2004.[35] C. S. Lent and P. D. Tougaw, “Lines of interacting quantum-dot cells: Abinary wire,” Journal of Applied Physics, vol. 74, no. 10, pp. 6227–6233,1993.[36] G. To´th and C. S. Lent, “Role of correlation in the operation of quantum-dotcellular automata,” Journal of Applied Physics, vol. 89, no. 12, pp. 7943–7953, 2001.[37] K. Hennessy and C. S. Lent, “Clocking of molecular quantum-dot cellu-lar automata,” Journal of Vacuum Science & Technology B: Microelectron-ics and Nanometer Structures Processing, Measurement, and Phenomena,vol. 19, no. 5, pp. 1752–1755, 2001.151[38] V. Vankamamidi, M. Ottavi, and F. Lombardi, “Two-dimensional schemesfor clocking/timing of QCA circuits,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 27, no. 1, pp. 34–44,Jan. 2008.[39] E. Sicard, “Introducing 14-nm FinFET technology in Microwind,” Jun.2017.[40] G. To´th, “Correlation and coherence in quantum-dot cellular automata,”Ph.D. dissertation, 2000, copyright - Database copyright ProQuest LLC;ProQuest does not claim copyright in the individual underlying works; Lastupdated - 2019-10-17.[41] F. Karim, “Investigation of the correlated dynamics of quantum-dot cellu-lar automata circuits and systems,” Ph.D. dissertation, University of BritishColumbia, 2014.[42] F. Karim and K. Walus, “Efficient simulation of correlated dynamics inquantum-dot cellular automata (qca),” IEEE Transactions on Nanotechnol-ogy, vol. 13, no. 2, pp. 294–307, March 2014.[43] P. D. Tougaw and C. S. Lent, “Dynamic behavior of quantum cellular au-tomata,” Journal of Applied Physics, vol. 80, no. 8, pp. 4722–4736, 1996.[44] C. Lanczos, “An iteration method for the solution of the eigenvalue prob-lem of linear differential and integral operators,” Journal of Research of theNational Bureau of Standards, vol. 45, no. 3, pp. 255–282, 10 1950.[45] G. Mahler and V. A. Weberruß, Quantum Networks. Springer Berlin Hei-delberg, 1998.[46] E. Farhi, J. Goldstone, S. Gutmann, J. Lapan, A. Lundgren, and D. Preda,“A quantum adiabatic evolution algorithm applied to random instances ofan NP-complete problem,” Science, vol. 292, no. 5516, pp. 472–475, Apr.2001.[47] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser, “Quantum computationby adiabatic evolution,” 2000.[48] M. Born and V. Fock, “Beweis des adiabatensatzes,” Zeitschrift fu¨r Physik,vol. 51, no. 3, pp. 165–180, Mar 1928.[49] W. Vinci and D. A. Lidar, “Non-stoquastic hamiltonians in quantum anneal-ing via geometric phases,” npj Quantum Information, vol. 3, no. 1, Sep.2017.152[50] T. Kadowaki and H. Nishimori, “Quantum annealing in the transverse isingmodel,” Physical Review E, vol. 58, no. 5, pp. 5355–5363, Nov. 1998.[51] G. E. Santoro, R. Martonˇa´k, E. Tosatti, and R. Car, “Theory of quantumannealing of an ising spin glass,” Science, vol. 295, no. 5564, pp. 2427–2430, 2002.[52] A. Das and B. K. Chakrabarti, “Colloquium: Quantum annealing and analogquantum computation,” Rev. Mod. Phys., vol. 80, pp. 1061–1081, Sep 2008.[53] J. Brooke, T. F. Rosenbaum, and G. Aeppli, “Tunable quantum tunnellingof magnetic domain walls,” Nature, vol. 413, no. 6856, pp. 610–613, Oct.2001.[54] M. W. Johnson, M. H. S. Amin, S. Gildert, T. Lanting, F. Hamze, N. Dick-son, R. Harris, A. J. Berkley, J. Johansson, P. Bunyk, E. M. Chapple, C. En-derud, J. P. Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Oh, I. Per-minov, C. Rich, M. C. Thom, E. Tolkacheva, C. J. S. Truncik, S. Uchaikin,J. Wang, B. Wilson, and G. Rose, “Quantum annealing with manufacturedspins,” Nature, vol. 473, no. 7346, pp. 194–198, May 2011.[55] S. Morita and H. Nishimori, “Mathematical foundation of quantum anneal-ing,” Journal of Mathematical Physics, vol. 49, no. 12, p. 125210, Dec.2008.[56] W. van Dam, M. Mosca, and U. Vazirani, “How powerful is adiabatic quan-tum computation?” in Proceedings 42nd IEEE Symposium on Foundationsof Computer Science. IEEE, 2001.[57] L. D. Landau, “Zur theorie der energieu¨bertragung. II,” PhysikalischeZeitschrift der Sowjetunion, vol. 2, pp. 46–51, 1932.[58] C. Zener and R. H. Fowler, “Non-adiabatic crossing of energy levels,” Pro-ceedings of the Royal Society of London. Series A, Containing Papers ofa Mathematical and Physical Character, vol. 137, no. 833, pp. 696–702,1932.[59] G. A. Kochenberger, F. Glover, B. Alidaee, and C. Rego, “An unconstrainedquadratic binary programming approach to the vertex coloring problem,”Annals of Operations Research, vol. 139, no. 1, pp. 229–241, Oct. 2005.[60] R. Martonˇa´k, G. E. Santoro, and E. Tosatti, “Quantum annealing of thetraveling-salesman problem,” Physical Review E, vol. 70, no. 5, Nov. 2004.153[61] A. Perdomo-Ortiz, N. Dickson, M. Drew-Brook, G. Rose, and A. Aspuru-Guzik, “Finding low-energy conformations of lattice protein models byquantum annealing,” 2012.[62] A. Lucas, “Ising formulations of many NP problems,” Frontiers in Physics,vol. 2, 2014.[63] S. Jiang, K. A. Britt, A. J. McCaskey, T. S. Humble, and S. Kais, “Quantumannealing for prime factorization,” Scientific Reports, vol. 8, no. 1, Dec.2018.[64] G. Kochenberger, J.-K. Hao, F. Glover, M. Lewis, Z. Lu¨, H. Wang, andY. Wang, “The unconstrained binary quadratic programming problem: asurvey,” Journal of Combinatorial Optimization, vol. 28, no. 1, pp. 58–81,Apr. 2014.[65] M. Anthony, E. Boros, Y. Crama, and A. Gruber, “Quadratic reformulationsof nonlinear binary optimization problems,” Mathematical Programming,vol. 162, no. 1-2, pp. 115–144, Jun. 2016.[66] R. B. Stinchcombe, “Ising model in a transverse field. i. basic theory,” Jour-nal of Physics C: Solid State Physics, vol. 6, no. 15, pp. 2459–2483, aug1973.[67] M. W. Johnson, P. Bunyk, F. Maibaum, E. Tolkacheva, A. J. Berkley, E. M.Chapple, R. Harris, J. Johansson, T. Lanting, I. Perminov, E. Ladizinsky,T. Oh, and G. Rose, “A scalable control system for a superconducting adi-abatic quantum optimization processor,” Superconductor Science and Tech-nology, vol. 23, no. 6, p. 065004, Apr. 2010.[68] T. R. Huff, H. Labidi, M. Rashidi, M. Koleini, R. Achal, M. H. Salomons,and R. A. Wolkow, “Atomic white-out: Enabling atomic circuitry throughmechanically induced bonding of single hydrogen atoms to a silicon sur-face,” ACS Nano, vol. 11, no. 9, pp. 8636–8642, Jul. 2017.[69] N. Pavlicˇek, Z. Majzik, G. Meyer, and L. Gross, “Tip-induced passivationof dangling bonds on hydrogenated si(100)-2 × 1,” Applied Physics Letters,vol. 111, no. 5, p. 053104, Jul. 2017.[70] R. Achal, M. Rashidi, J. Croshaw, D. Churchill, M. Taucer, T. Huff,M. Cloutier, J. Pitters, and R. A. Wolkow, “Lithography for robust and ed-itable atomic-scale silicon devices and memories,” Nature Communications,vol. 9, no. 1, Jul. 2018.154[71] J. W. Lyding, T.-C. Shen, J. S. Hubacek, J. R. Tucker, and G. C. Abeln,“Nanoscale patterning and oxidation of h-passivated si(100)-2×1 surfaceswith an ultrahigh vacuum scanning tunneling microscope,” Applied PhysicsLetters, vol. 64, no. 15, pp. 2010–2012, Apr. 1994.[72] T. C. Shen, C. Wang, G. C. Abeln, J. R. Tucker, J. W. Lyding, P. Avouris, andR. E. Walkup, “Atomic-scale desorption through electronic and vibrationalexcitation mechanisms,” Science, vol. 268, no. 5217, pp. 1590–1592, Jun.1995.[73] E. T. Foley, A. F. Kam, J. W. Lyding, and P. Avouris, “Cryogenic UHV-STMstudy of hydrogen and deuterium desorption from si(100),” Physical ReviewLetters, vol. 80, no. 6, pp. 1336–1339, Feb. 1998.[74] M. Smeu, H. Guo, W. Ji, and R. A. Wolkow, “Electronic properties ofsi(111)-7× 7 and related reconstructions: Density functional theory calcu-lations,” Phys. Rev. B, vol. 85, p. 195315, May 2012.[75] P. Scherpelz and G. Galli, “Optimizing surface defects for atomic-scale elec-tronics: Si dangling bonds,” Phys. Rev. Materials, vol. 1, p. 021602, Jul2017.[76] M. Rashidi, E. Lloyd, T. R. Huff, R. Achal, M. Taucer, J. J. Croshaw, andR. A. Wolkow, “Resolving and tuning carrier capture rates at a single siliconatom gap state,” ACS Nano, vol. 11, no. 11, pp. 11 732–11 738, Nov. 2017.[77] H. Labidi, M. Taucer, M. Rashidi, M. Koleini, L. Livadaru, J. Pitters,M. Cloutier, M. Salomons, and R. A. Wolkow, “Scanning tunneling spec-troscopy reveals a silicon dangling bond charge state transition,” New Jour-nal of Physics, vol. 17, no. 7, p. 073023, jul 2015.[78] T. R. Huff, T. Dienel, M. Rashidi, R. Achal, L. Livadaru, J. Croshaw, andR. A. Wolkow, “Electrostatic landscape of a hydrogen-terminated siliconsurface probed by a moveable quantum dot,” ACS Nano, vol. 13, no. 9, pp.10 566–10 575, 2019, pMID: 31386340.[79] L. Livadaru, P. Xue, Z. Shaterzadeh-Yazdi, G. A. DiLabio, J. Mutus, J. L.Pitters, B. C. Sanders, and R. A. Wolkow, “Corrigendum: Dangling-bondcharge qubit on a silicon surface (2010 new j. phys. 12 083018),” New Jour-nal of Physics, vol. 19, no. 11, p. 119501, Nov. 2017.[80] Z. Shaterzadeh-Yazdi, “Coherence of coupled dangling-bond pairs on thesilicon surface,” Ph.D. dissertation, University of Calgary, 2014.155[81] M. Rashidi, J. A. J. Burgess, M. Taucer, R. Achal, J. L. Pitters, S. Loth, andR. A. Wolkow, “Time-resolved single dopant charge dynamics in silicon,”Nature Communications, vol. 7, no. 1, Oct. 2016.[82] S. S. H. Ng, “Computer-aided design of atomic silicon quantum dotsand computational applications,” Master’s thesis, University of BritishColumbia, 2020.[83] J. L. Pitters, I. A. Dogel, and R. A. Wolkow, “Charge control of surfacedangling bonds using nanoscale schottky contacts,” ACS Nano, vol. 5, no. 3,pp. 1984–1989, Feb. 2011.[84] N. Apsley and H. P. Hughes, “Temperature-and field-dependence of hoppingconduction in disordered systems,” Philosophical Magazine, vol. 30, no. 5,pp. 963–972, Nov. 1974.[85] R. A. Marcus, “Electron transfer reactions in chemistry. theory and experi-ment,” Reviews of Modern Physics, vol. 65, no. 3, pp. 599–610, Jul. 1993.[86] Qt C++ framework. The Qt Company.[87] E. P. Blair, M. Liu, and C. S. Lent, “Signal energy in quantum-dot cel-lular automata bit packets,” Journal of Computational and TheoreticalNanoscience, vol. 8, no. 6, pp. 972–982, 2011.[88] Y. Lu, M. Liu, and C. Lent, “Molecular quantum-dot cellular automata:From molecular structure to circuit dynamics,” Journal of applied physics,vol. 102, no. 3, p. 034311, 2007.[89] E. Blair, “Electric-field inputs for molecular quantum-dot cellular automatacircuits,” IEEE Trans. Nanotechnol., vol. 18, p. 453–460, May 2019.[90] F. Karim, K. Walus, and A. Ivanov, “Analysis of field-driven clocking formolecular quantum-dot cellular automata based circuits,” Journal of com-putational electronics, vol. 9, no. 1, pp. 16–30, 2010.[91] S. S. Pidaparthi and C. S. Lent, “Exponentially adiabatic switching inquantum-dot cellular automata,” Journal of Low Power Electronics and Ap-plications, vol. 8, no. 3, 2018.[92] S. Morita, “Faster annealing schedules for quantum annealing,” Journal ofthe Physical Society of Japan, vol. 76, no. 10, p. 104001, Oct. 2007.156[93] S. S. Pidaparthi and C. S. Lent, “Exponentially adiabatic switching inquantum-dot cellular automata,” Journal of Low Power Electronics and Ap-plications, vol. 8, no. 3, 2018.[94] J. Dziarmaga, “Dynamics of a quantum phase transition: exact solution ofthe quantum ising model.” Physical Review Letters, vol. 95 24, p. 245701,2005.[95] M. Taucer, F. Karim, K. Walus, and R. A. Wolkow, “Consequences of many-cell correlations in clocked quantum-dot cellular automata,” IEEE Transac-tions on Nanotechnology, vol. 14, no. 4, pp. 638–647, July 2015.[96] F. Karim and K. Walus, “Calculating the steady-state polarizations of quan-tum cellular automata (qca) circuits,” Journal of Computational Electronics,vol. 13, no. 3, pp. 569–584, Sep 2014.[97] E. P. Blair and C. S. Lent, “Environmental decoherence stabilizes quantum-dot cellular automata,” Journal of Applied Physics, vol. 113, no. 12, p.124302, 2013.[98] R. Harris, M. W. Johnson, T. Lanting, A. J. Berkley, J. Johansson, P. Bunyk,E. Tolkacheva, E. Ladizinsky, N. Ladizinsky, T. Oh, F. Cioata, I. Perminov,P. Spear, C. Enderud, C. Rich, S. Uchaikin, M. C. Thom, E. M. Chapple,J. Wang, B. Wilson, M. H. S. Amin, N. Dickson, K. Karimi, B. Macready,C. J. S. Truncik, and G. Rose, “Experimental investigation of an eight-qubitunit cell in a superconducting optimization processor,” Physical Review B,vol. 82, no. 2, Jul. 2010.[99] N. Dattani, S. Szalay, and N. Chancellor, “Pegasus: The second connectivitygraph for large-scale quantum annealing hardware,” 2019.[100] K. Boothby, P. Bunyk, J. Raymond, and A. Roy, “Next-generation topologyof d-wave quantum processors,” 2020.[101] N. Robertson and P. Seymour, “Graph minors .XIII. the disjoint paths prob-lem,” Journal of Combinatorial Theory, Series B, vol. 63, no. 1, pp. 65–110,Jan. 1995.[102] V. Choi, “Minor-embedding in adiabatic quantum computation: I. the pa-rameter setting problem,” Quantum Information Processing, vol. 7, no. 5,pp. 193–209, Sep. 2008.157[103] J. Marshall, E. G. Rieffel, and I. Hen, “Thermalization, freeze-out, andnoise: Deciphering experimental quantum annealers,” Physical Review Ap-plied, vol. 8, no. 6, Dec. 2017.[104] V. Choi, “Minor-embedding in adiabatic quantum computation: II. minor-universal graph design,” Quantum Information Processing, vol. 10, no. 3,pp. 343–353, Oct. 2010.[105] J. Cai, W. G. Macready, and A. Roy, “A practical heuristic for finding graphminors,” 2014.[106] A. Fijany and B. N. Toomarian, “New design for quantum dots cellularautomata to obtain fault tolerant logic gates,” Journal of Nanoparticle Re-search, vol. 3, no. 1, pp. 27–37, 2001.[107] C. S. Lent, K. W. Henderson, S. A. Kandel, S. A. Corcelli, G. L. Snider,A. O. Orlov, P. M. Kogge, M. T. Niemier, R. C. Brown, J. A. Christie, N. A.Wasio, R. C. Quardokus, R. P. Forrest, J. P. Peterson, A. Silski, D. A. Turner,E. P. Blair, and Y. Lu, “Molecular cellular networks: A non von neumannarchitecture for molecular electronics,” in 2016 IEEE International Confer-ence on Rebooting Computing (ICRC). IEEE, Oct. 2016.[108] M. Balali, A. Rezai, H. Balali, F. Rabiei, and S. Emadi, “A novel designof 5-input majority gate in quantum-dot cellular automata technology,” in2017 IEEE Symposium on Computer Applications Industrial Electronics(ISCAIE), 2017, pp. 13–16.[109] L. McMurchie and C. Ebeling, “PathFinder,” in Proceedings of the 1995ACM third international symposium on Field-programmable gate arrays -FPGA '95. ACM Press, 1995.[110] J. P. Pinilla and S. J. E. Wilton, “Layout-aware embedding for quantum an-nealing processors,” in High Performance Computing, M. Weiland, G. Juck-eland, C. Trinitis, and P. Sadayappan, Eds. Cham: Springer InternationalPublishing, 2019, pp. 121–139.[111] N. G. Dickson, M. W. Johnson, M. H. Amin, R. Harris, F. Altomare, A. J.Berkley, P. Bunyk, J. Cai, E. M. Chapple, P. Chavez, F. Cioata, T. Cirip,P. deBuen, M. Drew-Brook, C. Enderud, S. Gildert, F. Hamze, J. P. Hilton,E. Hoskinson, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Lanting, T. Ma-hon, R. Neufeld, T. Oh, I. Perminov, C. Petroff, A. Przybysz, C. Rich,P. Spear, A. Tcaciuc, M. C. Thom, E. Tolkacheva, S. Uchaikin, J. Wang,158A. B. Wilson, Z. Merali, and G. Rose, “Thermally assisted quantum an-nealing of a 16-qubit problem,” Nature Communications, vol. 4, no. 1, May2013.[112] P. Weinberg, M. Tylutki, J. M. Ro¨nkko¨, J. Westerholm, J. A. A˚stro¨m, P. Man-ninen, P. To¨rma¨, and A. W. Sandvik, “Scaling and diabatic effects in quan-tum annealing with a d-wave device,” Physical Review Letters, vol. 124,no. 9, Mar. 2020.[113] J. Strecka and M. Jasˇcˇur, “A brief account of the ising and ising-like models:Mean-field, effective-field and exact results,” Acta Physica Slovaca, vol. 65,p. 235, 08 2015.[114] O. Goldschmidt and D. S. Hochbaum, “A polynomial algorithm for the k-cutproblem for fixed k,” Mathematics of Operations Research, vol. 19, no. 1,pp. 24–37, 1994.[115] R. G. Downey, V. Estivill-Castro, M. Fellows, E. Prieto, and F. A.Rosamund, “Cutting up is hard to do: The parameterised complexity ofk-cut and related problems,” Electronic Notes in Theoretical Computer Sci-ence, vol. 78, pp. 209 – 222, 2003, CATS’03, Computing: the AustralasianTheory Symposium.[116] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for par-titioning irregular graphs,” SIAM Journal on scientific Computing, vol. 20,no. 1, pp. 359–392, 1998.[117] P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cour-napeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van derWalt, M. Brett, J. Wilson, K. Jarrod Millman, N. Mayorov, A. R. J. Nel-son, E. Jones, R. Kern, E. Larson, C. Carey, I˙. Polat, Y. Feng, E. W. Moore,J. Vand erPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henriksen, E. A.Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, P. vanMulbregt, and SciPy 1.0 Contributors, “SciPy 1.0: Fundamental Algorithmsfor Scientific Computing in Python,” Nature Methods, 2020.[118] R. B. Lehoucq, D. C. Sorensen, and C. Yang, ARPACK: Solution of LargeScale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods, Avail-able from, 1997.[119] I. Ojalvo and M. Newman, “Vibration modes of large structures by an auto-matic matrix-reduction method,” Aiaa Journal - AIAA J, vol. 8, pp. 1234–1239, July 1970.159[120] S. R. White, “Density matrix formulation for quantum renormalizationgroups,” Physical Review Letters, vol. 69, no. 19, p. 2863, 1992.[121] S. R. White, “Density-matrix algorithms for quantum renormalizationgroups,” Phys. Rev. B, vol. 48, pp. 10 345–10 356, Oct 1993.[122] ITensor Library (version 3.1.3) AAlgorithm DetailsA.1 SiDB HoppingDynamics EngineThe in-text discussion for HoppingDynamics related the dynamics to hopping rates,νi j. In practise, it can be inefficient to run simulations by time-stepping using therate equations: in some cases the rates can be unchanged for relatively large peri-ods of time; in others, multiple hopping events can occur within a single time stepand it can be tricky to determine the correct final state. Here we will discuss a moreappropriate interpretation of the hopping model, as well as some additional detailson limiting complexity.A.1.1 Hopping IntervalsRather than calculating hopping probabilities in time steps pi j(t;dt) = νi j(t)dt,we can reinterpret the hopping model in terms of hopping intervals, τi. Theseintervals define the amount of time before an electron at a DB- site performs a hop,and satisfy P(τi ≤ t) = 1− exp(−∫ t0 ν¯i(t ′)dt ′), where ν¯i(t) = ∑ j 6=i νi j(t)+ νi,B(t)is the net outgoing hopping rate. Here we include only SiDB-SiDB rates and thesurface-bulk hopping; however, any additional channels can be included. It is notsensible to generate a hopping interval as a random variable whose distributiondepends on future values of the tunneling rates. To resolve this, we generate asimple exponential random variable τ¯i ∼ Exp(λ = 1) which satisfies P(τ¯i ≤ t) =1− exp(−t), define a depletion rate ddt τ¯i(t) =−ν¯i(t), and set the hopping interval161τi as the time for τ¯i to reach 0. Then τ¯i =∫ τi0 ν¯i(t′)dt ′ and P(τi ≤ t) ≡ P(τ¯i ≤∫ t0 ν¯i(t ′)dt ′) = 1− exp(−∫ t0 ν¯i(t ′)dt ′) as required. This is a useful trick, as we cangenerate τ¯i every time a charge arrives at a new site, and τ¯i will reach zero over ahopping interval consistent with the future rates.This approach is particularly useful when there are no time-varying externalfields V exti (t). In this case, the hopping rates only change when there is a recon-figuration of charge in the surface, or a surface-bulk hop. The next event that willhappen will occur in ∆τ = mini τ¯i/ν¯i seconds. We can advance the system by thisamount, decrement the τ¯i by ν¯i∆τ , determine and perform the appropriate hoppingevent (see next section), update the νi j, and continue. In generating Fig. 4.7, we donot time step over the full 18 minutes of simulated hopping events, we simply cal-culate all hopping events that will occur over that interval, compare that list againstthe times when the tip will be over the SiDBs, and read off the charge states. Ifdynamic fields are used, we step time in increments of ∆t over which we assumethe fields to be constant. Within each increment, we compute all hopping events.A.1.2 Event SelectionAfter τi seconds, the charge at site i hops to some target j. We set the probability ofa given target being selected to be proportional to the corresponding tunneling rate:Pi→ j(t) = νi j(t)/ν¯i(t). The target j here can also be the bulk for a depopulationevent.A.1.3 CohoppingIn certain circumstances, lower energy configurations exist which, if only singlecharge hops are allowed, would require passing through a higher energy configu-ration to reach. In this case, it may be justified to allow for cohopping, where twocharges hop simultaneously. An example and schematic of cohopping is shown inFig. A.1. In the current version of HoppingDynamics, cohopping is used primarilyas a mechanism for escaping high energy metastable configurations, rather thantaken as physically justified. For a cohopping event from sites (i, j) to (k, l), weassign a rateνi j→kl = ν0e−αCH ri jχi j→klη(∆Gi j→kl) (A.1)162(a) Metastable Configurations1 24 31 24 31 24 3(b) Hopping TrajectoriesFigure A.1: Cohopping schematic. In (a), we show two metastable configurationsfor a biased pair of SiDB QCA cells. In order to reach the lower energy state on thebottom, both of the charges in the right cell must hop. Using only single hoppingevents, the least expensive way trajectory is to first hop the rightmost charge atsite 3. Due to the repulsion from the other charge, this would increase the systemenergy. If cohopping is enabled, we can hop both charges simultaneously withouthaving to overcome the temporary barriers.where ν0 is the calibrated pre-factor from Eq. (4.6), αCH is a cohopping spatialattenuation factor which restricts cohopping to nearby sites, χi j→kl describes thedependence on the distances between the involved sites, and η defines the energydependence for the chosen hopping model with ∆Gi j→kl = ∆Ei j→kl +ρi +ρ j. Wedefine χi j→kl in terms of the spatial components of the single charge hopping rates:χi j where νi j = ν0χi jη(∆Gi j):χi j→kl = 12(χikχ jl +χilχ jk), (A.2)We can understand a cohopping event from i j→ kl as either of two trajectories:i→ k and j→ l, or i→ l and j→ k. If we assume two hopping events to be simul-taneous but independent, the probability would satisfy pi→k, j→l(t;dt) = νikν jldtdtwith a spatial component χikχ jl . Our choice in Eq. (A.2) is simply the average overboth trajectories.163A.1.4 Restricted Range HoppingThere are potentially O(N2) possible νi j and O(N4) possible νi j→kl . Realistically,hopping events between sufficiently distant SiDBs are unlikely to ever contribute.Similarly for cohopping, two charges which are far apart are unlikely to be suf-ficiently correlated to justify them as candidates for cohopping. We define twodistance bounds: the maximum hopping distance Rhop, and a maximum cohoppingpair separation RCH . For each DB− site i, we only keep track of νi j for DB0 sitesj within a radius of Rhop. For each DB− pair (i, j) with ri j < RCH we track νi j→klfor DB0 sites satisfying rik < Rhop and r jl < Rhop. For all results shown, we useRhop = RCH = 5nm.A.2 Dense Placement AlgorithmThe Dense Placement algorithm uses a simultaneous “place-and-route” approachcommonly used in FPGA design [109]. It attempts to place every node in the QCAconnectivity graph onto a corresponding assigned qubit in the Chimera graph. In-teractions between cells are facilitated either by direct coupling or by chains ofqubits routed between these assigned qubits.Initial SeedWe begin by selecting an initial seed, the first QCA cell and its correspondingqubit, as follows. We select QCA cell i with probability proportional to Api whereAi is the number of neighbours of the cell and p is a constant. In this paper, p wasselected to be 3 to prioritize high adjacency cells but still allow some chance forlower adjacency cells. To select the seed qubit, we first select a tile according toa Gaussian distribution about the center of the QPU with width σ . The qubits ofthat tile are then randomly ordered and iterated through until a qubit is found withenough neighbours to facilitate the interactions of the selected seed cell. If no suchqubit is found, a new tile is selected. In this work, σ was chosen to be 1 to stronglybias initial placement near the center of the Chimera graph.164Cell PlacementBuilding from the initial seed, for each iteration of the main loop we look at the setof unplaced cells adjacent to the set of placed cells and sort by decreasing numberof unplaced connections. For each cell to be placed, we aim to select a qubitwith sufficiently many available neighbouring qubits to facilitate all interactions.Such a qubit is referred to as suitable. The chosen qubit is that which is bothsuitable and yields the shortest total path to all qubits assigned to adjacent cells.This is achieved by simultaneously running a lowest-cost-path search from eachsuch qubit until a suitable qubit is reached by all search trees. The cost of each edgein the search contains two components: a contribution from the type of couplingto the new node (internal or external to a tile), and an edge proximity contributionwhich weighs nodes in tiles closer to the edge of the processor more strongly. Fordense placement, internal tile placements are preferred so internal couplers havelower cost than external couplers. The external coupler cost was chosen to be 1.9times the internal coupler cost such that one external coupler was preferable totwo internal couplers. The edge proximity cost increases linearly as the minimumdistance to the processor edge decreases. An increase in cost of 0.5 per tile wasused. The chosen parameters tightly pack long chains of low adjacency cells andreduce the risk of running out of room for embedding.RoutingThe routing algorithm is based on negotiated congestion, a well-established pathfinding technique used in mapping netlists to FPGAs [109]. This method aims tominimize the path length for each route by use of a cost scheme that penalizes theusing a qubit for more than one path.Seam OpeningIn the case that there is no suitable qubit available, the translational symmetry of theChimera graph allows an avenue of qubits and couplers to be “opened” by shiftingall qubits after or before a row or column. We interpret this as opening a seambetween two rows or columns. As specific qubits or couplers may be inactive dueto manufacturing yield, some shifted qubits may require remapping. Such qubits165(a) Serial adder QCA cell layout. Drivercells are shaded.(b) Sample EmbeddingFigure A.2: Example embedding of a serial adder QCA circuit. Here qubits arerepresented by line segments with active couplers shown by black circles. Darkershaded qubits indicate those assigned to QCA cells with lighter shaded qubits in-dicating virtual qubits. Only tiles with used qubits are shown.are said to “conflict.” Following such a shift, remapping is achieved by findingnew placements for cells with conflicting assigned qubits and rerouting conflictingpaths. The new cell placements can result in recursive calls to seam opening andremapping. We select the seam to open according to a simple metric. For the cellbeing placed, we consider all seams adjacent to assigned qubits of the cell’s placedneighbours. For each of these seams and each opening direction, we check thatthere is a free row or column to shift into. If so, we assign to that seam a cost, C,given byC = cqnq+ cpnp+ cdd, (A.3)where nq and np are the number of conflicting assigned qubits and paths, d is theseam distance defined as the minimum distance between the seam and the meanlocation of all assigned qubits, and cq, cp, and cd are the relevant cost factors. Inthis work, the cost factors were chosen as cq = 3, cp = 4, and cd = 3. Typically,fewer qubits were needed for replacing a small number of qubits than rerouting anumber of paths.1660 2 4 6 8 10 12Chain/Model Size10-310-210-1100Occurrence probabilityDense PlacementHeuristicFigure A.3: Occurrence probability of qubit chain lengths and vertex-model sizesfor the Dense Placement and Heuristic algorithms respectively for a sample trial ofthe serial adder with full adjacency. Chain lengths are measured by the number ofcouplers used with models measured by the number of qubits. Due to the differentinterpretations of embedding used by the two algorithms, the distributions repre-sent how each model allocates virtual qubits. Note the occurrence of long qubitchains for the Dense Placement algorithm.The Dense Placement algorithm is good at keeping most routes between mappedqubits to only one or two couplers. However, as the placement builds outwardsfrom the center and does not currently account for how the final cells to be placedare connected, there are often long chains of ten or more qubits about the perimeterof the embedding needed to facilitate the final connections. The QCA cell layoutand a sample embedding for a serial adder are shown in Fig. A.2 with the distribu-tion of qubit chain lengths shown in Fig. A.3. Note the long chains that are aroundthe perimeter as discussed.A.2.1 Parameter Assignment and Model ConversionBetween each pair of qubits assigned to adjacent QCA cells there can exist a chainof virtual qubits. The interaction term Ji j between the two cells can be assigned toany coupler along this chain without changing the effective interaction between theassigned qubits, given that all other connectors are assigned Jc = −1. Parameterassignment then takes the form of finding an effective vertex model for each cell by167chain kFigure A.4: Schematic of a chain between two end nodes. Squares and circlesrepresent assigned and virtual qubits respectively. The filled squares are the endnodes terminating chain k and the unfilled squares are internal nodes within chaink. Dotted boundaries indicate the end-node vertex models.deciding where along each virtual qubit chain to assign the interaction term. Oncethe vertex models are found, we can proceed as above to assign bias and interactionparameters.In [6], we argue that the energy of the first excited state in a system of twointeraction wire, serving as analogs for vertex models, is maximized if the wiresare of similar size. We assume then that be should aim to minimize the largestvertex model in our embedding. One can formulate the allocation of virtual qubitswith the aim to minimize the maximum of all vertex-model sizes as a linear op-timization problem. First, to maintain the correct number of connectors betweenvertex models observe that any cell with greater or less than 2 neighbours mustcontain in its vertex model its assigned qubit. It is then useful to reinterpret theQCA circuit connectivity graph as a set of “end node” (Ai 6= 2) cells connectedby chains of “internal node” (Ai = 2) cells. The Dense Placement algorithm theneffectively constructs chains containing virtual and internal nodes between the endnodes. One such chain is illustrated in Fig. A.4. Note that the internal nodes couldbe placed anywhere along these chains without changing the embedding as long astheir order is maintained. Qubit allocation then reduces to determining the numberof qubits at the end of each chain assigned to the vertex models of the end nodes.The remaining qubits of each chain are divided evenly between the internal-nodecells. Only chains containing virtual qubits need to be considered.The kth chain contains Nk internal nodes and total of Mk qubits, excluding theend nodes. Of these qubits, nk and mk qubits are assigned to the vertex models of168the end nodes at the beginning and end of the chain respectively. Then the kth chainhas an average internal-node vertex-model size, s¯k, ofs¯k =Mk−nk−mkNk. (A.4)If S` is the sum of all ni, m j associated with the `th end node, then each end nodehas a vertex-model size, s` ofs` = 1+S`. (A.5)The linear programming problem is then given as minimizing the maximum of theset of all s¯k and s` with constraints ni,mi ≥ 0∀i and ni+mi ≤Mi−Ni ∀i. This is aninteger programming problem and hence is potentially computationally intensive;however, the number and allowed range of parameters is typically small so thiswas not a concern. If necessary, linear programming (LP) relaxation can be used,removing the integer constraint on ni and mi, with some consideration to roundingin post-processing. This method optimizes the maximum vertex-model size butdoes not consider the distribution of the remaining vertex models. Better resultscould be obtained from a nonlinear optimization which considers all vertex-modelsizes.A.3 Component Mode SolverHere we discuss a number of details relevant to the implementation and applicationof the CM solver for the results in this work.Initial ConditionsAs a variation approach, the CM solver cannot guarantee convergence to the opti-mal ground state from some arbitrary initial condition. The solver is best applied asa means of tracking the low energy spectrum of a QCA network as some parameteris continuous changed. This is appropriate for studying QCA clocking, in whichthe mean fields in the depolarized regime are simple: mi ≈ 0. Assuming that themi change sufficiently continuously as the clocking fields are adjusted, the solveris provided with a good estimate.169Selecting the Reduced BasisIn general, the reduced basis representations of the σˆ iz are dense, and thus we needto keep d , the dimension of H , reasonable if we want to have fast diagonaliza-tion. In particular, we define a maximum dimension, dthresh, allowed for H , chosenbased on constraints on available resources and the desired trade-off between accu-racy and speed. We then require that d = Πndn ≤ dthresh. The remaining questionis choosing how many states from each component to included. We employ an ap-proach which effectively translates dthresh to a temperature constraint. If the densityoperator is taken to be a Boltzmann distribution, the energy eigenstates contributewith weight pn ∝ e−β (En−E0). We might consider the important eigenstates to bethose which satisfy pn ≥ p∗ or equivalently En−E0 ≤ kBT ∗ for some temperaturebound T ∗. We add component eigenstates to the basis in order of increasing energyafter subtracting off the component ground state energies. States are added untilthe next addition would push d beyond dthresh. For all results shown, a value ofdthresh = 5000 was used.Order of Mean Field UpdatesThere are a number of schemes that could be considered for when to update themean fields during the self-consistency and recursive steps. In the case whererecursion is not necessary, we update the mean fields upon a new estimate of the fullsystem ground state. However, when each component is solved during recursionwe obtain a new and potentially better estimate of the fields for that component. Wecould self-consistently solve each component before constructing its reduced basis;however, this makes the self-consistency loop potentially exponential in the depthof the decomposition tree. In addition, despite a significant increase in runtime, wehave not observed an cases where this method gave different results than simplywaiting for the full system ground state. In this work, convergence was achievedwhen maxi |∆mi| ≤ 0.001 and typically only one or two iterations were required ateach value of γ .170Sub-DecompositionFurther decompositions of the components were done automatically using METISwith weights wi j = |Eki j|. At each level of the decomposition tree, componentslarger than 16 cells were split into two connected sub-components with the largercontaining at most 2/3 of the cells: ε = 1/3.Decomposition UpdatesOptimal decomposition is a question both of identifying the set of minima usedin the dynamic objective function as well as accurately computing the Mi jzz . Inexperimenting with the CM solver, we found that the accuracy of the first excitedstate is largely dependent on the choice of decomposition. Roughly speaking, thepoorer the choice in components, the more component modes are needed and thuseither the slower or less accurate the result. However, unless the initial choice isparticularly poor, the ground state itself can be computed with speed and accuracycomparable to DMRG. In addition, even a rough estimate of the locations of theminima seems to be suitable for identifying good decompositions.Note on ComplexityFor a given choice of accuracy parameters, both the CM solver and DMRG re-quire O(N) diagonalization steps per iteration. We should expect then that bothhave runtime complexity which scale comparably with the number of devices. Thechallenge in characterizing complexity comes from understanding how the neededaccuracy parameters depend on the complexity of the network. Further investiga-tion is needed to address this. As an example, the considered XOR gate resultsincluded 200 γ values, and took about 2-3 minutes for both the CM solver andthe DMRG implementation on a single core: roughly one second per update. Nei-ther our Python implementation of the CM solver nor our choice in parameters forITensor’s DMRG were particularly optimized.171Appendix BAdditional DerivationsB.1 Analytical Estimate of the 3-State QCA Cell GroundState EnergyWe give exact expressions for the population and polarization of a 3-state QCA cellin Eq. (5.12) given the ground state energy E0 of the formE0 = 12(C−|h|)− 12√α2+(C+ |h|)2We solve the for the ground state exactly in a few limits:E0(γ,0,C) =12C− 12√8γ2+C2 (B.1a)E0(γ,h,0) =−√2γ2+h2 (B.1b)lim|h|→∞E0(γ,h,C) =12(C−|h|)− 12√4γ2+(C+ |h|)2 (B.1c)from which we obtain a suitable form for α:α2(γ,h,C) = 4γ[2+u2−u√2+u2]172with u(γ,h,0) = |h|/γ. After numerical experimentation, we obtain a better approx-imation by noting that u(γ,h,C) nearly satisfiesu(γ,h,C) = u(γ,h,0)/ f (C/γ) (B.2)where f (x) is approximately the hyperbola√1+ x2/9− x/3.B.2 Clocking Field Range for 3-State QCA WiresIn a completely depopulated wire, the first device to populate is adjacent to thefully polarized driver cell and has an activation energy ofCaloEk0= ξ − 12 (B.3)This defines a lower limit on the clocking field. For a fully populated wire, takento be infinite, a cell in the middle experiences maximum congestion and has anactivation energy ofCahiEk0=2ξ ·ζ (q,αs)− (1− 1/2p−1)ζ (p), Inverting2ξ ·ζ (q,αs)−ζ (p), Non-Inverting (B.4)where ζ (q,α) = α−1 Liq(α) with Liq(α) = ∑k αk/kq the polylogarithm functionwhich satisfies Liq(1) = ζ (q) and Li1(α) = − ln(1−α). Note that the activationenergy is slightly higher for an inverting wire as the contributions to the polariza-tion bias do not all add. If ξ is sufficiently small, it is possible for the polarizationbias to have a larger contribution to the activation energy than the congestion, re-sulting in a negative activation energy. We observe in this case that the last cell inthe wire experiences half of the congestion and polarization bias of a cell in themiddle, and hence has half the activation energy. In this special case, we shouldincrease the negative Cahi to half of its value. These cases account for all resultsshown in this work.173B.3 Y Spin Invariance in the 2-State ApproximationIn the 2-state approximation, we are free to choose the “up” conventions for eachof our spin directions. In particular, we are free to select which of the eigenspinors1√2[1−i]and 1√2[1i]in the z-spin basis correspond to |+y〉. Swapping between thetwo y-spin conventions corresponds to taking the complex conjugate of all spinoperators: (σˆx, σˆy, σˆz)→ (σˆx,−σˆy, σˆz). Further, any state |ψ〉 in one conventionbecomes its conjugate |ψ〉∗ in the other. From these results, we make two importantobservations. (1) As there are no σˆy terms in the Hamiltonian, Hˆ given in Eq. (2.3),an energy eigenpair (E, |ψ〉) of Hˆ is also an eigenpair of Hˆ∗. It follows that in thez-spin basis all energy eigenstates have real coefficients: |ψ〉 = |ψ〉∗. (2) For anyproduct Oˆ of σˆ iµ operators, 〈ψ|Oˆ|ψ〉= (−1)ny〈ψ∗|Oˆ|ψ∗〉 with ny the number of σˆ iyterms. For energy eigenstates, we then have 〈Oˆ〉= (−1)ny〈Oˆ〉. The expected valueof any product of Pauli spin operators with any energy eigenstate is zero if thereare an number of σˆy terms. All out considered steady states are of the form:〈Oˆ〉ss = Tr ρˆssOˆ =∑krk〈φk|Oˆ|φk〉 (B.5)for real rk and energy eigenstates |φk〉. Hence 〈Oˆ〉ss = 0 for all Oˆ containing an oddnumber of σˆy terms.174Appendix CSupplemental DetailsC.1 Pegasus TopologyThe Pegasus topology is effectively three nested Chimera graphs with added edges.Four additional couplers are added within each 8-qubit tile, shown in Fig. 7.2.In Fig. C.1, we show how the tiles in Pegasus are connected. Tiles within aChimera sub-graph have the same connections as in Chimera. We can call thisintra-Chimera coupling. Tiles in different sub-graphs are connected in a triangularlattice. This we call inter-Chimera coupling. In Fig. C.2, we show the specificinter-Chimera couplings between qubits in each tile. In total, each qubit has 5internal couplers within each tile, 2 intra-Chimera couplers, and 8 inter-Chimeracouplers.C.2 Additional Embedding ResultsC.2.1 Embedding TimeFor such a specific class of embedding problems, it is not meaningful to discusstheoretical worst-cases for time complexity. Instead, we consider only a best-fit tothe average run times for the generated circuits. More specifically, we quantize thealgorithm run times by the value of the fitted model at the size µ where the singletrial embedding success probability is 50%: see Fig. 7.12. In Fig. C.3, we show the175(a) Intra-Chimera Coupling (b) Inter-Chimera CouplingFigure C.1: Tile connectivity of the Pegasus topology. The tiles can be dividesinto three nested Chimera graphs, indicated by the colors. All the Chimera-likecouplers between tiles in each Chimera sub-graph still exist. Coupling betweensub-graphs can be described by a triangular lattice.Figure C.2: Connectivity between qubits in different Chimera sub-graphs. Thougheach qubit has 8 inter-Chimera couplings, the distribution depends on the choiceof qubit in the tile.1760. (s)Full AdjacencyLimited Adjacency0 100 20050 150 250Number of Cells(a) Dense Placement051015202530Run-time (s)Full AdjacencyLimited Adjacency0 50 100 150 200 250010203040500 100 20050 150 250Number of Cells(b) MinorminerFigure C.3: Average run times for generated circuits on the 512 qubit processorfor both algorithms with best fits. The inset of (b) shows the maximum run timesfor bins of 20 cells with a power fit.average run times for the generated circuits for both algorithms and both adjacencytypes. Run times for the Dense Placement algorithm are approximately linear withthe circuit size due to the local nature of cell placement. For minorminer, run timeswere best fit by an exponential. Note the significant difference in time scale be-tween the two algorithms, particularly for large circuits. Minorminer is reportedin [105] to have a worst case time complexity of O(nHnGeH(eG + nG log(nG))with n and e the number of nodes and edges in the source graph, H, and thetarget graph, G. In our case H is the QCA circuit connectivity graph and G theChimera graph. Depending on the complexity of a circuit of Ncells cells and therange of included interactions, the number of edges in H is between Ncells−1 and12 Ncells(Ncells− 1). Hence the worst case time complexity of minorminer shouldbe between O(N2cellsN2qbits log(Nqbits)) and O(N3cellsN2qbits log(Nqbits)) for a processorcontaining Nqbits qubits. For a given processor, we should then expect the run timesto have an upper bound of the form tmax(N) ∝ Nα with 2 ≤ α ≤ 3. The inset inFig. C.3b shows the maximum run times within bins of 20 cells and the associatedpower law fits. The trends for limited and full adjacency were found to have pow-ers of 2.21± .16 and 2.73± .21, which lie within expected values. All embeddingswere done using a single core of an Intel Core i7-2670QM processor.1770 2 4 6 8 10 12 14 16Processor Size N=M05010015020025030035050% Circuit Size,DP: LimitedDP: FullMM: LimitedMM: Full(a) Embeddable circuit size2 4 6 8 10 12 14 16Processor Size N=M0510152025303540 DP: LimitedDP: FullMM: LimitedMM: Full2 4 6 8 10 12 14 Runtime,(s)(b) Characteristic runtimeFigure C.4: Metrics for the 50% circuit size and characteristic run time for bothalgorithms and adjacencies as a function of the processor size.Table C.1: Power law fit parameters for the characteristic circuit size, µ , and runtime t(µ).Method Aµ (cells) bµ At (ms) btDP-L 11.6±1.0 1.26±0.04 1.15±0.20 2.85±0.08DP-F 5.5±0.6 1.34±0.06 0.84±0.19 2.91±0.11MM-L 10.3±0.2 1.53±0.01 0.41±0.11 4.89±0.14MM-F 8.0±0.2 1.57±0.02 0.31±0.06 4.95±0.11C.2.2 Scaling PerformanceHere we investigate the trend of the 50P circuit size, µ(N), and the characteristicrun time, t(µ,N), of both algorithms over a range of processor sizes. We consideronly square processors. That is, we assume there are N rows and M = N columnsof 8 qubit tiles. Due to its exponential run-time performance, we only tested mi-norminer up to processors of size N = 10. The Dense Placement algorithm wastested on processors up to size N = 16. The estimated values of µ(N) and associ-ated run times t(µ,N) are shown in Fig. C.4 with power law fits. The fit parametersare included in Table C.1 where µ(N) = AµNbµ and t(µ,N) = AtNbt . The inset inFig. C.4b has the y-range expanded to better see the Dense Placement algorithmperformance. While the power law fits for µ(N) appear accurate up to N = 10,there is a notable fall-off for the Dense Placement algorithm for larger processor1780.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.400.µ/µ0DP: LimitedDP: FullMM: LimitedMM: FullProportion of Qubits DisabledFigure C.5: Relative decrease in the embeddable circuit size as a function of theproportion of qubits disabled. Fits are shown.sizes, likely due to the need to coordinate the discussed long chains for the finalcell placements.C.2.3 Resilience to Processor YieldSo far, we have assumed all of the qubits and couplers in the processor to be avail-able. Here we investigate the performance of both algorithms for generated circuitswith different percentages of the available qubits disabled. A new set of disabledqubits is uniformly generated for each embedding trial for each circuit. Fig. C.5shows the relative µ values for different percentages of disable qubits on a 512qubit processor. Here µ0 indicates the value of µ when no qubits are disabled. Wefit a model of the formµ(ndis) = µ0[1−αnβdis](C.1)where 0 ≤ ndis ≤ 1 is the proportion of qubits disabled. As typical values of ndisare small (∼ 0.05), of importance is the observation that µ(ndis) for the DensePlacement algorithm falls off quickly with ndis. In contrast, minorminer seems tobe much more resilient to yield.179Appendix DEquationsD.1 Gell-Mann Matricesλˆ1 =0 1 01 0 00 0 0 λˆ2 =0 −i 0i 0 00 0 0 λˆ3 =1 0 00 −1 00 0 0λˆ4 =0 0 10 0 01 0 0 λˆ5 =0 0 −i0 0 0i 0 0λˆ6 =0 0 00 0 10 1 0 λˆ7 =0 0 00 0 −i0 i 0 λˆ8 = 1√31 0 00 1 00 0 −2D.2 Equations for All Single and Two Point CorrelationsFor completeness, we include here the i〈[Hˆ, Λˆ]〉 components of the dynamics equa-tion up to two-point correlations.The single-point correlations are easily obtained180asλ ix :−hiλ iy+∑n6=iEkniKinyz (D.1a)λ iy : γiλiz +hiλix−∑n6=iEkniKinxz (D.1b)λ iz :−γiλ iy (D.1c)Making use of the equality, Ki jab = Kjiba, we obtain six two-point correlations withequationsKi jxx :−hiK jixy−h jKi jxy+ ∑n6∈{i j}[EkniKjinxyz +Ekn jKi jnxyz](D.2a)Ki jxy : γ jKi jxz−hiKi jyy+h jKi jxx+ ∑n6∈{i j}[EkniKi jnyyz−Ekn jKi jnxxz](D.2b)Ki jxz :−γ jKi jxy−hiKi jyz+Eki jλ iy+ ∑n6∈{i j}EkniKi jnyzz (D.2c)Ki jyy : γiKjiyz+ γ jKi jyz+hiKi jxy+h jKjixy− ∑n6∈{i j}[EkniKi jnxyz +Ekn jKjinxyz](D.2d)Ki jyz : γiKi jzz− γ jKi jyy+hiKi jxz−Eki jλ ix− ∑n6∈{i j}EkniKi jnxzz (D.2e)Ki jzz :−γiKi jyz− γ jK jiyz (D.2f)181


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items