- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Theory and application of Eigenvalue independent partitioning...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Theory and application of Eigenvalue independent partitioning in theoretical chemistry 1977
pdf
Page Metadata
Item Metadata
Title | Theory and application of Eigenvalue independent partitioning in theoretical chemistry |
Creator |
Sabo, David Warren |
Date Created | 2010-02-25 |
Date Issued | 2010-02-25 |
Date | 1977 |
Description | This work concerns the description of eigenvalue independent: partitioning theory, and its application to quantum mechanical calculations of interest in chemistry. The basic theory for an m-fold partitioning of a hermitian matrix H, (2 < m < n, the dimension of the matrix), is developed in detail, with particular emphasis on the 2x2 partitioning, which is the most' useful. It consists of the partitioning of the basis space into two subspaces — an n[sub A]-dimensional subspace (n[sub A] > 1), and the complementary n-n[sub A] = n[sub B]-dimensional subspace. Various n[sub A]-(or n[sub B]-) dimensional effective operators, and projections onto n[sub A]- (or n[sub B] dimensional eigenspaces of H, are defined in terms of a mapping, f, relating the parts of eigenvectors lying im each of the partitioned subspaces. This mapping is shown to be determined by a simple nonlinear operator equation, which can be solved by iterative methods exactly, or by using a pertur-bation expansion. Properties of approximate solutions, and various alternative formulas for effective operators, are examined. The theory is developed for use with both orthonormal and non-orthonormal bases. Being a generalization of well known one-dimensional partitioning formalisms, this eigenvalue independent partitioning theory has a number of important areas of application. New and efficient methods are developed for the simultaneous determination of several eigenvalues and eigenvectors of a large hermitian matrix, which are based on the construction and diagonalization of an appropriate effective operator. Perturbation formulas are developed both for effective operators defined in terms of f, and for projections onto whole eigen-spaces of H. The usefulness of these formulas, especially when the zero order states of interest are degenerate, is illustrated by a number of examples, including a formal uncoupling of the four component Dirac hamiltonian to obtain a two component hamiltonian for electrons only, the construction of an effective nuclear spim hamiltonian in esr theory, and the derivation of perturbation series for the one-particle density matrix in molecular orbital theory (in both Huckel-type and closed shell self-consistent field contexts). A procedure is developed for the direct minimization of the total electronic energy in closed shell self-consistent field theory in terms of the elements of f, which are unconstrained and contain no redundancies. This formalism is extended straightforwardly to the general multi-shell single determinant case. The resulting formulas, along with refinements of the basic conjugate gradient minimization algorithm* which involve the use of scaled variables and frequent basis modification, lead to efficient, rapidly convergent methods for the determination of stationary values of the electronic energy* This is illustrated by some numerical calculations in the closed shell and unrestricted Hartree-Fock cases. |
Subject |
Eigenvalues Chemistry, Physical And Theoretical |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | Eng |
Collection |
Retrospective Theses and Dissertations, 1919-2007 |
Series | UBC Retrospective Theses Digitization Project |
Date Available | 2010-02-25 |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0060960 |
Degree |
Doctor of Philosophy - PhD |
Program |
Chemistry |
Affiliation |
Science, Faculty of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
URI | http://hdl.handle.net/2429/20930 |
Aggregated Source Repository | DSpace |
Digital Resource Original Record | https://open.library.ubc.ca/collections/831/items/1.0060960/source |
Download
- Media
- UBC_1977_A1 S22.pdf [ 19.54MB ]
- Metadata
- JSON: 1.0060960.json
- JSON-LD: 1.0060960+ld.json
- RDF/XML (Pretty): 1.0060960.xml
- RDF/JSON: 1.0060960+rdf.json
- Turtle: 1.0060960+rdf-turtle.txt
- N-Triples: 1.0060960+rdf-ntriples.txt
- Citation
- 1.0060960.ris
Full Text
THEORY;: AND APPLICATION OF EIGENVALUE INDEPENDENT PARTITIONING IN THEORETICAL CHEMISTRY lay DAVID WARREN SABO BW Sc* (Hons.) U n i v e r s i t y of A l b e r t a , 19?2 A THESIS SUBMITTED: IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY ira the Department oaf CHEMISTRY We accept t h i s thes is as conforming to the required standard / THE UNI VERS IT Yf OF BRITISH COLUMBIA July, , 1977 ®David Warren: Sabo, 1977' I n presenting t h i s thes is i m p a r t i a l f u l f i l l m e n t of the requirements for an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e for reference and study. I fur ther agree that permission! f o r extensive copying of t h i s thes is f o r s c h o l a r l y purposes may be granted by the Head of my Department or by h is representat ives . I t i s understood that copying or p u b l i c a t i o n of t h i s thes is f o r f i n a n c i a l gain- s h a l l not be allowed without my wri t ten; permission)* Department of Chemistry The U n i v e r s i t y of B r i t i s h Columbia Vancouver, B r i t i s h Columbia, Canada, V6T 1W5 Date i i i . ABSTRACT This work concerns the d e s c r i p t i o n of eigenvalue indepen- dent: p a r t i t i o n i n g theory, and i t s a p p l i c a t i o n to quantum mech- a n i c a l c a l c u l a t i o n s of i n t e r e s t i n chemistry. The basic theory for an m-fold p a r t i t i o n i n g of a hermitian matrix H, (2 < m < n , the dimension of the m a t r i x ) , i s developed i n d e t a i l , wi th p a r t i c u l a r emphasis on the 2 x 2 p a r t i t i o n i n g , which i s the most' u s e f u l . I t consists of the p a r t i t i o n i n g of the basis space into two subspaces — an n^-dimensional subspace (n^ -̂ 1 ) , and the complementary n - n A « nB?-dimensional subspace. Various n^- (or ng-) dimensional e f fec t ive operators, and project ions onto n A - (or ny) dimensional eigenspaces of H, are defined i n terms of a mapping, f , r e l a t i n g the parts of eigenvectors l y i n g i m each of the p a r t i t i o n e d subspaces. This mapping i s shown to be determined by a simple nonlinear operator equation, which can be solved by i t e r a t i v e methods exact ly , or by using a pertur- bat ion expansion. Propert ies of approximate s o l u t i o n s , and various a l t e r n a t i v e formulas f o r e f fec t ive operators, are examined. The theory i s developed f o r use wi th both orthonormal and non-orthonormal bases. Being a g e n e r a l i z a t i o n of w e l l known one-dimensional p a r t i t i o n i n g formalisms, t h i s eigenvalue independent p a r t i t i o n - ing theory has a number of important areas of a p p l i c a t i o n . New and e f f i c i e n t methods are developed for the simultaneous deter- mination of several eigenvalues and eigenvectors of a large hermitian matr ix , which are based on the construct ion and iv» diagonali'zation of an appropriate effective operator. Pertur- bation formulas are developed both for effective operators defined i n terms of f, and for projections onto whole eigen- spaces of H. The usefulness of these formulas, especially when the zero order states of interest are degenerate, is illustrated by a number of examples, including a formal uncoup- li n g of the four component Dirac hamiltonian to obtain a two component hamiltonian for electrons only, the construction of an effective nuclear spim hamiltonian in esr theory, and the derivation of perturbation series for the one-particle density matrix in molecular orbital theory (in both Huckel-type and closed shell self-consistent f i e l d contexts). A procedure i s developed for the direct minimization of the total electronic energy in closed shell self-consistent f i e l d theory in terms of the elements of f, which are uncon- strained and contain no redundancies. This formalism i s extended straightforwardly to the general multi-shell single determinant case. The resulting formulas, along with refine- ments of the basic conjugate gradient minimization algorithm* which involve the use of scaled variables and frequent basis modification, lead to efficient, rapidly convergent methods for the determination of stationary values of the electronic energy* This is illustrated by some numerical calculations in the closed shell and unrestricted Hartree-Fock cases. TABLE OF CONTENTS Page Abstract . i i i List of Tables . • • • . . . . . . . . . . • . . . . x i i Lis t of Figures • • • • • • • • • xvi Acknowledgements x v i i i Chapter 1 Eigenvalue Independent Partitioning, An Introduction • • • • • • • • • • • 1 Chapter 2 2 x 2 Partitioning Thecry • • « • • < > • 9 2*1 Basic Theory « . • • • . . . . . . • • . • 10 2.1#a The f-operator 10 2.1*b The Befining Condition for f . . . 14 ^•l^c? Rederivation From A Projection; Point of View a • • • • • • • • • 16 2»l.d The Relationship Between T and the EigenprojiBCtions—Covariant and Contra variant Representations ... 21 2.1.e Variational Formulation of D(f)=0 . 23 2»l.f Relation: Between o and D(f) —Eigenvalue Dispersion • • • • • 25 2 .1 . g Transformation, of f Under a Change of Basis o • • • 27 2.2 Effective Operators . . . . . . . o • • • 30 2 . 2 . a Basic Definitions • 30 2.2»b Eigenvectors and Eigenvalues of the Effective Operators . • • . - • * 3^ v i . Page 2..2.C Relat ionships With Other Formulations • • • • • 37 2*3 General izat ion, to a Nonorthonormal Basis 42 Chapter 3 The E f f e c t i v e H a m i l t o n i a n s — P r a c t i c a l Considerations * • • • • * • • • • 46 3*1 A l t e r n a t i v e Formulas • • • • • • • • • * 47 3*2 Implicat ions of Inexact Solut ions * • • *. 51 3*3 Perturbation! Theory f o r H A , G A , and H A # * 55 3*3 *a The H A Scheme • . * * * * * * * • 55 3*3«b: The G A Scheme • • • . • 59 3*3*c The H A Scheme * * • • * • • • • • 62 Chapter 4 M u l t i p l e P a r t i t i o n i n g Theory * • • • • 64 4*1 The mi x m\ P a r t i t i o n i n g Formalism * • • • 66 4*1.a Basic T h e o r y • • • • • • • • • • • 66 4*1*b The Defining C o n d i t i o n on the f j j 72 4*1*c V a r i a t i o n a l Formulation of the Equations f o r the * * * * * 74 4*1 *d Transformation! of the f j j Under a Change of Basis . . . . . . . . 77 4*2 E f f e c t i v e Operators * * * . » • • • • • 79 4*2*a Basic D e f i n i t i o n s * . • . • • • • 79 4.2*b Eigenvalues and Eigenvectors of the E f f e c t i v e Operators . . • • • * 82 4*3 General ization; to a Nora-orthonormal Basis 85 4*4 P r a c t i c a l Considerations * . . • • • • • 91 4»4*a A l t e r n a t i v e Formulas * * • • • • * 91 v i i . Page 4.4.b Implications of Inexact Solutions 95 Chapter 5 Exact Determination*of T • • •• • • • • 98 5*1 The Calculation; of a Few Eigenvalues of a Large Hermitian Matrix • • • . • . • • 100 5.2 2 x 2 Partitioning—Orthonormal Basis • • 104 5.2.a General Considerations * • .. • • • 104 5.2.0 Methods Based on D ^ ( f ) . • • . . 109 5.2.c Methods Based on D* 2*(f) . . . . . 113 5»2*d Solution; of the Newtore-Raphson Equations by Descent Methods • • 11? 5*2.e Extremizing the Trace • • • • • • 120 5 . 2 .f Minimization) of the Norm of D . . 122 5.2. g Test Calculations • • • • . • • • 123 5*3 Generalization! to a Non-orthonormal Basis — 2 x 2 Case • • • 131 5«3.a General Considerations • . • • • . 131 5.3»b Methods Based ore Gg^ and gg^-— A Generalization! of Algorithm SDNR 132 5.3. C Methods Based o n v D ^ ( f ) — Generalized Nesbet Procedures. • 135 5*3«d Other Methods, • • .. ., . • . ; . 137 5«3..e Choice of an I n i t i a l Estimate, and Improvement of Convergence Rates 140 5.3.-f Test Calculations With Overlap . . 144 5.4 Multiple Partitioning . . . . . . . . . . 159 Chapter 6 Perturbations Theory • •> • • • • . • • • 161 6.1 Introduction! • • • • . . . . • • . . . • 162 v i i t * Page 6*2 2 x 2 Partitioning — Orthonormal Basis 164 6*2 *a General Discussion * • • * • • • * 164 6*2*b A-states Degenerate * • • • • * • 173 6*2 *c A-states Nora-degenerate • • • * • 178 6*3 Examples * • • • • » • • • • * * • • • • 183 6*3*a The Dirac Equation • • * * * * * * 183 6*3*b Derivatioro of a Spini Hamiltonian — Strong Field Case • * * * * . 191 6*4 Non-orthonormal Basis — 2 x 2 Partitioning 195 Chapter 7 Eigenvalue Independent Partitioning And Molecular Orbital Theory * • * * • 209 7*1 Introduction • • » * • » • » • * • • • • 210 7*2 Perturbation of the Density Matrix — Orthonormal Basis • • * • * • • • • 212 7*2*a General Theory *.. * * • . . • • • • 212 7.2*b Huckel Molecular Orbital Theory * 221 7*2*c Numerical Example — Huckel Theory 224 7*3 Perturbation of the Density Matrix — Non-orthonormal Basis * • * * * * * 235 7*3«a General Theory * • * . . . . . . * 235 7*3*b Extended Huckel Molecular Orbital Theory * . . . . . . . . . . . . 240 7*4 Self-Consistent Perturbation Theory • . . 243 7*4*a General Theory * . • • • • * * • • 243 7*4*b: Coupled Hartree-Fock Perturbation Theory * » * • * • • • • • • • • 246 Ix* Page 7*4*cc Uncoaplied Hartree-Fock Perturbation Theory . * . * . . • • * • • • • 249 Chapter 8 Direct Minimization Self-consistent Field Theory •> * . . . . . . . . . . 251 8*1 Introduction! . . * . • . * 252 8*2 Closed Shell Systems * . • • . . . . * . 257 8*2.a Orthonormal Basis • • • • • • » 257 8*2*b Non-orthonormal Basis • • • • • • 263 8.2.c Results of Test Calculations • • • 266 8*3 Unrestricted Hartree-Fock Theory • • * • 271 8*3»a Energy Derivatives •• * » • • . • • 271 8»3*b Test Calculations and Computational Refinements* • * • • • • • • . • 275 8*3*c Use of Scaled Variables • • . • • 282 8*4 Theory for the General Single Determinant Case. • • • . • . . . • . . . * . . . . 290 8*4.a The Basic Variables of the Calculation; . . . . . . . . . . 290 8.4.b The Energy Variation and Fi r s t Derivatives . * • . . . . • • • 293 8*4.c The Second Derivatives . . . * 297 8*4*4 Incorporation! of the Intershell Orthogonality Constraints • . • 302 8.4.e Example — The Two Shell System * 305 Bibliography 313 Page Appendices • • • • • • • • • • • • • • • • • • • • • 316 Appendix 1 Proofs of Alternative Formulas ~ 2 x 2 Partitioning • • • • . • • 317 Appendix 2 The 3 x 3 and 4 x 4 Case — Orthonormal Basis . . . . . . . . . . . . . . . . 323 Appendix- 3 Proofs of Alternative Formulas -.— Multiple Partitioning . . . . . . 327 Appendix 4 Description of Algorithms — 2 x 2 Case 333 Appendix 5 Rates of Convergence and Asymptotic Error Constants • • • • • • • • • • • 341 <» Appendix 6 Algorithms for the Determination: of T — Multiple Partitioning Case •. • • • 349 A6.1 Methods Based o m D ^ ( I ) * 0 349 A6*2 Methods Based on D j J ' ( T ) - 0 . . . . . . 353 A6*3 Methods Based on the Simultaneous Solution of G J X(T)«0 and g J I(T)*0 . .. . • • .. • 356 A6.4 Methods Based on D ^ ( T ) =0 . • • . . . 358 Appendix T5 Additional Perturbation Series — Ortho- normal Basis • • • •• .. » • . • • e • 359 Appendix; 8 Noro-relativistic; Approximation of the Dirac Hamiltonian 370 Appendix 9 Additional Perturbation Series — Non-orthonormal Basis • • • • » • • 377' Appendix 10 Self-consistent Perturbation Theory When F ^ is not Block Diagonal . . . 382 x i * Page Appendix 11 Minimization Algorithms • • • •> • • • • 385 A l l * l Method of Conjugate Gradients •. * • • 385 A11..2 The Newton-Raphsoni Method • • • • . . • 38? Appendix 12 Derivatives With Respect to Real and Imaginary Parts of f. . • • . • . • 390 Appendix 13 Covariant and Contravariant Representa- tions — the General Case • • • • • • 391 x i i * L i s t of Tables Page Table 5«1 Linear Convergence Rates of the Algorithms i n Selected Calculat ions • • • • . • • • 125 Table 5,2 Rates of Convergence • . • » • • • • • • * 146 Table 5*3 Rates of Convergence * • • • • • • . • • • 149 Table 5*4 Rates of Convergence » • • •- ••• • • • • • • 155 Table 6..1 D ( n ) ( f ) . . . . . . . . 169 Table 6.2 ^ ( f ) . 170 Table 6.3 171 Table 6.4 . 171 Table 6.5 H{N) 172 Table 6.6- f*n^ (A-states degenerate) . . . . . . • • 174 Table 6.7 f* n * (A-states degenerate) * 174 Table 6.8 ĤN — Reduced Formulas (A-states degen.) 175 Table 6.9 (A-states degenerate) . . . . . . . 0 1765 Table 6.10 G^n^ (A-states degenerate) . . . . . . . 9 176 Table 6.11 HĴ (A-states degenerate) . . 177 Table 6.12 f ( r e ) . . . . . . . . . . . . . . . . . . . 180 Table 6.13 S{ n ) • • 180 Table 6.14 c | n ) . : . . * * * 181 Table 6.15 Ĥ M) . . 182 Tables 6.16- D^ n^(f) — Non-orthonormal Basis . • • • • 198 Table 6.17 ftj^ ( f ) — Non-orthonormal Basis . . . . . 199 Table 6.18 H$N^ — Non-orthonormal Basis . . . . • • 199 x i i i * Page Table 6,19 G ^ — Non-orthonormal Basis . . . . . . 200 Table 6*20 g ^ — Non-orthonormal Basis . . . . . . 200 Table 6*21 f ^ n ' — Non-orthonormal Basis (A-states degenerate) • • * • • • 201 Table 6*22 H^'' — Non-orthonormal Basis (A-states degenerate) • • • • • * 202 Table 6.23 — Non-orthonormal Basis r • o- .. (A-states degenerate) • • • • • • 203 Table 6,24 H A n' — Non-orthonormal Basis (A-states degenerate) • • * * • * 203 Table 6.25 f ^ n ' — Non-orthonormal Basis * * . . • • 20? Table 6»26 H A — Non-orthonormal Basis . . . • o 20? Table 6»2? G A n^ — Non-orthonormal Basis • • • • • * 208 Table 6,28 fij^1* — Non-orthonormal Basis . . . . . . 208 Table 7*1 C^AA* M o l e c u l a r 0 r b i t a l Basis * . • 216 Table 7*2 ( P ^ ) ^ Molecular O r b i t a l Basis * . • 217 Table 7*3 ( P A ) B B ^ M o l 1 e c u I a r O r b i t a l Basis * . * 218 Table 7*4 E^ n' — Molecular O r b i t a l Basis . . . . . 219 Table 7*5 E* n^ — Molecular O r b i t a l Basis • . . • * 220 Table 7.6 ( P A 0 ) ( i ) for A 6 System ( K ^ - 0) . * . * 227 Table 7*7 ( P A 0 ) ( i ) f o r A N A * A , System: ( H ^ = -1) . 228 Table 7.8 (PA0)*1* f o r an A ^ System (H^ = 0) . * 228 Table 7*9 ( P ^ ) ^ *— Non*-orthonormal Basis * . * • • 237 Table 7*10 (PJ)^ — Non-orthonormal Basis * * • • 237 xiv* Page Table 7.11 (̂BB̂ Non>-orthonormal Basis • • • • 238 Table 7*12 E^n? — Non-orthonormal Basis • • • • • 238 Table 7*13 E ^ — Non-orthonormal Basis • • '.. • • 239 Table 8*1 Closed Shell Case — Test Calculations *, 270 Table 8.2 Details of Direct Minimization Calcula- tions, CN Molecule ( r • • 2,0 a.u.) • • 278 Table 8*3 Details of Direct: Minimization Calcula- tions, CN Molecule (r • 2.2 a.u.) • * 283 Table A 7*l g ^™** * . * . . . . . . . 363 Table A7.2 g j ^ ( n ) . * , . . o . . . 364 Table A7.3 • • • • - 365 Table A7.4 g A ^ ( n i ) . . . . . . . . . . . . . . . . * 365 Table A7*5 gj^^• . . * 366 Table A7.6 g ^ 1 ^ . 366 Table A7.7 H^ n ) in Terms of the g^ n ) and fij^K- . . 367 Table A7.8 H*n* in Terms of the g ^ and GJ n *. . * 368 Table A7.9 G^.. • 369 Table A8.1 Pauli Hamiltonian (adapted from DeVries (1970)) . . . . . . . * 374 Table A8.2 g A — Non-.relativistic Approximation! • 3?4 Table A8*3 g A — N o n t-relativistic Approximation! • 375 Table A8*4 gj^ — Non>-reIativistic Approximation * 375 Table A8..5 Eriksen Hamiltonian: (adapted fromi DeVries (1970)) . . . . . . . •> 376- Page Table A8.6 Transformation*Connecting H p a u i _ i a n d Hg^ (adapted froirn DeVries (1970)) ., ., 376 Table A9«l — Non-orthonormal Basis • •, • • • 378 Table A9.2 g ^ 1 ^ — Non-orthonormal Basis • • • • 379 Table A9*3 g j [ ^ n ^ — Non*-orthonormal Basis • • o 379 Table A9.4 — Non*-orthonormal Basis • . . . 380 Table A9*5 Sj[n^ — Non-orthonormal Basis • . • . . 380 Table A9.6 — Non-orthonormal Basis • • • • • 381 x v i . L i s t Of Figures Page Figure 5.1 Algorithm! SDNRS . 153 Figure 5.2 Algorithm SDNRS 154 Figure 7.1 P ^ vs. f o r the Ag System . . . . • 229 Figure 7.2 P 1 2 vs. H i i f o r the Ag System • . ... • 230 Figure 7*3 V a r i a t i o n of T±1 with H n f o r the A^B^ System,, H|J* = -1.0 231 Figure 7*4 P 2 1 vs H 1 1 f o r the A^B^ System^ It[J* • -1.0 * . . . . . . . 232 Figure 7*5 Varia t i o n of P ^ with f o r the A^B^ System, • 0.0 • . • • • . • • 233 s Figure 7*6 V a r i a t i o n of P 2 1 with H n f o r the A^B^ System, « 0.0 234 Figure 8.1 Total e l e c t r o n i c energy as a f u n c t i o n of i t e r a t i o n number f o r the CN molecule, bond length =2.0 a.u. . . . . . . . 279 Figure 8*2 Total electronic- energy as a funct i o n of i t e r a t i o n number f o r the CN molecule,, bond length = 2.0 a.u. • . . . • • • 280 Figure 8*3 Total e l e c t r o n i c energy as a function of i t e r a t i o n number f o r the CN molecule„ bond length « 2.0 a.u. * • • • • • • 281 Figure 8.4 Total electronic energy as a function of i t e r a t i o n number f o r the CN molecule, bond length =2.2 a.u. . . . • . . . 284 x v i i Page Figure 8*5 Total electronic energy as a functions of iteration! number for the CN molecule, bond length • 2.2 a.u • 285 Figure 8*6 Total electronic: energy as a function of iteration number for the CN molecule, bond length « 2.2 a.u. . . . . . . . 286 x v i i i . ACKNOWLEDGEMENTS I would like to take this opportunity to express my gratitude and sincere thanks to Dr. John A. R. Coope for his guidance and many helpful suggestions during my time as a graduate student at UV Bv C , and especially during the prep- aration) of this thesis* His accessibility and willingness to become involved ire my research, and to demonstrate how to communicate the results of that work, have made this a reward- ing and productive time* I would like to thank my wife, Marlene, not only for transcribing the figures and hand drawn tables in. this thesis, but also for her support, and her patience, tolerance, and understanding of a husband often obsessed with writing a Ph.. D.. thesis* My parents also deserve much credit for their support and wise counsel over the many years of my education. I would like to thank the National Research Council of Canada, the H. R.« MacMillan Foundation, and the University of British Columbia, for financial support,, without which these studies would not have been possible. I would also like to acknowledge the contribution of the U. Bi. C. Computing Center to the more practical aspects of this work. Their extensive program library, and their extensive and powerful hardware f a c i l i t i e s have made this part of the work far less painful thaniicould be expected at many other institutions. 1 CHAPTER 1 EIGENVALUE INDEPENDENT PARTITIONING, AN INTRODUCTION ( "The average Ph. D. thesis is nothing but a transference of bones from one graveyard to another." (J. Frank Dobie, A Texan i n England,, 1945) 2. Matrix partitioning is a well established technique ire linear algebra, and such techniques have been found to be very useful in quantum chemistry. In a series of papers, Lttwdin (1968, and references cited therein) has demonstrated the power and generality of a one-dimensional partitioning forma- lism, which contains, as special cases, many conventional methods used i n quantum mechanical calculations. Through the partitioning of the basis space into two subspaces — a one- dimensional space spanned by a chosen reference function, and the complementary n-1 dimensional space — he obtains an expres- sion for the eigenvalues, € „ of the matrix H as • H a a * " a b < e a V - "WX.'- where H_Q is a function not only of the elements of H, but also, of € & i t s e l f . Further development of the formalism leads to a variety of perturbation formulas (including, among others, the Rayleigh-Schrodinger and Brillouim-Wigner types), iterative methods for determining a single eigenvalue, formulas for upper and lower bounds to eigenvalues, and many other useful results. The function H ( € ) in eq. (1.1) can be regarded as a 2L.31 SL one-dimensional effective operator which depends implicitly,on the eigenvalues € Q of H. A number of attempts have been made to construct effective operators without implicit eigenvalues (see Klein (1974) and references cited therein), one of which is the eigenvalue independent partitioning of Coope (1970), which has some similarities to a non-canonical approach to the con- struction of effective operators,in elementary particle theory, f i r s t formulated by Okubo (1954). This thesis is primarily concerned with the development of this partitioning formalism, and i t s application in;quantum mechanical calculations. The basic theory is described ini considerable detail i n chapters 2 - 4. In the simplest ( 2 x 2 ) case, the basis space is partitioned into two subspaces — an n A-dimensional subpace and the comple- mentary n-nA s ttg, dimensional subspace, where 1 < n A < n-1 r- but now, the fundamental quantity is taken to be a mapping, f, relating the parts of the eigenvectors lying in these two sub- spaces. It is possible to define a variety of n A-dimensional (and also, nB«dimensional) effective operators i n terms of this mapping. The set of eigenvalues of these effective operators form a subset of the eigenvalues of the matrix H, but the effective operators themselves no longer depend e x p l i c i t l y or implicitly on these eigenvalues. Also, the corresponding eigenvectors of the f u l l matrix H are obtained straightforwardly from those of the effective operators using the mapping f. Lowdim and Goscinski (1971) are quite correct ih\ pointing out that implicitness of some sort is unavoidable in a parti- tioning formalism, and that this eigenvalue independent parti- tioning formalism could be described, in: a particular sense,, 4 as an eigenvector implicit partitioning. This implicitness is basically a result of the fact that the eigenvalues (and through them, the eigenvectors) of a matrix are nonlinear functions of the elements of the matrix. As indicated by Coope (1970),, the one-dimensional partitioning formalism of Lowdim can be obtained as a special case of this eigenvalue independent partitioning formalism when n A * 1 (as i s , ire fact, also demonstrated, but N not emphasized, by Lowdin and Goscinski (1971))• The adoption of this more general point of view, i n which the partitioning theory is formulated ire terms of a mapping between the partitioned spaces rather than in terms of the eigenvalues and eigenvectors of the matrix, leads to new and important areas of application. In particular, i t is especially suitable when groups of eigenvalues or eigenvectors are to be treated simultaneously. In chapter 2, i t is shown that the mapping f can be used to define projections onto whole eigen- spaces of H. The condition defining f can be formulated variationally, and is also seen to be related to measures of errors in such eigenprojections. It is also shown that f transforms nonlinearly under a linear transformation of the basis vectors, and that this has important practical implica- tions. The simplest ( 2 x 2 ) case of the eigenvalue independent partitioning described above is straightforwardly generalized to partitioning of the basis space (and eigenvector space) into mv (2 < m;< n>), subspaces, as i s demonstrated in chapter 4. 5. There are two main areas of a p p l i c a t i o n of t h i s p a r t i t i o n i n g formalism* One of them i s in; the construct ioni of e f f e c t i v e operators i n n A-dimensional spaces, with n A >• 1. For eigen- values which are well separated f romo a l l others, one-dimensional p a r t i t i o n i n g formalisms, as iro eq. (1.1), are useful, but when degeneracy or near degeneracy occurs, these formulas become i l l - conditioned. Traditionally,, multi-dimensional e f f e c t i v e opera- tors have been constructed using a canonical procedure, H = UfH U, (1.2) requiring the calculation! of a unitary transformation,, U, which uncouples the desired eigenspace of the operator from; the rest of the eigenvector space (for example, Van Vleck perturbation:theory, (Van; Vleck, 1929). also, see Tani (1954) and Kleins (1974)). The u n i t a r i t y of U i s commonly ensured by writing i t as TJ « e 1 3,, (1.3) where S i s a hermitian operator. Thus, i h obtaining the desired uncoupled operator I i , one must determine the exponential iS operator, e • This can be done straightforwardly using a perturbation formalism when that i s appropriate, but i t i s very d i f f i c u l t , i n general,, to calculate S exactly otherwise. On the other hand, the mapping f, i n t h i s p a r t i t i o n i n g formalism, i s defined by a much simpler, though s t i l l nonlinear, equation,, which can not only be solved using a perturbation expansion, when appropriate, but can also be solved i t e r a t i v e l y i n a very 6. t straightforward manner to obtain f to any desired level of accuracy* Methods for the iterative determination of f, and i t s generalization in a multi-partitioning formalism,are given i n chapter 5» and accompanying appendices. The particular appli- cation to the calculation of a small number of eigenvalues and eigenvectors of a large hermitian matrix i s considered in detail, and test calculations demonstrate the usefulness of this new approach to the problem. Because of the simple algebraic form of the condition defining f, compared to those defining the operator S in eq,. (1.3), perturbation formulas for f and for effective operators defined i n terms of f,. are obtained straightforwardly for arbitrary order, unlike the involved step by step procedure required in the canonical approach. Certain of the more useful series are developed i n chapter 6. Two examples are included to demonstrate the scope and ease of use of these formulas. It is shown that a formal uncoupling of the four-component Dirac equation.to obtain a two-component r e l a t i v i s t i c wave equation for electrons, is obtained by a particularly simple application of the basic formulas derived in the early part of chapter 6. Also, a nuclear spin hamiltonian for the strong f i e l d case is derived to second order. In a l l cases, the presence of degeneracy in zero order is of no concern as long as a l l degenerate or nearly degenerate levels are treated at the same time. 1 Another major application of this eigenvalue independent partitioning formalism is in the use of the mapping operators to describe projections onto particular eigenspaces. As shown in chapter 2, projections onto eigenspaces can be written i i i terms of f in a form which is automatically idempotent and , self-adjoint for any value of f• Because the elements of f are required to satisfy only a single simple defining condition* perturbation formulas to a r b i t r a r i l y high order are again obtained straightforwardly. In chapter 7, perturbation: formulas for such projections are developed with reference to molecular orbital theory. In particular, perturbation; formulas for the density matrix in Huckel, extended Huckel, and closed shell self-consistent f i e l d theory are produced. The density matrix (the projection; onto the occupied orbitals) in closed shell self-consistent f i e l d theory can be written solely in terms of the operator f corresponding to a partitioning of the eigenvectors of the Fock operator, F, into two sets, consisting, respectively, of the occupied and the unoccupied orbitals, and thus, the total electronic energy is completely specified by f• The application of this partitioning formalism in self-consistent f i e l d theory represents a generali- zation; of the simple matrix partitioning described above, in that the oiterator,, F,, to be brought to block diagonal form, i t s e l f depends on the partitioning operator f through i t s dependence on the density matrix R» Since the matrix elements of f are not constrained ini any way and do not contain; any 8 redundancy (see section 8.2), they are a particularly suitable set of variables in terms of which to determine the stationary values of the energy directly. The derivatives of the Hartree- Fock energy with respect to these variables are given very compactly using the columns of the density matrix and i t s complement. This formalism is extended straightforwardly tp the general multi-shell single determinant case using the multi- parrtitioning formalism described im chapter 4. Some numerical calculations in the closed shell and unrestricted Hartree-Fock cases are described im chapter 8, and they indicate that refine- ments involving the use of scaled variables and the adoption of bases which nearly diagonalize the Fock matrices, result i n practical procedures which are superior to the Roothaan proce- dure and to other currently available direct minimization]self- consistent f i e l d procedures. CHAPTER 2 2 PARTITIONING THEORY The White Rabbit put on his spectacles* •Where shall I begin please your Majesty?* he asked. •• Begirt at'the beginning', the King said gravely, 'and go on t i l l you come to the end« then stop.' " (Alice's Adventures in Wonderland. Lewis Carroll) 2.1 Basic Theory 2 . 1 .a The f-operator Consider the matrix eigenvalue equation, H X = X £ , (2 .1a) X*X = 1, (2 .1b) where H i s an n x n hermitian matrix, X is the n x n unitary matrix whose columns are the orthonormal eigenvectors of H, and 'f is the n x n diagonal matrix whose elements are the corresponding real eigenvalues of H.. If the n-dimensional basis set being used i s partitioned into two subsets spanning spaces S A and Sfi of dimensions n A and nfi, respectively, and the eigenvectors of H are similarly partitioned into two ,sets (A) Xv"' and X ^ , spanning spaces S A and Sg, also of dimensions lB« n A and n B, respectively, then, the matrix, X, above can be written in the block form, X « [ X ( A ) X< B )] XAA XAB n h h XAA 0 _*_BA XBB £ "a 1 V = T X. ( 2 . 2 ) Formally,: one has, ,-1 and, f * XBA XAA •• h = X A f i Xg* . .(A) (2 .3a) (2 .3b) The operator f maps the part of an eigenvector x£ lying i n S A into the part lying i n Sg. It can be considered as a generalization of the operator f ( E ) , defined by Lowdin (1962), i n connection with a p a r t i t i o n i n g formalism with n A = 1 (that i s , with the space S A one-dimensional). The function of the space S A here i s analogous to that of the so-called reference function i n one-dimensional p a r t i t i o n i n g formalisms. S i m i l a r l y , (B) the operator h maps the part of an eigenvector x* l y i n g i n S B into the part l y i n g i m S A . From eqs. (2.3), i t i s seen that f exists i f the matrix block X^ A i s non-singular, while h exists i f the matrix block X f i B i s non-singular. Since the eigenvectors of a hermitian matrix are orthogonal i f the basis functions are l i n e a r l y independent, the above conditions on XAA a n d XBB a r e s a - t i s f * e d simultaneously for at l e a s t one p a r t i t i o n i n g of the basis functions. The orthonormality condition, (2.1b), on X can be used to show that h = - f * . Thus, i n the simple 2 x 2 case, (2.4) A . T = - f 1 B ( 2 . 5 ) The operator f i s the fundamental quantity i n t h i s 2 x 2 p a r t i t i o n i n g formalism. Because of (2.4), i t completely determines projection operators, P A and Pfii, onto the two eigenspaces S A and Sfi. One has 1I"\AXIA>^A P A - X(A>X<A>t = ( X A A X i , ) [1. f f ] . 12. However, from the orthonormality condition, (2,1b), on X, one can write, X fX « X + g X = l n , (2.6) where, <• t * g = T T = 1 A • f ' f 0 S *A 0 1 B + f f i 0 gB_ (2.7) The matrices g A and g f i define metrics, with respect to which the truncated eigenvectors i n X ^ and X B B, are orthonormal. That i s , XAA GA XAA = 1A ' and • (2.8a) (2.8b) XBB % XBB = 1B! " These truncated eigenvectors are not orthonormal with respect to unity unless f = 0. Since X i s i n v e r t i b l e , from (2.6) or (2.8), one has, s A 3 <XAA XIA S "A + F T F ) » (2.9a) and, g B = (X f i B X^ B r 1 - ( 1 B + f f f ) . (2.9b) Using (2.9a), the projection P A can now be written, PA ' e l 1 Hk f f ] = g - l f si1 t &11 f f (2.10) In a s i m i l a r manner, the projection P B onto the eigenspace S B,can be written s o l e l y i n terms of f as, 13 B - f 1 B 3B .̂t -1 " f «B -1 *B (2.11) I t i s e a s i l y v e r i f i e d that PA + PFI = 1. The operators PA and PFI above are manifestly s e l f - a d j o i n t . Furthermore, using the d e f i n i t i o n s of g A and g f i i n terms of f, given i n eqs. (2.9). these matrices can be shown to be idempotent by d i r e c t matrix multiplication.. F i n a l l y , t r g " 1 ^ + f f f ) t r P! • t r gT 1 + t r f g T 1 f t - t r 1, and s i m i l a r l y , t r PB « t r l g » n f i, (2.12a) (2.12b) where the c y c l i c property of the trace has been used. Thus, for a r b i t r a r y f, the operators PA and PFI s a t i s f y a l l the c r i t e r i a necessary to be orthogonal projection operators. The usefulness of the formulation i n terms of the operator f i s e s s e n t i a l l y that, while the operators PA and PFI must s a t i s f y a complicated set of general constraints i n order to be projec- tions onto (any) spaces S A and Sg, the p a r t i t i o n i n g operator f i s not constrained i n any way. The eigenprojection PA i s completely s p e c i f i e d by the n An complex components of the vectors x £ A \ (r = 1, .... n A ) , spanning S A. However, the space S A i s also spanned by any (A) (A) other set of n. vectors v i ' related to the xj, ' by a non-A r r singular n A x n A l i n e a r transformation. This transformation 14 corresponds merely to a change of basis in S A» Therefore, 2 there are n A arbitrary, or redundant, complex parameters • (A) present in the specification of S A using Xx '• Thus only n A(n - n A) = n A n B complex parameters are necessary to specify the eigenspace S A* But this is exactly the number of degrees of freedom (or matrix elements) in f. Thus the operator f represents the minimum amount of information necessary to specify a projection onto the eigenspace S A (and therefore, also onto Sfi, which is the complement of S A) of HV This parti- tioning formalism is therefore particularly useful i n situations in which only eigenspaces have significance, rather than specific eigenvectors. 2»l.b The Defining Condition For f The matrices f and h, defined in the previous subsection, can be obtained by diagonalizing H to get i t s eigenvectors, X, and then applying the formulas ( 2 . 3 ) directly. However, i t is possible to formulate a system of equations for f and h, which do not require knowledge of the eigenvectors of H.. The eigenvalue equation (2 .1a) is rewritten as H T = T H, (2.13) where H = X ? X V — A HA 0 0 H B (2.14) is to be block diagonal. The diagonal blocks of eq. (2.13) give expressions f o r H A and Hfi i n terms of f, h, and H, and, FTA " HAA + HABF- ( 2 - 1 5 A > »B * HBAH * HBB' ( 2 ' 1 5 B ) I f these expressions are substituted back into eq. (2.13), the two off-diagonal blocks become nonlinear matrix equations, and, D(f) = H B A • H B B f - f H A ( f ) * HBA + HBB f * f HAA - f HAB f " °' D'(h) = H M h + H A B - h H f i(h) " HAB + HAA h ' h HBB * h HBA h = °- (2.16) ( 2 . 1 7 ) Equations (2.16) and (2.17) are both systems of n An f i simulta- neous nonlinear equations, the f i r s t for the matrix elements of f, and the second for the matrix elements of h. I t i s noteworthy that the two systems are not coupled, and thus can be solved independently. Of course, i n t h i s case, i t i s not necessary to solve both (2.16) and (2.17), because i f one or the other has been solved, the so l u t i o n of the remaining system i s given by eq. (2.4). In f a c t , i t can e a s i l y be seen that D'* i s of the same form i n -h f, as D(f) i s i n f, implying eq. (2.4) without e x p l i c i t l y making use of orthogonality (the h e r m i t i c i t y of H i s used f ! and t h i s , of course, implies the orthogonality condi- t i o n anyway). 16. In the 2 x 2 partitioning formalism, eq. (2.16) is the fundamental equation determining the operator f, i f an ortho- normal basis i s used. A number of efficient iterative tech- niques for the exact solution of (2 .16) w i l l be detailed later. The quantity D(f) is closely related to other more commonly used quantities in the determination of eigenprojections. In particular, D(f) w i l l be seen to be related in several ways to the error in an eigenprojection. 2»l.c Rederivation From a Projection Point of View An alternative approach to this partitioning formalism can be made via the projection operators themselves. The objective is to determine the eigenprojection P A onto a space S A spanned by n A eigenvectors of H, in terms of some minimal set of variables which number nAnfi., as shown previously. It is useful to examine this approach i n some detail, not only because i t provides a different point of view, but also, because the projections themselves are manifestly basis independent. The conditions that P A be an eigenprojection of H are that P A commute with H, [ H , P A ] = 0 , (2 .18) and that P A be a projection operator, that i s , PA 2 " V PA f s V t r PA = nA- ( 2' 19) It i s convenient to define a partitioning of the basis functions into two sets, spanning spaces S A and Sfi of dimensions n A and nfi:, respectively/ Projections onto the spaces S A and Sfi are given by, 0 0 0 0 B (2.20) This partitioning of the basis functions implies that the projection P A can be written in block form, ~ i t' P.. P. AA i 3BA AB 3BB (2.21a) where, and PAA * PA 1 PA PA • PBA ' PB • PA PA • PAB " PA • PA PB • • PBB * PB i PA PB ' (2.21b) In terms of the partitioned matrix, (2.21),; the idempotency condition, P A « PA, is equivalent to the three block equations, • 2 • » PAA " PAA PAB PBA ' G» PBA " PBA * AA P» i - Pr>r> PT>A = 0, BB BA (2.22) and * 2 ' ' "* PT.TI PT>« PA-O = 0,. BB BB BA AB the remaining block equation being just the adjoint of the second one. Since there are only n An f i independent variables in PA, i t is possible, in principle, to express P^ and Pfifi 18. i n terms of Pj^* However, the equations are nonlinear, and while formal general solutions can be written down, PAA * *CX t ( 1 * ^ PBA • (2 .23a) and p ^ = £[i * ( i - **BA - ( 2 - 2 3 b ) they are seen to contain the ambiguity i n the square root, and are generally d i f f i c u l t to evaluate• A more useful r e s u l t i s reached by a d i f f e r e n t route.. The matrix P A i s of rank n A» because the p r o j e c t i o n operator onto an n A-dimensionaI space has p r e c i s e l y n A non-zero eigen- values, corresponding to the n A eigenvectors of P A which span the image space S A* This means that there i s at l e a s t one nA x nA S U D m a , t r i x o t P A which; i s non-singular. I t w i l l be assumed that the p a r t i t i o n i n g of the basis s e t , ( 2 . 2 0 ) , i s c a r r i e d out so that P A A i s such a submatrix, that i s , detCP^.) f 0 . With t h i s assumption, the f i r s t equation of (2 .22) can be rewritten as PAA " PAA <*A * P M l p B A PBA PAA • «'2»>-- The quantity inside the brackets i n (2.24) w i l l be greatly s i m p l i f i e d i f P g A i s written as some factor times P^* that i s , i f PBA * f PAA • PAB a PAA f t ' < 2 « 2 * > where f i s an n^ x: n A matrix, and thus represents a suitable quantity, i n terms of which the matrix P A could be expressed. The existence of f i s assured by the i n v e r t i b i l i t y of P ^ r f . P ; A p ^ 1 . (2.25b) Now, (2.24) yields PAA * ( 1A * ^ f ) " 1 * (2.26) From (2.25), PBA * f ( 1A * f + f > " 1 « (2.2?) Finally, substituting (2.25) into the second of eqs. (2.22), and multiplying from the right by f ^ f f * ) " 1 , yields, Pfifii » f (1 A • f ^ ) " 1 ^ * (2.28) Equation (2.18) can now be used to derive an equation defining the operator f. Expansion of the commutator again yields three unique block equations, <E*>AA * <*A * f t f ) " 1 ( H A A + f \ B > - <HAA + W » l A * f t f r l 3 °' ( E Q ) b a - d B • t t ^ r H a ^ • f f f H B A ) (2.29) " ( HBA * HBB f ) ( 1A * f t f>" 1 ' G» and, (EQ) f i B ; - (1 B • f f V ^ f H ^ * f f \ B ) Here, use has been made of the relations, f ( l A • f t f ) " 1 - ( l f i • f f t ) " 1 f , and (2.30) (1 A + t^t)"1^ • f f ( l B • f f * ) " 1 . to move a l l of the inverse operators to the outside of each term. I t i s then seen that - f ( l A • f f f ) (EQ)^ ( 1 A + f f f ) «• ( 1 B + f f f ) ( E Q ) ^ ^ • f f f ) * HBA + H B B f * f HAA - f HAB f °» ( 2 . 3 D and also, ( 1 B * - f f t ) ( E Q ) M ( 1 A • • ( 1 B + *t f)- ( E Q ) B B ( 1 B + f f f ) f • D(f) » 0 , where the quantity D(f) has been defined i n eq..(2.16) That Is,, the operator f defined i n eq. (2.16) i s of the same s i z e and s a t i s f i e s the same defining equation as the p a r t i t i o n i n g operator f described i n the previous two subsections. This r e s u l t re-emphasizes the fact that t h i s p a r t i t i o n i n g formalism i s based on the idea of defining an eigenspace of a hermitlan operator, rather than i n d i v i d u a l eigenvectors. The •pull-through' r e l a t i o n s , ( 2 . 3 0 ) , are used extensively i n the 2 x 2 p a r t i t i o n i n g formalism. They are most u s e f u l l y written as f g A X - g j 1 f r (2 .32a) and g j 1 f 1" * f t g j 1 , (2 .32b) I'm the notation; established i n the previous subsections. 21. 2»l.d The Relationship Between T and the Eigenprojections— Covariant and Contravariant Representations The columns of the partitioning matrix T, of eq. (2.2) or (2.5), can be regarded as a set of non-orthonormal basis vectors spanning the original n-dimensional basis space. These vectors w i l l be denoted here by e r, (r * 1,, n), that i s $ « O i r e 2, e nJ - f 1 B (2.33) The metric defined by the scalar products of these vectors, Srs = W i s g i v e n b y ^ at I I ss o g B (2.34) using the notation developed i n eq. (2.7). Using the inverse metric, % = g]*1* a s e - t °* contragredient basis vectors e r, (r * 1„ re), can be defined by ^ n = Z g r s e , s=l 3 (2.35) or [e , e , .•., e ] o gj1 h - f •B g* "SA f -1- 5A (2.36) On comparison with eq. (2.10), the f i r s t n A of these vectors e r can be identified as the f i r s t n A columns of the projection P A onto SA.. Similarly, the last nfi of the e r are the last 22. n^ columns of Pg, = (1 - P A), the projection onto the comple- mentary subspace Sfi* Thus the two sets of n A vectors 1- 3 A * and EA = (PA>AA ( P AV (2.37) are dual (contragredient), both spanning the eigenspace S A, while the two sets of nfi. vectors. and ®B 1, -'VAB *B " ̂ PÂBB (IWAB ^ PB ̂ B? (2.38) are also dual,, both sets spanning the eigenspace Sfi. Prom a different point of view, a metric & can be defined, with respect to which the e r are orthonormal, namely, e • t-± • e = 6 , r s rs • (r,s s* 1, .*•, n) • (2.39) That is,, Here s ee i ee = (2.40) -1 Is the same as g- above. Similarly, the e are orthonormal with respect to ,, which is the same numerically as ĝ above. It should be noted that A and g, and and g as denoted here are, im principle, quite different quantities. They happen to be numerically identical here because = e*e 2 3 . (and %$} « Such i s not the case, however, i f the o r i - g i n a l basis i s non-orthonormal. These sets of contragredient vectors are very useful f o r w r i t i n g a number of important r e l a t i o n s , to be developed l a t e r , i n a very compact manner. 2.1.e V a r i a t i o n a l Formulation! of D(f) » 0 The expectation! value of an operator with respect to one of i t s eigenvectors i s stationary with respect to a r b i t r a r y small variations i n that eigenvector. As a r e s u l t , i f P A i s a projection! onto the eigenspace S A of the operator H, then the expectation value of H over S A, given by E = t r P AH , (2.41) w i l l be stationary with respect to a r b i t r a r y small variations i n P A. That i s E ( P A • 6P A) - E(P A) - t r [ P A ( f + 6 f ) - PA'(f)']H • t r 6P AH + 0 ( 6 2 ) , (2.42) must vanish to f i r s t order i n the i n f i n i t e s i m a l s . I t i s assumed here that H i s independent of P A or f . From eqs. (2.10), to f i r s t order, <6PA>AA " <hh • and (&P*) « - f g ^ s g g - 1 f f • 6 f g " 1 f t + f g T ^ f 1 " , 1 0 A'BB SA 06A 6A 0 *A 6A • 24. where to f i r s t order, 6g A - 6 f f f • f f 6 f . (2.43b) Substitution of (2.43a,b) into eq. (2.41),, followed by use of the cyclic property of the trace, and the •pull-through' relations (2.32) for f and f*, results in an expression of the form 6E » t r 6 f f I + tr6f Df + 0(6 2), (2.44) where D = g" 1 D(f) g" 1. (2.45) Because 6f and 6f* are arbitrary variations in f and f*, the condition! that E vanish in f i r s t order is that the matrix 15 -1 -1 vanish. The matrices g A and g f i are positive definite,, however, and thus D can vanish only i f D(f) i t s e l f vanishes. Thus, the condition that the expectation value of H over the image space of the projection P A be stationary i s equivalent to the condition D(f) = 0, eq. (2.16).. It i s also interesting to note here that the quantity D in eq. (2.45) i s the BA block of the hamiltonian H, i n the basis of contragredient, non-orthonormal vectors e of eq. (2 .36). Thus one can write H . - ( [ ( 1 - f A ) H P ; V „ (Z.'.Sa) or = H o r (2.46b) eB eA and the rate of change of the expectation value E with an element f o r of f is seen to be proportional to the correspond- ing element of the off-diagonal block of the hamiltonian^ i n this particular basis. 25. 2 2»l»f Relation Between o and D(f) — Eigenvalue Dispersion l m t h e study of matrix eigenvalue problems, i t i s useful to define the variance a , which i s a measure of the error i n an approximate eigenvector, x, of a matrix H, given by r2 _ (Hx - Xx))f(Hx - Xx) f . (2.47) X X I f the approximate eigenvalue, X,. i s calculated as the Rayleigh quotient of H with respect to x, x = i _ H x ^ (2.48) x x then. eq. (2.47) becomes o 2 . * X X \ X ->X i which i s i n the form of the usual d e f i n i t i o n ; of variance. In terms of the projection P x • x x T, onto the one-dimensional space spanned by the normalized vector x, eq. (2.49) can be written as o 2 = t r H(l - P x) H P x * (2.50) Equation (2.50) suggests a g e n e r a l i z a t i o n of the concept 2 • of the variance o to apply to projections P A onto a multi- dimensional space spanned by several approximate eigenvectors > A of H.. Substitution of eq. (2.10) f o r P! into eq. (2.50), and use of (2,16),, gives o 2 * t r H(1-P A)HP A = t r g^DCf Jg^pCf ).f - |«i*DCf)gX*||? ( 2 . 5 D 26, where ||A Q = ( L |A- p.̂ denotes the Hillbert-Schmidt norm of the r,s matrix A* This: may also be written im the form _2 _ 1J if rr -n* ~\ 2 = -£tr([ H , P. ]/)., (2.52) -1 If P A is an exact eigenprojectiom of H, the variance a i n (2.51) must vanish, because then £ K » » 0.. Since g^ -1 2 and gg are positive definite matrices, o can vanish only i f D(f) « 0. I n this case, D(f) i s seen to give a quantitative t measure of the error i n PA, rather than merely a cr i t e r i o n for the presence or absence of error. In terms of matrix elements, one has o 2 = Ll<0p I g ^ D t f J g ^ l ^ l 2 = E|<0°|H|0°>\2, (2.53) Pt t p,t where <fy , ( ,0=1', ng), and 0^, (t = 1, n^), are basis elements in the subspaces Sfi and SA,, respectively. The 0p, (p = 1,. ng,), and the 0^, (t « 1, n A), are the orthogonalized transformed basis vectors, f g - f (2.54) in the basis of the 0̂ and 0 .̂, above. Thus QC is seen to be a measure of the smallness of the elements of the off-diagonal block of H in this basis. Using the closure relation Z \p><p\ - 1 - E l t x t | , eq* (2.53) can be rewritten as t = £ f0j|Hz|0?> - Z , <0°.|H|0°> % (2.55) t€S t,s€S A where 0?, 0°, in these summations run over eigenvectors in the x s space S A only. On transforming these vectors to a new set: (r = 1, ..., n A), which diagonalizes H in S A, eq. (2.55) becomes n = i = £ A o 2 . (2.56) n=l n If f is an exact solution of (2.16), uncoupling the parts of the 0°, (t = 1. n A)» in S A, and the 0p, (p « 1, • •», n^) in Sg, exactly, then the ^ i n (2.56) are exact eigenvectors of H> and each o n is: identically zero. If f is not exact, then the ^ w i l l be only approximate eigenvectors of H, and o o is the variance of H with respect to the single approximate 2 eigenvector yn+ Thus o is the <i sum over these individual. variances, and is useful not only as a quantitative measure of the accuracy of f, but also as an upper bound to the 2 individual a... n. 2.1.g Transformation of f Under a Change of Basis The quantity f defined by eq. (2.2) is clearly dependent upon the basis set being used. Because of eq. (2.3), i t does not transform linearly under a linear transformation of the basis vectors. Consider the linear transformation, *L -'.£ *..V.\. . (2.57) 28. of the basis vectors so that the eigenvectors of H, referred to the new basis [0*] have coefficients,; X* = V X. (2.58) In the new basis, partitionings of eigenvectors and basis vectors similar to those described in section 2.1.a can be carried out, yielding, x = XAA XAB i i XBA XBB h -f X AA 0 "BB (2.59a) where f = X-n A X» 4 , (2.59b) BA "AA analogous to eqs. (2.2) and (2.3).. To obtain the relationship between f and f , we proceed as follows.- Prom (2.2), X* « V X « V T X * A • = T X or <• I A A A * _ 1 T = V T X X , (2.60) whgfce" the right hand side of eq. (2.60) is independent of f , A • but does depend on the truncated new eigenvectors X • However, from the AA block equation; of (2.60), i t follows that, XAA * <VAA + VAB f > *AA- (2.61) Substitution of this equation into the BA block equation of (2.60) gives f i n terms of f and V only, *' = <VBA + VBB^<VAA + ¥ r l " ( 2 - 6 2 ) While such a complicated transformation can be very 29. inconvenient i n some cases, i t i s also a feature which can be usefully exploited.. In calculations i n which the elements of f are acting as coordinates, the metric character of the object function can be radically altered by a simple basis change, because of the nonlinear dependence of f on V.. For quantities transforming linearly in V, such a basis change merely results in a rotation of the object function. This point is discussed in. greater detail i n chapters 5 and 8. If f is small, the inverse matrix i n (2.62) can be expanded as < VAA + W" 1 - V'A <*A * VAB' VlA>" 1 " VAA * ^ A B ^ l A + ' and thus, to f i r s t order in f, F " • VBA VIA + < VBB " V I A W ^ I A + Thus, i f f is small, the transformation (2.62) is nearly linear, although not homogeneous. 30 2.2 Effective Operators 2.2.a Basic Definitions The primary application of the partitioning formalism just described is in the construction of effective operators. In this context, such operators are defined in either of the sub- spaces S A or Sg. of the f u l l basis space, but their eigenvalues form a subset of the eigenvalues of the original operator i n the f u l l basis space, and the corresponding eigenvectors are related im some way to those of the original operator. There are two ways of regarding the matrices of such effective opera- tors. They can be regarded as the matrix of a transformed operator i n the old basis (active sense), or, alternatively, as the matrix of the old operator in a transformed basis (passive sense). Both points of view are equivalent, but i n , what follows, the former w i l l be emphasized.. The: simplest set of such effective operators for the matrix H has already been defined i n equations (2.14) and (2.15). In S A, we have the operator ftAaHAA+HABf> ( 2 * 6 5 a ) with the eigenvalue equation fi*XAA-XAA ? U ) ' ( 2 ' 6 5 b ) and i n S B» ftB * HBB " ( 2 * 6 6 a ) with the eigenvalue equation, 31. Both and Hg. are nonfhermitian in general, although their eigenvalues f ^ and f are real, since they are subsets of the eigenvalues of the hermitian operator H* The eigen- vectors X ^ and Xgg, are not orthonormal in general, because they are truncations of the orthonormal eigenvectors X of the f u l l hamiltonian H. It is possible to derive a pair of self-adjoint effective operators directly from the eigenvalue equation (2.1a). Pre- multiplicationi by T + , and use of eq. (2.17), yields, °AXAA * *L*Xk f < A ) ' <2'67a) where and, G A - (T'HT)^ = H a a * K A f if • fFHBA • fFHBB:f, (2.67b) where V B B * % XBB f U ) » < 2' 6 8 a ) G B « (T+HT)Bg. = H M - H B A f t - f H A B • f H ^ f * . (2.68b) * t * The off-diagonal block of T HT Is given by, GBA * «m + V - ™ U " '"AB*' f2'6" which i s just the quantity D(f), defined in; eq. (2.16). When GgA « 0, i t can be shown that, GA " gA*A» QB. " g B % » ( 2 * 7 0 ) using eqs. (2.9)» and the definitions of the effective operators presented above* Thus, when f is known exactly, the se l f - adjoint effective operators G A and Gg could be considered to be obtained from the non-selfadjoint effective operators H A 32. and Hg; by orthonormalizing the eigenvectors of the l a t t e r . I t i s also possible to obtain, s e l f - a d j o i n t e f f e c t i v e operators i n S A and Sg. by orthonormalizing the truncated eigenvectors. The e f f e c t i v e operators H A, Hg,,, and G A and Gfi above,, are uniquely determined once p a r t i c u l a r p a r t i t i o n i n g s of the basis and eigenvector spaces are defined. The s e l f - ad'joint e f f e c t i v e operators obtained by orthonormalization are not unique, however, i n that they depend on the p a r t i c u l a r orthonormalizationi procedure employed. The symmetrical orthogonalization procedure of Lowdin (1970) and others, has the feature that the new orthonormalized vectors resemble the i n i t i a l vectors as c l o s e l y as possible, i n a p a r t i c u l a r sense. 1 Applied to the present case, the new orthonormal eigenvectors are given by g * X.» , (2.71a) "AA " &A "AA i n S A, and, CBB = «B* XBB • ( 2 ' 7 1 b ) i n Sg. Thus one has, CAA CAA * XAA gA XAA = 1k * (2.72) CBB CBB = XBB gB XBB = 1B • by eq* (2.8).. The eigenvalue equation i n i s obtained either by premultiplying (2.65b) by g A or (2.67a) by g A to get, * I n the notation used above, the difference between the two sets i s measured by ^ i j " X i j ' which i s minimized i f G i s given by eq. (2.71) (Lowdin, 1970).. 33. SA CAA = CAA (A) where ~ 4 * -4 H A - S A * W (2.73) (2.74a) (2.74b) Similarly, premultiplicatiorri of (2.66b) by gg or (2.68a) by -4 gg . gives the equation (B) HB CBB * CBB f (2.75) where HB - gg* ifggg* - 4 - 4 (2.76a) (2.76b) It is also possible to define effective operators in, either S A or Sg for any other operator defined i n the unpartl- tioned space. For some operator M, (2.77) where I 1 4 V ] MAA MAB MBA MBB f = MAA + MAB f + f MBA + f MBB f' (2.78) MA has the same form in M as 0 A defined in (2.67b) has in H. Here M*A has the same expectation values for the truncated eigenvectors X A A as the operator M has for the f u l l eigen- vectors X. . An effective operator with the same properties with respect to the orthonormalized eigenvectors is clearly given by, 34. which i s analogous to R"A defined im eq. (2.74).- The analogue of the e f f e c t i v e operator H A of eq. (2.65a) can be obtained by premultiplying MA by g * , following eq. (2.67b). E f f e c t i v e operators f o r M r e s t r i c t e d to Sg, analogous to (2.77) - (2.79), can be obtained i n a s i m i l a r manner. 2.2.b: Eigenvectors and Eigenvalues of the E f f e c t i v e Operators In order to amplify the material i n the immediately pre- ceding subsection, the connection between the eigenvalues and eigenvectors of the operator H and those of the e f f e c t i v e operators H, G, and H, w i l l be i l l u s t r a t e d here from a d i f f - erent point of view. The f u l l operator H has the eigenvalue equation 1 T l 1 (2.80) Once the two basis spaces, S A and Sg, are defined, each eigen- vector can be written as a sum of two parts, t i - t i A * f i B . (2.81) one part i n S A and one i n Sg*, The eigenvectors are them- selves divided into two sets, 1 P J ^ * ( i = 1, •-••»». n A ) , and » ( i s 1* *••» n B ) , where n A + n f i = n, according as they l i e i n S A or Sg. The basic property of f i s to map the part of ' f ' ^ * n SA * n i t o t h e p a r x * n SB» a c c o r d i n S x 0 35. V[B S f f±K* (2.82a) S i m i l a r l y , one has,. - ( - f * ) ^ ? * (2.82b) Combination with eq.. (2.81) y i e l d s , ^ U ) . ^(A) . ^(A) . ^ ( A ) + f ^ ( A > - ( 1 A * t)V{£h (2.83a) and, * ( l f i - f f ) f i B - (2.83b) Ini the notation! used here and throughbut t h i s subsection, the operator f i s to be regarded, when necessary, as embedded i n the n-dimensional basis space, but w i l l , be denoted by the same symbol as before. The eigenvalue equation; f o r H A i s ftA ^ i f * J i ^ i A ^ ( i " nA>* ( 2 ' 8 * a ) where the eigenvectors s a t i s f y < ^ i A ) l g A l ^ J A } > " 6iF U , j B l r n A K t'2.8iH») For the e f f e c t i v e operator G A, i t i s e A ^ U ) ' Ii*' **tt' (i - 1 V' < 2'85> with the same orthonormality conditions (2.84b). F i n a l l y , f o r the e f f e c t i v e operator H A, the eigenvalue equation i s | ( t ) X ^ ) . ( i - 1. .... n A ) , (2.86a) where, 36. and, < X I A ) I ^ A ) > A hy " L » V - (2'86C) In terms of the eigenvectors of HA, the eigenvectors of the original operator H are ^ [ A > - <1A * D g ^ X i ^ . (2.87) In a l l of these equations, the eigenvalues j» ( i s i» •-•>•§ n A), are exactly the n A eigenvalues of the original operator H corresponding to the eigenvectors f ' j ^ * ( i a 1, n A) • The eigenvalue equations for the effective operators Hg, Gg.,, and Hg, defined in; Sg,, are of the same form as those given above for the corresponding effective operators in S A» Finally, consider the projections P A and Pg, onto the eigenspaces S A, and Sg., respectively. For PA, " A PA * f l ^ ^ x ^ ^ ^ l " A » . E ( 1 A • f) Ifi^xfiVl ( 1 A • f f ) (2.88a) Here r E |^ [ ^ x ^ l ^ l * g ( A ) - 1 „ (2.88b) defines an embedding of the inverse of the metric g A i n the n>-dimensional basis space. Similarly, P l . E •|VjB)»«tiB>l i=l 37. - £ ' ( 1 B - f f ) l ^ i f t X ^ i ^ l d B - *>.- (2.89a) where„ X l t i B > < f i f r l * g < B ? M . (2.89b) is an embedding of the inverse of the metric gfi i n the f u l l n-dimensional basis space. 2.2.C Relationships With Other Formulations Many of the quantities defined or derived above have appeared in one form or another in the literature, usually i n connection with the calculation of effective operators i n a perturbation formalism.. The treatment by Friedrichs (1965) of an isolated part of the spectrum of an operator H, is particularly interesting in this regard. Several interrela- tions between the current non-canonical formulation and the more commonly used unitary methods are illustrated by rewriting some of the quantities introduced in that treatment, using the block notation employed here.. Following Friedrichs, the aim; here i s to obtain an expres- sion for a projection operator P̂ onto a space spanned by a set of eigenvectors which correspond to an isolated part of the spectrum of some perturbed operator H.. Rather than requiring that the projection P- be orthogonal (that is,, that the operator Fc be hermitian), or ex p l i c i t l y idempotent, i t i s 38:. required only that P€ Po " V Po P€ * Po' (2.90) where P Q Is a projection onto the corresponding eigenspace of the unperturbed operator tt • These linear conditions, (2.90), imply idempotency, P 2 . p ep 0p e . P , P 0 - p €. and OJ O € O O € O thus verifying that P Q and P̂ are projections. However, by themselves, they do not imply that P̂ * P €, or that P* = P Q. Equations (2.90) represent the minimal conditions for P £ to be a projection, without prescribing any information about the internal structure of i t s image space. I n a basis adapted to the solution of the zero order problem, that i s , with the matrix representation, Po " ( 2 . 9 D 0 0_ where the subscript A denotes the space spanned by the zero) order eigenvectors of interest, the form of the matrix repre- sentation of Pg is restricted by (2.90) to 0 0 (2.92) where? f 6 Is a matrix undetermined by (2.90). It is now possible to define mappings, Ug(SQ->Se) and l£(S,->S V. between the spaces S- spanned by the eigenfunctions t c O t 39. <3f the perturbed operator, and the space S Q spanned by the corresponding eigenfunctions of the unperturbed operator. Im terms of the projections P Q and P €, these mappings are U* = H ( P r P 0 ) . (2.93) as given by F r i e d r i c h s . I t then follows that the operator = U* H € U~ r (2.94) i s from S_ to S , but has the same spectrum as Hct, the O) o t F perturbed operator. That i s , H g i s an e f f e c t i v e operator i n the space S . + In the matrix n o t a t i o n introduced above, the mappings U~ are A 0 I, (2.95) i where the subscript B) denotes the space of a l l eigenvectors of H Q except those of i n t e r e s t . Thus, i n the no t a t i o n developed i n the previous sections, H AB D( f e ) ftB?(f€) (2.96) I t i s possible to define a new set of unitary mappings, U, , which map between S_ and S . and vice versa, as € t o U +1 "A ; f . (2.97) I € "B|_ Using (2.97) instead of (2.95) i n eq. (2.94),, a new trans- formed perturbed operator i s obtained, 40. (2.98) which is self-adjoint. The operators G A and Gg are given by eqs. (2.67b) and (2.68b). These results are in accord with the fact that the non-selfadjointness of the operators H A and Hfi, introduced im the previous section, i s associated with the fact that the mappings between the two spaces and Sg are, not unitary (that i s , they do not leave the inner product unchanged). We point out that H g is block diagonal when the matrix block f € of U*' satisfies D(f €) = 0 r eq. (2.16). It is inter- + esting to note that choosing the matrix block f g in Û of eq. (2.95) to 1 satisfy (2.l6)„ is equivalent to a partial reduction of H>€ toward the upper Hessenberg form, the result of a non-unitary procedure used in numerical matrix diagonali- zation. However, H g is not exactly upper Hessenberg even i f D;(f€) vanishes, because the diagonal blocks of H^ and Hg, are not upper triangular i n general. Finally, note that Friedrichs introduces an operator (P^Pg)" 1, which is defined only irr the image space S Q of P Q. In the matrix notation used abover P J P € = lk * f j f 6 = ( g i ) A . U.99) Thus* the orthogonal, projection onto Ŝ ,, given by Friedrichs as G A ( f e ) D ( f € ) T D(f €) G B ( f € ) 41. is written in matrix notation* here as ,-1 P = linr fe a-0 (g c) - l o 0 €'A €'A 0 0 a (2.100) which is identical to the projection P A of eq. (2.10). Finally, we also point out that the operator H, defined by symmetrical orthogonalization i n eqs. (2.74) and (2.76), coincides with operators of Sz.-Nagy (1946/47? see also Riesz and Sz.-Nagy, 1955m §136),, Primas (1961, 1963)r and also? Kato (1966, Remark 4.4 of chapter 2)» 42. 2.3 Generalization to a Nonorthonormal Basis Set The formalism presented i n the f i r s t part of t h i s chapter can e a s i l y he generalized to the s i t u a t i o n i n which the basis functions 0^, ( i * 1. •»•>,, ra)* being used, are not orthonormal, Ira t h i s case, the eigenvalue equation; has the form H X - S X E, (2.101a)) with normalization! X'S X - i n „ (2.101b) where the elements of the matrix S are the inner products of the basis functions, S l j 5 - <0 il0 j>. The p a r t i t i o n i n g of the basis set, and of the eigenvectors of H into two sets of dimensions n A and ng:, respectively, i s car r i e d out exactly as before, leading to eq. (2.2), X = B AA BK = T X , where f and h are again formally given by (2.3). However, as a r e s u l t of the more complicated normalization condition (2.101b), the simple r e l a t i o n (2.4) i s now replaced by |;).. (2.102) h = -<sAA • t's^rhs^ • f f s B B / Because of the complexity of (2.102), i t i s convenient here to r e t a i n the notation h and f throughout, rather than eliminate h e n t i r e l y , as was done for the orthonormal case. The metric matrices for the truncated eigenvectors, as i h eq. (2.8), are given by the diagonal blocks of the product 43. T TS T, 6A AA AB: BA BB * (2.103a) and % = SBB + SBA h + AB + h SAA h- (2.103b) The projection P A s t i l l has the form (2.10), but the projectionsP f i must here be written, h PB * •B B ] - -1 -1 hgg h hggj . (2*104) These projections are s e l f - a d j o i n t , but now the idempotency conditions become (P AS) 2= P̂ S,, and (PgS) 2 « PgS, as can be v e r i f i e d by d i r e c t matrix multiplication., Also, i t can e a s i l y be shown that t r P AS * nA,, and t r PgS = ng. The defining conditions on f and h can be obtained from the analogue of eq. (2.13), namely r K I « S T H„ (2.105a) where H: - X f X"1, (2.105b) i s to be block diagonal. The non^selfadjoint e f f e c t i v e opera- tors H A and H g are given by the diagonal blocks of (2.105a) as 'AA 'AB AA 4ABJ and «B * ( SBA h + S B B r l ( H B A h + HBB>}' (2.106) (2.10?) With these d e f i n i t i o n s , the eigenvalue equations for these e f f e c t i v e operators have exactly the same form as i n the orthonormal case. 44 Alternatively, the inverse matrices in (2.106) and (2.10?) could be transferred to the right hand sides of the eigenvalue equations for H A and Hg_, respectively, and be regarded as effective overlap matrices, giving eigenvalue equations of the form fiAXAA'SA*AA ? U ) . < 2'108> and « E X B f i S B XBB: / ( B ) » < 2 ' 1 0 9 ) where now, fi! and HL are given formally by eq. (2.15), as A, "B3 K * HAA + HAB f • ( 2 - U 0 ) and ! % * HBB?* HBA h- (2.111) The operators and Sfi are of the same form in Sy §A - SAA + SAB f' H S SBA h + SBB* < 2* 1 1 2 ) Equations (2.108) and (2.109) are generalized eigenvalue equations for a non-selfadjoint operator* Using (2.106) and (2.10?) i n the off-diagonal blocks of eq. (2.105a),, the defining equations for f and h,, analogous to (2.16) and (2.17), are now found to be and " HAB* HAA h- ( SAA h* SAB ) ( SBA h + SBB )" 1 ( HBA h + HBB> * °« 45. As f o r an orthonormal basis, the equations f o r f here are not coupled to those f o r h.. Equations (2,113) and (2.114) are not the only useful equations defining f and h. An a l t e r n a t i v e approach i s used i n some d e t a i l i n the next chapter. S e l f - a d j o i n t e f f e c t i v e operators can again be obtained by premultiplying the eigenvalue equation, (2.101a), by T • The r e s u l t i n g operator, G A, i n S A i s given by eq. (2.67b), but the corresponding e f f e c t i v e operator i n Sg must now be written G B * HBB * HBA h * ^"AB + h t f tAA h (2.115a) - ggH R r (2.115b) with (2.115b) holding only i f eqs. (2.113) and (2.114) are s a t i s f i e d . The eigenvalue equations f o r these e f f e c t i v e opera- tors are as i n eqs. (2.6?a) and (2.68a), applicable also i n an orthonormal basis. The BA block of T fH T i s GBA * HBA * H B B f * h t< KAA + H A B f ) * ( 2 - l l 6 ) which becomes i d e n t i c a l to D(f) i f h i s g i v e n by (2.102). The e f f e c t i v e operators H A and H R are given by eqs. (2.74) and (2.76). respectively, i n t h i s case. Their eigenvalue equations are given by (2.73) and (2.75)., Sets of contragredient vectors can be defined here i n terms of the columns of P A and (1 - PAS);,, and t h e i r r e c i p r o c a l vectors* These are useful i n writing various quantities i n a compact manner when a nonorthonormal basis i s used. These vectors are considerably more complicated i n t h i s case than those given i n s e c t i o n 2.1.d. Their detailed examination w i l l tee deferred u n t i l some motivation has been provided f o r defining them. 46. CHAPTER 3 THE EFFECTIVE HAMILTONIANS— PRACTICAL CONSIDERATIONS "So I prophesied as I was commanded* and as I prophesied, there was a noise, and behold a shaking, and the bones came together, bone to his bone* And when I beheld, l o , the sinews and the f l e s h came up upon them, and the ski n covered them abovet but there was no breath i n them*" Ezek i e l 37» 7,8 (KJV) 47. 3.1 Alternative Formulas The purpose of t h i s section i s to examine some of the int e r - r e l a t i o n s h i p s between the e f f e c t i v e operators described i n sections 2.2 and 2,3, i h somewhat greater d e t a i l , e s p e c i a l l y when f i s known only approximately. As has been pointed out before, the two alte r n a t i v e expressions f o r the operators G A and Gg given i n eqs. (2,67b),, (2.68b), and (2.70), are equiva- lent only i f f s a t i s f i e s D(f)=0. I f f s a t i s f i e s D(f)=0 only approximately, i t i s possible to d i s t i n g u i s h two types of approximate e f f e c t i v e operators, H A and Hg, namely, " A 0 8 8 HAA + HAB f» ( 3 - l a ) } " H B » - H B A f t » <3' l D ) and H A 2 ) • g" 1 CJA , (3.2a) « K 2 ) s «B % * <3.2b) These two types of operators are related by H< 2 ) » k[X) + g ~ * f f D ( 1 ) ( f ) , (3.3a) and " R 1 • " i 1 * + *£n>Mlt)\ (3.3b) where the notation (f); i s defined below i n eq. (3.4)* Thus the two sets of formulas, (3*1) and (3*2), are equivalent only i f D* l*(f) =0. In e f f e c t , Hij 2* and Hg 2^, here are gen- e r a l i z a t i o n s of the Rayleigh quotient <V> |H \*J»/<'P\H»> f o r a single eigenfunction. The operators H A and H^ 'correspond 48 to the use of an intermediate normalization, involving w r i t i n g the expectation! value of tt as «t>\ H | V*>/<0 | "P>, where 0 i s some ar b i t r a r y reference function. The Rayleigh quotient i s second order im , while t h i s intermediate normalization i s only , f i r s t order im In terms of the operators tt^1^ and H^2^,. eq. (2.16) can be written i n one of two forms, D ( 1 ) ( f ) = H B A + H B B f - ffti1^̂ 0, (3.4) and D ( 2 ) ( f ) * H B A + Hggf - f H A 2 ) = 0. (3.5) These two equations are equivalent i n that they both have the same so l u t i o n s . However, t h e i r detailed forms are quite d i f f e r e n t away from t h i s s o l u t i o n . Equation (3*5) can be obtained d i r e c t l y by requi r i n g that T^HT,, rather than T +HT, be block diagonal, the l a t t e r being i m p l i c i t i n the derivation; of (2.16). I t can be shown that the r e l a t i o n s h i p between these two quantities i s given by D ( 2 ) ( f » * g s 1 D ( 1 ) ( f ) , (3.6) im the case of an orthonormal basis. I t i s also possible to di s t i n g u i s h between three d i f f e r e n t formulas f o r c a l c u l a t i n g operators of the type designated H*A, depending on which form of eq. (2.74) and also which form of H A i s used. Only one such form i s useful, and f o r p r a c t i c a l purposes, i s given by either eq. (2.74b) or by 49. In the case of a nonorthonormal basis, the situation i s considerably more complicated. Because the orthogonality condition:, (2.101b), Is no longer simple, i t is necessary to allow for the possibility that i f f and h do not exactly satisfy eqs. (2.113) and (2.114), they may also f a i l to satisfy eq. (2.102).. As a result, the off-diagonal blocks of both the matrices, ** - I G A 6 A B G » T*HT = ' G B A G B (3.8a) and gr a T ST = (3.8b) gA gAB must be considered to be potentially nonzero in what follows. Twoo pairs of operators H^ and Hg are again defined in this case,, S i 1 ' " < S A A + W^JM* H A B F I ) ' and (3.9a) " ( SBA h + SBB ) r l ( HBB * HBA h )' identical to eqs. (2.106) and (2.107). and S A 2 > " *i\ • and H (2) . -1 Gtt ., L B " G B B These two sets of operators are shown in Appendix 1 to be related by the equations, (3.9b) (3.10a) (3.10b) 5 0 . n[2) - H|1) + g ' V D ( 1 )(f)„ (3.11) and S B 2 ) H * 1 * • gj1f D ( 1 ) ( f ) f * (3 .12) where D ^ ( f ) , , given formally by eq.. (3*4),, is the quantity im eq.. (2.113) definingr f . Thus, these two pairs of operators are identical only when both Dv '(f) * 0., The two operators H A ' and H"A ' give rise to two different defining conditions on f „ given by D ( 1 ) ( f ) = H M + H & B f - ( S B A + S ^ f j h j 1 * „ (3.13) and D ( 2 > ( f ) = H f i A + H B g f - ( S M • S f i 2f)H{ 2 ) . (3.14) In this case, the relationship between the two quantities D ( 1 ) ( f ) and D ( 2 ) ( f ) is D ( 2 ) ( f * - ( S ^ S ^ h K g ^ h ' g ^ ^ (3.15a) When gg^ = 0,, this reduces to D ( 2 ) ( f ) = ( S B R • S B A h ) g - X D ( l ) ( f ) , (3.15h>) or D ( 2 ) ( f ) - [ 1 B - ( S M + S g g f f J ^ ^ y ^ C f K (3 .15c) The derivations of eqs. (3*12) - (3*15) are quite long, and have been outlined i n Appendix 1* 51. 3.2 Implications of Inexact Solutions Consider an approximate solution, f a P P r o x f to eq. (2.16), given by ^ p p r o x = f + 6 f f ( 3 . l 6 ) where f i s an exact solu t i o n of (2.16). I f the e f f e c t i v e operators H^, G A, and H A are calculated using f a P P r o x f the error 6f w i l l r e s u l t i n errors i n the e f f e c t i v e operators at some order i n 6f. Sta r t i n g with the operator G A, and writ i n g Gapprox = G A • 6G A ^ (3.17a) where GA, i s exact # ! i t i s e a s i l y v e r i f i e d , using (2.16), that 6G A = ( 6 f f f ) H A + H A ( f f 6 f ) • 0(6 2)„ (3.17b) to f i r s t order i n the errors. Here the operator H A i s exact. Thus the error i n G a P P r o x i s f i r s t order i n 6f. S i m i l a r l y , from eq. (2.70), °A + S 0 A " <SA * 'S A » f i i 2 ) + 4 f i A 2 ) ) - or 6 HJ 2 ) = gJ 1[6G A - 6g AH A] + 0 ( 6 2 ) . (3.18) Since Sgk = 6 f f f * f f 6 f • 0(& 2). (3.19) eq. (3.18) then y i e l d s 6 H A 2 > = S x 1 t " A ( f t f i f ) " < f t 6 f > * y + 0(6 2).. (3.20) On the other hand, from (3.1a), one has d f i j 1 * = H A f i6f, (3.21) 52 exactly. Except for reA * 1, both these errors, 6HA ' and 6HA " are f i r s t order i n 6f. However, (3*20) consists of the d i f f e r - ence of two very similar terms, which actually vanishes for n A =1. This corresponds to the familiar property that the error in an eigenvalue calculated as the Rayleigh quotient of an approximate eigenvector is second order in the error in the eigenvector. For n A > 1, the f i r s t order error, (3*20), does not vanish in general, but, as w i l l be shown presently, the f i r s t order correction; to the eigenvalues does vanish. Using eq. (3.20),. and the result = " g A * 6 g A i g A * + 0 ( * 2 ) * ( 3 ' 2 2 ) i t is easy to show from (3*7) that 6 * A = tH A r g ^ V f i f g - * - *g A*«A*V' 1 , °^Z)' <3.23) This also vanishes in f i r s t order when n A = 1, but is ire general non-vanishing when nA, > 1. For a non-orthonormal basis, eqs. (3.17b) and (3»23) remain; the same because the formula for the operators G A and H A used ire deriving these results does: not contain the overlap *(1) matrix e x p l i c i t l y . However, for the two operators H A and H A , the form of the errors caused by an error in f does differ from (3»20) and (3.21).. From eq. (3.9a),. 6 f t A l } " ( S A A + S A B f r l ( H A B 6 f - S A B 6 f A A 1 ) ) ; + ° < » 2 > * ^ ' 2 k ) From (3.10a), and using the same procedure as was used to obtain eq. (3.20)„ one obtains,. 53. 4 2 ) • ^ > X B + f t s B B > 6 f f 3 # 2 5 ) - ( S A & + f f S B B ) 6 f H A ] + 0 ( 6 2 ) . Except for (3-25) vanishing when n A * 1, both (3.24) and (3.25) are f i r s t order i n 6f» Although the f i r s t order term i n (3.24) now involves a difference between two terms, the two terms are * (2) not very s i m i l a r , as i s the case i n (3»25)t and therefore ' i s s t i l l expected to be inherently more accurate f o r a non- *(1) orthonormal basis than H A The f i r s t order v a r i a t i o n i n the eigenvalues of G A due to some variations 6GA, £gAt i n the operators G A and gA,< respec- t i v e l y , i s given by 6 ? i 55 < ^ U ) | 6 G A " l i 6 ^ l ^ l A } > + 0 ( 6 2 ) * ( 3 ' 2 6 ) The functions Y'jJ^ are the eigenfunctions of the exact operator G A, and eq., (3*26) follows d i r e c t l y from the eigen- value equation, (2,85). f o r G A» But upon su b s t i t u t i o n of (3.17a) and (3.20) into (3,26), and using the eigenvalue equation (2.84a) for H~A, i t i s seen that 6 ^ of (3»26) vanishes i n the f i r s t order i n 6f» *(1) In the case of the operator H A ',, the f i r s t order error i n the eigenvalues i s given by (3.27) which c l e a r l y does not vanish i n general. On the other hand, one has, • •<+^ ) i^si 2 ) i^ ) > 54. « < t i t ) | H A ( f t 6 f ) - ( f t 6 f ) H A | ^ [ A ) > + 0(6 2) = 0 ( 6 2 ) . (3.28) * (2) The operator H A ' i s thus inherently more accurate than the operator H A , when evaluated using an inexact f» The f i r s t order errors i n the eigenvalues of the operators H A are just given as the expectation values of the f i r s t order error operator,(3*23),with respect to the eigenfunctions of the exact operator H A, defined i n eqs* (2.86b). Since 6H A can be written as a commutator to f i r s t order, i t s expectation value w i l l be zero, and therefore, 6 l i a < ^ C U ) | 6 S A - 0 ( 6 2 ) - ( 3 - 2 9 ) For a nonorthonormal basis, only the expectation values of € H A ' and SK A ' are d i f f e r e n t i n nature from those given above fo r an orthonormal basis* From (3*24),; one obtains, which does not vanish i n f i r s t order i n 6f under any obvious general conditions. However, from (3«25 ) t * f i 2 ) • < ^ ) | g A 6 f i i 2 ) i ^ u ) > + 0 ( 6 2 > " < ^ i A } I ̂ ^ \ B + f t s B B > 6 f ^ S A L ^ f f W » A IV'&Wft 2) * o(6 2).. (3.31) Therefore the eigenvalues of H A ' i n a non-orthonormal basis are affected only i n second order by errors i n f * 55 3*3 Perturbation Theory For Ĥ ,, Ĝ .. and The purpose of this section is to outline perturbation formulas for the eigenvalues and eigenvectors of the effective operators H A„ GA, and H A„ defined in the space S A» For the most part, the formulas presented below are not new, however, those for H A and G A are not well known. These formulas are necessary i f the eigenvalues and eigenvectors of the f u l l operator are to be calculated via a perturbation procedure based on; this partitioning formalism. The formulas for H A w i l l be derived in some detail, ; because they are unusual in that H A is a non-selfadjoint operator (but with real eigenvalues)* Those for G A and H A w i l l then just be summarized* 3.31a The H A Scheme The eigenvalue equation in this case is written, SA*I " Mi • < ^ i l « A l ^ r - 6ir <3-32) where the subscripts and superscripts 'A" on the ^ and the <y> ̂ have been suppressed, and w i l l be throughout this section. The metric matrix g A i s selfadjbint. We have A npO A x n*0 x oo ( n j oo ( n j (3*33) ' 1 n=0 J 1 A n-0 A where the superscript is to indicate the order of the term 5 6 . in the perturbation parameter? (or parameters), and the solution of the zero order eigenvalue equation «!°M03= fi 0 ) ' * i 0 ) . ^ i 0 , i 4 0 ) ^ i 0 ) > - 4 H ( 3 . 3 t ) is known. The terms in the series for H A and g A are given,, and the terms in the series for {p ̂ and are to be calculated. Consider the f i r s t order term of the eigenvalue equation and normalization condition ( 3 * 3 2 ) , given by ( S p L tfh^ • ( S <°> - $»)1>™ = 0 . (3 .35a) and 2<*[0) U<0) ̂ [iy> * <^°> Igi1' \ii0>> - o. (3 .35b) when a l l quantities are real . The f i r s t order eigenvalue is obtained by premultiplyihg (3.35&) by ( g ^ 0 ^ ^ 0 * ) * , and integrating, to give « ( D = <V.<0H4°>5|1) W[0)> . (3 .36) No contribution is obtained from the second term of (3«35a) , since from ( 3 . 3 4 ) , cancelling ithe rest of the term. The f i r s t order wavefunction is obtained i n a similar manner. Premultiplying (3»35a) by ( g A ^ ^ f e 0 ^ * a n d integrating, gives < i i°>- j< o )x^ o ) i46) i «i>>-<̂ 0> ui0'̂ !1'- ;il*H0)>. Writing V i 1 ' here as, V - J 1 * - S ^ 0 ^ . : ( 3 . 3 ? ) them gives < ^ ( 0 ) . J O ) - ( l ) i ^ ( 0 ) a k i = TToT 7W) * (3.38a) ? i " £ k The c o e f f i c i e n t a j ^ i s obtained from eq. (3«35b) as a and thus (1) » . ^ c ^ 0 > | ^ 1 > | ^ 0 > > ,, (3.38b) < ^ , ( 0 ) . ( 0 ) - ( l ) |-i,(0) ^ ( 1 ) . E < T . 1 ' gA A 1^1 > ^ ( 0 ) and tfi g(0) *(0) (3.39) >i " ' i The second order terms of eqs.. (3»32) are ( f i f ) .|<f))t< 0 ) + ( H A 1 ) - | i l , ) t i 1 ) * (HA°>.^°>)V<2) = 0„ (3.40a) 2 < 1 p ( 2 ) | g ( 0 ) | f | 0 ) > + 2 < ^ ( l ) k ( l » | t ( 0 ) ^ (3.40b) —-v^i1} 140> i ^ [ 0 ) 142 > i - o. The approach here i s the same as that i n the f i r s t order case. Premultiplicatiore of (3.40a) by and integration, leads to, S u b s t i t u t i o n of eqs. (3.36) and (3*39) into (3.41) to eliminate f>^"^ and fronri the l a t t e r , r e s u l t s i n a formula somewhat 58. reminiscent of the usual Rayleigh-Schrodinger second order energy formula, w) <H 0 ) i^ 8 )si 1 ) i » S 0 ) x»i 0 ) i ^ 1 > i »^> — ' •""""•'—I Illl. I .1 I1JI.IBII. -OIMI 1 11.1 I- II .IIIJM.IUI. . . M i l l -fx " j / i fi(0) -(0) J r i " Ii (3.41b) f[ 0 )| gi 0 )H{ 2 )|T[ 0 )>. The second order wavefunctiom ^ i s expanded] im terms of the zero order wave functions,, (2) and the c o e f f i c i e n t s a ^ ' determined from eqs. (3.40a,b) i n the same manner as was used i n the f i r s t order case. The f i n a l r e s u l t i s k/lL s ( 0 ) »(0) i " fit (3.A3) s - i c ? ^ 1 ' ! ^ ' ! ^°W [ ° VA' i ^ W ^ u i 0 * if[ 1 )>]V'i 0 > The pattern i s now c l e a r . The n order terms of eq. (3*32) cam be written as Z ( f t p * - % <3>)^( n-J> « o, (3.44a) and Z T j <ti'5)|ir;'""k) l ^ i k ) > * 0* (3.44b) Premultiplying (3«44a) by (g A°^ )f» and integrating gives, | i n ) = < f i 0 ) | g i 0 ) H i n ) | t i 0 ) > (3.45) n-1 + ;i1<^i0)U<0)(5ij)-;i i ,)|V'in-j>>. 1 3 X 59. The rr**1 order wavefunction, is expanded as a linear combination of the zero order wavefunctions, and the expansion coefficients are deduced from-eqs, (3.44a,b), The result is <^k^^ I s ^ ^ ^ ^ ^ i ^ ^ * ^ ^ ^ k^ l g A ^ ( H ^ ^ - ^ ^ ) |^ n"*^> ^ 0 ) k/i ^ ( 0 ) m ^ ( 0 ) (3.46) j=0 k=0 1 A 1 1 k/n No attempt has been made in any of these formulas to eliminate higher order quantities in terms of lower order ones• because that leads to computationally less efficient formulas. A l l formulas above are given in terms of the eigenfunctions of H A» To obtain formulas applicable in a matrix notation, the functions a r e replaced by column vectors x j ^ , and a l l operators by their matrix representations. 3.3 .b The G A Scheme The eigenvalue equation in this case is written, Both G. and gr. are selfadjoint. We have (3.48) 6o* where the ajj^ and the g ^ are given, and the and the % ̂ are to be calculated. It is assumed that the zero order eigenvalue equation, 0(0) ^(0) . f ( 0 ) g < 0 ) ,(o)r < V , ( 0 ) | g ( 0 ) | ^,(0),. . 4 ^ f ( 3 < w has been solved* The n order term of (3.47) i s then; n i E [ o [ 3 ) - ( E S [ j ) g p - k ) ) ] t i n - j 0 a 0„ (3.50a) and E "s 3 «+<i»| .(J - J-W | . o. (3.50b) j=0 k»0 i 1 A 1 The m order eigenvalue cam be obtained by/ pre-multiplying (3.50a) by ^ ° ^ t and integrating to give |<*> -V^^IGW-C i dk)4j-k))|^n-j)> j « l k=o"1 A 1 1 • < t ( 0)|o A n),V ?(k) g(|-k ) | t ( 0 ) ^ ( 3 > 5 1 ) The n^ h order wavefunctioniis expanded i n terms of the and the expansion coefficients deduced from eqs. (3»50a,b>). The f i n a l result i s . . « - <^>l4J)-1̂ [1)4J-1)lVi-J)> f ( „ k/i j=l ^ ( 0 ) ( Q ) - i 1? 1 "L3 '<^,»)|^*-J)|V{«>*i°> ( 3 ' 5 2 ) j=0 k=0 1 A 1 x k/m These formulas can be shown to be equivalent to those derived 61. Ira the H A scheme, by expressing the G^?^ im terms of the g j ^ and H|̂,, according to eq. (2.70). These r e s u l t s also agree with those given by A . . Imamura (1968),, in, a d i f f e r e n t notation* to second order. The f i r s t order formulas here are ? ( D . <^(0)|G(1). . f{o)g) | ^ ( 0 ) > # j ( 3 . 3 3 a ) ^(1) « 2 ^ I S A -?i % i n > ^(0) 1 j / i c(o) g(o) d f i f * (3.53b) - K ^ M g i 0 | ^ [ 0 ) > ^ 0 ) . The second order formulas are ( 2 , = E |<^>ia{1>-;i°>41)lVi8)>|2 ' A 3/1 g(0) «(0) ri " / j (3.54a) ^ ( 0 ) ^ 2 ) ^ ( 0 ) ^ 2 ) ^ ( 1 ) ^ 1 ) ^ ( 0 ) ^ and * i 2 ) • J^n o ) - ^ o ) ) - i c < M i^'? >^ o in * , 4 i > i * i i ) > ^i0)i42)-|[i0,42,-li1)41,i^i0)>]^c> -K<vi?°i42)ivi0>> * 2 < * { 1 ) l ^ 1 * l " * i ° > > , < f u ) | g i o , | V , a ) > ] V , ( o , ( 3 ^ Here* eq. (3«53b) was used to eliminate Y'j1^ from the expression for js[2 .̂ The r e s u l t i n g formula i s longer, but the f i r s t term i s now of the more f a m i l i a r form f o r a second 62 • order energy formula. The extra terms here compared to the usual Rayleigh-Schrodinger formula are due to the presence of 41* and g<2>. 3,3,-tr. The HA Scheme The eigenvalue equation in this case is « A * i - / i * i ' < * ! • ! * , > - • « • < 3 , 5 5 > where and the are the eigenfunctions considered in the and G A schemes. We have. HK -Xi - *^ i n ) . f i - = f ( n ) . (3.56) A n=0 A -1- n=0 A ' x n=0J The solution of the zero order eigenvalue equation* SXo) ̂ (o) . j(0) ̂ Co),. < X ( 0 ) , ^ ( o ) v . ( 3 > 5 7 ) is assumed known. The HAN^ are given, and the "XJ11^ and $ are to be calculated. Since HA is selfadjoint, this is just the usual Rayleigh-Schrodinger perturbation theory. The m th order term of (3*55) is £ - ftfhXi*-** = 0, (3.58a) and S < * { J ) | X { N " 3 ) > - 0„ ( n / 0).. (3.58b) j*0 1 x Pre-multiplying (3*58a) by % \ 0 ^ and integrating gives the n t h 63. order eigenvalues r n*l j f < w) - <x[0)|sin)|̂ [0)> • 5 <xi0)|fiiJ)-5i3) |xin"j)>. (3.59) The m order eigenfunction is expanded im terras of the zero order eigenfunctions,, *in) • 0 o > an ) • ( 3 - 6 0 ) where a < f = Z * ' ? i 1 1 „ (k / i),(3.6la) K 1 j=l «(0) g(0) and* a i i ) : = <X£ j )|Xin i"J )> (3.61b) The coefficient ajP vanishes, but, in general, the a f n ^ for n > 1,. are not zero,. Equations (3*59) and (3»6l) can be written in terms of the eigenfunctions ^ { 3 ) used in the and GA schemes using - * gA* (k)Y43-k). (3.62) The terms lm the series g** - ? g A i U ) . (3.63) are given below,, in chapter 6. 64 CHAPTER 4 MULTIPLE PARTITIONING THEORY "Great f l e a s have l i t t l e f l e a s upon t h e i r back to b i t e 'era,, and l i t t l e f l e a s have le s s e r f l e a s , and so ad infinitum* The great f l e a s themselves i n turn have greater f l e a s to go on,, while these again have greater s t i l l , and greater s t i l l , and so on/ (quoted i n C. F» Froberg, Introduction to Numerical Analysis. (1969)) 65. Ira this chapter, the possibilities of generalizing the formulas derived im the preceding two chapters to a more extensive partitioning w i l l be examined* Such mix nr parti- tioning formalisms, for m > 2, have a number of applications lm the construction of effective operators and in the derivationi of perturbatiom formulas in. eigenvalue problems i n which i t is convenient or necessary to divide the eigenvalues and their eigenvectors into several distinct sets. The limiting parti- tioning formalism is that in which m: « n, that i s , the m-dimensibnal space spanned by the basis functions aand by the eigenvectors, i s partitioned into n one-dimensional spaces.. This is the ordinary eigenvalue problem.. 66. 4*1 The nr x mi Partitioning Formalism 4-1.a Basic Theory For the present, i t is assumed that the basis set used consists of orthonormal functions, so that the eigenvalue equation to be examined is H X = X £ , X F X = 1 ,. (4.1) where H is hermitian, X^ the matrix of the eigenvectors of H, is unitary, and $ is the real diagonal matrix of the eigen- values of H* The set of basis functions is now divided into m> subsets, each spanning one of m subspaces,, S^, S 2, ...» S^, m of dimensions,, n̂ ,, n^, n̂ ,, respectively. Here, E n^ is equal to n, the dimensions of the f u l l space*. Similarly, the set of n\ eigenvectors of H, which are represented as the columns of X above, are divided into m subsets,, X ^ 1 ^ , X ^ , (m) • • ' Xx ',, each spanning one of mi subspaces S^„ S 2, • «., S m, of the same respective dimensions n̂ ,, n̂ ,, • ••» n^* Because of this double partitioning,, the matrices H and X can be written i n an m x m block form, H = " l l H12 H1M 11 X12 . . . x 1 M "21 H22 — # » X = X21 X22 ... x 2 M • * • * • . • • • • • • * "Ml %2 % 1 %2 (4.2) where the symbols Hjj and X J J represent nj. x tij dimensional matrix blocks* Let the diagonal part of the eigenvector 67 matrix he denoted by X k l l *22 (4*3) MM The basic quantities in this mi x nr. partitioning formalism are now obtained as the off-diagonal blocks of the operator T, defined, as for a 2 x 2 partitioning, by the equation, X « T X . (4.4) In the notation to be adopted, one has T I I s 1I» and (I,,J * 1, • m), (4.5) T, = fn J / I, IJ " *IJ » where l j is the identity matrix i n the space Sj,, and f j j is an nj x nj matrix given by or X J I = f J I X I I " f J I = XJ;I X I I ' (4.6a) (4.6b) The operators f ^ j are straightforward generalizations of the two operators f and h defined for the 2 x 2 partitioning (where f s f 2 1„ and h « £,?)•• The specific operator f,^ maps "12 ,(L) KL the part of an eigenvector, x p . Iyihg< in the space S L, into the part lying i n the space S K > i where the eigenvector x^ is 68© irt the space Ŝ .. It; is seen from eq#. (4,6) that a l l the f j j , (I,J * 1, m, I / J),, exist only i f the matrix,.X, of eq. (4..3); is non-singular,, or alternatively,, only i f the diagr onal blocks, (I = 1,, m)r are nonsingular., However, i f the f u l l eigenvector matrix,, X.,, is i t s e l f nonsingular (as It must be i f H is hermitian, since then X i s an orthogonal matrix, with inverse given by X*), then there is at least one partitioning of the basis functions for which X. i s nonsingular:. A particular block Xj^ w i l l be singular only i f at least one of the eigenvectors x £ ^ , (r - 1„ n^.)* is orthogonal to the basis subspace S^. A The blocks f j j of the partitioning operator T in eq, (4,4) are not entirely independent.. From the orthonormallty condition (4.1),, one has 4- A 4. A 4 - A A XTX = X TT TT X = 1 m orr T T = (X XT); 1 = g„ (4.7) A which i s to be block diagonal.. Thus the blocks of T are related by the equations, t nr. t SjK = f K J + fJK + L ^ f L J fLK = °* ( J » K = 1 » J / K ) » L/J,K (4.8) Since g is symmetric,, eqs. (4.8) represent £m:(m-l) unique matrix block equations, involving the m(m-l) different off- A diagonal blocks of T.. Equations (4.8) could be used to eliminate half of the elements of the off-diagonal blocks of T in favour of the remaining half.. While this procedure leads 69. to; the very simple r e s u l t f^2 = - f 2 i * n 2 x 2 case, when mi i s la r g e r than 2, the increase i n complexity of eqs. (4.8) makes i t impossible im practice to incorporate these equations e x p l i c i t l y into the general formalism. As a r e s u l t , im what follows, a notatiom inv o l v i n g a l l m(m-l) off-diagonal blocks of T w i l l be used (though eqs.. (4.8) are i m p l i c i t , at l e a s t when the f j j are exact)* For the cases m * 3 and 4, eqs. (4.8) are examined i n somewhat greater d e t a i l i n Appendix 2. Equations (4.8) express the orthogonality of eigenvectors of H belonging to d i f f e r e n t sets X ^ and X* K*. Thus i t i s not necessary to impose them e x p l i c i t l y , since i f H i s hermitian, t h i s orthogonality i s automatic. As a result,: i n many a p p l i - cations,, the increasing, complexity of eqs* (4.8) with increasing mi i s of no p r a c t i c a l concern* The diagonal blocks g| of the matrix gr of (4.7) are metrics,, with respeot to which the corresponding; truncated eigenvectors,, X^j,, are orthonormal* That i s , X I I g I X I I = h (^*9) where m % " <* f*>n " h + & fJI f J I #1 (4.10) The projections P^ onto the eigenspaces cam be w r i t t e n s o l e l y - i n terms of the ( J = 1* •••• nr, J / I ) , f o r each I * Using (4.7) and P^ = X ( I ) X ( l ) t = T ^ g " 1 ^ 1 * * , , i t i s s e e n that <PI>KL " fKI & fll " 70 The d e f i n i t i o n - , (4.10),, o f can be used t o e s t a b l i s h the rdempotency o f P j , and i f eqs. (4*8) a r e s a t i s f i e d , i t i s easy t o show t h a t P J P J = 0* E q u a t i o n (4.10) a l o n e i s s u f f i c - l e n t t o e s t a b l i s h t h a t t r P j = n^.. Thus, i n e x p r e s s i n g t h e p r o j e c t i o n o p e r a t o r s , P-j-,, ( I = 1„ ».., m), i m terms o f t h e f j j as I n (4.11),, i t i s n e c e s s a r y t o c o n s t r a i n ' t h e m(m-l) b l o c k s f j j o n l y I f the P j a r e t o be m u t u a l l y o r t h o g o n a l . Furthermore,, t h i s f o r m a l i s m p r o v i d e s an a p p a r a t u s v i a eqs. (4.8),, however t e d i o u s i t may be,, t o e x p r e s s the P j i n terms o f a minimum number o f u n c o n s t r a i n e d v a r i a b l e s . The minimum number o f v a r i a b l e s r e q u i r e d t o d e s c r i b e t h e * i e i g e n s p a c e s S^,, ..., & m I s o f c o n s i d e r a b l e importance h e r e , as i n t h e 2 x 2 c a s e . The p r o j e c t i o n P j onto the e i g e n s p a c e i s c o m p l e t e l y s p e c i f i e d by the n n^ complex components o f ( I ) ' the e i g e n v e c t o r s X.N ,.. which spam S^.. However, t h e space i s e q u a l l y w e l l spanned by any s e t o f n^ v e c t o r s o b t a i n e d from t h e x ^ r ( r = 1,, n ^ ) , by a n o n s i n g u l a r - l i n e a r 2 t r a n s f o r m a t i o n . . Thus, t h e r e a r e n^, complex v a r i a b l e s l m ( I ) • X* ',, w h i c h s e r v e o n l y t o s p e c i f y a p a r t i c u l a r b a s i s i n Ŝ .. Furthermore,, t h e o r t h o g o n a l i t y c o n s t r a i n t s i m ( 4 . 1 ) , w r i t t e n , x ( I ) t x ( J ) = 0„ I < J , nr. 1-1 cam be used ; t o e l i n r i n a t e a f u r t h e r Z nu E n\T complex 1=2 1 J = l J v a r i a b l e s from a l l o f the X ^ ^ „ The r e m a i n i n g number o f nom- redundant and u n c o n s t r a i n e d v a r i a b l e s r e q u i r e d t o s i m p l y s p e c i f y t h e e i g e n s p a c e s S^„ ••*, S m , i s thus 71 nr, nri 2 mi 1-1 m, 1 - 1 E numT - E ny - E m,- E m, = E E n T n T . I,,J=1 1 d 1=1 1 1 = 2 1 J=l J 1 = 2 J=l 1 J ( 4 . 1 2 ) This i s just the number of elements i n the upper (or lower) Mock t r i a n g l e of T,. and i s also the number of independent variables l e f t i n T when eqs. (4.8) are e x p l i c i t l y incorporated into the formalism.. This multiple p a r t i t i o n i n g formalism can be defined completely from the point of view of the determination of the eigenprojections P̂ ,, (I = 1,, •>...„ nr,)„ i n a manner analogous to: that used i n s e c t i o n 2 . I . e . From. eq. (4.11):,, i t i s e a s i l y seen that r K L " < Pi>KL ( P L > £ L ' ( * - 1 3 ) One of the d i f f i c u l t i e s i n manipulating quantities i n t h i s m u l t i p a r t i t i o n i n g formalism arises from: the fact that there i s nxrcounterpart here to the "pull-through" r e l a t i o n s , ( 2 . 3 2 ) , which were used extensively i n the 2 x 2 case to simplify the various expressions arising.. In. f a c t , i n th i s case, the analogue of eqs., ( 2 . 3 3 ) i s which,, f o r J / K,, gives f J K % 1 = -gj l fK . T - , £ f S" l f t * ( J" K = !»•••»> m„ J/K). (4.14) Equations (4.14) are not of great use i n general because of the summation term on the r i g h t hand side., In the 2 x 2 case, t h i s term-i does not occur,, leaving eqs;* ( 2 . 3 2 ) . , 72. 4*1 »b: The Defining Conditions an-, the f j j . Aŝ im the 2 x 2 case, the off-diagonal blocks of the A partitioning matrix T can be determined by diagonalizing the matrix H„ to obtain i t s eigenvectors, X,. and then using eqs* (4.6) directly* However, i t is again possible to formulate systems of nonlinear equations which can be solved to obtain the fJJ directly,, thus making i t unnecessary to f u l l y diag- onal ize H* Consider f i r s t the eigenvalue equation (4.1),, written,, using (4 .4) r as A A A _ A _ 1 A A , , V H T = T X f X 1 « T H„ (4.15) A where the: matrix H,, given as A A _ A H = X f X -1 A H, 0 (4.16) is to be block diagonal* Equation (4.15) is valid only i f A the diagonal block part, X, of X is nonsingular,, which is exactly the condition that must be satisfied i f the f J ; P are to exist* The diagonal block parts of (4.15) define the operators H-j- and the off-diagonal block parts provide equations for the Thus, one has A m. H I * H I I + H I J f J I * (4.17) J / I 73. The equations determining the f j j are then, nn DJI<*) « 0 - H J J + ^ H J K f K I - f j j H j K/I (4.18) m> mi = H J I + K ^ H J K f K I " f J I H I I " tJI1^1 H I K f K I 9 K/I K/I (I„J - 1, .»•,, mi, i / j ) . im Equations (4.18) consist of E n_ni T coupled nonlinear I„J-1 1 J I / J equations I n the matrix elements of the f The solutions of these equations w i l l automatically also s a t i s f y eqs. (4.8) because oi" the he r m i t i c i t y of H.. E x p l i c i t incorporation of eqs. (4.8) into (4.18) could be used to reduce the t o t a l number of equations and variables by a factor of two, but at the expense of greatly increasing the complexity of the equations to be solved., In eqs. (4.18),, coupling occurs only between f.jj i n the same block column of T.. Thus, the ( n - n^Jn^ equations D J J ( T ) = 0„ ( J =1, ...,,m, J/I)> can be solved f o r th * the elements of the I block column of T, namely, the f j j , ( J - 1* m* J/l)„ without having to determine any of the fKL f o r L = I * A somewhat d i f f e r e n t set of equations f o r the off-diagonal Mocks of T: r e s u l t i f the eigenvalue equation i s rewritten as G = T HT = T T X f X. „ (4.19a) and ff = r fT = (X V)'1 r (4.19b) 74. where the second equality i n (4.19a) i s obtained using (4.15). Both G and gr are to be block diagonal, and the condition, that t h e i r off-diagonal blocks vanish provides equations f o r the fJJ. Since both; G and g are hermitian, the vanishing of t h e i r off-diagonal blocks cam each provide only £ £ n~.mT unique I / J x d equations,, and thus both of (4.19a) and (4.19b.) must be used together to determine a l l the f j j * This r e s u l t s i n a set of coupled nonlinear equations of the form' °JI* " ° " % + j j f c HJLFLI + £ FKJHKI + R J 'JAL^U ' (4.20> and. t m t «JI = 0: = FIJ+ FJI + Llt FLJFLI * <̂ 21> L/J.K (T,J = 1, mi, J < l).)f where eqs. (4.21) have appeared before i n (4.8). These equations A e f f e c t i v e l y couple a l l of the off-diagonal blocks of I, and A therefore,, the entire matrix T must be determined at once i f (4.20) and (4.21) are used.. As a r e s u l t , while the system of equations (4.20)-(4.21) has the same solutions as the system (4.18),, the two systems must be treated quite d i f f e r e n t l y from1 a computational point of view. 4.1.c: V a r i a t i o n a l Formulation of the Equations for the f j j . In t h i s multiple p a r t i t i o n i n g procedure, i t i s also possible to show that eqs* (4.18),. determining- the f j j , are equivalent 76* to a variational criterion,, in that the vanishing of the quantities D K I(T), (K = 1* ..., my K/l), implies that the trace of the operator H over the image space of the projection, operator Pj is stationary. This stationarity implies that Pj Is an> eigenprojection of H* The algebra required to demonstrate this is considerably mere tedious here than i n the case of a 2 x 2 partitioning* The objective is to obtain: an expression, for the f i r s t order variation: of the quantity, E I * t r P I H " T * t r ( P I > K J % C * <*-22> with respect to small variations i n the f^jt (K=l, .»., my K/ l ) . Fromi (4*11), one obtains,, 6<PI>KJ " 6 f K I g I l f J I + f K I * % l f J I + f K I % l 6 f J I + (4.23a) where, SgJ 1 = - g j ^ g j g j 1 + 0(6 2) 1 i l l (4*23b:) " £l ( 6 f ^ l f L I * ^ I 6 ^ ^ ! 1 + 0 ( 5 2 ) ' Substitution of eqs. (4.23a,b) Into the equation, m , 6Ej = t r 2 ( 6 P X ) K J H J K •• (4.24) when H is independent of the f j j r leads to the rather compli- cated expression, 6Ej = trrE 6 f p i g J 1 [ ( T t H ) I F - ( T ^ T j ^ g ^ f ^ ] P/I .c (4.25) + t r 2 6 f p I [ ( H T ) p i - f p j g J^T^T^ j l g J 1 + 0(& 2). P?I ?6 The passage from (4.24) to (4.25) uses the c y c l i c property of the trace. Consider the c o e f f i c i e n t of 6 f p i i n (4 .25). From (4.18), one has,, ( H T ) p i - f p j g ^ ^ H T ) ^ " D p i(T) * f p j g ^ C g i H j - (T'̂ HT " DPI<*> + 'KS I^^IA - (i'HT)^] A 1 M " + » A • D P I < T ) + f P I g I ^ f K l C f K I K I - < H T>Kl] * - 1 m t * = D P I ( T » - f P I g I ^ fKI DKI< T> Consequently, the vanishing of a l l the dJQ(T)» (K=l, my K / I ) r Implies the vanishing of the c o e f f i c i e n t s of 6 f p i i n eq. (4 .25). Since the c o e f f i c i e n t s of 6 f p i i n (4*25) are just the adjoints tt of those of &fpj»< they vanish also, causing oEj to vanish to f i r s t order im the i n f i n i t e s i m a l s * The vanishing of 6Ej to f i r s t order implies the converse, namely,, that a l l D K j t (K=l,; .... nr, K / l ) , vanish* This follows from the f a c t that the rank: of the matrix (1 - P^) must be n-n-j. i f i t i s to project onto the complement of the eigenspace S^ of H of dimension n-nj. Because of t h i s , the set of l i n e a r systems (one for each column D , of D T)» written compositely as m • * ^ (i - P j j j ^ d ) - o „ K/I has only the t r i v i a l s o l u t i o n D K I(T) = 0, (K=l, m, K / l ) . 77. 4.1. d Transformation of the f j j Under a Change of Basis Under a l i n e a r transformation of the basis functions {.0^; * (see s e c t i o n 2.1.g), the eigenvectors of K become (eq. (2.50)), x' = V X • I X , (4.27) where T u • h - and ( I , J = 1„ m). (4.28) A t A The off-diagonal blocks of T i n the new basis and those of T i n the old basis are related by A A l l A A A ,1 1 r = v x x x= x T x x 1 from- which, f J I = ( V ? ) J I X I I X H ' ( 4' 2 9> But, from (4.27)# one has, • A X I I = ( V T } I I X I I and thus, which gives the fjj s o l e l y im terms of the transformation c o e f f i c i e n t s VJJ, and the fjj i n the old basis. Again, the transformation f o r the fJJ under a l i n e a r basis change i s complicated and nonlinear i n both the c o e f f i - cients of the transformation and the old v a r i a b l e s . While 78. such a complicated transformation i s doubtless disadvantageous under some circumstances, i t can be also usefully exploited, as pointed out in section 2.1*g. Note also,, that, writing, <VII + V K l ) " 1 - <4 + ^ Vli VIK*KI>" l vH * ( 1 r & vri viK fKi + 0 ( f 2 ) , v l i • i t is seen that f j i - v J i v n + K% <VJK- v j i v n v i K ^ K i v n * ° < f 2 > ' W 1 (4.31) which, for small f» is nearly linear* but not homogeneous. 79 4*2 E f f e c t i v e Operators 4*2.a Basic Definitions Like the 2 x 2 p a r t i t i o n i n g formalism, one of the primary applications of t h i s multiple p a r t i t i o n i n g formalism i s i n the; construction), of e f f e c t i v e operators. Such operators would be defined i n one of the subspaces, S^, of the f u l l basis space, but would have as eigenvalues, a p a r t i c u l a r subset of the eigenvalues of the o r i g i n a l operator H i n the f u l l space. Those eigenvalues correspond to the eigenvectors of H spanning the space S j * Since they are r e s t r i c t e d to the subspaces i n which the e f f e c t i v e operators are defined, the corresponding eigen- vectors of these operators are simple, or orthonormalized, truncations of eigenvectors of the operator H i n the f u l l space. However, given the matrix T and the eigenvectors of the e f f e c t i v e operators, those of the o r i g i n a l operator K im the f u l l space can be obtained straightforwardly* The types of e f f e c t i v e operators a r i s i n g here are analogous to those defined previously i n the 2 x 2 case* The simplest set of e f f e c t i v e operators has been defined already i n eq. (4.14)* These operators, , * nr H I = H I I + 2 H u f J i ' U = 1» .*., m);, (4.32) are defined; im the corresponding subspaces S j , and have the' eigenvalue equations, % X X I = f ( I ) X I ; [ , (I = 1, m), (4*33) 80. as seen from eq. (4.13). Here, f ^ is the I**1 diagonal block of the matrix of eigenvalues, f , in eq. (4.1). The operators Hj are non-selfadjoint, in general. However, their eigenvectors, X^j, are orthonormal with respect to the non- unit metrics g^, according to eqs. (4.9). The corresponding basic set of self*adjbiht effective operators are those defined by the diagonal blocks of eq. (4.19a), namely, Gj = (T t H0?) I I , (4.34) with the eigenvalue equation, G I X I I = g I X I I ? U ) •• ( I = 1» C4.-35) where g T - ( T f T ) T T . (4.36) -I N i X ' I I In detail,, one has. GT - H T T + E ( f T - r H r T + H T T f T T ) + E ftrH II T jpT ^ I J ^ J I " n I J A J l ' T AJI i lJK AKI» K/I (4.37) If eqs. (4.16) are satisfied, this can also be writtenasy G I " I ( i t ) I J ( H 5 ) J I • = <$t>IJ<™')JI. » ( T ^ J J J H J - gjH x . (4.38) As in the 2 x 2 case,, other sets of self-adjoint effective operators can be obtained by orthogonalizihg the truncated eigenvectors by other procedures. Lowdire's (1970) symmetrical orthogonalization (see section 2.2.a) leads to the set of orthonormal eigenvectors, 81. C I I = g I ^ X I I • (I = 1, .... ra),, (4.39) where C j ^ C ^ = X l l g i x i l a 1l» fev e c i # (4..9)» These new vectors s a t i s f y the eigenvalue equation, Hj C J J = CIT (4.40) where H x * g j * H x g j * , (4.41a) - g j * G x g|* , (4.41b) the equivalences here being based i m p l i c i t l y on the assumption m that the p a r t i t i o n i n g operator T used i s known exactly. E f f e c t i v e operators i n the spaces Ŝ ,. • • •, S^, can be defined f o r any other operator i n the f u l l space, using the de f i n i t i o n s (4.4) and (4.39)• In p a r t i c u l a r , matrix elements of some operator 0 can be written, x ( I ) t 0 X ( I ) = X J J ' O J X n . (4.42a) Here Oj = ( T t 0 T ) I I , (4.42b) i s an operator confined to the subspace Ŝ ., but possessing the same expectation values with respect to the X^j as the o r i g i n a l (I) operator does with respect to the f u l l eigenvectors X N '•• A second type of e f f e c t i v e operator, \ a g I ^ °I g I ^ ' (4.^3) w i l l give the same matrix elements with respect to the ortho- normalized vectors Cj^,, of (4 .39), as the o r i g i n a l operator 0 does with respect to the X^^. Here 0^ has the same form 82 in 0 as the operator Gjt eq. (4.34),, does in H, and Oj is of the same form as H T given by (4.4lb)* 4.2.b Eigenvalues and Eigenvectors of the Effective Operators Up to this point, the m x nr partitioning formalism has been presented almost totally i n matrix notation. It is instructive, however, to re-examine some of the relationships quoted previously, from the point of view of the actual eigen- functions of the operator H, and derived effective operators. The eigenvalue equation, (4.1), for H is written as H ^ i " ? i ^ i ' ( i r j = 1,. ..... n);. (4.44) If a partitioning of the basis space into m subspaces „ S ». is carried out, these eigenfuctions of H can be written as a sum of parts, m t i • fa +*iZ + ••• • t i m " £*U> (4.45) with 'Vf>iJ being the part of ty^ lying in the subspace S j . The partitioning of the eigenvectors of H into m sets, spanning eigenspaces .... S m, merely divides the Y ^ into m sets — the notation 'V/[J ,̂ (J=l,, ... ,mj i=l, nj), now denoting the i * h member of the j * h such set. The basic equations, (4.6), of the partitioning formalism then are, 83 . This means that the eigenfunctions of H in the f u l l space can be written as In the notation used im this section, the symbol f K J represents an embedding of the mapping f K J» (Sj -*> S K ) t in the whole n-dimensional basis space. It is a simple matter to write down the eigenvalue equations for the effective operators in this notation. The counterpart of eq., (4.33) for the operators Hj i s , H I T i l " fi ^ i l » ( i , j = l . . . . , n T), 1 (4.48) <¥llls\el\1$)> = *ir (I = 1 ^ t t m U The eigenvalue equation, (4 .35) , for the operators Gj becomes, G I ^iV = f l ^ I ^ i l ^ 1=1.....m). (4.49) with the same orthonormality condition as im ( 4 .48 ) . The eigenfunctions obtained from the hy the symmetric ortho- gonalization procedure are given by, x i i } • { k' 5 0 ) Thus, the eigenvalue equation,, (4 .40) , for the operators is f ^ ( D , e ( D ^ ( I ) 1 1 1 f i 1 1 * (i.j=lr...,nI» 1=1....,m). < % ( l ) X ( I ) _ ( 4 . 5D < * i l 1 ^ j l > " 84„ The f u l l eigenfunctions of H are given in terms of the eigen- functions, (4.50);, of the operators Hj, by K/l Finally, for the eigenprojections,. PT, i t is seen that, p' = E rtWxyph = E 1 (1 • E f K T ) l ^ ^ x f H }|(1 E f* J i=l K=l K' =1 K/I K f/I m / T\ m + = (1 + E f K T ) g u ; ( l + E f J . T ) , - (4.53) K=l ^ K'=l * X K^I K'/I where I (I) . s |^(I)><1//(I)|, (4. 5^) i=l 1 1 1 1 defines an embedding of the metric gj in the f u l l n-dimensional basis space. 85 4.3 Generalization to a Non-orthonormal Basis The generalization of the multiple partitioning formalism to the case of a non-orthonormal basis i s straightforward. The eigenvalue equation is now H X » X S E, (4.55a) with XfS X = l n > (4.55b) as in eq. (2.90), where S is the matrix of overlap integrals of the basis functions* The set of basis functions, and the eigenvectors, X„ of H, are each partitioned into m subsets, exactly as described in section 4.1.a, making i t possible to write the eigenvector matrix X i n the partitioned form (4.2). The f u l l matrix X can then be written in: terms of some matrix A A T, and the diagonal block part, X,, of X, as given in eq. (4.4), X = T X. (4.56) The matrix elements of T are given here also by eqs. (4.5). The conditions under which eqs. (4.56) w i l l be valid are identical to those under which (4.4) are valid, namely, that the partitioning of the basis functions must be so defined A that X is invertible. While X is no longer a unitary matrix, the hermiticity of H implies that the columns of X are linearly independent (except possibly i f S is singular) and thus there w i l l be at least one partitioning of the basis A functions for which X is invertible. A The m(m-l) off-diagonal blocks of the matrix T are not 86. a l l independent, as can be demonstrated using the orthogonality condition, (4.55b)• The analogue of eq.. (4.7) i s T ST « (X X T) 1 = g, (4.57) which must be block diagonal i f the orthogonality condition Is to be s a t i s f i e d . This implies the equations, nri m .̂ m ^ g I J = S I J * L ^ S I L f L J * L ^ f L I S L J + R £ = 1 f K I S K L f L J L/J L/I K/I,L/J • 0,. ( I , J = 1, mi I / J ) . (4.58) These equations could be used i n s p e c i f i c cases to eliminate h a l f of the elements of the off-diagonal blocks of T from the formalism.. However, they are considerably more complicated than the corresponding equations, (4. ,8), f o r an orthonormal basis* Therefore, the remarks following eqs. (4*8) apply here with even greater emphasis. From a p r a c t i c a l point of view, such an elimination procedure i s not to be recommended* The diagonal blocks of g, given by m m ^ m ^ g I ' S I I j j j S I L f L I + j£z f L I S L I + ^^/K^KLhl9 (4.59) serve as metrics for the truncated eigenvectors X̂ .̂ , as indicated i n eq. (4.9). Because of the e x p l i c i t presence of the overlap matrix, S* the leading term of g^ here i s S^j,. rather than a unit matrix of the same dimension, as occurs with an orthonormal basis, eq. (4.10) *% The defining conditions on the off-diagonal blocks of T are obtained i n a manner s i m i l a r to that employed with an 87 orthonormal basis. Direct substitution of (4.56) into (4.55), and use of the fact that X is invertible leads to H T = S T (X f X " 1 ) « S T S, (4.60) where H is to be block diagonal, as i n eq. ( 4 . 1 4 ) . The diagonal A A blocks of (4.60) give H in terms of H, S, and T, as % = [ ( S T ^ r ^ H T j j j m m If the overlap matrix is a unit matrix, the inverse matrix in ( 4 . 6 l ) reduces to an identity matrix, and eq. ( 4 . 1 7 ) for an orthonormal basis is recovered. The expression, (4.61), for the effective operators Hj, (1*1,, m),, are of the same A form as eqs. (2.95) and (2.96), given for the operators A and Hg. in the 2 x 2 partitioning formalism. From (4.60), i t A is seen that the eigenvalue equations for these Hj are1 given by flj X J J * X X I (I =1„ .... m), (4.62) exactly as in ( 4 . 3 3 ) for an orthonormal basis. As pointed out A • in chapter 2, however, a new set of effective operators Hj could be defined by -• m H I = H I I + E HIJ fJI» ( 3 > 1 » m ) » (^«63) J=l J/I which is identical to ( 4 . 1 7 ) , but leads to an effective eigenvalue equation of the formn ftI XII= gI X I I ! U ) » 88. where §I = S I I *' / t S I J f J I • J/I cart; be regarded as an effective overlap matrix. Equations (4,64) and (4.62) are simply restatements of the same eigen- value equation, and one typical way of actually solving (4.64) Is by using (4.62) as an intermediate. Defining conditions on- the f a r e now obtained from the off-diagonal blocks of eq. (4.60), after substitution of (4.6l). The result is D I | T(T) = (HTJJJ - (STJJJHJ m m „ = H L J * K-l H L K F K J " ( S L J + K-1 S l K f K j ) H j K ^ (4.66) m nr. m KVI K/J K/J x ( H J J + J l H J K f K I > = 0 # Clearly, the presence of the overlap matrix severely complicates the determination of the f j j * It is seen that these equations s t i l l retain the property of being separately soluble for individual block columns of T. Despite the complexity of eqs. (4.66), i t is s t i l l possible to devise efficient iterative schemes for their solution. Am alternative set of defining conditions, analogous to eqs. (4.19), are obtained by premultiplyihg the eigenvalue 89. A d -equation, (4.60), by T , to give A + A .* t A . A a A _1 . G' = T HT « (T ST) X J X x , (4.67a) where TfST • (X X*)" 1 = g, (4.67b) must be block diagonal.. Thus G i t s e l f must be block diagonal and i t s diagonal blocks form a second set of effective operators, when this is so, with the eigenvalue equations G I X I I " g I X I I ( 3 > 1 * m ) r (4.68) This is identical in form to eqs. (4.35), the effects of the presence of the overlap matrix being buried i n the detailed form of gj* The operator Gj here is identical in form with the corresponding quantity for an orthonormal basis. When eqs. (4.66) are satisfied, implying that the second equality in (4.67a) is satisfied, i t is seen that Gj • gjfij, (I » 1, m). (4.69) The matrices fJJ can therefore also be determined by the 1 condition that the off-diagonal blocks of G and g vanish, eqs* (4.20) and (4.58). Since both G and g are hermitian, both eqs. (4.67a) and (4.67b) are required to determine a l l the f j j . These equations effectively couple a l l of the off- A diagonal blocks of T, which must therefore be completely determined simultaneously, rather than block column-wise, as is possible using (4.66)* This drawback in using eqs. (4.67) is probably more than compensated for by the much simpler form of these equations. 90. TWo of the three types of effective operators which were defined for an orthonormal basis have been introduced above in eqs. (4.62) and (4.68) for the present case. The third type of effective operator, namely, the Hj, (I = 1, m), given by eqs. (4.4l), are identical in form here because the overlap matrix does not appear ex p l i c i t l y in their definitions. The corresponding eigenvalue equations are given by (4.40) with the eigenfunctions of Hj(;.being related to those of Hj and G, by (4.39). 9 1 . 4 . 4 P r a c t i c a l Considerations 4 . 4 . a Alternative Formulas In the development of i t e r a t i v e procedures f o r the deter- mination of the f J X either from eqs, ( 4 . 1 8 ) or ( 4 . 2 0 ) , ( 4 . 2 1 ) , or t h e i r counterparts i n the case of a non-orthonormal oasis, i t i s necessary to take into account the manner i n which a given f-dependent quantity i s evaluated. This point has been explored i n d e t a i l i n section 3*1 f o r a 2 x 2 p a r t i t i o n i n g . The purpose of t h i s section i s to outline the corresponding (more complicated) r e s u l t s for a multiple p a r t i t i o n i n g formalism. Consider f i r s t the case of an orthonormal b a s i s . The operators Gj are given by eq. (4 .37) as Gj = ( T ^ H T J J J •;• ( 4 .70) When the D X J(T) are not a l l zero, i t i s possible to di s t i n g u i s h two d i s t i n c t forms f o r the operators Hj, namely, H I * H I I + 2 H u f J i » ( 1 = 1 . m), (4 .71) and i n eq* ( 4 . 1 7 ) , and, J*2* = g ^ G j „ (I * 1 „ ro), (4 .72) from eq. ( 4 . 6 9 ) . I t i s a r e l a t i v e l y simple matter to demonstrate that (see Appendix 3). H< 2 ) = QX) + g" 1 Z f J j D ^ ^ , ( 1 = 1 . .... m), J/I (4 .73) (1) where the B\r , defined below, are e s s e n t i a l l y the conditions 92 (4.18), defining the f j j . Thus,, i f a l l D ^ , (J=l, mi J / l ) , * f l ) vanish for a particular value of I,, the two operators Ej ' * (2) and Hj ' (for that particular: value of I) are identical. Given the two forms of the operator H j r i t is possible to write the f i r s t form of the defining conditions, (4.18), on the f J J in one of two alternative ways, namely, and »«•<*> • H J I + £ HJK fKI " f J l " i 2 ' - These two forms are equivalent in the sense that they both have the same zeros, by virtue of (4.73)• Ira. detailed form, they are quite different, however, away from a zero. Substitution • of (4.73) into (4.75) gives verifying that (4.74) and (4.75) are only equal where they vanish, and that they do have a l l their zeros in common.. Equation; (4.76) i s the generalization of eq. (3*6) in the 2 x 2 partitioning formalism* In the 2 x 2 case, i t was shown that the conditions . (2) * - l DjJ' * 0 also arose out of the requirement that I HT be block diagonal* In the present multiple partitioning case, this i s no longer true r because of the increased complexity of the orthogonality conditions, (4.8)* Since I I « g, one has T"1 = g'1**, (4.77) 93. and therefore, I HI = & T HT = g G. (4.78) In the 2 x 2 case, g could easily be made block diagonal, and (1) i n so doing, Gg^ became identical to Dv ' ( f ) , leading to eq., (3.6). In the present case, the general block diagonalization of g is not possible, and therefore, a result similar to that of the former case cannot be obtained. Here, as in the 2 x 2 case, i t is possible to calculate the operators Hj using one of three different formulas, i n terms of Hj , ', and Gj, respectively. The f i r s t form,, in terms of H| , as indicated in (4.4la) r is not of practical interest, because i t represents only a partial re-normalization of the truncated eigenvectors. The latter two formulas, are effectively identical from a practical point of view. Consider now the case of a non-orthonormal basis. Many of the results presented earlier, for the simple 2 x 2 parti- tioning, have analogues i n the present m x nr partitioning formalism which are too complicated to be l i k e l y to be useful. Again, i t is useful to distinguish two sets of effective , at operators of the HT-type, namely, a g j 8 % gj . (4.79a) (4.79b) (4.80) and, 94. U ^ 2 ) s o-"*l ft * g l 1 (4.81) = «l 1 C«n*2(f J IHj I* H u f J i ) + J ^ I f J I H J K f K I ^ I = 1,, »..,, m, i n both cases. The relationship between these two sets of operators is found to be (see Appendix 3), exactly as for an orthonormal basis. *(1) * (2) The two types of effective operators, and H£ ', lead to two different sets of defining conditions for the f J i a o f the type (4.18). They are written " j l ' * " H J I + K ^ I H J K f K I - ( S J I ^ 1 S J K f K I > f t i 1 ) ' and, where, in both (4.83) and (4.84), I,J • 1, m, I/J. Direct application of eq. (4.82) to eq. (4.84) gives the relationship between these two types of quantities, Equation (4 .85) is the generalization of eq. (3 .15c) to the multiple partitioning case. A generalization of (3»15b) can also be obtained here, but only at the expense of a great deal of tedious algebra.. The f i n a l result contains many additional terms not appearing in (3 . 15b) , and thus is not l i k e l y to be useful. 95. 4.4.h> Implications of Inexact Solutions The purpose of t h i s section i s to examine the errors i n the e f f e c t i v e operators Hj„ Gj, and Hj, a r i s i n g from the use of inexact f j j , ( I , J » 1, ...» m, I / J ) . These r e s u l t s c l o s e l y correspond to those given i n section 3»2, and thus, only a b r i e f summary i s required here. Consider an approximate solu t i o n to eqs, (4.18) or (4.20), (4.21) r written as fapprox = + ^ { l f J s l t 9 f m > l f i j ) t ( i f # 8 6 ) where the fjj» here, are to represent an exact s o l u t i o n to those equations. The error 6fjj i n f j j gives r i s e to errors i n the e f f e c t i v e operators Hj, Gj, and Hj... The only complica- t i n g factor here, compared to a 2 x 2 p a r t i t i o n i n g , i s that the errors i n several f X J w i l l contribute to the o v e r a l l error i h a given e f f e c t i v e operator. Prom: eq. (4.37), i t i s seen that fiGj = E (SfJx^jjHj + H j f j j f i f j j ) • 0 ( 6 2 ) . (4.87) J / l S i m i l a r l y , from (4.71), = E H I J 6 f J I + °( f i 2)» (4.88) J/I Using the equation, ( g x * 6 g x ) ( H ( 2 ) + bw[2)) = G x • 6G X , obtained from the d e f i n i t i o n (4.72) of H£2\ the error i n H{2^ Induced by the above errors i n the f X J i s given by 96 gJ 1C!*SI« 6 g lH< 2 )] + 0 ( 6 2 ) (4*89) In obtaining eq. ( 4 . 8 9 ) , , eq. ( 4 . 1 0 ) has been, used to write using eq. ( 4 . 7 9 ) . . A similar, though not identical, form for 6Hj is obtained i f the formula ( 4 . 7 9 ) for Hj i n terms of Gj is used* In eqs. ( 4 . 8 7 ) - ( 4 . 9 0 ) , a l l quantities on the right hand sides which are not incremental, are exact. The formulas ( 4 . 8 7 ) - ( 4 . 9 0 ) exhibit substantial similarities with the corres- ponding formulas for a 2 x 2 partitioning., In fact, in most cases, i t is seen that terms involving the single block f in. the 2 x 2 case here contain sums over similar terms for each th * block f i n the I block column of T. As for a 2 x 2 partitioning, the error expressions ( 4 . 8 7 ) - ( 4 . 9 0 ) are a l l f i r s t order in the errors bfjj* However, the errors 6Gj of ( 4 . 8 7 ) , 6H£2^ of ( 4 . 8 9 ) , and fiHj of ( 4 . 9 0 ) , have a vanishing expectation value in f i r s t order with respect to *(1) the exact eigenvectors^of these operators. The error 6 E i sfij = 6 ( g l * ft<2)g-*): 97. of (4,88) does not have such an expectation value which vanishes in f i r s t order in the errors in the f j j * For this reason, the *(1) Hj can be considered as inherently less accurate than the former three effective operators, when inexact values for the elements of T are used* 98. CHAPTER 5 EXACT DETERMINATION OF T "•Why,1 said the Dodo, *the best way to explain i t i s to do i t * . (And,, as you might like to try the thing yourself some winter day, I w i l l t e l l you how the Dodo managed i t , ) " (Alice's Adventures in Wonderland, Lewis Carroll) 99. Several sets of simultaneous non-linear equations, defining the off-diagonal blocks of the partitioning operator T, have been derived. In general, these equations can only be solved numerically. Some numerical iterative techniques are described in this chapter, and some assessment of their efficiency and r e l i a b i l i t y is made* A number of additional ways of defining T are also discussed, together with the numerical procedures they suggest* The methods described can be applied in. a wide range of quantum mechanical calculations• They are particularly useful when only a small number of the eigenvalues and eigenvectors, or only a projection onto a whole eigenspace (rather than the individual eigenvectors) of a hermitian operator are desired* The techniques described below represent new and practical approaches to such calculations* 100. 5.1 The Calculation of a Few Eigenvalues of a Large Hermitian Matrix The choice of algorithms to determine f depends to some extent on the nature of the applications which are an t i c i p a t e d . One important application) of the methods of t h i s chapter i s the c a l c u l a t i o n of a small number of the lowest (or highest) eigenvalues, and corresponding eigenvectors, of a large hermitian matrix. Such applications ar i s e i n the determina- t i o n of e l e c t r o n i c wavefunctions f o r the lower l y i n g energy l e v e l s of atoms and molecules i n large scale configuration i n t e r a c t i o n calculations, and i n a v a r i e t y of calculations i n applied mathematics and physics. The matrices a r i s i n g may have dimensions up to tens of thousands, (ROos, 1975)* Algorithms f o r the p a r t i a l diagonalization of large matrices must s a t i s f y a number of conditions to be p r a c t i c a l . With a matrix so large that i t must be stored on some a u x i l i a r y device, rather than in.the central computer memory, only small sections are available to random access at one time., Techniques which involve many successive modifications to the o r i g i n a l matrix thus become very i n e f f i c i e n t , and t h e i r v u l n e r a b i l i t y to s i g n i f i c a n t cumulative round-off error increases with the dimension of the matrix. Further, i n tech- niques i n which the entire matrix must be brought to some standard form before the c a l c u l a t i o n of a single eigenvalue and eigenvector, the c a l c u l a t i o n of a small number of eigen- values and eigenvectors may require nearly as much work as the c a l c u l a t i o n of a l l of them. 101. In i t e r a t i v e techniques, on the other hand, these d i f f i - c u l t i e s can be minimized. With proper organization, small sections of the matrix can be used sequentially, and the work per i t e r a t i o n can be made proportional to the actual number of eigenvalues being calculated. For large matrices, t h i s work should then also be roughly proportional to the square of the dimension of the matrix, rather than the t h i r d power* Most i t e r a t i v e techniques now a v a i l a b l e 1 for the p a r t i a l dlagonalizatiom of large matrices are based on the c a l c u l a t i o n of successive corrections to some s t a r t i n g vector, to obtain a sequence of vectors converging to a single eigenvector. Since these techniques t y p i c a l l y use the maximization or minimization of the Rayleigh quotient with respect to the approximate eigen- vector as the c r i t e r i o n f o r the c a l c u l a t i o n of the appropriate corrections, the single eigenvector obtained usually corresponds to the largest or smallest eigenvalue of the matrix. To f i n d other eigenvalues and eigenvectors of the matrix, the same procedure i s repeated, but convergence onto previously c a l - culated eigenvectors i s prevented using one of several tech- niques (Shavitt, 1973)• A d i f f e r e n t approach to the p a r t i a l diagonalization of a large hermitian matrix by i t e r a t i v e methods, i s provided by t h i s eigenvalue independent p a r t i t i o n i n g formalism. I f a 1See Shavitt et. al.,(1973)l Shavitt,. (1970) i Nesbet, (1965) I and Feler, (1974). 102. matrix f, corresponding to the uncoupling of an n^-dimensional subspace spanned by the desired eigenvectors, can be deter- mined, then the calculation of these ^eigenvalues and eigen- vectors reduces to the construction and solution of an n^- dimensional eigenvalue equation, to get truncated eigenvectors X^, only, followed by the matrix multiplication, r 1, X U > . A f XAA* (see eq. (2.3)). The n A eigenvalues and eigenvectors are determined simultaneously, and thus, no error prone and time consuming deflation or eigenvalue shifting procedures need be employed to obtain eigenvalues greater than the smallest one. If the accuracy of the elements of f is uniform, the accuracy of the n A eigenvalues and eigenvectors calculated should be uniform, rather than slowly deteriorating in the order ih which they are calculated. These methods are especially use- f u l when the desired eigenvalues are nearly, or exactly,, equal, but well separated from the remaining eigenvalues of the matrix. Existing procedures which consist of successive c a l - culation of the desired eigenvalues, one at a time, may perform very poorly i n such a situation. The major part of the procedures described here involves the calculation of f» In developing suitable algorithms for the iterative determination of f, two c r i t e r i a were satisfied whenever possible, namely, that the amount of computation per iteration be proportional to n^ng , and that the columns or 103. rows of Hg B fee required only sequentially. With n g > > nj^» manipulations of n f ix n g matrices (such as inversion, or the evaluation of the product of two of them) require of the order of computational operations, which is of the same order as the amount of work required to completely diagonalize the entire matrix by traditional methods.. To; maximize their accuracy, given f to some accuracy^ the; eigenvalues and eigenvectors should be computed from one of * (2) ~* * ( l ) the effective operators H"A » or H A. rather than from even though the latter is easier to calculate. The computed eigenvalues w i l l then be accurate to second order in the error in f (see section 3*2).. For ng, > > J * A , the calculation of G A requires of the order of ng n A computational operations. The remainder of the calculation, including the calculation of H A ' or HA, i f desired, the diagonalizatiom of the n A x n A (A) effective operator, and the determination: of Xx a l l represent negligible additional computation. 104 5.2 2 x 2 Partitioning — Orthonormal Basis 5.2.a General Considerations This section is concerned with the determination of f by solution of eq. (2.16). D(f) = H M • H B B f - f n A « 0. ( 5 . D This matrix equation represents a system of n An B simultaneous nonlinear equations for the individual matrix elements f o r * A general solution can be written down i n only two special cases. If the hamiltonian is already block diagonal, then*, clearly, f 88 0. If the diagonal blocks of H vanish, so that H is block off-diagonal or "alternant",, then (5*1) reduces to HBA " "AR*" °» <5.2) which has the solution, f " <HBAKAB>"*HBA * HBA(HABHBA>'*' as can be verified by direct substitution.. When H does not have one of the special forms mentioned above, some iterative procedure or perturbation method must be used to solve (5*1)• Iterative methods to successively correct the approximation to a solution are considered here. Perturbation methods are discussed in the following chapter. Among the simplest iterative techniques to apply are those in which eq. (5*1) i s rewritten as a fixed point problem, f * 5 ( f ) = *mltD{t) * /ff]. (5.4) where ^r* is some non-singular, possibly f-dependent super- 105. operator* Successive substitutions, *m+i * ^ ,( f m^» s t a r t i n g from some i n i t i a l guess f 0 , give the scheme, 6 f m + 1 - ^ 1 D ( f m ) . (5.5a) fm*i * *m * 6 W m 3 o- 1 . 2 -— <5.5b)- hopefully,convergent to a s o l u t i o n of (5.1). I f the sequence [ f m ! converges, the rate of convergence w i l l be l i n e a r i n general i f -Hr i s independent of f • I t e r a t i v e procedures with better than l i n e a r convergence invariably involve the use of an f-dependent operator Ar» The Newton-Raphson procedure i s the simplest of t h i s type. The generalized Newton-Raphson: equations, are a s p e c i a l case of (5«5a). i n which J ^ r i s the negative of the Jacobian matrix, J ( f ) , which consists of the f i r s t deriva- t i v e s of the elements of D(f) with respect to the elements of f. I t e r a t i o n on eqs. (5»6) and (5«5b) r e s u l t s i n a second order convergent sequence (fj. That is,, the error i n the estimate, f f f l, of f , a f t e r the m i t e r a t i o n : i s given as a l i n e a r combination of second order products of the errors i n fTm-l* the r e s u l t of the previous i t e r a t i o n ( i n the sense described i n Appendix 5). so that convergence becomes very rapid as the so l u t i o n i s approached* For eq. (5»6), as f o r any i t e r a t i o n formula of order greater than one, convergence See R a i l (1969), e s p e c i a l l y section 121 Traub, (1964)i and also, Appendix 5 of t h i s t h e s i s . 106. w i l l always occur I f a s u f f i c i e n t l y accurate i n i t i a l approxi- mation, f Q , can be obtained. For l i n e a r i t e r a t i o n functions, there need not be any i n i t i a l estimate of f which w i l l lead to convergence. A set of related i t e r a t i v e procedures with high order convergence properties can be generated according to the scheme, 6 fm 2 ) " - ^ - l ^ W •41)> (5.7) . . . . . . m m—l m—l m j (k) I t can be shown that the error i n f„ 8 f» S 6f« i s a l i n e a r m m-i m combination of (j+1) order products of errors i n f m _ i (Traub, 1964). The advantage of using an i t e r a t i o n formula of the type (5*7) i s that the Jacobian matrix, which i s t y p i c a l l y of large dimension (n An f i x n^ng here), need be constructed and inverted only once f o r each cycle of the type (5*7). I t e r a t i o n schemes with second order convergence require the evaluation and manipulation of the (n Ang) f i r s t derivatives of D(f)., S i m i l a r l y , t h i r d order convergent i t e r a t i o n schemes generally require the evaluation and manipulation of the i ( n A n B ) 3 second derivatives of D ( f ) . Algebraic expressions for these sets of derivatives are e a s i l y obtained. Third and higher order derivatives of D(f), eq. (2.16), with respect to f arte zero. For the p a r t i c u l a r a p p l i c a t i o n to large matrices, these 10?, i t e r a t i o n ; schemes with better than l i n e a r convergence involve the manipulation] of unacceptably large amounts of information. The s o l u t i o n of eq. (5*6) for 6 £ m + 1 involves of the order of (n Ang)^ computational operations, and f o r n^ » m>>n A, t h i s i s comparable to the amount of work required to diagonalize H completely* For n^, >>n^, a t h i r d order formula involves of k the order of m operations per i t e r a t i o n — equivalent to the complete diagonalization of the matrix m times over* For large matrices, i t i s therefore necessary to concentrate on compu- t a t i o n a l l y e f f i c i e n t , l i n e a r l y convergent i t e r a t i o n procedures. When H i s diagonally dominant, with the diagonal elements of c l o s e l y grouped about the value X^, the simple choice A = U j l , - H<|>)®1 A ( 3 . 8 ) ( d) (direc t product notation) suggests i t s e l f . Here Hg-g/ i s the diagonal part of Htgg.. This gives an i t e r a t i o n scheme based on the correction , f - B " ( f ) • (5.9) cl o s e l y r e l a t e d to degenerate perturbation theory. In eq. (5*9)•• and throughout the treatment of the 2 x 2 case, Greek l e t t e r s r e f e r to basis elements i n Sg,, and Roman l e t t e r s to basis elements i n S^. The i t e r a t i o n index mi w i l l be dropped wherever the context does not require i t * More generally, f o r diagonally dominant matrices, the simple choice,, 108 Heads to the corrections, D a r < f ) & f a r = ~ f f l . (5.U) H r r ~ Hoo also closely related to perturbation theory. The procedure based on (5*11) w i l l be designated as the "Simple Perturbation" (SP) algorithm* Numerical calculations indicate that i t converges well only when the diagonal elements of H. are ordered monotonicallyj and when the diagonal elements of are well separated from those of K^g* Details of test calculations, using this and other algorithms, are given in section 5*2.g. A better approach is to base the choice of on approxi- ( mations to the appropriate Newton-Raphson equations. As demonstrated i n Appendix 5? these methods are s t i l l linearly convergent, but hopefully exhibit some of the s t a b i l i t y of the Newton-Raphson equations, over a range of problems. Different approximations to (5*6) lead to algorithms exhibiting different rates of linear convergence. In assessing the compu- tational efficiency of such algorithms, however, i t i s neces- sary to consider both the amount of computation per iteration and the number of iterations required to obtain desired accuracy. During the iterative solution of (5*1)* the required f-dependent quantities must be evaluated using the current approximation to f. Thus,, the considerations in sections 3*1 109. andi 3.2 are relevant here, and i t is useful to classify the algorithms developed below according to the way in which the f-dependent quantities involved are evaluated. 5.2.b: Methods Based on D ( 1*(f) If f is an approximation, to the solution of eq. ( 5 . 1 ) t and 6f is the exact correction, so that f * f + 6f Is the o exact solution of D^^(f) « 0, then, i t follows from the definition, (3.4), of D ^ C f ) * that H B l ) ( f 0 ) t 6 f - 6fH A(f) » - D ( 1 ) ( f 0 ) . (5.12) This is an exact equation for 6f• The Newton-Raphson equations for the system D* '(f) = 0 are 4l)(fo)t6f - ^ A 1 > ( f o ) " - B ( 1 ) ( V » (5-13) the matrix elements of the Jacobian in this case being, D ( D ort Equation (5*13) differs from the exact equation (5«12) only i n that the exact operator H A(f) appearing in (5*12) i s replaced by the current approximation H A*^(f Q) i n (5*13). Despite the sparseness of the Jacobian matrix here, the Newton-Raphson method i s s t i l l computationally inefficient. A nom-iterative method, such as Gaussian elimination, for solution of (5*13)• does not easily allow proper exploitation 110. of the blocked structure of the Jacobian.. Straightforward application) of the Gauss-Seidel method and i t s refinements, to the determination of 6f from (5.13) with J ^ ( f ) and D ^ U ) f i x e d * does allow the sparseness of the Jacobian matrix to be exploited* However, such a procedure i s i n e f f i c i e n t i n that i t does not make use of a l l the information avai l a b l e about f at a l l times i f J * 1 ^ and D^1^ are held fi x e d during the i t e r a t i o n to determine 6 f * Thus, a modified Gauss-Seidel procedure applied to (5*13) i s required* The simplest l i n e a r i t e r a t i o n formula based on the Newton- Raphson equations, (5*13)» i s one i n which the operator A" i n (5.4) i s taken as the negative of the diagonal part of the Jacobian matrix. The successive corrections to f„^ are then or given by D ( D 6 f o r » 2 2 (5.15) V f lA ; r r v a 'aa Im view of the s i m p l i c i t y of the matrices involved, the most e f f i c i e n t computational procedure i s to change only one element of f at a time, c a l c u l a t i n g B^} at that time, and updating H ^ and the diagonal part of Kg 1** co n t i n u a l l y * A f t e r a change im a single f o r» these quantities are e a s i l y updated because they are l i n e a r i n f r ( ^ ' i s r 8 H s o 6 f o r • ( s " 1 * " - nA>» ( 5 ' l 6 a ) and ( & K ( 1 ) t ) r t „ " -*f„ H « • . ( f i H j ^ ) ^ . (5.16b) oo or ro A r r (1) C a l c u l a t i o n of D^*' as required involves the same number of computational operations per sweep through 6f as the continual updating of (for which D ^ must be stored), but there i s a l i k e l i h o o d of s i g n i f i c a n t accumulation! of round-off error as the s o l u t i o n i s approached i f such an updating procedure i s used f o r D ^ . Where the diagonal elements of are f a i r l y well separated from those of H^, the usual s t a r t i n g approxi- mation i s f = 0., In t h i s case, the s t a r t i n g approximations to 7 and ' are simply and H f i B. The i t e r a t i v e scheme based on (5»15) and (5«l6) w i l l be referred to here as the "Simple Diagonal Newton-Raphson" (SDNR) algorithm. A precise statement of computational d e t a i l s i s given i n Appendix 4.. The idea of the c o r r e c t i o n &fi o r,, calculated i n (5.15), is; that i t should reduce the corresponding B^} approximately ta> zero. This may be f a r from true early i n the c a l c u l a t i o n i f (1) 6 f o r i s la r g e * The change & f a r required to reduce D^r' exactly to zero can be determined fronr (5.12)» The r e s u l t i s a quadratic equation i n 6 £ 0 r t namely, « r o * 4 + C < ^ i l > > r r - < 4 B 1 > t > « . 3 i * « - D a J > = ° * < 5 ' l 7 > The i t e r a t i v e scheme based on t h i s equation w i l l be referre d to as the "Quadratic Diagonal Newton-Raphson" (QDNR) algorithm. Precise d e t a i l s are given i n Appendix 4. I f (5»17) has two r e a l roots, the desired correction i s the one of smallest magnitude numerically. When (H^ 1*) -(fig 1 * * ) i s much greater than either or both of B^} or H^, t h i s c o r r e c t i o n 112. d i f f e r s n e g l i g i b l y from that given by (5 .15) . As the so l u t i o n of (5»1) i s approached, and the magnitude of D^J becomes progressively smaller compared to the other c o e f f i c i e n t s i h (5*17) (which are constant, or e f f e c t i v e l y constant once a reasonable approximation to f i s achieved), i t i s necessary to use the formula f o r the root of a quadratic equation with a r a t i o n a l i z e d numerator to avoid serious round-off error, that i s , 2 X D i i ) or 6f or where (5.18) X* • «»C ( K i l ) ) r r - ( H i 1 ) t ) c o ] » - (5.19) when a l l c o e f f i c i e n t s are r e a l . This equation can be used instead of eq. (5*15) i n c a s e s where d i f f i c u l t y i s experienced i n estab l i s h i n g convergence. * ( l ) * ( l ) I f diagonal elements of Hj^ and Kg ' become very nearly equal at some stage of the i t e r a t i v e c a l c u l a t i o n , eq. (5.15) nay lead to divergence. Such diverging tendencies may be damped i f (5.18) i s used. On the other hand, situations occur i n which eqs. (5.18) accelerate the divergent process. The re s u l t s of some numerical calculations using both of these algorithms are included i n section 5*2.g and Table 5«1. I f diagonal elements of H ^ and Hgg. are equal, i t i s necessary eit h e r to use a non^zero s t a r t i n g approximation f o r f, or to use algorithm QDNR i n i t i a l l y , since a p p l i c a t i o n of the 113. SDNR algorithm may lead to a division by zero early i h the calculation. It is unlikely that either of these stratagems w i l l lead to a rapidly converging calculation, however, unless a reasonable separation i s soon established between the diag- onal elements of and H^. In the limit tig > > n^, the quadratic algorithm, QDNR, requires effectively the same amount of computation per sweep through 6f: as the linear algorithm SDNR. In both cases,, the time consuming part of the calculation i s the evaluation of Dor*, and'possibly, the updating of H ^ and Hg 1^, rather than the calculation of 6f f f r from either (5»15) or (5.18). Iteration on (5.15) for 6 f o r , while updating (HJ1^, but * (1 )t (1) keeping (Hg ) a g and D* f fixed, i s equivalent, i f convergent, to using (5*18)• This i s not necessarily an efficient procedure, however. (2) 5.2.c Methods Based on Dv '(f) The operator H ^ ( f ) appearing i n D ^ ( f ) must be con- sidered to have errors of the same order as those i n f i t s e l f . *(2) As shown in chapter 3, however, the error i n H| ' is smaller, in some sense, the eigenvalues being unaffected ih f i r s t order by a f i r s t order error i n f. This insensitivity can be exploited in iterative procedures for solving the equation,. D ( 2 ) ( f ) = Hg^ + Hggf - f H A 2 ) « 0, (5.19) which has the same solutions as does eq. (5»l)« -1 *(2) Because of the inverse operator g A in , the exact (2) Jacobian matrix of Dv '(f) is no longer simple, j ( 2 ) , ; * H 6 + - (H<2>) +6 - i f *«* (5.20a) Here, one has, a(H<2)) * - <«;lVfV6rt " ( * I l f t V V t r - ^.20b) Because of the term involving the derivatives of the elements * (2) of H^ , the exact Jacobian matrix is not at a l l sparse, in (1) *(2) generalj unlike JK ' of eq. (5.14). However, since H^ ' varies slowly with f near the solution of (5*19)r i t is (2) expected that those elements of J x ' arising solely from the third term of (5.20a) w i l l be relatively smaller than the remaining non-zero ones. On neglecting this term i n (5«20a), 'i the approximation AD . * = H 6 . - (HJ2) )^tA„ , (5.21) or.pt av,pt op r t A r t po ' w is obtained. This gives the simple equation, H B B6f - 6 f H A 2 ) * -D* 2 )(f)„ (5.22) as an approximation to the Newton-Raphson equations for the system (5.19). In contrast to (5»13 ) i this equation involves the original H f i B only, and not some modified nfi x n B matrix. *(2) On the other hand, i t is more complicated to update H£ ' than H^1^. For any change 6f i n f, the change i n H^2^ is given exactly,by, *w(2)_ -l(new)r(new) -l(old) r(old) A " gA GA "* gA GA _ -l(new)r*« _ «(2) where '0GA - 6g AH A^] (5.23) - g A 1 ( n 6 w ) [ 6 f ^ D ( 2 ) + H ^ WBA " HBA * HBB f« ( 5 ' 2 i f ) A l l quantities on the right side of (5.23) are before updating, except where ex p l i c i t l y indicated. Since an n A x n A matrix inversion is required for each updating of H A '., the use of (5.22) is efficient only i f groups of elements of f are changed simultaneously before updating H A • In application to large matrices, i t is most efficient to change entire n A-dimensional rows of f at one time. For refi > > n A , this leads to an algorithm requiring comparable work, per iterative sweep through 6f, to algorithm SDNR (that 2 i s , of the order of n An f i computational operations per sweep). As in SDNR, only single columns of the block H f i B are required at one time* Two iterative methods based on eq. (5*22) appear useful. The f i r s t i s the simplest diagonal approximation* which corre- sponds to taking M- of eq. (5» k ) as the negative of the diagonal (2) part of Jr This leads to the iteration formula, D (2) 6 r . 2E ( r « l t ^ n ) # (5.25) or *(2) (H A ) r r - H o a When 6 f a r is given by this equation, the expression,.,., (5«23)> Il6» * (2) for 6H A ' s i m p l i f i e s somewhat to &ft(2)_ -l(new)r/ f(new ) tx & f t i ( 2 ) + / w t x & f A " 6A L C f ;Aa 6 foA HA +< WBA ;Aa 5 faA * ( » / ) 4 A . S i 2 ) d ] . ( 5 " 2 6 ) 'Aav"aA"A where H^ 2^ d i s the diagonal part of H^ 2\ and where (Wg A) A ( J, ( 6 f f ) A a r ( f ^ n e w ) t ) A ^ . and U f ) a A , r e f e r respectively to the a t h rows of W^, f ( n e w ) f , a n d 6 f f , and the o t h column of 6f. The second method i s to tre a t the n A equations i n (5*22) for each fixed a as a system of simultaneous l i n e a r equations for the &f o r» (r * 1, • n A ) . This corresponds to taking to be block diagonal, each diagonal block being the negative of the diagonal block of «T ' r e f e r r i n g to a row of 6f, The re s u l t i n g i t e r a t i o n formula can be written,, 4FOA ' -^AWA " "A2^"1 • (5.27) which,, im practice, involves the soluti o n of a system of n A simultaneous l i n e a r equations i n n A unknowns. For t h i s change 6f, the f i r s t term; of (5»23) vanishes, so that the updating *(2) formula f o r H A reduces to ,S<2> . ^ — I C C w J ^ f ^ - ( ^ J ^ f ^ l 2 ' ] . (5.28) This method involves somewhat more computation per sweep through 6f than the preceding one, but may be expected to converge i n fewer o v e r a l l i t e r a t i o n s i n c e r t a i n cases where the off-diagonal elements of H A A are larg e . The two procedures described above w i l l be referred to as the "Diagonal Generalized Nesbet" (DGN), and the " F u l l Generalized Nesbet" (FGN) algorithms, respectively. A precise statement of computational details is given in Appendix 4, In the case n A = 1, they both reduce to an algorithm of Nesbet (1965)* There are also certain similarities to that of Davidson (1975)* Test calculations using them are described in section 5*2.g, 5*2,d Solution, of the Newton-Raphson Equations by Descent Methods The approximation of the f u l l Newton-Raphson equations by much simpler equations, to avoid prohibitively costly calcula- tions, reduces both the rate of convergence and the range of calculations for which convergence occursv A major factor i n non-convergence of any of the algorithms SDNR, QDNR, DGN, or FGN, must be the neglect of some or a l l of the coupling between elements of 6f in the Newton-Raphson equations. The successive correction of individual (or at most a few) elements of f can lead to very slow convergence ("spiralling"), and also divergence, i n the case of systematic over-estimation of the elements of 6f, It is desirable to vary a l l of the elements of f simulta- neously, but using methods which are less costly than solving the Newton-Raphson equations exactly. The Gauss-Seidel method applied to the f u l l Newton-Raphson equations, with updating of J and D only after one or more sweeps through 6f have been completed, is one possible way to approximate the coupling between the elements of 6f• However, in this sub- 118. section, alternative methods Based on the minimization of the r e s i d u a l , J6f + D, of the f u l l Newton-Raphson equations, w i l l be examined. When 6f i s real,, the soluti o n of the Newton-Raphson; equations i s equivalent to determination of the stationary points of Q(6f) - #6f fJ6f + 6f fD, (5.29) considered as a function of 6f., Further, i f J i s p o s i t i v e - d e f i n i t e , the solutions of the Newton-Raphson equations are equivalent to l o c a l minima of (5*29), so that 6f can be determined using a gradient minimization technique. The eigen- values of of eq. (5*14), are evidently the differences *(1) between the eigenvalues of Hifi ' and • Thus, as long as "(1) a l l the eigenvalues of Hg are larg e r than a l l the eigem- (1) values of H£ ', the Jacobian J x ' w i l l be positive d e f i n i t e . (2) The eigenvalues of the Jacobian matrix J x ' are less easy to *(2) deduce because of the terms involving the derivatives of H^ • in. eq* (5.20a)• I f these derivatives are s u f f i c i e n t l y small, then J A ' w i l l be positive d e f i n i t e as long as some minimum,, separation i s maintained between the largest eigenvalue of .. 4* (2) H^ ' and the smallest eigenvalue of H f i B. The condition that the Jacobian be positive d e f i n i t e implies that i t i s the lowest eigenvalues which are sought. I f the Jacobian matrix i s not positive d e f i n i t e , the solutio n of the Newton-Raphson equations i s equivalent to minimization of the functional, 119. Q ( 6 f ) » ( J 6 f t D) r(Ji 6 f + B) • DfD • D f J i 6 f • 6 f f J D • d f ^ S f , (5O0) which i s more d i f f i c u l t to handle than (5.29) because of the generally large dimension of When J i s positive d e f i n i t e , the use of an i t e r a t i v e gradient minimization technique (such as: the method of steepest descents or the method of conjugate gradients) to calculate 6 f by minimizing (5»29) while holding J and D fixed, involves modifying 6 f as a whole by successive amounts a^v^, where i s a s c a l a r step length chosen to minimize Q along the search d i r e c t i o n v^. The search directions v^ are chosen equal to, or re l a t e d to, the directions along which Q changes most r a p i d l y . Computational d e t a i l s of the a p p l i c a t i o n of these minimization techniques to quadratic forms l i k e (5*29) are given by Ralston (1965, PP» 439-445). I f the steepest descent method i s used with the Newton- Raphson equations based on D ^ ^ ( f ) , the most c o s t l y part of the minimization i t e r a t i o n i s the determination of the step lengths a^, which involves evaluation of the sca l a r product V i , F ( 1 ' - V i * o /» t {lW S » l > t ) P. ( Ti ) .t . . . . . - < T iW 5 i 1 } >tr* T lVr* 2 For n f i >>nfr* t h i s requires of the order of n An f i computational operations. In the conjugate gradient method, an ad d i t i o n a l 2 nA nB operations are required to evaluate the product V£J«D, 120* necessary in determining vi +^» Thus, i f mi minimization iterations are carried out, the calculation of 6f, including the i n i t i a l evaluation of and requires of the order 2 of (m+2)nAnB operations using steepest descents, and a mini- 2 mum of (2m+2)nAn-B operations for conjugate gradients. This is roughly equivalent to m+2 and 2m+2 iterations, respectively, of the algorithms discussed in the previous three subsections* The advantage here is that a very good estimate of 6f may be obtained for m small, because the f i r s t few iterations i n such minimization techniques frequently result i n the greatest movement towards the minimum* Coupling between the elements of 6f i s taken into account here, while the computation per iteration is s t i l l proportional to n An f i for nfi >>nA* as is desired* While such descent methods are not expected to be of much use in application to large matrices, they are very useful in the slightly more complicated self-consistent f i e l d problem; in molecular orbital theory (see chapter 8), where i t i s considerably more costly to update D(f) and J ( f ) , because of the complicated dependence on f of the matrix being block diagonalized. 5*2 *e Extremizing the Trace Another alternative to eq* (5*1) is to determine f such i that the trace of the matrix H over the eigenspace S 4 is 121. stationary (see section 2.1.e), that i s , such that E(f+6f) - E(f) * 6E * tr[P A(f+6f)-P A(f)]H • 6tr PAH, (5.32) vanishes to f i r s t order in 6f. Prom eq. (2.39)# this i s equivalent to the vanishing of the quantity B - VjE - g j 1 D ( 1 ) ( f ) g j 1 r (5*33) 0 being the derivative dE/df,. The derivatives of D with or or respect to the elements of f and f* are given by or and 77 1 ••-<% 1 V<« I L GA«A 1 ) t r + ^ ^ t r ^ l W ^ ^ o r (5.34b) Thus, the Newton-Raphson equations for the system D = 0 can be written - -D. (5.35) On multiplying from the right by g A and from the l e f t by gg,, these become i i B 2 ^ t * f-6fH| 2 )-«D ( 1 >*[D^ 1 ) g][ 16f t f+f6f tD^ 2^] . (5.36) If and D^2^ are considered to be of the same order as 6f as the solution i s approached, then the last term of (5«36) i s of higher order than the remaining three terms. If this term 122. i s neglected, the r e s u l t i n g equation i s of the type (5*13) with the operators ' and Hg , replaced, respectively, by *(2) *(2) * M ) and K g . Because the difference between H.̂ , Hfi ', and H^ , Hg.', i s of the same order as the term neglected, the r e s u l t i n g approximate equation i s not necessarily an improve- ment over (5*13)» despite the presence of the more accurate e f f e c t i v e operators. The exact equations, (5»35) and (5*36), could be s i g n i f i c a n t f o r gradient minimization techniques, which can be set up so *(2) that divergence cannot occur. However, the evaluation of H^ ' involves the inversion of the n f i x n^ matrix, gg, as well as the formation of the product gg^Gg*- For n f i >>n A, these two computations could be p r o h i b i t i v e . 5.2.f Minimization of the Norm of D A further a l t e r n a t i v e to determining f by solut i o n of , (5.1) i s to minimize the square of the Hllbert-Schmidt norm of D, l|D||2 - E ID. |, (5.37) o,r with respect to f.. The required gradients are Ms!!2. . 2 E D hfr o,r 0 r df_.e TS TS * 2 (HgD + D H A ) T g . (5*38) This approach i s a t t r a c t i v e because i t involves a suitable 123. convergence c r i t e r i o n d i r e c t l y . I f a gradient minimization technique i s used, i t i s easy to ensure that a maximal rate of convergence i s maintained. By minimizing ||D'l\ i t s e l f along some search d i r e c t i o n i n each i t e r a t i o n , problems of over- shoot and undershoot can l a r g e l y be avoided. 5.2 »g Test?-Calculations- The algorithms described i n sections 5-2.a - 5.2.c have been applied to a series of matrices based on that considered by Nesbet (1965). The off-diagonal elements of these matrices are a l l unity, and the diagonal elements are some combination of the f i r s t n odd integers, 1*3,5, ••• • Matrices with dimensions up to 250 x 250 were consideredi t h i s being s u f f i - cient f o r testing.. The calculations were carr i e d out on an IBM 370/168 computer using double p r e c i s i o n arithmetic. L The convergence c r i t e r i o n was based on the Hilbert-Schmidt norm, ||Dll=(tr D 1!)^, of the p a r t i c u l a r form of D(f) used i n each method. A c r i t e r i o n based on the maximum change ° f o r i n the elements of f during an i t e r a t i v e sweep can also be u s e f u l . In a l l examples, the basis space, S A, i s defined by the f i r s t n A basis functions i n order, so that re-ordering the diagonal elements of H i s equivalent to varying S A* For convergent c a l c u l a t i o n s , i t was found, except f o r the f i r s t few i t e r a t i o n s i n some cases, that log||D|| i s usually well approximated as a l i n e a r function of the i t e r a t i o n number. 124. That i s , convergence was l i n e a r once the c a l c u l a t i o n s t a b i l i z e d , with the value of ||0|| decreasing on the average by some constant factor f o r each i t e r a t i o n . This factor can be regarded as an average asymptotic error constant. Table 5*1 gives these error constants (or convergence rates) f o r a number of examples, to i l l u s t r a t e the e f f e c t s of varying the size of the matrix, varying the differences between the diagonal elements of H A A and Hg B, and varying the ordering of the diagonal elements of the f u l l matrix to change S A« For comparison, Nesbet*s algorithm (Nesbet, 1965) was used to obtain a single eigen- value of each of the matrices considered. The square root, o, of the variance f o r the approximate eigenvector, as defined i n eq. (2.49), was used as the convergence c r i t e r i o n i h t h i s case, and l o g a was also found to be a l i n e a r function of the i t e r a t i o n number. Note that the smallest numbers i n Table 5.1 represent the fas t e s t convergence. For the basic Nesbet matrix, with S A the space corresponding to the n A smallest (or largest) diagonal elements of H, a l l methods converge to give the eigenspace of the n A smallest • (or largest) eigenvalues. The rate of convergence varies l i t t l e with n A, either increasing or decreasing s l i g h t l y as n A increases. When the largest eigenvalues are sought ( or equivalently, when the off-diagonal elements are -1), con- vergence i s considerably poorer f o r n A • 1 than f o r n A = 5. except for algorithm DGN. The f i v e algorithms tested have rates of convergence generally comparable to the Nesbet method TABLE 5*1 Linear a Convergence Rates of the Algorithms in Selected Calculations. tnA » *8) DIAGONAL MATRIX METHOD S P •nesbet i / / / S $ $ 10 20 2 So 10 20 2ZO ' ,3,5\. ./7,/0C 1.3, S... 37,3? / .3 ,5 \ . . V97, / . 3 , * . . . 37,39 /.S,5".. • **7,*59 0.27 a 33 o.afc O.S8 O.SjT 0-3/ 0.4"/ 0.22 0.29 a 5"/ 0.3/ 0.23 0.3o 0.23 o.3o 0.5O o.37 O.V9 0.25 O.3o O.SO a/9 (4) o.So (it) O.J 3 o.30 a23 0.30 /O t1,ntis... 3.1 0.3V 0.23' 0.23' o.av (a) OliWf" 5" 5 S s lo to IO 10 iO /.3.-././;/3.;.f'...'/vy 0.58' / , 3 . . .19,11,9,1.. IWOSI* O.S3(5) 0.S5* 0.3/* 0.3/* '*,/7.. 9,//, .3.1 0.*9<< o.** e 0.3/^ 0.3/* c//V. OA2* o.42* oMl 0.3©<a> 0.3/ (t)e o.m(4)i 0.23*- o.a3 0.33 0.23 ©.33 5 S S 5 20 20 2D to » , » » ','V-V-3 J/.V,3.5...4f, 0.2o K 0.6* 0/8 0.20 OAS O.I& 0.21 o.JS o.si 0.1 la O.9O I m no m 126. TABLE 5.1 (continued) aThe tabulated numbers represent the average factor by which the norm J/DH i s decreased per i t e r a t i o n , once a l i n e a r conver- gence rate i s established. S A i s spanned by the f i r s t n A basis functions. A l l off-diagonal elements are unity. The numbers are obtained by a le a s t squares c a l c u l a t i o n of the slope of logllDII as a function of i t e r a t i o n number. bthe number of it e r a t i o n s before l i n e a r convergence i s established i s indicate i n brackets to the r i g h t of the convergence factor when not zero. c t h e eigenvalues of t h i s matrix are. O.386, 2.461, 4.519, 6.753. 8.629, 10.691, 12.766, 14.868, 17.037. 22.072. dconverges to the eigenvalues 0.386, 2.461, 4.519. 6.573, 10.691. Converges to the eigenvalues 0.386, 2.461, 4.519, 8.629, 10.691. ^converges to the eigenvalues 0.386, 2.461, 4.519, 10.691, 12.766. ^apparently converges to the eigenvalues 0.386, 2.461, 4.519, 10.691, 14.868. "converges to the eigenvalues 10.691, 12.766, 14.868,17.037, 22.072. f. converges to the eigenvalues 8.629. 12.766, 14.868, 17.037, 22.072. ^ HDl i s o s c i l l a t o r y . Yr HDll i s apparently divergent. *l|Dl| becomes constant (* 4.34) a f t e r 25 i t e r a t i o n s . m o becomes constant or increases very slowly a f t e r about 50 i t e r a t i o n s . 127. i n cases where convergence i s straightforward, as i t i s when the diagonal elements of H are ordered monotonically and are well separated. Frequently, large matrices a r i s i n g i n various applications, for which only a few of the lowest eigenvalues and t h e i r eigenvectors are required, have diagonal elements arranged i n roughly increasing order, with v a r i a t i o n i n the diagonal elements large compared to i n d i v i d u a l o f f - diagonal elements. As seen from the r e s u l t s i n the f i r s t part of the table, these algorithms are well suited f o r such c a l c u l a t i o n s . The simple perturbation (SP) algorithm generally exhibits the poorest convergence rates i n these examples, as may be expected, since i t represents the crudest approximation to the exact Newton-Raphson equations. The algorithm DGN works r e l a t i v e l y poorly i n two cases f o r n^ * 5« Presumably, one of the diagonal elements of H£ ' approaches one of Hgg too clos e l y during the calculation.- The algorithm QDNR has convergence rates i d e n t i c a l to SDNR, because these two calcu- l a t i o n s d i f f e r s i g n i f i c a n t l y only at i n i t i a l stages before l i n e a r convergence i s established. The e f f e c t of varying the spaces S^ and Sg, as defined by the associated diagonal elements of H, i s i l l u s t r a t e d by the t h i r d part of Table 5.1. Rates of convergence deteriorate markedly when one or more diagonal elements of exceed at leas t one diagonal element of Hgg. The algorithms DGN and FGN sometimes converge to d i f f e r e n t eigenspaces S A than do SDNR and QDNR. I t i s noteworthy that QDNR gives no improvement 128. over SDNR i n these examples, and i s ac t u a l l y non-convergent i n one case where SDNR converges w e l l . The uncertain convergence i s presumably due to one of the differences ( H B U t ) a a - ( H A l } ) r r , or H^0 - ( H A 2 ) ) r r , appearing i n the denominators of the i t e r a t i o n formulas, becoming small or changing sign. The i t e r a t i o n formulas become i l l - c o n d i t i o n e d or even singular under such circumstances. The presence of an H i n d u c t i o n period" before l i n e a r convergence i s established i s presumably associated with an i n i t i a l uncertainty i n the s e l e - c t i b n of a space S A , when the diagonal elements of the approximate A-space and B-space e f f e c t i v e operators are not well separated. In p r i n c i p l e , the space S A, sp e c i f i e d by the calculated f, may correspond to any group of n A eigenvalues of the matrix H. Thus, i n p r i n c i p l e , any subset of n A eigenvalues, none of whose eigenvectors are orthogonal to the subspace S A of the f u l l basis space, can be calculated without previous deter- mination of any of the other eigenvalues. However, the f i r s t three sections of Table 5«1 show that the i t e r a t i v e c a l c u l a t i o n converges best when S A corresponds to the n A lowest (or highest) eigenvalues of the matrix, and S A to the smallest (or largest) diagonal elements of the matrix. Deviations from t h i s arrange- ment e n t a i l considerable risk: of poor convergence or no conver- gence at a l l . I f n A of the lowest m eigenvalues (m > n A) are desired, these convergence problems can be avoided by pre- diagonalizing a block of H containing the m smallest diagonal 129. elements„ as described in section 5*3*e. The last section of Table 5*1 shows the superiority of the algorithms developed here over Nesbet*s algorithm when the lowest few diagonal elements of the matrix are nearly equal, but well separated from the remaining ones. Generally, the f-operator calculated must correspond to the space S A spanned by the eigenvectors belonging to a l l of the nearly equal eigen- values i f good convergence rates are to be obtained. However, a surprising feature of the results is that the algorithm SDNR performs very well, even when a diagonal element of Hgg is relatively close to one of H A A« These computations do not indicate any clear cut superiority of one algorithm in a l l cases. When convergence is straight- forward, a l l converge effectively equally rapidly. When con- vergence is not straightforward, any one of the methods may be more stable or rapidly convergent than the others. However,, the algorithm DGN appears to be less successful than SDNR and FGN,. generally. The simple diagonal Newton-Raphson (1) procedure, based on Dv (f) • 0, is somewhat easier to program (2) e f f i c i e n t l y for n^ > 1,. than the methods based on D ' ( f ) , and from this standpoint i s particularly attractive. In fact, in> most cases,, the rates of convergence for this method compare very favourably with those of the other, more complex, methods. The extra computation involved im using QDNR rather than SDNR appears to be of l i t t l e value, im general, even though this represents a negligible amount of additional work as n B 130 becomes very large. While SDNR yields only the approximation directly, a calculation of ' at the end of the iterative sequence takes only of the order of the time of one iteration, and can be carried out i f the eigenvalues and eigenvectors of H corresponding to S^ are desired. For n^ =1, SDNR offers an alternative to Nesbet's method,, of comparable efficiency. 131. 5.3 Generalization to a Non-orthonormal Basis — 2 x 2 Case g»3.a General Considerations When the basis i s not orthonormal, the r e l a t i o n of the two off-diagonal blocks of the p a r t i t i o n i n g operator i s more complicated than before (see section 2.3)* The off-diagonal blocks, f and h,, of T, eq. (2.2),, can be defined by the pair of simultaneous equations representing the vanishing of the off-diagonal blocks of G = T^HT and g = T tST, namely, GBA * HBA * H B B f * h \ " °» ( 5 ' 3 9 a ) and *BA = SBA + S B B f + h \ " °» . (5 .39b) where H A = • H Agf„ and S A • S^* S A f i f . Alternatively,, these equations may be combined to give separate equations fo r f and h, D B A ( f ) ° HBA * H B B f * ( SBA * S B B f ) 5 A " °» < 5 ^ 0 a > and D A B ( h ) " HAB * HAA h - ( SAA h * SAB ) 5B " °» <5.40b) as irr. eqs. (2.113) and (2.114). Mere, one has, K A " S j 1 ^ » ( 5 . 4 l a ) L « S^H' , (5.^1h) and * KB ~ "B "B with Hg - H B B + H f i Ah, and Sg * S f ig + S f i Ah. Algorithms to calculate f and h, or eithe r separately, have again been founded on approximations of the f u l l Newton- 132. Raphson equations by less costly linearly convergent i t e r a t i o n scheme . 5 . 3.b Methods Based omGj^ and — A Generalization of Algorithm SDNR A direct generalization of the SDNR algorithm based on the equation Dgj^ (f) » 0t. does not lead to an efficient computa- tional scheme i n a non-orthonormal basis* Because of the inverse matrix S A a change in even a single element of f * ( i ) changes a l l of the elements of H A , making updating costly* However, a very simple procedure can be based on the simulta- neous solution of eqs. ( 5 *39) , this procedure reducing to algorithm SDNR when the overlap matrix S is replaced by 1 ^ and h by - f + , . The Newton-Raphson equations corresponding to the system (5»39) can be written as the pair H ^ f + 6h fH A = -GflA » (5.42a) and Sg6f + 6h fS A m - g M (5.42b) These represent 2^^, equations for the n An f i elements of f and the n^ng-elements of h. The diagonal parts of these equations are of the form (S f ) B oo (I*AW ( SA^rr — 6f or • 6h „ ro g o r _ _ (5.43) 133. with the solution, A • -G (S A ) - g (H. ) 6 f o r » g r r g g r r , (5.44a) A or and -(Hw ) + (S„) G 6 h ; r - q g g q r a c p c r „ (5.44b) ^ o r where A a r - ("B^oo^A'rr " - »•»*> A computational procedure based on these equations involves roughly double the work per iteration as algorithm SDNR. The A I quantities G o r and g o r are calculated as required, while Ĥ , A A £ A I f S^, and the diagonal elements of Sg and Hg are stored* These latter matrices are easily updated, because they are linear in f and h* For a change i n f and h\ , one has . or ro * andi ( 6 H ' ) O W * H F L 6f , A sr so or (6S.) = S 6f v A sr so or *' t * (6H-,. )^_ s 6h__H' _ , B oo ro ro , *t v * (6Sl ) « 6h S • v B 'oo ro ra } (s * 1, ..., n^),, (5*46a) (5.46b) Precise computational details of this procedure,, designated as the "Simple Diagonal Newton-Raphson with Overlap" (SDNRS) algorithm, are given i n Appendix 4. Again, a quadratic generalization (here QDNRS) can be + obtained. Equations for the exact corrections 6f and ©h ,, 134. required to reduce Gg^ and g ^ exactly to zero, are obtained froar (5*39) in the form, i f B f6f + 6h f H A * 6h fH A B6f - - G M , (5.47a) and S*$f • 6h +S A • 6h fS A B6f - - g M . (5.47b) The corresponding "diagonal equations", <"B\o 4 for + s hw (*A>rr + ' " ^ r - and (si) 6f • 6h* (S-.) + 6h* S 6f » -g , (5,48b) v Woo or ro v A'rr ro ro or 6 o r • w # can be combined to give 6 f o r and d h ^ as the roots of quadratic equations. The correction to f is the smallest root of A 6 f o r + B 6 f o r + 0 * °» (5.49) where * t . *' t A s (SBp00Hro " Sro( HBi ^aa • B * - A -S G + Hi g , (5.50) tior ro or r c & o r / and O o r A r r A r r or The correction to h* is then * * ~ G„~ - (H4t)rt„6f„r, 6 h* a » —£E a r . (5.51) <«A>rr + H r o 6 f a r Precise computational details again appear in Appendix 4, 135. (2) 5.3.C Methods Based on Dv '(f) — Generalized Nesbet Procedures As explained in section 3*2, equation (5«40a) can be understood either as D ^ ( f ) » 0, or D ^ ( f ) « 0, according as * * (1) "(2) H A i s taken as the approximation H A ' or Hj . In either case, the Jacobian matrix has the elements i<i> dD F ( i ) + . I ^ r or.pt d f Pt H_̂ 6 .—S__(H.1 ̂ )*._" ap r t ap A t r dH (i) <SBA*SBBf>" *fpt (5.52) or Even without the last term involving the derivatives of , this matrix is no longer sparse, and the convergence of iterative methods based on "diagonal" approximations may be adversely affected* For D ^ ( f ) , this Jacobian matrix can be written, • -or, f t Oop " YoA^ l j IAP ̂ rt<Sap ^oA^A? ^ <HA>tr • (5.53) -1, This is considerably more complicated than before, and must (2) fee handled in a similar way to the Dx ' methods* (2) The Dv (generalized Nesbet) methods are extended to a non-orthonormal basis straightforwardly* As before, i t is *(2) reasonable to neglect the derivatives of , giving, »<2) dD i df or,pt ap r t op A t r r(2) (5.54) For some change 6f in f, the operator ' i s updated according to 136. 6 - ( 2 ) . g - l ( n e w ) [ 6 f t ( D ( 2 ) ^ H 6 f . S_6fH< 2 )) t t *(2)-n (5*55) BA *BA"*"A where WfiA • H B A + Hj^f and Y f i A * S f i A + S;fiBf. The "Diagonal Generalized Nesbet with Overlap? (DGNS) iteration formula i s D<2> 6f » ^ , (r * 1, ....... n A), (5.56) *(2) S__ (H. ) •* H__ oo* A r r oo * (2) for which the updating formula for , eq. (5*55)» becomes A gA «- C WBA }Ao 6 foA * ' 'BA 'A06foAHA t *(2)d-i (5*57) * Soo 6 foA 6 foA HA ' 3- The " F u l l Generalized Nesbet with Overlap" (FGNS) iteration formula i s **LVt*.Az) - HI-1, (5.58) LoA " "oA L~0O"A " "oo- which again, i n practice, is treated as a system of n A simul- taneous linear equations. The f i r s t term of eq. (5*55) now *(2) vanishes, so that the updating formula for H A becomes 4»i2>! - % 1 ( n e W ,[(wL)A 0«aA - <*BA>Ao««A 5i 2 ) 3 - W-5»> For both algorithms DGNS and FGNS, approximately twice the computation i s required per sweep through 6f, as for their counterparts in an orthonormal basis. A precise statement of computational details i s given in Appendix 4. 137. 5.3.d Other Methods For a non-orthonormal basis, the gradient of the trace of the matrix H over the image space,. S A, of the p r o j e c t i o n P A i s given by - -V- t r (P* H) * 1 , (5.60) df df or or where As t h i s trace i s stationary i f and only i f S A i s an eigenspace of H (section 2.1.e), one way to determire f i s to solve the equation D * 0. Using eqs. (3.11) - (3*15), eqs. (5*61) can be transformed to D = D ( 2 ) ( f ) g? 1 A (5.62) which vanishes, as i t should, when D ^ ( f ) and D ^ ( f ) vanish. This equation reduces to eq. (2.45) i n an orthonormal b a s i s . Algebraic expressions f o r the derivatives of D with respect to the elements of f can be obtained without d i f f i c u l t y , and give the Newton-Raphson equations f o r the system D»0, as '<V'IM<llk>6f<eAlsA«I1»+<YBAe:lwBA+HBB-5lBA>6F«I1 These are somewhat more complicated than the previous eq. (5.35), but not hopelessly so. 138. The Newton-Raphson equations, (5.42), or those arising from D ^ ( f ) * 0, or D ^ ( f ) * 0, can be solved for 6f using descent methods as described in section 5*2,d« As before, the costly part of the minimization iteration, the evaluation of products like Vj^J»v^, generally require of the order of 2 2 nA nB c o m P u * t a " t i G n a l steps, i h addition to the work required in calculating the Jacobian, and the other vectors entering the product* For the Newton-Raphson equations, (5*42), reduction in computation by a factor of i n A results i f the blocked structure of the Jacobian is ex p l i c i t l y taken into account, yielding, T . J . T - £ [ ( v f ) o r ( H ^ ) ( v f ) • ( v h ) r o ( s t ) ( v f ) ] (5.64a) • l t ( v f > o r ( 5 A ) t r ( v h ) t o + ( v h ) r o ( S A ) t r ( v h ) t o 3 where the search vector, v, has been divided into an f part and an h part. v f v h (5.64b) 2 Equation (5.64a) represents of the order of 2n An B computational operations for n B > > n A « As for an orthonormal basis, then, the approximate calculation of 6f (or 6f and 6h) from the Newton-Raphson equations using a gradient minimization proce- dure is as costly as several iterations i n algorithms SDNRS, QDNRS, DGNS, or FGNS, A third alternative is to determine f by minimizing the 139. Hilbert-Schmidt norm of G M and g f i A, or of D* A*(f) or D* 2*(f) directly. Only the scheme based on G^ and gg^ is considered here because the derivatives of GfiA and gg^ with respect to f and h are particularly simple, and because the form of the quantity to be minimized is not as simple as ( 5 *37 ) . Since GBA * x a s d l m e n s * o n s °* energy, whereas gg^ is dimensionless, the quantity to be minimized should be of the form N - I I G M f * d2||gBA|| - £ (|Gar| • a 2|g f f r| >, (5.65) o,r where a is a constant scale factor with dimensions of energy. The f i r s t derivatives of N are * N * 2(H B GBA + ^BfBA^s* <5.66a) and d N * 2 ( G B A » r + ^ B A ^ T s ' (5.66b) ST Actual test calculations are required to develop c r i t e r i a for the choosing of a * It is desirable to choose a in some way which maximizes the rate of convergence, but such a cr i t e r i o n is not easily translated into an algebraic condition on a. The interpretation of a as an average energy scaling factor ••1 suggests a » n A t r G^. 140. 5.3.e Choice of an I n i t i a l Estimate, and Improvement of Convergence Rates When the off-diagonal elements of Hi are small compared to differences between diagonal elements i n H ^ and H f i B, the matrix elements of f are small, and a reasonable (and p r a c t i c a l ) s t a r t i n g approximation i n an i t e r a t i v e c a l c u l a t i o n i s f Q * 0. An improved s t a r t i n g approximation may be provided by the solu t i o n of a s i m i l a r problem when avail a b l e , or more e a s i l y calculated. For example, a possible s t a r t i n g estimate of f for a non-orthonormal basis i s an approximate s o l u t i o n of the corresponding problem with S replaced by a unit matrix (ortho- normal b a s i s ) . S i m i l a r l y , the operator, f - (5.67a) for the matrix H , related to H by H H 0 mm 0 HAA H A l i 0 HmA mnr 0 (5.67b) 0 0 HB1_ with m; > n A , 1 » mi - n̂ ,, and ng * ng5 - m, w i l l also be an improved i n i t i a l estimate f o r f, e s p e c i a l l y when H-j^ contains the most s i g n i f i c a n t elements of Hg A» I f m i s not too large, the (m- - n A) x n A block f l i s e a s i l y calculated from the eigenvectors of the block Hmm,. eq, (5«67b) r using eq. (2.3).. The idea here i s to improve the i n i t i a l estimate of the larg e r 141 elements of f. The consideration of asymptotic error constants and rates of convergence (Appendix 5) implies that an improved i n i t i a l estimate of f may make the difference between convergence and divergence, but, i n general, w i l l have l i t t l e effect on the rate of convergence eventually established. This has been born out i n test calculations. Generally, the rate of linear convergence in these algorithms is inversely related to the ratio between the off- diagonal elements of H and the denominators occurring i n the iteration formulas. Thus, the rate of convergence w i l l be increased i f these ratios are decreased by carrying out a linear transformation to reduce the size of the off-diagonal elements of H, and perhaps increase the size of the denominators in the iteration formulas. Therefore, a partial diagonalization of H to reduce to zero those off-diagonal elements Which,are coefficients of the potentially largest errors i n the" error formulas given in Appendix 5, followed by the iterative calcula- t i o n of f (with f Q « 0) i n this new basis, w i l l result i n improved rates of convergence. The desired mapping, f,. i n the original basis is obtained using the transformation equations given in section 2.1.f. Typically, this prediagonali- zation would involve an nr; x: nn block of H (m > n^) containing KAA i m a d d l ' f c i o n to that part of the remainder of H with the strongest coupling to H^. This prediagonalization is especially useful when some 142. denominators in the iteration formulas are small (implying that and Ĥ g., have some nearly equal or equal diagonal elements), since i n the new basis, these denominators may be much larger, and rates of convergence correspondingly become significantly improved. If the diagonal elements of H A A and HgB are i n i t i a l l y well separated, the effect of prediagonalizing a small block of a relatively larger matrix may not be as noticeable. It must be emphasized that this procedure i s not the same thing as the prediagonalization procedure described earlier to obtain the starting approximation f of eq. (5«67a) . , The linear basis transformation corresponds to a nonlinear trans- formation on the elements of f, and the metric properties of the iteration formula are changed, thereby changing the entire character of the iterative calculation. I*, i s easily seen that for ng, » m: > n^. the transformation of H to the new basis,, given by the columns of the matrix V relative to the old basis, and the subsequent back-transforma- tiom of f requires at most of the order of n An f i operations, because the greater part of the forward transformation matrix is a unit matrix• that is mnr B: VAA V — 0 VmA V — mm 0 (5.68) 0 0 Here V is the m x w> matrix of the eigenvectors of the mi x m mm 1 4 3 . Block of H, l g i s an (n-m) x (n-m) unit matrix, and m » m-n A» TJie transformed matrix H is then H VfHV V H V mm mm rami "BrnXmrni <Vmm> HBB: (5.69a) 0 HBA 0 mm Bm » (5.69b) * _̂ BA t HBm The reverse transformation for f is t • [ ( ^ ^ ^ ( ^ w ' K ^ J i i t t ^ j ^ ' r 1 . . (5.70) BB AA 'AE J The eigenvectors i n v m i n are normalized with respect to the corre« sponding m x m. block of S, that i s , v r o m^mm ymm S 5 lm» a n d t n e r e f o r e » the inverse of the transformation' V* is 0 r t - 1 . S V mmi mm (5.7D Using this, the transformation (5«70) becomes,, f = ( S V )1A + <SY>Imfm [ ( S V ) A A + (SVj . - f - r 1 , (5.72) 'A Am m- where the operator f im the partially diagonalizing basis has been written f » The evaluation of the right hand side of eq, (5.72) requires 2 of the order of n An f i ; operations when n^ » n A* No direct handling or manipulation of an n x n matrix i s required. 144. 5.3.f Test Calculations With Overlap A ser i e s of test c a l c u l a t i o n s have been ca r r i e d out using algorithms SDNRS, QDNRS, DGNS, and FGNS.. In the model problems examined, the basic matrix H was the same as that used previously i n the calculations without overlap, namely, with diagonal elements equal to the f i r s t n odd integers, and the off-diagonal elements a l l unity. The overlap matrices were of the form S * t n - l n-2 _ n - l „n»-2 (5.73) This matrix i s positive d e f i n i t e f o r a l l a < 1* I t resembles the overlap matrix f o r a l i n e a r chain of atoms, with overlap f a l l i n g o f f with distance ( S ^ * a ̂ ""^ ),but i t also serves to model a configuratiom i n t e r a c t i o n c a l c u l a t i o n i n which overlaps f a l l o f f with energy differences. For a = 0, the orthonormal case i s recovered, while as a approaches the maximum value unity, the eigenvalue equation, (2.101) becomes i l l - c o n d i t i o n e d . At a * 1, a l l but one of the eigenvalues of S vanish, and the eigenvalue equation i s singular.^ A l l other computational d e t a i l s are the same as f o r the -^For large n, the eigenvalues of (5*73) w i l l not d i f f e r s i g n i f i c a n t l y from those f o r the corresponding M c i r c u l a n t M matrix (Rutherford, 1949) of the same dimension. For such a matrix, i t can be shown generally that the eigenvalues range between (l-a)/(l+a) and (l+a)/(.l-a). with the greatest concentration of eigenvalues near the lower end as a approaches unity. 145. calculations of Table 5*1• As before, S A is the space of the basis functions corresponding to the f i r s t n A diagonal elements of H. The results of three series of calculations are given in Tables 5*2 - 5*4, which include information on the effect of varying the i n i t i a l approximations to f and h, and of prediagonalization. As before, the rates of convergence decreased only slowly with increasing dimension of the eigen- value problem. Table 5*2 shows how convergence rates vary as the overlap integral a increases from zero to 0 .9t with n A and n held constant. It is seen that a l l the calculations diverge between a « 0 ,4 and a = 0 . 6 , except those with prediagonaliza- tion, for which the upper limit overlap is between 0 .8 and 0 . 9 * In this case, the rate of convergence of the algorithms SDNRS and QDNRS at f i r s t deteriorates only slowly, but changes abruptly to divergence between a = 0 ,8 and a « 0 . 9 . . For DGNS and FGNS, the deterioration of convergence rates and onset of divergence i s more gradual. Initializations i n this series of calculations tend to favour convergence to eigenspaces corresponding to the n A lowest eigenvalues. Except where noted, convergence in a l l cases was to the space S A corresponding to the n A lowest eigen- values. The only other combination of eigenvalues obtained from a convergent calculation consisted of the n A - 1 lowest eigenvalues together with the largest eigenvalue. A possible explanation of this is that the iterative corrections are 1 4 6 . Q) tu o 2 : to i o O si ^ 5 .2 CO U. -̂ 3 3 §T £ * g : ^0 d 0* 0' 0 0 o- ^ © JO m K *>flÔ .*» to cD«i 0 d 0'̂ o'~» e S;̂ 8 IS ^ ^ r f o*"- 0 <3 o"w O' <"* V» 0 . 6 0 < 0 © r - J> Q 0- . "» f* *™ «* 00 °^ 0 0 6' <$ 0' 0' -® **» - en 0 «n • » a-eD o* 0' 0' 0""- e m r» '0 * - 0*-Jn Cf 0' °* 0* 0'S CP a: z A C5 & ^ ^ •> 0' 0 0-0 00 c ro 5 ft — ~ 0 0' d d ^ 0'" n $ £ ^ © S* S ^ ^ V 0' 0" 0' d ^ a i •z @ & _> 0' d o -5 5 3 JS > 8 .>• d o d d 0 © w -5 CO -O X * R .v d 0" 0 ^ ^ t ^ £> 0 0 O 0 6̂ Q — <~i ~0 t~- Cfi 0— c> 0' >̂ d 0 d d d * « * *> a = = = . = . = i*-> - - = - - - = 147. TABLE 5.2 (continued) a f Q = 0, no basis change. f calculated from eigenvectors of upper diagonal 10 x 10 block. °f = 0 , iterative calculation carried out i n basis diagonalizing upper diagonal 10 x 10 block. f = f calculated for a=0 (non-overlap case). o eBlank spaces indicate no calculation was carried outi bracketed numbers indicate number of iterations before linear convergence is established (error constants were determined neglecting these points). Convergence to lowest four eigenvalues, and largest eigenvalue, llDII slowly decreasing. g l lDl| = 0.34 after 50 iterations — but convergence apparently is to lowest 5 eigenvalues. h l lD» slowly increasing — calculation may be divergent. 1 a HDII^ 10 after 50 iterations and i s oscillatory. J Calculation divergent i f iterative scheme restarted in original basis.. \\D\\ diverges slowly in partially diagonalizing basis, but begins to converge in original basis.. ^Calculation! is possibly converging slowly after 28 iterations. ^Indicates that iterative calculation is divergent, llDll increases over most of f i r s t 5® iterations. 148. reflecting the situation which would occur i f a actually were greater than one. As a increases through unity, the largest eigenvalue increases to .+» , re-emerges from -co t and becomes the new lowest eigenvalue, while the corresponding eigenvector direction presumably changes l i t t l e . . Table 5«2 shows that the rate of convergence, once linear convergence is established,, is effectively independent of the starting f• An improvement of the i n i t i a l f may slig h t l y reduce the overall number of iterations, but does not increase the rate of convergence. These results also illus t r a t e the substantial improvement in convergence rates (as well as the substantially wider range; of values of a over which convergence is obtained) resulting from prediagonalization of a small block of H.. This improve- ment i s not due to the improved starting approximation, but to the change of basis. Table 5»3 gives rates of convergence for a set of calcula- tions in which the basis space S^ does not correspond to the n A smallest diagonal elements of H.. It is seen that convergence rates are very poor indeed, and that a large proportion of the calculations did not converge at a l l . When convergent, DGNS and FGNS did not give the lowest n^ eigenvalues in these calculations. However, except with prediagonalization, SDNRS and QDNRS s t i l l give the lowest n A eigenvalues! the .rates of convergence being far superior to those of DGNS and FGNS in these cases. It o O" 8 d 0 CO Ci d O' O' 85 o */> ® r-. <-| ti «*> Vn d —i d d - to- il. • O' r- o - d •a Ci • ̂ 8 ti «*» •a C$ <r> «— -̂ •o cr- «** • O c r «•» I". d tvo .-> d to ® d 6 •J a • ~-> o-S A ® o' r~ d v» o» a d §s d*~ M R S O o -o M R S 0""» S3 •o •* cn O* • ̂ 00 a ."2 - a T ) e *~ «r)t> cr ."2 -o OCT d - •3 -0 <o QC & .2' .2* H5 CM ©' CO o' •o 00 rt CJ CO % S .•2- -© .> fif s rt a •«* "ST ^ sr «o" o - r-" c>C or-; 4 H vn vn vn 150e TABLE 5.3 (continued) aBracketed numbers indicate the number of iterations before linear convergence i s established. These points are ignored in calcula- ting the rates of convergence. Convergence i s to the lowest n A eigenvalues im a l l calculations, unless otherwise noted. D f Q « 0, no basis change. c f Q calculated from eigenvectors of upper diagonal 10 x 10 block, d f 88 °» iterative calculation carried out in basis diagonalizing upper diagonal 10 x 10 block. e f Q * f calculated for a * 0 (non-overlap case). converges to eigenvalues #1, 2, 3, 4, and 6. ^slow convergence (or possibly slow divergence), eigenvalues after 50 iterations apparently #1,, 2, 3, 5, and 6. nslow oscillation in llDll, eigenvalues after 50 iterations are #1, 2, 3. 5. and 6. ^"convergence apparently to eigenvalues #1, 2, 3, 6, and 7 — modification Q) unstable after transformation back to original basis. ^convergence to eigenvalues #1, 2 r 3. 6, and 7 — convergence continues after back transformation. 151. On the other hand, comparatively good rates of convergence to eigenspaces S A not corresponding to the lowest n A eigen- values of H were obtained i f the i t e r a t i v e c a l c u l a t i o n was car r i e d out a f t e r prediagonalization (m = 2 n A i n Table 5 * 3 ) • In f a c t , i t i s clea r that prediagonalization i s necessary i f higher eigenvalues are to be obtained r e l i a b l y / and e f f i c i e n t l y . I f the r e s u l t i n g back-transformed f-operator was used as an i n i t i a l approximation for a c a l c u l a t i o n i n the o r i g i n a l basis,the c a l c u l a t i o n diverged r a p i d l y i h a number of cases, even though t h i s i n i t i a l approximation to f yielded values f o r ||D(2)|| or [ E ( G 2 r • g 2 r ) ] * which were less than IO* 1 2, This o,r indicates that i n c e r t a i n cases, no improvement i n the s t a r t i n g approximation f o r f (without also changing the basis) w i l l lead to convergence — the asymptotic error constants defined i n Appendix 5 must be predominantly greater than one, leading to an increase i n the errors e o r i m f o r t regardless of how small the e a r are i n i t i a l l y . By transforming to the, p a r t i a l l y diagonalizing basis, the most important of these error constants are reduced to zero, and convergence occurs. The calculations using the generalized Nesbet algorithms, DGNS and FGNS, frequently consisted of a few i n i t i a l i t e r a t i o n s (2) during which ||DV '|| changed r e l a t i v e l y r a p i d l y , either increasing, or decreasing, or both, followed by a region of (2) apparent convergence i n which ||DV '|| , decreased extremely (2) slowly. In such cases, i t was not unusual for IID v llto L < decrease by only one part i n 10 - 1 0 v per i t e r a t i o n . In 152* many of these calculations where convergence was very slow, certain of the n A eigenvalues of the effective operator G A were surprisingly accurate in view of the large value of |ID̂ 2 |̂|. In several cases, with f i n a l l|D̂ 2\ll i n the range 0.1 - 0 . 2 , those eigenvalues of GA belonging to the lowest n A of H were obtained accurate to eight or more figures, whereas the remaining eigenvalues of G A were much less accurate. The poor convergence is thus apparently associated with determining that part of S.A corresponding to eigenvalues not among the lowest n A» For convergent calculations, also, the plots of log(||GBA||2 |lgBAH2)*., or log ||D(2)|| as a function of iteration number often exhibited "induction periods" before linear conver- gence was established* Figures 5«1 and 5.2 show such plots for two groups of calculations. The shape and length of these induction periods depends strongly on the i n i t i a l f. Typically, only 5 - 1 0 iterations are involved — the example i n Fig. 5*2 is an extreme case i n which over 30 iterations are required before convergence f i n a l l y occurs. As indicated i n Table 5»3t the two converging calculations i n Fig. 5*2 are to different eigenspaces S A» Figure 5*1 illustrates clearly the independence of the rate of convergence on the starting approximation of f* Table 5*4 gives rates of convergence for a series of calculations in which the f i r s t n A or n A+l diagonal elements of H are nearly equal* When the f i r s t n A diagonal elements of H are well separated from the rest, convergence is rapid. 153 FIGURE 5.1 Algorithm SDNRS H i i • 1, 3, 5. 7, U , 9. 13, 15, . . . . 39. nA * n * 2 0 * 0 1 8 0 # 2 1. f Q * 0, iterative calculation in original basis*. 2* f i Q calculated from eigenvectors of upper 10 x 10 block of H* 3* f- * 0, iterative calculation carried out i n basis o diagonalizing upper diagonal 10 x 10 block* 4* f Q * f calculated for a = 0 (non-overlap case)* 15 k . FIGURE 5*2 Algorithm SDNRS H u » 1. 3. 5. 11. 13.. 7* 9., 15*. 17* . . . . 39i 1 ^ 5 . a*20, a«0.2. !• f 0 e °. iterative calculation im original basis. 2* f & calculated from eigenvectors of upper 10 x 10 block of H* 3«. f « 0. iterative calculation carried out i n basis o diagonalizing upper diagonal 10 x 10 block* k * f^ « f calculated for a « 0 (non-overlap case)* 155. re Uj O o O vn -1 3 §5 §£ Is s$ §S u» CS Oi O d Q d d O © d ti d d o d d d d 6 o* ti 6 0' o' 0 © d d * S S? ti 2£ | s •z v!I A o* Q' o cj o* d e> d ti d ti ti © ci d ° ® ti ti o ti ti o ti d * ?S $ 7 s 5 a 3 g d •< d d*-' ©' d d o S z A CJ 5 t ^ ^ Is * cr ti 6 d d - d d o d © t i d d d o ^ d 6 d d d ti d © © «> ® •£ $P ' ^ »- r~ — *•» ' i* ;J . r t eo iyj In ti ti d d © d d ti «n or 1 ) «X «*> w> ti ti ti d""" ti o o ti © «• £ $ ^ S S c> d d d o' d d o' «0 e*~ 5 S" ^- "* Cr © ti ti ti d o * d ti~ •a e » > S $ 1 * ^ IT d ci ti d o d ti © « <"« C f r r * •* tt c i d o 1 Ci 6 Ci O - _> - o 3: ^ « . < r-L co . « * » « * > • > . > >* c.- 3 «*> *C «o <ns rr," 3 . 3 i * c,* a" 2 F & 3 r t & $ 3 $ 3 V n w > v n V A \ A i n i A V A 156. TABLE 5 » k ( cont inued) a A l l c a l c u l a t i o n s converge t o the lowest 5 e igenvalues u n l e s s otherwise n o t e d — b r a c k e t e d numbers i n d i c a t e the number o f i t e r a t i o n s before l i n e a r convergence r a t e s are e s t a b l i s h e d * **f * 0 , no b a s i s change. c f 0 c a l c u l a t e d from e i g e n v e c t o r s o f upper d i a g o n a l 10 x 10 b l o c k . d f 0 * 0, i t e r a t i v e c a l c u l a t i o n c a r r i e d out i n b a s i s d i a g o n a l i z i n g upper d i a g o n a l 1 0 x 1 0 b l o c k . e f Q = f c a l c u l a t e d f o r a « 0 ( n o n - o v e r l a p c a s e ) . f converges t o second lowest e i g e n v a l u e — i n cases w i t h l o n g i n d u c t i o n p e r i o d , t h e r e i s a s h a l l o w minimum i n II DU a f t e r between 10 and 20 i t e r a t i o n s , a f t e r which i t i n c r e a s e s t o a maximum before d e c r e a s i n g a g a i n . A p p a r e n t l y converg ing t o e i g e n v a l u e s #1, 2 . 3, 4,, 9 | a f t e r 50 i t e r a t i o n s , JlDll * 14.8. a p p a r e n t l y c o n v e r g i n g t o e i g e n v a l u e s #1 , -2 , 3, 4, and 8j a f t e r 50 i t e r a t i o n s , RDII • 5 . 9 . ^apparent ly converg ing t o e i g e n v a l u e s # 1, 2 , 3, 4, and 7i: a f t e r 50 i t e r a t i o n s , l|Dll * 5.8* 157. The denominators in the iteration formulas are large, so that the asymptotic error constants are small, and the iterative calculations well-conditioned. When the f i r s t diagonal element of H f i B is close to diagonal elements of (and a * 0.2), the rates of convergence of the algorithms SDNRS and QDNRS are virt u a l l y unaffected, whereas, those of DGNS and FGNS deteriorate to a much greater extent. The greater the number of diagonal elements of Hfifi near those of H^, the slower the rate of convergence, as evidenced by the poor convergence here when n A = 1 (Nesbet algorithm). The Nesbet algorithm is apparently converging here to the second lowest eigenvalue of H at the 50 iteration, but the convergence is very slow. In the calculations • reported in Table 5»4» convergence is normally to the space S A corresponding to the lowest n A eigenvalues of H. The only exceptions are the most poorly converging generalized Nesbet calculations, in which the space S A corresponds to the lowest n A - 1 plus the ( n A + l ) s t , or (n A +2) n d eigenvalues of H. Generally, the rates of convergence shown in Table 5*4 decrease as the overlap a increases. From Tables 5.2 - 5 « k , i t is seen that the two algorithms, SDNRS and QDNRS are generally more reliable than the generalized Nesbet algorithms. However, when convergence occurs, the rates of a l l algorithms are similar. Again, algorithm SDNRS is easier to program efficiently than the others, and, unless the size of the problem makes the extra storage to hold h a c r i t i c a l factor,, this is probably the algorithm of choice. While the eigenvalue problem is more d i f f i c u l t with overlap, these methods are useful, especially with prediagonalization* It appears that attempts to obtain improved starting approxima- tions for f are not of great value, and im particular, the solution of the corresponding problem for an orthonormal basis was frequently the worst starting approximation tried in these calculations* 159.' 5.4 Multiple Partitioning Finally, the possibilities for efficient, practical iteration procedures for solving the m x m partitioning equations of chapter 4 are considered. The basic strategy is again to obtain efficient linearly convergent iterative schemes by approximating the second order convergent Newton- Raphson equations corresponding to the nonlinear system to be solved* Three sets of equations were introduced in chapter 4 for the determination of the off-diagonal blocks of T in an ortho- normal basis* They arei (1) «4J>(T). = Hj, • H J K f K I • f ^ P - 0, K/I (I,J = 1,..., m, l / J ) , (2) D ^ T ) - H j I • £ H J K f K I * f j j H j 2 ) - 0, K/I (I,J a 1, •*., m, I/J), and the pair of systems, (3) G J l (T) « H „ ^ H J 1 f K ^ j f ^ ^ f J X K ' K I " «f K^I «JI^> s f I J * f J I * L ^ f L J f J I * °' (5.76b) L/J (I < J a 1, ••*, m)• A fourth set of equations, intermediate between (5«74), (5*75) and eqs* (5*76) are 160* D^ }(T) = (g" 1G) J I • 0, (I.J = 1, .... ro, I/J), (5.77) A 1 " which arise from the condition that (ST) HT be block diagonal. These four sets of equations have the same solutions, however, they lead to algorithms which are quite different. While the iteration schemes derived from eqs.. ('5«?4) - (5*76) are very similar to those developed for a 2 x 2 partitioning, a major difference is that the complexity of the orthogonality conditions (4.8) makes i t a practical impossibility to ex p l i c i t l y A eliminate half the elements of the off-diagonal blocks of T before solving one of these sets of equations for the remaining elements. In a l l algorithms, therefore, i t is assumed that the A elements of a l l m(m-l) off-diagonal blocks of T are to be determined. A description of possible algorithms for solving the four sets of equations above is given in Appendix 6. No numerical testing of these algorithms has been carried out. The determination of the matrix elements of the off- A diagonal blocks of T for an m x m partitioning in a.non-ortho- normal basis involves complications only in detail due to the presence of the overlap matrix. Equations for this,case have been given in chapter 4, and they may be handled in essentially the same way as those defining equations given above. 161. CHAPTER 6 PERTURBATION THEORY 0 polish*d perturbation} golden; care! That Hreep'st ports of slumber open wide, To many a watchful, night! (Shakespeare, King Henry IV. Part II) I wouldn't lose any sleep over i t " (wise old saying) 162, 6>1 Introduction Only the simplest problems in quantum mechanics can be solved exactly. As a result, perturbation expansions of some sort are involved in many quantum mechanical calculations. Perturbation series for effective operators are useful in treating a set of degenerate, or nearly degenerate levels with one or more perturbation parameters,, especially when the degeneracy is not s p l i t in f i r s t order. Effective operator perturbation series are also useful in developing physical pictures, as, for example, i n uncoupling the Dirac equation to obtain equations for electrons only.. In this chapter, perturbation series are developed for the effective operators HA, Ĝ , and Ĥ , defined for a 2 x 2 parti- tioning in terms of the operator f.. These series can be derived straightforwardly because of the relatively simple algebraic form of the relations defining the operators. The absence of constraints or\ auxiliary conditions on f makes possible efficient computational schemes for automatic sequen- t i a l calculation of the terms i n the perturbation series toj arbitrary high order.. The perturbation formulas are not complicated by degeneracies at any order, so long as a l l eigenfunctions i n a given degenerate set in zero order are partitioned into the same space. In fact, as w i l l be seen below, the presence of degeneracy tends to simplify the use of these perturbation series* Two examples are presented to illustrate the perturbation 163 formulas derived. These are the uncoupling of the Dirac equation for a spin-i particle, and the construction of a nuclear spin hamiltonian im esr theory. Perturbation of the projection P A which, in molecular orbital theory, becomes the one-particle density operator, is considered in the following chapter, and the formalism is extended there to the related, but more complicated, self-consistent f i e l d molecular orbital problem. 164. 6.2 2 x 2 Partitioning — Orthonormal Basis 6 . 2 .a General Discussiom A perturbation formalism based on the material presented in chapter 2, and in particular, on the eigenvalue equation! (2.1),, w i l l be considered f i r s t . It is assumed that the hamiltonian H can be written as an infinite series 00 H = Z H ( n ), (6.1) n=0 where the perturbation parameter or parameters are to be cont- sidered to be included implicitly in symbols like Hv , which is of order n im the perturbation. The operator f is written as an infinite series, f = ? f ( n ) . (6.2) n=0 Substitutiom of these two series into the condition D(f) = 0 , eq. (2.16), defining f,, yields the series D(f) = £ D ( n ) ( f ) = 0, (6.3a) m=0 where ^ ( f ) - ^ * S ( H B n - ^ f ( J L f U ) H t n - j ) . f ( j ) n ^ ^ ^ °" 1 S (6.3b) s 4A ) + . ^ 0 ( 4B" J ) f U )-^ U )"A n" J ) )- ( 6 ' 3 c ) *% The series for H A is given below. Since (6«3a) is implicitly a power series in one or more arbitrary perturbation parameters, D(f) w i l l vanish as a whole only/if each term vanishes. Thus, 165 a hierarchy of perturbation equations is obtained, D* n )(f) = 0, n 0,1,2 (6,4) from which the f c a n be determined.. The zero order condition; is formally D< 0>(f)=K^UH(« )f( 0)-f ( 0 »H^>-f<°)H<^f( 0». 0. (6.5) which is just the original condition defining f for the zero order operator H^0^. Unless H ^ i s block diagonal, f ^ 0 ^ w i l l not vanish,-and consequently, the D^n^ w i l l depend on f^Q\ taking the form D<^f) = ( H < ^ • A ( n ) ( f ( n - l ) f # > # f f ( 0 ) j . J(0)t f(n). f(n )5(0) + A(n) . 0 f ( 6 # 6 ) where A^n^ i s a quantity depending on terms i n the series for f of order n-1 or lower. General solutions for the n^ng-dimen- sional system of simultaneous linear equations, (6.6), cannot usually be written down, and the f v must therefore be deter- mined by numerical methods. If H*0) is block diagonal, then f*°* = 0, and eqs. (6.6) become H ( 0 ) f ( n ) . f ( n ) H ( 0 ) ! s . A ( n ) ( f ( n - l ) f _ t ( l ) ) § ( 6 # ? ) which is again a system of n An f i simultaneous linear equations» which, in general, must also be solved numerically. However, these equations are considerably simpler than eqs. (6.6). 166. Finally, i f is diagonal, eqs. (6.7) reduce to ( H ( 0 ) - H ( 0 ) ) f ( n ) - - A ( n ) „ (6.8a) oc rrr or or^ ' and ih this case, the solution can be given explicitly,, f " • «">> H<°> ' ( 6 ' 8 b > r r oo ) Here,, again, Greek letters refer to basis elements in the sub- space Sg., and Roman letters to basis elements in S^, In general, for f ( ° ) = o, the A ^ n ^ are given by A< n > - H&)*V(45-3>f").f< i)i4j-J)). N? ^ ' V J W J - H ) ^ ) , j=i a a a a j=i i=i A a (6.9) which is obtained by deleting terms depending on f a n d f ^ from eq. (6 .3b) . When the series for H contains only a few terms — for example, when H = H ^ • H^1^ only -- i t is more (n) In) useful to group terms i n the Dv and A v ' according to the order of the hamiltonian, H, rather than f, in the term. For A * n \ this gives k a l 3*1 (6.10a) - ^ • ^ ( H l ^ ^ ^ - f ^ ^ H l ^ ) . (6.10b) k=l A Table 6.1 l i s t s the f i r s t few members of the perturbation hierarchy, D ^ ( f ) , , for the case f ^ = 0 . Explicit formulas, in the format of eq. (6.10a), for low order A ^ are obtainable from: Table 6.1 by deleting the f ^ dependent terms in the D ^ . 16?. Perturbation formulas, i n terms of the f<n> and H<»>. for a l l of the other quantities defined in sections 2,1 and 2,2 follow directly from their definitions. The formulas for them presented in the remainder of this section apply when f<°> . 0 . Using eq. (2.65a), the series for the effective operator H A is found to be HA - S H<n). (6.11a) where A "A n=0 A fit(O) . „(0) HA * AA » W ( 1 ) - H ( l ) HA AA * (6.11b) and H A n ) = H B n ) • H A J - ^ f ( ^ t <m> 1 ) . For GA, given in eq.o(2.67b), one obtains G. * £ a[n), (6.12a) n=0 A where, and, P ( 0 ) . H ( 0 ) GA ~ AA • GA AA * (6.12b) + ^ 2 n - j - l f ( n . j . i ) t H U ) f ( i ) t ! ( n > 1 ) j=l i = l B B 168 The metric g A has the very simple series where n=0 A g ( 0 ) - 1 . A *A* °° £ n ) , (6.13a) « A 1 } * G ' ( 6 . 1 3 b ) and A j=l I f the hierarchy, (6.4), i s used to e x p l i c i t l y eliminate the terms i n H'g^ from eqs.. (6.12b), the r e s u l t i n g series f o r G A w i l l be i d e n t i c a l to that obtained by expansion of the r e l a t i o n GA = gA**A' 6 ( 1 , 1 ( 2 ,7°)» ' t n a' f c i s G< n ) « S g l J )H A n-J }. (6.14) That the two expressions, (6.12) and (6.14), for G A are equiva- lent i f and only i f the equations of the hierarchy (6.4) are s a t i s f i e d , i s i n accord with the fac t (section 3»l) that the two d e f i n i t i o n s , G A • (T^HT)^ and G A « Sjfik9 a r e e c l u i v a l e n " t i f and only i f D(f) « 0. An advantage of eqs. (6.12) i s that they are the same whether or not the basis i s orthonormal, (n) whereas, any formulas incorporating the re l a t i o n s Dv '(f) * 0 e x p l i c i t l y must be d i f f e r e n t i n a non-orthonormal basis, since the condition D(f) depends e x p l i c i t l y on the overlap matrix ih that case. +4 Perturbation series f o r the powers, g A » of g A» can be obtained i n several ways outlined i n Appendix 7« Given these, 169. TABLE 6.1 D ( n ) ( f ) D ( 1 ) ( f ) D ( 2 ) ( f ) D ( 3 ) ( f ) D W ( f ) D ( 5 ) ( f ) H ( 0 ) - ( l ) ̂ ( l M O BB f " f AA BB f "*f AA • H B f i ? f ( l ) - f ( 1 H ( 0 ) f ( 3 ) f.(3)H(0 BB f " f AA + H< B>f< 2 )-f< 2 + 4 2 ) f ( l ) - f ( l H ( 0 ) f ( k ) f ( k ) „ ( 0 BB AA • H< 2 )f< 2>-f< 2 H (0) f .(5) f(5) H (0 BB: f f AA • H < B ¥ k > - f < k • H B 2 > f ( 3 ) . f ( 3 * H B 3 )f< 2>-f< 2 (1) BA H ( l ) . H(2) AA + BA AA " f AB f H ( 2 ) . „(3) AA + BA H ( l ) „ f ( 2 ) H ( l ) f ( l ) - f ( l ) H ( l ) f ( 2 ) AA AS..1 A-&> H ( 2 ) - f ( l ) H ( 2 ) f ( l ) AA A13 H (3) + H ( k ) AA * BA H ( l ) . f ( D H ( l ) f ( 3 ) . f ( 2 ) H ( l ) f ( 2 ) AA A 33̂- A S f .(3)„(D f.(l) " f AB-: f H ( 2 ) - f ( l ) H ( 2 ) f ( 2 ) - f ( 2 ) H ( l ) f ( l ) AA .A13. î B' H<3)-f(l)H(3) f(l) 170. TABLE 6.2 E^n\ « { 0 ) - «<•> " «<»>• H « | ) f < ^ + H ( ^ f < ^ ^ ) , ( 3 ) " I 6 ' " H ^ ) + H l | ) f < l ) ^ ) f f 2 ) + H < 3 ) f ( 3 ) + „ U ) f ( ^ + „ a ) f ( 5 ) the series for the effective operator H*A can be written down from eqs. (2.74) as oo ~(r,) H. = Z H A , (6.15a) n=0 A where, H<"> - t ^ W ' i l W - J ' ^ " 1 , (6.15b) 1=0 j=0 A or H<n> . £ . S ^ 1 1 ^ 1 ^ 1 ^ (6.15c) i=0 j=0 A A A Explicit expressions for the lower order fij^,. G ^ * a n d „ according to eqs. (6.11), (6.12), and (6.13), and for the H A n*, are given in Tables 6,2 - 6 ,5 . A l l three effective operators, HA, 6 A , and HA, are identical i n zero and f i r s t order. In general, they differ i n second and higher order. 171. TABLE 6.3 G J n ) • i ° • - i i ' G (2) = H ( 2 ) 4 . H ( l ) f ( l ) + f ( l ) t H ( l ) + f ( l ) t H ( 0 ) f ( l ) A AA AS SA 8S G A 3 ) " H ^ X f M 1 ^ ' 1 ^ ' • f ' ^ H ^ M ^ + f ^ ^ f ' 2 ) • H A £ > f< 3 > * f < 3 > FH<l >+f <1 > F H < * > f <2 Kt<2 > 'H'B f <1 > • f ( 1 ) t H < ° ) f ( 3 ) + f U ) t H ( 0 ) f ( 2 ) t f ( 3 ) t H ( 0 ) f ( l ) I T s s 0 = f ( l ) t f ( l > s f ( 2 ) t f ( l ) + f ( l ) t f ( 2 ) gA s f ( 3 ) t f ( l ) + f ( 2 ) t f ( 2 ) + f ( l ) t f ( 3 ) g(5) gA s f w t f ( l ) + f ( 3 ) t f ( 2 ) + f ( 2 ) t f ( 3 ) + f ( D t f ( 4 ) gA s f ( 5 ) t f ( l ) + f ( 4 ) t f ( 2 ) + f ( 3 ) t f ( 3 ) + f ( 2 ) t f ( 4 ) + f ( l ) t f ( 5 ) 172. TABLE 6.5 HJ n J ^(0) s H ( o ) A AA A AA H [ f ( l ) t f ( 2 ) + f ( 2 ) t 1 , ( l ) t H(0)-, HA AA AB AB f AB: f + i [ f A ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) f ni0)^ + i [ f ( l ) t f ( 2 ) + f ( 2 ) t f ( l ) f H U ) L + i [ f ( D t f ( l ) f H ^ + H ^ f ^ ] . Several alternative formulas for the terms of the perturbation series of these effective operators can be given, the usefulness of a given set depending on the situation. The formulas given above are not particularly well suited in some cases for the calculation, of high order terms. Procedures for deriving alternative series are given ire Appendix 7». along with a tabula- tion of some alternative formulas. The perturbatiom series (6.9) through (6.15) are rather general in that they give the various effective operators ultimately i n terms of the f ^ and H ^ . However, as indicated in eq. (6.8), i f H*0* is diagonal, the ff*n* can be written ex p l i c i t l y in terms of the matrix elements of the Ĥ n .̂.. Expansions corresponding to each of eqs. (6.8) through (6.15) in terms of only the perturbed operator H w i l l be given in the next two sections. 173 6.2.b A-states Degenerate Explicit perturbation formulas are especially simple when the eigenvalues of Ĥ 0^ corresponding to the subspace are a l l equal, say, to In i t s eigenbasis, is just a multiple of the unit matrix, and the f ^ n ^ are defined by ( HBB } " € A 1 B ) f U ) " - A < n ) » < 6- l 6 a> with the solution f(n) a ^ ( n ^ (6.16b) where is the reduced resolvent matrix evaluated at 6^.and restricted to Sg.. The useful point here is that L i s a matrix, not a supermatrix. I f Hgĝ i s diagonal, t h e f ^ n ^ are given simply as the products of the matrices A^n^ with the nfi x n f i diagonal matrix L, and thus, relatively simple matrix expressions can be given in terms of H only, for the various perturbation series derived i n the previous subsection. Perturbation formulas of this type are given i n Tables 6.6 - 6.11. When HggV is not diagonal, the f ^ must be determined by solving a system of nA nB ; s l m u l ' t a n e o u s equations. Substitution of eq. (6..l6b) into eqs. (6.10) yields 4i>= A< • ' ' - " f c H ^ f < n-*>-f («>-k)S< W) » f ^ J V ^ V c ^ f ^ - f ^ ^ H i ^ ) . (6 .18) k*l BB A which can be used to eliminate high order f ^ from perturbation TABLE 6.6 (A-states degenerate) •H<|>f<2>-f<2>&J2> CABLE 6.7 f ( l " (A-states degenerate) + / W(1 ) T H . ( D T H ( l ) H ( l ) w H ( l ) ' BB: L HBB " L HBA AB , L HBA + r T „.(1)7 T H ( 1 ) H ( 1 ) + I 2 H ( 1 ) H ( 1 ) 2 +l L» HBB i+ L HBA AA + L BA AA 175 TABLE 6.8 H^n* ~ REDUCED FORMULAS (A-states degenerate) a{0) • < ap) = i ^ » « « > ' f ( 1 > + * ( 1 ) t i ^ ) « ( 1 ) t i ^ ! , * < 1 ) - * ( l > t ' ( 1 ) M i i ) + H ( 2 ) f ( 2 ) + f ( l ) t K ( 2 ) f ( l ) - ( f ( l ) t f ( l ) * ( 2 ) AB BB: A + f ( l ) t H ( l ) f ( 2 ) _ f ( l ) t f ( 2 ) g ( l ) BB A • H < 3 > f < 2 W 2 , t H J 3 > « < 1 > t l ^ > f ( 1 > - f ( 1 ) t f ( 1 > H p > + f ( l ) t K ( 2 ) f ( 2 ) + f ( 2 ) t H ( 2 ) f ( l ) _ f ( l ) f f ( 2 ) g ( 2 ) BB BB'. A . f ( 2 ) t f ( l ) j ^ 2 ) + f ( 2 ) t H ( l ) f ( 2 ) - f ( 2 ) t f ( 2 ) j ( l ) . [ f ( i ) t f ( 3 ) f 5 ( i ) L .. H ( 6 ) + H i ( 5) f>(l) + f.(l)t H i ( 5 ) HA " AA AB f + f BA + H A | ^ f ( 3 ) + f ( 2 ) t H B 3 ) f ( l ) . f ( 2 ) t f ( l ) ^ 3 ) + f ( 1 ) t l 43 ) f ( 2 L f < l ) t f ( 2 ) g p ) + f ( 2 ) t H ( 2 ) f ( 2 ) + f ( 2 ) t f ( 2 ) 5 ( 2 ) + f ( l ) t H ( 2 ) f ( 3 ) - f ( l ) t f ( 2 ) J ( 3 ) A BB A + f ( 2 ) t H B B / f ( 3 ) . f ( 2 ) t f ( 3 ) 5 ( l ) + [ 5 A D f f ( l ) t f •>)-,_ 176* TABLE 6.9 H A n^ (A-states degenerate) u ( D T H ( l ) AB. L HBA „(2) T H(1) . „(1) T W(2) AB L HBA + AB L HBA H ( 1 ) T W ( 1 ) T H ( 1 ) W ( 1 ) T 2 H ( 1 ) H ( 1 ) AB L H B E L HBA " AB L BA AA H<3 ) L „ U ) + „ U ) M < 3 ) + S ( | ) M U ) . K U ) ^ ) ^ ) H<£>LH&>LK<i> • ̂ ^ L H g ) - H ( | ) L ^ ) H ( 1 ) M ( 1 J H ( 1 ) T H ( 1 ) T W ( 1 ) T H ( 1 ) H ( 1 ) T H ( 1 ) T 2 H ( 1 ) H ( 1 ) AB L H B B L HBB. L HBA ~ AB BB L BA AA . H(1) T2„(1) T„(1) W(1) . „(1 ) T 3 W ( 1 ) H ( 1 ) 2 AB L BB L HBA AA + AB L BA AA TABLE 6.10 Q A n ) (A-states degenerate) ? H ( 1 ) T W ( 1 ) H ( 1 ) T M ( 0 ) (1) AB L HBA + AB! L HBB L HBA (2) (1) (1) (2) . o „ ( D T H U ) T H ( l ) AB L HBA + AB L HBA * 3 HAB L H B B L HBA H ( 1 ) T 2 H . ( 1 ) H ( 1 ) W ( 1 ) H ( 1 ) T 2 H ( 1 ) HAB^ L BA AA " AA AB L BA (1) ( 0 ) T r w ( 2 ) W ( 1 ) T H ( 1 ) T„,d)„(l)\ AB L HBB L C H B A + BB L HBA " L HBA AA } TABLE 6.11 H A (A-states degenerate) 177. (n) *(0) HA - W< 0 ) ~ AA HA AA T j(2 ) HA . „ ( i ) T H ( i ) + AB L HBA S(3) HA + H A Bl )LH B^ > . „(1)T„,(2) + AB L HBA + H Aa i )LH B^ )LH BJ ) For H = H?.*0* + H ( 1* only* H A « - B<|'toS>l^>l«»Up8 A l ,.(Hi|»L L.H<B> H ( f f < 1 ' ^ H ( | ) L 3 „ U ) } + . 4 { I A 2 ) . „ A l ) L „ < A ) j + s('5) - H ( i ) T H ( i ) T W ( i ) * H ( i ) T H ( i ) HA " AB L HBB: L HBB: ̂ BB. L r tBA Hlff A l )^H(|)L( t 2H B|) +LH^) L .4|)L 2)LH( A)! + ^ H | B , ^ B i ) C H { B ) ^ E l , . - H { 1 ) 3 J - 1?8, expressions for the effective operators. Substitution of eqs. (6.18) for H|̂ , (J = 1, n-1), i n the formulas for HA2N^ and HA2N+1\ and simplification of the resulting expressions -1 (n) using the formulas for L~ f v ' given in Table 6.6, results i n *(n) the so-called "reduced" formulas for the HA given i n Table 6.8. It is seen that, in the formula for n j 2 n ) and H|2N"1"1 ,̂ a l l terms containing f ( n + 1 ^ through f ^ 2 n ^ are in the form of A commutators with a lower order term in the expansion of HA. When n A * 1, these commutators vanish, and one obtains a 2n+l rule in the sense that f through order n is sufficient to A determine R*A correct through order 2n+l. Similar explicit results have not been obtained for the operators G A and HA, but the discussion of section 3«2 implies that errors in the eigenvalues of these two operators, when calculated from f correct through order n, should be of order 2n+2. For n A / l , none of these effective operators can be given correct to order n+2 or higher solely in terms of f ^ 1 ^ through f ^ . * (2 ) ~(2) In this case, HA and HA are identical. In general, a l l three effective operators are different in third and higher order here. 6.2.C A-states Non-degenerate If the eigenvalues of corresponding to the subspace S A are not a l l equal, then the factorization (6.16a) is not possible, and i t is necessary to calculate the matrix elements 179. ( Y\) of the f v n ; via eq. ( 6 . 8 ) , which can be written formally as f(n) a ^ ( A ( n ) ) 4 (6.19) Here, X is a superoperator, which, when acting on the operator produces the operator f ^ n ^ . The superoperator mC can be represented as a four index matrix, so that (6.19) becomes An) _ E j> An) xor " * ^ o r . ^ f t * (6.20a) If H ^ is diagonal, with eigenvalues €?, then or,^ot eo _ €o ap r t * r " o (6.20b) In this case, the perturbation formulas are for single matrix elements. Tables 6.12 - 6.15 give some low order formulas of this type. One application of these formulas is in molecular orbital theory. The application to the derivation of Coulson- Longuet Higgins type Huckel theory is outlined in the next chapter. 180, TABLE 6.12 f (n) .(1) _ 'or H (1) "or €° - €° r o .(2) _ or H<?> • or n H ( D H ( 1 ) 2 ou- u r U = l 6 - € r (i L H a t H t r t=l €? - €° , , t o J r .(3) _ or or OU [XT n ,=1 - € ; . E <>* t r _ 2 t a B 2 ox t u ur =i t = i ( « ) ( € ° - € ° ) + E . - O ^ O ^ r u , . . n f i H ( D H ( 1 ) n A (1) H(1) \ H ( 2 ) + E HY Yr _ £ ut t r - - . €© y II / ur Y=l € t = i e; ^ ( 2 ) + £ ou ut _ £ os st ot .._« , .0 ^o ^o S = l €" - €' s o H (1) t r t a r a 0(0) . „(0) H r s " H r s u(1) . H ( l ) H r s " H r s TABLE 6.13 H ( n ) •A— u( 2) H r s - H< 2 ) rs lA + E U=l H ( 1 ) H ( 1 ) H r u Hus €° - €° s IX 0(3) H r s " H r s nB H ( 2 ) H ( 1 ) • H ( 1 ) H ( 2 ) + I n r u n|is n r u "us U=l €o _ eo n B H. (1) ru U=l € - € ^ S U L B nUY nYS Y - l *° " * ° 1 S Y nA H u t t=l € - € t U 181 TABLE 6.14 G ( n ) G(0) = H ( o ) rs rs G r s ~ H r s .(2) _ rs H ( 2 ) • Z B H ( 1 ) H ( 1 ) ( € ^ € a " € ^ rs x ru us ( € o _ e o ) ( o 0 ) ^ v r u' v s u .(3) . rs r S c . l u ( 2 ) H ( D u ^ M 2 * H r o H 0 S + H r o H a s € ° - e ° e° - €° L B cr r 0 n + Z B n B H(1) H(1)„(1) H r o Hou H u s 0=1 ,=1 (e°-€°)(€°-€°) • Z r r ° 0=1 r o H ( 2 ) + E B H 0 U Hus nA H ^ H ^ _ ox ts t = l - € o J S 0 + Z 0=1 n w H ( 1 ) H ( D / « \ m r l „ n HV ; + Z nA H i l M 1 ^ ru uo _ E " r t "to t o €°-€° r o € o H ( l ) S OS €°-€° S 0 182. TABLE 6.15 H [ n ) 6(0) _ H(o) rs rs H r s " H r s *(2) H r s rs + ^ [ i < € ° < ) - € ° ] „ ( i ) H ( D Sri' rs = H. (3) rs H ( D H ( 1 ) + L ro os o=l €" - € nB nA + i z z o=l t=l „(1) W(1) H(1) H r o H o t H t s LK<^1- <o> ( D H ( i ) H ( i ) r t to os nB „ ^o o o=l € -€ s o r o as m, H ( 1 ) H ( D n A H ( 1 ) H ( D B Hctt Hus _ E A H o t H t s ,=1 €; - €° t=i - e° t o *B +| Z o=l n » H ( 1 ) M ( D H ( 2 ) + z _ r j i £§_ _ n. H r t H t o H (1) as ra H=l e° -r t=l € 0 J <*>«S>(€»-€°) 183 6.3 Examples 6.3.a The Dirac Equation A particularly simple application of the perturbation formalism just developed is to the formal uncoupling of the Dirac equation for a spin-i particle i n an electric and magnetic f i e l d . Not only are the A-states (E^°^ = mc2) degen- erate in zero order here, but the B-states (E^°^ = -mc2) are degenerate as well. Historically, much effort has been expended on the problem of obtaining a two component effective operator describing the behaviour of a spin-i particle in an electromagnetic f i e l d , from the four component Dirac Hamilt- onian. In several cases, special algebraic properties of the Dirac hamiltonian were used to construct the desired effective operators, so that i t appeared that such operators were unique, in some sense, to the Dirac equation, and not necessarily analogous to effective operators constructed i n other contexts. In this section and the accompanying Appendix 8, i t is shown that the perturbation formulas tabulated in section 6.2.b yield the desired effective operators immediately. The Dirac equation is special in that i f only a magnetic f i e l d is present, the condition D(f) * 0 can be solved exactly. Other methods for the exact uncoupling of the Dirac equation in the absence of an electric f i e l d have, of course, been known for many years (for example, Foldy and Wouthuysen, 1950). The Dirac hamiltonian, including electromagnetic inter- actions, w i l l be written here as 184 where, and (6.21) (6,22a) (6.22b) Here <t> i s the e l e c t r i c p o t e n t i a l , (o t a , a ) are the Pauli spin A jf <w matrices, TT * p - A i s the mechanical momentum of the system, and my e, c, are the mass and charge of the electron, and the ve l o c i t y of l i g h t . With the perturbation defined i n (6.22b), the i m p l i c i t perturbation parameter i s l/m. This i s s t r i c t l y not the usual n o n - r e l a t i v i s t i c approximation, i n which the terms of the series are ordered by powers of v/cc, and which w i l l be dealt with l a t e r . Both ordering schemes eventually give the same terms i n the perturbation! s e r i e s , but the order in. which s p e c i f i c terms occur may be d i f f e r e n t . The hamiltonian (6.22a,b) i s blocked according to the p a r t i t i o n i n g of the basis space into the subspaces S A (E^°^*mcr2), and Sg ( E ^ s - m c 2 ) . The reduced resolvent (6.17) i s just a multiple of the unit matrix because both the A-states and the B-states are degenerate, that i s , L » • 1 . (6.23) 2mc Referring now to Table 6 .8 , i t i s seen that 185. H A°> - mc2 , ft(2) - i _ ( o . n ) 2 (6.24) and "A3> * " " I T (S'3MfiMJ) - (£'H) 2 0 • 4m c Standard methods* can be used to transform these expressions into the more e x p l i c i t forms, g(2) . 0 , x + 1 W B W and 2ic 2i (6.25) A 4m 2c 2 4m 2c 2 4m 2c 2 ~ ~ Here ^ = A i s the magnetic f i e l d , and IS * - J^ i s the e l e c t r i c f i e l d . The various terms i n (6.24) are r e a d i l y * ( A ) i d e n t i f i a b l e . H A ' i s the rest energy of the system i n the *( 1) absence of any f i e l d s , and H A ' i s the e l e c t r o s t a t i c energy. * (2) H A includes both the k i n e t i c energy of the system, and the magnetic dipole i n t e r a c t i o n . The non-hermiticity of the *Using the well known commutation properties of both the Pa u l i matrices and of d i f f e r e n t i a l operators, one obtains, O, TT]_ = i h £ * - -ihg, (CJ»TT)0(O»TT) = -ho»(E x Jj)-~^g,« IML + ihE»n + 0TT»TT, (o.n) 2 = 2io.(E x n) + i h ^ . E , (O«JT) 20 « -•—0 £•>< + JT.TT0, O,, TT<«TT]_ = h 2 V2<t> - 2ihE«TT. 186. operator i s seen to appear f i r s t i n t h i r d order. The f i r s t term i n X^ * eq. (6.25), i s the spin*-orbit i n t e r a c t i o n , and the second term i s the so-called Darwin term. Equations (6.24) are i d e n t i c a l to the re s u l t s obtained using the Pauli elimina- t i o n method to uncouple the Dirac hamiltonian. The co r r e c t i o n to the k i n e t i c energy due to the r e l a t i v i s t i c v a r i a t i o n of mass 1 / x4 *(4) arises out of the term • j A (O»TT) , appearing i n H: • 8m-V A S i m i l a r l y , using Table 6,11, one obtains, r?(0) . 2 H A - mc , -(1) H^ i ; * e0 fc(2) = 1 /„m„\2 (6.26) A. 85 2m" ( £ ' 2 ) and »A3) m T T T fe-JJ. C S ' H . * 3 J . om c 4m <r 8m c The second and t h i r d order terms can be rewritten S{ 2 ) is-*. and (6.27) H i 3 ) - c ( E x TT) - - S ^ y V «E. 4m 2c 2 - - - 8m 2c 2 ~ ~ Except for the fourth order r e l a t i v i s t i c c o r r e c t i o n to the ki n e t i c energy, which w i l l appear here i n , t h i s i s e f f e c t - i v e l y the r e s u l t quoted by DeVries (1970), which was obtained from a perturbation series to fourth order i n v/c calculated 187. v i a the Foldy-Wouthuysen procedure. The point of eqs. (6.24)- (6.27) i s that, except for some algebraic manipulations necessary to obtain the e f f e c t i v e hamiltonians i n a more f a m i l i a r form, these expressions could be written down without any other c a l c u l a t i o n , using the tabulated formulas i n section 6.2. The f i r s t and second order terms i n the expansion of f are p a r t i c u l a r l y simple here also, being given by f ( D = 1 o.n, 2mc ~ and (6.28) That i s , each i s made up of only one term. These two terms are s u f f i c i e n t to determine H A and H A to fourth order. Equations (6.28) are also useful f o r c a l c u l a t i n g e f f e c t i v e operators f o r other properties of the system. For an expansion i n powers of v/c (the n o n - r e l a t i v i s t i c approximation), the Dirac Hamiltonian i s usually rewritten as (DeVries, 1970). |~3 [*t £»E]«. = H = H (6.29a) where H (0) i s as i n (6.22), but now H (6.29b) and H (6.29c) 188. Actual expressions f o r H A, G A,and H A to s i x t h order, based on eqs* (6.29)»are given i n Appendix 8 i n a somewhat more abstract notation. Because of the p a r t i c u l a r form of the f i r s t and second order perturbations here (the f i r s t order perturbation couples two states only i f they are i n d i f f e r e n t subspaces S A and Sfi, whereas, the second order perturbation couples states i n the same subspace only), the perturbation series f o r these e f f e c t i v e operators have only even order terms nonzero, while the series for f has only odd order terms nonzero. To s i x t h order, the operator H A i s exactly equal to the r e s u l t obtained to s i x t h order i n v/c using the Pauli elimination method to obtain a n o n - r e l a t i v i s t i c approximation. S i m i l a r l y , to s i x t h order, H A i s i d e n t i c a l to the results of a canonical uncoupling of the Dirac hamiltonian, such as that c a r r i e d out by Eriksen, (1958). DeVries (19?o) demonstrates that the Pauli hamiltonian i s related to the Eriksen hamiltonian to s i x t h order by a fourth order s i m i l a r i t y transformation defined i n the space of positive energy states only. Such a r e l a t i o n s h i p i s evident from the d e f i n i t i o n of H A, eq. (2 .7 ka), i n terms of H"A, namely that H A - g A*H A g"*. (6.30) Thus, the required s i m i l a r i t y transformation matrix i s just g A . Because R"AÂ here i s a multiple of the unit matrix, the terms (6) *•* i n g A i n eq. (6.30) exactly cancel i f H A i s desired to s i x t h (<) order only. Since g A = 0, as seen from the tabulations i n Appendix 8, the s i m i l a r i t y transformation g A need be known only 189. to fourth order to determine HA to sixth order. A treatment of the Dirac equation with some similarity to the above application of the partitioning formalism has been given by Morpurgo (i960). In the course of a rather complicated derivation of a unitary transformation to bring the Dirac hamiltonian to an uncoupled form, i t becomes convenient for Morpurgo to define an operator of the form G - U B B , (6.31) where the quantities U B B and U A B are blocks of the unitary transformation matrix when partitioned in the same manner as H in eq. (6 .22) . It i s not d i f f i c u l t to show from the defining condition given by Morpurgo that, for the Dirac hamiltonian only, G » (-f*)" 1. (6.32) It is not known i f the quantity G has useful generalizations in other contexts, as does the operator f. Certainly, the relationship (6.32) can possibly hold only i n cases where and Sg have identical dimensions (so that U A B has an inverse). If an electric f i e l d is not present, these effective operators can be calculated exactly, since the perturbation is nonzero only in off-diagonal blocks. This was, i n fact, the basis of Foldy and Wouthuysen's free particle calculation (19^0)• The equation D(f) = 0, defining the operator f, becomes, O«JJ - 2mcf - f f t ' T r f = 0 . (6.33) Multiplication by O/JT from the right yields a quadratic equation for (g . «Tr)f, The desired solution is 190. f - r 2 %*J g- i . (6.34) mc + [m c + (g.Jj) J 8 since the root with the plus sign i n the denominator leads to the expansion ( 6 .28 ) . Given t h i s exact expression f o r f, a l l other quantities defined i n chapter 2 can be written, exactly, i n terms of mc and £«TT. The operator i s p a r t i c u l a r l y simple, HA = HAA + H A B f mc + [arc + (g'Jj) J The operators G A and H*A are obtained i n the same way, but the expressions are much more complicated. I t i s also possible to write down an exact expression for the projection onto the space S^, spanned by the eigenvectors of the perturbed hamil- tonian which have zero order energy E^ 0^ » mc 2. Since *A - h * f t f * \ + f 2 « ["MCT + 7(mc)2 + (g«B ) 2 1 j 2 + (g*g) 2 t mc • [(mc) 2 + ( g . T j ) 2 ] * by eq. (2.9a), one obtains, from eq. (2.10), P A « {[mc • [(mc) 2 • (£»TT) 2]*] 2 + ( f i - j j ) 2 } " 1 fmc+[(mc) 2+(o .TT) 2 ]* [mc+[(mc) 2+(o . TT ) 2 ] * ] ( 2 . T r ) (o . T T)[mc +[(mc) 2 +(o . T r ) 2 ] i ] (o - r r ) 2 (6.36) 191 6+3*o Derivation of a Spin Hamiltonian —> Strong F i e l d Case Consider the hamiltonian operator H - Jj.S + t «5 + (6.37) where £ i s the e f f e c t i v e magnetic f i e l d , -3g*H,, and where I - I I U ).4 ( ; ) >. (6.38) J the 4 ^ being hyperfine tensors. This hamiltonian describes the i n t e r a c t i o n of a system of nuclear spins with an e l e c t r o n i c spin. In the strong f i e l d case, the term h/*§ (the e l e c t r o n i c Zeeman interaction) i s large enough that the energy separation between l e v e l s of d i f f e r e n t electronic spin i s greater than the energy separations between nuclear spin l e v e l s . Therefore, a perturbation expansion with respect to H^0^ = h»S, i s appropriate i n examining the c h a r a c t e r i s t i c s of the nuclear spin system. In t h i s subsection, a nuclear spin hamiltonian i s constructed from (6.37)» i n which the electron spin quantum numbers are present only as parameters. In the strong f i e l d case, the el e c t r o n i c spin i s quantized i n the f i e l d d i r e c t i o n , taken to be the z-axis i n the notation adopted here. Thus, the zero order hamiltonian i s H ( 0 ) = hS z, (6.39) where h « |h,|. I t i s convenient to expand the perturbation H ( 1^ = i n the form, H ( D . a S • S + • i * + S _ Z Z (6.40) • D + 2 S ? + D * l ( S - S z + S z S - ) + D o ( S H s 2 ) + D - l ( S * S z + S z S + ) + D - 2 192. Here S ± = S t S are the usual s h i f t operators f o r the elec t r o n i c x y spin, and $+ = 3 1 3„ are of the same form i n the components — x y of 1 • The c o e f f i c i e n t s D„ are q Do = 2 D z z » D i l " *<Dxz * L V* I 6 A L ) and D±2 S *C*<Dxx - D y y) t i D x y ] . The zero order l e v e l s having S 2 = m define the space for the e f f e c t i v e nuclear spin hamiltonian ^ " m. Because the perturbation has matrix elements with Am = 0, i l , ±2 only, there are only a small number of nonzero matrix elements i n f^\. which can be written down d i r e c t l y using Table 6.6 and eq. (6.40). These are and # 2 . . " ^ D . 2 [ S 2 - B ( m . l ) ] * [ S 2 - ( l » + l ) ( m + 2 ) ] * 4^2.. ' ^> + 2 Ca 2 - ^ » - l ) ] *C5 2 - ( » - l)(^2)]* • (6.1*2) The A-states are degenerate, and so i s i d e n t i c a l to to second order. Therefore, Tables 6.8, 6.9» or 6.11 y i e l d (to second order), 193. " "AA and # J 2 > * X ^ Z ^ t ^ ^ m ^ l l - ^ D ^ D ^ L ^ - S m - l ] 2 2 ( 6 , 4 3 ) This derivation appears to assume a s p e c i a l , and inconvenient, coordinate system. However, a l l reference to s p e c i a l coordinate directions (x and y) perpendicular to the f i e l d d i r e c t i o n disappears on developing the terms i n ( 6 .43 ) . I f h, s p e c i f i e s a unit vector i n the d i r e c t i o n of h, then the e f f e c t i v e hamil- tonian to second order can be written - Cm + n ' £ A U ) . I ( j ) • E (6.44) 0 i» J Here C i s m 0, - hm • D 0 ( „ ^ 2 ) * fffi^ - kg*] [2 5 2-2m 2-l] -xCS*2z-£ - (E*H ) 2]t"§ z- 8 m- 13« ( 6-'» 5 a ) the l a t t e r two terms giving the o v e r a l l second order s h i f t of the endor l e v e l s . The other e f f e c t i v e parameters are 2(j)=i4(j)-h+|(3ra2-S2)D.(l4h)«4(j) + ^ (S 2-m 2)A ( ; 5 ), (6.45b) and A (J }.(1 - S h ) . ^ ^ . (6.45c) 194. Also, D and A are the cofactor matrices, D * D" 1det(D), A » A - 1 d e t ( A ) . (6.46) £6 SSL S. 9 C £5. 2& The t h i r d order jty-^ has been obtained i n a s i m i l a r way. m 195. 6". 4 Non-orthonormal Basis — 2 x 2 P a r t i t i o n i n g Many quantum mechanical calculations are c a r r i e d out using a non-orthonormal basis. In such s i t u a t i o n s , i t may be inconven- ient or undesirable,, (or even impossible for c e r t a i n kinds of perturbation) to transform to an orthonormal basis i n order to carry out a perturbation c a l c u l a t i o n . This section outlines perturbation expansions based on the formalism i n section 2.3,, applicable i n a non-orthonormal basis. It i s assumed that the given perturbed hamiltonian and overlap matrices have the expansions H = ? H ( n ) f l S = t S ( n ) , (6.47) n=0 n=0 where H^n^ and S^n^ are i m p l i c i t l y of order n i n the perturba- t i o n parameter or parameters. As observed i n section 2 .3 * there are two a l t e r n a t i v e types of conditions defining the off-diagonal blocks of the p a r t i t i o n i n g operator T, i n this case, the f i r s t being eqs. (2.113) and (2.114). I f perturbation expansions are desired only for e f f e c t i v e operators i n the A-space, then only eq. (2.113) need be considered, D(f) = H B A - H B B f - ( S B A + S ^ f J S ^ S ; (6.48a) • ^ + W - « S B A + S B B F ^ S A A + W ) " 1 < H A A + H A B F ^ ° - (6.48b) Substitution of the series, (6.4?), for H and S, and expansion of f i n a s i m i l a r series then leads, as before, to a hierarchy 196. of equations, D ( n ) ( f ) = 0 , n « 0 , 1 , 2 ( 6 . 4 9 ) I t w i l l be assumed i n the remainder of t h i s section that and are at le a s t block diagonal, so that f ^ = 0 . Formulas for the D ^ ( f ) i n terms of H, S, and f are then obtained i n stages. Writing ^ V ^ ^ ^ " 1 ? (sU '^siS-^f"))]. (6 .51) n=2 j=l one obtains A AA k = 1 AA AA ( 6 . 5 2 ) *—1 from which low order terms i n the perturbation ser i e s f o r SA can be obtained. Then for the operator HA= S A HA, one has K " » i n ) » ( 6 . 5 3 a ) A n=0 A where s ( n ) . j S - K J ) S : ^ J ) - ? ST^J^H^-^^T 'H^^^^) - ( 6 . 5 3 b ) j= 0 A A A k=l A a Given ( 6 . 5 2 ) and ( 6 . 5 3 ) » eq. (6.48) can be expanded s t r a i g h t - forwardly, and the hierarchy (6.49) can f i n a l l y be written 19? 4B ) f ( n )- SBB ) f < n ) sM )" l HAA ) * " A ( n ) * n S B l ' 2 ( 6 ' 5 4 ) where A(»).,4n) +^ 1^l) f(J).^ 1( S<n-J) +^ J s(»-l-W f(W )5(J) (6.55) The equations (6.54) f o r the f ^ n ^ are more complicated than those for an orthonormal basis because of the blocks Sgg^and SAA^ - 1 appearing on the l e f t hand side. This l i n e a r system can be solved for f ^ n ^ using numerical methods, but a general solution i n terms of S and H cannot be given. The expansions (6.51) - (6.55) become much simpler when H and S are diagonal i n zero order. Then S^°^=l n (so that SBB^ = XB» SAA^" 1 = 1k^* a n d t h e e a . u a t i o n s defining the f ( n ^ become i d e n t i c a l i n f o r m to eqs. (6 .8 ) , A(n) • H ( 0 ) ° r H(0) • <«•*> r r oo where the A are now given by (6 .55) . Low order terms i n the expansions (6.48) and (6.53) are given i n Tables 6.16 and 6.1? for the case S^°^=l • In terms n of f and H, the e f f e c t i v e operator G^ i s independent of S, and therefore the expansion (6.12) s t i l l holds (Table 6 .3 ) . Low order terms f o r the metric g A have been l i s t e d i n Table A9.1 of Appendix 9. The formal d e f i n i t i o n of H A, eq. (2 .74) , i n ±4 terms of g A 2 i s also unaffected. However, the series f o r g A 198. TABLE 6.16 p("'(f) — Non-orthonormal Basis n ( l ) . „(0),(1) , ( 1 ) „ ( 0 ) + h ( 1 ) q(l ) „ ( 0 ) D " BB f " f AA BA BA AA D " BB f " f AA t l ) (2).„(l) f(l) , s ( l } . f | l ) U l | ( l ) , .(1) H (0), f . ( 2 ) „ ( 1 ) , ( 1 ) , „ ( 0 ) BA BB f * ( S B A + f AA AA AA ' - l S B A + SBB f ' AA + H B ^ H B B V 1 > + H B B V 2 > - ( s ( A ) + f ( ^ ) [ „ { 2 ) . „ ( B ) f ( i > - s < A ' H U ) - ( s A l ) + s ( B ) f ^ ) ) „ ^ ) ^ ) 2 H A A 0 > ] - C s ^ V ^ s ^ 2 ' , ^ D<*> - H B°)f(«-f^»H|°» + H B i t ) t " B B ) f ( l J + H B B , f ( 2 ) + H B B ) f < 3 > - C S B 3 ) + s < | ) f ( l ) + s ( B ) f ( 2 ) + f ( 3 ) ) ( H ( l ) . s ( l ) H ( 0 ) ) . ( s ^ ) + s B 3) f(i ) + s B | ) f ( 2 ) + s B|)f ( 3 ) ) H A 0 ) 199. TABLE 6.17 H A n ^ ( f ) ~ Non-orthonormal Basis „ ( D o ( D W ( 0 ) AA AA AA . ( s ( l ) « » ) , C i ) ) „ U , ^ ( i ) 2 H ( i ) s ( D 3 H ( o ) AA AA TABLE 6.18 H A n^ — Non-orthonormal Basis • H i ! ' ' AA AB f + * L f ' BA + f '' AA J- 200. TABLE 6.19 GJ'}' — Non-orthonormal Basis J O l - I D . J D t 10).„(1) BB f * h AA aA „ ( 0),O ) . h ( 3)t H ( 0 ) BB f + h AA + H ( 3 > + H ^ f ( l ) + H ( t l ) f ( 2 ' + h < l ) t ( H A A » + H { l » f < 1 ) ) + h ' 2 » t H A A ) < ) t H ( 3 ) f ( l ) + „ ( | ) f < 2 ) + „ ( l ) f O ) t h < 3 , t H A i ) TABLE 6»20 g^ n^ —- Non-orthonormal Basis f ( 3 ) + h ( 3 ) t «»>«g>f< 1>«»>f< 2W l>t(s A 2>*<|>f« 1>)*h< 2»ts Ai) + S B A ) + S B B ) f < 1 , + S B H f ( 2 ) t S B B ) f ( 3 > + h ( 1 ) t ( s < 3 ) . s ( ? ' f ( 1 ) + s ( i ) f ( 2 ) ) + h ' 2 ) U s ( 2 ^ s < i > f ( 1 ) ) AA AB AB AA AB + h <3>t s U> 201 TABLE 6.21 f ^ — Non-orthonormal Basis (A-states degenerate) , ( D ,(2) ^ B A ^ A 3 ^ H H -«°S L H ( A ) + L H ^ I ( H < i » - € A S ^ ) ) ^ ) * L < ) - € ^ < i ) ) ] ( K < i > . € j s A A ) ) + € A 1 L S B A + SBB L ( H B A A BA ' J .(3) + ^ ( s ( A ) + s U ) L n ( A ) , ] S A i ) - 6 o [ s O ) + s ( | ) ^ u ) t S ( i ) l [ H u ) . H u ) j u ) . ( s a ) + J a ) ) 5 ( 1 ) AA now depends on S, and has a f i r s t order term, so that, while eqs. (6.15) hold here also, the formulas i n Tables 6.5, A7.7» and A7.8 are no longer v a l i d . E x p l i c i t expressions f o r low order terms i n the series f o r H*A i n terms of H, S, and f are given i n Table 6.18. As i n an orthonormal basis, the perturbation ser i e s f o r the e f f e c t i v e operators H^, G^, and H^, are much more compact when the A-states are degenerate i n zero order. Equations 2 0 2 , TABLE 6 . 2 2 k[n^ — Non-orthonormal Basis A • 1 1 (A-states degenerate) H{ 0 ) - H<j>> - € j l A - ( 1 ) _ H ( l ) . 0 . ( 1 ) _ HA ~ AA " A A A " AA HA " AA AB L HBA - i B , ^ « B i ) < ) ^ ) - ( s ^ , ^ B A > ) a ^ ) « 0 ( s B i ) + s < B ) s B i ' ) ] - « ) ^ I , ^ B i , ^ i , * B A , < « B i , - ^ B i , ^ B i ) ) 5 1 i , "Afc AA ' :A WBA T*BB ""Bk •S ( 1 ) 2 H ( 1 ) e o s ( l ) 3 o r s ( l ) s ( 2 ) + s ( l ) L 5 ( D l 'AA AA A AA + 6 A l S A A »* AA AB L r tBA •* • ( 6 . 1 6 ) and ( 6 . 1 7 ) apply here, with the modified k { n ) of eq. ( 6 . 5 5 ) » A and can be used to obtain e x p l i c i t matrix formulas f o r f, H A, G A , and H A, s o l e l y i n terms of H and S. The lower order terms for these expansions are given i n Tables 6.21 - 6.24. I f the A-states are not degenerate i n zero order, use of eqs. ( 6 . 5 6 ) y i e l d s formulas f o r the i n d i v i d u a l matrix elements of the operators f, H A, G A, and H A. Low order formulas of t h i s type are given i n Tables 6 . 2 5 - 6.28. The second set of conditions defining f and h ar i s e from the requirements 203. TABLE 6.23 o[ n^ — Non-orthonormal Basis (A-states degenerate) HX2)+ H(i)*(i) +2(iWi)+S(i)T„(o)TS(i) AA AB L KBA AE ̂ BA AB L H B B L HBA H , ( 3 ) + H ( 2 ) L 2 , ( 1 ) + S ( 1 ) L H ( 2 ) AA AB- L HBA AB L HBA X' L HBH ) ] C^JU ) AB LHBB LLHBA BB LHBA MBBA +LHBA ' AA + < : o T K ( 2 ) . q ( l ) * ( l ) n +€AL(SBA +SBB: LHBA }-> * ( 1 ) L H ( 1 ) « ( 1 ) AB L HBB: L HBA TABLE 6.24 HJ"̂ — Non>orthonormal Basis (A-states degenerate) »AA ) - * A W ( D - o s ( i i - _ a(D AA " A AA " AA H (2). H l(l) Tg(l) *o,„(2) + s(l) T8(lK i f s ( D H ( l ) 7 AA AB; L HBA " €A ( SBA + SAB L HBA AA " AA > + + A AA 201*. and «BA S ( r S $ ) B A " SBA + W + h ^ S A A + S A B f ) = °< ( 6 ^ 7 b ) When and are diagonal, expansion of these equations yields 4 ° ) f ( n ) + h ( n ) t H A ° ) - -B< n ). (6.58a) and f ( n ) + h.(n)t = - f iXn)^ (6.58b) where e x p l i c i t use has been made of the con d i t i o n S ^ = l . Here, one has, B < rri = H « n) n- j ) , ( j } ) t„(n- j)+ 1 BA flit J- ., AA J " 1 j 1 (6.59a) ^ z " 1 toC«tH(«-l-*)rC3>t i=l j«l and J ' 1 ' (6.59b.) i - i j - i m Equations (6.58) can be solved simultaneously f o r the matrix elements of f ^ and h/ n^, which are given by t M = 1 o r S - 2 2£ (6.60a) or €° e° r " o and n ( n ) % ->1 'or 'or ( 6 . 6 o b ) or c ° c° r o 205 where the €^ are the eigenvalues of I f the overlap matrix i s not perturbed,, the quantities B ^ a l l vanish, and eq. (6.60a) reduces to eq. (6.9), and (6.60b) implies eq. (2.4). I f eqs. (6.57) are to be used as the basis of a perturbation formalism here, the serie s f o r h and f must both be considered simultaneously. This complication i s o f f s e t by the simpler fo r of the expansions (6.59)• Several low order terms i n the series f o r G'ĝ and g f i A are given i n Tables 6.19 and 6.20. E x p l i c i t expressions for the corresponding quantities J i j n ^ and; Bg11^ can be obtained from these tables by deleting the terms i n h ( n ) and f ( n ) . I f the A-states are degenerate, eqs.. (6.58) can be w r i t t e n as Lr(0)-.(n) c o . ( n ) t _ R ( n ) BB "* € A h ~ ~ & 1 » and (6.61) f ( n ) + hi(.n).t = - B | n ) t which can be solved as a system of two matrix equations in. two unknown matrices. The s o l u t i o n i s f ( n ) = L [ B ( n ) _ € o B | n ) ] f and < 6' 6 2> h < n ) t = (€°L - l ) B < n ) - LB< n ). where L i s given ineq.. (6.17)• I f these equations are used to obtain expressions for the f ^ s o l e l y i n terms of H and S„ the expressions obtained are na t u r a l l y i d e n t i c a l to those froim (6.16) and (6.17) with (6.55). However, eqs. (6.62) provide a more 206 e f f i c i e n t computational scheme for the calculation! of high order terms i n the series f o r f • A c o l l e c t i o n of al t e r n a t i v e formulas for the terms i n the seri e s f o r H^ along with formulas f o r the metric g A and related quantities have been given i n Appendix 9» This type of perturbation theory i s useful, f o r example, i n extended Huckel molecular o r b i t a l theory* 207. TABLE 6.25 — Non-orthonormal Basis 4V - <£0i) or IT or €° - €° r o H o r } + £ <»ej } / J J ( 1) € r * V ; €o - € ° S ( 1 ) ) + 6 ° S < 2 ) r or r / s d ) • o t ' € t S o t ) ( I I(1) . € o s ( 1 ) ) 1 - J ( S o t + ToTo } ( H t r " € r S t r ' eo o TABLE 6.26 HJn^ — Non-orthonormal Basis w ( 0 ) H r s H d ) e o s ( l ) H r s " € s S r s HS2) + 2 rs p r P o „ d h ( H g s € s S P s > s rp ' € s " €<° ^ AA AA ; r s - <4V s rs + €r* SAA >rs 208, TABLE 6.27 G^n^ Non-orthonormal Basis H ( 0 ) H r s H. (1) rs H. (2) + L n ( l ) ( H P S - g s S P s > + £ 1*TP " € r S r P J H ( l ) rs € s €P , H(1) f o q ( l ) w w ( l ) e ° < 5 ( 1 ) ) + L € ° (Hry> ~ € r S r P M H P S 6 S S P S > TABLE 6.28 H*[n^ — Non-orthonormal Basis H H (0) rs (1) rs - i ( € ° + € ° ) S ( 1 ) s' rs H'il* + 2 H rs p rp / H ( l ) e o q ( l ) x ( l ) C H P s g s S P s } € s " €P +*<€°+e°> 1 r s p rp r rp S ( D + _£s a Pa -O -o ps c O - O s ( 2 ) + E S ( 1 ) ( H ^ VVs > rs P rp fcs ^ 209. CHAPTER 7 EIGENVALUE INDEPENDENT PARTITIONING AND MOLECULAR ORBITAL THEORY "•We have applied, the same process,' Mein Herr continued, not notic i n g Bruno*s question, *to many other purposes. We have gone on se l e c t i n g walking-sticks-- always keeping those that walk b e s t — t i l l we have obtained some, that can walk by themselves! ,,,.,M (Sylvie and Bruno Concluded. Lewis C a r r o l l r 210 7.1 Intro due t i ore The eigenvalue independent p a r t i t i o n i n g formalism developed i n chapter 2 i s p a r t i c u l a r l y suited to situations i n which only the whole space spanned by a subset of eigenvectors of some operator has s i g n i f i c a n c e , rather than the i n d i v i d u a l eigenvectors themselves,. The mapping f i s s u f f i c i e n t to determine the p r o j - ection. P A„ eq, (2,10),, onto the subspace of i n t e r e s t , so that,, im p r i n c i p l e , a l l relevant properties can be determined once f has been calculated. One of the more important areas of quantum) chemistry i n which these aspects of the p a r t i t i o n i n g formalism can be exploited i s im molecular o r b i t a l theory. In molecular o r b i t a l theory,, a closed s h e l l system containing 2n A electrons i s represented by a S l a t e r determinant made up fronc reA doubly occupied o r b i t a l s . Since t h i s determinantal wave- function- changes by at most a complex scalar factor under am a r b i t r a r y l i n e a r transformation!of these occupied o r b i t a l s , the i n d i v i d u a l o r b i t a l s have no d i r e c t s i g n i f i c a n c e . In an n-dimem- sio n a l basis space, the n A occupied molecular o r b i t a l s are sp e c i f i e d by n^n = ^ ( n ^ + rig) complex numbers, the Icao ( l i n e a r combination-of atomic o r b i t a l s ) c o e f f i c i e n t s . Since these n A molecular o r b i t a l s are a r b i t r a r y up to an x n A l i n e a r trans- 2 formation', so that n A of these complex numbers must be redundant, there are only n An f i independent complex variables in. the problem* which i s exactly equal to the number of B r i l l o u i n conditions that must be s a t i s f i e d . This i s also the number of (complex) matrix elements im the mapping f defined im eq. (2.2), a r i s i n g out of a p a r t i t i o n i n g of the eigenvector space of the hamiltonian! into art m^-dimensional subspace spanned by the occupied molecular o r b i t a l s , and an n B-dimensional subspace spanned by the unoccu- i pied o r b i t a l s . Thus, not only i s the mapping f s u f f i c i e n t to determine the projectiom onto the space of the occupied o r b i t a l s , but i t also represents the minimunh amount of informations required to specify that projection. The matrix elements of f contain no redundancies, and are subject to no con s t r a i n t s . These two properties of f are of considerable p r a c t i c a l importance.- This chapter i s primarily concerned with the derivatiom of perturbatibm formulas f o r the projectiom onto the space ©f the occupied molecular o r b i t a l s . This projection, i s also frequently referred to as the one-particle density matrix i n molecular o r b i - t a l theory,, and i s equal to the charge-bond order matrix except fo r a factor of two., Both the simple matrix (Huckel theory), and the se l f - c o n s i s t e n t field„cases are considered:. The l a t t e r i s more general than the matrix uncoupling considered hitherto i n that the operator to be block diagonalized by f, i t s e l f depends om f . This chapter i s r e s t r i c t e d to consideration of closed s h e l l systems only. iThe detailed! nature of the p a r t i t i o n i n g of the basis space i s not of cen t r a l importance here, as long as f e x i s t s . Nevertheless,, p a r t i c u l a r p a r t i t i o n i n g s may be of sp e c i a l i n t e r e s t im c e r t a i n cases because the elements of f then have a p a r t i c u l a r physical s i g n i f i c a n c e . One example i s a basis made up of localized; bond,, lone pair, and antibond o r b i t a l s . When t h i s space i s partitioned into an n^-dimensional subspace, S^, spanned by the bond and lone pair o r b i t a l s r and an ng-dimensional subspace,, Sg.,, spanned by the antibonding o r b i t a l s , then the elements of f measure the d e l o c a l i - zationr of the bond and lone p a i r o r b i t a l s through the mixing im of antibond o r b i t a l s . In the same way,, i n a sel f - c o n s i s t e n t f i e l d calculation,, carried out im a Huckel basis,, which diagonalizes the hamiltonian i n the absence of e x p l i c i t electrom-electrom i n t e r - action, and partitioned into occupied and unoccupied orbitals,, the elements of f represent the magnitude of the mixing of these i n i t - i a l l y occupied and unoccupied o r b i t a l s because of the e l e c t r o n repulsiom terms. 212* 7.2 Perturbations of the Density Matrix —Orthonormal Basis 7•2.a General Theory Consider a p a r t i t i o n i n g of the basis space into two sub- spaces, S A and Sg,, spanned by the o r b i t a l s occupied and unoccupied,, respectively,, i n zero order. The projection P A onto a subspace SA», spanned by the occupied perturbed o r b i t a l s , can be written (eq.. (2.10)),, as,, (7.1) PA -1 -1-t ffA % f f g - 1 f g : 1 f t =A i & A where g A = 1 A + f * f • The perturbation series for f therefore t> determines series f o r each of the blocks of PA,, given by /p'\(n) _ _-l(n) ^ A'AA ~ gA • -1 f(iV(n-3). J = 1 (7.2) and i=i j=i when the zero order hamiltonian i s at lea s t block diagonal (that is,, when f ^ 0 ^ = 0). I n the simple matrix case, the terms i n the perturbation series f o r f are determined from the hierarchy of conditions D^ n^(f) = 0, eqs. (6.4). In a sel f - c o n s i s t e n t ( n) f i e l d formalism, the equations defining the fK ' may be more complicated. They are considered i n some d e t a i l i n sect i o n 7»4. Fromi eqs. (7.2), i t i s seen that the f i r s t two terms i n the series for P A are given by, 213. and (7.3) (7.4) where i s t h e p r o j e c t i o n onto the b a s i s space S A . The b l o c k s o f t h e second and h i g h e r o r d e r terms o f PA a r e a l l n o n - v a n i s h i n g i n g e n e r a l . The s p e c i a l form o f P̂ , i n which t h e o n l y non- v a n i s h i n g m a t r i x elements a r e t h o s e between t h e z e r o o r d e r o c c u p i e d and un o c c u p i e d s p a c e s , i s a consequence o f t h e absence o f a f i r s t o r d e r c o n t r i b u t i o n ! t o the m e t r i c g^. U s i n g the f o r m u l a s g i v e n ins s e c t i o m 6.2, t h e p e r t u r b a t i o n s e r i e s f o r P̂ cam be e x p r e s s e d s o l e l y i m terms o f the p e r t u r b e d h a m i l t o n i a n H. T a b l e s 7.1 - 7»3 g i v e t h e s e f o r m u l a s f o r t h e elements o f (P̂ )̂ ,. (P̂ 'BÂ A N D ̂ BB '̂ F O R N I = ° ' 1 » 2 , and 3. The case i m which t h e A-states a r e a l l degenerate i s not o f g r e a t importance i m m o l e c u l a r o r b i t a l t h e o r y , and no f o r m u l a s ' ( IT) f o r (P̂) a p p l i c a b l e t o tha t ' case a r e i n c l u d e d h e r e . The f o r m u l a s i n T a b l e s 7.1 - 7*3 g i v e the m a t r i x elements o f t he p e r t u r b e d d e n s i t y m a t r i x i m the b a s i s o f the z e r o o r d e r o r b i t a l s . These,, i n t u r n may be known i n terms o f some more p r i m i t i v e b a s i s f u n c t i o n s , f o r example,, as a l i n e a r c o m b i n a t i o n o f a t omic o r b i t a l s . The c o e f f i c i e n t s w i t h r e s p e c t t o such a b a s i s ( e g . t h e lcao) c o e f f i c i e n t s ) w i l l be denoted here as t h e columns o f a u n i t a r y m a t r i x C. The p e r t u r b e d d e n s i t y m a t r i x i m the o r i g i n a l b a s i s w i l l be denoted by R. The terms i m t h e 214. perturbation! series f o r R- are given by/ R ( n ) = C p A ( n ) c t r or /^\ occ , / \ » unocc , / \ - l !J o i r L V A'AA J r s js „ „ 10*- 'BB Jc j r ( 7 . 5 ) occ unocc: , , • % ( n 0 - i * + I I Cir£< PA>AB ^ o c ' ^ i o ' ^ P A J ^ l c r ^ r - where the primes on the summation indices indicate that they are r e f e r r i n g to basis elements i n Sg, (that is,, numerically, o* = a + rtk i n C^,,).. In the simple matrix case (Huckel theory), the energy of the system described by the determinantal wavefunction made up of o r b i t a l s which span the image space of P A i s given by E = v tr: PAH, ( 7 . 6 a ) v being am occupation number f o r the o r b i t a l s . Using eqs. ( 7 . 2 ) , a perturbation series for E i n terms of f and H„ and ultimately, In terms of H only, can be derived.. The general formula 1 Is E ( n ) = v Z t r [ " ( p ' ) ( ; 3 ) H ( r r ^ +-(p,)U').H('n-j') + /p 1) (0 )H(n«- j) - t o A ^ A A B B A A B A rtAB J — u "A'BB "BB Formulas f o r E ^ im terms of f ̂ and H ^ are given i n Table) 7 . 4 through f i f t h order. By using the conditions D ^ ( f ) = 0 , eqs. ( 6 . 4 ) , i t i s possible to obtain, a 2n+l rule here, i n that Ê 2N) and Ê 2*1*1* can be written i n terms of f ( 1 ^ through f ^ only, as i s done i n the formulas i n Table 7.4. Table 7*5 gives (n) formulas for the EV im terms of the perturbed hamiltonian: only,, through t h i r d order. The formalism presented above corresponds to some extent to that developed by McWeeny (1962),, for the perturbation; of the density matrix i n the context of self-consistent f i e l d theory. Since self-consistency terms are not indicated e x p l i c i t l y im much of that derivation 1,, the r e s u l t i n g formulas: correspond c l o s e l y to those derived above. The procedure used by McWeeny to derive a perturbation'series f o r P^ was to expand 2 the equations 1 PA [H„ p'] = 0„ (7.7a) and i f " PA ' PA f " PA • i n perturbation series, and then successively isclye the hierarchy of simultaneous equations e f f e c t i v e l y f o r the blocks of P̂ ,, as i t i s partitioned i n eq. (7.1). The series obtained by McWeeny for P^ i s i d e n t i c a l to that obtained here — only the derivation: i s d i f f e r e n t . Here P A has been wri t t e n im terms of a matrix f" ire such a way that eq. (7.7b) i s automatically ( rt) s a t i s f i e d . The hierarchy of equations, Dv (f) = 0,, defining the series for f„ i s equivalent to the hierarchy r e s u l t i n g from eq. (7.?a)„ as showm im section. 2»-3. In his derivation, p McWeeny refers to the one-particle density matrix,, denoted as #> P A here-, by the symbol p im his 1962 paper. 216. McWeeny e f f e c t i v e l y takes the elements of (P^BA. a s t h e independent variables to be determined in. the calculation,, ' (n) and calculates the elements of the other blocks of (P A) from themi. TABLE 7.1 ( p A ) ^ Molecular: O r b i t a l Basis ( p ' ) ( 0 ) = 1 ^ A'AA LA A AA u nw K ( D W ( 1 ) [tp;> 2 ) ] - - - ^ - s * -A'AA -"rs a = 1 ( €o,_ € o ) ( € o . ,o } %T nv, w ( l ) w ( l ) / n t , H ( 1 ) K ( 1 ) \ C ( P ; ) ( 3 ) ] . . Z B „ H(i) Z B _ ( / ) H ( i > -H (1) or n. + H ^ ¥ 2 ) + H ( 2 ) H ( 1 ) or OS- or os WAV t - l (€°- ] H (1) OS (€°-€°)(€°-€°) s o r o 217/, TABLE 7*2 ( P ^ ) ^ — Molecular O r b i t a l Basis ( p ' ) ( 0 ) = 0 * A BA U L* rA'BA J a r H (1) or v r a' *- ̂ A' BA J o r nSn. H ( l ) H ( l ) or Z a t t r P=l (€°- €°) t=l <€°- €°) L v r A ; B A J o r H ( 3 ) + n B H ( 2 ) H ( l ) n A : H ( l ) H ( 2 ) *op "pr lot " t r or P=l (€°- 6g) t=l (€°- €°) nfe K ( 1 ) %=1 <€j-€2> V ̂ Y=l (€°- € ° ) f l ( € ° - $ ) E [H ( 2 )+ E g p p t Z H o s H s r } H t r + E E g t t<° P=l t=l(€°.€°)(€°-€°) - E E H ( 1 ) H ( 1 ) H ( D as sP Pr 3=1 «o=l (€°-€°)(€^€°)(€°-€ p°) 218. TABLE 7.3 ( P ^ ) ! ^ — M o l e c u l a r O r b i t a l B a s i s 1 A BB u n, ff(l)H(l) A B B ^ r = l (€°- €?>(€£- €°) raA r- nw H ( 1 ) H ( 1 ) rrp"^3)n . r A L d W 2 ) + K < 2 ) H ( 1 ) + H ( 1 ) v py V ^ V B B J A / > - ^ I H o r V H « r V- Y = i €°) •~— r Y 219. TABLE 7*4 E ( n ^ — Molecular O r b i t a l Basis E(o) = V t r H<0) AA E ( l ) = V t r E (2) = V t r E (3) = V t r 'AB * -> - ( D t ^ d J u d ) . ^ 1 ) ^ 1 ) ^ 1 ) AA BB * t ^ W u M - f ( 1 ) f ( 1 ) t f ( 1 ) H J l > + K A ^ ) f ( 2 ) t f < 2 ) - f ( 2 ) f ( 2 ) t H < ° J ] + f ( 2 ) t H ( 3 ) + f ( 2 ) H ( 3 ) + ( f ( 2 ) f ( l ) t + f d ) f ( 2 ) t ) H ( 2 ) . ( f ( 2 ) t f ( l ) + f ( l ) t f ( 2 ) ) H ( 2 ) _ f ( 2 ) t r ( 2 ) H ( l ) + f ( 2 ) f ( 2 ) ^ d ) AA AA AA _ f ( l ) t f ( 2 ) f ( l ) t H ( l ) _ f ( l ) f ( 2 ) t f ( l ) H ( l ) BA AB + ( f ( l ) f ( l ) t ) 2 H d ) . ( f d ) t f d ) ) 2 H d ) + f ( l ) t f ( l ) f ( 2 ) t K ( 0 ) f ( l ) + f u ) t f d ) f d ) t H ( o ) f ( 2 > _ f d ) f d ) t f ( 2 ) H ( o ) f d ) BB AA . f ( D f ( l ) t f ( l ) H ( 0 ) f ( 2 ) t - | 220 TABLE 7.5 E v " ' -- Molecular O r b i t a l Basis ,(0) _ = v E € r=l r .(1) . = v E H .(2) . • (3) . (1) = v r - 1 r r r n4 E I H E H r % H ( l ) t r ( l ) 2 r g or n A nB H ^ H ^ H ^ 2 ro os s r 8-1 c l (€^€°)(€°-€°) nv, H ( 1 ) H ( 2 ) 2 Hr-g H o r nB • v E o=l r B; „ os sp pa P-l 8-1 < € ° - € ° ) ( € ° - ^ n A H ( D H ( D 3 = 1 ( < - «°> 221. 7.2.D Huckel Molecularr O r b i t a l Theory As an i l l u s t r a t i o n of the straightforward way i n which the tabulated perturbation formulas for P A can be used,, expressions for the bond-bond,, bond-atom, atom-bond, and atom- atom; p o l a r i z a b i l i t i e s , . as defined by Coulson and Longuett- Higgins (19^7) In Huckel molecular o r b i t a l theory,, w i l l be derived. These quantities are proportional to the f i r s t order response of diagonal and off-diagonal elements of the charge- bond order matrix,, P = 2R, where R denotes the density matrix i n the atomic o r b i t a l basis, (or the second order response of the energy, ( 7 . 6 ) ) , , to a perturbation of diagonal or o f f - diagonal elements of the hamiltonian, H^0^,. i n the atomic o r b i t a l basis. Thus, the re s u l t s below are determined by combining the formulas i n Tables 7..1 - 7«3 with eq. ( 7 » 5 ) . F i r s t , consider the single center perturbation given by representing a change i n H^0^ by an amount 6a^ . . On trans- forming to the zero order molecular o r b i t a l basis,, t h i s becomes (HjWO^ij; = 6 a t C t i C t ( j „ ( i , j = 1, n), (7.8b) assuming the are a l l real.. S u b s t i t u t i o n of (7.8b) into the f i r s t order formulas f o r P A In Tables 7.1 - 7»3t and back- transformation to the atomic o r b i t a l basis v i a (7*5) then gives the r e s u l t s t = - L - ( R ( D ) „ (7 .9a) 3 H t t 6 a t * 222.. where, 1 n \ occ umoce C. C. , - i - R H ; = E E x r x g [ C ,C. + C C. ,"]„ (7.9b) 6 a t 1 J l r a €° - ej, i a , 0 j r l r j o , J " w y ' and,, lm p a r t i c u l a r , for the diagonal elements,, . / 4 i \ dec: unocc; C.„,C. C. ,C. - i - R ? l J - 2 E E to' t r 10' i r ( ? < 9 c ) 6 a t 1 1 r o €° - €°, These quantities are respectively the atom-bond and atom-atom p o l a r i z a b i l i t i e s (TT. . . and TT. .) to within a factor of two, 1J , X 1, X as defined by Coulsom and Longuett-Higgins (1947). One may obtain second derivatives of the elements of R with respect to one or more diagonal elements of H i n an analogous manner. A summation over a l l 6a^ i s incorporated into eq. (7.8), allowing f o r a simultaneous perturbation of a l l diagonal elements of H^;. The second derivatives of elements of R with respect to two diagonal elements, HL , and H , of (2) H^0 are obtained by i s o l a t i n g the c o e f f i c i e n t of S ^ f i a ^ i n R. , 2 t * , <™* u n o c c C p o ' C p r C q s C q o ' C i r C , f s ^ p p ^ q q r 0 € r - €o« L 3 € s " € o * +unoca O q f,,C q rC i g,C.. p, +occ unocc C ^ C ^ + ' o ^ o : r ' V c O , 0 r a t - € x 2 2 3 . While t h i s i s considerably more complicated than the f i r s t order formula, i t i s nevertheless obtained quite straightforwardly from- the formulas tabulated for P A. This procedure can be continued to a r b i t r a r y order, but the e x p l i c i t formulas r a p i d l y become more complex and less useful. The computation of high order terms can be done more e f f i c i e n t l y by successively c a l - culating the f ^ and g~ 1 ( i ) ̂ evaluating the P A^^ i n terms of these quantities numerically using eqs. (7.2),, and then trans- forming to the atomic o r b i t a l basis. The a t t r a c t i v e feature of the derivation of (7.9) and (7.10) above i s that the various ( n) summations i n the formulas f o r the .. appear automatically J, as being either over occupied o r b i t a l s or over unoccupied orbitals., This i s not so when conventional perturbation formulas, based on the perturbation series for the occupied orbitals,, are used, for which the derivation and s i m p l i f i c a t i o n (n) of formulas for R for n > 1 becomes very laborious. For a two-center perturbation given by ( H AQ^)pq ~ ^AO^qp = 6B , the matrix of the perturbation i n the molecular o r b i t a l pq basis i s The formulas I n Tables 7.1 -7.3 and eq. (7*5) y i e l d immediately the bond-bond and bond-atom p o l a r i z a b i l i t i e s , TT... and TT. iJtPq 1 1 pq (to w i t h i n a factor of 2), respectively, 22*U where^ and 1 W.(D ° r u n 2 c c ^ P a ^ q r + C q a ' C T o r ^ G i a ' C . i r + C i r C i a ^ I T " * i * = r a €° - €° Hpq r a* (7.12b;) . h \ oc:cr, unocc C.. ,C. R>.r' = 2 2 Z — — — [C ,C +C „C ] . n n J €o^ €o *• pa' qr qa« pr-J W r : ff (7.12c.) Higher* order formulas are obtained straightforwardly here also, but even ire second order, they are lengthy and none w i l l be given here.. 7..2.C- Numerical Example — Huckel Theory To obtain some information on the nature and usefulness of high order perturbation series for the charge-bond order matrix,, P = 2R r a number of numerical calculations were carr i e d out,, based on three Huckel-like hamiltonians• The f i r s t two of them, were 0 -1 0 0 0 -1 -1 0 -1 0 0 0 0 -1 0 -1 0 0 0 0 -1 0 -1 0 0 0 0 -1 0 -1 -1 0 0 0 -1 0 (7.13) which could represent a six-membered r i n g of i d e n t i c a l atoms i n Huckel theory (eg., benzene), and,, 225. H ( 0 ) _ - 1 - 1 0 0 0 0 0 - 1 - 1 0 - 1 0 0 0 0 0 0 - 1 - 1 - 1 0 0 0 0 0 0 - 1 0 - 1 ,0 0 0 0 0 0 - 1 - 1 - 1 0 0 0 0 ' 0 0 - 1 0 - 1 0 0 0 * 0 0 0 - 1 - 1 - 1 - 1 0 0 0 0 0 - 1 0 (7.14) representing an eight-membered r i n g around which two kinds of atoms alternate (eg. P^N^). The t h i r d example,, denoted as H 1 ° 1 „ i s obtained by se t t i n g the ( 1 „ 1 ) element of H!J01 to A 5 B 3 A 4 B 4 ; zero. Series f o r P ^ and P ^ for a single-center perturbation (Ĥ varying) only w i l l be described. They can be writteni as ( 7 . 1 5 ) and c o e f f i c i e n t s i n each case for nn = 0 , , 1 „ 2 , 3 » and 4, are givero i n Tables 7.6 - 7 . 8 . Because of the symmetry of Ĥ 0], the series f o r P 1 1 contains only odd order terms, while that for P ^ contains only evens order terms. For the Â B:̂ and A^B^v systems, none of the c o e f f i - cients im the series f o r P ^ and P ^ through fourth order are zero. The c o e f f i c i e n t s im the series obtained decrease i n magnitude quite rapidly, with the fourth order c o e f f i c i e n t being smaller than the zero order term by a factor of up to several hundred. The c o e f f i c i e n t s given im Tables 7.6 - 7 «8 are both positive and negative, but no pattern; i n sign i s recognizable to fourth order. Plots of exact values of P ^ and P 1 2 as a 226 functions of Ĥ , along with plots of the f i r s t through fourth order series approximating these quantities aire given i n Figures- 7*1 - 7.6-.. The two matrices, HI0! and H»°l „ can be considered as A4FL;2+ 5 3 alternative zero order terms when the (1,1) element only i s to be perturbed, Thus,, the exact quantities P ^ and P 1 2 » considered as functions of are i d e n t i c a l im both cases. The alte r n a t i v e series givem Im Tables 7»7 and 7.8 are them seen to be power; series: expansions of the functions ^n^n^ a n d P i 2 ^ 1 1 ^ around two d i f f e r e n t values of H^. Two poten t i a l p i t f a l l s im the use of high order perturbation, s e r i e s , which warrant some emphasis„ are i l l u s t r a t e d by the results- here.. Thesie are rather obvious dangers which apply quite generally to the use of any truncated power series expan- sion , which i s what such f i n i t e perturbation, approximations actually- are. F i r s t l y , , as the size of the perturbation In- creases, the error lm a higher, (but f i n i t e ) order p a r t i a l sum eventually becomes larger than the errors in, the lower order truncations of the series,, although, by the time t h i s occurs, none of the p a r t i a l sums of lower order may be s u f f i c i e n t l y accurate to be use f u l . Thus,, while the inclusion; of the next higher order termi im a given series w i l l generally increase the accuracy of the approxiraatiom when the perturbation; is; small., It may sub s t a n t i a l l y decrease the; accuracy I f the perturbation i s lar g e . Secondly,, the range of acceptable accuracy of an approximations to. a given order depends s i g n i f i - 22? cantly on the zero order approximation 1 used.. As: seen from Figures 7*4 and 7«6», the f i r s t order approximation for P^2^H11^ has a considerably wider range of usefulness around Ĥ0^ = -1 than around Ĥ0^ = 0. I f an approximation for P 1 2 as a function of i s desired for -1 < < 0„ It i s clear that the zero order hamiltonian H![01 Is: superior to . A4f°4- 5 3 TABLE 7.6 ( P A 0 ) ( i ) f o r Ac- System (H<° } = 0) p£l) = -0.398148 p j 2 ) = 0.0 P* 3 ) = 0.031875 P<f = 0.0 p[P = 0.666667 (2/3) p< 2 ) = -0.053626 p[l> = 0 . 0 ,(4) 12 0.006489 TABLE 7.7 ( p A Q ) ^ f o r an A ^ System (Ĥ^ = - 1 ) ( 0 ) . p ( 0 ) _ r l l ~ p ( D _ p ( 2 ) _ P l l ~ p l l " p<4) . * l l " 1 . 4 7 7 3 0 1 - 0 . 2 7 3 1 5 7 - 0 . 0 6 8 2 6 3 - 0 . 0 0 1 7 3 2 0 . 0 0 5 2 0 1 >(0) _ 1 2 " >(D = 1 2 3 ( 2 ) _ 1 2 ~ 3 C 3 ) . 1 2 ~ ,(4) . 1 2 " 0 . 5 7 5 8 6 9 1 0.104447 - 0 . 0 0 9 9 3 7 - 0 . 0 1 1 5 4 0 -0.002410 TABLE 7.8 ( P A Q ) ( l * for an A.B^ System: (HJ^ = 0 ) , ( 0 ) _ 1 1 ~ 1 . 1 4 0 8 2 5 p l i * = - ° » 3 & 7 1 7 9 P l i ^ = - ° » 0 3 0 1 ^ 7 P^3) = 0 . 0 2 7 8 4 1 PJJ 7 = O.OO5568 FlV = ° « 6 5 ? 2 9 6 P ^ = O . 0 4 5 0 5 6 : p 1 2 * = - ° » ° ^ 8 3 8 0 p 1 2 ^ = - ° » o o 8 8 6 3 P1Z = ° « 0 0 ^ 3 230. 2 3 1 . 232 233. 234 235 7 . 3 Perturbation of the Density Matrix -- Non-orthonormal Basis 7»3va General Theory In t h i s section, perturbation formulas for the density- matrix are developed f o r the case i n which the primitive (atomic o r b i t a l ) basis i s i n i t i a l l y non-orthonormal, and i n which the overlap between these o r b i t a l s may i t s e l f be perturbed. Such a s i t u a t i o n would a r i s e , for example, f o r a perturbation involving a bond length change in; a Huckel-type molecular o r b i t a l formalism. The major complications here are the e x p l i c i t presence of the perturbed overlap matrix, and the fact that the transformation between the zero order molecular o r b i t a l s and the atomic o r b i t a l basis i s now non-unitary. The projec;- t i o n , PA», onto the space of occupied o r b i t a l s , i s s t i l l given by eq. ( 7 . 2 ) , and therefore, the formal expansions, (7*3)$ s t i l l hold.. However, now the formulas for the f ^ and g^ J ' must be obtained from section 6 . 4 . The i n i t i a l perturbation series are calculated i n a basis of zero order molecular o r b i t a l s , with c o e f f i c i e n t s r e l a t i v e to the atomic o r b i t a l basis denoted here as the columns of the (generally-non-unitary) matrix C'. That is,, i n the c a l c u l a t i o n of the f ̂ 3) and gj^*^',. the perturbations are H M 0 ) = C + H A 0 > G - S M 0 ) = C + S A 0 ) c » (n=0„l,2,,...)), ( 7 . 1 S ) where H^0^ and S'M°^ are to be at least block diagonal (so that. ff*°* =0). When C t S A ^ C = S ^ = 1„ the transformation, t of the density matrix,, PA.,, from; the molecular o r b i t a l basis to 2 3 6 . the o r i g i n a l atomic^ o r b i t a l basis can be written as R = CP AcJ , ( 7 . 1 7 ) When H M 0 Is diagonal,, e x p l i c i t formulas for the elements of i P A (and R), i n terms of those of H,, S,, and C, only, can be written down. It: w i l l be assumed that Is diagonal and that S M Q ' = 1 In: what follows., The zero order termi of P A Is s t i l l given by eq. ( 7 « 3 ) » However,, the f i r s t order term i s now i ( 1 ) . - ( S ( 1 ) ) f ( l ) t ^MO 'AA 1 f ; ( D o ( 7 . 1 8 ) The matrix elements between zero order occupied o r b i t a l s appear as a r e s u l t of the perturbation: of the overlap matrix.. E x p l i c i t formulas f o r the matrix elements of the blocks of p^ n i) In terms of H M Q and1 S' M 0 only are given in; Tables 7«9», 7el0'» and 7.11 for m = 0,1„ and 2 . A perturbation^ series f o r the Huckel energy E (eq. (7.6)) can again; be obtained'using eq. (7.7). Expressions f o r the E^ n^ In terms of fV H„ and. S only are given in. Table 7.12 for n = 0 , (2) 1 , 2, and' 3» No d i f f i c u l t y Is encountered i n eliminating f A ' frorm the expression: obtained v i a (7*7) f o r E w • However, no (h.) (<) attempt was made to) v e r i f y that E v ' and E w / can; be wri t t e n down; s o l e l y In; terms of f ^ and f ^ 2 ^ by; using the conditions defining the f ,, as was done- f o r the case of an; orthonormal b a s i s . Formulas for the E v ' In; terms of the elements of H and S only are given f o r m = 0,, 1,, and 2 i n Table 7.12. 237 I (—A TABLE 7.9 ( P A ^ A A — Won-orthonormal Basis ( p ' ) ( 0 ) = 1 * A AA lA ( P * ) ( l ) - - S ( 1 ) * A AA AA A AA J r s rs 4 r , ,o , O i . ., r t ts P-1 U g-€pj t=l nB ( H i ^ - C ^ y j ^ ^ - S ^ C ? ) + N rp r r p ps sp p' TABLE 7.10'' ( P A ^ B A — Non-orthonormal Basis ( P * ) ( 0 ) = 0 * A BA U „ ( D e o q ( l ) ^ A BA -W " c o c o € r " 6 c N B ( H L ^ ^ i i ^ C H i i ^ i i . ) ) ^ A BA -Jor; n o r r-or ^o: ̂ € 0 j " f l t r J £ - *° t=l (€?-€°) t r t o 238. TABLE 7.11 ( P ^ ) ^ Non>orthonormal Basis ( P ' ) ( 0 ) = 0 * A BB U f p 'y^ 1^ = o A BB J op nA ( H ^ r - l - S ( 1 ) € ° ) ( H ( 1 ) ore it 7 v r<o - S ( 1 ) € ° ) rp r y TABLE 7.12 E ^ 7 — Nion-orthonormal Basis E<°> * v t r HA°» " t 2 > • * t r C H j r ^ H j i ) . , ' ^ ^ ^ -r<i>3ii>̂ >̂ Mi)*H(i)+1.u),ci)tH<i) +[-(^M 2B^ ( 1 ,^ ( 1 , t(S< A ,^' f( 1>)) + < SAi > + SAE f ( 1 ̂ f * 1 ) f (S <A>+f <1 > ) )s£> ^ ( 1 ) 0 ( 1 ) ^ ( 1 ) ^ ( 0 ) - r r BB 2 3 9 . TABLE 7 . 1 3 E.^' — Non;-orthonormal Basis E<°> - v L X r=l ^ nA . r r r r r r=i E = v n B (H v Z 0 = 1 ( 1 ) cr; " €r?or- > H ( l ) 0^ -o.\. ro 240 7»3»b Extended' Huckel Molecular O r b i t a l Theory The formulas: just developed f o r use with a non-orthonormal basis w i l l new be used to derive e x p l i c i t expressions f o r the f i r s t order response of the elements of the density matrix R ( i n the atomic o r b i t a l basis) — equal to the charge-bond order matrix divided by 2 — under a perturbation of both the hamil- tonian and overlap; matrices* These formulas: are analogous to those of s e c t i o n 7»2,b. f o r ordinary Huckel theory, and would be applicable, for example, i n an extended Huckel formalism. The increased complexity of the formulas due to the presence of the overlap matrix probably accounts i n part for the lack, of a detailed treatment of t h i s problem', i n the literature» although a number of low order formulas have appeared i n connect i o n with p a r t i c u l a r applications (Fujimoto et: al,, 1974; Co ope, 1956; L i b i t and Hoffmann,, 1 9 7 4 ) . . For single center perturbations,, we; have n (H t ^ 6 a t 6 p t V ' and ( 7 . 1 9 a ) n (S 2 6S t=l t t 6 p t 6 q t so that (Hi t ? 1 6 t t t C t i C t j and (7.19b>) n (S t ^ 5 S t t C t I C t j The derivatives a % j / 3 ( H ' A 0 ^ p p a n d a R i j / 3 ^SA0^pp a r e S i v e n b v 241. the c o e f f i c i e n t s of 6a and 6S , respectively, i n the expression obtained for R H by i n s e r t i n g the f i r s t order formulas i n Tables 7 . 9 - 7 . 1 1 with eq., ( 7 . 1 9 b ) , i n eq. ( 7 . 1 7 ) . In d e t a i l ( 1 ) n occ: occ # # R..' = - E 6a. I £ C. C. C. C. U t = 1 t r s i r t r ts _js ( 7 . 2 0 ) ni occ unocc [6a. - €°6S +. 3 + £ £ E x - r „ x x C. C. , t=l r o € r r - € a, x [ C ^ C * ^ + C i a,C*,.], where the €^ are the eigenvalues of H^0^, and i t has been; assumed that H j j ^ i s diagonal and = 1., ( 1 ) In the same way, for two-center perturbations, ( & A 0 'pq = < K i o \ p " 6 S p q ' a n d ^ A ^ ' p q " \ p " 6 S p q ' f o r a 1 1 **• (p / q.)» which implies that V'tahi m _ 6 e w t ° p i 0 q 3 * ° p f q i ] > p,q-x and ( 7 . 2 1 ) one obtains / x n occ occ R;V = - E 6S E E C . [C C +C C 1 G ( 7 . 2 2 ) i j p,q=l 1 X 1 r s i r L pr qs ps q r J js n occ: uriocc (6P__-6°>6S__ ) + E E E q r • [C- C ,+C ,C ] ~ „ p ° c 0 u pr qo 1 po* q r J p t lq=l r o r " a* 242 ffcom which the f i r s t derivatives of the elements of R with respect to off-diagonal elements of H A Q and S A Q m a y b e o b t a i n e d # Formulas f o r higher order terms i n the series f o r R cam he obtained here i n the same way. However, they are long and tedious, and not very informative by themselves. Nevertheless, using formulas developed im section 6,4, i t i s possible to compute these higher order terms numerically f o r s p e c i f i c applications, i n a r e l a t i v e l y e f f i c i e n t manner* 243. 7»4 Self-consistent Perturbation: Theory The object of t h i s s e c t i o n i s to develop a perturbation formalism f o r the one-particle density matrix i n closed s h e l l Hartree-Fock theory.- The formulas developed here allow a more rigourous c a l c u l a t i o n of various properties of atoms and molecules than those given i n previous sections of t h i s chapterr because e l e c t r o n r e p u l s i o n terms are included e x p l i - c i t l y . The entire e f f e c t of the self-consistency terms i s buried i n the detailed c a l c u l a t i o n of the perturbation series for f, and therefore, formulas f o r the density matrix and related quantities i n terms of f, which were derived f o r the simple matrix case, w i l l apply here a l s o . 7.4.a General Theory I n t h i s case, the perturbation series f o r the operator f i s obtained by expansion of the equation D(f) = F B A • F B f i f - f ( F A A • F A f i f ) = 0, (7.23) leading to a hierarchy of equations determining the f v • This hierarchy Is formally i d e n t i c a l to that obtained i n the simple matrix case, except that now, the matrix F (the closed s h e l l Fock matrix) i t s e l f depends on f through i t s dependence on the density matrix, P A t ) F(P A) = H + G(P A). (7.24) Here H i s the core hamiltonian, and the two-electron part, 244. G(P A),, representing electron repulsion, i s given by G < V T . < , A 2 2 < V + U < £ R S L , U * 3 ~ *[rtllus]),. (7.25a) A r s t„u*l A t u with [ r s j u t ] » ^ * ( l ) ^ s ( l ) r ^ 2 0 u ( 2 ) 0 t ( 2 ) d t l o / | ' 2 • (7.25b) the 0 r being elements of the zero order molecular o r b i t a l basis usetf f o r the c a l c u l a t i o n . Direct i t e r a t i v e s o l u t i o n of eq. (7.23) f o r f, without making a perturbation expansion, i s equivalent to solving the Hartree-Fock equations exactly. We w i l l consider here only those perturbations which can be Introduced as perturbations of the core hamiltonian, H « *? K ( n ) . (7.26) n=0 This perturbation w i l l induce changes i n the electron d i s t r i b u - t i o n described by P A, and through that, the two-electron part of the Fock matrix i s perturbed. Thus, the n^ h order term i n the perturbation series f o r the Fock matrix consists not only of the n t h order t e r m . H ^ , i n (7 .26) , but also includes an m order two-electron term* The exact form of t h i s n order two-electron term depends on the manner i n which eq. (7*23) i s expanded into a hierarchy of equations determining the terms of the series f o r f, as i s explained below. I t i s convenient, but not s t r i c t l y necessary, to require that F ^ ( f ) be at l e a s t block diagonal, so that f i t s e l f i s at l e a s t a f i r s t order quantity, f = E f v '., I n some n=l applications, i t may be desirable to relax t h i s requirement, 245. and the modifications which must be made i n such a case to the formalism given below are indicated i n Appendix 10v However, in-: t h i s section* and sections 7.4.b and 7.4.c following, i t w i l l be assumed that we are working i n a basis i n which F ^ i s at l e a s t block diagonal* Formal s u b s t i t u t i o n of the series f o r f and F into (7.23) gives D ( n ) . p(n)+ g ( F ( n - j ) f ( j ) . f ( j ) p ( n p . j ) ) BA BB; AA 1 = 1 3 = 1 (7.27) n = 0, 1, 2, . . . • Here, one has A(ro) = n 2 1 ( F ( n - j ) f ( j ) . f ( j ) F ( n - j ) ) BA j—1 ^A + - 2 f { i U ? H ) f « ) ; i=i j=i A B (7.28) which does not depend e x p l i c i t l y or i m p l i c i t l y (through F ^ ) on f ^ n \ The quantity (7,28) i s not the same as the analogous quantity i n eq, (6,9) i n the simple matrix case,, because may now depend o r c f ^ n ^ , and therefore, f o r the purposes of solving (7.27), i t must appear e x p l i c i t l y . The extension.of these equations to a non-orthonormal basis w i l l lead to equations s i m i l a r to (6,54) and (6.55) i n place of (7.27) and (7.28). 246. Despite the s i m i l a r i t y of the basic equations, the deter- mination of the perturbation series f o r f i s more complicated here than i n the simple matrix case when the term F^ im (7.27) i s considered to depend on f ^ , that i s r when the order Fock matrix i s considered to be F ( n ) m H ( n ) + G ( p A ( n ) ) , ( 7 # 2 9 ) When (7»29) i s incorporated into (7.27)» the so-called "coupled Hartree-Fock" perturbation scheme r e s u l t s , and the equations f o r the f ̂ i n t h i s case are derived i n the following sub- section. Formalisms im which the dependence of on P A ^ Is p a r t i a l l y or completely neglected, leading to schemes referred to as "uncoupled Hartree-Fock" perturbatiom theory, are d i s - cussed i n s e c t i o n 7.4.c. 7.4.,b Coupled Hartree-Fock Perturbation Theory I n the coupled Hartree-Fock perturbation scheme, F^ n^ i s ( rs) considered to be dependent on f v as indicated i n eq. (7.29). That i s , the two-electron integrals are considered to be order neutral. I t i s convenient to write the n> order density matrix, P A ^ » i n the form /Or) . ^ ( n ) + 0 A A + f ( n ) f ( n ) t • S ' ( n ) An) (7.30) where l>Mn^ depends: on f ^ f o r $4 n-1 only. Thus, eqs. (7.27) 247. become D ( n ) ( f ) » G B A ( f ( n ) ) * P B ^ f ( n ) " f ( n ) F ^ ) + F ^ ( P ^ n ) ) BA (n) ( 7 ' 3 1 ) + = 0. n = 0, 1, 2 f l . . . , where The l a s t two terms of (7*31) are now independent of f ^ n ^ . The important! feature of eqs. (7*31)# however, i s that the f i r s t (n) three terms — those which are dependent on f v ' — are of a p a r t i c u l a r l y simple form, which i s the same f o r a l l values of n. Prom; eq. (7.25a), unocc occ G B A ( f l Ks = 2 2 fa™ 2[r's|ro'] - [f'oMIrs] o r (7.33) unocc occ / \» + Z Z f < n ' 2[r»s]|o»r] - O ' r l l o ' s ] . a r I f a l l quantities are real,, t h i s reduces to / \ unocc occ. /_\ G B A ( f l Vs = E z fcv' Mr'sllo'r] - [r»o»||rs] - [r«r||o's] unocc occ / \ - L Z iJJ'A . 8 r o . (7.34) a r The four index quantity, A r , s r o , , has sometimes been referre d to as the Nesbet supermatrix. Thus, i f the zero order Fdck matrix i s block diagonal, the n* order equation (7.31) cam 248 be written D (n) . <*" " J 8 * B f (n ) + [ p ( n ) ( ? ; ( n ) , , AW-, . r s ~ ~~ Tsro or L BA A ' O BA Jrs ' r o ( f a l t ,,,, ng, s=l, ,,,, n A ) , (7.35a) where V r a " ( PBB W r s " ^ A A ^ r s V o + V s r o * (7.35*) Where the zero order Fock matrix i s diagonal, eq, (7.35a) becomes t~\ « /_A unocc: occ. t\ r s T s r s _ „, Tsro or o r (7.363 • \?<j*hil<*>).• - o. ( r * l t n B i s»l, •••» n A ) t where the €^ are the eigenvalues of the zero order Fock matrix. In either case, the cal c u l a t i o n . o f f ^ n ^ reduces to the s o l u t i o n of a system of n Ang simultaneous l i n e a r equations. Even when the zero order Fock matrix i s diagonal, i t i s no longer possible to obtain a closed formula f o r the elements of f b e c a u s e of the self-consistency term,. However, only the terms Fg^CPj^* 1^) and'Ag^ need be calculated f o r each value of n, since the c o e f f i c i e n t s of the f^} i n D ^ ( f ) are the same f o r every value of n. The c a l c u l a t i o n of these two quantities i s e a s i l y done automatically, and therefore, the formalism above can be used to calculate high order perturbation series f o r P A without having to derive and use e x p l i c i t perturbation equations. 249. Because of the p o t e n t i a l l y large dimension of the matrix B im eq. (7.35a),, mon-iterative methods of so l u t i o n (such as Gaussian elimination) may not be p r a c t i c a l i n application! to that l i n e a r system, e s p e c i a l l y i f B i s a sparse matrix. Per- haps the simplest i t e r a t i v e technique i s the Gauss-Seidel procedure, with the i t e r a t i o n formula (n) (rr) (m) 1 1 1 1 0 0 0 0 0 0 .(n) n EFBA (PA )+ABA ^ts" oSr rSs ^ s r o * .(in) or ^ n ; B *-"i3A " A * OA -"rs p g r r»<B T s r o u c ^ (7.37) B T S S T This procedure has been found to be s a t i s f a c t o r y im the small number of calculations we have done, although no data has beem obtained on actual rates of convergence. Many other e f f i c i e n t techniques are available for the i t e r a t i v e s o l u t i o n of large l i n e a r systems* We s h a l l not explore t h i s aspect further here, however. 7 . 4 . C Uncoupled Hartree-Fock Perturbation Theory The term "uncoupled Hartree-Fock perturbation theory" has been applied to a number of related approximations, proposed over a period of years, im order to simpl i f y the sol u t i o n of (7.27) ( f o r example, Langhoff, Karplus, and Hurst, 1965f Musher, 1967)t usually only i n f i r s t order. The complicated coupling term- im eqs. (7.35a) or (7.36) arises d i r e c t l y from the requirement that self-consistency be maintained i n a l l orders i n the perturbation. However, i n a s i t u a t i o n im which the perturbation i s expected to d i s t o r t the electronic d i s t r i - 250. bution only .slightly, i t may be possible to obtain acceptable r e s u l t s by relaxing t h i s self-consistency requirement somewhat. In f i r s t order, t h i s amounts to ignoring the dependence of *BA^ o n leading to the same r e s u l t as i n the simple matrix case. f ( l ) u n c a _ o r ( ? 3 8 ) r " €o whemF^0^ i s diagonal. A degree of ambiguity enters i f t h i s formalism i s to be extended to higher order, however. I t i s not c l e a r whether one should ignore just the f ^ dependent part of p£nK or a l l of P A ^ irn the n t h order equation, (7 .27) . There may be an accumulation of non»-self-consistency as one proceeds to higher orders, depending on the exact form of the approximations employed,, and t h i s may cast doubt on the v a l i d i t y of these higher order terms. An i n t e r n a l l y consistent and unambiguous perturbation formalism does r e s u l t i f the two-electron, integrals are con- sidered to be f i r s t order quantities except where they enteir i m p l i c i t l y i n P ( 0 ) . Them p ( n ) . H ( n ) + G ( P A ( n - 1 } ) , (7.39) and no self-consistency term i n f ^ n ^ w i l l occur i n the n**1 order equation of (7 .27) . In fac t , except f o r the i m p l i c i t dependence of the i n A ^ om lowerr order f ^ , the r e s u l t i n g hierarchy of equations determining the f ^ n ^ w i l l be i d e n t i c a l to that i n the simple matrix case. Only by actual calculations can the v a l i d i t y of the assumption (7*39) be assessed, however. 251. CHAPTER 8 DIRECT MINIMIZATION SELF-CONSISTENT FIELD THEORY "•In that case', sai d the Dodo solemnly, r i s i n g to i t s feet, *I move that the meeting adjourn, f o r the immediate adoption of more energetic remedies—*" "•Would you t e l l me, please, which way I ought to go from here?* •That depends a good deal on where you want to get; to*, sai d the Cat© 'I don't much care where—', s a i d A l i c e , •Then i t doesn't matter which way you go', said the Cat. • — s o long as I get somewhere', A l i c e added as an explanation. •Oh, you're sure to do that*, sa i d the Cat, 'If you only walk long enough*• A l i c e f e l t that t h i s could not be denied, s she t r i e d another question, 'What sort of people l i v e about here? 1 •In that d i r e c t i o n , * s a i d the Cat, waving i t s r i g h t paw around, ' l i v e s a Hatter i and i n that d i r e c t i o n ' , waving the other paw* •li v e s a March Hare, V i s i t either you l i k they're both mad,* (Alice's Adventures i n Wonderland, Lewis C a r r o l l ) 252. 8.1 Introduction; In the Hartree-Fock approximation, the t o t a l e l e c t r o n i c energy of an atomic or molecular system described by a single determinant wavefunction, ^ , can be written i n a given basis as mi n / • \ mi n M\ /n E - Z v. E E V i Z R s r R t u i=l ^,,8=1 s r r s i , 1=1 1 J r,s s r x u t,u=l x ([rsllut] - a i j C r t l l u s ] ) i - 1 1 r.s=l p ^ r » r s H f v l V i E T x " > x " > ^ > x < J > # i , j = l 1 j r.s a.0=1 s a r a t p 1 1 0 t,u=l x ([rs||ut] - a^CrtHus]). Here h i s the core hamiltonian f o r the system, and the [rsj|ut] are two-electron; integrals defined in. eq. (7.25b). The summation indices, i , j , r e f e r to electronic s h e l l s . The X,^ are expansion; (lcao) c o e f f i c i e n t s , expressing the occupied normalized o r b i t a l s as l i n e a r combinations of the given basis functions. The are occupation numbers f o r these o r b i t a l s , and the a.. are constants determined according to the values of and v... The operator (the one-particle density matrix f o r the i sh e l l ) i s a projection onto the space of + h the i s h e l l occupied o r b i t a l s , R ( i ) . x ( i ) x ( i ) t . ( 8 . 2 ) 253* The electronic energies of the stationary states of the system are approximated by the stationary values of E, eq. (8.1), considered as a function, of a suitable set of variables, such as the X^ or R ^ . The t r a d i t i o n a l and s t i l l , at present, the most commonly used procedure f o r determining the stationary values of E has been by solving the corresponding Hartree-Fock equations (Roothaan, 1951)» F ( i ) x ( i ) „ s x ( i ) f ( i ) # ( i a l f . . . f i n ) . (8.3) The matrices F ^ depend on a l l of the occupied o r b i t a l s X ^ , ( i ) (ji = 1 , ra).. An i n i t i a l estimate of the X v ' i s used to construct approximations to the F ^ \ which are then diagonalized to y i e l d , i t i s hoped, an improved estimate of the X ^ , which can be used to obtain a further improved approximation, f o r the F ^ ^ . This i t e r a t i v e procedure i s continued u n t i l s e l f - c o n s i s t - ency i s achieved. I t i s conceptually very simple, and i n applications to the simplest (single s h e l l ) systems, rates of convergence r e l a t i v e to the work required i n each i t e r a t i o n are quite good. D i f f i c u l t i e s i n obtaining convergence do a r i s e , however, e s p e c i a l l y i n calculations involving more complicated m u l t i - s h e l l systems. An alt e r n a t i v e to the use of the Hartree-Fock equations i s to minimize the energy, E„ d i r e c t l y with respect to a chosen set of v a r i a b l e s . One problem Ira using the elements of the density matrices, R ^ , or the lcao c o e f f i c i e n t s , X^\, f o r t h i s , i s that a r e l a t i v e l y large number of constraints must be imposed i f the simple functional form, (8.1), of the energy 254. Is to be preserved. When using the lcao c o e f f i c i e n t s , the presence of redundant variables also causes d i f f i c u l t i e s f o r procedures, such as the Newton-Raphson method (Appendix 11), which require that the matrix of second derivatives of E (the Hessian matrix) be non- singular near stationary points. Redundancy among the expansion c o e f f i c i e n t s i s associated with the invariance (to within a complex soalar factor) of the determinantal wavefunction! to non-singular l i n e a r transformations of occupied o r b i t a l s im the same s h e l l . Under orthonormality constraints, the redundancy associated with unitary transformations s t i l l remains. The density matrices contain no redundancy, but must s a t i s f y more complicated constraints. The presence of q redundant variables implies the existence of a q-dimensional constant energy surface through each point i n the coordinate space of the unconstrained v a r i a b l e s . A serious consequence for some gradient minimization techniques i s that the Hessian matrix i s then singular at stationary points of the energy ( S u t c l i f f e , 1974, 1975i Coope, unpubl.). An e f f i c i e n t technique f o r eliminating the orthogonality constraints on the lcao c o e f f i c i e n t s f o r closed s h e l l systems has been developed by Fletcher (1970), and extended by Kari and S u t c l i f f e (1970, 1973) to more general m u l t i - s h e l l and multi-determinant cases. However, calculations i n which Fletcher's method i s used lm conjunction^with the conjugate gradient minimization technique, are frequently poorly 255. convergent near the energy minimum. As a result, such direct energy minimization procedures have been used more to provide improved starting estimates for the solution of the Hartree- Fock equations than as am alternative to the Hartree-Fock equations (see, for example, Claxton and Smith, 1971). Sutcliffe (1974, 1975) has exp l i c i t l y exhibited the singul- arity of the Hessian matrix at the energy minimum i n formalisms based on Fletcher's method,; and he has suggested that this singularity may contribute to the slowness of convergence near the energy minimum.. This suggestion is questioned below, both om theoretical grounds, and by examination of rates of converg- ence for calculations involving minimization of the energy with respect to a set of unconstrained variables containing no redundancies. It is our contention that the observed poor convergence rates arise rather out of deficiencies i n the straightforward! implementation of the conjugate gradient mini- mization algorithm. Sutcliffe (1974,, 1975) has proposed several solutions to the redundancy problem, but clearly, the simplest would be to write the total electronic energy, from the beginning, in terms of a set of umcomstraimed variables mot possessing such redund- ancies. The eigenvalue independent partitioning formalisms developed ire chapters 2 and 4 provides such sets of variables* namely the matrix elements of the off-diagonal blocks of the matrix T. In the following sections, the application! of the partitioning formulas to the minimization of the energy of a 2 5 6 . systenr represented by a single determinant wavefunction i s described. One of the major advantages here i s the f a c t that the derivatives of E with respect to these variables can be expressed very simply i n terms of the columns of the projec- tions, (8.2),, onto the occupied o r b i t a l s , and t h e i r complements. A scaled descent method,based on p a r t i t i o n i n g with respect to current occupied and unoccupied molecular o r b i t a l s , i s proposed, (section 8.3.c), which appears to be very successful i n p r a c t i c e . 257. gA 1 t 8.2 Closed S h e l l Systems 8.2.a Orthonormal Basis The square matrix, X, of the eigenvectors of the closed s h e l l Fock matrix i s par t i t i o n e d into the n A occupied and the nig unoccupied molecular o r b i t a l s . The orthonormal basis func- tions, i n terms of which these o r b i t a l s are expressed, are partitioned into two sets of the same dimensions, n A and n^, defining spaces S A and Sg. In t h i s way, the c o e f f i c i e n t matrix X can be written i n the blocked form (2.2). The projection* R, onto the space of occupied o r b i t a l s , i s given by eq. (2.10) as (8.4) t (A) where g A = 1A + f f i s the metric f o r the eigenvectors X'x-'r truncated to the space S A» In the closed s h e l l case, the energy functional i s p a r t i - c u l a r l y simple, E = 2 t r Rh + t r RG(R) a 2 E R s r . h r * £ R t ([rsllut] - i [ r t | | u s ] ) . (8.5) r,s t,u Substituti©m of (8.4) into (8.5) gives the energy im terms of the matrix elements of f only. Since the degrees of freedom, av a i l a b l e * , n^ng, exactly equal the number of matrix elements 1The argument involving numbers of variables i s of central- importance here, and i s as follows f o r the closed s h e l l case. (cont'd) 258. of f, there can be no redundancy. Also, the matrix f i s completely unconstrained because R,. eq. ( 8 . 4 ) , automatically s a t i s f i e s the c r i t e r i a necessary to be a pr o j e c t i o n ( s e c t i o n 2.1*a). In short., the matrix elements of f represent a set of unconstrained variables possessing no redundancies, with respect to which the energy can be minimized. In p r i n c i p l e , the elements of the block Rg^ also provide a set of non-redundant and unconstrained variables, but they are not very suitable f o r s p e c i f y i n g the energy, because of 2 the complicated r e l a t i o n s h i p between Rg A and R ^ or Rgg. I f no constraints are imposed on the occupied molecular o r b i t a l s , ft) X x , s p e c i f i e d by n An complex parameters, then these o r b i t a l s are a r b i t r a r y up to an n A x n A l i n e a r transformation. Therefore, there must be n A 2 complex redundant variables among the lcao c o e f f i c i e n t s i n the single determinant wavefunction, leaving nknB e o m P l i B X : variables which are independent. I f the molecular o r b i t a l s are constrained to be orthonormal, then n^ 2 r e a l para- meters are eliminated by the constraints, and n A 2 of the r e a l parameters remaining are redundant (equal to the independent parameters i n a unitary transformation — t h i s includes the n^ ar b i t r a r y phase f a c t o r s ) , again leaving 2n An f i r e a l parameters or n^ng. complex parameters which are independent. For the density matrix, the requirement of idempotency leads to 2 2 2 the n A • ng (complex:) constraints. R A A - R ^ • R A B R B A * ° » and Rgg, * RBA RAA RAB* w n i c n 6 i v e the blocks R ^ and Rgg i n terms of Rgj^r s p e c i f i e d by complex parameters. Given R ^ and Rg^, i t i s easy to calculate Rgg» hut the f i r s t equation here i s not e a s i l y solved f o r R ^ (see s e c t i o n 2.1.c, and i n , p a r t i c u l a r * eq. ( 2 . 2 3 ) ) . 2 5 9 . The f i r s t and second derivatives of E with respect to the elements of f are most e a s i l y obtained by the incremental app- roach used i n section; 2.1 .e, but now re t a i n i n g terms to second order i n the v a r i a t i o n . Writing g ^ ( f • of) B fij1*"^ + 6 g A 1 ' to second order one has, S - gA l 6 gA *k + gA l f i gA gA l f i gA *k * °^3^ <8'6> where 6g A • 6 f * f • f + 6 f + o f f 6 f . ( 8 . 7 ) For the density matrix,, the v a r i a t i o n R(f • 6f) = R(f) • 6R, i s given, exactly, by 6RAA a 6 g f » *RAB * * g I 1 & f t + 6 g I l 6 f t » ( 8 . 8 ) 6 RBA = " f f A 1 + ^ A 1 + ^ ^ A 1 ' 6R B B - f S g " 1 ^ • d f g ^ V + f g j ^ f * + S f S g ^ V + f o g ^ f i f 1 " * dfĝ df1" + afdgĵ &f1-., The f i r s t order term i n the expansion of E ( f • 6f) can be si m p l i f i e d to give 6 ^ E : * 2 t r 6R F ( 8 . 9 a ) « 2 t r 6f fD + 2 t r 6 f D ? , ( 8 , 9 b ) where D * g ^ D f f ) ^ 1 . ( 8 . 9 c ) This i s i d e n t i c a l to the r e s u l t obtained f o r a simple matrix 260, (section 2.1.e), except that here, the quantity D(f), given; D ( f ) * FBA + FBBf " f FAA - f W » ( 8 - 1 0 ) Is defined i n terms of the Fock operator, F • h • G(R), (8.11) •B which i t s e l f i s a function, of f • As before, g t t = 1*, • f f * , , Is the metric f o r the eigenvectors X v 7 of F truncated to the space Sg. From (8.9b)t the f i r s t derivatives of the energy are seen rto be 2D_ r (8.12a) df or * 2[(1 - R ) F R ] o r (8.12b)) « 2 F o r , (8.12c) eB?A using the notation- developed i n section 2.1.d. Here and below, Greek l e t t e r s denote basis elements in; Sg, and Roman l e t t e r s denote basis elements in; SA.. The f i r s t derivatives of the energy with respect to the variables f a r are therefore given, by elements of the off-diagonal blocks of the current Fbck matrix between contragredient non-orthonormal vectors given by the f i r s t n A columns of R and the l a s t n f i columns of (1-R)• Because the metric matrices g A and gg are posi t i v e d e f i n i t e (as,; therefore, are t h e i r inverses),, i t i s seen that the f i r s t derivatives of the energy with respect to the elements of f cam vanish only i f D(f) = 0. In.fact, t h i s condition, or the more 261 general one, eq. (2.13)» P I s T P , with F defined as im (2.15a), can be regarded as an expression f o r the Hartree-Fock equations i n the present formalism. As indicated i n section 2 .2 .a , the condition-. D(f) = 0 (and therefore, V f t E = 0) i s also equivalent to X*FX being block diagonal. I s o l a t i o n of the c o e f f i c i e n t s of the second order terms i h ; E ( f + 6f) y i e l d s the second derivatives of the energy with respect to the elements of f . Af t e r considerable algebraic manipulation, one obtains - v B o ^ s - <t - wA - [ - M U M P or T S and r o r S T T + [ e B e A e A e E J , (8.13a) . „ Bf A II BT A j fcLCB"A irB'A- a r ^ s (8.13b) SE „. ^ dE Rl In the p a r t i c u l a r case that the p a r t i t i o n i n g chosen i s defined by the current projection! R, so that f = 0 and R « 1 A , and further, when p a r t i c u l a r bases adapted to R and (1 - R) are chosem which diagonalize F ^ f R ) and Fg B(R), respectively, them the dominant terras i n eqs. (8.13) are the derivatives a E / d : £ c r d * W with the value, 262, or or (the €^ being the eigenvalues of F) equal to the s i n g l e t single excitation- energies. At the energy minimum, the remaining second derivatives a l l reduce to combinations of two-electron i n t e g r a l s . To the extent that these combinations are small, the e x c i t a t i o n energies, (8,14), approximate the eigenvalues of the Hessian matrix, which are thus p o s i t i v e , as they should be f o r an energy minimum* In the case that a l l quantities are r e a l , the above derivative formulas become (see Appendix 12), — - ^ . o . r . ( 8 . 1 5 ) 8 f n r 6B®A or and af at, or T s aE n aE R + L + F o xR„ - (l-R)„*.For s, af 0 8 af ^ s eB eB r s a r eA eA O A T r os (8.16a) with X ~ _ J B Bll A A J -"-•EBAir°BwA-» d f ro or °r + F a 0 o ° R r . r . " ^ - R J f l B ^ P 1 , * (8.16b) e B e B r r oo e A e A 263 8.2.b Non-orthonormal Basis In t h i s case, the density matrix, R, and the el e c t r o n i c energy, E, are s t i l l given by eqs. (8.4) and ( 8 . 5 ) , r e s p e c t i v e l y . However, now both E and R depend, through the metric g A, on the overlap matrix. According to eq.. (2.103a), the metric g A i s given here by «A ° SAA * W * f t sBA + f t SBE?' ( 8 ' 1 7 ) so that now eq. (8.7) must be replaced by ( S A R • f f S B t t ) 6 f • 6 f + ( S W A + S R R f ) • 6 f f S R B 6 f BB' BA BB BB * Y A B 6 f + f>f^m * 6 f f S B B 6 f , BBT (8.18) i n the energy variation:. The quantity Y f i A = S f i A • s g g f h a s been defined previously i n s e c t i o n 5«3«c, and reduces to f f o r an orthonormal basis. I s o l a t i o n of the f i r s t order part of E ( f + 6f) gives the f i r s t derivatives of the energy with respect to the elements of f as - T - • 2 F e ° e r • (8.19) which i s i d e n t i c a l to eq. (8 .12c) . The o r b i t a l s e j , (r«l,...,n A), are the same as before, but now the © B „ (o»l,,...„ n B ) , are given as the columns of " g A l Y A B H " f g A l y A B - [1 - R S ] ( B ) . (8.20)} 264.. That i s , the f i r s t derivatives of the energy with respect to the elements of f are given as matrix elements of the current Pock matrix between two sets of contragredient non-orthonormal vectors consisting, respectively, of the f i r s t n A columns of the density matrix, R,f and the l a s t n f i columns of the comple- mentary matrix (1 - RS)* ^ As before, the second derivatives are obtained by i s o l a t i n g the c o e f f i c i e n t s of second order terms im E ( f + 6 f ) . The c a l c u l a t i o n of these c o e f f i c i e n t s i s considerably more lengthy and tedious than f o r an orthonormal basis, but the f i n a l r e s u l t s are given simply by 2 d E _ or_o_rnJ»*^s; * f o r s f r s AT? ap (8.21a) - ( S R ) _ - * § - ( S R L ^ f - ae7f T r a f * d f r r r " o s and .2, c r * s (8.21b) + Rsr Fe°e^ " ( 1 " R S ) o T F e a e r * sr e f i e E or e A e A These formulas are i d e n t i c a l i n form to those obtained in ana orthonormal basis, eqs. (8.13) , except for the factors R and (1-R) being replaced by SR and (1-RS), respectively, i n c e r t a i n places• •^This i s not the complement of R im the usual sense of the word. Since (RS) 2 » RS, one has (l-RS)R * 0, however, the reverse product R(l-RS) • R - R S i s not zero im general* 265. Am analysis of the metric properties of the and efi s i m i l a r to that of section 2.1.d for an orthonormal basis cam be c a r r i e d out. The algebra i s tedious and only the major re s u l t s are l i s t e d here. Writing these vectors as columns of a matrix e% -1 -«A1YAB 1 B - 'A AB (8.22) one can show that 0 0 9 g 3S - i "FGALYAB"YBAGALFT (8.23) v e r i f y i n g the non-orthonormality of the columns of %• A set of vectors dual to the e (that i s , such that e e * 1 * g, S) are given by % m BA - f 1 B (8.24) where § A • + S A f i f • These vectors are also non-orthonormal. as i s seen from 2 S " | " SASA + YABYBA -̂ A * YBA YAB " SAF l f i • f f f (8.25) I t i s seen from eq. (2.33) that the l a s t n f i of the are 266* formally the same here as i n an orthonormal basis. Metrdc matrices, with respect to which the e* and e^ are orthonormal, can be constructed e x p l i c i t l y , . One obtains, A - ee1" * * t t + f f VA " F V A B - F' XB + XBA YAB (8.26) and A ee* = -sI^V WBÂ A1 -^^^(V^B^BA^^ r (8. f o r which i t i s easy to v e r i f y that e ^ e « 1 , ©*A® * 1 • Not only are these r e s u l t s more complicated than f o r an ortho- normal basis, but now g / A a ^ d g'/'̂ i » ire contrast to the previous case. The matrices e and % are no longer normal., 8.2.c Results of Test Calculations — Closed S h e l l Case A set of CNDO/2 calculations were carr i e d out to obtain information! ohi the convergence properties of di r e c t energy minimization procedures based on the formalism presented ihv sections 8.2.a and 8.2.b. The calculations were ca r r i e d out 267;. on an IBM 3 7 0 / 1 6 8 computer using double p r e c i s i o n arithmetic*"* In a l l c a l c u l a t i o n s , the convergence c r i t e r i a imposed were | S E | < 1 0 ~ 1 2 a.u* and l&R ĵJ < 1 0 " ^ per i t e r a t i o n * The number of i t e r a t i o n s required to s a t i s f y both these c r i t e r i a are given i n Table 8 . 1 f o r selected c a l c u l a t i o n s . In practice, a single i t e r a t i o n I n a d d i t i o n to those indicated i n the table i s . required i n each case to v e r i f y that the convergence c r i t e r i a have been s a t i s f i e d * The seven molecules chosen are one5 f o r which the Roothaan i t e r a t i o n method can be used with varying degrees of success. Pour of them, CH^, HP, L i P , and HgO, present no problems at a l l . For two of them, BeO and BN, Roothaan*s method i s only slowly convergent, and the l a s t one, PN, leads to o s c i l l a t i o n s between d e f i n i t e charge d i s t r i b u t i o n s a f t e r about t h i r t y Roothaan i t e r - ations. For each of these l a s t three d i f f i c u l t cases, converg- ence of Roothaan's method w i l l occur or can be accelerated i f a suitable i n t e r - i t e r a t i o n density matrix averaging procedure i s employed. The variables f a r were defined by a p a r t i t i o n i n g between 'occupied' bond and lone pair o r b i t a l s , and 'unoccupied* a n t i - bond and atomic o r b i t a l s . The bond o r b i t a l s were non-polar combinations of hybrid atomic o r b i t a l s , the hybrid AOs used, being f a r from optimal i n some cases (for example, sp-^ hybrids ^The parts of the programs involved i n c a l c u l a t i n g the CNDO/2 integrals and core hamiltonian were adapted from the CNDO/2 program of Pople and Beveridge ( 1 9 7 0 ) ; . 268 oro the F atom). In t h i s bond/antibond/lone pair o r b i t a l basis, the s t a r t i n g approximations was f = 0. For calculations done d i r e c t l y im the atomic o r b i t a l basis, the starting; value of f was? calculated using eq. (2.3a), where X defines the s t a r t i n g o r b i t a l s i n the AO basis. I t i s seen that im a l l but a small number of ca l c u l a t i o n s , s u b s t a n t i a l l y fewer i t e r a t i o n s were required to s a t i s f y the convergence c r i t e r i a when using the d i r e c t minimization methods, than when using Roothaan's method. Even i n the cases causing d i f f i c u l t y f or the Roothaan method,, convergence appears s t r i a g h t - fdrward f o r the d i r e c t methods. When variables af o r» defined by an a r b i t r a r y p a r t i t i o n i n g of the AO basis are used, the number of: i t e r a t i o n s required increases somewhat. Rates of convergence for Fletcher's method and the p a r t i t i o n i n g method are generally comparable, indicated that the presence of redundant variables ire the former has no observable e f f e c t on convergence rates. Generally, i t was found that the o v e r a l l rate of convergence depends very l i t t l e on the accuracy of the step length as long as some minimal accuracy i s maintained. Assuming that the construction of the Fbck matrix i s by fa r the most costly single step i n an SCF c a l c u l a t i o n , d i r e c t energy minimization procedures based on the conjugate gradient algorithms are at lea s t twice as c o s t l y per i t e r a t i o n as the Roothaan method. Therefore, even a rather substantial decrease i n the number of i t e r a t i o n s required f o r a d i r e c t method may not represent a more e f f i c i e n t o v e r a l l c a l c u l a t i o n . However, 2 6 9 . the d i r e c t methods do have an advantage of r e l i a b i l i t y — they can never diverge, i f set up appropriately. With the p a r t i t i o n i n g defined i n the bond/antibomd/lone pa i r basis, the f u l l Newton-Raphson equations converge very r a p i d l y — im none of the seven examples studied are more than f i v e i t e r a t i o n s required to s a t i s f y the stringent convergence c r i t e r i a * In the case of CH^, t h i s rate of convergence can be duplicated using the conjugate gradient technique i f the step lengths during the l i n e a r search are calculated s u f f i c i e n t l y accurately (correct to four f i g u r e s ) , but not f o r HgO. I f am a r b i t r a r y p a r t i t i o n i n g i s defined i n the atomic o r b i t a l b a s i s r i n i t i a l convergence of the Newton-Raphson method i s generally very much poorer. For two of the molecules, the c a l c u l a t i o n a c t u a l l y diverges, while f o r a t h i r d , i t converges to a s t a t - ionary point above the minimum- value of the energy. Because of the expense involved i n using the f u l l Newton- Raphson equations, both a diagonal block and diagonal approxi- mation were tested, these being analogous, respectively to algorithm FGN, and to algorithms DGN and SDNR, as described i n chapter 5» While these approximations represent a very s i g n i f i - cant reduction i n computation required, the methods are seen to be generally u n r e l i a b l e . Convergence i s not only much poorer, but some calculations a c t u a l l y diverge i n cases where Roothaan*s method converges. The Newton-Raphson equations can, nevertheless, be u s e f u l l y exploited i n other ways, one of which i s described and i l l u s t r a t e d i n section 8.3.c. 270 TABLE 8.1 Closed S h e l l Case — Test C a l c u l a t i o n s a Method 0 HF Molecule L i F H 20 BeO BN PN Roothaan 10 16 17 20 49 >80 b osc. Fletcher d l 7 5 14 7 18 19 16 d3 2 14 4 18 19 16 j c 7 5 14 9 18 18 16 P a r t i t i o n i n g d i 7 5 15 7 19 19 18 bond o r b i t a l 2 14 5 19 20 15 basis 3 c 7 5 14 9 19 19 16 d1 8 7 20 10 18 19 44 atomic o r b i t a l 1 4 5 18 10 18 20 43 basis 3 c 5 5 18 10 18 20 43 P a r t i t i o n i n g 5 >40 (steepest descents) Newton-Raphson F u l l (B/A basis) 2 3 4 2 5 5 5 F u l l (AO basis) 3 1 3 d 4 div. 5 div. Block Diagonal 3 27 7 div. d i v . d i v . (B/A basis) Diagonal 17 11 di v . 24 div* div. d i v . (B/A basis) aNumber of i t e r a t i o n s required to s a t i s f y |6E| <10" l i 5 r |*Ri<i|'<10 per i t e r a t i o n . ^convergent ^ i n t e r p o l a t i o n schemesi d^ — secant formula, i timesi c • — cubic formula converged to an excited s t a t e . 271 8»3 Unrestricted Hartree-Fock Theory 8. 3. a Energy Derivatives The formalism developed i n the previous section f o r the closed s h e l l case can be c a r r i e d over with minor modifications to unrestricted Hartree-Fock c a l c u l a t i o n s . In f a c t , i t i s possible, im some sense, to view the r e s u l t i n g formalism as that of two coupled closed s h e l l systems, one for the a-spin; electrons, and one f o r the 3-spin electrons. The energy functional i s now The matrices Rf* and R p are the one-particle density matrices r e f e r r i n g to the a-spin and 3-spin occupied o r b i t a l s , respectively. A set of unconstrained, non-redundant variables completely speci f y i n g E can be introduced as follows. In the chosen basis, the n^ occupied a-spin o r b i t a l s are written as columns: of a matrix X a ^ , and s i m i l a r l y , the n£ occupied 3-spin. o r b i t a l s as X ^ A \ These o r b i t a l s w i l l be eigenvectors of the appropriate Fock operators. Now, two d i f f e r e n t partitionings of the basis set are c a r r i e d out. In the f i r s t case, the basis functions are partitioned into two sets of dimensions n^ and n ^ spanning 2?2 spaces and Sg,. A second p a r t i t i o n i n g i s defined i n which the basis functions are par t i t i o n e d into two sets of dimensions 4 and 4 spanning spaces and Sg, respectively. As a r e s u l t , the occupied a-spin and 3-spin o r b i t a l s can be written i n the block f o r m i ,a(A) AA "BA r3(A) AA CB BA (8.29) I t i s now possible to define two f-operators r namely, BA AA »• (8.30) i n terms of the two sets of occupied o r b i t a l s . Then one has ,-1 with R i . x i ( A ) x i ( A ) t i i + i 4 * h * f f » g A f i i-1 i f . (8.31) (8.32) giving the two density matrices, and thus the electronic energy s o l e l y i n terms of the njn** n ^ elements of f a and f p . That the elements of f0" and f 8 are the minimum number of" variables necessary to specify the energy, but not subject to any constraints nor possessing any redundancies, can be esta- blished i n the same way as f o r the closed s h e l l case. The requirement that the a-spin occupied o r b i t a l s be orthonormal, and the redundancy associated with the invariance of the energy, (8.28), to an n^ x n| unitary transformation of these o r b i t a l s 273. together eliminates no? of the nf(n? • n f j lcao c o e f f i c i e n t s number of elements im f 0 1 . S i m i l a r l y , orthonormality constraints and redundancy leave only n^n^ independent variables invthe P-spimoccupied o r b i t a l s , which i s equal to the number of elements i n f^« Orthogonality between a-spin o r b i t a l s and the 0-spina o r b i t a l s i s automatic, due to the orthogonality of the spire parts. The so-called " p a i r i n g conditions" sometimes used i n the derivation!of t h i s d i f f e r e n t - o r b i t a l s - d i f f e r e n t - s p i m (DODS) formalism (Rosenberg and Martino, 1975)» merely represent a p a r t i c u l a r choice of some of the redundant variables im the o r b i t a l c o e f f i c i e n t s of the two sets of spim-orMtals. and thus need not be considered i n the above arguments concerning the number of degrees of freedom i n the problem. For a v a r i a t i o n 6R* i n the R* (i=a,0),, the corresponding change 6E im E, eq. (8,28), i s given exactly as 6E = t iC&RV 1 + 6R PF*] + i t r [ 6 R a 6 G a + ©R?6G e]. (8.33) Here F® and F S are the a-spin and 0-spin o r b i t a l Fock matrices respectively,, F a • h • G* » h+ J(R a) - K(R a) • J(R*) , (8.34a) and p e = h + G P = h + J(R P) - K(R P) * J(R a) . ( 8 . 3 W The f i r s t order part of (8.33) i s the sum of two terms of the same form as (8.9a) f o r the closed s h e l l case. Therefore, one has immediately that 274* * [(1 - R ^ F V } on (8.35a) (8.35b) Thus, the f i r s t derivatives of the energy with respect to the elements of the f * are again just matrix elements of the corres- ponding current Fock operator between two sets of contragredient (non-orthonormal) molecular o r b i t a l s , which are, respectively, i i i the f i r s t n^ columns of the density matrix R and the l a s t rig a columns of the matrix (1 - R S),. ( i • a , 0) . I t i s seen that the f i r s t derivatives of the energy with respect to elements of f0, depend on f e only i m p l i c i t l y through the dependence of F a on R8,. and vice versa. The second derivatives of the energy are given by,, i i«E i. = -iftSR 1) » ^ r s f r s 1 • c(4?'t»iflk4r<«i>"K(4)'(»i>'lk»i7 c«i (8.36) *{RLE(.4)^4f- ( 1 - R i s > o t F ( e i ) s ( e i ) r and « i[(e B)°(e^) r||(e|f ( e A f ] , . i / ji (8.37) 275. J - ^ j - - *[(e|)«(ei)r||(ei)B(e|:)r]. i f l . or T S These formulas are d i f f e r e n t fronr those i n the closed s h e l l case because the coupling between a-spin and 0-spim o r b i t a l s i s e x p l i c i t i n the second order v a r i a t i o n of E with f a and f P . 8.3.b> Test Calculations and Computational Refinements A seri e s of minimal basis set (STO) ah i n i t i o calculations were c a r r i e d out on the molecule CN i n order to obtain informa- t i o n on the p r a c t i c a l implementation of the UHP-SCF formalism just described. Claxtom and Smith (1971) have reported cone vergence problems i h s i m i l a r c a l c u l a t i o n s . A Roothaan it e r a t i o n ! procedure converges very slowly when the interatomic distance i s 2.0 a.u*, and exhibits o s c i l l a t o r y behaviour, f a i l i n g to converge, when t h i s distance i s increased to 2.2 a.u. (see Figures 8.1 and 8.4)). When a d i r e c t minimization procedure based om Fletcher"s method was used, i t was found that converg- ence was rapid at f i r s t , but became very slow as the minimum was approached. They concluded that the most e f f i c i e n t proce- dure was to use the d i r e c t method i n i t i a l l y , u n t i l a good estimate of the energy minimum was obtained, and then complete the calculation-using a Roothaan i t e r a t i o n procedure, which converges well when provided with a good s t a r t i n g approximation. The calculations here were c a r r i e d out on an IBM 370/168 computer using double pr e c i s i o n arithmetic. The integrals i n 276> the S l a t e r o r b i t a l basis were obtained from a version of the POLYCAL program. O r b i t a l exponents were takem from Clement! (1963). The l i n e a r search step i n the conjugate gradient algorithm was required to reduce dE/dX by a factor € compared to i t s value at X 8 0,, and € was usually chosen; as 0.1. The s t a r t i n g approximation! i n a l l but one case was equivalent to the eigenvectors of the core hamiltonian. I t was found that the convergence of the d i r e c t minimiza- tion, calculations based on; the p a r t i t i o n i n g formalism was very poor i f the were defined by an a r b i t r a r y p a r t i t i o n i n g of the atomic o r b i t a l basis. Convergence improves greatly i f they are defined by the p a r t i t i o n i n g of a set of molecular o r b i t a l s , X ,, which more nearly block diagonalize the Fock operator. In practice, t h i s involves evaluating the energy gradient and f-operator i n the new basis, that i s , V f i E * * 0 - ( e ^ X > F i A 0 ( X o e A ) ' ( 8 ' 3 8 ) the c a l c u l a t i o n requiring less computation i f the quantities i n the brackets are evaluated f i r s t , and then the back-trans- formation of the density matrix as calculated from the MO basis f-operator using (8.31), ( R 1 ) * 0 - X ^ R 1 ) " 8 * * . (8.39) iif xlsx^ 9 i„ No transformation! of the two-electron integrals 0 0 i s necessary..^ The Fletcher and Roothaan calculations were done % h e transformation; to the MO basis has an additional advantage when working i n a non-orthonormal AO basis, because i f the new (cont'd) 277. i n the o r i g i n a l atomic o r b i t a l b asis. Tables 8*2 and 8.3 summarize the r e s u l t s of sixteen d i f f e r e n t c alculations done here. The r e l a t i v e rates of convergence of some of the methods and refinements are also i l l u s t r a t e d im Figures 8.1 - 8.3 f o r the CN molecule with a bond length of 2.0 a.u., and lm Figures 8*4 - 8.6 f o r a bond length of 2.2 a.u. The energy range i s Figures 8.1 and 8.4 i s larger by a ratio:Vof 400115 than that i n the other four f i g u r e s . 0m comparing the r e s u l t s of the calculations involving Fletcher's method (2, 4„ 5) to those based on; the use of the f£ , (6, 7). i t i s seen that not only d® both methods converge poorly near the energy minimum, but that Fletcher's method ac t u a l l y s l i g h t l y outperforms the method based on the p a r t i - t i o n i n g formalism. This i s also seen i n Figures 8.1 and 8.4. A number of modifications of the basic method based ont the use of the were examined. Slow rates of convergence near the minimum imply s i g n i f i c a n t l i n e a r dependence between successive search directions i n the conjugate gradient c a l c u l a - t i o n . Simply r e s t a r t i n g the c a l c u l a t i o n with a steepest descent d i r e c t i o n more frequently resulted i n no improvement (Figures 8.2 and 8.5)* However, a major' increase i n the rate of conver- gence was obtained when the basis, X Q, i n which the p a r t i t i o n i n g was defined was replaced by the eigenbasis of the current Fock basis vectors s a t i s f y X^SXQ » 1, then the energy gradient formulas applying i n an orthonormal basis can be used since MO S * 1. This p a r t i a l l y , i f not completely, o f f s e t s the a d d i t i o n a l cost of the transformations i n (8.38) and (8.39)• 278. TABLE 8.2 Details of Direct Minimization Calculations CN Molecule (r * 2.0 a.u.) Type a min. & a l g . € modification! f F i n a l Energy- a.u. ranlP cc d e 1 R -112.691216s 16 2 F l c»g% -112.835540h 10 3 R -112.8457221 3 4 PI- cvg. 0.1 -112.841230 8 5 PI. ct*g. 0.01 -112.840916 9 6 P e.g. 0.1 -112.818824 14 7 P c.g. 0.01 -112.822130 13 8 P e.g. 0.1 3 -112.805275 15 9 P c.g. 0.1 3 X -112.845708 4 10 P e.g. 0.1 x k -112.845365 6 11 P e.g. 0.1 3 X: -112.845418 5 12 P e.g. 0.1 3 x X -112.845722^ 1 13 P s.d. 0.1 -112.824592 12 14 P s • d • 0.1 3 1 -112.845121 7 15 P s.d. 0.1 X k -112.832094 11 16 P s.d. 0.1 3 1 X -112.845722 2 aR*Roothaan r F l * F I e t c h e r r P*Partitioning ^ c g . « conjugate gradient, s.d. = steepest descent. csteepest descent r e s t a r t frequency. basis update at steepest descent r e s t a r t . ^gradient s c a l i n g inv e f f e c t a f t e r 30 Iterations unless otherwise noted, exact energy i s -112,845722 a.u. ®28 i t e r a t i o n s h 2 9 i t e r a t i o n s *uses f i n a l r e s u l t from c a l c u l a t i o n #2 as s t a r t i n g approximation, ^convergence c r i t e r i a |6E| < 1 0 " 1 2 „ |6R i i|<10* 6 s a t i s f i e d im 25 i t s . ^using eigenvalues of core hamiltonian. ^indicates the frequency of basis modification. "indicates the order of the f i n a l energies, fromi lowest to highest. FIGURE 8*1 Total electronic energy as a function of i t e r a t i o n number f o r the CN molecule, (bond length = 2,0 a.u.). (1) Roothaani (2) p a r t i t i o n i n g , steepest descent search directions onlyi (3) p a r t i t i o n i n g , conjugate gradients r (4) Fletcher, conjugate gradients! (5) p a r t i t i o n i n g conjugate gradients with gradient s c a l i n g and basis update with steepest descent r e s t a r t every 3 i t e r a t i o n s . I n a l l d i r e c t minimization calculations, € = 0.1 (see Table 8.2). I T E R A T I O N S -i 1 r 1 • 5T to iS 30 3° FIGURE 8.2 T o t a l electronic energy as a functions of i t e r a t i o n number f o r the CN molecule, (bond length = 2*0 a.u.)* Comparison, of the e f f e c t of various modifications on the conjugate gradient algorithm*—partitioning approach onlyi (1) steepest descent r e s t a r t every 3 i t e r - ations onlyi (2) basic conjugate gradient algorithm! (3) gradient s c a l i n g only; (4) gradient s c a l i n g and steepest descent r e s t a r t every 3 i t e r a t i o n s i ; {.$). steepest descent r e s t a r t every 3 i t e r a t i o n s with basis update at restart» (6) gradient s c a l i n g , steepest descent r e s t a r t every 3 i t e r a t i o n s with basis update at r e s t a r t (see Table 8.2). FIGURE 8.3 Tot a l electronic energy as a function of i t e r a t i o n number f o r the CN molecule, (bond length » 2.0 a.u..). Comparison of the effect of various modifications on the steepest descent a l g o r i t h m — p a r t i t i o n i n g approach onlyi (1) steepest descent algorithm only» )f( b * s i c conjugate gradient algorithm* (3) steepest descents with gradient s c a l i n g onlv* (4) steepest descents j y i t h b a s i s update every 3 i t e r a t i o n s i (5) steepest descents with gradient s c a l i n g and basis update every 3 i t e r a t i o n s , (6) conjugate gradients with gradient s c a l i n g , steepest descent r e s t a r t every 3 ite r a t i o n s with basis update at r e s t a r t (see Table 8, 282 operators at the time of the steepest descent r e s t a r t , as was done fo r the calculations numbered 9 i n the tables. The p a r t i t i o n i n g operators are set to zero when the new basis i s incorporated into the c a l c u l a t i o n , and therefore, t h i s basis modificatiom i s equivalent to a single Roothaan i t e r a t i o n . 8,3.c Use of Scaled Variables A second modification of the basic algorithm which r e s u l t s i n a major improvement ini convergence, i s suggested by the Newton-Raphson equations f o r determining the zeros of the energy gradient (see Appendix 11). Upon neglecting the two- electron; integrals i n eqs. (8.36) and (8.37), and i n a basis diagonalizing the current Pock operators, i t i s seen that &g - €*•- « * . . (8.40) Thus, the diagonal approximation of the Newton-Raphson equations can be written, & f o r « -.<(€j - e ^ r 1 - * f - , (8.41) d f o r where A i s some constant independent of 0 and r., I f the minimization problem i s rewritten i n terms of a new set of v a r i a b l e s r 283. TABLE 8*3 Details of Direct Minimization Calculations CN Molecule ( r • 2.2 a.u.) Type a min. b € modification! f F i n a l Energy rank 1 alg» a d e a.u. 1 R osc • ̂ 16: 2 P l e.g. -110.991333h 8 3 R: -111.0129141 2 4 F l e.g. 0.1 -110.995595 5 5 F l c.g» 0.01 -110.994936 6 € P c>g. 0.1 -110.976114 13 7 P e g . 0.01 -110.977213 12 ' 8 P c.g. o a 3 -110.975463 14 9 P e.g. 0.1 3 X -111.011872 3 10 P e g * 0.1 -110.981282 11 11 P e.g. 0.1 3 X -110.984485 9 12 P C.gv 0.1 3 X x: -111.012980 1 13 P S .d. 0.1 -110.951620 15 14 P s»d., 0.1 3* -110.992565 7 15 P s.d. 0.1 -110.981113 10 16 P s.d. 0.1 3* X: -111.010863 4 a RsROothaan, Fl»Fletcher, P*Partitioning • L i , c.g. = conjugate gradient, s.d. * steepest descent. °steepest descent r e s t a r t frequency. d basis update at steepest descent restart.. egradient s c a l i n g i n e f f e c t . a f t e r 30 i t e r a t i o n s unless otherwise noted, exact energy i s -111.012980 a.u. ®2Q i t e r a t i o n s ^ 9 i t e r a t i o n s *uses f i n a l r e s u l t from c a l c u l a t i o n #2 as s t a r t i n g approximation. ^using eigenvalues of core hamiltonian. v indicates frequency of basis modification, ^indicates the order of the f i n a l energies from.lowest to highest. ' I I T E R A T I O N S T 1 T - 1 r r * 10 I* 20 2S 3» FIGURE 8.4 T o t a l electronic energy as a function) of i t e r a t i o n number f o r the CN molecule, (bond length « 2,2 a«u.)» (1) Roothaani (2) partitioning,, steepest descent search directions onlyi (3) p a r t i t i o n i n g , conjugate gradients 1 (4) Fletcher, conjugate gradients, (5) p a r t i t i o n i n g , conjugate gradients with gradient s c a l i n g and basis update with steepest descent r e s t a r t every 3 i t e r a t i o n s . In a l l d i r e c t minimization calculations,, € » 0,1 (see Table 8,3). ' OD I I T E R A T I O N S J i 1 1 • 1 7 s 10 is x> 3o FIGURE 8»5 Tot a l electronic energy as a function of i t e r a t i o n number f o r the GN molecule, (bond length « 2.2 a.u»)» Comparison of the eff e c t of various modifications on\ the conjugate gradient algorithm--partitioning approach onlyi (1) steepest descent r e s t a r t ev«ry 3 i t e r a t i o n s onlyi (2) basic conjugate gradient algorithm} (3) gradient s c a l i n g onlyi (4) gradient s c a l i n g and steepest descent r e s t a r t every 3 i t e r a t i o n s i (5) steepest descent r e s t a r t every 3 Iterations with basis update at r e s t a r t i . (6) gradient s c a l i n g , steepest descent r e s t a r t every 3 it e r a t i o n s with basis update at r e s t a r t (see Table 8«3)« I T E R A T I O N S i , 1 1 1 1 / S 10 IS 2o 05" 3D FIGURE 8*6 Total electronic energy as a function of iteration number for the CN molecule, (bond length * 2*2 a.u*)*. Comparison: of the effect of various modifications on the steepest descent algorithm—partitioning approach only, (1) basic steepest descent algorithm, (2) basic conjugate gradient algorithm! (3) steepest descents with gradient scaling only, (4) steepest descents with"basis update every 3 Iterations, (5) steepest descents with gradient scaling and basis update every 3 iterations,: (6) conjugate gradients with gradient scaling, steepest descent restart every 3 iterations with basis update at restart (see Table 8*3)• then the diagonal Newton>Raphson equations, which are approxi- mations of a second order convergent method, give the correction! 6 f ^ r as a simple step (the same f o r a l l o and r) along the steepest d i r e c t i o n * This modification can e a s i l y be incorporated into the ordinary conjugate gradient formalism by s c a l i n g the energy gradient, Them the appropriate c o r r e c t i o n f o r the unsealed variabiles i s where v i s the conjugate search d i r e c t i o n , computed from the scaled gradients, and X i s the step length computed by i n t e r - p o l a t i o n i n the usual manner* This gradient s c a l i n g (or the * * i i m p l i c i t use of the scaled variables f£ r) i s aimed at correcting the problems i n descent methods caused by anisotropy i n the curvature of the energy surface*. In practice, the numbers used are the best available estimates of the eigenvalues of the Fock operator at any stage of the c a l c u l a t i o n * I n i t i a l l y , any suitable estimate may be used (for example, o r b i t a l energies from a semi-empirical c a l c u l a t i o n of some sort, or even the eigenvalues of the core hamiltonian, as was done fo r the c a l - culations described i n Tables 8*2 and 8*3)* This s c a l i n g procedure has no simple counterpart f o r Fletcher's method* The calculations numbered 10 were done by incorporating only t h i s s c a l i n g procedure into the basic conjugate gradient (8.43) o f 1 * ( 6 1 - € i ) " * 6 f i « XCC 1 - €l)~*v or v a r or v o r/ < or (8.44) 288. a l g o r i t h m r e s u l t i n g im a substantial improvement i n the rate of convergence. Increasing the steepest descent r e s t a r t frequency to every three i t e r a t i o n s (compared with 45 as recommended by Fletcher and Reeves, (1964))* resulted i n a further small Improvement* However, when the molecular o r b i t a l basis defining the p a r t i t i o n i n g i s replaced by the eigenbasis of the current Fock matrix at the steepest descent r e s t a r t , the most r a p i d l y convergent algorithm, r e s u l t e d * For the 2 . 0 a.u. interatomic distance* the energy became correct to f i f t e e n figures ( e f f e c t i v e l y the l i m i t of the machine p r e c i s i o n ) , and the diagonal elements of the density matrices to seven fig u r e s , in; only 25 i t e r a t i o n s . Nearly the same r e s u l t s were obtained f o r the 2 .2 a.u. bond length." The test c alculations described above support the assertion; that the s i n g u l a r i t y of the Hessian matrix at the energy minimum has no observable e f f e c t om the rate of convergence of the con*- 6 jiugate gradient algorithm. Rather, they indicate that much of the poor convergence i s due to the fact that the energy curvature i s highly anisotropic im. general, and the usual conjugate gradient algorithm does not take proper account of t h i s * A single average For a quadratic form, i t i s e a s i l y demonstrated that a singular Hessian matrix has no e f f e c t on the convergence properties of the conjugate gradient algorithm, except that the minimum; w i l l be located i n fewer i t e r a t i o n s , since no l i n e a r search i s required i n directions corresponding to those along which the form has zero curvature. In; a converging energy minimization c a l c u l a t i o n , the part of the coordinate space corresponding to the redundant variables should e f f e c t i v e l y act as a n u l l space as f a r as the choice of search directions i s concerned. This i s e s p e c i a l l y true near a minimum, where the energy i s most l i k e a quadratic form. 289. step length along the descent directions generated tends to overestimate the necessary correction f o r some variables and underestimate i t f o r the others. This i s a well known short- coming of steepest descent procedures also* Im f a c t , the steepest descent algorithm: employing a cubic i n t e r p o l a t i o n l i n e a r search, does not converge much more slowly than the conjugate gradient algorithm. I t appears that i n a p p l i c a t i o n to d i r e c t minimizatiomi s e l f - c o n s i s t e n t f i e l d theory, the f i n i t e termination; property of the conjugate gradient method i s of l i t t l e advantage, since, even f o r the smallest systems, t h i s f i n i t e termination) irr p r i n c i p l e requires considerably more it e r a t i o n s than are acceptable i f e f f i c i e n t calculations are to r e s u l t . No attempt was made i n t h i s s e r i e s of ca l c u l a t i o n s to determine an optimal r e s t a r t frequency. The rate of convergence i s usually greatest either i n an i t e r a t i o n involving a r e s t a r t , or i n the one immediately following, and therefore, i t i s unl i k e l y that an i n t e r v a l between r e s t a r t s of much more than: three i t e r a t i o n s w i l l r e s u l t inn faster o v e r a l l convergence.. Despite such frequent steepest descent r e s t a r t s * there s t i l l appears to be some advantage to using the conjugate gradient search directions,, as can be seen on comparison of calculations 13 - 16, respectively, with calculations 6, 9. 10, and 12, The c a l c u l a t i o n of these search directions from the steepest d i r e c t tions i s a small part of the whole c a l c u l a t i o n . Even so, the steepest directions by themselves give remarkably good r e s u l t s here (see Figures 8.3 and 8.6). 2 9 0 . 8.4 Theory for the General Single Determinant Case 8.4.a The Basic Variables of the Calculation; We now consider the case of an N-electron system, represented by a single determinant wavefunction constructed from occupied o r b i t a l s f o r which there i s a natural grouping into mr sets, c a l l e d s h e l l s , which are r e l a t i v e l y weakly coupled by the hamiltonian; operator. The rt^ occupied o r b i t a l s , X j ^ , associated with the i * ^ ' s h e l l , are chosen from a set of n; o r b i t a l s X ^ , which are eigenfunctions of the Pock operator, F ^ , r e f e r r i n g to the I***1 shell.. The t o t a l energy of the system, eq. (8.1), Is; then completely determined by the projections (one-particle density matrices invmolecular o r b i t a l theory), R ( i ) = X^X* 1**, ( i - 1, ...» m)„ (8.45) onto the i n d i v i d u a l nj-dimensional subspaces of the f u l l n-dimensional basis space, each subspace spanned by one of these sets of occupied o r b i t a l s . I t w i l l be shown; that the columns of these projections and t h e i r complements again pro- vide non>orthonormal basis vectors, im terms of which the f i r s t and second derivatives of the energy, (8.1), with respect to a set of variables provided by the m u l t i - p a r t i t i o n i n g formalism of chapter 4 r can be written e f f e c t i v e l y as compactly as i n the closed s h e l l case. I f the simple form; of the energy, (8.1), i s to be preserved, these projections must s a t i s f y the constraints R ( i ) t , R U ) f (8.46a) 291 and R ( i ) S R U ) « R ( i ) 6 i j t (8,46b) or,, equlvalently,, the occupied o r b i t a l s must s a t i s f y the ortho- normality conditions,, X< i J ) tSX^) * (8.47) Here S i s the matrix of overlap integrals of the f i x e d basis functions i n terms of which the X ^ and R ^ are defined. The number of independent parameters necessary to specify the energy exactly i s determined as follows. The t o t a l number of parameters i n the Icao coefficients,; x i ^ , ( i * 1,, nc), mi m+1 x i s m £ % ( where m « Z nv, i s the dimension of the f u l l I«l x 1*1 1 basis space. Within each set X - j ^ , the orthonormal i t y constraint, and the redundancy due to the invariance of the energy, (8.1), to an a r b i t r a r y unitary transformation! of o r b i t a l s i n the same 2 s h e l l , together account f o r n T parameters.. The orthogonality m I-rl of o r b i t a l s im d i f f e r e n t s h e l l s i s expressed by Z n T Z n T 1=2 A J«l 0 unique conditions, making i t possible, im p r i n c i p l e , to elimin- ate am equal number of parameters. Thus, the t o t a l number of unconstrained and non-redundant parameters required to specify mi I the energy i n the form (8.1) i s Z nv(n\ - Z m T). 1*1 1 J * l 0 The iratra-shell constraints and redundancy can be elimin- ated by rewriting the energy im terms of a new set of parameters chosen as follows. For each i , ( i = 1, ra), the m eigen- vectors X ^ r of the Fock operator F ^ , , are divided into m*l 292 subsets, X^A^„ of dimension r i j , r espectively* such that the subset X ^ i s the set of occupied I t h s h e l l o r b i t a l s . S i m i l - a r l y , the n-dimensional basis space i s partitioned into m+1 subspaces S^, ( J a i r m+1), of the same dimensions, n\jt, r e s p e c t i v e l y . The matrices X ^ , ( i « 1, m), can now be written i n an (m+1) x: (m+1) block form, s i m i l a r to that im eq. (4.2). A set of uncoupling operators,, i s defined, such that (eq. (4.4)), x ( i ) a $(i)£(i) f ( i - l r ..... mi),, (8.48) * ( i ) (i) where X x ' i s the diagonal block part of X v making i t possible to write the block columns of i n t e r e s t as ( i ) c ( i ) L1I ,(i) L2I c ( i ) Sn+i , i a ; ( i ) x ( i ) T I X I I ( i ) II (1) 21 ( i ) +1..I II (8.49a) ( i * lp ..., un), where f j j ^ * X^Y^ ^TT^ ^» (*" 8 8 l r ••••»> m+1) MI " I I (8.49b) That i s , we have i m p l i c i t l y set up m (m+l)-fold partitionihgs., one f o r each of the X ( i ) The parts of each shown i n (8.49a) are the only ones which enter the energy expression, (8.1). The f ^ , ( J » 1„ m+1, J i / I ) , are s p e c i f i e d by n^Cn-n^) 293 complex parameters, which i s exactly the number of parameters im the Xj ' afterr i n t r a - s h e l l orthonormality and redundancy/ have been accounted for * m 1-1 The i n t e r s h e l l orthogonality constraints imply 2 Z n^m- 1=2 J=l A * r e l a t i o n s between the elements of the f j j ^ r ( i 8 8 1». ur)* The e x p l i c i t incorporation of these r e l a t i o n s into the theory w i l l be considered im section 8 . 4 .d. 8»4«b) The Energy Variation: and F i r s t Derivatives The energy functional w i l l be written here im terms of the R<1> a * E * Z t r v , R l x ' f c i + £ E t r v.RVL'G. , (8*50) i=l 1 i=l 1 1 where h i s the core hamiltonian matrix, representing the el e c - tronic k i n e t i c energy,, and the interaction! between the electrons and the nu c l e i , and the represent the imter-electronic repul- s i o n terms of the hamiltonian operator* In d e t a i l , one has 6* 8 8 E G , 1;(V,RM^) • Z v.£. .(R ( ; J>)„ (8.51) where GI>̂ (R): = J(R) - atjK(R), (8.52a) and a i j ; = 1 i f v £ = v. * 1, = otherwise*. (8,52te) 294. The matrices J(R) and K(R) are the usual Coulomb and exchange matrices with elements given by J< R>rs * Z RU> s | | u t3» K < R ) r s " £ R t u t r t H u s ] -t*u t,,u (8.52c) where the symbol [rs||ut] i s defined i n eq. (7«25b)). The occupa- t i o n numbers, v^„ may have values of 1 or 2 only.. An incremental approach i s employed to obtain the deriva- t i v e s of the energy. A change 6 R ^ i n R ^ f ( i « 1, .... m), produces a change i n the energy given exactly by 6E « £ v , t r d R ^ M 1 * + £ L v . t r 6R(i*6G,t (8.53) i=l 1 i * l 1 *• where F ^ i s the Fock matrix associated with the i t h shell., F ( i ) a h «• G^. (8.54) I n the notation established i n the previous subsection, the Mocks of the p r o j e c t i o n R ^ are given by where ,<1> . ( x ( i ) x ( i ) f ) - l g I ^ A I I A I I ' t 1* 1 1 L/I ^ L I K/I K I 0 K/I K I ^ L I L ^ (8.56) Then, one has* «5i , -4 1 ) 4 l ) - 1 4 1 ) t ^ 1 ) »4 1 ) - 1 ^ ) , *^ ) 4 1 ) - 1 «^ t + 6 4 J ) 6 g < i ) - 1 6 f ^ ) t . (8.57) 295. To second order,, one has • 0(6 3), (8.58) where eq. (8.56) y i e l d s I ^ IL LI ^ K l K l (8.59) K/I The c a l c u l a t i o n of the f i r s t derivatives i s s i m p l i f i e d by noting that terms i n oE l i n e a r i n ©fjj^ or 6 f ^ * , f o r a s p e c i f i c value of i , can only enter v i a a single term of the f i r s t sum- mation (over s h e l l s ) i n eq. (8.53). Substituting (8.57) - (8.59) into (8.53), and r e t a i n i n g terms only to f i r s t order, im the 6 £ j ^ and t h e i r adjoints, one obtains + M / I < S f « ) t S « L * " > + f « l i , t S « - 6 f " ) > ] v . ( l ) . - l f ( i ) t PU)7 . Sv.tr m L 1 6 4 i ) * [ ( F < i > ? ( i ) ) p I - ( S * ( i ) ) I > l g ( i ) - l x ( f ^ V 1 ' ? * 1 * ) ^ 1 ) - 1 • 296. •^*4| )«5 l ) " 1 t < * t l ) V l )) I Pr(» { 1 ) t» < 1 ,T <«) ^ *; 4 1 > - 1 ( } « 1 > t 8 ) I ^ II a j i ^ ^ 2 ^ 4 i ) t J y « R ( i ) W w R j i ) ^ (8.60a) m+1 P-l P I J»K I K K J J P Thus,, on defining a new set of non-orthonormal basis vectors, e<*> - (1 - R^^S)^,, K/I, ^ e J I K J I • (J,K « 1, m+1), ( i s 1, m), (8.61) one can write, m ra+1 £ V i Fromi t h i s , one obtains,,, (8.60b) 1 , S E 1 d l f P I Vv F ( i ) (8.62) as the formal f i r s t derivatives of the energy with respect to the elements of the f j ^ and t h e i r hermitian conjugates. As i n the simpler closed s h e l l case, the f i r s t derivatives of the energy are matrix elements of the appropriate current Fock operator i n a basis of non-orthonormal molecular o r b i t a l s 297;. which are columns of the corresponding projection R v / and the complementary matrix ( l - R ^ S ) . The metric properties of the oasis vectors, (8,6l)„ are examined i n Appendix 13. 8.4.c The Second Derivatives The second derivatives of the energy are obtained i n a straightforward, but somewhat tedious, manner, by i s o l a t i n g the second order terms i n (8,53)• These terms consist of two types. The f i r s t arises from the trace over the product of second order variations i n the projections R ^ and the corresponding Fock ( i ) operator F x , while the second arises from the terms of the form t r 6R^^6G^, which contain products of f i r s t order v a r i a - tions of the density matrices. Consider f i r s t the simple term This equation can be viewed as representing l i n e a r transforma- tions on the basis functions in terms of which the two-electron integrals are evaluated. The summations over r,s can be treated independently of those over t and u, above. To f i r s t order, one has (8.64) J,K»1 r€K S€JT,' E (6R «JK 'sr ) <rlls> 298, -2 ,ip ( 6 f« X r i K C 6 - - S f - ) ^ i M f - i > t s ^ /I v€I s€J xc[g< i )- 14J ) t] v r<r||s> P=l u€I r i »*V J , K«1 r € K D A 1 8 , 1 / I v€P s€J «• [ 6 K P - ^ w i i , 4 i ) " 1 4 i , t ] v r < H i B > - " E 1 £ {•f«J»)B¥<(^l>)"||(41»)'S. 'PiSS " (8.65) *"z £ (.fjl)*) M<(.«1>)»||(.Ji))<,». **I v€P Here, the notation <r||s> i s to indicate symbolically only that the basis function # r enters the expression a n t i l i n e a r l y , while 0_ enters i t l i n e a r l y . I t i s not meant to imply that matrix s elements of the type [rs||ut], given by (7»25b), can be written as the product of two simpler matrix elements. Combining (8,65) with the corresponding r e s u l t f o r the sum over indices t*u, i n the o r i g i n a l expression: (8 ,64) , then leads toathe r e s u l t /1\ m+1 m+1 m+1 t r 6R v i'6G, 3 E E E E E x j=l Pal Q»l u 6 P a€J /I / J v€I 0€Q tif^>t)nit6*<«*)a,[ct.<»)''C.<1>>»||t^.>>»£41>)«] - a i j f < 4 i ) > | 1 ( 4 3 ) ) a l h ^ ) ) ^ 4 1 ) ) v ] } 299. + (64^ t ) > ( 1 ( e f ( | > ) e a { [ ( e ^ ) ) " ( e < i ) ) 1 , | | ( e ( J ) ) a ( e ( ^ ) » ] - a i J f ( e ( i » ) " ( e < ^ ) » | | ( e < i » ) « ( e ( i > ) » ] } •<«4i ) V» < S4l ' , t ) a B [ t (e< i ) ) V (e< 1>)t ' || (e^ ) ) » (e^))n - aijC(4 1 >> v(4 J >> 0ii < e 8 i )» s<4 i ) ) | 1]} • < « ^ ) V,<«^ )) u[[<4 1 >> v««p 1 )> i*lk4 i >>"<»Q , )>'3 -a 1 JFci 1 )» ,<4 i ) l ,lk4 J ))"f4 i )> l ,3}V (8.66) Each of the four terms here consists of a Coulomb and exchange in t e g r a l combination, evaluated i n the p a r t i c u l a r non-orthonormal molecular o r b i t a l basis given by ( 8 . 6 l ) . The contributions of the second term i n (8.53) to the second derivatives of E are e a s i l y obtained from ( 8 .66 ) . Consider now the f i r s t term of eq. ( 8 .53 ) . A considerable amount of algebraic manipulation i s required to obtain the second order terms i n compact form. The f i n a l r e s u l t i s t r 6(2> i eti) r(i) » _ t r " J 1 • f < J ) t ( M < i > ) P I » « i « V 1 J n m / I Vi I I P I I I QI J 300 - tr. Z 6 f p ^ R : ( I ) S ) I Q 6 4 ^ F ( I L v m . / I (8.67) This equation contains no terms involving products of matrices 6 f P I ^ W 1 < t h d i : f f ' e r e n ' t values of i . Thus, the f i r s t term of (8*53) gives contributions only to second derivatives of the energy with respect to variables r e f e r r i n g to the same s h e l l * The complete second derivatives of the energy can now be written down by combining eqs, (8,66) and (8,67)» and iheorpora t i n g constant factorsand occupation numbers where indicated by (8 ,53) , In a l l , there are only s i x d i f f e r e n t formulas (of which two pairs are complex conjugates of each other)* d 2E -v. l d ( f Q I /. \ dE d ( f P I >UP ( [ ( e « i ) ) ' i ( e ^ i ) ) v | | ( e ^ ) a ( e ^ ) ) e ] f ( 4 i ) ) ' l ( . { i ) ) p l k ^ l ) ) a ( e j i ) ) ¥ ] } ~ a i i f d 2E v i dE d ( f P I V v d ( f Q I }CLfi 2 * U P I >nP d(f,( dE QI 'av_ 301 » 2 J-[C(41))V(e^i>)"||(e<1))e(e<i))a] 2 • B . ( i ) » . ( i ) ~ | -aij[(e|>)"(e<i>)«||(e<i>)B(e(i»)»]} , (8.68) Wi>*/ '..Mi-* • - ^ (C(e^ , )^4 i ) ) -||(e^),«(eU) ) ' ^ I J ct4 i > ^<«S , ) > v K J )> B<-i l >) ,'3}' » , 2 + - 2 d 2E v.v - a 1- 1f(.< 1 )) v(.^ )) a||(.^ )) ¥(4 i >) | ,]J , K(rlU4) *trli^ 2 l p ' ' 1 " J a J - 1 J c<4 l >) , ,<-, 1 )) Blk4 J ) > M- )) ,'3j • a <4i V v 9 ( f w > « Y 2 In a l l these formulas, the convention! fi€P t v,0 € I# a€Q, and Y € J , i s implied. 302 8 . 4.d Incorporation! of the I n t e r s h e l l Orthogonality Constraints The i n t e r s h e l l orthogonality constraints on the lcao c o e f f i c i e n t s are given by eq. ( 8 . 4 7 ) . The equivalent expres- sions ire terms of the f p j ^ are analogous to eqs. ( 4 . 5 8 ) . Only h a l f of these equations are unique, the other h a l f being t h e i r adjoints. There are two ways to incorporate these constraints into the theory. The constraint equations, ( 8 , 6 9 ) t cam be used to^ e x p l i c i t l y eliminate an appropriate number of elements of the f p j ^ occurring im the energy f u n c t i o n a l . The derivatives of the energy with respect to the remaining unconstrained variables are then obtained from eqs. ( 8 . 6 2 ) by a simple a p p l i c a t i o n of the chain r u l e . The advantage of using t h i s method to handle the imtershell constraints i s that the energy and i t s deriva- t i v e s are them expressed im terms of a minlmumi number of uncon- strained and non-redundant v a r i a b l e s . The r e s u l t i n g formalism i s suitable to use with true minimization! techniques, such as the conjugate gradient method with variations discussed previously (sectiom 8 . 3 . b ) • Such procedures would be r e l i a b l e , since divergence could not occur, and e f f i c i e n t , as long as the number of s h e l l s i s small. As the number of s h e l l s increases, the i n t e r s h e l l constraint equations become considerably more complicated (see Appendix 2) and the addit i o n a l cost of c a l c u l a t i n g 303. the energy derivatives with respect to the independent variables may soon o f f s e t the other advantages. Another problem here i s that the elimination! procedure i s not e a s i l y automated for use with an a r b i t r a r y number of s h e l l s . This approach i s i l l u s t r a t e d ire d e t a i l i n sectiore 8.4.e f o r a two s h e l l system.. A second approach to incorporating the i n t e r s h e l l constraints m 1-1 into the theory i s to consider the £ % £ n T unique cora- 1=2 1 J=l 0 m m+1 s t r a i n t equations, (8.69). and the Ere,. £ re, independent 1=1 1 J=I+1 J equations expressing the vanishing of an appropriate set of the same number of f i r s t derivatives of the energy, as a system of m for the elements E nv(n - riy) simultaneous nonlinear equations 1=1 1 1 JLI: •pi of the fli*, (P * 1, P/Il i s 1» »)'• The deriva- t i v e s of the g j j ^ t eq. (8.69). are a J l K * •u*qr»Li<* Sm*K¥ >ps • < 8-? 0 a> d ( fB8L ;pq K 1 and m = 6.,6TT6 ( £ f i - r ' T s v M ) r . ^ • (8.70b) >/ft(D\ j l LJi sq K 4 KI KM rp * ML 'pq *• 1 With these formulas, and eqs. (8.62), the Jacobian matrix f o r the complete system can be constructed, and the can be determined i t e r a t i v e l y , using one of several methods (f o r example, the Newton-Raphson equations). The advantage of using t h i s approach i s that the energy 304 derivatives, (8.62), and the derivatives, (8,70), of the constraints, can be used!without further modification, and can; be calculated automatically f o r systems involving an a r b i t r a r y number of s h e l l s . The c a l c u l a t i o n now involves twice as many variables as i n the f i r s t approach. The large number of variables may preclude the use of the f u l l Newton-Raphson equations, making i t necessary t o develop l i n e a r l y convergent approximations to them which are more e f f i c i e n t o v e r a l l , much as was done i n 7 chapter 5» i n a d i f f e r e n t context,' These methods are not descent methods, and therefore, w i l l not necessarily y i e l d an energy minimum at a l l times. Nevertheless, f o r systems involving a large number of s h e l l s , t h i s would appear to be the approach of choice. 'The situatiom i s admittedly greatly complicated here by the presence of the large number of two-electron\ integrals (which must be transformed to a new molecular o r b i t a l basis i n each i t e r a t i o n ) entering the second derivatives of the energy. They would have to be p a r t i a l l y or t o t a l l y neglected, or else approximated i n some manner i f a computationally e f f i c i e n t algorithm based on the Newton-Raphson equations i s to r e s u l t . An a l t e r n a t i v e procedure would be to use a method not r e q u i r i n g these second derivatives ( f o r example, a generalization of the secant method for the s o l u t i o n of a single nonlinear equation). No information on the performance of such methods has yet been obtained,however» 305 8.4.e Example -- The Two S h e l l System As an i l l u s t r a t i o n of the general formalism just described, formulas applicable to a two s h e l l system are given here e x p l i c - i t l y , A 3 x 3 p a r t i t i o n i n g must be used i n t h i s case. The variables entering the c a l c u l a t i o n are A*7 f, 1 (1) 21 , ( D '31 T (2) 12 h >(2) 32 ( 8 . 7 D where the occupied o r b i t a l s f o r the two s h e l l s are written,, (1) . £ ( l ) v ( D (2) J(2) (2) K2 l2 A22 ' (8.72) The projection operators onto these two occupied spaces are given e x p l i c i t l y by XD g ( 1 > g l -1 J21 61 *31 -1 -1 1 ,21 ADM)'1 f ( D t J21 g l *21 ADM)'1 f ( D t ?31 g l 21 M)mlAl)i g l A31 f ( l ) f f ( l ) " 1 f ( l ) t *21 g l x31 f ( l ) _ ( l ) " 1 f ( l ) t J31 g l 31 (8.73) and ,(2) A2) J 2 ) " 1 f ( 2 ) t ri12 g 2 1 4 gg 12 (2)"" 1 f(2)t r12 A2) J 2 ) " 1 f ( 2 ) t J32 g 2 I12 f X2)_(2)- 1 x12 g 2 J 2 ) " 1 g£ f ( 2 ) ( 2 ) - 1 z32 g 2 f ( 2 ) J 2 ) - I j ( 2 ) t E12 g 2 *32 J 2 ) - I f ( 2 ) f g 2 J32 f ( 2 ) ( 2 ) - 1 f ( 2 ) t x32 g 2 x32 (8.74) 306o lm an orthonormal basis, one has, J1) * 1 + f ( l ) t f ( D + f ( D t f ( l ) g l * X l + f21 f21 + f 3 i f31 * (8.75a) and «2 *2 + f12 f12 * f32 f32 # (8.75fc) I f the basis i s non-orthonormal, e x p l i c i t formulas f o r the g£ are considerably lengthier,, for example, g l S l l * S 1 2 f 2 1 S 13 f 31 21 s 21* r 31 31 21 b22 r2 u21 + f . ( D t q * ( i h f ( i ) t s f ( i ) + f ( i ) t s f ( D ,*21 ^ 3 31 31 32r21 + r 3 1 33 31 * .(2) (8.76) and s i m i l a r l y f or gj?*"* For an orthonormal f i x e d basis, the nora-orthonormal contra- gredient molecular o r b i t a l basis, ( 8 « 6 l ) , i n terms of which the energy derivatives can be written very compactly, are given by S ( D = R<D e A l .(I) c21 r ( D c31 g (1) -1 e 2 (1 -R< 1 J) ,2 1 - f ( 1 ) g ( 1 ) x 2 r21 g l J31 g l 21 -1 , ( l) t r21 (8.77a) 307. and - 4 i r l 4 i } t f ( i ) j i ) * l f ( i ) f x21 g l z31 1 - f ( - ) f ( l ) " l f ( l ) t A3 31 s l 0 4 "31 -(2) The expressions for the e£ ' are analogous, *< 2 > - ( l . R ( 2 ) ) f l , e2 R ( 2 ) ~(2) K,2 '•' e3 * ( 1 - R ( 2 ) ) ,3 ' (8.77 c;) For a non-orthonormal fi x e d basis, e x p l i c i t expressions for the; % ^ im terms of S and the f p ^ and g ^ , (i = 1»2), are con- siderably lengthier. In the course of a c a l c u l a t i o n , the current projections R^1^ and R^2^ would always be known, so that the e^ 1^ would be obtained d i r e c t l y from formulas l i k e (8.77b), rather than being evaluated using formulas l i k e (8.77a). The vectors dual to the e , ( l ) t 1 (1) 21 (1) 31 -f, 21 l o - f are given by (1)T 31 0 (8.78) The scalar products of these vectors and the metric matrices with respect to which they are orthonormal are given by 2(1) . ^(Dtjyd); a *<l)fc(l)t g n e 308 and! 4 ° -1 1 2 r21 g l J21 r21 g l x31 " f 3 l g i *2i x 3 31 g l 31 ^(1) * £<1>V 1 } • e ( 1 ) e ( 1 ) t , (8.79) (1) g ( 1 ) l . + f ( l ) f ( D t i 2 + i 2 1 r 2 1 , ( l ) f ( l ) t r31 r21 ADADt x 2 i A 3 l , + f . ( D f ( l ) t 3 31 31 (8.80) with s i m i l a r r e s u l t s f o r e/ 2^ and e ^ 2 \ Similar, but lengthier, r e s u l t s are obtained i n a non-orthonormal f i x e d basis, but, i n that case, g ( i ) / A ( i \ and g ( i ) / A ( i ) , ( i • l f 2 ) , and thus, the number of formulas doubles. The s i m i l a r i t i e s between eqs. (8.77) - (8.80), and the re s u l t s given i n section 2.1.d, a p p l i - cable to a single s h e l l system, are e a s i l y seen. The formal f i r s t derivatives of the energy are SE *<f2r>or a ( f &E 31 'ar SE J l f 1 2 ; r o V x ^ e < x > ) « ( e < x > ) r B . « F ( 2> (8.81) 309 and as w(2) Formal second derivatives can be written down e x p l i c i t l y from eqs. (8.68). Even i n t h i s simple case, there are thirty-two di f f e r e n t e x p l i c i t second derivative formulas (E depends O K f21^» f31^» f 12^ a n d f 32^' a n d t n e i i r ad joi n t s ) neglecting those which are complex conjugates. There i s only one i n t e r s h e l l constraint equation i n t h i s case. In an orthonormal basis, i t i s *12 2 ) - f<|> • f < 2 ) t •• t™'t™ - 0. (8.82) This equation i s e a s i l y used to obtain (2) ( I ) l l ) (2) giving f j g i n "terms of f ^ l r *31 •' a n d f32 • w n o s e elements can be used as a set of unconstrained and non-redundant variables, i n terms of which the energy may be minimized. Equation (8.83) i s unusually simple. For the next simplest case, a three s h e l l system, there are three i n t e r s h e l l constraint equations, which, while s i m i l a r to (8.82), cannot be used to obtain three "dependent" blocks, f p j ^ r ire terms of the remaining s i x "independent" blocks without introduction of an inverse matrix (see Appendix 2).. In fact,, for a two s h e l l system, when the fi x e d basis i s non- orthonormal, the i n t e r s h e l l constraint becomes -(12) . (jCDtgftt)) g 1 2 = {L1 b l 2 ) 1 2 310. " S l l f 1 2 ) + S12 * S 1 3 f 3 2 ) + f 2 1 ) t ( S 2 1 f 1 2 ) ' f S 2 2 + S 2 3 f 3 2 ) ) • 4 l ) t ( S 3 1 f 1 2 ) + S 3 2 + S 3 3 f 3 2 ) ) * 0 # ( 8 # 8 4 ) from which one obtains '« • - £sn « l i ) t s 2 i • WSJ'1 <8>85a) x L^12*T21 b 2 2 + 1 3 1 32 ^13 31 23 r31 33' 32 J = -A"1B. (8.85b) Not only i s t h i s expression considerably lengthier than (8.83), hut the presence of the inverse matrix complicates the appl i c a - t i o n of the chain r u l e , and leads to more complicated formulas f o r the energy derivatives with respect to the remaining indep- endent va r i a b l e s . From (8.83), one obtains ii£li!2£j£ « K & d ( f 1 2 ) ) o p _ . ,A2), a r f ( l ) x " " V 6 « V w f ( l ) « / 6 ov< f 3 2 Vp» d U 2 1 'uv * i r 3 1 'fiv (2)x (8.86) 12 1 op m . /^(D~x a u 3 1 Vv Combining these with eqs. (8.81) then y i e l d s . SB B F ( l ) „ F (2) H t ^ \ r ^ . < l V ( . < l > > * 2 F ( e < 2 > ) ° ( e < 2 > ) r f ' ^ ^ ( e ^ - v 2 p 2 ) F J ) t E U ) ] ( SE _ „ p(l) 3 ( f U ) * ) V ' F * l x 3 1 ; a r x~3 ' ' ~| ̂ " •.£*" *e:~' lar (8.8?) 311. andi , x, w ( 2 ) u [Z(i)v{z) "1 which require l i t t l e a d ditional work once the derivatives i n (8.81) are known. For a n©ni-orthonorraal basis, eqs. (8.85) can be used to obtain, a / 2).;: \ 0KI12 ; r o s . - l r c . A - l w q <, -(2)-, A r s L & 2 1 A n ~ ^22 • & 23 x 32 -Vo* d U 2 1 V s d ( f ( 2 J ) wJl)»r * A r s - - S 3 l A " l B * s 3 2 - S 3 3 f 3 2 > \ o • ( 8 ' 8 8 ) d U 3 1 'as and d ( f i 2 ) ) r o _ . r A - i / ^ D t c * f ( D t q - n 7 7 - ^ 2 7 ; s " 6 o r L A u 2 1 b 2 3 *31 33'Jra» d v r 3 2 'af which lead to + v 2 [ ( S 2 1 A " 1 B - S 2 2 - S 2 3 : f 3 2 ) ) P j 2 ) t ( 2 ) A " x l . s * e 2 e l 31 as 3 1 2 1 312* and (2) ( . < 2 ) > a ( . < 2 ) f (8.89) Both sets of equations, (8.87) and (8.89)». are suitable for use with gradient minimization;algorithms. Second derivatives of E with respect to elements of fgi^» f3i^» a n d *32^* a n d " t ^ 1 6 * 1 , adjoints are obtained i n a s i m i l a r way.. The greater complexity of the formulas (8.89) compared to those i n (8.8?) i s not of much concern here, since i n an actual c a l c u l a t i o n , one would expect to carry out the energy minimization i n a molecular o r b i t a l basis i n which S « 1 (see section 8.3.b)>. In t h i s case,, the stationary points of the energy can also be determined by solving the system of 2rc ln 2 • n^n^ + general complex) simultaneous nonlinear equations given by P s (8.89) 0 and * 0. 313. BIBLIOGRAPHY: Claxton, T.. A*.and Smith, N, A., Theoret. Chim. Acta. 22, (1971). 399. Clementi, E*., J . Chem. Phys., 38, (1963). 2686. Coope, J , A. R., Molecular Physics. 18, (1970), 571. — Ph. D. Thesis, Oxford University, (1956). Coulson, C , and Longuet-Higgins, H. C , Proc. Roy. S o c . A191, (1947), 39. Daniel, J.. WY, Numerische Mathematik. 10, (1965).. 125. Davidson, E. R., J . Comp. Phys.. 17» (1975)* 87. DeVries, E., F o r t s c h r i t t e der Physik. 18, (1970),, 149. Eriksen, E.., Phys* Rev.. I l l , (1958),, 1011. Feler, M. G.„ J . Comp. Phys.-., 14,, (1974),, 341. Fletcher, R.,, Molecular Physics. 19» (1970), 55* Fletcher, R., and Reeves, C.M.. Computer Journal,, 7, (1964), 149. Foldy, r L. L., and Wouthuysen, S.. A... Phys. Rev.. 78. (1950), 29. Friedrichs,, K» 0., Perturbation; of Spectra i n H i l b e r t Space. (1965), American; Mathematical Society, Prov,, R. I . Fujiimoto, H.. et. a l . , J . Phys. Chem.. 78, (1974), 1167. Garton, D., and S u t c l i f f e , B. T..,. Theoretical Chemistry, V o l . 1, Quantum Chemistry, S p e c i a l i s t P e r i o d i c a l Reports. Chemical Society, London, (1974), 314. Imamura, A., Molecular Physics. 15t (1968), 225. Kari, R., and S u t c l i f f e , B. T.t Chem. Phys. L e t t . . 7, (1970), 149. — International Journal of Quantum Chemistry. 7» (1973). 459. Kato, T., Perturbation Theory for Linear Operators. Springer-Verlag, New York, (1966). Klei n , D., J . Chem. Phys.. 61, (1974),, 786. Langhoff, P. W., Karplus, M., and Hurst, R. P., J . Chem. Phys.. 44, (1966), 505. L i b i t , L., and Hoffmann, R., Journal of the American- Chemical Society, 96,. (1974), 1370. Lowdin, P* 0., J . Math. Phys.. 3t (1962), 969. — International Journal of Quantum Chemistry. 2, (1968), 867. — Advances i n Quantum Chemistry. 5» (1970), 185. Lowdin, P. 0., and Goscinski, 0., International Journal of Quantum Chemistry. 5. (1971)t 685. McWeeny, R., Phvs. Rev.. 126, (1962), 1028. Morpurgo, G., Nuovo CJmento. 25,. (i960),, 624. Musher, J . , J . Chem. Phys.. 46, (1967). 369. Nesbet, R. K., J . Chem. Phys.. 43, (1965). 3H* Okubo, S., Prog. Theor. Phys.. 12, (1954), 102. Pople, J . A., and Beveridge, D.. L., Approximate Molecular O r b i t a l Theory. McGraw-Hill, Toronto, (1970). Primas, H-, Helv. Phvs. Acta. 34, (1961), 331. — Rev. Mod. Phys.. 35, (1963). 710. R a i l , L. B., Computational Solution of Nonlinear Operator Equations> Wiley, New York, (1969). 315. Ralston, A., A F i r s t Course i n Numerical Analysis, McGraw-Hill, Toronto,. (1965). Riesz, F.,, and Sz.-Nagy, S.„ Functional Analysis. Frederick Ungary (1955). Roos, B..„ "The Configuration: I n t e r a c t i o n Method" im Computational Techniques i n Quantum Chemistry and Molecular Physics., (G.. H.. F. Diercksen, B. TV S u t c l i f f e , and A. V e i l l a r d , eds,),, Reidel, Bostom, (1975). Roothaan, C. C. J.,, Rev. Mod. Phys.. 23, (1951). 69. Rosenberg, M.r and Martino, F., J , Chem, Phys.. 63, (1975) $> 5354. Rutherford, D. £., Roy, Soc* of Edinburgh. Proc. A. 63, (1949/52), 232. Shavitt, I., J . Comp. Pfays.. 6, (1970),, 124.. Shavitt, I., et. a l . , J . Comp. Phys.. 11,, (1973)., 90. S u t c l i f f e , B. T., Theoret. Chim. Acta. 33, (1974),, 201. — Theoret. Chim. Acta. 39, (1975). 93* Sz.-Nagy, B.., Comment. Math. Helvet.. 19, (1946/47), 347. Tarai r S.., Prog., Theor. Phys.. 12,, (1954), 104. Traub, J . F., It e r a t i v e Methods f o r the Solution of Equations. Prentice H a l l , Englewood C l i f f s , N, J . , (1964). VamVlecic, J . H., Phys., Rev.., 33* (1929),. 467. 316. APPENDICES "Humpty Durapty looked doubtful* 'I'd N rather see that done on paper,* he sa i d , (Through the Looking Glass, Lewis C a r r o l l ) 3 1 7 APPENDIX 1 Proofs of Alternative Formulas — 2 x 2 P a r t i t i o n i n g This appendix outlines some of the manipulations necessary to e s t a b l i s h a number of i n t e r - r e l a t i o n s which have been quoted i n section 3 » 1 * Consider f i r s t the orthonormal case.. The relationship: between the two sets of e f f e c t i v e operators ( 3 « 1 ) and ( 3 * 2 ) i s e a s i l y established. From the d e f i n i t i o n • . ( 2 ) _ - U HA " gA Gk-,f one obtains, "'I' - SI'C % A •. H A B f • tHnBti • H B B ;f)] = ft*1' • g j 1 ^ 1 ^ ) . A s i m i l a r procedure establishes the r e l a t i o n ( 3 « 3 b ; ) between Hg 2* and Hg 1*. To: e s t a b l i s h eq.. ( 3 . 6 ) , , the r e s u l t ( 3 . 3 a ) ; i s substituted into ( 3 * 5 )»- and the " pull-through'* r e l a t i o n s , ( 2 . 3 2 ) , used.. This yi e l d s D<2>(f): = H B A • Hggf - fH< 2 ) = % A + H B B f - ^ A 1 ^ - f g I 1 f t D ( 1 ) ( f ) = ( i B - f g - i f W ^ C f ) = 4 y i ) : ( f ) . The c o n d i t i o n that T HT be block diagonal i s e a s i l y determined! i n a d i r e c t manner. The inverse of T i s 318. * - l T: 1 1„ - f g ^ f ' Matrix multiplcation, followed by use of the "pull-through" r e l a t i o n s , (2.32), then establishes that the off-diagonal blocks of T.rlHT are given by D* 2^(f) = g ^ D ^ ^ f ) . Before deriving-eqs.. (3.11); - ('3*15)» applying i n the case of a nonorthonormal basis,, i t i s necessary to examine the orthonormality condition,, (2,101b)), im more d e t a i l . The blocks of the matrix g are gA * SAA + S A B f + f + S B A + ^ B f i f ' % " SBB: + SBA h + ^ S A B + h t s A A h » and = h f s A A + h t s A B f + s M * s B B f . g{B .. Thus,, one has:, g A = ( i A - f V ) s A • f t g B A , and (A1.2) (A1.3) §A = SAA + S A « f ' % ' ( 1 B " h t f t ) § B . + Here* and throughout t h i s appendix,, the notation 'ABA * ^B: = SBB + S B A h r established i n eq.. (2.112),, i s used to simp l i f y the equations. From- (A1.2) and (A1.3). one obtains, and; (A1.5) -1 ABTB * 319 From these l a s t r e l a t i o n s , two generalizations of the " p u l l - through" r e l a t i o n s i n the orthonormal case, can be derived. They are ^VA " = • h V g ^ 1 - fc'g^jV, (A1.6) a n d l ^ A l f t " FTGB̂ B + ^ S A B F B " FTGBA§Ilft • ( A 1- 7 ) The l a s t two terms vanish i n each i f ggA * 0t. leaving the simpler expressions ^SA^A1 * SB^'V , (Al.,8) and g ^ V - ^ g g S j 1 .. (A1.9) Two other r e l a t i o n s w i l l be useful below i n deriving (3*15)• They are, h f g A = - ( 1 B : - h t f t ) ( S B A * S B B f ) • g f i A , (AU10) and F T % " - < 1 A " f V > ( s A A h + SAB)} + g A i • ( A i a i ) The f i r s t , (Al»10)„ i s obtained as follows. h f g A X - ( 1 B - h t f t ) h + S A + h t f t g A - - ( 1 B * h t f t ) ( S f i A * s B B f - g B A ) • h t f t g B A = - d B - h t f t ) ( s B A i + s B B f ) • g M ., The f i r s t l i n e here i s obtained by premultiplying (A1.2) by h*". The second l i n e then follows d i r e c t l y from the d e f i n i t i o n ^ ( A l . l ) , of g"BA« The relation!. ( A l . 11) i s derived analogously, 320. by f i r s t premultiplying (A1.3) by f*, and then using ( A l . l ) f o r g Ag * gg^* A number of other r e l a t i o n s s i m i l a r to these could be derived here also, but these are s u f f i c i e n t for what follows. * M ) To e s t a b l i s h the r e l a t i o n s h i p between the operators H A ' * (2) and H A ,, we proceed as follows.- F i r s t , from ( A l . l ) , • tSBA " <SBA + * and thus, eq* (3.13) y i e l d s HBA + HBBf " -ft* - emK'^AK + «ABf> ^ ( " ( f ) . (A1.12) Then, using (3.8),, one obtains £(2) _ - l r A " gA QA * gA^ HAA + H A B f + f ^ H B A + HBB f>3 " tf&UL * "AB* - * W - ^ A ^ ^ A A + K A B f ) - D ( 1 ) ( f ) ] ] ' g I 1 ( 1A - ^ H H ^ • H A B f ) • tftWhtt + % f gBA HA * But, from eq, (A1.2)„ ( i A . fV) - (gx - f V ^ ; 1 * and thus, .'A "A e s t a n i i s h i h g (3.11). Equation (3.12) i s obtained im an analogous manner* 321. A number of approaches can be used to obtain eqs* (3«15). one of which i s as follows i ° U ) a H B A + H B B f - < SBA+ SBB f> ^kGK ' HBA * % B f + ( 1B• * h t f t > " ^ h t - g B A g I 1 ) G A = d B - hhfrK(iBhtf^cH^gf) * h t G A . g B A g ; 1 G A - d B . h ' f W 1 * - g M ( H i 2 > - s ^ n - ( i B - h V j f 1 ^ - g B A g ^ 1 f + ) D ( 1 ) ( f ) . . (Ai . 1 3 ) The t r a n s i t i o n from- the f i r s t l i n e to the second i s effected using the r e l a t i o n (Al.10)» and the remainder by use of previous definitions„ including (A1.12) above. From (A l * 3 ) , one has ( i B - h V ) - 1 = ( s B B + s M h ) : ( g B * hh^r1 > which,, upon s u b s t i t u t i o n into (A1.12)),, gives eq* (3.15a)* Equation (3.15b.)) follows d i r e c t l y from (3«15a); by simply dropping the terms i n g^andi gg A* The easiest way to o b t a i n (3.15c) i s to begin again from the d e f i n i t i o n , (3 .14), of D^ 2*(f) D ( 2 ) ( f ) * H M + K B B f - (S f i A + S B B f ) n } 2 ) = HBA + HBB* ~ ( SBA * W ^ A 1 * + gI l f t D U ) ( f )^ " ^ B - <SBA + S ^ f J g ^ f ^ D ^ ^ f ) . This d e r i v a t i o n i s analogous to that establishing^ (3*6) i n the case of an orthonormal b a s i s . For an orthonormal basis, i t wasr found that the condition (2) Dv '(f) = 0 could be obtained by requ i r i n g that the product T^HT be block diagonal* For a nonorthonormal basis, the 322 cx>rrespoTKiing condition! is to require the product (ST) HT to> be block diagonal. But writing T ST = g f ! one has, and: (ST) HT • g T HT * g AG., Thus, whem g is block, diagonal. (ST) HT has diagonal blocks H A 2^ and H^2*,. and off-diagonal blocks,, D ^ ' * [ ( S T r ^ T ] ^ = g j V 1 * , , and (2)* While? this is of the fornr of eq. (3-6), D f i A' can not be * (2) written in terms of H£ ' as in*eq. (3«l4), unlike the analogous r e s u l t lm an orthonormal basis. 323 APPENDIX 2 The 3 x 3 and 4 x 4 Case — Orthonormal Basis To i l l u s t r a t e some of the complications which a r i s e i n a multiple p a r t i t i o n i n g formalism, a number of e x p l i c i t formulas are given here f o r quantities a r i s i n g out of a 3 x 3 and a 4' x< 4 p a r t i t i o n i n g formalism. In the 3 x 3 p a r t i t i o n i n g formalism, the three non-self- adjoint e f f e c t i v e operators given by (4.1?) are H l = H l l + H 12 f 21 + H 13 f 31 * H 2 = H 22 + H 21 f 12 * H 2 3 f 3 2 »' (A2.1) and H 3 = H 3 3 + H 3 l f 1 3 + H 3 2 f 2 3 each containing only one extra term compared to the 2 x 2 case. A considerably greater increase i n complexity occurs when m=3 i n the defining conditions on the f j j * given by (4 .18) . There are now s i x matrix block equations with s i x terms each, i n place of the two block equations with four terms each, as i m the 2 x 2 case. They are, D 21 " "21* H 2 2 f 2 1 + H 2 3 f 3 1 - f 2 1 ( H n + H 12 f 21 + H l 3 f 3 l ' = 0 D 3 l ' H ^ l + H 33 f 31 - f 3 1 ( H n * H 12 f 21 + H l 3 f 3 1 s 0 »12 " H 1 2 + % f 1 2 + H 13 f 32 " f 12< H 22 + H 2 l f 1 2 + H 2 3 f 3 2 ' s 0 D 32 " H 3 2 + H 3 l f l 2 + H 3 3 f 3 2 " f 3 2 ( H 2 2 + H 21 f 12 + H 2 3 f 3 2 ' s 0 D 13 " H 1 3 + H U f 1 3 + K 1 2 f 2 3 - f 1 3 ( H 3 3 + H 3 l f 1 3 + H 3 2 f 2 3 ' = 0 and, 324. D23 = H 23* H 23 f 13 + H22 f 2 3 " f 2 3 ( H 3 3 + H 3 l f l 3 + "32^3* = ° ' (A2.2) The orthogonal i ty c o n d i t i o n i n (4.1) gives r i s e to three matrix block equations here, and,. g12 " f12 + f21 + f 31 f 32 " °» g13 = f13 + f 21 f 23 + f31 = ° ' g 23 = f 12 f 13 + f 23 + f32 = °* (A2.3) These equations can be used to el iminate f 1 2 » 1̂3 a n d ^23 ^ r o m the remainder of the formalism, i n favour of f 2 1 , and f^ 2 » I n f a c t , i t i s not d i f f i c u l t to show that t t f12 " " f21 " f 31 f 32 • f 23 = ( 11 " f12 f21* 1 ( " f 3 2 + f12*31* • -lh + ( f 2 l + f 3l f 32 ^zi^t*^ + ( f 2 l + f 3 l f 3 2 ) f 3 i ] ' and (A2..4) f 13 = " f31 " f21 f 23 • - f 3 i + f 2 i C V < f 2 i + ' J i ^ ^ i ^ ^ ^ ^ ' J i ^ i ^ ^ i 3 * The problems involved i n e l i m i n a t i n g fj L 2» f 23 a n d f13 f r o m e a * s » (A2.2), and thereby reducing the number of block equations which must be considered from s i x to three, are c l e a r from these equations. I t would be quite d i f f i c u l t to derive e f f i c i e n t procedures to solve such a system, because of the general ly complex dependence of the remaining three block equations on the elements of f 2 1 , 3̂1 a n d f 32* 325 For a 4 x 4 partitioning,, the orthogonality conditions, g j j = 0, give the following s i x unique matrix block equations, g12 " f12 + f21 + f 31 f 32 + f t l f » 2 " °» ( A 2 ' 5 ) g13 * f 13 + f21 f 23 + f31 + f i l f 4 3 = °» (A2.6) g 23 = f l 2 f l 3 + f 23 + f 32 + f i 2 f 4 3 s °» g l 4 = f l 4 + f 2 l f24 + f 3 l f 3 4 + f 4 l = ° ' g24 " f 1 2 f l 4 + f24 + f 32 f 34 + ftz = °» (A2*7) g34 = f l 3 f l 4 + f 23 f24 + f 34 + f 43 = 0 # These s i x equations can be used to write the fjj» ( J > I ) , s o l e l y i n terms of the f I J t ( J < I ) , as follows. Equation! (A2.5) gives f 1 2 d i r e c t l y as f 1 2 - - ( f ^ + f ^ f 3 2 + f ^ f ^ ) . (A2.8) Then the two equations, (A2.6)^are solved simultaneously for f 1 3 and f g y y i e l d i n g , f 23 = ( f 12 f 21 " , l ) ~ 1 E f 32* f l 2 f 4 3 " f12 ( f 3 1 + ^ l ^ ^ * and <A2-*a> f 1 3 = - ( f ^ + f j j f ^ ) - f ^ f 2 3 . (A2.9b) Substitutioni of the adjoint of (A2.B) into (A2.9a) then gives f 2 3 im terms only of the (J"< I), and su b s t i t u t i o n of that r e s u l t into (A2.9b) does the same for f'^j-- The three equations (A2.7) can be solved simultaneously for ^j^r and f ^ , , y i e l d i n g , 326 f 3 4 = " ^ 1 " f 1 3 f 3 l " ^ f 2 3 " f 1 3 f 2 1 ) ( 1 " f 1 2 f 2 1 ) 1 ( f 3 2 " f l 2 f 3 1 ^ (A2*10a) f24 = " ( 1 " f 1 2 4 l ) " 1 ^ f l 2 + f 1 2 f i l + ( f 3 2 " f 1 2 f ^ (A2.10b) and * l 4 = " < f J l + f 3 l f 3 4 * f2lW- (A2.10c) Substitution of eqs* (A2.8) and (A2.9) into (A2,10a) gives f ^ im terms of the f j J t , (J/ < only. S i m i l a r l y , (A2.8), (A2.9), and (A2.10a) can then be used to write f g ^ i n terms of the same set of variabl e s . Equations (A2.10a,b) then can be used to eliminate and f ^ , from (A2,10c). The r e s u l t i n g expres- sions w i l l c l e a r l y be very lengthy* I t should be noted that the elements of the fjj» (J > I'),, can be calculated numerically much more e a s i l y from those of the f I j r ( J < I ) , than eqs, (A2.3) or (A2.8) - (A2.10) indic a t e . Such a c a l c u l a t i o n involves the soluti o n of E " l ^ j I,.J J<I simultaneous l i n e a r equations in; the same number of scalar v a r i a b l e s * The complicated formulas above a r i s e only when analy t i c formulas are desired r e l a t i n g these d i f f e r e n t matrix blocks f J J « 327. APPENDIX 3 Proofs of Alternative Formulas — Multiple P a r t i t i o n i n g This appendix outlines some of the manipulations necessary to e s t a b l i s h a number of i n t e r - r e l a t i o n s which have been quoted i n section 4.4* Equations (4.73) f o r an orthonormal basis are obtained as follows* From eq. (4,37) and eq. (4.72), one has, *(2) a -1 P H I s % G r - • ^ E ^ ^ ^ J l C ^ ^ W l C l M • CA3.1) Elimination of the quantity/ i n the inner brackets i n t h i s equation using leads to the desired expression, Equation (4.10) has also been used i n the l a s t step. The r e l a t i o n (4.76) between D ^ and D ^ , i n an ortho- normal basi s * i s established immediately by substituting eq* (4.73) into the d e f i n i t i o n , (4.75). of D ( 2 ) . The non-orthonormal case presents many more complications here. From eq. (4*58), one has, 328e Upon su b s t i t u t i o n of t h i s equation into eq. (4.59)» a series of alternative formulas f o r the metric gj can be obtained, among them, g l = K / l f K l t g K I \ / / ^ ( S L I + M / l S L M f M I } ^ S H + K ^ i S l K f K I S - ^ I 4 l 4 K ( S L I + M ^ I S L M f M I ) \ ^ I f K I g K I + S I I + K ^ I S I K f K I a t 1 - K 2 i 4 l f I K ^ ( S I I + M ^ I S I M % ) * K ^ I 4 l t L ^ K f L K ( S L I L/I + E f J T g K T » ( J = x" •••» m>» (A3«5) K/I K I K I This l a s t form' i s the generalization to the m x m p a r t i t i o n i n g of eqs. ( A l . l ) and (A1.2) of Appendix 1. In the case of a 2 x 2 partitioning,, the second term of (A3.5) does not occur at a l l because of the r e s t r i c t i o n s on the range of the inner summation.. Also, the summation symbols i n the f i r s t and t h i r d terms of (A3.5) can be deleted i n that case, since the summation i s over only one term. Using the notation Sj of eq. (4 .65) , the generalizations of eqs. (A1.3) and (A1.4) of Appendix 1 are obtained from (A3.5) as L/I (A3.6) 329. The analogues of eqs. (Al*5) and (Al»6) of Appendix 1 are now obtained by right multiplying (A3.6) by f | p , (P/l), and l e f t multiplying the equation for gpS p by the same factor, and combining the two equations to get, t x P s I S ' l ' = SpSj^f Jp - SO f i P + fIP » (A3.7) g I ^ I l f P I " ^ I ^ ' P 1 * f P I ^ + ^ f P I » and where * K / i f « C ^ K < S " ^ i S L « f M l ) : l " ^ i f i l g K I " K/P L / l and CA3.8) ^ = K / p f K p f p K * ^ f ^ , ^ f M ( S L P ^ W l I P > > ^ f K P « K P 3 K/1 L/P The two equations, (A3*7)» asa well as the two quantities, JB and 30 t can be obtained from each other by interchanging the indices P and 1. The generalization of eqs, (Al . 7 ) and (Al*8) of Appendix 1 is then obtained by dropping those terms in (A3.7) above involving g K I , K / l * Here, however, this amounts to dropping only the last term inside the curly brackets of and dD * Thus the usefulness of the resulting equations as generalized multiple partitioning 'pull-through* relations is severely hampered, because of the complexity and size of the last two terms of (A3«7)» even when the orthogonality condition) is satisfied* Finally, the generalization of eqs. (A1.9) and (ALIO) of 330. Appendix 1, which were used to obtain one of the re l a t i o n s between D^2^ and f o r a 2 x 2 p a r t i t i o n i n g , can be obtained as follows* From ( 4 * 5 8 ) , one has t * t " G J I * ^ / K J ( S K I + L ^ I S K L F L I ) " ( S J l " , ' K ^ I S J K F K I J » K / l (A3.9) so that, from; eq. ( 4 . 5 9 ) and ( A 3 . 4 ) , f i j g i * < l- fIj4.>C«ji-^/^ K/I * F I J , ^ F K I F I K ^ I " f L K / J L / I * F I J K y I F K I G K I N "< 1 " F I J F I ( S J I ^ K ^ J 3 J K F K l ^ G J I + F I J K / j f K I G K I K / J K / J K/I L / I (A3.10) " W^KI^/IO/LI} * K/I The l a s t four complicated summation terms i n (A3 .10) make the re s u l t e f f e c t i v e l y useless, and therefore, no generalization of eqs, (3*15a,,b) i s given here. The proof of eqs. (4.82) i s as follows. From eq. ( A 3 . 9 ) » one has, K / I 331. Then* eq. (4.83) becomes, D J I ' " " J l ^ j " J K f K I - < S J I * ^ 3 J K f K I > " S l X < H I I * j ^ H I J f J I > M i x ^ i i ^ i A i * ' ( A 3 ' U ) from which, an expressions f o r the combination H J J + E HjK fKI can be obtained. By d e f i n i t i o n , one has, " S i 1 t ^ I * j ^ 1 1 ! J* J I + j p j f J I ( H J I * ^ 1 1 JK f KI > ̂ • which becomes, af t e r using ( A 3 . l l ) . " i 2 ' " ^ 1 " i i ^ ^ r j ' j i - j j / j i ^ i j - su *^tL^a*^IstLta) S J ^ < H i i + ^ 1 K i J f J i ) - D j l > K/J = s j l t l - J J I * « f i J * ( , ^ i * J J I H i J * i i ) • J ^ * 5 i C ^ I - 1 S J 4 j » K i » I J I s K L ' i i > ^ l > * J S i ' J i B n ) K/I L/1 K ^ l 332. using eq. (A3.5) to obtain this last form. A large amount of cancellation now occurs among the coefficients of " l 1 ) a ^ ( H ^ ^ H ^ f j ! ) , with the f i n a l result being eq. (4.82), (V(2) *(1) -1 r rt D ( l ) H I = H I + g l j ^ x f J I D J I ' APPENDIX k Description of Algorithms — 2 x 2 Case This appendix gives detailed descriptions of the implementation of the algorithms discussed i n chapter 5« In various instances below, e s p e c i a l l y i n the updating cycles, the order i n which the computations are done i s important, Greek indices r e f e r to basis elements i h Sg, Roman indices to basis elements i n S^, 1» Simple Diagonal Newton-Raphson (SDNR). i n i t i a l i z a t i o n ! f « 0 * ( l ) d i a g s H d i a g them t r » r=l update i 0 * 1 or (s=l , •,•,n^), >oo* <»B 1 ) + ) o o - 6 f o r H ; f f„„ + 6f , or or or Quadratic Diagonal Newton-Raphson (QDNR), i n i t i a l i z a t i o n ! f s 0, H ( 1 ) « H A AA* - ( l ) d i a g _ Hdiag,, B: " BB, ' then Bi A i f H = 0, then 6f « Z>IV/&nr. . ar or or ' ̂ o r i f A„„ -0, then &f =*<*̂ >JF/H~ i f r e a l or or or ' ro 6 f o r = 0 t otherwise, i f both H_ -0,,A_ =0, then 6f =0, or or or otherwise,, *-»or L ^ * o r or or J % * sgn ( A a r ) , update t < » i 1 ) > . r * < i 4 l ) ) . r + H B a d f a r - ( 8 s l n A ) ( H * o o ^ r t 'oo 0 I o r n r o * f -*>f • 6f • A o r ^ or or Diagonal Generalized Nesbet (DGN). i n i t i a l i z a t i o n ! f * 0, HA " AA* them n B W s H__+ £ H_ _ f »• D « ) " W o r % i * . t ( 8 i 2 ) ) t r - t = l A ' r r "oo- r * l . • n. update i (s,t « 1, •«., n A) , - * * SA • 6 & J f o r — f o r * * f o r • ( r ' 1 ' • V " A =f* I A 6 f „ + ( n f 2 ) ) . •6fl„6f r t a(nJ 2 )) rs T 0 % » \ ot A ts ro os A + W L * * « . » (r.s « 1, ..... n.) t ro os 336. 4* F u l l Generalized Nesfcet (FGN) • i n i t i a l i z a t i o n ! f = 0, H<2> . H AA gA 5 8 XA * then! rig W =H + E H f , r * l , •••.,,n. t solve ^ o A [ H i 2 ) - H o a l A ] - D< 2 ). update i = 1, .•••, n A ) » gA * gA * 6 g A - n; £(2) £(2) -1. HA HA + gA A * a-1t n B* f o r f o r + 6 f o r • ( r * l r •••• n A ) f 337. 5. Simple Diagonal Newton-Raphson With Overlap (SDNRS), i n i t i a l i z a t i o n ! 0 . h • 0 , HA = HAA • SA = SAA • u'tdiag; _ „tdiag £tdiag „ ^t d i a g B 3 "* » ° B ~ R B » *BB B BB- them nA G »H + £ H f + £ h (tt.) . or or ̂ op pr ̂ so x A'sr • "B n. ^ o r * ^ ^ * (SA> s r • A 0 r - ( ^ t ) 0 O < » A ) P P - ( i i ) „ ( s J ) 0 0 6f =[g (S!) -G (S.) 1 / A . or L & o r v A'rr ''or* A /rr J' I-*<"* •' or 6 » r o < < S B ) a o 0 o r - ( 2 B t > o a « o r ^ a r . • updatei A s r A s r so or (Si. ) i , ( S . ) _ + S__6f_^.« * A s r ' A s r so or* ( % \ o - * ("B f )oo + 6 h r o H r o ' s«l» ••• f,n A, 'B'do wB'oo * 6 n r o S r o • or or or ^ h „ + 6h. _ , ro ro ro r*l„ n. 338. 6> Quadratic Diagonal Newton-Raphsom with Overlap (QDNRS). i n i t i a l i z a t i o n ! them G *HT HI or or nB n A £ 1 •f jP + £ P=l n B nA f * 0, HA * HAA » S'tdiag , „tdiag WB BB •' W ^ A ^ s r • h - 0, SA * SAA »' Stdiag _ ct d i a g BB S , - a S „ ~ + 2 S_ f + £ h M ( S A ) A = ^ o A o " Sro< HB>oc • B * ( S B ) a o ( H A ) r r - ( S A ) r r ( H B ) o a - S r o G a r + H r o g d r , C: • -(S») G + g (M.) , v A ' r r or 6 o r A A'rr* 6f * 0. or * i f A=0, B=0, i f A/0, B»0, 6 f a r * (-C/A).* i f C/A < 0, I f A*0,. Bf'O, 6 f o r • 0 i f C/A > o; 6 f a r B "* C J / B' i f A/0, B^0f 6 f o r • -B / 2C \ |BT llB|+JBF-4AC J 6h, ro or B oo or *"R'«.̂ ^̂ »»-."" (K« )__ ro or A r r update i (H! )'-»(£!) +H 6f t A s r A'sr so or* (S. ),-•(§.) + S of , A'sr A'sr so or' (s«i ,»•-»• , n A ) , ( H B ^ O O ^ ^ B ^ O O ^ V ^ O ' ^ B ^ o o " * ^ ! ) o a * 6 h r o S r o • f f • 6f_„ or or or hw. + 6h _ ro ro ro t r * l , . . . , n 4 t 0 = 1 f • • • |Tlg ;» 339. 7. Non^Orthogonal Diagonal Generalized Nesbet (DGNS), i n i t i a l i z a t i o n * f » 0, S A 2 > = SAA HAA AA then i Y « r . a S « ^ + E Srr/> f,vr*» (r«l , . . . ,n. ) , or orv/^j op prw 9 9 k 9 v v p { v v nA « o r = D ( 2 ' / C t H { 2 ) ) R R S 0 0 - H C 0 ] , or or update t r»l,••• tn A, t s s 6 < o Y o s ^ i o S f o s + 6 f t o S o o 6 f 0 8 • (s, t = n.), SA • 6 6 A t ^ • . " ^ ^ C T — ^ S 6f (r=l f • • • , n A ) , or or occ or' * * A A A T . a = w l ^ 6 f « o + Y L £ 6 f „ + ( n f 2 j ) . rs ro os ro^_^ ot A ts ro oo os A ss ( r , s = l r , . . t n A ) , H<2> - H < 2 > * S?A . or f o r + 6 f o r (r«l,...,n A). o « l , n 340* 83* Non-Orthogonal F u l l Generalized Nesbet (FGUS).. i n i t i a l i z a t i o n ! f « 0, "12) * SAA HAA » gA 3 SAA • then! n B ^ o r ^ ^ o p V • <r»l....,nA), ,(2). u(2) or or 4 - _ « ot A t r * t=i solve. « 0 A [ S 0 0 H { 2 ' - H 0 01 a] = D£'.. update! r«l r.••,n lA» ,(2) ( s , t = l , . . . . n A ) , gA ffA * 6 g A n. r(2) A + =W* 6f -Y' 2 6f ( H A ' ) , ts to os a i c A rs * (s, t = l , . , . , n A ) , ^(2) *(2) -1 A HA gA f o r f o r + 6 for» <r-l,..-nA). o 3l (•••,n B* 341. APPENDIX 5. Rates of Convergence and Asymptotic Error Constants This appendix analyses the rates of convergence of some of the algorithms, f o r the determination of f, described im chapter 5»* These considerations are based to some degree on the work of Traub (1964). To avoid confusion 1, between subscripts denoting i t e r a t i o n number, and those denoting matrix elements, the fixed point i t e r a t i o n formula, eq. ( 5«4 ) , w i l l be rewritten here as. 0(f) = f - X D(f), (A5.1) where X « Thus, 0 ( f e x a c t ) - fexact^ b e C a u s e D ( f e x a c t ) » 0 The basic i t e r a t i o n formula, eqs. (5»5)» can be written i n t h i s notation as fm*l s fm ~ <*5.2) I f the necessary derivatives of 0(f) e x i s t at f 6 * * 0 * , then 0(ft) can be expanded im a Taylor series centered on ft , which allows one to write an expression for the error i n the current estimate of f, given by 0 ( f ) , i n terms of the error i n the r e s u l t of the previous i t e r a t i o n . One has .(*) • z *%*f* or rexact i»5* far dVr' ̂ exact • • ••• . (A5.3) where 4;' - <v„ -cact- us.*) 342 The i t e r a t i o n function 0(f) i s then said to be of order p i f a l l derivatives of the elements of 0, with respect to elements of f, of order l e s s than p vanish at f = fexact^ w n i l e a t l e a s t one derivative of order p does not vanish. Near the solution, the dominant term i n the error w i l l then, be a sum; over terms containing the product of p errors from the previous i t e r a t i o n . The asymptotic error constants f o r t h i s i t e r a t i o n functions are taken here as the p order c o e f f i c i e n t s i n (A5«3)» For the exact Newton-Raphson equations, ( 5 « 6 ) , the i t e r a t i o n function i s 0 N R ( f ) = f - J^D. (A5 .5) Thus, one has _£I B . £ eixl£ D , (A5 .6) df r.s df T or * or which vanishes at f = f e x a c t , because D ( f e x a c t ) « 0. The second derivatives are £1 a - E e±£* D * E CJ" 1), ^ s df df . ., r.s df df , . r.s K *»< I B af df . . or o r or o r or o r (A5.7) where the i d e n t i t y , d ( J ' ^ J j / d f ^ 85 0, has been used to obtain the l a s t term* At f • f e x a c t t the f i r s t summation i n (A5*7) vanishes*, but not the second one,in general. The Newton-Raphson equations are thus second order convergent, as i s well known, and one can write F t o.vfp* p t ' T s * f o r d f a , r , r , s f exact a r V r ' + 0 ( e 3 ) r (A5.8) 3^3 N R i n d i c a t i n g e x p l i c i t l y the second order nature of 0 ( f ) . The algorithm SDNR i s based on the equation D ^ * ( f ) = 0 . As seen from eq. (5*15)• the operator JC in; the it e r a t i o n . SDNR formula 0 (f) i s just the inverse of the diagonal part of the Jacobian matrix, X SDNR _ 1 pt,cr " j (D Pt,,pt so that, ^SDNR /- \ *Pt ( f ) 6«»o 6rt • D<^(f) Pt - TTTJ J Pt.pt: and d0 SDNR Pt 3 f or _ e c /Qt.,or exact ^ r t " T 1 ^ fiexacx J>t rpt ..exact (A5-9) (A5.10) (A5.il) which vanishes only f o r P-o, and t=r, i n general. Thus, for S D N R 0 , one can write .<f*>. « * * P ° 0 t + 0 ( e 2 ) , (A5.12) CHA ; t t - {lim >PP S D N R which v e r i f i e s that 0 i s indeed l i n e a r l y convergent, and gives an expression f o r the dominant error term near the so l u t i o n . A s u f f i c i e n t condition f o r convergence to occur i s l«pt + 1 )I < \ept\* Cp-1-.-....nBi t»l,...,n A). (A5.13) Assuming that .(m) or > t e < 1„ (o=l,...,n BJ r«l,...,n A), (A5.14) i s true when 0 p + J i s to be evaluated, i t can be seen from (A5.12) 344. that convergence w i l l d e f i n i t e l y occur i f o=|(5i1 ) t>.|* rJ t|(fii 1 )) r t|<|(5i l>) t t-(^ , t)| . (A5.15) which i s obtained by replacing a l l the r a t i o s of the type (A5«l4) occurring i n (A5.13) by unity. When cy c l i n g systematically through the elements of f, a l l fQr* (<*//0t r / t ) , w i l l have been updated more recently than f ^ at t h i s point, and thus, i n an appropriate basis, the condition (-A5.14), i s not unreasonable, as long as the c a l c u l a t i o n i s converging and the errors thus decreasing. While the condition (A5.15) i s too crude to be of any p r a c t i c a l use, i t does indicate that the rate of conver- gence i s related to the r e l a t i v e magnitudes of the differences between diagonal elements i n H A and Hfi, and of t h e i r o f f - diagonal elements. Convergence requires only that the errors i n the elements of f decrease over a number of i t e r a t i o n s , rather than that the errors i n each element of f decrease i n every i t e r a t i o n . The r e s u l t s of te s t calculations i n Table 5»1 show that good rates of convergence occur when (A5.15) i s vi o l a t e d s u b s t a n t i a l l y f o r some elements of f (as for the example with n * 250). A more detailed error analysis indicates that a c r u c i a l f a c t o r f o r convergence i s the c a l c u l a t i o n of 6f one element at a time, with continual updating of ' and Hg (and i m p l i c i t l y , of D ^ ) , I f these quantities are updated only a f t e r a complete sweep through 6f, convergence occurs only f o r small n A, and n R , and i s very slow, at best. 3^5. For the generalized Nesbet algorithms, based on the (2) equation D ' ( f ) = 0 , the same sort of r e s u l t i s obtained* From eq. (5«25)» one has ,DGN d0 Pt df or .exact p o w t r x(2) Pt.or HPP " ( \ ; t t f exact (A5 . 16) .DGN which does not vanish i n general, and thus 0 i s l i n e a r l y convergent i n general, with e z H b n 4 ? } - £ ( H A 2 ) ) . e ^ (m+1). o^P p q g t r*t A r t ? r Pt ( " i 2 ) ) t t " Kf>f> + 0(e*). (A5.17) For n A * l , . one has E H> e (m) e (m+1) . o/P <°° ° •T2T (HA *11 " V + 0(e*). (A5.18) Therefore, algorithm DGN i s second order convergent when n ^ - l only i f Hgg, i s diagonal. For the algorithm FGN, the operator !K i s the inverse of (2) the diagonal block part of JT defined i n eq. (5*21), each such diagonal block of corresponding to a row of Hr '• The algorithm i s l i n e a r l y convergent, since, d0 FGN Pt fffc n. df with or does not vanish in. general at f e x a c 1 ' . B3 "A po°rt -^-pt.rs " f s . o r , fi(2)n-l (A5.19a) (A5.19b) 346V The generalized Nesbet algorithms for use with a non- orthonormal basis give similar results. From section 5»3»c, i t is seen that ,DGNS f\.wii<~> _ —1 6 6 oo A r r oo (A5.20) and F G N S _ . r H s ft(2)rl " p t . o r " 6 p ° L oa * Soo I 1A J * (A3.21) Therefore, one has, ,DGNS 30 l°t d f or f exact = V o 6 r t + T(2) ^Pt.or S ^ ( H A } t t " HPP. .exact (A5.22) which does not vanish in general for any values of pt and or. Thus, for algorithm DGNS, one has, T<2) e (m+1) pt o,r 6 p a 6 r t Pt.or H pp - s / 0 / o ( H A 2 ; ) t t (2) .exact 0(e 2). or (A5.23) It is seen from eq. (5»52), defining J v , that unless n A=l, the coefficient of e^' on the right hand side here is not zero, although i t i s l i k e l y very small. The expression for the error FGNS in 0 is of the same form as (A5.19a) with (A5«21) substituted for (A5.19b).. The error analysis for algorithms SDNRS and QDNRS requires an extension of the procedures used above. The iteration formula must now be written as the pair of equations ,(11),* «/(12) - -] 0 f(f,h) S f 0 h(f,h) h ^(21)(f,h) #(22)u,h) G f i A(f.h) g ^ U . * ) (A5.24) 34?. Comparison to equation (5»44) y i e l d s the r e s u l t , <A.t,or ~ p o w t r ' A t „ o r /s. f>o r t *-Vt pt „ t (A5.25) 0 > ( 2 1 ) = - (22) . ( H B } A t,or - — 6. 6. , A t.or ^ > o 0 t r f A , t p 0 t r ^ P t where i s defined i n eq. (5«45). An expression f o r the errors i n the i t e r a t i o n formula (A5.24) must now be obtained from a Taylor series i n the elements of both f and h, which yi e l d s the r e s u l t , d +2- D ) s £ P ( V p t ( e ( m ) } + d<*f>pt ( e ( m ) } "1 + 4 z d 2 ^ f ) ^ ( e ( m ) ) ( e ( m ) ) • • . ' , L ^ ° r (A5.26) 2, ( * f } P t ( e ( m ) } ( e ( m ) } , d (*f>Pt ( (mK ( (• , • 0 ( e 3 ) , for 0f, with a l l derivatives evaluated at f e x a c t and h e x a c t , A s i m i l a r expansion can be written for ( e ^ m + ^ ) . S u bstitution of (A5.25) and (A5.24), then gives ,(m+lh . r E ("A } t t ( §Bikc " ^ A W V W ( e(m) (A5.2?) + £ E"Vtt<SA>rt - 'SA)tt'"A>rt](e(m)) + 0 ( e 2 348 and (A5.28) Neither (•f l^)p. t °r ( e £ m ^ ) t p occur i n the f i r s t order term of either of these equations. In a l l the f i r s t order error estimates derived i n this Appendix, the denominator of the error estimate i s seen to he i d e n t i c a l to the denominator i n the i t e r a t i o n formula. Thus, i f t h i s denominator becomes small, not only does 6f (or 6f and 6h) become large, but so do the errors i n f (or f and h). Also, i t i s seen that these error estimates a l l involve o f f - diagonal elements of H (or H and S), and therefore, improved convergence i s expected i n a l l algorithms i f these matrices are made more diagonal. 349. APPENDIX 6 Algorithms f o r the Determination of T — Multiple P a r t i t i o n i n g Case The purpose of t h i s appendix i s to outline, i n some d e t a i l , algorithms f o r solving eqs, (5*74) - (5*77) f o r the matrix elements of the off-diagonal blocks of the uncoupling operator T i n an nr, x m p a r t i t i o n i n g . ( l ) * A6.1 Methods Based on D « (T) = 0. I f the f j j , (I,J=1, • **, m, I / J ) , are approximate solutions to any of the defining conditions (5*74) - (5*77), and the exact solutions are given by f j j = f j j + 6 fJi» "then, from the equations D J J (T) = 0, i t i s seen that the exact corrections & f j j to the fJJ are given by I H J K(T°)6f K I -SfjjHjCT) * - D ^ T 0 ) , (I, J=l,... ,m, I / J ) , ^ (A6.1) where « J K ^ 0 ) = H J K " f J I H I K ' < A 6 ' 2 ) I f the exact e f f e c t i v e operators HjfT), (I=l,,,.,m), were known, the l i n e a r system (A6.1) could be solved d i r e c t l y f or the The Newton-Raphson equations corresponding to the nonlinear system = 0, eqs, (5*74), are ^ H ^ K ^ 0 ) * ^ - e f j j H ^ T 0 ) - - D ^ C T 0 ) , , (I,J=l,...,m, I / J ) , (A6.3) 350. which differ from (A6.1) only in that the approximate effective operators Hj*^(T°) appear ih (A6.3) in place of the exact operators Hj(T) i n (A6.1). Equations (A6.3) are obtained by substituting the Jacobian matrix with elements a m * 1 * } a J r I ^ K t L 1 ,tr 6o^> 6KJ 6LI + ( HJK )o^ 6rt 6LI • d ( f i a V t (A6.4) into eq. ( 5 . 6 ) , and isolating the JI block. The similarity of eqs. (A6.1) - (A6.4) to the corresponding equations for a 2 x 2 partitioning i s seen i f i t is noted that Hgfi, = H^Mf i n that case. If solved exactly, eqs. (A6.3) would lead to a second order convergent algorithm. In fact, i f the Hj (T ) are replaced by the Hj 2^(T°), the resulting iteration formula i s nearly third order convergent, just as in the 2 x 2 case. However, the linear system (A6.3) i s of dimension 2 2. n Tn T, K J 1 J which can be unacceptably large even when a 8 S I U is i t s e l f I not unusually large. There are at least two levels of diagonal approximations possible here. In the diagonal block approximation, only terms involving &fjj i t s e l f im eq, (A6.3) are retained, leaving % j 6 f j i * 6 f j i " i 1 ) = 'BJI* ( I » J = 1 m» W» < A 6«5) This involves the solution of m(m-l) smaller systems of linear equations in each iteration, of dimensions, respectively, njnj, (i,J=l,...,m, I/J), a considerable reduction in computation; per iterative sweep. These equations become 3 5 U e s p e c i a l l y useful i f each of the i s small, that i s , i f the p a r t i t i o n i n g divides up the f u l l space into a large number of subspaces of small dimension* I t might also be necessary to use (A6*5) i f the off-diagonal elements of the H j j and the H£ ' are large* In the 2 x 2 case, eqs. (A6.5) are s t i l l the f u l l Newton-Raphson equations, however* The lowest l e v e l of diagonal approximation of (A6.3) gives an i t e r a t i o n formula which reduces to that of algorithm SDNR i n the 2 x 2 case* I t consists of retention of only the i n d i - vidual diagonal elements of the Jacobian matrix (A6.4), which leads to the i t e r a t i o n formula, D ( D ° T r T j 1 c ( 4 u ) r r - (tjj)00 ] Like SDNR, an e f f i c i e n t i t e r a t i v e scheme based on.eq. (A6.6) would consist of c y c l i n g through the 6 f o n e element at a time, c a l c u l a t i n g the D^ £ as required, and sto r i n g the Hj 7 J I and diagonal elements of the H\JJ continuously. Because the * (1) ^1 Rj ' and K j j are; l i n e a r i n the f J J , they are e a s i l y updated, according to ( * ^ I 1 > ) ; s r s ^ I J ^ s o F o r • ( s = 1 » •••»ni>» (A6.7a) J I and (""JAO ' <HIJ>ro«Vl • ""l^rr • <k6-™> A* change im f j ^ a f f e c t s only ft^1 * and I f the diagonal elements i n d i f f e r e n t diagonal blocks of H are well separated, and the off-diagonal elements are small compared to these 352 separations, then a reasonable s t a r t i n g approximation i s T =l n» or f j j « 0, (I,J=l,,..,m, I / J ) . The block columns of T can be determined i n d i v i d u a l l y here, i n any order, because the e f f e c t i v e "(1) ( i ) operators H£ , and H j J t as well as the quantities Dj£ , (J=l, ...,m, J / l ) , depend only on the f j j , (L=l, ,-•. ,m, L / I ) . Substitution of f J X * f j j • * f j I into eq. (A6.1) y i e l d s the exact equation f o r the & f j j , (I,J=1, m, I / J ) . (A6.8) The diagonal block approximation, D ^ ^ T ^ - ^ f j ^ e f ^ H ^ ^ d f j j H j j S f j j , (A6.9) has a form i n & f j j l i k e the form of the basic defining condition, eq. (2.16), f o r a 2 x 2 p a r t i t i o n i n g . The diagonal elements of t h i s equation give a quadratic i t e r a t i o n formula, («I J)r<,^ Jr I +i : ( » i 1 ,)rr - ^ J )a 0 ] 6 V I - < D j i ))or= •• (A6.10) When 6 f i s large, t h i s formula may give improved convergence. °J rI The r e l a t i v e increase i n cost accompanying the use of (A6.10) i n place of (A6.6) depends on; the dimension of the problem, but becomes ne g l i g i b l e as H becomes large. 353. (2) * A6.2 Methods Based omDjj (T) » 0, The elements of the Jacobian matrix i n t h i s case are given by, j(2) »< pg :>or °J ri*PL^K a ( f L K ) p t ^ ^ J ^ ^ r t - ^ ^ ^ t r ^ ^ K I + ( H J L V 6 r t 6 K I n I d(H^ 2^) 8 1 a ( f L l V t where, from (4,89). one obtains, *r<2) ¥ H I i s r . v. r/_-lStx.t s . , -Ut d ( f L l V t J C ( gj H j f ) g ^ 6 r t - ( gJ 1 f ^ ) s < 0 ( Hj ) t r ] 6 L M , (A6.12) For any n^ which are unity, the f i r s t derivatives of the corres- * (2) ponding ' are zero. The r e l a t i v e i h s e n s i t i v i t y of the ef f e c t i v e operators Hj '(T. ),as approximations to the exact A - 4* j ( T ) , to errors i n the off-diagonal blocks of T, *(2) can again be exploited by neglecting the derivatives of Hj ' i n (A6 .11) , This truncation leaves the s i m p l i f i e d Jacobian matrix, J- , with the only nonzero elements given by, 7o 2 i T.. Tt T ' t(H J J ) 0 ( ! > » r t - ( 4 2 ' ) t r 6 < , p ] 6 I | J +(H J L ) 0 ( ) 6 r t , 7(2) ^r u \ * _ru(2)rJ rI ,/»L tI (A6.13) This yi e l d s the Newton-Raphson equations H J J 6 f J I - 4 f J I f i i 2 , t 1 ^ J H J K 6 f K l " - D J I ) > I/J). (A6 . l t ) 354 *(2) In these equations, only the coefficient matrices Hj depend * th * on T, and these only om the I block columnjof T, respectively. Therefore, in principle, eqs. (A6.14) can be used to solve for the f L I , (L=l,...,m, L/l) for each value of I individually, that i s , for a single block column of T at a time. If the separations between diagonal elements in different diagonal blocks are not small compared to the off-diagonal elements of H, then diagonal approximations of (A6.14) are useful. The diagonal block approximation, i s H J J 6 f J I - 6 f J I * i 2 ) = " D J 2 ) » ( ^ = 1 . J / l ) , (A6 . 1 5 ) where tne nj(n-nj) dimensional system of linear equations (A6.14) for a given value of I is replaced by (m-1) linear systems of the smaller respective dimensions n^-nj, (.1=1,..,,m, J/I). Equations (A6.15) can s t i l l lead to relatively costly overall iterations unless the products njrij are a l l small. However, they can be approximated further to give generalizations of the algorithms DGN and FGN. "(2) The exact change in Hj ' due to a change i n a set of f K I , (K=l,...,m. K/I), is 6ft(2),gil(new,[ t i ( D ( 2 ) ^ ^ . ^ ^ ( 2 ) , (A6 . 1 6 ) • S WfTT6f T T - E ft T6f T TH< 2 )], where W J I " H J I + L ^ H J L f L I * ( A 6 ' 1 7 ) Iff only a single f j j is changed, eq. (A6.16) reduces to the 355 same form as eq. (5«23)t .««2>.^<—>[.fJ1(»«)+HJJ.fJ1.«JIfii«) (A6.18) + W J I 6 f j r f J I 6 f J l " i 2 ) 3 ' For any change i n f j j , the entire operator H£ ' i s changed, and thus, as f o r a 2 x 2 partitioning,, i t i s most e f f i c i e n t to change groups of elements withim the matrix fJJ before updating H^'. The generalization of the algorithm DGN i s ( D ( 2 ) ) 6 V i s (fit*>>" !H ) • (ral»-»ni)' (A6a9) J 1 ( H I ' r r " ( HJJ>oo and when a l l Oj- elements of the c^*1 row of f j j are changed im *(2) t h i s way, the change i n H£ ' i s 6 S ( 2 ) = g - l ( n e w ) [ ( f ( „ e w ) t ) i o ( 6 f j i ) o i f i ( 2 ) + ( w t i ) i o ( 6 f j i ) o l + ( 6 f J I ) I c ( 6 f J I ) o I H < 2 > d ] , (A6.20) The generalization of algorithm FGN i s ( * f J I ) c I - ( D J I ) ) a l t < H J J ) a a 1 I - 5 i 2 ) ^ 1 ' < A 6 - 2 1 ) * (2) and the corresponding change i n Hj ' i s •fii2>-^ l ( nW, ,C(wJI)I.»fa I-(*J1)1(,MelH<2)D» (A6.22) In eqs, (A6.20) - (A6.22), the symbols Io (ol) indicate the o columns (rows) of the I block row (column). 356. A6.3 Methods Based on the Simultaneous Solution of Gjj.(T) a0 and fijj(T)s0. The Newton-Raphson equations a r i s i n g from eqs. (5*76) are the pairs E ( 6 f T T ) f W T T + 2 ( W T T ) f 6 f T T * -G T T, (A6.23a) I and ( 6 f J I ) % 6 f I J + L E i [ ( 6 f L J ) t f L I + ( f L J ) t 6 f L I ] = - g J I , (A6.23b:) L ^ J (I < J = 1, .... m), where the quantities W"LI are defined i n eq. (A6.17)« A l l of these equations must be solved simultaneously. That i s , they cannot be separated e a s i l y into a number of subsets without common va r i a b l e s . The diagonal block approximation to (A6.23) i s the pair f 6 f I J ) t w l I + W J J 6 f J I - ° J I ' (A6.24a) and e f j j + U f j j J ^ - g j j . (A6.24b) Solving the diagonal parts of these equations simultaneously f o r corresponding elements of and ( 6 f J J ) * , gives the i t e r a t i o n formulas < WII>rr " < WjAa and 357. which are s i m i l a r to the i t e r a t i o n formulas f o r algorithm SDNRS, The s i m p l i c i t y of eq,. (A6,24b) makes i t possible to eliminate either or ( G f j j ) * from eq,. (A6.24a), leaving an equation i n terms of only one of these. Substitution of ( 6 f I J ) t = - ( g j I • 6f J 3 ;) (A6.26) into (A6,24a), gives the equation - o f J I W I I • wjjfefjj = -G J J • gjjWjj , (A6.27) for 6 f J I # The diagonal part of t h i s equation y i e l d s W j l ) o r - ( Q j I > < n P " ( g j f I l ) g r , (A6.28) ( W I I } r r " ( w j A o and, using t h i s i n eq, (A6.26) gives , * , _ - < G J I * o r ^ g j I W I I ) o r - ( g j I > o r ^ W I I * r r - ( W L ) o o 3 r ° ( W I I * r r " ^ J J ^ o o (A6.29) These formulas amount to addition; of the quantity E ^ g J l * o t ^ W I I * t r t / r to the numerator of (A6.25b) and i t s subtraction from the numera- tor of (A6.25a)„ I f bfjj i s f i r s t eliminated from (A6.24a)„ then the same sort of r e s u l t i s obtained,, but now, the quantity 2 ^\i)aa (S^TT)oy. i s added to the numerator of (A6,25b) and p/o ™ ox pr subtracted from the numerator of (A6,25a), This ambiguity i n i t e r a t i o n formula i s undesirable. Only actual numerical studies can determine whether the i n c l u s i o n of these p o t e n t i a l l y c o s t l y a d d i t i o n a l terms i n the i t e r a t i o n formulas (A6.28) and (A6,29) i s j u s t i f i e d , i n comparison to (A6.25). The implementation of these i t e r a t i o n formulas i s s i m i l a r 358. to the SDNRS algorithm. The quantities WJJ, (I,J=l,...,m) are stored, and updated continuously as the elements of f j ^ are changed. The elements of G J J are calculated from G J I * W J I + L ^ < f L J ) t w L I » ( A 6-3°> and those of g j ^ fronr (5*76b:), as required. A6.4 Methods Based on D ^ ( T ) » 0 . The Newton-Raphson equations corresponding to eqs. (.5••77) are ^ < » W t * 6 W ^ < 6 W t f w + < W t * f i B i > 3 ^ a ) (A6.31) + ^ 2 K ( 6 f L K ) T W L I + ^ ( W L K ) t 6 f L I = - G K I , , (K,I=l,...,m, K / l ) . A l l of the off-diagonal blocks of T occur i n each of these equa- tions i n a complicated manner. I f only those terms Involving 6 f K I on the l e f t hand side of (A6.31) are retained, a much simpler approximation! r e s u l t s , WKK 6 fKI ~ 6 fKI D i l ) = - G K I f (K,I=l,...,m, K/I)., (A6.32) Note that D^j = (g G ) I X / Hj. ' unless eqs. (5*77) are s a t i s f i e d . I t has not been determined whether or not an e f f i c i e n t i t e r a t i v e procedure can be based on eqs. (A6.32). The quantities are comparatively c o s t l y to calc u l a t e , and the i t e r a t i v e scheme would apparently require the maintainence of some estimate of gT 1 throughout the c a l c u l a t i o n . 359. APPENDIX 7 Additional Perturbation Series — Orthonormal Basis In section 6.2, several perturbation formulas for the mapping f» and the e f f e c t i v e operators H A, G A, and H A, were given. The purpose there was to give the low order terms of the perturbations series s o l e l y i n terms of the perturbed operator H. Such perturbation formulas for the e f f e c t i v e operators would then have a general s i g n i f i c a n c e , i n that they are not necessarily obtainable only from the p a r t i t i o n i n g form- alism; presented i n chapters 2 and 3*> For example, the formulas f o r H A can also be obtained using a canonical transformation; formalism. The purpose of t h i s Appendix i s to supplement the material i n section; 6*2,. Additional information on the e f f i c i e n t c alcu- l a t i o n , e s p e c i a l l y of high order terms, of the perturbation; ±1 +4 -* series f o r g^ , g A , and H A i s presented. The formulas tabulated i n s e c t i o n 6,2 become too lengthy with increasing order to be of p r a c t i c a l use much beyond t h i r d order. Perturbation series f o r those powers of the metric g A , namely g A t and g^ , can be obtained in; a number of ways. In terms of the perturbation series f o r f, the series f o r these quantities i s obtained by generalization of the f a m i l i a r power series expansions, sA* S <*A + ftf>* s " £ i K n ) n=0 A 36o (A7 - U gj* - (1 A • ffff)-* = t g r l i ( n ) = l A - i f t f + | ( f ^ f ) 2 - ^ ( f t f ) 3 ^ ( f t f ) 2 | ' - ^ ( f t f ) 5 + . . . , (A7.2) and S ( 1 A + F T F ) " 1 = * G : 1 ( N ) ni=0 = 1 A - f f f + ( f f f ) 2 - ( f f f ) 3 + ( f ^ f ) 4 - ... . (A7.3) In each of (A7«-l)» (A7.2), and (A7.3)» actual expressions for the g j ^ n ^ „ g j ^ 1 1 ^ . and g j 1 ^ are obtained by substituting eq. (6.8) into each product, and isolating a l l terms of order n. Tables A7.1, A7.2, and A7»3 contain some low order formulas of this type. Perturbatiom formulas based on eqs. (A7.1) - (A7.3) cam be generated to high order automatically without great; d i f f i c u l t y , but with increasing order, they rapidly become very lengthy, and thus costly to use. For automatic computation of high order terms im each of these series, a more efficient procedure is available. It is possible to obtaint a series for any of g A ,, g^ » and g A P i n terms of that of g^ and other powers of g A by expanding identities of the form S ^ I 1 = *A » (A7.4a) ( g ^ ) 2 • gj 1,, (A7.4b) 361. <gA*>2 * g A (A7.4c) g A * % * - 1 A (A7.4d) g A * € l g i * * h (A7.4e) g A i g A gA* * *A •• U7M) and so on. From (A7.4a), one obtains giving g ^ 1 ^ ) in; terms of g A ^ f ( I = 0,..., n), and lower order 1 g ^ 1 ^ ^ * (3 = Of •••» n-1). Here, the second term i s assumed: to make no contributiom when (n-1) < 0.. S i m i l a r l y , eq. (A7.4c) yiel d s the expansion g A l ( n ) = i g i n ) - i e A* (**V ( 3 )- (A7'6) j ^ i giving g ^ ^ ^ i n terms of g^11^ and lowerr order terms i n the i expansiom of g A • Equation (A7.4b) y i e l d s a s i m i l a r r e s u l t for gj^^n^ i n terms of g j [ ^ n ^ and lower order terms i n the series f o r g A • From eq. (A7.4d), one obtains g A i ( n ) « - L g A i ( n ^ ) g I i U ) » n > 2 , (A7.7a) A j . p l A A and « • * < * > . . E g ^ ^ g ^ ^ , m > 2 , (A7.7b) which Inter-relate the perturbation; series f o r g A ^ and g ^ . 362. Somewhat more complex expressions are obtained from eqs. (A7.4e) and (A7.4f). The important feature of eqs. (A7.5) - (A7.7), and other similar ones, Is that the evaluation of a le order quantity generally requires the evaluation of no more than It- products of two n A x n A matrices. As k increases, this represents a rapidly decreasing fraction of the computation! required i f formulas like (A7»l) - (A7«3) are used e x p l i c i t l y . This substan- t i a l computational advantage i s a result of not having to repeatedly re-evaluate certain often-occurring combinations In (A7«l) - (A7.3)» The need to store a l l lower order terms i n these series for use in the calculation of higher order terms may be regarded as a disadvantage, but i t Is of no consequence i f n A << n̂ ,. An appropriate combination of eqs. (A7.5) - (A7.7) with eqs. (6.15), together with eqs. (6.12) or (6.13), is certainly the most practical procedure for the calculation of high order terms i n the series for H A» Equations (A7»5) - (A7«7) can also be used to obtain the ± 4 - 1 /1) terms in the series for g A a and g^ solely in terms of the g£ J • Such expressions are particularly useful when moderately high order calculations are being done by hand and algebraically, rather than numerically by machine. Tables A7.4, A7.5 and A7.6 contain several low order formulas of this type. Note the simplicitly of these formulas relative to those i n Tables A7.1 - A7«3« Tables A7#7 and A7.8 contain expressions for low order terms i n the series for H"A i n terms of only g A and R*A or GA. Finally, Table A7«9 contains low order formulas for G A 363 Irc which the equations Dv '(f) = 0 have been used to eliminate a l l terms involving Hgg.* These formulas are the same as those derived using eqs. (6.14). TABLE A7.1 g ^ " * g U 0 ) « 1 A XA gA = 0 % i ( 2 ) . 4 f d ) t f ( l ) % i ( 3 ) a ^ ( f ( l ) t f ( 2 ) + f ( l ) t f ( 2 ) ) gAW . 4 ( f ( l ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) ) m l ( f ( D t f ( l ) ) 2 gj*<5> . i ( f ( D t f ( 4 ) + f ( 2 ) t f ( 3 ) + f ( 3 ) t f ( 2 ) + f ( 4 ) t f ( i ) ) . l | f ( l ) t f ( 2 ) + f ( 2 ) t f ( l ) f f ( l ) t f ( l ) J SM6) m i < f ( l ) t f ( 5 ) t f ( 2 ) t f ( 4 ) + f ( 3 ) t f ( 3 ) + f ( 4 ) t f ( 2 ) + f < 5 ) t f ( l ) ) _ l | f ( l ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f r ( 3 ) t f ( l ) r f ( l ) t f ( l ) J + . l ( f ( D t f ( 2 ) + f ( 2 ) t f ( l ) } 2 ^ ( f X D t f C D p 364* TABLE A7>2 g ^ ( n ) gA XA - K 2 ) = „ M f ( i ) t f d ) ) g - * W * . 4 ( f ( D t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) ) + | ( f ( D t f ( l ) ) 2 g - i ( 5 ) = . | ( f ( D t f ( 4 ) + f ( 2 ) t f ( 3 ) + f ( 3 ) t f ( 2 ) + f ( 4 ) t f ( l ) ) + | | f ( l ) t f ( 2 ) + f ( 2 ) t f ( l ) f £ ( l ) t f ( l ) J + g j * ( 6 ) " = - i ( f ( 1 > t f ( 5 ) + f ( 2 ) t f ( 4 ) + f ( 3 ) t f ( 3 ) + f ( 4 ) t f ( 2 ) + f ( 5 ) t f ( l ) ) + | | ' f ( l ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) ( f ( l ) t f ( l ) j + + | ( f ( 2 ) t f ( l ) + f ( l ) t f ( 2 ) ) 2 . ^ ( f ( l ) t f ( l ) ) 3 365. TABLE A7.3 gA l i - i d ) a o gA g"^ 2) = _ f ( l ) t f ( l ) gA -1(3) = - ( f ( D t f ( 2 ) + f ( 2 ) t f ( l ) ) g j [ 1 ( 4 ) = . ( f ( 1 ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) ) + ( f ( l ) t f ( l ) ) 2 g- l f5) * . ( f ( D t f ( 4 ) + f ( 2 ) t f ( 3 ) + f ( 3 ) t f ( 2 ) + f ( i f ) t f ( l ) ) + f f ( 2 ) t f ( l ) + f ( l ) t f ( 2 ) j f ( l ) t f ( l ) J g-^ 6) = „ ( f ( l ) t f ( 5 ) + f ( 2 ) t f ( 4 ) + f ( 3 ) t f ( 3 ) + f ( 4 ) t f ( 2 ) + f ( 5 ) t f ( l ) ) + { f ( l ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) f f ( l ) t f ( i ) j + ( f ( l ) t f , ( 2 ) + f ( 2 ) t f ( l ) ) 2 _ ( f ( l ) t f ( l ) ) 3 TABLE A7.4 e A ^ ^ - i(o) gA = gA s 0 . 4 ( 2 ) gA s *42) . 4 ( 3 ) gA s *43) ff 4(4) gA s - M2)2 m 4(5) s *45) - M42). gp*i+ g * ( 6 ) s *46) - toi*'. ff<2)> . 1_(3) gA 5+ H gA Tô A .-id) gA TABLE A 7 . 5 g ^ ( n ) • o TABLE A7.6 g j 1 * 1 * * - - K 0 ) gA • *A .-id) gA • 0 % 1 ( 2 ) • 4 2 ) 2 € 1 ( 5 ) = - g ( 5 ) gA •J42). % 1 ( 6 ) gA + [42).. gP̂ 5+ v3 367 TABLE A7.7 H*Ani) im Terms of the g A n ) and n| r e ) H A ~ H A sp } • *P> Hp> = fi|3) • i [ g p > . . ftA0)]_ • *G42). i i i 1 ' ] . sp' * ap * *[gp. fi|0)]- • «4 3 ). si 1 ' ] . + *c42). "PL 1J2) 2&(0) + 3ft(0) J 2 ) 2 . 1J2)S(0) (2) "S^ A H A * S H A G A 5 % H A G A H p ' . H p ) + i [ g p > .Ap> ]_+4[gP) . H p > ]_H[gp > . H p > ] .H[ g p > . H p > ] . sp - sP'ncgP',ai°>].+*[gp).ap>].H[gP».sp>]_H[gp>.ap']. - k P ) 2 H P ) ^ 0 ) g P , 2 ^ 4 2 ) 3 H p ) - # P ) g p ) 3 4 f 4 2 ^ 4 3 1 X 1 )^P , f 4 2 ) 4 3 ) K-| sP ) 26P )^P ) gP ) 2 I G A " A G A 5 G A N A G A To^A N A G A 15 G A " A G A l^(3 ) i(0)_(3)^(3 ) i ( l),(2) 1^(2)J(1).(3) l.(2)S(2) (2) T ^ A H A G A T ^ A H A G A ^ A H A G A T f G A H A G A 368 TABLE A7..8 K["̂ im Terms of the g j " ^ and ~4oy - * i 0 ) Si2> - o«> - *fc«>. oi<»} + « i 3 ) • ° i 3 ) - »fci 3 ) . ° i 0 ) } + - *(42>. 'Ph - * l s i 2 ) . 42)U • k i 2 M 0 ) 4 2 ) R<5) . op) +/ Oi0).. 445) t i f4 2)43)j J + + f G(l),. i g(4) + i g( 2 ) 2 } + -i [ a i 2 > . , g p ) K . i { ( J p ) „ g ( 2 ) j + « i 6 ) • ^ ) - * l 4 6 > . 4 0 ) } + - ^ g p>,Gii ) } + - 4 b f > . o f ) u -*f r i^ .o i 3 ) } + -*{ g i 2 , . a f)K4{ai<», . [ g f),^)} + U 4H 0 ^4 3 , 2U^i 0 \4 2 , 3U4fci 1 , . t4 2 ) .4 3 )! +U 4 K ( 3) G ( 1) J 2 ) 4 S ( 2 ) E ( 1).( 3) 3 Jt2 ) 2 ( 0) (2) ^ A "A *A T? SA "A S A I6 GA ?A S A 3,(2)0(0) ( 2 ) 2 1 (2) (2) (2) T ^ A U A ^ Tiek U A G A 369. TABLE A7»9 G A n ) r(0) _ H ( o ) GA ~ AA UA ~ AA Q(2) = H ( 2 ) + H ( l ) f ( l ) + f ( l ) t f ( l ) H ] ( 0 ) A AA AA °A 3 ) • K i 2 > + K « * ( 1 ) + , t A B ) ' t 2 > + * < 1 ) t ' ( 1 ) H A i ) + ( f ( l ) t f ( 2 ) + f ; ( 2 ) t f ( l ) ) H ( 0 ) /v/v °A° " ^ ) * H A | ) f ( i ) + ^ ) f ( 2 > + f C i ) f f ( l ) K U ) + ( f P ( D t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) ) H ( 0 ) AA 370. APPENDIX 8 Non- r e l a t i v l s t i c : Approximations of the D i r a c Hamiltonian The purpose of t h i s appendix i s to l i s t expressions f o r the various e f f e c t i v e operators dealt with i n section 6.3a, ; to order (v/c)^, f o r comparison;with expressions reported by DeVries (1970). In order to f a c i l i t a t e t h i s , the terms im the Dirac hamiltonian w i l l be written i n the symbolic form H Co) . m 0 0 a* 0 -m a 0 H (2) _ 0 0 0 .0 (A8.1) Here a l l the natural constants have been dropped except f o r the mass m, which i s useful i n comparing the formulas below to those i n sectiom 6.3A» In t h i s notation, the reduced resolvent i s (A8.2) Formulas of the type given i n Table 6.6 or 6.7 give the following f i r s t s i x terms i n the series f o r f,, 2m (2) t: = o, f (3) = - l . [ 0 a - a0] ?̂ aafa,, 4mi 8mf' = 0, f<5) = JL 8m- (A8.3) ^[02a+a02-20a0] + "-~r[-20aata+2aata0+a0ata-aat0a] J l6m> * a ( a f a ) 2 „ 6m? 371* and f<6> = 0, Only odd order terms are nonvanishing i n the expansions for f• Formulas of the type given i n Table 6.8 can now be used to obtain, H<3> = 0, (A8.4) A HA -Va f0a - afa0) - - M a V 2 , kmf 8m H A = 0„ - (at02a+ata02-2at0a0) A 8mJ + ~^(-2at0aa+a-ataat0a+2(ata)20+ata0ata)+—^-?(afa) I6nr* 16m3 As expected from the structure of the perturbation, only even order terms are non-vanishing i n t h i s expansion. I t i s seen that the fourth and s i x t h order terms here are e x p l i c i t l y non- herraitianv Comaprison of eqs. (A8.4) with the formulas i n Table A8.1, obtained using the Paul! elimination method, indicates that both sets are I d e n t i c a l . For calculations of G A and H A „ up to s i x t h order, formulas of the type given i n Tables 6.10 and 6.11 are cumbersome. I t 3 7 2 i s preferable to calculate the series for g^,, * a n a- and to use eqs. (6.14) and ( 6 . 1 5 b ) or ( 6 . 1 5 c ) , respectively. The perturbation series f o r these metric quantities are given i n Tables A8.2 through A8.4. Again, i t i s seen that they contain only even order terms since they are defined i n only a single subspace, S^. Equation (6.14) y i e l d s , a<°> - », a p ) - o. G A 2 > " * + & «*«• and1 r (3) _ o GA ~ °* (A8.5) = -^•(4a t0a-0a +a-a ta0) ~ ( a f a ) 2 , A 8m- 8m̂ G J 6 ) « —i- 7(5a t0 2a-3a t0a0-30a t0a+a ta0 2+0 2a ta-0a +a0) A 16m? + - - ^ ( - • 4 a + 0 a a t a - 4 a t a a t 0 a + ( a t a ) 2 0 + 0 ( a t a ) 2 + 2 a + a 0 a t a ) 3 2 m . + - l , ( a + a > 3 . 64m? Equation ( 6 . 1 5 ) y i e l d s , R i 2 ) • * 373. = -K (2at<z5a-ata0-0a+a) - - ~ ( a + a ) 2 , (A8 . 6 ) A 8m 8m3 H p ) . 0„ ^ l 6 ) = (2aVa-2a t 0a0-20a t 0a+aa t0 2+0 2a ta) A 16m3 +—^r(12a t0aa +a - 1 2 a taa t0a+ 7(a ta) 20+70(a ta) 2+lOa ta0a t 128m- ( a f a ) 3 . 1 6 m 3 Both of these operators are manifestly s e l f - a d j o i n t . Comparison of eqs. (A8 . 6 ) with the formulas i n Table A8«5 obtained using Eriksen's method ( E r i k s e n r 1958) indicates that the Eriksem hamiltonian i s i d e n t i c a l to H A „ at le a s t to s i x t h order. The transformation* V, used by DeVries ( 1 9 7 0 ) to transform the Pau l i hamiltonian into the Erlksen hamiltonian, «Er " V H P a u l i <A8-?> Is given i n Table A 8 , 6 . Oni comparison! of Tables A8,4 and A 8 . 6 , the s i m i l a r i t y transformation, V"*1, implied by eq. (A8.7) i s seen, to be i d e n t i c a l , to fourth order # t a g A"^. 3 7 4 . * m = 0 0 • ^jja+a TABLE A8.1 Pauli Hamiltonian (adapted from De Vries (1970)) (0) P a u l i (1) Pauli (2) Pauli (3) Pauli ( 4 ) Pauli (5) Pauli (6) Pauli = 0 - ^ ( - a t a 0 + a t 0 a ) ^ ( a f a ) 2 4m 811^ = 0 = - ^ ? ( a t a 0 2 - 2 a t 0 a 0 + a t 0 2 a ) 8nr* + _ _ L T - ( 2 ( a + a ) 2 0 - a t a a t 0 a + a t a 0 a t a - 2 a + 0 a a t a ) + 1 6 m 4 ^ ( a ' a ) 3 16m- TABLE A8.2 g^ — N o n - r e l a t i v i s t i c Approximation 0 a ~^(2a t0a-0a ta-a ta0) M c J a ) 2 A 8mJ 8m* £ (5) . 0 g f ^ = -~^-r(3a t0 2a-30a t0a-3a t0a0+0 2a ta+a ta0 2+0a ta0) l 6 m + ( - 4 a t a a t 0 a - a t 0 a a + a + 3 0 ( a + a ) 2 + 3 ( a t a ) 2 0 + 2 a t a 0 a t a ) — ^ - £ ( a + a ) 3 3 2 m : ? 64m° 375. 4 TABLE A8»3 — Non>relativistic Approximation g i ( 0 ) = 1 A XA g.*'" 0* - ^ ( Z o V a - d S a V o V ) ^ r t a ' a ) 2 A l6m J 128J>T SA * = - ^ T t ( 3a t0 2a-30a t0a-3a t0a0+0 2a ta+a fa0 2+0a fa0) A 32m7 •—^-F(-18a taa t0a-l8a t 0aa ta+130(a ta) 2 +13(a ta) 2 0+lOa t a0a ta) 256m? +_J£> ( a t a ) 3 1024m° TABLE A8.4 g ^ — N o n ^ r e l a t i v i s t ^ - Approximation ^A A g I i U ) = ° g A M 2 ) = 1 a t a A 8m ^ ( 3 ) . 0 ^ ' " = 16m3 128m -4 ( 4 ) = _ l (.2at0a+0ata-»-ata0) + - ^ ( a ^ ) 2 -i ( 5 ) g - £ ( 6 ) e — L ^ ( . 3 a t ^ 2 a + ^ a t ^ a + j a t ^ a ^ - i ^ 2 a t a _ a t a f l } 2 - ^ a t a ^ ) A 32mi —^- ?(22a t aa t0a+ 2 2 a t 0 a a t a-150 (a t a ) 2 - 1 5 (a t a ) 2 0-l4a ta0a t a) - - ^ ( a + a ) 3 1024m° 376. TABLE A8.5 Eriksen Hamiltonian (adapted from DeVries. (1970)) Er, " m 4 i } • o = o HP> "Er H i i J * — i5-(a ta0-2a t0a+0a ta) - - ~ ( a + a ) 2 E r 8m2 8m3 0 n i . ^ = —^-7(a ta0 2+0 2a ta-2a t0a0-20a t0a+2a t0 2a) E r 16m3 +-~^-Tr(7 (a ta) 20+70 (a + a) 2-12a taa t 0a-12a t0aa ta+lOa t a0a t a) 128hT +-^- 7(a ta) 3 16m5 TABLE A8.6 Transformation Connecting H p a u l^ and H ^ (adapted from DeVries,. (1970)) vi 0 ) • i A vi 1 ' = 0 A V<2) s 1 a t a 8m2 v<3) = 0 A y(^> = - L - ( 2 a t 0 a - a t a 0 - 0 a t a ) ^ - r ( a f a ) 2 A 16m3 128itT 377. APPENDIX 9 Additional Perturbation i S e r i e s — Non-orthonormal Basis In t h i s appendix, some alte r n a t i v e perturbation formulas, applicable i n the case of a non-orthonormal basis, are derived and l i s t e d * In particular,, the series f o r the metric g A, and i t s powers, and two sets of al t e r n a t i v e formulas f o r the opera- tors H A ,are given, here* Formulas f o r the perturbation series f o r g A i n terms of f and S are obtained straightforwardly by expanding eq* (2,103a) to obtain g A - ? g | n ) . A n=0 A where _(n) a s ( n ) + n " 1 r s ( n - j ) f ( j ) + f ( j ) t , s ( n - j ) + n " J s ( n - o - i ) f gA AA £1 A K B A i=l B B (A9.D E x p l i c i t expressions for several low order terms of (A9 .1) are given i n Table A9*l* I t i s seen that g A now contains a nonzero f i r s t order term. Similar e x p l i c i t expressions could be obtained f o r the -1 4 -4 matrices g A ». g A • and g^ • They r a p i d l y become even more lengthy than those im Table A9 .1 . and lose t h e i r usefulness. However, eqs.(A7*4) - (A7.7) s t i l l hold, and can be used here to express the perturbations series for these powers of g A im terms of the series for g A i t s e l f . Such formulas, given im Tables A9 .2 - A9.4, are: seen to be very s i m i l a r to the corres- ponding formulas i n an orthonormal basis, given i n Tables 378?* A 7 . 4 - A7«6. They are more lengthy generally, because of the presence of the f i r s t order term i n g A« F i n a l l y , formulas such as those given i n Tables A9»2 - A 9 « 4 can,again be used to obtain: useful alternative formulas f o r the H A n* in. terms of the g| n^ and either H A n ) or . Low order expressions of t h i s type are given i n Tables A9»5 and A9.6. TABLE A9»l — Nom-orthonormal Basis g<°> . 1 gA lA J l > = S ( l ) gA AA g ( 2 ) . f ( l ) t f ( l ) , ( S A l ) f ( l ) + f ( l ) t S & A ) ) + a U ) g C3) . f ( l > t f ( 2 ) + f ( 2 ) t f ( l ^ (4 ) B f ( l ) t f ( 3 ) + f ( 2 ) t f ( 2 ) + f ( 3 ) t f ( l ) + ( f ( 3 ) t s ( l ) + s ( l ) f ( 3 ) ) A BA + ( f ( 2 ) * S ^ > + S U ) f ( 2 ) ) + f ( 2 ) t s a ) f ( l ) t f ( l ) t s ( l ) f ( 2 ) e ( 5 ) = f(l)tf(t)+f(2)tj(3)tC(3)tf(2)+£('»)tf.(l)+(f('*)ts(l)+s(l)f(t)) A SA AS t ( f ( 3 ) t s U ) + s ( | ) f ( 3 ) ) + f ( 3 ) t s ( l ) f ( l ) + . f C l ) t s ( 3 ) f ( l ) •f<1>*S«>*«?>*(f<2>*si5>*<3>f<2>)*f<2>*S«)f«1> + f ( 2 ) t s ( l ) f ( 2 ) + f ( l ) t s U ) £ ( 2 ) + ( r ( l ) t s ( W + s W f < l ) ) , s ( 5 ) 3 7 9 . TABLE A9 .2 g A ^ ( n ^ — Non-orthonormal Basis - i ( 3 ) = A J 3 ) . i f J D -(2)1 + 1 ( i ) 3 gA t g A " ̂ l g A • gA J+ + T5gA - = J ^ i L d ) . (3)1 J L f C z ) 2 * ! f . ( i ) 2 J 2 ) J gA t g A 8C gA » gA 5+ S gA T o l gA »,gA J + g A i ( 5 ) • * « i 5 4 « i } - n + 4 f r i 2 ) ^ 3 ^ « i ) 2 . « p ^ T5l gA " gA *+^5gA gA gA +T5 gA gA gA . 4 f f f ( D 3 -(2)1 . ^ ( . ( l ) - ( 2 ) - ( l ) 2 ( l ) 2 (2) <lK 1 ( " l 2 ^ l g A » gA i+"I75 ( gA gA gA + g A gA gA J " 2 l 5 g A TABLE A9>3 g A ^ n ) Non-orthonormal Basis gI*(0) • h sii(3) • -43^{41,.42,!--£41)3 ^ w - -^4feP.43){+^P2-^fei1,2.42l ^ ( 5 ) • -45 , A^41^^£« ).43 ,} t^4 l ) 2.43,i+ -^{1)43,41,^^2,2-41,^-^42)41,42) ^41)42,^42)41,.41,2L-^41)5 380, TABLE A9.4 g ^ 1 ^ "• Non-orthonormal Basis -1(0) . . ^ - -42,-41)2 gA 1(3) • -43>-&2).«il,K - 4 *ilW - -^-M 3 ' .^- -42,2-f41,2.43)i+ -41,42)41,*41) g i 1 ( 5 ) • -45,+f^).41,K443,.42)L-f41) .43,L-41,43>41) -(42,2.41,L-42)41)42)441)42)*42)41).41,2L-41) TABLE A9.5 K^cx) — Non-orthonormal Basis H ( 0 ) = H ( 0 ) A AA »i2) ^i 2 , +*c4 2 ,.H^)i +i[g| 1 ).fi| 1 )]. i ( D 2„(o) i . ( i > n < o U i ) + 3 H ( o ) ( i ) 2 ^ g A AA Tf sA AA «A 8 AA SA «i3) • "i3)^c43i41)L+*c42,.5i1,].̂ c41). 42,X441,2«A1) 441)ai1,41)^i1)41,24f4l,.42)}^4"iA,{41,.42)it ^41)2H{r41)^41)<)41)2 381 • TABLE A9>6 Non-orthonormal Basis HA " AA ^ • 41,-*(41). 40,h sp>. 0p)-*{43).40,L-H42,.41,}+-*{41,.42)K ^{1,41)41,^42)40)41,^{1)40)42) -^i1,240,41,-^1)40)41,2 3 8 2 . APPENDIX 10 Self-Consistent Perturbation Theory When F^°^ Is not Block Diagonal The requirement that the zero order part of the Fock matrix; be at l e a s t block diagonal was imposed i n section 7.4 f o r reasons; of convenience rather than necessity.. The basic changes i n the formalism r e s u l t i n g from a r e l a x a t i o n of that requirement w i l l be summarized here* I f p(0) h a s n o n z e r o off-diagonal blocks, eq. ( 7 « 2 3 ) implies the existence of a zero order term i n the series f o r f, given by the equation D ^ f X - P ^ P * ^ 0.. (A10.1) This equation has a non-zero solut i o n f(°^ i n general,, i f FBA^ ^ °* ^ e c a u s e i * i s J u s - t "*ne defining equation f o r the mapping f(°) corresponding to the non-block diagonal F^ G / >. In the coupled Hartree-Fock perturbation formalism, the n order equation (defining f. ') now becomes D ( i r ) ( f ) = F ^ > + ^ ( P ^ - ^ f ^ L f ^ i p j ^ - J ? ) • I " i 1 f(i)pC«-i-3-) f(J') i=o j=o A B = G B A ( f i n ) ) + G B B ( f i n > ) f ( 0 ) - f ( ° ) G A A ( f { n ) ) - f ( 0 ) G A B ( f i n ) ) f ( 0 ) * 383. • i n , ( ? ; ( n , ) ^ « f i ( B , ' f ( 0 , • f ( 0 , p S , ( f i ( B , , ^ ( 0 ) F ( n ) / « ' ( n ) x f ( 0 ) . n - \ p ( n - j ) f ( j ) - f ( o ) p ( n - j ) x " f AB5 {FA } I j«i 3 2 5 1 1 AA ; i=l j=l A i S * 0, (A10.2) These equations can be written i n the s i m p l i f i e d form D T s n ) ( f ) L > W f o ? - C r s ) = ° ' < A 1 0 ' 3 ) (r=l, ••••»• nfi» s«l, n A ) , but now,. B - A + E A f ( 0 ) - E A f ( 0 ) - E f ( 0 ) A f ( 0 ) H r s o r " A r s r o £ A r ^ r o>s ^ A t s r o V t * t V t ^ w V s + C F i C f ( 0 ) ) V a » . r - ' V ^ ^ r s W <A10^> and c ( n ) . p C j ^ U ) ) ^ ^ -f(°hl«h?kin))fi0) ^ ( F ^ - ^ f ^ L f t ^ P ^ - ^ ) «j~ 1 - NEX "E1 f ( i ) F f n - i - ^ / f ( ^ . (A10.5) i=l j«l A * The operators Pfi , and F A are defined formally i h eqs. (2.66a) and (2.65a),. r e s p e c t i v e l y . 384. The additional complexity of eqs. (A10.3) - (A10.5) over the corresponding equations given i n section 7»4 for f^G/*=0, i s e a s i l y seen. Nevertheless, there are situations i n which i t may be desirable to use t h i s formalism. For example, i f the c a l c u l a t i o n i s to be c a r r i e d out i n a p a r t i c u l a r basis (for instance, l o c a l i z e d o r b i t a l s of some s o r t ) , i t i s probable that the zero order Fock operator i s not block diagonal. I t may, however, be more e f f i c i e n t i n such a case to carry out the c a l c u l a t i o n i n a second basis i n which F^ 0 / t i s at l e a s t block diagonal, and then transform the r e s u l t s back to the desired basis. I t must be remembered that the presence of a nonzero invalidates a l l the perturbation formulas derived i n chapter 7, including those f o r P A and E. 385. APPENDIX 11 Minimization Algorithms Details of two minimzation algorithms referred to i n Chapter 8 are given here, with p a r t i c u l a r reference to d i r e c t energy minimization calculations for closed s h e l l systems, A l l . l Method of Conjugate -Gradients The conjugate gradients method i s a descent optimization procedure. I t can be regarded as a steepest descent algorithm with memory. As i s true of any descent method,, the value of the object function cannot diverge here i f i t i s bounded from below. However, convergence i s not guaranteed i n general., As applied to the closed s h e l l case, when the energy i s to be minimized with respect to the elements of the operator f, the algorithm i s as f o l l o w s r 1* I n i t i a l i z a t i o n — a n i n i t i a l estimate of the f-operator, leading to an i n i t i a l estimate of the density matrix, R, i s required. An i n i t i a l estimate of the Fock matrix, F(R), i s calculated from: t h i s i n i t i a l density matrix. 2* The energy gradient i s calculated, ^ J E * 4 Fgtg , ( a l l quantities r e a l ) . B A 3* Given VfE» a n d " t h e search* [direction used i n the previous i t e r a t i o n , ^ o l d , the current search d i r e c t i o n i s calculated as 386. v = - VjJE + 3v' .old where 3 a fa 8! n o^r lVAr l 2 I f t h i s i s the f i r s t i t e r a t i o n (or an i t e r a t i o n numbered a multiple of nAnB,) take 3 = 0, that i s v = -VfJS, which i s the steepest descent d i r e c t i o n . 4., Minimize E ( f • Xv) as a function of the single parameter X, representing a step length along the current search d i r e c t i o n . This i s usually done using a cubic i n t e r p o l a t i o n procedure of Davidon (see Garton and S u t c l i f f e , 1974).. 5 Update, and re-evaluate R and F(R). I f predetermined convergence c r i t e r i a have not been s a t i s f i e d , return to step 2. Otherwise, e x i t the procedure. The l i n e a r search i s the most co s t l y step i n the c a l c u l a t i o n . I t i s therefore important to use i n t e r p o l a t i o n schemes which do not require a large number of energy evaluations, and which make maximal use of the information a v a i l a b l e . The cubic i n t e r p o l a t i o n formula w i l l give the exact minimum of a quadratic function, and i s therefore quite suitable i n d i r e c t energy minimization c a l c u l a - tions,, e s p e c i a l l y near the energy minimum. In the calculations f f + X m i n i 387. reported i n section 8.2.c, a second in t e r p o l a t i o n procedure, based on the secant method for solving nonlinear equations, was used. Given values of dE/aX at two points along the search d i r e c t i o n , an approximation to the minimizing step length i s given by E'(X 9)X 1 - E'(X 1)X ? X . ? _ 1 L _ 2 # ( A l l . l ) mini » v A2 " A l While th i s i n t e r p o l a t i o n formula does not make use of a l l the information available ( i t uses the energy derivatives, but not the energy i t s e l f ) , i t does have the advantage of not requiring that the energy minimum be bracketed by X^ and X2» I f E(X) i s a quadratic function, ^ m ^ n . g i v e n by ( A l l . l ) i s exact. Since both the cubic i n t e r p o l a t i o n formula and eq. ( A l l . l ) locate the minimum along the search d i r e c t i o n only approximately, i t i s necessary to ensure that E(f + X„, v) i s indeed less than mm E ( f ) . I f t h i s i s not so, then a second in t e r p o l a t i o n on one of the two subintervals of the o r i g i n a l i n t e r v a l must be c a r r i e d out. F i n a l l y , i t should be noted that components of the search d i r e c t i o n v on surfaces where E i s constant can only enter v i a the memory term. Therefore, i f the c a l c u l a t i o n i s converging (that i s , i f * s decreasing),, then g < 1, and these compo- nents are attenuated imsucceeding i t e r a t i o n s . A l l . 2 The Newton-Raphson Method The application! of the Newton-Raphson: method to the closed s h e l l s e l f - c o n s i s t e n t f i e l d c a l c u l a t i o n involves a d i f f e r e n t \ 388. strategy for determining stationary values of the energy, namely, solving f o r the roots of the system of simultaneous nonlinear equations F*t« =0, This method i s not a descent method, and e B r A does not necessarily converge to an energy minimum. The o v e r a l l algorithm as applied to the closed s h e l l case can be summarized* as followst 1* I n i t i a l i z a t i o n — s a m e as f o r the conjugate gradient method.. 2» The energy gradient i s calculated, , ^f_ E a 4 F g t - , ( a l l quantities r e a l ) . 1 B'A 3. The Jacobian matrix i s calculated (the Hessian matrix of the energy), 2 "or,«rs " f • or Ts 4.. The Newton-Raphson equations, J6f • - V f E . are solved for the elements of the c o r r e c t i o n 6f to f . 5.. The f-operator i s updated, f ^ f + i f , and new e s t i - mates of R and F(R) calculated. I f the prescribed convergence c r i t e r i a are s a t i s f i e d at t h i s point, the c a l c u l a t i o n i s terminated. Otherwise, return to step 2. The Newton-Raphson algorithm i s conceptually simple to implement i n the sense that there i s no ambiguity present l i k e 389. that associated with the l i n e a r search step i n the conjugate gradient method. I t i s second order convergent\ one Newton- Raphson i t e r a t i o n ! being roughly equivalent, i n p r i n c i p l e , to m conjugate gradient i t e r a t i o n s , where m i s the number of independent variables i n the problem (Daniel, 1965). However, the large amount of computation required per i t e r a t i o n as m becomes large tends to o f f s e t the rapid rate of convergence, and i t i s generally considered inapplicable f o r ap p l i c a t i o n to self-consistent f i e l d c a l c u l a t i o n s , as outlined above. 390. APPENDIX 12 Derivatives With Respect to Real and Imaginary Parts of f . Most of the formulas derived i n t h i s chapter have been i n terms of the elements of f and t h e i r complex conjugates. Under some circumstances, i t i s more useful to rewrite these formulas i n terms of the r e a l and imaginary parts of f, denoted here as and f 1 . I f a r e a l basis set i s used, i t i s necessary to have derivatives of the energy only with respect to the r e a l part of f . The formulas f o r obtaining these derivatives from the previously obtained ones are summarized here. Writing f « r . = f!L • i f L r f* = fL - i f * , or or or or or or one has fTl df__ af!_ dfL 1V 8f df* ) * — \ c r or / *-or " o r " o r and a2 _ a2 A a2 A a2 a2 and af^ af;; af af- af af- af af af af« or Ts or f s or Ts or* T S or T S 2 2 2 2 2 a* , a g , a* . a41 a* — _ i * ,+ 5~ • —5 5 y - af:Laf:L af^af-,, af af , a af JLC j f 8 L 0 or T S or Ts or "s or "s , or T S a2 a2 a2 ^ a2 a2 af!LafJa afrtv.af-.e af* af* af-,o af* af* or T S or Ts or T S or Ts or Ts I t i s worth noting that i f both E and f are r e a l , then aE/df1 vanishes. 391 APPENDIX 13 Covariant and Contravariant Representations -- The General Case An analysis of the metric properties of the non-orthonormal molecular o r b i t a l s defined ini eq. (8.6l) f o r a general multi- p a r t i t i o n i n g , can be carried out i n a manner analogous to that of section 2.1.d. The major formulas only are summarized here. We have «&> - (1 - R ( 1 , S ) j ] t and ~<i) _ R(i) e J I " K J I - (A13.1b) Writing g ( i ) . g ( D t - ( i ) ( ( A 1 3 . 2 a ) one obtains S ( D . mlx g<irVi)tf(i) ci)-1 g J I 1 1 JI J-^J-P1?! g I r - i LI r L I J g I P a l L s l - (R ( 1 ) - SR* 1) 2)^ - g £ > \ U / I ) . and 392, gJK * J K p t i 1 S I PI bPK b J P r P I g I fKI J p p i ^ t l b J P r P I g I X L I £ L I g I rP*I bP*K * (1 - R ^ S - SR ( i' - S R ^ ^ S ) ^ , (J,K / I), ; (A13.2b) demonstrating the non«-orthonormality of the e ^ with respect to the i d e n t i t y i n general. A set of vectors, e/*^, dual to the are given by ZlV " ~fm*' ( M ^ (A13.3) and »II S I P f P I * They are also non-orthonormal with respect to the i d e n t i t y , ; e where ( D t e ( i ) tt g U ) t (A13.4a) and « "E1 S f ( i ) - f ( i ) E S f ( i ) §L1 p Z l LP PI LI p * t S I P f P I = g**'*,, (L / I) , s ( A 1 3 . W , J i ) s ^ f ( i ) t s « i f ( i ) . I n s J P ; , 8 l f p i s p j s j p , f p , i • 3 9 3 . However, these two sets of contragredient vectors can be used to construct metric matrices, with respect to which they are orthonormal. In d e t a i l , s ( i ) t A ( i ) ? ( i ) . 1 ( ( A 1 3 . 5 a ) where the blocks of * e ^ e , ^ * are **** &IM - 6 L M •,5l 1 s i » f n ) f p*i t s p ' i i ' <*••» * and (A13.5b) A ( i ) . " J 1 * ( i ) t f ( i ) + 5' o f ( i ) f ( i ) t s O i l * jl± f J I f J I p # p ? = 1 S I P f P I fP*I S P ' I * S i m i l a r l y , ^ D t g j i y i ) . l t ( A 1 3 . 6 a ) where the blocks of B e* 1^ 1** are,. 2 a ) • H r * L 1 M 1 ^ S ^ t h r < S v p i ) ) 4 l ) ^ 1 > , P.P ' a l (L,. M / I ) , ( A 1 3 . 6 b ) 394. fid) « -(D-d)" 1 " J 1 r f ! ( i ) t s ws . f C p j J i ) " 1 ^ L I F L I G I J J ^ J ( F P I S P J M S J P , F P ' l ' G I P , P ' « 1 -"ESS f ^ U 1 * " 1 • f ( i ) K d ) - 2 B ^ T i i ^ n g i I L I g i and A I I " G I L 1 * U P I s p j M S j p , f p , i ' - l g i P , P*=l I t i s seen that g ( i ) / S ( i /, and g}^ f ^XK so that the matrices e ^ and e}x^ are not normal i n th i s general case* where the fixe d basis i s non-orthonormal. The above formulas sim p l i f y greatly i f the fixed basis i s orthonormal. One obtains, ® J K ' = (1 - R ( i ) ) J K , ( K / I ) , *\ ( J f K s l r ...f m + i ) V \ (A13.7) ~ ( i ) . R ( i ) J d « l . .... m ) . E J I " K J I • Then, one has, ~ ( i ) . g(i)tg(i) . ,(i),<i>t . • 2 f i ) t , ( A 1 3 . 8 a ) where = d - R ( i ) ) L M . ( L . M / I ) . * 0 * g i i ) f r (MD. (A13.8b) andi *d> . R d ) g I I K I I S i m i l a r l y , f o r the dual vectors, one obtains, & L L ^ = * L ' ( L = 1, m+1), e ( i> = f ( i ) . » e ( i ) t - § L I XLI &IL » and *1M * ° R ( M ^ L T L,M/I) 9, Then, one has, where and g ( i ) = e ( i ) t e ( i > * e ( i ) e t 1 > t * d lK • S i " • 0 • S I L ' * . ( L / I ) » « I G I
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
France | 6 | 0 |
Japan | 5 | 1 |
United States | 2 | 0 |
China | 1 | 28 |
City | Views | Downloads |
---|---|---|
Unknown | 6 | 1 |
Tokyo | 5 | 0 |
Sunnyvale | 1 | 0 |
Beijing | 1 | 0 |
Ashburn | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Share to: