Population Analyses Based on IonicPartition of Overlap DistributionsbyYiming WangB.Sc. in Physics, Dalian University of Technology, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Chemistry)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2018c© Yiming Wang 2018Committee PageThe following individuals certify that they have read, and recommend tothe Faculty of Graduate and Postdoctoral Studies for acceptance, the dis-sertation entitled: Population Analyses Based on Ionic Partition of OverlapDistributions submitted by Yiming Wang in partial fulfillment of the re-quirements for the degree of Master of Science in Chemistry.Examining CommitteeResearch Supervisor:Dr. Yan Alexander Wang, ChemistrySupervisory Committee Members:Dr. Mark Thachuk, ChemistryDr. Gren Patey, ChemistryiiAbstractIn this thesis, we bring up several new schemes of partitioning the atomicpartial charges for the purpose of reducing the dependency on the basis setsand the inaccuracy from previous methods we did in our group. We analyzeall the methods including Mulliken, evaluate them by comparing with Natu-ral Population Analysis (NPA) with several different groups of systems whichwe divide according to their polarity. We find that when applied to morepolarized systems such as compounds containing Fluorine, our PopulationAnalyses Based on Ionic Partition of Overlap Distributions (IPOD) seriesperform better and produce charges closer to those of NPA method. Withinthe same system, IPOD series work better for atoms with more polarizedbond than for atoms with non-polarized ones. On top of all the analysesfor separate groups, we plot the correlation between charges produced bydifferent methods with charges generated by NPA method. From the graphand the slope value we conclude that IPOD2d is the method which gives themost reliable result compared to NPA among all the methods. Also, in orderto figure out the best basis set which can represent the result of IPOD2d, weplot the correlation graph between charges produced by IPOD2d and NPAmethods for several basis sets. We find that 6-31G basis set is the mostrepresentative basis set. Using the 6-31G to calculate charges for certainsystems renders us lots of advantages in terms of computational efficiencywhile still providing a reasonable result.iiiLay SummaryA frequent topic in quantum chemistry is the determination of the electronicconfiguration and the electronic charge distribution of a molecule, especiallythe net charges associated to each atoms within a molecule. While we knowthe electronic charge is hard to observe by experiment directly, we need toget the charge distribution among the constituent atoms in a molecule bya given wavefunction. The process to carry out this analysis is consideredas population analysis. In order to solve the problems of the method ourgroup previously did, such as the high dependency of basis sets and in-accuracy of the results, we bring up several new schemes to partition theelectronic charge. We analyze the results calculated with different basis setsand compare them with the results generated by Natural Population Anal-ysis (NPA) and evaluate how good they perform within polarized and lesspolarized systems.ivPrefaceIn the theory part, IPOD1 and IPOD2 methods were developed in Dr. AlexWang’s group prior to my arrival. IPOD1 and IPOD2 methods were origi-nally started by Dr. Wang and Mr.Yakun Chen. The following up methodsIPOD2b, IPOD2c, and IPOD2d are from Dr. Wang and Mr. Miguel Garcia,a PhD student in our group. I collaborated with Mr. Miguel to put the ideainto implementation into NWChem. The choosing of the systems and dataanalysis was done by myself.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . xiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Electronic Density Analysis . . . . . . . . . . . . . . . . . . . 11.2 Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Basis Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Basics for Quantum Chemistry . . . . . . . . . . . . . . . . . 91.4.1 Hartree-Fock . . . . . . . . . . . . . . . . . . . . . . . 101.4.2 Density Functional Theory . . . . . . . . . . . . . . . 161.5 Population Analysis . . . . . . . . . . . . . . . . . . . . . . . 201.5.1 Mulliken and Lo¨wdin Population Analyses . . . . . . 221.5.2 Natural Population Analysis (NPA) . . . . . . . . . . 231.5.3 Ionic Partition of Overlap Distribution (IPOD) meth-ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Result and Discussion . . . . . . . . . . . . . . . . . . . . . . . 272.1 Fluorine Systems . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 Alcohol Systems . . . . . . . . . . . . . . . . . . . . . . . . . 312.3 Alkene Systems . . . . . . . . . . . . . . . . . . . . . . . . . 34viTable of Contents2.4 Aromatic Systems . . . . . . . . . . . . . . . . . . . . . . . . 432.5 Small Inorganic Molecules . . . . . . . . . . . . . . . . . . . 432.6 Correlation between Different Methods with NPA . . . . . . 492.7 The Best Basis Set within IPOD2d Method . . . . . . . . . . 493 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55viiList of Tables2.1 Average charges for all basis sets and standard deviations forHF and LiF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.2 Average charges with all basis sets and standard deviationsfor NaF and KF. . . . . . . . . . . . . . . . . . . . . . . . . . 282.3 Average charges and standard deviations for atom C in CH3OHand C2H5OH for basis sets from 3-21G to 6-31G*. . . . . . . 312.4 Average charges and standard deviations for atom C in C3H7OHand C4H9OH for basis sets from 3-21G to 6-31G* . . . . . . . 312.5 Average charges and standard deviations for atom Carbon(side) in C2H4 and C3H4 for basis sets from 3-21G to 6-31G(2pd, 2p). . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.6 Average charges and standard deviations for atom Carbon(side) in C3H6 and C4H6 for basis sets from 3-21G to 6-31G(2pd, 2p). . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.7 Average charges and standard deviations for atom Carbon(middle) in C2H4 and C3H4 for basis sets from 3-21G to 6-31G(2pd, 2p). . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.8 Average charges and standard deviations for atom C (middle)in C3H6 and C4H6 for basis sets from 3-21G to 6-31G(2pd, 2p). 382.9 Average charges and standard deviations for atom Carbon(side) in Anthrancene and Biphenyl for basis sets from 3-21Gto 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.10 Average charges and standard deviations for atom Carbon(side) in Naphtalene and Phenanthrene for basis sets from3-21G to 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . 392.11 Average charges and standard deviations for atom Carbon(middle) in Anthrancene and Biphenyl for basis sets from 3-21G to 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . . 422.12 Average charges and standard deviations for atom Carbon(middle) in Naphtalene and Phenanthrene for basis sets from3-21G to 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . 42viiiList of Tables2.13 Average charges and standard deviations for atom Carbon inCO2 and CO for basis sets from 3-21G to 6-31G**. . . . . . . 452.14 Average charges and standard deviations for atom Nitrogenin HCN and NH3 for basis sets from 3-21G to 6-31G**. . . . 452.15 Average charges and standard deviations for atom Nitrogenin N2O and HNO for basis sets from from 3-21G to 6-31G**. 472.16 Average charges and standard deviations for atom Nitrogenin H2O and HOOH for basis sets from 3-21G to 6-31G**. . . 47ixList of Figures2.1 Partial charges for Fluorine diatomic compounds. The basissets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d,6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 292.2 Partial charges for C atom in Alcohol compounds. The basissets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d,6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 322.3 Partial charges for O atom in Alcohol compounds. The basissets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d,6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 332.4 Partial charges for Carbon-side atom in Alkenes compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 352.5 Partial charges for C-middle atom in Alkenes compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 372.6 Partial charges for C-side atom in Aromatic compounds. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G),(d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 402.7 Partial charges for C-middle atom in Aromatic compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 41xList of Figures2.8 Partial charges for atom in small inorganic molecules. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G),(d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 442.9 Partial charges for Oxygen atom in small inorganic molecules.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 482.10 Correlation between different methods with NPA . . . . . . . 502.11 Correlation between different methods with NPA . . . . . . . 512.12 Correlation for different basis sets in IPOD2d with the samebasis sets in NPA . . . . . . . . . . . . . . . . . . . . . . . . . 52xiList of AbbreviationsIPOD Partition of Overlap DistributionMPA Mulliken Population AnalysisLPA Lo¨wdin Population AnalysisNPA Natural Population AnalysisAIM Atoms In MoleculesLCAO Linear Combination of Atomic OrbitalsSTO Slater-Type OrbitalGTO Gaussian-Type OrbitalDFT Density Functional TheorySCF Self-consistent ProcedureCGF contracted Gaussian functionsPGTO primitive Gaussian-Type OrbitalDZ Double ZetaDZP Double Zeta plus PolarizationPT Perturbation TheoryKE kinetic energyHF Hartree-FockHK Hohenberg-KohnKS Kohn-ShamSD Slater determinantMO Molecular OrbitalAOs Atomic OrbitalsNOs Natural OrbitalsNAOs Natural Atomic OrbitalsCBPA Christoffersen-Baker population analysisxiiAcknowledgementsI would like to thank Professor Wang for his patient mentoring and expertadvice throughout this project. Also, I would like to thank Miguel for hisextraordinary support in this thesis process. And finally, last but by nomeans least, also to everyone in my lab. It was great sharing laboratorywith all of you during last three years. Thanks for all their encouragement!xiiiChapter 1TheoryA frequent topic in quantum chemistry is the determination of the electronicconfiguration and electronic charge distribution of a molecule, especially netcharge associated with each atom in a polyatomic molecule [1]. While thischarge distribution is hard to observe directly by experiment, it is importantfor rendering certain chemical interpretation of the wave function whichleads to useful understanding of chemical phenomena [2].To quantify the notions of atomic charge and orbital population in asatisfactory way, there are a number of ways for analysing a calculationwhen accurate wavefunctions are available [3].These analysis mainly fall into two categories:1. Partition of charge between atoms based on the orbital occupancy.2. Partition of a physical observable derived from the wavefunction, suchas electron density.1.1 Electronic Density AnalysisWhen we have an optimized structure and the molecular wavefunction Ψ, wecan calculate electron density which is the square of the wavefunction. Hencewe can examine how electron density distributes over a molecule directly.For partitioning of a physical observable derived from the wave function,such as electron density, we have Bader Atoms In Molecules (AIM) analysis[4]. It relies on properties of the electron density alone and the informationcan be obtained from the laplacian of the electron density [5].For partitioning the molecular wave function by some orbital basedscheme, Ψ is written as a Slater determinant of individual molecular or-bitals ψi(r)[6], and electronic density ρ is defined asρ(r)= 2[|ψ1(r)|2 + |ψ2(r)|2 + ....+ |ψN/2(r)|2], (1.1)where N is the electron number, r is the space vector.All of the molecular orbitals together must contribute to the electrondensity. Each molecular orbital is expanded in terms of the atomic orbitals11.2. Gaussian(AO) basis functions (Slater-Type Orbital or (contracted) Gaussian-TypeOrbital) [1]. In the LCAO (Linear Combination of Atomic Orbitals) ap-proximation, if we insert the orbital expansion into ρ, ρ becomes a sum ofthe contributions. If we go further and see how ρ is distributed in terms ofparticular atoms and locate the charge on the atomic centers (nucleus), wecarry out a population analysis [7].1.2 GaussianGaussian is a popular and widely used Computational Chemistry Softwarepackage, especially for electronic structure calculations [8]. It has been up-dated continuously since firstly released in 1970 by John Pople. We use 09version for our project although the latest one is Gaussian 16. Gaussianprovides a wide-range modelling capabilities from the prediction of ener-gies, molecular structures and vibrational frequencies to the prediction ofreactions in a wide variety of chemical environments [9].Also, Gaussian offers various methods for modelling compounds andchemical process. Including Hartree-Fock methods (restricted, unrestrictedopen-shell), Density Functional Theory (DFT) and Molecular Mechanics. Inour project, we mainly use DFT methods to do calculations for charge anal-ysis with different basis sets. Among them, DFT and Hartree-Fock methodare both Self-consistent Procedure (SCF methods).1.3 Basis SetFor basis set, many basis sets are stored internally in the Gaussian and thefollowing are what we applied in our projects:STO-3G, 3-21G, 4-31G, 6-31G, 6-31G*, 6-31G **, 6-31G(2pd, 2p),6-31G(3df, 3pd), 6-31++G 6-31++G*, 6-31++G**.In the following section, we will show the definition of different basis sets andillustrate how specific attributes of a basis set influence calculated quantities[10].The 1s Minimal STO-3G Basis SetIn this section, instead of introducing basis sets for general polyatomicmolecule calculations, we describe 1s type basis functions as a start. Theextension of these concepts for the general case which includes s,p,d-typebasis functions will follow the same rule [11].21.3. Basis SetIn terms of 1s functions, there are mainly two types of basis functionswhich are widely used:1. The normalized 1s Slater-type function of the formφSF1s(ξ, r −RA)=(ξ3pi)1/2e−ξ|r−RA|2, (1.2)where RA is the center, and ξ is the Slater orbital exponent.2. The normalized 1s Gaussian-type function of the formφGF1s(α, r −RA)=(2α/pi)3/4e−α|r−RA|2, (1.3)where α is the Gaussian orbital exponent [12].The orbital exponents which are positive numbers control the width ofthe orbital. Large values of ξ or α give a tight function while small values givea diffuse function. There are two main factors one should consider in termsof choosing of a basis, which are the efficiency and accuracy of describingelectronic wave function calculations. For best efficiency, one intends to usefewer possible terms when expanding the molecular orbital ψi.ψi =k∑µ=1Cµiφµ (1.4)In this sense, Slater functions have an advantage over Gaussian functions.Fewer Slater basis functions than Gaussian basis functions are needed toprovide the same level of accurate results. Another consideration is the timeconsumption of the two-electron integral evaluation. In an SCF calculation,one of the most expensive steps is the calculation of the two-electron integralswhich have the form(µAνB|λCσD)=∫dr1dr2φ∗Aµ(r1)φBµ(r1)r−112 φ∗Cλ(r2)φDσ(r2). (1.5)The fact that the evaluation of the four-center integrals is more computa-tionally costly with Slater-basis functions makes Gaussian functions a betterchoice in this scenario. The reason for that is for a Gaussian functions canbe applied with explicit formulas while Slater-basis functions do not have.Also, the name of software Gaussian derives from its use of GTOs.The reason why these integrals are much easier to calculate with Gaus-sian basis function, is that the products of Gaussians are Gaussians centeredbetween atom centers. The product of two Gaussian functions is a thirdGaussian function centered between two atoms, that is,φGF1s(α, r −RA)φGF1s(β, r −RB)= KABφGF1s(p, r −Rp), (1.6)31.3. Basis Setwhere KAB isKAB =(2αβ/[(α+ β)pi]3/4)exp[− αβ/(α+ β)|RA −RB|2]. (1.7)The exponent of the new Gaussian centered at Rp is p = α+β and the thirdceter P is on a line joining the centers A and B,Rp =(αRA + βRB)/(α+ β). (1.8)The same rule applies for four-center integral. For a 1s Gaussian, it can bereduced to two-center integrals.(µAνB|λCσD)= KABKCD∫dr1dr2φGF1s(P, r1 −RP)r−112 φGF1s(Q, r2 −RQ).(1.9)Although the two-electron integrals can be evaluated more efficiently withGaussian functions, the fact that Gaussian functions have inaccurate func-tional behaviour of molecular orbitals hinders them from being the optimumbasis functions. Thereafter, we can use basis functions which are fixed linearcombinations of the primitive Gaussian functions φGFp as a trade off. Theselinear combinations called contracted Gaussian functions (CGF) which aregiven byφCGFµ(r −RA)=L∑p=1dpµφGFp(αpµ, r −RA), (1.10)where L is the length of the contraction and dpµ is a contraction coeffi-cient. There is a functional relationship between the pth normalized prim-itive Gaussian φGFp and the basis function φCGFµ by the Gaussian orbitalexponent αpµ which is also called contraction exponent.The method consist in choosing a certain contraction length, contrac-tion coefficients and contraction exponents to build a desirable set of basisfunctions φCGFµ on the lefthand side. While using these fixed functions todo calculations in molecular wave function, especially an SCF calculation,the contraction coefficients, should remain unchanged.With the contracted basis set functions {φCGFµ }, the two-electron inte-grals(µν|λα) can be evaluated as a sum of easily calculated two-electronintegrals over the original Gaussian functions. If the contraction parame-ters are appropriately chosen, one can have atomic Hartree-Fock functions,Slater-type functions, etc and still only use primitive Gaussian functions toget an efficient integral. The idea behind it is to get a linear combinationof N primitive Gaussian functions by fitting a Slater-type-orbital (STO),41.3. Basis Setwhich is called STO-NG procedure. STO-3G basis sets are the most widelyused in polyatomic calculation to rapidly evaluate integrals since it has beenfound that using more than three Primitive Gaussian-type-orbital (PGTO)to represent the STO gives little improvement [13]. Let us first considerusing φCGF1s to approximate a 1s Slater-type function with ξ = 1.0 as anexample.φCGF1s(ξ = 1.0, STO − 1G) = φGF1s (α11), (1.11)φCFG1s(ξ = 1.0, STO − 2G) = d12φGF1s (α12)+ d22φGF1s (α22), (1.12)φCFG1s(ξ = 1.0, STO − 3G) = d13φGF1s (α13)+ d23φGF1s (α23)+ d33φGF1s (α33),(1.13)We only consider up to three contractions where we need to find the best-fit coefficients dpµ and exponents αpµ with a fixed ξ = 1.0 to get a basisfunction which is the closest to a Slater-type function. To do this, we canuse a least square techniques to minimize the integral [13]I =∫dr[φSF1s(ξ = 1.0, r)− φCGF1s (ξ = 1.0, STO − LG, r)]2. (1.14)Equivalently, we get the maximization overlap between the two functions bycalculatingS =∫drφSF1s(ξ = 1.0, r)φCGF1s(ξ = 1.0, STO − LG, r), (1.15)on the condition that both functions in this equation are normalized. Forthe STO-1G we only need to get the primitive Gaussian exponent α thatmaximizes the overlap [11],S =(pi)−1/2(2α/pi)3/4 ∫dre−re−αr2. (1.16)The maximized overlap is obtained when α = 0.270950. We can observedeficiencies in behaviour of Gaussian functions near the origin and at largedistances. The same theory can be applied to the STO-2G and STO-3G toget the optimum fits. The results are shown as follows:φCGF1s(ξ = 1.0, STO − 1G) = φCGF1s (0.270950), (1.17)φCGF1s(ξ = 1.0, STO − 2G) =0.678914φCGF1s (0.151623)+0.430129φCGF1s(0.851819),(1.18)51.3. Basis SetφCGF1s(ξ = 1.0, STO − 3G) =0.444635φCGF1s (0.109818)+0.535328φCGF1s(0.405771)+0.1543329φCGF1s(2.22766).(1.19)The general notation for the STO-3G contraction is(6s3p|3s)|[2s1p|1s] [14].In the parentheses, the number before the slash represents for heavy atoms(first row element). The number after the slash represents for hydrogen[15]. The basis in the square bracket indicates the corresponding number ofcontracted functions. Note this notation only illustrates the size of the finalbasis without revealing how the contraction is done.Minimal basis sets usually provide rough results which are insufficient forresearch level publications, although they are much cheaper than their largercounterparts. Also, the minimum basis sets have limitation in variationalflexibility and are not capable to render accurate representation of orbitals.To solve these issues, we use multiple functions to represent each orbital.For instance, the double-zeta basis set allows us to treat each orbital asΦ2s(r)= ΦSTO2s(r, ξ1)+ dΦSTO2s(r, ξ2), (1.20)so that a 2s atomic orbital can be expressed as the sum of two STOs. Thetwo STOs differ in ξ which determines how large the orbital is. The constantd represents for how much each STO contributes towards the final orbital.The same theory applies for the triple and quadruple-zeta basis sets whereeach orbital is expanded as sum of three or four STO, respectively. Thepurpose of the trade-off is to get better accuracy with less time. There aredifferent ways of extending basis sets, such as splitting the valence functions,adding polarized functions and adding diffuse functions [16].To improve the minimal STO-3G basis set, the first step we do is doublingall basis functions, providing a Double Zeta (DZ) type basis. Doubling thenumber of basis functions can provide a more accurate description of theelectron distribution in molecules while the distribution differs significantlyfrom the one in atoms. Alternatively, we can add polarization functionsthat benefits the description for chemical bond by introducing directionalitywhich the minimal basis fails to provide. There exists chemical bondingbetween valence orbitals. For instance, doubling the 1s-functions in carbonresults in a better description of 1s-electrons. Nevertheless, when it is gettingcloser to the atomic case, the 1s orbital is basically independent of thechemical environment. An adjustment made to the DZ type basis is only61.3. Basis Setdoubling the number of valence orbitals, which is more commonly used as asplit valence basis [17].One of the most popular split-valence basis sets is Pople’s basis setsn-ijG or n-ijkG [18]. n is the number of primitives for the innershells; ijor ijk represents the number of primitives for contractions in the valenceshell. The ij/ijk notations describe sets of valence double/triple zeta qualityrespectively. Here, we use Pople’s 3-21G basis set notation as an example.“3” indicates the number of gaussian functions summed to describe theinner shell orbital. The number “2” is the number of gaussian functionsthat comprise the first STO of the double zeta. The number “1” is thenumber of gaussian functions summed in the second STO.Here, a more concrete example in terms of Hydrogen 1s orbital expandedby Double Zeta Basis sets: 3-21G will be showed. For hydrogen, the con-traction areφ′1s(r)=2∑i=1d′i,1sg1s(α′i,1s, r)and (1.21)φ′′1s(r)= g1s(α′′1s, r). (1.22)In this case, three s-type gaussian primitives are contracted to two basisfunctions. The inner hydrogen function φ′1s is a contraction of two primitiveGaussians. The other one φ′′1s is uncontracted. It is frequently denoted as(3s)→[2s] contraction. The coefficient in function are then fixed in subse-quent molecular calculations [15].For the atoms Li to F, the contractions areφ1s(r)=3∑i=1di,1sg1s(αi,1s, r), (1.23)φ′2s(r)=2∑i=1d′i,1sg1s(α′i,2sp, r), (1.24)φ′′2s(r)= g1s(α′′2sp, r), (1.25)φ′2p(r)=2∑i=1d′i,2pg2p(α′i,2sp, r), (1.26)φ′′2p(r)= g2p(α′′2sp, r). (1.27)To determine the 3-21G basis sets, we firstly need to choose a certain form1.23 to 1.27 for the contractions. Then we further optimize the correspond-ing parameters. The designation of 3-21G basis is(6s3p|3s) → [3s2p|2s].71.3. Basis SetBesides, although 3-21G and STO-3G basis contain the same number ofprimitive GTOs. 3-21G is much more flexible since they include twice asmany valence functions which can create free combinations to make MOs.The other commonly used split-valence basis sets in our project are 4-31G, 6-31G, 6-311G which are similar to 3-21G. Instead of improving abasis set by going to triple zeta, quadruple zeta, etc, one would rather addfunctions of higher angular quantum number to make the basis set betterbalanced [12]. These higher angular momentum functions are denoted aspolarization functions. Usually we add p-type function to H and d-typefunctions to the first row atoms Li-F. For example, originally a C-H bondis described by s-orbital(s) for the hydrogen and s- with pz- orbitals forcarbon. If one only involves s-functions for the hydrogen, the difference of theelectron distribution between the direction along the bond and perpendicularto the bond cannot be described. To compensate that, a polarization of thes-orbital is introduced by adding p-orbital to the hydrogen. In this way,the p component plays a role in improving the description of the H-C bond.Same theory also applies for using d-orbitals to polarize p-orbitals, usingf-orbitals to polarize d-orbitals etc [19].In arguing whether we need to further add the d-orbital to a hydrogens-orbital if a p-orbital has already been added, we have a general guide-line. The most essential part is the first set of polarization functions(i.e.,p-functions for hydrogen, d-functions for heavy atoms). Hereafter, the for-mation of a Double Zeta plus Polarization (DZP) type basis comes fromadding a single set of polarization functions (e.g., p-functions on hydrogensand d-functions on heavy atoms). Polarization functions are denoted inPople’s sets by an asterisk. 6-31G* and 6-31G** basis sets are formed byadding polarization functions to a 6-31G basis. 6-31G* describes a basisset where d-type functions be added to a basis set with valence p orbitals.6-31G** describes when d-type functions are added to the heavy atoms, andp-type functions are added to hydrogen [17]. The 6-31G** is synonymous to6-31G(d,p). In terms of degree of contraction, the notation for 6-31G* and6-31G** are (11s4p1d/4s)/[4s2p1d/2s] and (11s4p1d/4s1p)[4s2p1d/2s1p] re-spectively. It has been proved by experience that adding polarization func-tions to the heavy atoms plays a more essential role than adding polarizationfunctions to hydrogen. In addition to polarization functions, the basis setsare also frequently augmented with the diffuse functions. These functionshave very small ξ exponents and decay slowly with distance from the nu-cleus. Diffuse functions are normally s- and p- functions and frequently gobefore the G. Diffuse gaussians provide an accurate description of anionsand weak bond such as hydrogen bonds. Also, they are necessary for cal-81.4. Basics for Quantum Chemistryculation of properties (like dipole moment, Rydberg states, polarizabilities,etc). For Pople’s basis sets, diffuse functions are denoted by + or ++. Theonly one + or the first of two + indicates that adding one set of diffuse s-and p- functions on heavy atoms. The second + indicates that a diffuses-function is also added to hydrogens [19]. Similar as we discussed aboutpolarization functions, the diffuse functions we add is both on hydrogen andnon-hydrogen atoms. Here are some examples about diffuse functions. The6-31+G(d) represents a double zeta split valence basis which has one set ofdiffuse sp-functions on heavy atoms only and has a single d-type polarizationfunctions on heavy atoms. The similar theory applies for a more compli-cated case, like a 6-311++G(2df, 2pd) represents which a triple zeta splitvalence which has additional diffuse sp-functions, two d- and one f-functionson heavy atoms as well as diffuse s- and two p- and one d- functions onhydrogens [20].1.4 Basics for Quantum ChemistryDetermining the electronic configuration and charge distribution is a funda-mental topic in quantum chemistry. Since charge distribution is importantfor understanding certain chemical phenomena while cannot be observeddirectly by experiment, we need the molecular wavefunction Ψ to help in-terpret certain features of an optimised structure [21].The wave functionΨ(x, t) is the solution to Shro¨dinger’s equation which contains the informa-tion about the system. The wavefunctions themselves do not have physicalsignificance, rather the physical significance lies in the interpretation of theproduct of the wavefunction and its complex conjugate. As the total elec-tronic density ρ(x, t) is the square of the wavefunction. Here, we interpretρ(x, t) as the probability of finding the particle at time t at position x.Although we can solve the equation for part of certain simple cases, itis impossible to get an exact solution for most real molecules. There, wemust resort to approximation methods. There are some effective approx-imate methods such as the Perturbation Theory (PT) and its modifica-tions or Hartree-Fock approximation, which is a non-perturbative methodfor multi-electron atoms. Hartree-Fock approximation is equivalent to themolecular orbital approximation, which is fundamental to chemistry. Thetheory provides the idea that a single-particle function (orbital) is enoughto describe each electron’s motion since the instaneous motion of electronsis independent from one to another. Besides, the Hartree-Fock theory playsan important role in quantum chemistry not only for its own sake, but also91.4. Basics for Quantum Chemistrybecause it contains effects of electron correlation which provides a good startfor more accurate approximation [22].1.4.1 Hartree-FockBorn Oppenheimer ApproximationWe will give a brief explanation of Hartree-Fock theory from an introductorylevel and illustrate how to calculate molecular orbitals using Hartree-Focktheory. The aim to develop Hartree-Fock is to approximately solve theElectronic Shro¨dinger equationHˆel(r;R)Ψ(r;R) = Eel(R)Ψ(r;R). (1.28)To achieve that, we need to invoke the Born Oppenheimer approximationfirst, which guarantees the separation between the motion of the nuclei andthe motion of the electrons. As a consequence, the Hamiltonian operatorHˆel(r;R) is grouped into four termsHˆel(r;R) = Te(r) + VˆeN + VˆNN + Vˆee(r). (1.29)We can put it in a more expanded notation as follows:[−12∑i∇2i−∑I,iZI|RI − ri|+∑I>JZIZJ|RI −RJ |+∑i>j1|ri − rj | ]Ψ(r,R) = EelΨ(r,R),(1.30)where r denotes electronic and R denotes nucleus degrees of freedom [23].The Born-Oppenheimer Approximation ignores the motion of the atomic nu-clear since it is much heavier than an electron. Based on this physical fact,the Born-Oppenheimer Approximation describes the electronic wavefunc-tion under the approximation that the nucleus is stationary. By neglectingthe nuclear kinetic energy term KˆNN (R), the problem becomes solving theelectronic Shro¨dinger equation. By solving the Shro¨dinger equation, we canextract useful information such as electronic dipole moment and polarizabil-ity [22].The Many-electron Wavefunction: the Slater DeterminantTo introduce the basic idea of Hartree-Fock theory, we must mention theSlater-determinant since the Hartree-Fock theory assumes ψ is a Slater de-terminant. An exact wavefunction to solve a multi-electron system should bein the form of |ψ(r1, r2, ....ri)〉since a single function relies on the coordinate101.4. Basics for Quantum Chemistryof all the electrons simultaneously. A possible approximation consists onwriting the multi-electron wavefunction as a product of single-electron func-tions. By writing |Ψ(r1, r2, ...rj)〉 ≈ |ϕ1(r1)〉|ϕ2(r2)〉... ϕi(ri)〉, we transforma multi-electron system into a set of independent electrons located in itsown orbital. And these single-electron wavefunctions are called atomic or-bitals. However, a correct many electron wavefunction must satisfy both theprinciple of indistinguishability and the principle of antisymmetry. Theseprinciples are not satisfied by a simple product. By introducing Slater de-terminant as follows, we can build an antisymmetric solution [24].Ψ(1, 2, ...N) =1√N !∣∣∣∣∣∣∣∣∣φ1(1) φ1(2) · · · φ1(N)φ2(1) φ2(2) · · · φ2(N)............φN (1) φN (2) · · · φN (N)∣∣∣∣∣∣∣∣∣ (1.31)The exchange of two columns represents the exchange of two particleswhich results in the change of sign. And two equal rows will cause a zerodeterminant which corresponds to Pauli’s exclusion principle that two (ormore) identical fermions cannot occupy the same quantum state. We canalso write it in a shorthand form as |χiχj ....χk〉if we know the list of theoccupied orbitals are{χi(x), χj(x), ...χk(x)}or even as simply as |ij...k〉[25].Simplified Notation for the HamiltonianWe now introduce a simplified notation for the Hamiltonian. We define theone-electron operator as:h(i) = −12∇2i −∑AZAriA, (1.32)which represents the kinetic energy (KE) and the attraction to all nuclei ofelectron i. The two-electron operator υ(i, j) = 1rij stands for the coulombrepulsion between electron i and j. With these definition, the electronicHamiltonian can be written as:Hˆel =∑ih(i) +∑i