Population Analyses Based on IonicPartition of Overlap DistributionsbyYiming WangB.Sc. in Physics, Dalian University of Technology, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Chemistry)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2018c© Yiming Wang 2018Committee PageThe following individuals certify that they have read, and recommend tothe Faculty of Graduate and Postdoctoral Studies for acceptance, the dis-sertation entitled: Population Analyses Based on Ionic Partition of OverlapDistributions submitted by Yiming Wang in partial fulfillment of the re-quirements for the degree of Master of Science in Chemistry.Examining CommitteeResearch Supervisor:Dr. Yan Alexander Wang, ChemistrySupervisory Committee Members:Dr. Mark Thachuk, ChemistryDr. Gren Patey, ChemistryiiAbstractIn this thesis, we bring up several new schemes of partitioning the atomicpartial charges for the purpose of reducing the dependency on the basis setsand the inaccuracy from previous methods we did in our group. We analyzeall the methods including Mulliken, evaluate them by comparing with Natu-ral Population Analysis (NPA) with several different groups of systems whichwe divide according to their polarity. We find that when applied to morepolarized systems such as compounds containing Fluorine, our PopulationAnalyses Based on Ionic Partition of Overlap Distributions (IPOD) seriesperform better and produce charges closer to those of NPA method. Withinthe same system, IPOD series work better for atoms with more polarizedbond than for atoms with non-polarized ones. On top of all the analysesfor separate groups, we plot the correlation between charges produced bydifferent methods with charges generated by NPA method. From the graphand the slope value we conclude that IPOD2d is the method which gives themost reliable result compared to NPA among all the methods. Also, in orderto figure out the best basis set which can represent the result of IPOD2d, weplot the correlation graph between charges produced by IPOD2d and NPAmethods for several basis sets. We find that 6-31G basis set is the mostrepresentative basis set. Using the 6-31G to calculate charges for certainsystems renders us lots of advantages in terms of computational efficiencywhile still providing a reasonable result.iiiLay SummaryA frequent topic in quantum chemistry is the determination of the electronicconfiguration and the electronic charge distribution of a molecule, especiallythe net charges associated to each atoms within a molecule. While we knowthe electronic charge is hard to observe by experiment directly, we need toget the charge distribution among the constituent atoms in a molecule bya given wavefunction. The process to carry out this analysis is consideredas population analysis. In order to solve the problems of the method ourgroup previously did, such as the high dependency of basis sets and in-accuracy of the results, we bring up several new schemes to partition theelectronic charge. We analyze the results calculated with different basis setsand compare them with the results generated by Natural Population Anal-ysis (NPA) and evaluate how good they perform within polarized and lesspolarized systems.ivPrefaceIn the theory part, IPOD1 and IPOD2 methods were developed in Dr. AlexWang’s group prior to my arrival. IPOD1 and IPOD2 methods were origi-nally started by Dr. Wang and Mr.Yakun Chen. The following up methodsIPOD2b, IPOD2c, and IPOD2d are from Dr. Wang and Mr. Miguel Garcia,a PhD student in our group. I collaborated with Mr. Miguel to put the ideainto implementation into NWChem. The choosing of the systems and dataanalysis was done by myself.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . xiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Electronic Density Analysis . . . . . . . . . . . . . . . . . . . 11.2 Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Basis Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Basics for Quantum Chemistry . . . . . . . . . . . . . . . . . 91.4.1 Hartree-Fock . . . . . . . . . . . . . . . . . . . . . . . 101.4.2 Density Functional Theory . . . . . . . . . . . . . . . 161.5 Population Analysis . . . . . . . . . . . . . . . . . . . . . . . 201.5.1 Mulliken and Lo¨wdin Population Analyses . . . . . . 221.5.2 Natural Population Analysis (NPA) . . . . . . . . . . 231.5.3 Ionic Partition of Overlap Distribution (IPOD) meth-ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Result and Discussion . . . . . . . . . . . . . . . . . . . . . . . 272.1 Fluorine Systems . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 Alcohol Systems . . . . . . . . . . . . . . . . . . . . . . . . . 312.3 Alkene Systems . . . . . . . . . . . . . . . . . . . . . . . . . 34viTable of Contents2.4 Aromatic Systems . . . . . . . . . . . . . . . . . . . . . . . . 432.5 Small Inorganic Molecules . . . . . . . . . . . . . . . . . . . 432.6 Correlation between Different Methods with NPA . . . . . . 492.7 The Best Basis Set within IPOD2d Method . . . . . . . . . . 493 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55viiList of Tables2.1 Average charges for all basis sets and standard deviations forHF and LiF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.2 Average charges with all basis sets and standard deviationsfor NaF and KF. . . . . . . . . . . . . . . . . . . . . . . . . . 282.3 Average charges and standard deviations for atom C in CH3OHand C2H5OH for basis sets from 3-21G to 6-31G*. . . . . . . 312.4 Average charges and standard deviations for atom C in C3H7OHand C4H9OH for basis sets from 3-21G to 6-31G* . . . . . . . 312.5 Average charges and standard deviations for atom Carbon(side) in C2H4 and C3H4 for basis sets from 3-21G to 6-31G(2pd, 2p). . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.6 Average charges and standard deviations for atom Carbon(side) in C3H6 and C4H6 for basis sets from 3-21G to 6-31G(2pd, 2p). . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.7 Average charges and standard deviations for atom Carbon(middle) in C2H4 and C3H4 for basis sets from 3-21G to 6-31G(2pd, 2p). . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.8 Average charges and standard deviations for atom C (middle)in C3H6 and C4H6 for basis sets from 3-21G to 6-31G(2pd, 2p). 382.9 Average charges and standard deviations for atom Carbon(side) in Anthrancene and Biphenyl for basis sets from 3-21Gto 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.10 Average charges and standard deviations for atom Carbon(side) in Naphtalene and Phenanthrene for basis sets from3-21G to 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . 392.11 Average charges and standard deviations for atom Carbon(middle) in Anthrancene and Biphenyl for basis sets from 3-21G to 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . . 422.12 Average charges and standard deviations for atom Carbon(middle) in Naphtalene and Phenanthrene for basis sets from3-21G to 6-31G**. . . . . . . . . . . . . . . . . . . . . . . . . 42viiiList of Tables2.13 Average charges and standard deviations for atom Carbon inCO2 and CO for basis sets from 3-21G to 6-31G**. . . . . . . 452.14 Average charges and standard deviations for atom Nitrogenin HCN and NH3 for basis sets from 3-21G to 6-31G**. . . . 452.15 Average charges and standard deviations for atom Nitrogenin N2O and HNO for basis sets from from 3-21G to 6-31G**. 472.16 Average charges and standard deviations for atom Nitrogenin H2O and HOOH for basis sets from 3-21G to 6-31G**. . . 47ixList of Figures2.1 Partial charges for Fluorine diatomic compounds. The basissets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d,6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 292.2 Partial charges for C atom in Alcohol compounds. The basissets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d,6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 322.3 Partial charges for O atom in Alcohol compounds. The basissets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d,6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 332.4 Partial charges for Carbon-side atom in Alkenes compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 352.5 Partial charges for C-middle atom in Alkenes compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 372.6 Partial charges for C-side atom in Aromatic compounds. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G),(d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 402.7 Partial charges for C-middle atom in Aromatic compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 41xList of Figures2.8 Partial charges for atom in small inorganic molecules. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G),(d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df,3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) . . . . 442.9 Partial charges for Oxygen atom in small inorganic molecules.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**) 482.10 Correlation between different methods with NPA . . . . . . . 502.11 Correlation between different methods with NPA . . . . . . . 512.12 Correlation for different basis sets in IPOD2d with the samebasis sets in NPA . . . . . . . . . . . . . . . . . . . . . . . . . 52xiList of AbbreviationsIPOD Partition of Overlap DistributionMPA Mulliken Population AnalysisLPA Lo¨wdin Population AnalysisNPA Natural Population AnalysisAIM Atoms In MoleculesLCAO Linear Combination of Atomic OrbitalsSTO Slater-Type OrbitalGTO Gaussian-Type OrbitalDFT Density Functional TheorySCF Self-consistent ProcedureCGF contracted Gaussian functionsPGTO primitive Gaussian-Type OrbitalDZ Double ZetaDZP Double Zeta plus PolarizationPT Perturbation TheoryKE kinetic energyHF Hartree-FockHK Hohenberg-KohnKS Kohn-ShamSD Slater determinantMO Molecular OrbitalAOs Atomic OrbitalsNOs Natural OrbitalsNAOs Natural Atomic OrbitalsCBPA Christoffersen-Baker population analysisxiiAcknowledgementsI would like to thank Professor Wang for his patient mentoring and expertadvice throughout this project. Also, I would like to thank Miguel for hisextraordinary support in this thesis process. And finally, last but by nomeans least, also to everyone in my lab. It was great sharing laboratorywith all of you during last three years. Thanks for all their encouragement!xiiiChapter 1TheoryA frequent topic in quantum chemistry is the determination of the electronicconfiguration and electronic charge distribution of a molecule, especially netcharge associated with each atom in a polyatomic molecule [1]. While thischarge distribution is hard to observe directly by experiment, it is importantfor rendering certain chemical interpretation of the wave function whichleads to useful understanding of chemical phenomena [2].To quantify the notions of atomic charge and orbital population in asatisfactory way, there are a number of ways for analysing a calculationwhen accurate wavefunctions are available [3].These analysis mainly fall into two categories:1. Partition of charge between atoms based on the orbital occupancy.2. Partition of a physical observable derived from the wavefunction, suchas electron density.1.1 Electronic Density AnalysisWhen we have an optimized structure and the molecular wavefunction Ψ, wecan calculate electron density which is the square of the wavefunction. Hencewe can examine how electron density distributes over a molecule directly.For partitioning of a physical observable derived from the wave function,such as electron density, we have Bader Atoms In Molecules (AIM) analysis[4]. It relies on properties of the electron density alone and the informationcan be obtained from the laplacian of the electron density [5].For partitioning the molecular wave function by some orbital basedscheme, Ψ is written as a Slater determinant of individual molecular or-bitals ψi(r)[6], and electronic density ρ is defined asρ(r)= 2[|ψ1(r)|2 + |ψ2(r)|2 + ....+ |ψN/2(r)|2], (1.1)where N is the electron number, r is the space vector.All of the molecular orbitals together must contribute to the electrondensity. Each molecular orbital is expanded in terms of the atomic orbitals11.2. Gaussian(AO) basis functions (Slater-Type Orbital or (contracted) Gaussian-TypeOrbital) [1]. In the LCAO (Linear Combination of Atomic Orbitals) ap-proximation, if we insert the orbital expansion into ρ, ρ becomes a sum ofthe contributions. If we go further and see how ρ is distributed in terms ofparticular atoms and locate the charge on the atomic centers (nucleus), wecarry out a population analysis [7].1.2 GaussianGaussian is a popular and widely used Computational Chemistry Softwarepackage, especially for electronic structure calculations [8]. It has been up-dated continuously since firstly released in 1970 by John Pople. We use 09version for our project although the latest one is Gaussian 16. Gaussianprovides a wide-range modelling capabilities from the prediction of ener-gies, molecular structures and vibrational frequencies to the prediction ofreactions in a wide variety of chemical environments [9].Also, Gaussian offers various methods for modelling compounds andchemical process. Including Hartree-Fock methods (restricted, unrestrictedopen-shell), Density Functional Theory (DFT) and Molecular Mechanics. Inour project, we mainly use DFT methods to do calculations for charge anal-ysis with different basis sets. Among them, DFT and Hartree-Fock methodare both Self-consistent Procedure (SCF methods).1.3 Basis SetFor basis set, many basis sets are stored internally in the Gaussian and thefollowing are what we applied in our projects:STO-3G, 3-21G, 4-31G, 6-31G, 6-31G*, 6-31G **, 6-31G(2pd, 2p),6-31G(3df, 3pd), 6-31++G 6-31++G*, 6-31++G**.In the following section, we will show the definition of different basis sets andillustrate how specific attributes of a basis set influence calculated quantities[10].The 1s Minimal STO-3G Basis SetIn this section, instead of introducing basis sets for general polyatomicmolecule calculations, we describe 1s type basis functions as a start. Theextension of these concepts for the general case which includes s,p,d-typebasis functions will follow the same rule [11].21.3. Basis SetIn terms of 1s functions, there are mainly two types of basis functionswhich are widely used:1. The normalized 1s Slater-type function of the formφSF1s(ξ, r −RA)=(ξ3pi)1/2e−ξ|r−RA|2, (1.2)where RA is the center, and ξ is the Slater orbital exponent.2. The normalized 1s Gaussian-type function of the formφGF1s(α, r −RA)=(2α/pi)3/4e−α|r−RA|2, (1.3)where α is the Gaussian orbital exponent [12].The orbital exponents which are positive numbers control the width ofthe orbital. Large values of ξ or α give a tight function while small values givea diffuse function. There are two main factors one should consider in termsof choosing of a basis, which are the efficiency and accuracy of describingelectronic wave function calculations. For best efficiency, one intends to usefewer possible terms when expanding the molecular orbital ψi.ψi =k∑µ=1Cµiφµ (1.4)In this sense, Slater functions have an advantage over Gaussian functions.Fewer Slater basis functions than Gaussian basis functions are needed toprovide the same level of accurate results. Another consideration is the timeconsumption of the two-electron integral evaluation. In an SCF calculation,one of the most expensive steps is the calculation of the two-electron integralswhich have the form(µAνB|λCσD)=∫dr1dr2φ∗Aµ(r1)φBµ(r1)r−112 φ∗Cλ(r2)φDσ(r2). (1.5)The fact that the evaluation of the four-center integrals is more computa-tionally costly with Slater-basis functions makes Gaussian functions a betterchoice in this scenario. The reason for that is for a Gaussian functions canbe applied with explicit formulas while Slater-basis functions do not have.Also, the name of software Gaussian derives from its use of GTOs.The reason why these integrals are much easier to calculate with Gaus-sian basis function, is that the products of Gaussians are Gaussians centeredbetween atom centers. The product of two Gaussian functions is a thirdGaussian function centered between two atoms, that is,φGF1s(α, r −RA)φGF1s(β, r −RB)= KABφGF1s(p, r −Rp), (1.6)31.3. Basis Setwhere KAB isKAB =(2αβ/[(α+ β)pi]3/4)exp[− αβ/(α+ β)|RA −RB|2]. (1.7)The exponent of the new Gaussian centered at Rp is p = α+β and the thirdceter P is on a line joining the centers A and B,Rp =(αRA + βRB)/(α+ β). (1.8)The same rule applies for four-center integral. For a 1s Gaussian, it can bereduced to two-center integrals.(µAνB|λCσD)= KABKCD∫dr1dr2φGF1s(P, r1 −RP)r−112 φGF1s(Q, r2 −RQ).(1.9)Although the two-electron integrals can be evaluated more efficiently withGaussian functions, the fact that Gaussian functions have inaccurate func-tional behaviour of molecular orbitals hinders them from being the optimumbasis functions. Thereafter, we can use basis functions which are fixed linearcombinations of the primitive Gaussian functions φGFp as a trade off. Theselinear combinations called contracted Gaussian functions (CGF) which aregiven byφCGFµ(r −RA)=L∑p=1dpµφGFp(αpµ, r −RA), (1.10)where L is the length of the contraction and dpµ is a contraction coeffi-cient. There is a functional relationship between the pth normalized prim-itive Gaussian φGFp and the basis function φCGFµ by the Gaussian orbitalexponent αpµ which is also called contraction exponent.The method consist in choosing a certain contraction length, contrac-tion coefficients and contraction exponents to build a desirable set of basisfunctions φCGFµ on the lefthand side. While using these fixed functions todo calculations in molecular wave function, especially an SCF calculation,the contraction coefficients, should remain unchanged.With the contracted basis set functions {φCGFµ }, the two-electron inte-grals(µν|λα) can be evaluated as a sum of easily calculated two-electronintegrals over the original Gaussian functions. If the contraction parame-ters are appropriately chosen, one can have atomic Hartree-Fock functions,Slater-type functions, etc and still only use primitive Gaussian functions toget an efficient integral. The idea behind it is to get a linear combinationof N primitive Gaussian functions by fitting a Slater-type-orbital (STO),41.3. Basis Setwhich is called STO-NG procedure. STO-3G basis sets are the most widelyused in polyatomic calculation to rapidly evaluate integrals since it has beenfound that using more than three Primitive Gaussian-type-orbital (PGTO)to represent the STO gives little improvement [13]. Let us first considerusing φCGF1s to approximate a 1s Slater-type function with ξ = 1.0 as anexample.φCGF1s(ξ = 1.0, STO − 1G) = φGF1s (α11), (1.11)φCFG1s(ξ = 1.0, STO − 2G) = d12φGF1s (α12)+ d22φGF1s (α22), (1.12)φCFG1s(ξ = 1.0, STO − 3G) = d13φGF1s (α13)+ d23φGF1s (α23)+ d33φGF1s (α33),(1.13)We only consider up to three contractions where we need to find the best-fit coefficients dpµ and exponents αpµ with a fixed ξ = 1.0 to get a basisfunction which is the closest to a Slater-type function. To do this, we canuse a least square techniques to minimize the integral [13]I =∫dr[φSF1s(ξ = 1.0, r)− φCGF1s (ξ = 1.0, STO − LG, r)]2. (1.14)Equivalently, we get the maximization overlap between the two functions bycalculatingS =∫drφSF1s(ξ = 1.0, r)φCGF1s(ξ = 1.0, STO − LG, r), (1.15)on the condition that both functions in this equation are normalized. Forthe STO-1G we only need to get the primitive Gaussian exponent α thatmaximizes the overlap [11],S =(pi)−1/2(2α/pi)3/4 ∫dre−re−αr2. (1.16)The maximized overlap is obtained when α = 0.270950. We can observedeficiencies in behaviour of Gaussian functions near the origin and at largedistances. The same theory can be applied to the STO-2G and STO-3G toget the optimum fits. The results are shown as follows:φCGF1s(ξ = 1.0, STO − 1G) = φCGF1s (0.270950), (1.17)φCGF1s(ξ = 1.0, STO − 2G) =0.678914φCGF1s (0.151623)+0.430129φCGF1s(0.851819),(1.18)51.3. Basis SetφCGF1s(ξ = 1.0, STO − 3G) =0.444635φCGF1s (0.109818)+0.535328φCGF1s(0.405771)+0.1543329φCGF1s(2.22766).(1.19)The general notation for the STO-3G contraction is(6s3p|3s)|[2s1p|1s] [14].In the parentheses, the number before the slash represents for heavy atoms(first row element). The number after the slash represents for hydrogen[15]. The basis in the square bracket indicates the corresponding number ofcontracted functions. Note this notation only illustrates the size of the finalbasis without revealing how the contraction is done.Minimal basis sets usually provide rough results which are insufficient forresearch level publications, although they are much cheaper than their largercounterparts. Also, the minimum basis sets have limitation in variationalflexibility and are not capable to render accurate representation of orbitals.To solve these issues, we use multiple functions to represent each orbital.For instance, the double-zeta basis set allows us to treat each orbital asΦ2s(r)= ΦSTO2s(r, ξ1)+ dΦSTO2s(r, ξ2), (1.20)so that a 2s atomic orbital can be expressed as the sum of two STOs. Thetwo STOs differ in ξ which determines how large the orbital is. The constantd represents for how much each STO contributes towards the final orbital.The same theory applies for the triple and quadruple-zeta basis sets whereeach orbital is expanded as sum of three or four STO, respectively. Thepurpose of the trade-off is to get better accuracy with less time. There aredifferent ways of extending basis sets, such as splitting the valence functions,adding polarized functions and adding diffuse functions [16].To improve the minimal STO-3G basis set, the first step we do is doublingall basis functions, providing a Double Zeta (DZ) type basis. Doubling thenumber of basis functions can provide a more accurate description of theelectron distribution in molecules while the distribution differs significantlyfrom the one in atoms. Alternatively, we can add polarization functionsthat benefits the description for chemical bond by introducing directionalitywhich the minimal basis fails to provide. There exists chemical bondingbetween valence orbitals. For instance, doubling the 1s-functions in carbonresults in a better description of 1s-electrons. Nevertheless, when it is gettingcloser to the atomic case, the 1s orbital is basically independent of thechemical environment. An adjustment made to the DZ type basis is only61.3. Basis Setdoubling the number of valence orbitals, which is more commonly used as asplit valence basis [17].One of the most popular split-valence basis sets is Pople’s basis setsn-ijG or n-ijkG [18]. n is the number of primitives for the innershells; ijor ijk represents the number of primitives for contractions in the valenceshell. The ij/ijk notations describe sets of valence double/triple zeta qualityrespectively. Here, we use Pople’s 3-21G basis set notation as an example.“3” indicates the number of gaussian functions summed to describe theinner shell orbital. The number “2” is the number of gaussian functionsthat comprise the first STO of the double zeta. The number “1” is thenumber of gaussian functions summed in the second STO.Here, a more concrete example in terms of Hydrogen 1s orbital expandedby Double Zeta Basis sets: 3-21G will be showed. For hydrogen, the con-traction areφ′1s(r)=2∑i=1d′i,1sg1s(α′i,1s, r)and (1.21)φ′′1s(r)= g1s(α′′1s, r). (1.22)In this case, three s-type gaussian primitives are contracted to two basisfunctions. The inner hydrogen function φ′1s is a contraction of two primitiveGaussians. The other one φ′′1s is uncontracted. It is frequently denoted as(3s)→[2s] contraction. The coefficient in function are then fixed in subse-quent molecular calculations [15].For the atoms Li to F, the contractions areφ1s(r)=3∑i=1di,1sg1s(αi,1s, r), (1.23)φ′2s(r)=2∑i=1d′i,1sg1s(α′i,2sp, r), (1.24)φ′′2s(r)= g1s(α′′2sp, r), (1.25)φ′2p(r)=2∑i=1d′i,2pg2p(α′i,2sp, r), (1.26)φ′′2p(r)= g2p(α′′2sp, r). (1.27)To determine the 3-21G basis sets, we firstly need to choose a certain form1.23 to 1.27 for the contractions. Then we further optimize the correspond-ing parameters. The designation of 3-21G basis is(6s3p|3s) → [3s2p|2s].71.3. Basis SetBesides, although 3-21G and STO-3G basis contain the same number ofprimitive GTOs. 3-21G is much more flexible since they include twice asmany valence functions which can create free combinations to make MOs.The other commonly used split-valence basis sets in our project are 4-31G, 6-31G, 6-311G which are similar to 3-21G. Instead of improving abasis set by going to triple zeta, quadruple zeta, etc, one would rather addfunctions of higher angular quantum number to make the basis set betterbalanced [12]. These higher angular momentum functions are denoted aspolarization functions. Usually we add p-type function to H and d-typefunctions to the first row atoms Li-F. For example, originally a C-H bondis described by s-orbital(s) for the hydrogen and s- with pz- orbitals forcarbon. If one only involves s-functions for the hydrogen, the difference of theelectron distribution between the direction along the bond and perpendicularto the bond cannot be described. To compensate that, a polarization of thes-orbital is introduced by adding p-orbital to the hydrogen. In this way,the p component plays a role in improving the description of the H-C bond.Same theory also applies for using d-orbitals to polarize p-orbitals, usingf-orbitals to polarize d-orbitals etc [19].In arguing whether we need to further add the d-orbital to a hydrogens-orbital if a p-orbital has already been added, we have a general guide-line. The most essential part is the first set of polarization functions(i.e.,p-functions for hydrogen, d-functions for heavy atoms). Hereafter, the for-mation of a Double Zeta plus Polarization (DZP) type basis comes fromadding a single set of polarization functions (e.g., p-functions on hydrogensand d-functions on heavy atoms). Polarization functions are denoted inPople’s sets by an asterisk. 6-31G* and 6-31G** basis sets are formed byadding polarization functions to a 6-31G basis. 6-31G* describes a basisset where d-type functions be added to a basis set with valence p orbitals.6-31G** describes when d-type functions are added to the heavy atoms, andp-type functions are added to hydrogen [17]. The 6-31G** is synonymous to6-31G(d,p). In terms of degree of contraction, the notation for 6-31G* and6-31G** are (11s4p1d/4s)/[4s2p1d/2s] and (11s4p1d/4s1p)[4s2p1d/2s1p] re-spectively. It has been proved by experience that adding polarization func-tions to the heavy atoms plays a more essential role than adding polarizationfunctions to hydrogen. In addition to polarization functions, the basis setsare also frequently augmented with the diffuse functions. These functionshave very small ξ exponents and decay slowly with distance from the nu-cleus. Diffuse functions are normally s- and p- functions and frequently gobefore the G. Diffuse gaussians provide an accurate description of anionsand weak bond such as hydrogen bonds. Also, they are necessary for cal-81.4. Basics for Quantum Chemistryculation of properties (like dipole moment, Rydberg states, polarizabilities,etc). For Pople’s basis sets, diffuse functions are denoted by + or ++. Theonly one + or the first of two + indicates that adding one set of diffuse s-and p- functions on heavy atoms. The second + indicates that a diffuses-function is also added to hydrogens [19]. Similar as we discussed aboutpolarization functions, the diffuse functions we add is both on hydrogen andnon-hydrogen atoms. Here are some examples about diffuse functions. The6-31+G(d) represents a double zeta split valence basis which has one set ofdiffuse sp-functions on heavy atoms only and has a single d-type polarizationfunctions on heavy atoms. The similar theory applies for a more compli-cated case, like a 6-311++G(2df, 2pd) represents which a triple zeta splitvalence which has additional diffuse sp-functions, two d- and one f-functionson heavy atoms as well as diffuse s- and two p- and one d- functions onhydrogens [20].1.4 Basics for Quantum ChemistryDetermining the electronic configuration and charge distribution is a funda-mental topic in quantum chemistry. Since charge distribution is importantfor understanding certain chemical phenomena while cannot be observeddirectly by experiment, we need the molecular wavefunction Ψ to help in-terpret certain features of an optimised structure [21].The wave functionΨ(x, t) is the solution to Shro¨dinger’s equation which contains the informa-tion about the system. The wavefunctions themselves do not have physicalsignificance, rather the physical significance lies in the interpretation of theproduct of the wavefunction and its complex conjugate. As the total elec-tronic density ρ(x, t) is the square of the wavefunction. Here, we interpretρ(x, t) as the probability of finding the particle at time t at position x.Although we can solve the equation for part of certain simple cases, itis impossible to get an exact solution for most real molecules. There, wemust resort to approximation methods. There are some effective approx-imate methods such as the Perturbation Theory (PT) and its modifica-tions or Hartree-Fock approximation, which is a non-perturbative methodfor multi-electron atoms. Hartree-Fock approximation is equivalent to themolecular orbital approximation, which is fundamental to chemistry. Thetheory provides the idea that a single-particle function (orbital) is enoughto describe each electron’s motion since the instaneous motion of electronsis independent from one to another. Besides, the Hartree-Fock theory playsan important role in quantum chemistry not only for its own sake, but also91.4. Basics for Quantum Chemistrybecause it contains effects of electron correlation which provides a good startfor more accurate approximation [22].1.4.1 Hartree-FockBorn Oppenheimer ApproximationWe will give a brief explanation of Hartree-Fock theory from an introductorylevel and illustrate how to calculate molecular orbitals using Hartree-Focktheory. The aim to develop Hartree-Fock is to approximately solve theElectronic Shro¨dinger equationHˆel(r;R)Ψ(r;R) = Eel(R)Ψ(r;R). (1.28)To achieve that, we need to invoke the Born Oppenheimer approximationfirst, which guarantees the separation between the motion of the nuclei andthe motion of the electrons. As a consequence, the Hamiltonian operatorHˆel(r;R) is grouped into four termsHˆel(r;R) = Te(r) + VˆeN + VˆNN + Vˆee(r). (1.29)We can put it in a more expanded notation as follows:[−12∑i∇2i−∑I,iZI|RI − ri|+∑I>JZIZJ|RI −RJ |+∑i>j1|ri − rj | ]Ψ(r,R) = EelΨ(r,R),(1.30)where r denotes electronic and R denotes nucleus degrees of freedom [23].The Born-Oppenheimer Approximation ignores the motion of the atomic nu-clear since it is much heavier than an electron. Based on this physical fact,the Born-Oppenheimer Approximation describes the electronic wavefunc-tion under the approximation that the nucleus is stationary. By neglectingthe nuclear kinetic energy term KˆNN (R), the problem becomes solving theelectronic Shro¨dinger equation. By solving the Shro¨dinger equation, we canextract useful information such as electronic dipole moment and polarizabil-ity [22].The Many-electron Wavefunction: the Slater DeterminantTo introduce the basic idea of Hartree-Fock theory, we must mention theSlater-determinant since the Hartree-Fock theory assumes ψ is a Slater de-terminant. An exact wavefunction to solve a multi-electron system should bein the form of |ψ(r1, r2, ....ri)〉since a single function relies on the coordinate101.4. Basics for Quantum Chemistryof all the electrons simultaneously. A possible approximation consists onwriting the multi-electron wavefunction as a product of single-electron func-tions. By writing |Ψ(r1, r2, ...rj)〉 ≈ |ϕ1(r1)〉|ϕ2(r2)〉... ϕi(ri)〉, we transforma multi-electron system into a set of independent electrons located in itsown orbital. And these single-electron wavefunctions are called atomic or-bitals. However, a correct many electron wavefunction must satisfy both theprinciple of indistinguishability and the principle of antisymmetry. Theseprinciples are not satisfied by a simple product. By introducing Slater de-terminant as follows, we can build an antisymmetric solution [24].Ψ(1, 2, ...N) =1√N !∣∣∣∣∣∣∣∣∣φ1(1) φ1(2) · · · φ1(N)φ2(1) φ2(2) · · · φ2(N)............φN (1) φN (2) · · · φN (N)∣∣∣∣∣∣∣∣∣ (1.31)The exchange of two columns represents the exchange of two particleswhich results in the change of sign. And two equal rows will cause a zerodeterminant which corresponds to Pauli’s exclusion principle that two (ormore) identical fermions cannot occupy the same quantum state. We canalso write it in a shorthand form as |χiχj ....χk〉if we know the list of theoccupied orbitals are{χi(x), χj(x), ...χk(x)}or even as simply as |ij...k〉[25].Simplified Notation for the HamiltonianWe now introduce a simplified notation for the Hamiltonian. We define theone-electron operator as:h(i) = −12∇2i −∑AZAriA, (1.32)which represents the kinetic energy (KE) and the attraction to all nuclei ofelectron i. The two-electron operator υ(i, j) = 1rij stands for the coulombrepulsion between electron i and j. With these definition, the electronicHamiltonian can be written as:Hˆel =∑ih(i) +∑i<jυ(i, j) + VNN . (1.33)In Hartree-Fock approximation, the two-electron operator υ(i, j) is replacedby Hartree-Fock Potential υHF (ri) which is the total averaged potentialacting on the electron i from the rest of the other N-1 electron spin orbitals.111.4. Basics for Quantum ChemistryVNN is a constant which can be ignored at present due to the fixed set ofnuclear coordinates{R} It will only shift the eigenvalues instead of changingthe eigenfunctions.The Hartree-Fock Energy ExpressionWe already gained the form of a Slater determinant for the Hartree-Fockwavefunction and a simplified notation for the Hamiltonian. To tackle theproblem of obtaining the molecular orbitals, we still need to express theHartree-Fock Energy in an appropriate way. The Hartree-Fock energy willbe put as the usual quantum mechanical form asEel =〈Ψ|Hˆel|Ψ〉. (1.34)The variational Theorem states that the energy Eel is the upper bound en-ergy of the actual ground state. Hence, we will adjust so-called “variationalparameters” until the energy of the trial function is minimized. In this way,variational method approximations would result in a good trial wavefunctionand its corresponding energy. By employing a linear combination of a set ofgiven basis functions, we can obtain the molecular orbitals successfully. Byfinding the linear expansion coefficients that minimize the energy. Here, weintroduce how to write the Hartree-Fock energy Eel by integrals of the one-and two-electron operators [26], that is,EHF =∑i〈i|h|i〉+ 12∑ij[ii|jj]− [ij|ji], (1.35)where one-electron integral is:∑i〈i|h|i〉 =∫dx1χ∗i (x1)hˆ(r1)χj(x1). (1.36)Each pair of electrons (in orbitals i and j) has a Coulomb integral, given by[ii|jj] =∫dx1∫dx2χ∗i (x1)χi(x1)1r12χ∗j (x2)χj(x2). (1.37)Since χ∗i (x1)χi(x1) and χ∗j (x2)χj(x2) each stands for the probability electron1 (and 2) in orbital i is located at x1(and x2).1r12is coulomb repulsionbetween electron at x1 and electron at x2.Overall this integral represents the coulomb repulsion between electron 1in orbital i and electron 2 in orbital j. Another term [ij|ji] = ∫ dx1 ∫ dx2χ∗i (x1)χj(x1)1r12χ∗j (x2)χi(x2) is called “Exchange integral”. It looks like as “Coulombintegral” and we can regard it as exchange the two of the orbital indices [27].121.4. Basics for Quantum ChemistryThe Hartree-Fock EquationsSo far we already obtained the wavefunction in the form of a single Slaterdeterminant consisting on one spin orbital per electron. Also, with thevariational theorem applied on the energy expression, the best spin orbitalsare those that minimize the electron energy.E0 =〈Ψ0|H|Ψ0〉=∑i〈i|h|i〉+ 12∑ij〈ij||ij〉=∑i〈i|h|i〉+ 12∑ij[ii|jj]− [ij|ji].(1.38)Applying the variational theorem on the energy E0 while constraining thespin orbitals to be orthogonal to each other, we arrive at the Hartree-Fockintegro-differential equation:h(x1)χi(x1) +∑j 6=i[∫dx2|xj(x2)|2r−112 ]χi(x1)−∑j 6=i[∫dx2χ∗j (x2)χi(x2)r−112 ]χj(x1) = εiχi(x1), (1.39)where εi is the orbital energy associated with orbital χi. The Hartree-Fockequations can be solved numerically, usually it is solved by expanding χi intoa linear combination of basis functions (Hartree-Fock-Roothan equations)[28]. By introducing basis sets, we convert the equation into a much simplerlinear algebra problem. Regardless on whether we are solving Hartree-Fockequation exactly or using a basis set expansion, Hartree-Fock equation willdepend on the orbitals. Therefore, we need to come up with some roughguess as our initial orbitals and use the solution as the second guess torebuild the Hartree-Fock equation and do this iteratively until we finallyrefine our solution to a certain standard. In other words, Hartree-Fock is aself-consistent-field (SCF) approach. Before we formally introduce SCF, wewill first explain each term in the Hartree-Fock equation [26].The Coulomb and Exchange OperatorsAs the Hartree-Fock Equation shows, the first two-electron term∑j 6=i[∫dx2|χj(x2)|2r−112 ] (1.40)131.4. Basics for Quantum Chemistryis the Coulomb term, which gives the Coulomb interaction of an electron inspin orbital χi with total average potential from the rest of the N-1 electronsin other spin orbitals. For future convenience, we define a coulomb operatori(x1) =∫dx2|χj(x2)|2r−112 (1.41)which represents the average local potential at x1 contributed from thecharge distribution of the electron in orbital χj .The other term in 1.39 arises from the need to satisfy the antisymmetryof the Slater determinant. We call it the exchange term because it looks likeCoulomb term with swapping spin orbitals χi and χj . Correspondingly, wecan introduce an exchange operator κj(x1), by considering its effect on anarbitrary spin orbital χi:κj(x1)χi(x1) = [∫dx2χ∗j (x2)r−112 χi(x2)]χj(x1). (1.42)Using this notation, the Hartree-Fock equations can be written more con-cisely as:[h(x1) +∑j 6=ii(x1)−∑j 6=iκj(x1)]χi(x1) = εiχi(x1). (1.43)Since this has the form of an eigenvalue problem and we can show that[i(x1)− κi(x1)]χi(x1) = 0, (1.44)we can remove the restrictions on summation j 6= i hence, and define theFock operator f as:f(x1) = h(x1) +∑j(x1)− κj(x1). (1.45)Therefore the Hartree-Fock equations becomef(x1)χi(x1) = εiχi(x1). (1.46)Roothaan solved this equation by introducing a set of κ known basis func-tions χ and expanded χ in a linear form:χi =∑µ=1Cµiχ˜µ. (1.47)141.4. Basics for Quantum ChemistryFor every single spin orbital i, this gives:f(x1)∑νCνiχ˜ν(x1) = εi∑νCνiχ˜ν(x1). (1.48)Multiply χ˜∗µ(x1) on the left and integrate will lead the integro-differentialequation to a matrix equation.∑νCνi∫dx1χ˜∗µ(x1)f(x1)χ˜ν(x1) = εi∑νCνi∫dx1χ˜∗µ(x1)χ˜ν(x1). (1.49)To simplify it, we introduce the matrix element notationSµν =∫dx1χ˜∗µ(x1)χ˜ν(x1), (1.50)Fµν =∫dx1χ˜∗µ(x1)f(x1)χ˜ν(x1). (1.51)With this definition, the integrated Hartree-Fock equation can be writtenas ∑νFµνCνi = εi∑νSµνCνi, i = 1, 2...., k. (1.52)These are so-called the Roothaan equations which can be written in an evenmore compact form as FC = SCε. If we want to solve it as an eigenvalueequation, we must first make the overlap matrix S to be the identity matrixby finding a transformation matrix to orthogonalize the basis functions [11].Self-Consistent-Field ProcedureAfter we reformulate the equation, it becomes a Pseudo-eigenvalue equationFC = Cε. It is not a regular eigenvalue equation because, although C canbe solved by diagonalizing F, F also depends on C, which is its own solution.Therefore the equation will be solved in an iterative fashion. That is thereason that Hartree-Fock equation is solved using a SCF procedure. AnSCF procedure consists on the following steps [30].1. Specify the molecule (including the nuclear coordinates{RA}, basisfunctions and number of atoms{ZA}).2. Form overlap matrix Sµν .3. Guess the initial Molecular Orbital coefficients C.4. Obtain a Fock matrix F using C from last step.5. Solve FC=SCε.151.4. Basics for Quantum Chemistry6. By getting the new C from the solution of step 5, build a new Fockmatrix F.7. Back to step 5, repeat step 5 and step 6 until C is quite stable andno longer fluctuates from one iteration to another.8. When the procedure is converged, use the final solution representedby C, F etc. to calculate other quantities of interest.1.4.2 Density Functional TheoryHartree-Fock Approximation although plays an important role from manyaspects, it still has some drawbacks and weakness. Particularly, the neglectof electron correlation of the Hartree-Fock Algorithm can lead to a largedeviation from experimental results. There are a number of approachesto solve this problem. A better alternate to Hartree-Fock calculations isDensity Functional Theory (DFT), which is among the most popular andversatile methods in computational chemistry. In this section, we will give abrief explanation of DFT of which name comes from the use of functionalsof the electron density [13].The Kohn-Sham EquationsThe framework for Density Functional Theory (DFT) originates from Hohenberg-Kohn theorems (H-K) which demonstrates that the energy of the groundstate electron is determined completely by the electron density ρ. The ad-vantage of an electron density approach when compared with the wavefunc-tion approach is that the electron density only depends on three spatialcoordinates while the complexity of a wavefunction increases quickly withthe number of electrons [31]. The wavefunction is a function of 3N variables,where N is the number of electrons and the density is just a function of threevariables. Since the electron density is the square of the wavefunction inte-grated over N-1 electron coordinates, the many-body problem of N electronsis reduced to three spatial coordinates. Although H-K theorem proves thatthere exists a one-to-one connection between the electron density and theground state energy, the functional connecting these two quantities remainsunknown. The goal of DFT is to formulate functionals bridging the electrondensity and the energy [32].Many attempts have been made to design DFT models that involve allthe energy components expressed as a functional of the electron density.While these methods fail to perform well, Kohn and Sham came up witha modern DFT method in which we work with a fictitious system of non-161.4. Basics for Quantum Chemistryinteracting electrons, built from an auxiliary set of orbitals such that the realelectron kinetic energy part could be calculated to good accuracy. Then theonly unknown functional is the exchange-correlation energy, which is the dif-ference in energy between the real system and the non-interacting system.By this method, since we already get much information calculated exactly,there only left a small remainder to be computed by an approximate func-tional. We will introduce the Kohn-Sham scheme and also discuss some ofits major features. We start by introducing orbitals and the Non-InteractingReference System [33].The foundation for Kohn-Sham (KS) DFT method is to construct afictitious non-interacting system by the introduction of orbitals, in such away, the density of the interacting electrons is the same as that of the system.Our goal in KS formalism is to split the kinetic energy into two parts, onecould be calculated exactly, the remainder is just a small correction. Withthe KS method, one has to pay the price of the increased variable complexitysince the orbitals are re-introduced, so the variables increase from three to3N. Also, a separate electron correlation term is needed [21].The KS model is quite similar to HF method, they share identical formu-las for the kinetic, coulomb electron-electron and nuclear-electron energies.In Hartree-Fock, the Slater determinant ΦSD represents an approximationof the true N-electron wave function, while in DFT, ΦSD represents the ex-act wave function of a fictitious system of N non-interacting electrons. Thenext crucial step is to introduce an effective, local potential Vs(r) within theHamiltonian, that we need to describe the non-interacting reference system,that isHˆs = −12N∑i∇2i +N∑iVs(ri). (1.53)The Kohn-Sham orbitals are determined the solution offˆksϕi = εiϕi, (1.54)where the one-electron Kohn-Sham operator fˆks introduced asfˆks = −12N∑i∇2 +N∑iVs(r). (1.55)For these Kohn-Sham orbitals, choosing an effective Vs plays an impor-tant role in bridging the density of the newly-constructed ficticious non-interacting system and that of the real interacting electron system. We needto find a Vs that makes ρs which is the summation of the moduli of squared171.4. Basics for Quantum Chemistryorbitals{ϕi}equals to the real density of the ground state for interactingelectrons,ρs(r) =N∑i∑s|ϕ(r, s)|2 = ρo(r). (1.56)The next step is to find an expression for the potential Vs. We startby introducing the expression of the kinetic energy of the non-interactingsystem [34]Ts = −12N∑i〈ϕi|∇2|ϕi〉(1.57)The difference between the kinetic energy between the interacting and non-interacting system is included in the exchange correlation term Exc[ρ(r)].This term also includes the difference of the colomb energy between inter-acting and non-interacting systems. We can write the energy functionalasE[ρ(r)] = Ts[ρ(r)] + J [ρ(r)] + Exc[ρ(r)] + Ene[ρ], (1.58)That is, Exc, defined as:Exc[ρ] = (T [ρ]− Ts[ρ]) + (Eee[ρ]− J [ρ]) = Tc[ρ] + Encl[ρ], (1.59)where Tc[ρ] could be considered as the kinetic correlation energy and Encl[ρ]contains both potential correlation and exchange energy. In other words, theexchange-correlation energy Exc is the functional that includes everythingthat remains unknown. In terms of Ts, we expect it to be a functional of ρ[35].We now discuss the problem on how to find Vs that will satisfy therequirement that the density of the non-interacting reference system is thesame as the real system. We can also regard the energy of the non-interactingsystem as two components: the kinetic energy and the energy coming fromthe interaction with the external potential. With this separation, we haveintroduced from 1.58, we describe the expression for the energy of real,181.4. Basics for Quantum Chemistryinteracting system as below [36]:E[ρ(r)] = Ts[ρ(r)] + J [ρ(r)] + Exc[ρ(r)] + Ene[ρ]= Ts[ρ(r)] +12∫∫ρ(r1)ρ(r2)r12dr1r2 + Exc[ρ] +∫VNeρ(r)dr= −12N∑i〈ϕi|∇2|ϕi〉+ 12N∑iN∑j∫∫|ϕi(r1)|2 1r12|ϕj(r2)|2dr1dr2+ Exc[ρ(r)]−N∑iM∑AZAr1A|ϕi(r1)|2dr1(1.60)Except for the unknown part Exc, This is very similar to the Hartree-Fock energy, so we can proceed in the same way here. We apply the vari-ational principle and minimize the energy expression with the constrain oforthogonal orbitals. As a result, we get the equations(−12∇2+[∫ρ(r2)r12dr2+Vxc(r1)−M∑AZAr1A])ϕi = (−12∇2+Veff (r1))ϕi = εiϕi.(1.61)By comparing it with the one-particle equations from the non-interactingsystem, we will find that Veff is same as VsVs(r) ≡ Veff (r) =∫ρ(r2)r12dr2 + Vxc(r1)−M∑AZAr1A, (1.62)where Veff , depends on the density through the coulomb term. Using thesame notation from the Hartree Fock integro-differential equation, we writethe Kohn-Sham Equation as[h(x1) + (x1) + Vxc(x1)]χi(x1) = εiχi(x1). (1.63)Same as what we have done with the Hartree-Fock Approximation, we alsoresort to basis set approximation and get the similar formalism FC = SCε.F is Kohn-Sham matrix and hence the Kohn-Sham one-electron equationcan be solved iteratively by SCF procedure we have introduced before. Thepotential Vxc is defined as the functional derivative of Exc of ρ.Vxc ≡ δExcδρ(1.64)191.5. Population AnalysisAnother important point we need to mention about Kohn-Sham approachis that unlike the Hartree-Fock model, where the approximation enters fromthe start, Kohn-Sham theory is in fact exact. It is when we have to figureout an explicit form of the unknown functional for the exchange-correlationenergy Exc and the corresponding potential Vxc that we need to resort to anapproximation. The quality of a DFT calculation will depend on the qualityof these two terms.1.5 Population AnalysisWe now proceed to describe the main topic: Population Analysis.One of the many molecular properties that can be extracted from thewavefunction is the electronic charge distribution. Unfortunately, it is hardto observe this distribution directly from experiment. This raises a question:given a wavefunction, how can we get an idea of the way charge is distributedamong the constituent atoms of a molecule? Answering this question is thegoal of population analysis. The most commonly used population analysismethods fall into two categories [3]:(1) Partitioning the molecular wavefunction using orbital based scheme.(2) Partitioning of a physical observable derived from the wavefunctionsuch as electron density.Population Analysis Based on Basis FunctionTwo of the most frequently used orbital-based partitioning schemes are Mul-liken and Lo¨wdin [22]. They are also highly related to our project. In orderto illustrate them clearly, we start by introducing some basic notations. Wealready know the electron density ρ represents the probability of finding anelectron at a certain position r. We can also define an electron density ρifor a single molecular orbital. It can be written as the square of a singlemolecular orbital containing one electron. ρMOi (r) = fi|ψMOi (r)|2 whereψMOi (r) =∑AOµ∑atomA CAiµφAµ (r) represents for a molecular orbital offer aSCF calculation has converged. fi is the occupation number of orbital ψMOi .If we sum the electron densities over all orbitals, we get the total electrondensity [37]:ρtot(r) =MO∑iρMOi (r). (1.65)201.5. Population AnalysisThe total density is normalized to the total number of electrons in the systemntot: ∫ρtot(r)dr = ntot. (1.66)If each orbital is expanded in terms of a set of normalized, but non-orthogonalbasis functions φ. The electron density can be written as an expansion ofthe basis functions φ and the expansion coefficients dABµν define the elementsof the density matrix:ρtot(r) =AO∑µ,νatom∑A,BdABµν φA∗µ φBν . (1.67)We can separate ρtot(r) into two terms, one belonging to the same atomiccenter and another belonging to two separate centers, as:ρtot(r) =AO∑µ,νatom∑AdAAµν φA∗µ φAν +AO∑µ,νatom∑A 6=BdABµν φA∗µ φBν . (1.68)The first part stands for the ionic component of the total electron densitysince it includes the electron density from basis sets centered around a singleatom:ρionictot (r) =AO∑µ,νatom∑AdAAµν φA∗µ φAν . (1.69)Similarly, the ionic electron density of atom A, ρionicA is defined asρionicA (r) =AO∑µ,νdAAµν φA∗µ φAν . (1.70)The second term in Eq.(1.68) represents the covalent part of electronic den-sity since it describes the overlap between two different atoms. If we onlysum over unique pairs of atom centers A 6= B, since the density matrix issymmetrical we get exactly half of the covalent part.ρcovalenttot (r) =AO∑µ,νatom∑A 6=BdABµν φA∗µ φBν . (1.71)Similarly, the covalent contribution between atom A and B are defined as:ρcovalentA (r) =12AO∑µ,νatom∑A 6=BdABµν φA∗µ φBν . (1.72)211.5. Population AnalysisFurthermore, by integrating respective electron densities, we can get theionic and covalent charges as:nionicA =∫ρionicA (r)dr =AO∑µ,νdAAµν SAAµν , (1.73)ncovalentAB =∫ρcovalentAB (r)dr = 2AO∑µ,νdABµν SABµν , (1.74)ncovalentA =12ncovalentAB =12∫ρcovalentAB (r)dr =AO∑µ,νdABµν SABµν , (1.75)nA = nionicA + ncovalentA , (1.76)where S is the overlap matrix:Sµν =∫φ∗µ(r)φν(r)dr. (1.77)1.5.1 Mulliken and Lo¨wdin Population AnalysesAmong the methods, the simplest one is the Mulliken scheme which parti-tions the covalent contribution equally between the two atoms. In Mullikenpopulation analysis, the charge on atom A, is the ionic charge on A, Eq.(1.70)plus half of all the covalent contribution containing atom A, Eq.(1.72). Thatis, the total number of electrons associated with certain atom A comes fromthe sum of the contributions from all AOs located on that atom A. Based ondifferent schemes, chemists develop various methods on how the contribu-tion involving basis functions is divided on different atoms. The gross atomicpopulation comes from the sum over all the atomic orbitals on a given atom.We then use this gross atomic population to subtract the nuclear charge andget the partial charge on each atom:QA = ZA − nA. (1.78)While Mulliken method partitions the matrix dµνSµν , another frequentlyused method, Lo¨wdin, uses the symmetric orthogonalization of matrix d,d′= S12 · d · S 12 . (1.79)The rest of Lo¨wdin proceeds as Mulliken but replacing d by d′. Both Mul-liken and Lo¨wdin are just two typical examples among the population anal-ysis using Sn ·d ·S1−n matrices [38]. Neither of them is superior to the other221.5. Population Analysisone in terms of getting the best result. Mulliken does not require an orthog-onal basis, thus it may yield negative gross population or value greater thantwo. On the contrary, Lo¨wdin analysis performs a transformation of all theatomic orbitals to make it an orthogonal basis. With the orthogonalizedbasis, the number of electrons are constrained to the correct range [0,2] [39].1.5.2 Natural Population Analysis (NPA)Besides Mulliken and Lo¨wdin, there is another popular method called “natu-ral population analysis” which has been developed to make population anal-ysis in general atomic orbital basis sets [40]. This method shows significantimprovement when we involve ionic compound compared to conventionalMulliken Population Analysis. In this section, we will give a brief explana-tion on Natural Population Analysis by first comparing it with conventionalnatural orbitals (NOs) and then outline the construction of the method.Relationship to Natural OrbitalsThe NAOs resemble the conventional NOs introduced by Lo¨wdin. For iso-lated atoms, they are both orthogonal atomic orbitals of maximal occupancy.But for polyatomic molecules, a distinction must be mentioned. For con-ventional NOs, they are defined as the orthonormal molecular orbitals ofmaximal occupancy and are thus completely delocalized. On the contrary,the NAOs are defined as orthonormal atomic orbitals of maximal occupancyand as a consequence, they are localized on individual atoms within eachblock of a molecule.Outline of the Construction of the MethodThere are essentially two steps for the construction of NAOs: First, diago-nalizing one-center blocks of the density matrix to gain a set of “pre-NAOs”.Secondly, removing interatomic overlap. As a result from the first step, thediagonalization will lead to two categories of the basis of occupancy: (1) the“minimal” set which corresponds to occupied atomic orbitals for the isolatedatom. (2) The “Rydberg” set which is the remaining part based on the mag-nitude of the occupation numbers significantly greater than zero. Since thereexists overlap between one center and another in the pre-NAOs, the orbitalsare not eligible to assess the atomic charge. Therefore, we need to removethe interatomic overlap as the second step. Our main idea is to orthogonal-ize the whole set of orbitals while still trying to maintain the atom-centeredqualities of pre-NAOs as much as we can. In order to achieve that goal, we231.5. Population Analysisneed to perform the orgonalization transformation on our pre-NAOs. Here,we summarize it briefly as four essential steps:(1) Divide the pre-NAOs within each block into the “minimal” set andRydberg sets.(2) Perform the occupancy-weighted symmetric orthogonalization on allthe minimal functions.(3) By applying a standard Gram-Schmidt orthogonalization, make the“minimal” set and Rydberg set on the same center orthogonal to each other.(4) Perform an occupancy-weighted procedure to make the Rydberg setson one center orthogonal to Rydberg sets on another center.After all four procedures, we finally get a set of orthogonal orbitals whichare denoted as NAOs. The diagonal elements of this density matrix representthe orbital populations. The atomic charge comes from the sum of contribu-tions of the orbitals centered in that atom. Empirically it is found that the“minimal” set plays an essential role in the contribution of the electron den-sity, as it counts up to 99% of the electron density. Furthermore, like in thecase of Lo¨wdin, the electron occupation numbers are constrained between 0and 2. As the basis set grows, the charges converge to a well-defined num-ber. However, NAOs also have some disadvantages. For example, they canextend too far away from the atom where they are centered. This would leadto the scenario where the contribution of electron density near one nucleusactually belongs to another nucleus [40].In our calculation, NPA plays an important role and it serves as thestandard which we use to evaluate other methods since for most systemsNPA generates stable and reasonable results with all basis sets. However,NPA requires orthogonalization during the process, which is computation-ally expensive. For big molecules and in large basis sets, we should try toavoid using NPA and find other substitute methods. Another motivationwe want to develop other method for calculation is that NPA is not widelyused in computational software. We need to come up with methods that areeasily implemented with most softwares.1.5.3 Ionic Partition of Overlap Distribution (IPOD)methodsMulliken and Lo¨wdin Population analyses do not usually generate stableresults when calculated with different basis sets. NPA performs well in termsof stability and accuracy. However, since it is computationally expensive, weare looking for some methods that do not require any orthogonalization whilestill generate the same quality results as NPA. That serves the motivation241.5. Population Analysisof developing our IPOD series.In this section, we introduce the previously published population analysisby Ionic Partition of Charge Distribution (IPOD) methods [37]. For ourown project, we made some correction on partitioning the covalent part ofthe electron charge. We start by introducing the simplest method, IPOD1.The approximation for IPOD1 is consist of distributing the gross atomicpopulation, nIPOD1A according to the proportion of its ionic charge:nIPOD1A ∝ nionicA , (1.80)where nionicA is the ionic charge for atom A from Eq.([1.73) while we satisfythis proportionality, we need to make sure that the gross atomic populationadd up to the total number of electrons ntotnIPOD1A =nionicAnionictotntot. (1.81)Obviously, this method simply ignores the contribution from the covalentpart. IPOD2 differs from IPOD1 in which it takes into account the cova-lent part which is partitioned by the same ratio as the ionic charge. Inmathematical form, the covalent density for atom A becomes:ncovalentA =nionicAnionicA + nionicB· ncovalentAB +nionicAnionicA + nionicC· ncovalentAC . (1.82)The idea of IPOD2 resembles that of Christoffersen-Baker population anal-ysis (CBPA). However, CBPA requires the orthogonalization of the basisfunction between one atomic center with another. Besides, instead of apply-ing the global density matrix, it performs the partition for every molecularorbital. IPOD2 still has the problem that partitioning by the ionic densityratio results too much weight for heavier atoms which contradicts our rulethat the partition should somehow reflect the electro-negativity. To fix that,we propose several other schemes to alleviate the influence of the dependenceon atomic number. Usually for heavier atoms, their atomic number is big, sofor IPOD2b we want to balance the ratio by dividing the atomic number ZA.For heavier atoms, sometimes they will likely have larger core electrons, soIPOD2c is to alleviate the influence by removing the core electron number.IPOD2d can be regarded as a combination of IPOD2b and IPOD2c, and wehope it would fix the partition in the right direction and become a trade offif either of 2b or 2c goes a bit over-correction.a) IPOD2b: First divide nionicA by the atomic number ZA then use the result251.5. Population Analysisas the ratio to partition the covalent part.RA =nionicAZA, (1.83)ncovalentA,IPOD2b =RARA +RB· ncovalentAB +RARA +RC· ncovalentAC . (1.84)b) IPOD2c: First subtract the number of core electrons ncore from nionicA ,then renormalize by the whole group.GA = nionicA − nA,core, (1.85)ncovalentA,IPOD2c =GAGA +GB· ncovalentAB +GAGA +GC· ncovalentAC . (1.86)c) IPOD2d: First subtract the number of core electrons and afterwards di-vide the subtracted part by the number of valence electron number nvalence.FA =nionicA − nA,corenA,valence, (1.87)ncovalentA,IPOD2d =FAFA + FB· ncovalentAB +FAFA + FC· ncovalentAC . (1.88)Here, we have to especially mention that for oxygen and fluorine. To avoidthe over-correction, we divide by five instead the actual number of valenceelectrons. This number 5 is from our empirical calculation, and it fix theover-correction to a good extent.26Chapter 2Result and DiscussionOur Ionic Partition of Charge Distribution (IPOD) series algorithms withMulliken Population Analysis (MPA) and Lo¨wdin Population Analysis (LPA)methods are implemented by Fortran 77 in nwchem program with MPA,LPA methods. NPA method is implemented with Gaussian09 platform. Wecarried out the studied systems with a series of basis sets: STO-3G, 3-21G,4-31G, 6-31G, 6-31G*, 6-31G**, 6-31G(2pd, 2p), 6-31G(3df, 3pd),6-31++G, 6-31++G*, 6-31++G**, using the hybrid density functionalB3LYP method in Gaussian09. To visualize the results, we used python toplot the atomic charge of different methods vs. the increased basis sets. Wecan see from the results (e.g., in Figure 2.1) that the charges of NPA donot generally fluctuate a lot with different basis sets while in other cases,the results generally depend on the basis sets. Therefore, the basis set in-dependency of NPA makes it a good standard to measure the quality ofother methods. The choice of systems for our analysis was made based ontheir functional groups and their polarity. We then analysed the results bycomparing them with different basis sets using python graphs.2.1 Fluorine SystemsFor the Fluorine atom in Hydrogen Fluoride (HF) in Table 2.1, compared tothe most stable method NPA, our methods of IPOD2b, IPOD2c, IPOD2dall work pretty well in terms of accuracy and stability. Compared to theaverage of NPA charge which is -0.529, IPOD2b and IPOD2d have the sim-ilar average of -0.509 and -0.541 respectively. IPOD2c with an average of-0.620 is slightly more polarized than NPA but is still the most stable amongthe IPOD methods with a standard deviation of 0.034, just slightly biggerthan that of NPA which is 0.024. In diatomic systems, IPOD1 and IPOD2generate the same results since IPOD2 partitions the covalent part the sameratio as the ionic part. Both of IPOD1 and IPOD2 polarize the moleculetoo much, therefore our IPOD2b is to balance this partition by dividingthe atomic number ZA. IPOD2c is to penalize the result by removing thecore electrons. In terms of HF system from Figure 2.1(a), we can conclude272.1. Fluorine SystemsHF LiFMethod Average Std. Dev. Average Std. Dev.MPA -0.385 0.113 -0.545 0.099LPA -0.248 0.224 -0.429 0.188NPA -0.529 0.024 -0.821 0.083IPOD1 -0.623 0.033 -0.697 0.064IPOD2 -0.623 0.033 -0.697 0.064IPOD2b -0.509 0.066 -0.585 0.095IPOD2c -0.620 0.034 -0.778 0.048IPOD2d -0.541 0.059 -0.730 0.066Table 2.1: Average charges for all basis sets and standard deviations for HFand LiF.NaF KFMethod Average Std. Dev. Average Std. Dev.MPA -0.631 0.164 -0.765 0.125LPA -0.530 0.193 -0.760 0.140NPA -0.833 0.102 -0.881 0.091IPOD1 -0.622 0.170 -0.735 0.135IPOD2 -0.622 0.170 -0.735 0.135IPOD2b -0.643 0.162 -0.770 0.120IPOD2c -0.809 0.102 -0.855 0.085IPOD2d -0.776 0.125 -0.840 0.100Table 2.2: Average charges with all basis sets and standard deviations forNaF and KF.282.1. Fluorine Systemsa b c d e f g h i jBasis Set0.80.60.40.20.00.20.4Atomic Charge(a) F Atom in HFa b c d e f g h i jBasis Set1.00.90.80.70.60.50.40.30.20.1Atomic Charge(b) F atom in LiFa c d e f g h i jBasis Set1.00.90.80.70.60.50.40.30.2Atomic Charge(c) F Atom in NaFMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2dd e i jBasis Set1.000.950.900.850.800.750.700.650.600.55Atomic Charge(d) F atom in KFFigure 2.1: Partial charges for Fluorine diatomic compounds. The basis setsare as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**),(f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j,6-31++G**)292.1. Fluorine Systemsthat our method of IPOD2b, IPOD2c, and IPOD2d work much better thanMPA and LPA although our methods do not require any orthogonalization.Especially for LPA, it fluctuates too much with different basis sets. For thebasis set 6-31(3df, 3pd), the charge already becomes positive while in realitythe charge of Fluorine should always remain negative.For Fluorine atom in Lithium Fluoride (LiF) from Figure 2.1(b), IPOD2bhas the same trend as LPA and MPA and all of the IPOD methods fluc-tuate severely compared to the rest of the methods. IPOD2b penalize theheavier atom F and makes it less negative. In terms of deviation in Table2.1, IPOD1, IPOD2, IPOD2c and IPOD2d all work better than NPA sincethey have smaller standard deviation than NPA. Especially for IPOD2c,the standard deviation is only 0.048 which indicates it is quite stable. Andcomparing the average of IPOD2c with that of NPA, they are rather close.Since the core electron for Li is only 3, by removing the core electron 2, Lihas much less weight in partitioning the covalent charges. As a result, Fgets more negative charges. In Figure 2.1(b) the purple line is a relativelyclose to the green line. Therefore, we can conclude that for LiF, IPOD2c isa better method than NPA considering the advantage of relatively cheapercalculation. Another trend worth mentioning is that with the adding of po-larization basis functions, all the methods gradually become less polarizeduntil it comes to the biggest two basis sets: 6-31(2pd,2p) and 6-31(3df,3pd).Afterwards, if we add diffuse functions to our basis set, it becomes polarizedagain (as the Figure 2.1(b) shows, the lines drop right from g to h), andbecome stable with diffuse functions.For Fluorine atom in Sodium Flouride (NaF) in Figure 2.1(c), similaras the previous two systems HF and LiF, LPA is still the most unstableone. IPOD2, IPOD2b and MPA follow the identical trend of decreasingthe polarization gradually with the increasing size of basis sets. Similarly,the decrease bounces back and becomes stable again when we add diffusefunctions to the basis sets. When it comes to IPOD2c method, the numberof core electrons for Na is 10 which is much bigger than the core of F which is2. By removing the core electrons, F gets better penalized by sharing moreweight in partitioning the covalent contribution which generates a betterresults compare to NPA method. On the other hand, IPOD2c and IPOD2dstand out for their stability and a relatively reasonable average charge closeto that of NPA.For Fluorine atom in Potassium Flouride (KF) in Figure 2.1(d), the basissets 6-31G* and 6-31G** produce the same charge while 6-31++G* and 6-31++G** are equal to each other since double asterisk only adds diffusefunctions on hydrogen. Thereafter, for this system, we do not have enough302.2. Alcohol SystemsCH3OH C2H5OHMethod Average Std. Dev. Average Std. Dev.MPA -0.160 0.248 -0.023 0.166LPA -0.154 0.154 -0.060 0.095NPA -0.321 0.033 -0.113 0.026IPOD1 0.294 0.212 0.146 0.240IPOD2 -0.933 0.339 -0.396 0.273IPOD2b -0.277 0.351 -0.043 0.211IPOD2c -0.766 0.368 -0.290 0.272IPOD2d -0.177 0.379 0.026 0.225Table 2.3: Average charges and standard deviations for atom C in CH3OHand C2H5OH for basis sets from 3-21G to 6-31G*.C3H7OH C4H9OHMethod Average Std. Dev. Average Std. Dev.MPA 0.034 0.177 -0.045 0.224LPA 0.002 0.074 -0.071 0.108NPA 0.072 0.020 -0.111 0.029IPOD1 0.024 0.373 0.088 0.412IPOD2 0.068 0.288 -0.438 0.385IPOD2b 0.066 0.176 -0.071 0.291IPOD2c 0.110 0.254 -0.329 0.392IPOD2d 0.115 0.187 -0.001 0.317Table 2.4: Average charges and standard deviations for atom C in C3H7OHand C4H9OH for basis sets from 3-21G to 6-31G*basis sets to determine which method works better than the others.All in all, for fluorine system, our methods of IPOD2c and IPOD2d ingeneral work quite well in terms of accuracy and stability, even better thanNPA method in some cases. IPOD2b works better than LPA and MPA butstill not polarized enough to yield a good result.2.2 Alcohol SystemsFor Alcohol systems, we mainly concentrate on two atoms: the oxygen andthe carbon which attached to that oxygen. For CH3OH, C2H5OH andC4H9OH, the -OH are all attached to the carbon from the side. Only for312.2. Alcohol Systemsa b c d e f g h i jBasis Set1.251.000.750.500.250.000.250.500.75Atomic Charge(a) C Atom in CH3OHa b c d e f g h i jBasis Set0.80.60.40.20.00.20.40.6Atomic Charge(b) C atom in CH3CH2OHa b c d e f g h i jBasis Set0.80.60.40.20.00.20.40.60.8Atomic Charge(c) C Atom in CH3CH2CH2OHMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d f g h jBasis Set1.00.50.00.51.0Atomic Charge(d) C atom in CH3CH2CH2CH2OHFigure 2.2: Partial charges for C atom in Alcohol compounds. The basis setsare as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**),(f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j,6-31++G**)322.2. Alcohol Systemsa b c d e f g h i jBasis Set1.751.501.251.000.750.500.250.000.25Atomic Charge(a) O Atom in CH3OHa b c d e f g h i jBasis Set2.01.51.00.50.0Atomic Charge(b) O atom in CH3CH2OHa b c d e f g h i jBasis Set2.01.51.00.50.0Atomic Charge(c) O Atom in CH3CH2CH2OHMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d f g h jBasis Set2.01.51.00.50.0Atomic Charge(d) O atom in CH3CH2CH2CH2OHFigure 2.3: Partial charges for O atom in Alcohol compounds. The basis setsare as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e, 6-31G**),(f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j,6-31++G**)332.3. Alkene SystemsC3H7OH, the -OH is attached to the carbon in the middle. For the carbonatom in CH3OH, we can see from Figure 2.3, the only relative stable resultis produced by the NPA method, the rest of the methods all have obviousoscillations. Besides, they always share the same trend that with addingmore polarization basis sets, the atomic charge becomes more positive untilthe point we start adding diffuse functions when the atomic charge begindropping off again and become even more negative than the initial basisset. Although generally all methods are not stable except for NPA (We cansee from our deviation Table 2.4), our IPOD series work quite well within acertain range (with basis sets from 3-21G to 6-31G* in Figure 2.3). WhileIPOD2c is much more negative than NPA which we assume is the most ac-curate method, IPOD2b and IPOD2d bring it back to the correct level ofcharge.For the other two similar systems, C2H5OH and C4H9OH, they basicallyhave the identical trend as CH3OH for each method since the -OH are bothattached to the carbon on the side. The only system which possess a differenttrend from the rest of three is C3H7OH where the -OH is attached to thecarbon in the middle instead of the side. For atom carbon in C3H7OH, allthe methods are more stable except for certain basis set such as 6-31G(3df,3pd) and 6-31++G.Since there are severe oscillation for different basis sets, the basis setaverages of all basis sets of each method are not the best way to representtheir accuracy. We would rather choose one basis set which is on average theclosest to the NPA method and choose that as the final charge we producefrom the method. In this way, we can not only have a better result in termsof accuracy, computational wise, we can also save more time and effort.And from the systems we have discussed above, we can crudely pick 6-31Gas that basis set we want to use as the standard. Later we will discuss moresystems and further prove our statement.Another atom we care about in the alcohols is the oxygen. Since all theoxygens from the above four systems are attached to hydrogen on the oneside and carbon on the other side, their structures are basically all the same.And as a result, All the four systems produce similar trends in terms of theoscillation. Our IPOD series are quite stabilized even compared to NPA.2.3 Alkene SystemsFor Alkene systems, we would choose two kinds of carbon in different en-vironments. The first choice is the carbon at the end which attached to342.3. Alkene Systemsa b c d e f g h i jBasis Set1.00.80.60.40.20.00.20.4Atomic Charge(a) C Atom in C2H4a b c d e f g h i jBasis Set1.251.000.750.500.250.000.250.50Atomic Charge(b) C atom in C3H4a b c d e f g h i jBasis Set1.251.000.750.500.250.000.250.500.75Atomic Charge(c) C Atom in C3H6MPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d e f g h i jBasis Set1.251.000.750.500.250.000.250.500.75Atomic Charge(d) C Atom in C4H6Figure 2.4: Partial charges for Carbon-side atom in Alkenes compounds.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*),(e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i,6-31++G*), (j, 6-31++G**)352.3. Alkene SystemsC2H4 C3H4Method Average Std. Dev. Average Std. Dev.MPA -0.248 0.087 -0.407 0.096LPA -0.242 0.050 -0.283 0.046NPA -0.431 0.005 -0.521 0.006IPOD1 -0.498 0.106 0.642 0.148IPOD2 -0.802 0.079 -1.028 0.099IPOD2b -0.372 0.112 -0.603 0.133IPOD2c -0.720 0.091 -0.978 0.118IPOD2d -0.333 0.120 -0.598 0.150Table 2.5: Average charges and standard deviations for atom Carbon (side)in C2H4 and C3H4 for basis sets from 3-21G to 6-31G(2pd, 2p).C3H6 C4H6Method Average Std. Dev. Average Std. Dev.MPA -0.273 0.076 -0.287 0.072LPA -0.275 0.059 -0.253 0.053NPA -0.448 0.006 -0.416 0.006IPOD1 -0.448 0.071 0.505 0.077IPOD2 -0.843 0.081 -0.858 0.075IPOD2b -0.407 0.106 -0.433 0.099IPOD2c -0.767 0.088 -0.788 0.083IPOD2d -0.378 0.110 -0.407 0.106Table 2.6: Average charges and standard deviations for atom Carbon (side)in C3H6 and C4H6 for basis sets from 3-21G to 6-31G(2pd, 2p).C2H4 C3H4Method Average Std. Dev. Average Std. Dev.MPA -0.248 0.087 0.223 0.086LPA -0.242 0.050 0.015 0.005NPA -0.431 0.005 0.068 0.007IPOD1 -0.498 0.106 0.087 0.211IPOD2 -0.802 0.079 0.355 0.080IPOD2b -0.372 0.112 0.306 0.104IPOD2c -0.720 0.091 0.393 0.104IPOD2d -0.333 0.120 0.350 0.130Table 2.7: Average charges and standard deviations for atom Carbon (mid-dle) in C2H4 and C3H4 for basis sets from 3-21G to 6-31G(2pd, 2p).362.3. Alkene Systemsa b c d e f g h i jBasis Set1.00.80.60.40.20.00.20.4Atomic Charge(a) C Atom in C2H4a b c d e f g h i jBasis Set0.40.20.00.20.40.6Atomic Charge(b) C atom in C3H4a b c d e f g h i jBasis Set0.40.20.00.20.4Atomic Charge(c) C Atom in C3H6MPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d e f g h i jBasis Set0.80.60.40.20.00.2Atomic Charge(d) C Atom in C4H6Figure 2.5: Partial charges for C-middle atom in Alkenes compounds. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*),(e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i,6-31++G*), (j, 6-31++G**)372.3. Alkene SystemsC3H6 C4H6Method Average Std. Dev. Average Std. Dev.MPA -0.048 0.063 -0.067 0.066LPA -0.117 0.033 -0.117 0.033NPA -0.215 0.003 -0.253 0.002IPOD1 -0.297 0.123 0.248 0.122IPOD2 -0.197 0.047 -0.255 0.063IPOD2b -0.057 0.071 -0.090 0.083IPOD2c -0.155 0.057 -0.215 0.073IPOD2d -0.032 0.078 -0.067 0.085Table 2.8: Average charges and standard deviations for atom C (middle) inC3H6 and C4H6 for basis sets from 3-21G to 6-31G(2pd, 2p).hydrogen from one side and carbon with double bond on the other side.Another carbon we want to focus on is the carbon in the middle which isattached to two carbons from different sides. We concentrate on these twocarbons because we want to evaluate how our method works for more po-larized and less polarized atoms even within the same molecule. Hence weplot both the carbons from the end and the carbon from the middle.Figure 2.4 shows that for different systems, the carbon from the sidealways share the similar trend. The charges are quite stable for basis sets3-21G to 6-31G(2pd, 2p). At 6-31G(3df, 3pd), the atomic charge suddenlygoes a lot more positive and breaks the flat curve. The numbers drop backwhen we start adding diffuse functions. Since basis set 6-31G(3df,3pd)),the total average is not very representative, we just calculate the averagebased on the basis sets from basis set 3-21G to 6-31G(2pd, 2p) and evaluatethe standard deviation accordingly. In the range from basis set 3-21G to6-31G(2pd, 2p), our IPOD2b and IPOD2d methods perform the best. TakeC2H4 as an example, from the Table 2.5 we can see, the average are -0.372and -0.333 respectively while the average of NPA is -0.431. Besides, we canobserve that for IPOD2b and IPOD2d, the charge produced by 6-31G isquite effective to represent result when compared to NPA.On the contrary, for the carbon in the middle, the charges produced byour methods are not as good as expected. As the Figure 2.5 shows, for C3H6and C4H6, within the range of basis set 3-21G to 6-31G(2pd, 2p), IPOD2c isthe best among the three. IPOD2b and IPOD2d become too non-polarizedand the charges are almost zero. But at least they are quite stable for thesetwo systems. However, when it comes to C3H4, charges fluctuate severely382.4. Aromatic SystemsAnthrancene BiphenylMethod Average Std. Dev. Average Std. Dev.MPA -0.138 0.029 -0.128 0.035LPA -0.116 0.019 -0.118 0.022NPA -0.240 0.003 -0.240 0.003IPOD1 -0.144 0.074 -0.184 0.069IPOD2 -0.364 0.043 -0.362 0.047IPOD2b -0.186 0.046 -0.178 0.050IPOD2c -0.328 0.046 -0.326 0.050IPOD2d -0.164 0.048 -0.162 0.053Table 2.9: Average charges and standard deviations for atom Carbon (side)in Anthrancene and Biphenyl for basis sets from 3-21G to 6-31G**.Naphtalene PhenanthreneMethod Average Std. Dev. Average Std. Dev.MPA -0.134 0.029 -0.126 0.032LPA -0.116 0.019 -0.118 0.022NPA -0.239 0.003 -0.237 0.003IPOD1 -0.190 0.062 -0.134 0.079IPOD2 -0.358 0.046 -0.354 0.047IPOD2b -0.182 0.049 -0.172 0.049IPOD2c -0.322 0.050 -0.316 0.051IPOD2d -0.160 0.049 -0.154 0.052Table 2.10: Average charges and standard deviations for atom Carbon (side)in Naphtalene and Phenanthrene for basis sets from 3-21G to 6-31G**.even it is within the range from basis set 3-21G to 6-31G**. Since the carbonwe concern about in C3H4 is only attached to two carbons by a double bondand with no hydrogens attached, it is the most non-polarized atom amongthe four. C3H6 and C4H6 are both attached to two carbons plus a hydrogen,so they are more non-polarized than C2H4 but are still more polarized thanC3H4. Accordingly, the quality of result falls in the middle as well. Hence,we can crudely conclude that our methods work better for more polarizedatoms even within the same system.392.4. Aromatic Systemsa b c d e f g h i jBasis Set0.40.30.20.10.00.10.20.3Atomic Charge(a) C Atom in Anthracenea b c d e h jBasis Set0.50.40.30.20.10.00.10.2Atomic Charge(b) C atom in Biphenyla b c d e f h i jBasis Set0.450.400.350.300.250.200.150.100.05Atomic Charge(c) C Atom in NaphtaleneMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d e f hBasis Set0.40.30.20.10.0Atomic Charge(d) C Atom in PhenanthreneFigure 2.6: Partial charges for C-side atom in Aromatic compounds. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*),(e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i,6-31++G*), (j, 6-31++G**)402.4. Aromatic Systemsa b c d e f g h i jBasis Set0.30.20.10.00.10.20.30.40.5Atomic Charge(a) C Atom in Anthracenea b c d e h jBasis Set0.20.00.20.40.6Atomic Charge(b) C atom in Biphenyla b c d e f h i jBasis Set0.30.20.10.00.10.20.30.40.5Atomic Charge(c) C Atom in NaphtaleneMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d e f hBasis Set0.30.20.10.00.10.20.30.4Atomic Charge(d) C Atom in PhenanthreneFigure 2.7: Partial charges for C-middle atom in Aromatic compounds. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*), (e,6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i, 6-31++G*), (j, 6-31++G**)412.4. Aromatic SystemsAnthrancene BiphenylMethod Average Std. Dev. Average Std. Dev.MPA 0.076 0.052 0.038 0.034LPA -0.008 0.007 -0.008 0.007NPA -0.058 0.002 -0.050 0.002IPOD1 -0.036 0.125 -0.154 0.082IPOD2 0.152 0.059 0.120 0.040IPOD2b 0.112 0.059 0.064 0.036IPOD2c 0.156 0.069 0.116 0.045IPOD2d 0.118 0.069 0.066 0.039Table 2.11: Average charges and standard deviations for atom Carbon (mid-dle) in Anthrancene and Biphenyl for basis sets from 3-21G to 6-31G**.Naphtalene PhenanthreneMethod Average Std. Dev. Average Std. Dev.MPA 0.066 0.045 0.066 0.054LPA -0.008 0.007 -0.008 0.007NPA -0.058 0.002 -0.057 0.002IPOD1 -0.060 0.137 -0.056 0.130IPOD2 0.136 0.055 0.144 0.064IPOD2b 0.094 0.052 0.096 0.064IPOD2c 0.138 0.062 0.148 0.070IPOD2d 0.102 0.059 0.108 0.071Table 2.12: Average charges and standard deviations for atom Carbon (mid-dle) in Naphtalene and Phenanthrene for basis sets from 3-21G to 6-31G**.422.4. Aromatic Systems2.4 Aromatic SystemsSo far we already know that our methods have some flaws when applied tonon-polarized atoms in a molecule. Here, we will see how our methods workwhen applied to one of the typical non-polarized system–Aromatic group.For this group, we focus on two carbon atoms, one from the side of thebenzene, and another from the joint between the two benzenes. Both ofthem are quite non-polarized, the joint one is even more non-polarized. Theaverage and standard deviation are shown in Table 2.12.For the Carbon atom from the side part of the benzenes, as Figure 2.6shows, in Anthracene and Biphenyl, our IPOD series produce more non-polarized numbers than NPA method. And in terms of stability, the qualityof the result is mediocre compared to the deviation of NPA which is only0.003 for both from Table 2.10. When it comes to the other two systems,Naphtalene and Phenanthrene, the results are pretty much the same. Thedeviations are around 0.049 and 0.052 respectively which is much biggerthan 0.003 from NPA.Similarly, the results generated by our methods are even worse for thejoint carbon between benzenes. The average of the charges become positivefrom Table 2.12, and the charge of the joint carbon becomes even worse thanthe performance of the side carbon as we can see from Figure 2.7. Since thejoint carbon between benzenes is more non-polarized, our methods cannothandle these cases as well as the more polarized situations. Again, NPA hasa very small standard deviation of 0.002 for all of the four systems. It seemsthe more non-polarized, the more accurate NPA performs.Based on the above analysis, we find out that our methods are not thatreliable for aromatic. It is still meaningful that we know in advance scenariosin which our methods could be applied. All in all, we can generally concludethat our IPOD series do not perform well either for non-polarized atomsor non-polarized systems such as aromatic groups. For these systems wewould prefer the NPA method instead. Also for non-polarized atoms, theyare usually not very chemically reactive and we do not do much researchparticular on these atoms.2.5 Small Inorganic MoleculesThe last series of systems we want to analyze by our methods are smallinorganic molecules. We choose atom Carbon for CO2 and CO, and the atomNitrogen for both HCN and NH3 to compare how our methods perform. For432.5. Small Inorganic Moleculesa b c d e f g h i jBasis Set0.10.00.10.20.30.40.5Atomic Charge(b) C atom in COa b c d e f g h i jBasis Set0.80.60.40.20.00.20.4Atomic Charge(c) N Atom in HCNMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d e f g h i jBasis Set0.20.00.20.40.60.81.01.2Atomic Charge(a) C Atom in CO2a b c d e f g h i jBasis Set2.01.51.00.50.00.5Atomic Charge(d) N atom in NH3Figure 2.8: Partial charges for atom in small inorganic molecules. Thebasis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*),(e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i,6-31++G*), (j, 6-31++G**)442.5. Small Inorganic MoleculesCO2 COMethod Average Std. Dev. Average Std. Dev.MPA 0.730 0.045 0.288 0.034LPA 0.326 0.045 0.108 0.031NPA 0.942 0.062 0.506 0.032IPOD1 1.244 0.078 0.362 0.024IPOD2 0.990 0.063 0.362 0.024IPOD2b 0.866 0.058 0.310 0.030IPOD2c 1.112 0.077 0.396 0.020IPOD2d 1.028 0.074 0.352 0.024Table 2.13: Average charges and standard deviations for atom Carbon inCO2 and CO for basis sets from 3-21G to 6-31G**.HCN NH3Method Average Std. Dev. Average Std. Dev.MPA -0.328 0.059 -0.808 0.060LPA -0.038 0.028 -0.580 0.107NPA -0.291 0.020 -1.105 0.031IPOD1 -.0.784 0.075 -1.294 0.072IPOD2 -0.430 0.064 -1.686 0.052IPOD2b -0.382 0.067 -1.164 0.075IPOD2c -0.498 0.067 -1.640 0.055IPOD2d -0.426 0.073 -1.158 0.079Table 2.14: Average charges and standard deviations for atom Nitrogen inHCN and NH3 for basis sets from 3-21G to 6-31G**.452.5. Small Inorganic MoleculesCO2, we can see from Table 2.13 that within the range from 3-21G to 6-31G*,the average for NPA is 0.942. IPOD2b and IPOD2d have a rather stableresult and rather accurate average which are 0.866 and 1.028 respectively.Plus, for IPOD2b, the standard deviation is only 0.058 which is smaller thanthat of NPA. This indicates IPOD2b is even more stable. Also if we lookat the Figure 2.8(a), we can see for basis set 6-31G, IPOD2c and IPOD2dproduce a rather close result to NPA which further confirm we can use 6-31Gto represent the average in CO2. All in all, for CO2, IPOD2 series shows abetter performance than the rest of the methods in terms of a more reliableaverage and less oscillations.A special case we are particularly interested in here is CO, since inreality the C atom should have negative charge and O have positive charge.However, in Figure 2.8(b), we can see none of these methods produce thecorrect results. All of the charges produced by our methods are between0.2 to 0.4 while NPA produces a positive 0.5 charge which is even worse.Therefore, we cannot evaluate our methods for CO, and neither can NPAgenerate a reasonable result in this system.Comparing HCN with NH3, HCN is a more polarized system than NH3.In HCN, however, we focus on the charge of atom N from C-N bond which isa less polarized bond compared with N-H from NH3. For C-N bond in HCNsystem, IPOD2b gives us the closest result to NPA which is -0.382 vs. -0.291.Besides, from Figure 2.8(c), we can see that for basis sets 6-31G, IPOD2band IPOD2d give charges quite similar to those of NPA. Thereafter, we canconclude that for non-polar bond in a polarized system, using a specificbasis set 6-31G for our method to approximate the charges still works. Onthe other hand, when it comes to N-H bond from NH3 system, we find fromTable 2.16 that IPOD2b and IPOD2d both provide a rather accurate averagecompared with NPA. And Figure 2.8 also indicates that using 6-31G basisset to approximate the charge for a polarized bond in a non-polar systemsworks.For Oxygen atom in N2O, our IPOD series work relatively well in therange from 3-21G to 6-31G** as shown in Figure 2.9(a), although the num-ber fluctuates a bit when it comes to the basis sets 6-31G* and 6-31G**. Allthe IPOD methods share the same trend for N2O system in the range from3-21G to 6-31G**. And 6-31G can be availed to approximate NPA in thissystem as well from Figure 2.9(a). For Oxygen atom in HNO system, IPODmethods almost overlap one and another within the range from from 3-21Gto 6-31G** as Figure 2.9(b) shows. Also their results are very stable witha standard deviation around 0.13 which is even smaller than that of NPAwhich is 0.023. There still exists some differences between IPOD methods462.5. Small Inorganic MoleculesN2O HNOMethod Average Std. Dev. Average Std. Dev.MPA -0.302 0.058 -0.262 0.010LPA -0.204 0.026 -0.088 0.015NPA -0.273 0.038 -0.195 0.023IPOD1 -0.404 0.128 -0.596 0.017IPOD2 -0.324 0.063 -0.258 0.013IPOD2b -0.314 0.063 -0.256 0.014IPOD2c -0.332 0.065 -0.260 0.013IPOD2d -0.332 0.065 -0.268 0.013Table 2.15: Average charges and standard deviations for atom Nitrogen inN2O and HNO for basis sets from from 3-21G to 6-31G**.H2O HOOHMethod Average Std. Dev. Average Std. Dev.MPA -0.702 0.056 -0.808 0.060LPA -0.528 0.093 -0.580 0.170NPA -0.918 0.040 -1.105 0.031IPOD1 -1.062 0.053 -1.294 0.072IPOD2 -1.166 0.041 -1.686 0.052IPOD2b -0.914 0.061 -1.164 0.075IPOD2c -1.148 0.041 -1.640 0.055IPOD2d -0.952 0.057 -1.158 0.079Table 2.16: Average charges and standard deviations for atom Nitrogen inH2O and HOOH for basis sets from 3-21G to 6-31G**.and NPA since we can see our methods are slightly below NPA which in-dicates that IPOD gives us a more polarized result than we expect and sodoes the 6-31G basis set.For Oxygen atoms in H2O and HOOH, Figure 2.9(c) and (d) show thesame trend for both. The reason for that is both Oxygen come from the O-Hbond. IPOD2b and IPOD2d both guarantee good average charges comparedwith NPA. The fluctuation is reasonable within the range from from 3-21Gto 6-31G** as well. As with the previous systems, the basis 6-31G canproduce a reasonable result to represent our methods in these two systemsas we initially expected.472.5. Small Inorganic Moleculesa b c d e f g h i jBasis Set0.60.40.20.00.2Atomic Charge(d) O atom in HOOHa b c d e f g h i jBasis Set1.251.000.750.500.250.000.250.50Atomic Charge(c) O Atom in H2OMPALPANPAIPOD1IPOD2IPOD2bIPOD2cIPOD2da b c d e f g h i jBasis Set0.60.50.40.30.20.10.00.1Atomic Charge(a) O Atom in N2Oa b c d e f g h i jBasis Set0.70.60.50.40.30.20.10.00.1Atomic Charge(b) O atom in HNOFigure 2.9: Partial charges for Oxygen atom in small inorganic molecules.The basis sets are as follows: (a, 3-21G), (b, 4-31G), (c, 6-31G), (d, 6-31G*),(e, 6-31G**), (f, 6-31G(2pd, 2p)), (g, 6-31G(3df, 3pd)), (h, 6-31++G), (i,6-31++G*), (j, 6-31++G**)482.6. Correlation between Different Methods with NPA2.6 Correlation between Different Methods withNPAWe now want to compare how each method performs based on the correlationplot with NPA using all the data we produced from all the systems and basissets. In the Figure 2.11 and Figure 2.12, we plot the charges produced bythe NPA method along the x-axis, and y-axis represents the charges we getfrom other methods. We can compare how close each method is relatedto NPA method by comparing the slope value with 1. Also, the r2 valuesindicates how reliable is the data shown in the plot. The closer the slopeand r2 are to 1, the better the method performs.From Figure 2.11, we can read the slope in the NPA and MPA plotsare 1.36 and 1.89 respectively which are both much larger than 1. It meansboth MPA and LPA methods polarize the charges more than they should.On the contrary, the slope for IPOD1 is only 0.67 and it indicates thatIPOD1 method is too non-polarized. When it comes to our IPOD2 seriesfrom Figure 2.11, the slope shows clearly that our IPOD2b and IPOD2dstand out compared to the rest of two. IPOD2b has a value of 1.056 whileIPOD2d has a number even closer to one at 1.007. In terms of reliability,the plots are quite reasonable since r2 is around 0.95 for both. It is thensafe to conclude that IPOD2d is the best method in terms of the correlationto NPA.2.7 The Best Basis Set within IPOD2d MethodSince we already figured out that IPOD2d is the best method among the allexcept for NPA, we now want to find the best basis set which can representIPOD2d method. Figure 2.12 shows how close IPOD2d is to NPA. in termsof four different basis sets, how close IPOD2d is related to NPA method.We choose these four basis sets particularly based on what we previouslydiscussed for different groups of systems. Comparing these four slopes, wecan tell that 4-31G and 6-31G are better than the rest of two with slopeof 0.987 and 0.990 respectively. 6-31G is the best basis set that we selectto represent IPOD2d. This corresponds to what we have analysed beforeby roughly looking at each graph in different systems which further provesour conclusion is correct. Choosing the best basis set to represent a methodmeans a lot for our practical calculations since our methods are compu-tationally much cheaper than NPA method while still providing a similarresult as NPA does.492.7. The Best Basis Set within IPOD2d Method0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.61.00.50.00.51.0Slope = 1.36058r2 = 0.94891(a) Correlation between MPA and NPA0.8 0.6 0.4 0.2 0.0 0.21.51.00.50.00.51.0Slope = 1.89392r2 = 0.92561(b) Correlation between LPA and NPA1.5 1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 0.67243r2 = 0.84376(c) Correlation between IPOD1 and NPAFigure 2.10: Correlation between different methods with NPA502.7. The Best Basis Set within IPOD2d Method1.5 1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 0.72993r2 = 0.89094(a) Correlation between IPOD2 and NPA1.25 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.751.00.50.00.51.0Slope = 1.05676r2 = 0.94784(b) Correlation between IPOD2b and NPA1.5 1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 0.75806r2 = 0.92839(c) Correlation between IPOD2c and NPA1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 1.00728r2 = 0.94975(d) Correlation between IPOD2d and NPAFigure 2.11: Correlation between different methods with NPA512.7. The Best Basis Set within IPOD2d Method1.0 0.5 0.0 0.5 1.01.000.750.500.250.000.250.500.751.00Slope = 0.89720r2 = 0.97996(a) Correlation for 3-21G between IPOD2d and NPA1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 0.98736r2 = 0.96633(b) Correlation for 4-31G between IPOD2d and NPA1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 0.99016r2 = 0.95599(c) Correlation for 6-31G between IPOD2d and NPA1.0 0.5 0.0 0.5 1.01.00.50.00.51.0Slope = 0.87758r2 = 0.93707(d) Correlation for 6-31G* between IPOD2d and NPAFigure 2.12: Correlation for different basis sets in IPOD2d with the samebasis sets in NPA52Chapter 3SummaryIn this thesis, we evaluated the IPOD series of methods to determine partialatomic charges. In order to evaluate our methods in a more systematic way,we put all the molecules into different categories according to their functionalgroups. As a result, we have more polarized groups such as Fluorine andAlcohol, less polarized ones such as Aromatic, and systems that fall in themiddle such as Alkenes and small inorganic groups.For the more polarized systems, we focus on the atoms from the most po-larized bonds within each molecule. We can conclude that IPOD2d performsthe best in terms of stability and accuracy when using the NPA method.And since the O-H bond is more polarized than the C-H bond, the result forcharges of Oxygen atoms performs even better than that of Carbon atoms.For non-polarized systems, we mainly concentrate on two kinds of carbonatoms, one from the side of benzenes and another from the joint of twobenzenes. The major difference between the two is that the middle one iseven less polarized. Our results indicate that in general our method does notwork well for non-polarized systems, and the less polarized the bond is, theworse our method performs. Therefore, in the future, if we want to applyour method to a new system, we should rather use NPA method instead ofIPOD series if the system is non-polarized.Besides the above two general cases, there are still many systems thatbelong neither to extremely polarized category nor to the non-polarizedone. We use Alkenes and small inorganic molecules as examples to analysehow our method performs on different types of bonds. The plots showIPOD2d generally gives a rather reasonable result for most cases in thesetwo categories of systems. Plus, when the atom is from a more polarizedbond, the result is more stable and more accurate. Hence, we can concludethat even in the same system, IPOD series produce a better result for atomsfrom more polarized bonds.At the last part of our discussion, we analysed the correlation betweendifferent methods and NPA in order to figure out the best IPOD method.By comparing the slope value from each plot, we can see that IPOD2d hasa slope of 1.007 which is the closest to 1. And this result further proves our53Chapter 3. Summaryprevious idea that IPOD2d is the best out of the IPOD series of methods.On top of finding the best method, we still want to find which basis set canprovide the most accurate charge which is closer to NPA. By plotting thecorrelation between the best four basis sets of IPOD2d vs. NPA, we canfinally conclude that 6-31G is the best basis set. In conclusion, by using theIPOD2d method with the 6-31G basis set, we can obtain charges very closeto those of NPA but at a much lower computational cost.54Bibliography[1] RS Mulliken. Electronic population analysis on LCAO–MO molecularwave functions. I. The Journal of Chemical Physics, 23:1833, 1955.[2] D Cremer and E Kraka. Chemical bonds without bonding electrondensitydoes the difference electron-density analysis suffice for a descrip-tion of the chemical bond? Angewandte Chemie International Edition,23:627, 1984.[3] RS Mulliken. Electronic population analysis on LCAO–MO molecularwave functions. II. overlap populations, bond orders, and covalent bondenergies. The Journal of Chemical Physics, 36:3428, 1962.[4] RFW Bader, TT Nguyen-Dang, and Y Tal. A topological theory ofmolecular structure. Reports on Progress in Physics, 44:893, 1981.[5] P Coppens and MB Hall. Electron distributions and the chemical bond.Springer Science & Business Media, 2012.[6] MA Blanco, A Mart´ın Penda´s, and E Francisco. Interacting quantumatoms: a correlated energy decomposition scheme based on the quan-tum theory of atoms in molecules. Journal of Chemical Theory andComputation, 1:1096, 2005.[7] RS Mulliken. Electronic population analysis on LCAO–MO molecularwave functions. II. Overlap populations, bond orders, and covalent bondenergies. The Journal of Chemical Physics, 23:1841, 1955.[8] JA Pople, M Head-Gordon, DJ Fox, K Raghavachari, and LA Curtiss.Gaussian-1 theory: A general procedure for prediction of molecularenergies. The Journal of Chemical Physics, 90:5622, 1989.[9] JA Pople, PMW Gill, and BG Johnson. Kohn-Sham density-functionaltheory within a finite basis set. Chemical Physics Letters, 199:557, 1992.55Bibliography[10] SN Gariseb. A computational study of the structure, bonding, andthermochemical properties of primary Ozonides derived from substitutedPhenol and Thiophenol. PhD thesis, 2013.[11] A Szebo and NS Ostlund. Modern Quantum Chemistry. Dover Publi-cations, 1982.[12] RBJS Krishnan, JS Binkley, R Seeger, and JA Pople. Self-consistentmolecular orbital methods. XX. A basis set for correlated wave func-tions. The Journal of Chemical Physics, 72:650, 1980.[13] WJ Hehre, RF Stewart, and JA Pople. self-consistent molecular-orbitalmethods. I. Use of gaussian expansions of slater-type atomic orbitals.The Journal of Chemical Physics, 51:2657, 1969.[14] TH Dunning Jr. Gaussian basis functions for use in molecular calcula-tions. I. Contraction of (9s5p) atomic basis sets for the first-row atoms.The Journal of Chemical Physics, 53:2823, 1970.[15] A Scha¨fer, H Horn, and R Ahlrichs. Fully optimized contracted gaussianbasis sets for atoms Li to Kr. The Journal of Chemical Physics, 97:2571,1992.[16] JS Binkley, JA Pople, and WJ Hehre. Self-consistent molecular or-bital methods. 21. Small split-valence basis sets for first-row elements.Journal of the American Chemical Society, 102:939, 1980.[17] MM Francl, WJ Pietro, WJ Hehre, JS Binkley, MS Gordon, DJ De-Frees, and JA Pople. Self-consistent molecular orbital methods. XXIII.A polarization-type basis set for second-row elements. The Journal ofChemical Physics, 77:3654, 1982.[18] WJ Pietro, MM Francl, WJ Hehre, DJ DeFrees, JA Pople, and JS Bink-ley. Self-consistent molecular orbital methods. 24. Supplemented smallsplit-valence basis sets for second-row elements. Journal of the Ameri-can Chemical Society, 104:5039, 1982.[19] T Clark, J Chandrasekhar, GW Spitznagel, and PVR Schleyer. Efficientdiffuse function-augmented basis sets for anion calculations. III. The 3-21+G basis set for first-row elements, Li–F. Journal of ComputationalChemistry, 4:294, 1983.56Bibliography[20] GK Chan and M Head-Gordon. Exact solution (within a triple-zeta,double polarization basis set) of the electronic Schro¨dinger equation forwater. The Journal of Chemical Physics, 118:8551, 2003.[21] F Jensen. Introduction to Computational Chemistry. John Wiley &Sons, 2017.[22] P Lo¨wdin. Quantum theory of many-particle systems. I. Physical inter-pretations by means of density matrices, natural spin-orbitals, and con-vergence problems in the method of configurational interaction. Phys-ical Review, 97:1474, 1955.[23] PR Taylor. Quantum theory of molecular electronic structure. Chem-istry, 185:285, 1963.[24] W Meyer. Theory of selfconsistent electron pairs. An iterative methodfor correlated manyelectron wavefunctions. The Journal of ChemicalPhysics, 64:2901, 1976.[25] HJ Silverstone and O Sinanog˘lu. Manyelectron theory of nonclosedshellatoms and molecules. I. Orbital wavefunction and perturbation theory.The Journal of Chemical Physics, 44:1899, 1966.[26] JC Slater. A simplication of the Hartree-Fock method. Physical Review,81:385, 1951.[27] AJC Varandas and J Branda˜o. A double many-body expansion ofmolecular potential energy functions. Molecular Physics, 57:387, 1986.[28] CCJ Roothaan. New developments in molecular orbital theory. Reviewof Modern Physics, 23:69, 1951.[29] H Ehrenreich and MH Cohen. Self-consistent field approach to themany-electron problem. Physical Review, 115:786, 1959.[30] TL Gilbert. Hohenberg-Kohn theorem for nonlocal external potentials.Physical Review B, 12:2111, 1975.[31] E Runge and EKU Gross. Density-functional theory for time-dependentsystems. Physical Review Letters, 52:997, 1984.[32] M Levy. On the simple constrained-search reformulation of theHohenberg–Kohn theorem to include degeneracies and more (1964–1979). International Journal of Quantum Chemistry, 110:3140, 2010.57Bibliography[33] W Koch and MC Holthausen. A chemist’s guide to density functionaltheory. John Wiley & Sons, 2015.[34] TA Wesolowski and J Weber. Kohn-Sham equations with constrainedelectron density: an iterative evaluation of the ground-state electrondensity of interacting molecules. Chemical Physics Letters, 248:71,1996.[35] ME Casida and TA Weso lowski. Generalization of the Kohn-Shamequations with constrained electron density formalism and its timede-pendent response theory formulation. International Journal of Quan-tum Chemistry, 96:577, 2004.[36] YK Chen and YA Wang. Population analyses based on ionic partitionof overlap distributions. Progress in Theoretical Chemistry and Physics,30:65, 2017.[37] R Carbo´-Dorca and P Bultinck. Quantum mechanical basis for Mullikenpopulation analysis. Journal of Mathematical Chemistry, 36:231, 2004.[38] AE Clark, JL Sonnenberg, PJ Hay, and RL Martin. Density and wavefunction analysis of actinide complexes: What can fuzzy atom, atoms-in-molecules, Mulliken, Lwdin, and natural population analysis tell us?The Journal of chemical physics, 121:2563, 2004.[39] AE Reed, RB Weinstock, and F Weinhold. Natural population analysis.The Journal of Chemical Physics, 83:735, 1985.58
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Population analyses based on ionic partition of overlap...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Population analyses based on ionic partition of overlap distribution Wang, Yiming 2018
pdf
Page Metadata
Item Metadata
Title | Population analyses based on ionic partition of overlap distribution |
Creator |
Wang, Yiming |
Publisher | University of British Columbia |
Date Issued | 2018 |
Description | In this thesis, we bring up several new schemes of partitioning the atomic partial charges for the purpose of reducing the dependency on the basis sets and the inaccuracy from previous methods we did in our group. We analyze all the methods including Mulliken, evaluate them by comparing with Natu- ral Population Analysis (NPA) with several different groups of systems which we divide according to their polarity. We find that when applied to more polarized systems such as compounds containing Fluorine, our Population Analyses Based on Ionic Partition of Overlap Distributions (IPOD) series perform better and produce charges closer to those of NPA method. Within the same system, IPOD series work better for atoms with more polarized bond than for atoms with non-polarized ones. On top of all the analyses for separate groups, we plot the correlation between charges produced by different methods with charges generated by NPA method. From the graph and the slope value we conclude that IPOD2d is the method which gives the most reliable result compared to NPA among all the methods. Also, in order to figure out the best basis set which can represent the result of IPOD2d, we plot the correlation graph between charges produced by IPOD2d and NPA methods for several basis sets. We find that 6-31G basis set is the most representative basis set. Using the 6-31G to calculate charges for certain systems renders us lots of advantages in terms of computational efficiency while still providing a reasonable result. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2019-08-31 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0371144 |
URI | http://hdl.handle.net/2429/66869 |
Degree |
Master of Science - MSc |
Program |
Chemistry |
Affiliation |
Science, Faculty of Chemistry, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2018-09 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2018_september_wang_yiming.pdf [ 750.8kB ]
- Metadata
- JSON: 24-1.0371144.json
- JSON-LD: 24-1.0371144-ld.json
- RDF/XML (Pretty): 24-1.0371144-rdf.xml
- RDF/JSON: 24-1.0371144-rdf.json
- Turtle: 24-1.0371144-turtle.txt
- N-Triples: 24-1.0371144-rdf-ntriples.txt
- Original Record: 24-1.0371144-source.json
- Full Text
- 24-1.0371144-fulltext.txt
- Citation
- 24-1.0371144.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0371144/manifest