Bottom-up coarse graining: Theory, implementation, application by Hoang Tran B.A., Earlham College, 2014 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Physics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2018 © Hoang Tran, 2018ii Abstract Molecular dynamics simulations at the atomic scale are a powerful tool to study the structure and dynamics of biological systems. However, important processes, such as protein folding, are still too computationally expensive to be investigated with atomistic simulations. In these situations, different coarse-grained force fields have been developed to appropriately reproduce the static structure of the systems. These strategies can range from specially designed, ad hoc approaches to transferrable, bottom-up plans. Nevertheless, because of the reduction in number of degrees of freedom, the obtained dynamics typically suffer from inaccuracies, which may be not adjustable using trivial scaling. To correctly reintroduce these properties, one can apply the Mori-Zwanzig formalism, essentially injecting the memory effects into these coarse-graining schemes. In practice, because of the difficulty in algorithm designing and the limit on computing power, the memory is often assumed to be short-term, and therefore can be approximated as a delta function. Recently, some investigations have been done successfully on simple models with significantly long memory. This work aims to extend these successes on systems with more complex topology. In particular, we applied the Mori-Zwanzig formalism to the system of polyethylene chains. Using GROMACS united atom model, molecular dynamics simulation was performed to obtain the coarse-grained force field and the memory kernel of the nonbonded interaction. These elements allowed for a successful coarse-grained simulation, which captured both static and dynamical structures of the reference system. iii Lay summary Molecular dynamics is a discipline of interest to biologists because it creates the bridge between theoretical formulation and experiment data. Full atomistic models have been developed and shown to give accurate predictions, but they take too much time to simulate many important processes using currently available computers. In these situations, coarse-grained simulations, which consider the overall motions of clusters of atoms and ignore less relevant details, are more efficient and useful. Nevertheless, the removal of less significant atoms produces observable consequences on the physics of these models, namely the pseudo-atoms have incorrect motion due to the lack of “obstacles”. In theory, it is possible to rectify these problems with the Mori-Zwanzig formalism. However, it is difficult to put this method in practice, so it has only been investigated recently, on a very simple system. In this work, we discuss the application of the Mori-Zwanzig formalism on a more complex model. iv Preface The author did most of the tasks discussed in this work. The simulation of the coarse-grained model and the data analysis were done with his own implementation in C++. The atomistic simulation used for reference was done by Professor Mark Thachuk using GROMACS. v Table of Contents Abstract......................................................................................................................................................................... ii Lay summary ............................................................................................................................................................... iii Preface ..........................................................................................................................................................................iv Table of Contents........................................................................................................................................................... v List of Figures ............................................................................................................................................................. vii List of Abbreviations ....................................................................................................................................................ix Acknowledgement ......................................................................................................................................................... x Chapter 1 Introduction............................................................................................................................................ 1 1.1. Integrator problem ......................................................................................................................................... 3 1.2. Memory problem ........................................................................................................................................... 4 1.3. Mean force problem ....................................................................................................................................... 5 1.4. Outline ........................................................................................................................................................... 5 Chapter 2 Theory .................................................................................................................................................... 6 2.1. Consistent coarse-graining ............................................................................................................................. 6 2.2. Mori-Zwanzig formalism ............................................................................................................................... 8 2.3. Memory as damped oscillators .................................................................................................................... 10 2.4. Integration of dissipative particle dynamics ................................................................................................ 12 Chapter 3 Implementation .................................................................................................................................... 14 3.1. Integrators .................................................................................................................................................... 14 3.1.1. Dissipative velocity-Verlet .................................................................................................................. 14 3.1.2. Lowe-Andersen .................................................................................................................................... 15 3.1.3. Impulsive leap-frog .............................................................................................................................. 15 3.1.4. Test result............................................................................................................................................. 16 3.1.4.1. Model systems ............................................................................................................................. 16 3.1.4.2. Physical quantities ....................................................................................................................... 17 3.1.4.3. Simulation conditions .................................................................................................................. 18 3.1.4.4. Result for Langevin dynamics ..................................................................................................... 18 vi 3.1.4.5. Result for dissipative particle dynamics ...................................................................................... 20 3.2. Memory ....................................................................................................................................................... 24 3.2.1. Standard memory ................................................................................................................................. 24 3.2.2. Test result for standard memory .......................................................................................................... 25 3.2.2.1. Model system ............................................................................................................................... 25 3.2.2.2. Result ........................................................................................................................................... 26 3.2.3. Auxiliary memory ................................................................................................................................ 30 3.2.4. Test result for auxiliary memory .......................................................................................................... 31 3.2.4.1. Model system ............................................................................................................................... 31 3.2.4.2. Result ........................................................................................................................................... 32 3.3. Mean force potential .................................................................................................................................... 35 3.3.1. Inverse Boltzmann method .................................................................................................................. 35 3.3.2. Test result............................................................................................................................................. 35 3.3.2.1. Model system ............................................................................................................................... 35 3.3.2.2. Result ........................................................................................................................................... 37 Chapter 4 Application .......................................................................................................................................... 39 4.1. Coarse-graining of bonded interactions ....................................................................................................... 39 4.1.1. Problem with coarse-graining of bonded interactions.......................................................................... 39 4.1.2. Solution ................................................................................................................................................ 40 4.2. Memory extraction ....................................................................................................................................... 46 4.3. Simulation model ......................................................................................................................................... 51 4.4. Result ........................................................................................................................................................... 53 Chapter 5 Conclusion ........................................................................................................................................... 55 Bibliography ................................................................................................................................................................ 57 Appendix A ................................................................................................................................................................. 61 Appendix B .................................................................................................................................................................. 68 vii List of Figures Fig. 1-1: The multitude of CG schemes. ........................................................................................................................ 2 Fig. 3-1: Time evolution of the average kinetic temperature 𝑘𝐵𝑇 for different integrators in LD model. .................. 19 Fig. 3-2: Time evolution of the mean square displacement for different integrators in LD model. ............................ 19 Fig. 3-3: The decay of velocity autocorrelation function over time for different integrators in LD model. ................ 20 Fig. 3-4: Time evolution of the average kinetic temperature 𝑘𝐵𝑇 for different integrators in DPD model. ................ 21 Fig. 3-5: Evolution of the mean square displacement for different integrators in DPD model. .................................. 21 Fig. 3-6: The decay of velocity autocorrelation function over time for different integrators in DPD model. ............. 22 Fig. 3-7: The radial distribution function 𝐺(𝑅). .......................................................................................................... 23 Fig. 3-8: Coefficients for dissipative force (left) and random force (right), with some choices of 𝑐 and 𝑛. ................ 26 Fig. 3-9: Average kinetic temperature 𝑘𝐵𝑇 as a function of cutoff length 𝑚 for the standard memory scheme. ........ 26 Fig. 3-10: Ratio between the kinetic temperature 𝑘𝐵𝑇 and the reference temperature 𝑘𝐵𝑇 as a function of 𝑘𝐵𝑇 in the standard memory scheme. ........................................................................................................................................... 27 Fig. 3-11: The relationship between 𝑇/〈𝑇⟩ and 𝑘𝐵𝑇 for different system sizes. ......................................................... 28 Fig. 3-12: Effect of the timestep value on ⟨𝑘𝐵𝑇⟩. ........................................................................................................ 29 Fig. 3-13: Time evolution of the average kinetic temperature 𝑘𝐵𝑇 for two different interpretations of 𝑒𝐼𝐽 in Eq. (56). ..................................................................................................................................................................................... 29 Fig. 3-14: Time evolution of the average kinetic temperature 𝑘𝐵𝑇 in the auxiliary memory scheme. ........................ 33 Fig. 3-15: Time evolution of the mean square displacement for the auxiliary memory shame. .................................. 33 Fig. 3-16: Log-log plot of the above mean square displacement versus time for the auxiliary memory method. ....... 34 Fig. 3-17: The decay of velocity autocorrelation function over time. ......................................................................... 34 Fig. 3-18: CG of a polyethylene chain of 40 united atom monomers with the rate of 20 to 1. ................................... 36 Fig. 3-19: Inverted nonbonded potential and fitting function. ..................................................................................... 37 Fig. 3-20: Comparison between targeted distribution 𝑔𝑛𝑏(𝑅) and the CG distribution 𝐺𝑛𝑏(𝑅) obtained from simulation using nonbonded Boltzmann-inverted potential with the CG ratio of 20 to 1 and fixed bond length. ....... 38 Fig. 4-1: Inverted bond potential and fitting function. ................................................................................................. 39 viii Fig. 4-2: Comparison between targeted radial distribution 𝑔(𝑅) and the CG distribution 𝐺(𝑅) for interactions of nonbonded pairs (square) and of pairs sharing a CG bond (circle), with the CG ratio of 20 to 1. ............................... 40 Fig. 4-3: Bond radial distribution from the thermalized simulation using Andersen thermostat with 𝑇 = 5.0𝐾. ....... 41 Fig. 4-4: Bond radial distribution from the thermalized simulation using Andersen thermostat with 𝑇 = 50.0𝐾 (yellow triangle) and 𝑇 = 200.0𝐾 (green circle). .................................................................................................................... 42 Fig. 4-5: Inverted potential and fitting function for the bonded interactions, with the CG ratio of 8 to 1................... 43 Fig. 4-6: Inverted nonbonded potential and fitting function for the CG ratio of 8 to 1. .............................................. 43 Fig. 4-7: Atomistic and CG distributions for the bonded interactions, with the CG ratio of 8 to 1. ............................ 44 Fig. 4-8: Atomistic and CG distributions for the bonded interactions for the CG ratio of 8 to 1. ............................... 45 Fig. 4-9: Two bonded CG particles. ............................................................................................................................ 47 Fig. 4-10: Decay rate of the radial dependence in the parallel (left) and orthogonal (right) memory kernel. ............. 48 Fig. 4-11: Temporal dependence at 𝑅 > 0.8𝑛𝑚 of the parallel (left) and orthogonal (right) memory kernel. ........... 48 Fig. 4-12: Temporal correlation 𝐾𝐼𝐽𝑥(𝑅, 𝑡)/𝐾𝐼𝐽𝑥(𝑅, 0) and its fitted function for the parallel (left) and orthogonal (right) memory kernel. ............................................................................................................................................................ 49 Fig. 4-13: Valence angle. ............................................................................................................................................. 51 Fig. 4-14: CG of a polyethylene chain of 40 united atom monomers with the rate of 8 to 1. ..................................... 52 Fig. 4-15: Nonbonded distribution for the CG simulation using the auxiliary memory method. ................................ 53 Fig. 4-16: Nonbonded distribution for the CG simulation using the auxiliary memory method. ................................ 53 Fig. 4-17: Normalized VACF for the CG simulation using the auxiliary memory method. ........................................ 54 ix List of Abbreviations MD: Molecular dynamics CG: Coarse-graining MZ: Mori-Zwanzig EOM: Equation of motion LD: Langevin dynamics DPD: Dissipative particles dynamics COM: Center of mass FDT: Fluctuation-dissipation theorem VV: Velocity-Verlet DPD-VV: Dissipative velocity-Verlet Lowe: Lowe-Andersen Imp-LF: Impulsive leap-frog MSD: Mean square displacement VACF: Velocity autocorrelation function RDF: Radial distribution function NPT: Isothermal–isobaric ensemble NVT: Canonical ensemble NVE: Microcanonical ensemble x Acknowledgement I would like to thank Professor Mark Thachuk for his unreserved support and his never-ending patience, and Professor Joerg Rottler for his keen advices. 1 Chapter 1 Introduction Computational techniques for the simulation of physical, chemical, or biological problems are based on a large variety of theoretical models. All these computational techniques can reach a typical time and length scale. Among them, classical molecular dynamics (MD) simulation has become standard for studying molecular systems, since it allows for the precise representation of molecular structure while still avoiding the complication of quantum physics [1]. The trajectory of the particles is computed by numerically solving equations of classical mechanics (most commonly Newton’s equations of motion) for an N-body interacting system, where potential energy is defined by known molecular mechanics force fields. With the current CPU speed, atomistic MD simulations that can be done in tolerable real time can support tens of thousands of individual objects, and up to a few nanoseconds for the simulation times [2]. However, many important chemical and biological mechanisms involve much larger numbers of atoms and much longer durations. For example, a typical protein folding process occurs in microseconds and some polymer chains contain millions of atoms [3]. In addition, the interactions between the particles are long-range, so the force computation scales as O(N2) and is usually the most CPU-demanding evaluation of the whole simulation. This inefficiency can be mitigated partly by reducing the algorithmic complexity (Verlet neighbor list for Lennard-Jones force, particle-mesh method for electrostatic interaction) or improving the computational resources (using machines specifically adapted to MD simulations or combining a lot of general-purpose CPUs with concurrent computing). Notwithstanding, one can still only simulate a small, limited fraction of these important processes. This is a major reason why development of coarse-graining (CG) methods is needed [4]. Coarse-graining is the process of merging multiple particles and describing their overall motion using fewer variables. The first CG models were proposed practically at the birth of MD, but they only became popular in the last decade [5]. There are good reasons for this increasing role. First, as mentioned above, despite rapid advances in computing power, CG approaches are still the only appropriate choices when the system is relatively large or the timescale is relatively long. Second, atomistic studies produce vast amount of data, some of which are necessary for the understanding of the mechanism in question, but the majority is irrelevant [6]. By focusing on key details, while averaging over more trivial elements, CG provides crucial conceptual advantages with respect to atomistic models. Notwithstanding, only recently did the computer become fast enough for MD simulations to support sufficiently large number of particles in a reasonably long duration, so that the result is interesting and verifiable [2]. Thus, for many processes, CG schemes are now in the “sweet spot” where there is adequate power to enable acceptable resolutions, but not adequate to do so with all-atomistic techniques. Consequently, CG methods have quickly grown in popularity in recent years and are now recognized as an equally valuable tool for the study of molecules. 2 In the last twenty years, many CG models have been introduced, targeting different, specific types of systems. They can assume various levels of reduction, ranging from united-atom, with the hydrogen atoms being absorbed into their connected, larger partners, to very crude representations, where tens of these groups are treated as a single interaction site [7]. Preferably, the elements in a CG cluster are related through some predetermined, fixed links, such as bonds or angles, but some schemes allow more extreme combinations. As an example, Fig. 1-1 exhibits some of the possible designs: a typical CG group represents a whole molecule (A), or parts of one (B, C, D). However, there are models that combine multiple independent molecules (E) [8], or even modify the grouping on-the-fly [9, 10]. While CG strategies are many, in essence, we can classify them into two major philosophies: bottom-up (structure-based) and top-down (thermodynamic-based). In the first approach, the CG force fields are extracted from reference atomistic simulations using approximation schemes such as Boltzmann inversion, inverse Monte Carlo, or force matching. On the other hand, top-down CG starts with simple, ad hoc potential form, then optimizes the parameters in an iterative procedure to reproduce key experimental data, especially static properties [8]. Although the bottom-up approaches can capture more of the fine details of the interaction, the top-down method is usually easier to employ and transfer [6]. On the other hand, bottom-up schemes enable a multiscale-modelling pipeline, where one uses more-detailed systems as the basis for the force field of the coarser models [2]. Although the distinction is quite intuitive, it should be emphasized that this is just a simple guideline, not a hard boundary, as an increasing number of models integrate both paradigms [6, 11, 12]. Fig. 1-1: The multitude of CG schemes. Either way, the dynamic properties of the CG simulation, such as diffusion coefficients or time correlations, all suffer inaccuracies compared to that of the reference system, which can be an actual, physical system or a higher-resolution model [13]. This problem arises because of the elimination of degrees of freedom that should appear in the form of thermal noise: intuitively, with less collisions, the particles will move more freely and display different dynamic behaviors. These disparities can be as simple as a scaling of the system, or much more complicated and 3 irreducible. To emulate the correct dynamical properties, one can employ the Mori-Zwanzig (MZ) formalism [14] on the CG model, essentially injecting a memory effect, in which the system’s past trajectory influences its current motion. Mathematically, the CG procedure is basically a projection operator from a larger vector space (the reference system) into a smaller space (the CG system). In the process, one must sacrifice some information, of which the reduction is inevitably displayed in the dynamics [15]. The MZ formalism deals with this loss of information through the orthogonal operator and shows that it amounts to a new equation of motion (EOM): with 〈𝐹𝑖𝑗〉⃗⃗⃗⃗⃗⃗ ⃗⃗ being the average of the instantaneous force 𝐹𝑖𝑗⃗⃗⃗⃗ , 𝑲𝑖𝑗(𝑡 − 𝑠) being the pairwise memory kernel, which relates the past velocities with the dissipation at the current time, and 𝛿𝐹𝑖𝑗⃗⃗ ⃗⃗ ⃗⃗ ⃗ the random component, representing the unresolved degrees of freedom during the CG procedure. The first term results from the application of the CG procedure, while the last two are the consequence of the MZ formalism. In principle, their explicit form can all be determined from the reference system [16], but one will encounter multiple practical obstacles when trying to design the numerical implementation. These hindrances contributed to why this scheme has not been investigated until recently, even though the MZ formalism has been known for more than forty years. 1.1. Integrator problem For the discerning eyes, the three terms on the right-hand side of Eq. (1) brings to mind the familiar Langevin dynamics (LD) [17], with the first addend corresponding to the conservative force, the second to the friction force and the third being the random force. In fact, as long as the timescale of the memory and of the change in momentum are well-separated, we can apply the Markovian approximation [18]. This condition reduces Eq. (1) into the EOM of dissipative particle dynamics (DPD), a pairwise generalization of LD equation. Originally proposed during the 90s in the soft matter community [19], DPD is a solution to correct the aforementioned discrepancy in the CG dynamics by introducing fluid viscosity effects and replicating the missing internal degrees of freedom. In the most generic form, the EOM in DPD can be written as The conservative force 𝐹𝑖𝑗𝐶⃗⃗ ⃗⃗ acts as the source of the thermodynamic structure of the system, while the dissipative term 𝐹𝑖𝑗𝐷⃗⃗⃗⃗ ⃗ and the random term 𝐹𝑖𝑗𝑅⃗⃗ ⃗⃗ together form a thermostat to reproduce the dynamic properties [20]. Although the viscosity effects can also be reproduced using LD, through the employment of pairwise dissipative and random forces, DPD describes how the hydrodynamic interactions between the particles affect the system’s approach to equilibrium, and thus maintains the correct canonical distribution. However, the finite integration of Eq. (2) proves to be troublesome, as the friction force between the particles at a particular time 𝑡 is an explicit function of their current relative velocity 𝑣𝑖𝑗⃗⃗ ⃗⃗ (𝑡). To calculate that velocity, the 𝑚 𝑎𝑖⃗⃗ ⃗ = ∑ (〈𝐹𝑖𝑗⃗⃗⃗⃗ 〉 +1𝑘𝐵𝑇∫ 𝑲𝑖𝑗(𝑡 − 𝑠)𝑣𝑖𝑗⃗⃗ ⃗⃗ (𝑠)𝑑𝑠𝑡0+ 𝛿𝐹𝑖𝑗⃗⃗⃗⃗ )𝑖≠𝑗 , (1) 𝑚 𝑎𝑖⃗⃗ ⃗ = ∑ 𝐹𝑖𝑗𝐶⃗⃗ ⃗⃗ + 𝐹𝑖𝑗𝐷⃗⃗⃗⃗ ⃗ + 𝐹𝑖𝑗𝑅⃗⃗ ⃗⃗ 𝑖≠𝑗 . (2) 4 standard MD integration strategy requires the forces 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) to be known beforehand, including the dissipative force 𝐹𝐼𝐽𝐷⃗⃗⃗⃗ ⃗(𝑡). Therefore, 𝑣𝑖𝑗⃗⃗ ⃗⃗ (𝑡) and 𝐹𝐼𝐽𝐷⃗⃗⃗⃗ ⃗(𝑡) are circularly dependent on each other [1], and a self-consistent approach is required to solve the DPD EOM exactly. Thus, we cannot simply commandeer the usual MD packages and utilize their algorithms in DPD simulations. It should be noted here that the conclusion may or may not be applicable to Eq. (1), since most of the velocities involved are determined from the trajectorial history. On the other hand, the historical velocities are all propagated with the same integrator, thus it is arguable that any possible inaccuracy simply compounds instead of mitigating. Recent works using the MZ formalism have not addressed the integrator issue [18, 21], so it is unclear whether it will be a problem. Nevertheless, by the point this study was conducted, most of the popular simulation programs had not incorporated, or incorporated poorly, a correct DPD integrator [22, 23]. Thus, we took it upon ourselves to implement different DPD algorithms available in the literature and assess their accuracy/performance. 1.2. Memory problem One of the main reasons for the lack of interest in the MZ formalism is the requirement for a memory kernel. Studies using the MZ formalism have been sporadic over the years, and the calculation is often simplified by applying the Markovian assumption, where the memories at different time are uncorrelated. In short, the features of interest evolve on a time scale much larger than the correlation time of the memory, so the memory integration can be approximated as a time-independent friction tensor and the EOM turns into a classic stochastic differential equation [14]. This solution works well in some applications, but obviously it is not universally true and fails in many others. For instance, they lead to the “plateau-value problem”, where the memory tensor does not go to zero at infinite time, so some ad hoc estimations should be introduced [15]. In these situations, the employment of a memory function is imperative. However, it is very costly, computation-wise, if one tries the direct, naïve implementation, where the simulation remembers its past trajectory: since a history must exist for every pair of particles, it grows as O(N2). While this implementation is algorithmically similar to that of the Verlet neighbor list, it demands a recalculation at every timestep, unlike the latter. This, combined with the fact that the memory can be significant over tens or hundreds of timesteps, increases the CPU time by a few orders of magnitude and partially neutralizes the advantages of CG schemes. Recently, Li et al. have shown that if the memory vanishes at infinity, it can be approximated as a series of dampened harmonic oscillators [24]. These oscillators absorb the memory effect into their time-evolution, so the program does not need to keep the history anymore and the time complexity reduces to linear. Li et al. have successfully employed this technique in the star-polymer melts system, where each polymer is one pseudo-atom. In our study, we investigate whether the methodology is applicable to more a complex system of polyethylene chains. 5 1.3. Mean force problem The problem of deriving the mean force potential from a given atomistic system and the corresponding CG scheme is well-known and can be solved with many different methods [2]. In fact, as the aim of this study is not identifying a new CG force field, we can simply borrow from the literature a CG model that has been investigated thoroughly. Nevertheless, to reconstruct the memory kernel, we still need to perform the atomistic simulation ourselves, so we decided that having our own CG machinery is preferable. Since the model of interest contains polyethylene molecules, which cannot be converted into singular, non-bonded CG particles, we implement a solution that can derive both non-bond and bond mean force potentials, including that of the bond angles. 1.4. Outline In this study, we try to replicate and enhance the results done by Li et al., who applied successfully the MZ formalism on a system of star polymer melts [13, 21]. Chapter 2 discusses in more detail the theoretical foundations of the various problems we have mentioned earlier. In Chapter 3, we describe the implementation of different approaches to these questions, and the degree of success for each. Finally, in Chapter 4, we choose the best-working solution and apply it to a polyethylene system. Our probe provides some improvements over the works of Li’s group, as they did not examine the validity of the MZ formalism on bonded CG particles. 6 Chapter 2 Theory 2.1. Consistent coarse-graining While this study mainly deals with the CG procedure of an all-atom model, the following framework can be applied to any two models for a given physical system. After all, the idea of bonded connections already contains a notion of CG, since the true atoms have no such structure and only interact with each other through their quantum fields. Henceforth, we shall refer to the higher-resolution model as atomistic, its interaction sites as atoms, whereas the lower-resolution model is called CG, with its sites being particles. Finally, we reserve lower-case symbols for atomistic variables, while CG variables are capitalized. This section largely follows the derivation done by Noid et al. [2]. Now, let us consider an atomistic system consisting of 𝑛 structureless atoms with mass 𝑚𝑖. We define 𝜑(?̂?) = 𝜑(𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ) to be the atomistic phase space, where 𝑟𝑛⃗⃗⃗⃗ = (𝑟1⃗⃗⃗ , … , 𝑟𝑛⃗⃗ ⃗) and 𝑝𝑛⃗⃗ ⃗⃗ = (𝑝1⃗⃗ ⃗, … , 𝑝𝑛⃗⃗⃗⃗ ) are the canonical positions and momenta of its atoms, respectively. The Hamiltonian is with 𝑢(𝑟𝑛⃗⃗⃗⃗ ) being the atomistic multi-body potential, and the corresponding equilibrium probability density in the canonical ensemble is with Next, we divide the atomistic system into 𝑁 particles, with 𝑁 < 𝑛, using some mapping operator 𝑷. Obviously, in exchange for computational performance, we need to reduce the number of variables, necessarily discarding some information. It is possible to recover this loss partially by embedding the particles with extra data, such as rotation, but usually we want to employ the same formalism. Therefore, the CG particles should also be structure-less: ℎ(?̂?) = ℎ( 𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ) = ∑𝑝𝑖22𝑚𝑖𝑖+ 𝑢( 𝑟𝑛⃗⃗⃗⃗ ) , (3) 𝑝𝜑(?̂?) = 𝑝𝑟(𝑟𝑛⃗⃗⃗⃗ )𝑝𝑝(𝑝𝑛⃗⃗ ⃗⃗ ) , (4) 𝑝𝑟(𝑟𝑛⃗⃗⃗⃗ ) ∝ exp (−𝛽𝑢( 𝑟𝑛⃗⃗⃗⃗ )) , (5) 𝑝𝑝(𝑝𝑛⃗⃗ ⃗⃗ ) ∝ exp (−𝛽 ∑𝑝𝑖22𝑚𝑖𝑖) . (6) 7 where 𝑅𝑁⃗⃗ ⃗⃗ ⃗ = (𝑅1⃗⃗⃗⃗ , … , 𝑅𝑁⃗⃗⃗⃗ ⃗) and 𝑃𝑁⃗⃗ ⃗⃗ ⃗ = (𝑃1⃗⃗ ⃗, … , 𝑃𝑁⃗⃗ ⃗⃗ ) are the CG coordinates and momenta. Although 𝑷 can still be quite general, in practice, we typically choose the CG particle to be the center of mass (COM) of its constituent atoms, with the corresponding phase space Φ(?̂?) = Φ(𝑅𝑁⃗⃗ ⃗⃗ ⃗, 𝑃𝑁⃗⃗ ⃗⃗ ⃗) being determined by [13, 25] where 𝐼 denotes the particle. Because Eq. (7) becomes right-unique in this case, i.e. each atomistic state maps to only one CG state, the equilibrium probability density 𝑃𝜑 of a given COM state (𝑅𝑁⃗⃗ ⃗⃗ ⃗, 𝑃𝑁⃗⃗ ⃗⃗ ⃗) is simply the probabilistic sum for all microscopic states (𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ) that are projected to that CG state, i.e. [14] However, we acquire this CG equilibrium distribution from the reference atomistic simulation, not the CG simulation itself. In line with the traditional CG schemes, we want the equilibrium averages obtained from the time evolution of the new CG system to be consistent with the atomistic ones, that is with being the Hamiltonian of the CG system. Using Eqs. (4), (9) and (10), we separate the variables and find the consistency conditions for the CG Hamiltonian into [2] Equation (12) shows how we can resolve the CG potential of mean force 𝑈(𝑅𝑁⃗⃗ ⃗⃗ ⃗) from the atomistic potential 𝑢(𝑟𝑛⃗⃗⃗⃗ ) and the mapping operator 𝑷. We shall discuss the actual calculation in Section 3.3. For now, it should only be noted 𝑷: (𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ) ↦ (𝑅𝑁⃗⃗ ⃗⃗ ⃗, 𝑃𝑁⃗⃗ ⃗⃗ ⃗) , (7) 𝑀𝐼𝑅𝐼⃗⃗⃗⃗ = ∑𝑚𝑖𝑟𝑖 ⃗𝑖∈𝐼 , 𝑃𝐼⃗⃗ ⃗ = ∑ 𝑝𝑖⃗⃗⃗ 𝑖∈𝐼 , 𝑀𝐼 = ∑𝑚𝑖𝑖∈𝐼 , (8) 𝑃𝜑(𝑅𝑁⃗⃗ ⃗⃗ ⃗, 𝑃𝑁⃗⃗ ⃗⃗ ⃗) = ∫𝑑𝑟𝑛⃗⃗⃗⃗ ∫ 𝑑𝑝𝑛⃗⃗ ⃗⃗ 𝑝𝜑(𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ )𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗) 𝛿 (𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝑁⃗⃗ ⃗⃗ ⃗) . (9) 𝑃𝜑(?̂?) = 𝑃Φ(?̂?) ∝ exp (−𝛽 ∑𝑃𝐼22𝑚𝐼𝐼− 𝛽𝑈 ( 𝑅𝑁⃗⃗ ⃗⃗ ⃗)) , (10) 𝐻(?̂?) = ∑𝑃𝐼22𝑚𝐼𝐼+ 𝑈 ( 𝑅𝑁⃗⃗ ⃗⃗ ⃗) , (11) 𝑃𝑅 (𝑅𝑁⃗⃗ ⃗⃗ ⃗) ∝ exp (−𝛽𝑈 ( 𝑅𝑁⃗⃗ ⃗⃗ ⃗)) ∝ ∫exp (−𝛽𝑢( 𝑟𝑛⃗⃗⃗⃗ )) 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗) 𝑑𝑟𝑛⃗⃗⃗⃗ , (12) 𝑃𝑃 (𝑃𝑁⃗⃗ ⃗⃗ ⃗) ∝ exp (−𝛽 ∑𝑃𝐼22𝑚𝐼𝐼) ∝ ∫exp (−𝛽 ∑𝑝𝑖22𝑚𝑖𝑖) 𝛿 (𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝑁⃗⃗ ⃗⃗ ⃗) 𝑑𝑝𝑛⃗⃗ ⃗⃗ . (13) 8 that the computed function is unique up to an additive constant. On the other hand, our specific choice of 𝑷 satisfies Eq. (13) automatically. The reader is invited to explore the justification for the above claims in the work on multiscale CG by Noid et al. [2]. 2.2. Mori-Zwanzig formalism Previously, we mentioned that the CG procedure results in unavoidable information loss. Our choice of CG variables, determined by Eq. (8), preserves the static structure of the atomistic system, hence the disparity can only manifest in the dynamics. In this section, we shall clarify our assertion and its implications. However, the actual derivation is long and complicated, so the bulk of the calculations is presented in Appendix A. First, a distinction must be made between the CG variables obtained from the CG simulation and their corresponding value computed directly using the atomistic trajectory. As the potential of mean force 𝑈(𝑅𝑁⃗⃗ ⃗⃗ ⃗) is an ensemble average, the force derived from this potential differs from the instantaneous total force acting on the actual COM, so the two time evolutions diverge from each other. Mathematically, we can express the time evolution of the (𝑅𝑁⃗⃗ ⃗⃗ ⃗, 𝑃𝑁⃗⃗ ⃗⃗ ⃗) along the atomistic path as [14] where Ω is the CG phase space density, 𝒫 is the ensemble-averaged projection, 𝒬 ≡ 1 − 𝒫 and is the atomistic Liouville operator. The first term on the right-hand side of Eq. (14) can be resolved completely from the projection operator 𝒫 and the atomistic Hamiltonian. Intuitively, we can see that it is the time evolution of Ω acquired from the CG simulation. On the other hand, 𝒬 is the complementary operator to 𝒫, thus the second term represents the data lost in the CG projection. From Eq. (14), we can derive the EOM of the particles as [14] with 𝑲𝐼𝐽 being the pairwise memory kernel and 𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗ the deviation of the total atomistic force from the mean force acting on 𝐼, namely 𝑑𝑑𝑡Ω = 𝑖𝑙Ω = 𝒫𝑖𝑙Ω + 𝒬𝑖𝑙Ω , (14) 𝑖𝑙 = −∑ [𝜕ℎ𝜕𝑝𝑖⃗⃗⃗ 𝜕𝜕𝑟𝑖 ⃗−𝜕ℎ𝜕𝑟𝑖 ⃗𝜕𝜕𝑝𝑖⃗⃗⃗ ]𝑖 (15) 𝑅𝐼⃗⃗⃗⃗ ̇ =𝑃𝐼⃗⃗ ⃗𝑀𝐼 , (16) 𝑃𝐼⃗⃗ ̇⃗ = −𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑈 (𝑅𝑁⃗⃗ ⃗⃗ ⃗) − ∑∫ 𝑲𝐼𝐽(𝑡 − 𝑠)𝑡0𝑃𝐽⃗⃗ ⃗(𝑠)𝑀𝐽𝑑𝑠𝐽 + 𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑡) , (17) 9 Equation (17) reminds us of the EOM for DPD, where the three terms on the right-hand side corresponds to the mean, friction, and fluctuating forces. Theoretically, all three terms can be determined completely with the CG variables. The mean force is calculated from the N-body potential of mean force. The friction force depends on the history of the system because its integrand involves past momenta. Finally, the fluctuating force is not necessarily random, since it is the atomistic fluctuation, which has a known equilibrium distribution in 𝜋(𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ). Essentially, while we cannot accurately predict the microscopic configuration (𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ) from a given CG state, the causal relation in the atomistic space restricts the possible CG events, manifesting through the temporal correlation in the friction force, which requires where ⟨… ⟩ is defined as the thermal average over the atomistic canonical ensemble: Evidently, this is the generalized formulation of the second fluctuation-dissipation theorem (FDT) for non-Markovian fluctuations [13]. In principle, assessing Eq. (17) directly allows us to calculate the particles’ motion. However, such evaluation is nontrivial because the force field depends intrinsically on the 𝑁-body configuration of the CG system [14]. To make Eq. (17) practical, we shall employ three assumptions. First, we approximate the non-bonded interactions as sums of pairwise-additive forces, i.e. Second, we take which implies the fluctuating forces are random for different pair, to get: Third, we assume that the forces between neighboring particles depend only on their COM positions, essentially neglecting the 𝑁-body correlation: 𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗ = (−∑𝜕𝜕𝑟𝑖 ⃗𝑢(𝑟𝑛⃗⃗⃗⃗ )𝑖∈𝐼) − (−𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑈 (𝑅𝑁⃗⃗ ⃗⃗ ⃗)) ≡ 𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ − 𝐹𝐼⃗⃗ ⃗ . (18) 〈𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉 = 𝑘𝐵𝑇𝑲𝐼𝐽(𝑡) , (19) ⟨𝑎(?̂?)⟩ = ∫ 𝑎(?̂?)𝑝𝜑(?̂?)𝑑?̂? (20) 𝐹𝐼⃗⃗ ⃗ ≈ ∑ 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗𝐽≠𝐼 , 𝛿𝐹𝐼𝒬⃗⃗⃗⃗ ⃗ ≈ ∑ 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗𝐽≠𝐼 . (21) 〈𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐾𝐿𝓠⃗⃗⃗⃗⃗⃗ (0)〉 = 0 for 𝐼𝐽 ≠ 𝐾𝐿 , (22) 〈𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉 = 〈∑ 𝐹𝐼𝐾⃗⃗ ⃗⃗ ⃗(𝑡)𝐾≠𝐼⊗ ∑𝐹𝐼𝐾⃗⃗ ⃗⃗ ⃗(0)𝐾≠𝐼〉 = 〈𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉 (23) 10 Thus, we arrive at the pairwise form of Eq. (17) where 𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗ = 𝑅𝐼⃗⃗⃗⃗ − 𝑅𝐽⃗⃗⃗⃗ , 𝐹𝐼𝐽𝑃⃗⃗⃗⃗ ⃗ = ∑ 𝐹𝑖𝑗⃗⃗⃗⃗ 𝑖∈𝐼,𝑗∈𝐽 is the pairwise instantaneous force between particles 𝐼 and 𝐽, 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗ and 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ are its mean and fluctuating force, and the matrix is the pairwise memory kernel of dissipation. Our next step is to compute each term on the right-hand side of Eq. (25). The pairwise approximation for the CG force field is a well-known problem in MD, so we only need to find a workable solution from the available literature. We shall explain our method of choice in Section 3.3. On the other hand, the non-conservative forces depend intrinsically on the system’s setup and their evaluation greatly affects the simulation’s success. Thus, in Section 3.2.1, we shall take some tests using a simple, a priori form of 𝑲𝐼𝐽 to verify the accuracy of our implementation, while leaving the evaluation of the actual memory kernel to Chapter 4. 2.3. Memory as damped oscillators At a glance, it is quite straightforward to discretize Eq. (25) as with 𝑀 being the cutoff length for the memory kernel, that is The term 𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ has been replaced with 𝐹𝐼𝐽𝑅⃗⃗⃗⃗ ⃗ to denote that the random force is now truly random, drawing from a known distribution. We can determine this distribution and the explicit form of the memory kernel from the reference atomistic simulation. Thus, it seems once all the coefficients are found, the CG simulation will run smoothly. However, as we said in the introduction, 𝑀 must be rather large for any system of interest; otherwise, it implies a significant separation between the time scale for the random force and momentum, so we can simply extract the 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ from the inner sum in Eq. (27) and apply the Markovian approximation, reducing the MZ EOM to the standard DPD: 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗ = 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗), 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ = 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗(𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗) . (24) 𝑑𝑑𝑡𝑃𝐼⃗⃗ ⃗(𝑡) = ∑ [𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗) − ∫ 𝑲𝐼𝐽𝑡0(𝑡 − 𝑠)𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑠)𝑑𝑠 + 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗(𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗)] 𝐽≠𝐼, (25) 𝑲𝐼𝐽(𝑡) = 𝛽 〈𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉 (26) Δ𝑃𝐼⃗⃗ ⃗Δ𝑡= ∑ [−𝜕𝜕𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗𝑈 (𝑅𝑁⃗⃗ ⃗⃗ ⃗) − 𝛽 ∑ 𝑲𝐼𝐽(𝑚Δ𝑡)𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡 − 𝑚Δ𝑡)Δ𝑡𝑀𝑚=0+𝐹𝐼𝐽𝑅⃗⃗⃗⃗ ⃗(𝑡)√Δ𝑡] 𝐽≠𝐼 , (27) 𝑲𝐼𝐽(𝑚Δ𝑡) = 0 for 𝑚 > 𝑀 . (28) 11 in which 𝛾𝐼𝐽 is the strength of the friction force. Moreover, the N-body function is very difficult to work with [2, 13], so Eq. (27) still needs some further decomposition, such as pairwise decomposition, to be practical. Since a pairwise memory kernel scales as 𝑂(𝑀𝑁2), a direct application of Eq. (17) would consume a lot of the computer’s memory and run inefficiently due to the large number of cache misses [26]. There exists an alternative approach that mitigates the complexity of the MZ formalism: instead of keeping track of all the relevant historical terms, we supplement the system with multiple series of 𝑘 auxiliary variables 𝑠𝐼𝐽⃗⃗ ⃗⃗ =(𝑠1, 𝑠2, … 𝑠𝑘), whose coupling with the relative momentum 𝑃𝐼𝐽⃗⃗⃗⃗ ⃗ mimics the memory effect. The coupled equation is given as [21] where 𝑨𝑝𝑠 = −𝑨𝑠𝑝𝑇 and 𝜉𝐼𝐽⃗⃗ ⃗⃗ is a vector of 𝑘 Gaussian random numbers with 〈𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡) ⊗ 𝜉𝐾𝐿⃗⃗ ⃗⃗⃗⃗ (0)〉 = 𝛿(𝑡)𝛿𝐼𝐽,𝐾𝐿. This is the so-called Ornstein-Uhlenbeck process [24]. By assuming again that the memory length is finite, we can take 𝑠𝐼⃗⃗⃗ (−∞) = 0⃗ and an ansatz for 𝑠𝐼𝐽⃗⃗ ⃗⃗ as Using Eq. (31), we eliminate the auxiliary terms from Eq. (30) and arrive at in which We can see that the form of the first equation functions similarly to the MZ EOM of Eq. (17). In Appendix B, we shall present the proof of equivalence. For now, it should be noted that Eq. (32) reduces to the pairwise-decomposed Eq. (25) with some specific choices of 𝑨 and 𝑩. In particular, the relation is sufficient to satisfy the second FDT. We can always solve Eq. (34) to find 𝑩𝑠𝑠 using Cholesky factorization if (𝑨𝑠𝑠 + 𝑨𝑠𝑠𝑇 ) is positive-semidefinite [24]. In fact, when 𝑨𝑠𝑠 is real and has complex eigenvalues with positive real part, 𝑲(𝑡) becomes a linear combination of damped harmonic oscillators [21]. Thus, for any decaying memory kernel ∑ 〈𝛿𝐹𝐼𝐽𝑅⃗⃗⃗⃗ ⃗(𝑡 − 𝑚Δ𝑡) ⊗ 𝛿𝐹𝐼𝐽𝑅⃗⃗⃗⃗ ⃗(𝑚Δ𝑡)〉 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡 − 𝑚Δ𝑡)Δ𝑡𝑀𝑚=0|𝑀≪1≈ 𝛾𝐼𝐽𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) , (29) (𝑃𝐼𝐽⃗⃗⃗⃗ ⃗̇𝑠𝐼𝐽⃗⃗ ⃗⃗ ̇) = (𝐹𝐼𝐽⃗⃗⃗⃗ ⃗0⃗ ) − (0 𝑨𝑝𝑠𝑨𝑠𝑝 𝑨𝑠𝑠) (𝑃𝐼𝐽⃗⃗⃗⃗ ⃗𝑠𝐼𝐽⃗⃗ ⃗⃗ ) + (0 00 𝑩𝑠𝑠) (0⃗ 𝜉𝐼𝐽⃗⃗ ⃗⃗ ) , (30) 𝑠𝐼𝐽⃗⃗ ⃗⃗ (𝑡) = ∫ 𝑒−(𝑡−𝑡′)𝑨𝑠𝑠 (−𝑨𝑠𝑝𝑃𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡′) + 𝑩𝑠𝑠𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡′)) 𝑑𝑡′𝑡∞ . (31) 𝑃𝐼𝐽⃗⃗⃗⃗ ⃗̇ (𝑡) = 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) − ∫ 𝑲(𝑡 − 𝑡′)𝑃𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡′)𝑑𝑡′𝑡−∞+ 𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) , (32) 𝑲(𝑡) = −𝑨𝑝𝑠𝑒−𝑡𝑨𝑠𝑠𝑨𝑠𝑝 , 𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) = ∫ 𝑨𝑝𝑠𝑒−(𝑡−𝑡′)𝑨𝑠𝑠𝑩𝑠𝑠𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡′)𝑑𝑡′𝑡−∞ . (33) 𝑩𝑠𝑠𝑩𝑠𝑠𝑇 = 𝑘𝐵𝑇(𝑨𝑠𝑠 + 𝑨𝑠𝑠𝑇 ) (34) 12 in Eq. (25), we can approximate it with a series of damped oscillators and convert the EOM to Eq. (30). Section 3.2.3 will clarify the practical implementation of this method. 2.4. Integration of dissipative particle dynamics So, we have seen that the EOM resulting from applying the MZ formalism is closely related to the DPD equation, of which the general form was presented in Eq. (2). In practice, the forces are typically assumed to be central with the following functional form [20] where 𝑟𝐼𝐽 = |𝑟𝐼⃗⃗ − 𝑟𝐽⃗⃗ |, 𝑒𝐼𝐽⃗⃗ ⃗⃗ = 𝑟𝐼𝐽⃗⃗⃗⃗ /𝑟𝐼𝐽, 𝑣𝐼𝐽⃗⃗⃗⃗ ⃗ = 𝑣𝐼⃗⃗ ⃗ − 𝑣𝐽⃗⃗ ⃗, 𝛾 and 𝜎 are the strength of the dissipative and random forces, respectively. 𝜉𝐼𝐽 are standard normal random variables, uncorrelated for different 𝐼𝐽 pairs and different times, with 𝜉𝐼𝐽 = 𝜉𝐽𝐼. The justification for the last property is the conservation of momentum, while the consequence of the former shall be explored later in this section. By convention, 𝐹𝐼𝐽𝐶⃗⃗⃗⃗ ⃗ can include any force term appropriate to the model, as long as it is conservative and independent of the other two. By contrast, the dissipative and random forces are coupled through a fluctuation-dissipation relation, which gives [20] in our particular case. This formula ensures the simulation samples the canonical distribution at the correct temperature. Once we have specified the explicit form for each force term, we can determine their value at the current point in phase space and numerically propagate our position to the next timestep. At the lowest order of approximation, the velocity-Verlet (VV) integration is the most popular scheme, due to simplicity, stability, and efficiency [27]. The shortened implementation of the VV algorithm is [28] However, the calculation of the second equation is nontrivial: 𝑣 (𝑡 + Δ𝑡) appears on both sides, explicitly on the left-hand side and implicitly inside 𝐹 (𝑡 + Δ𝑡). Since the friction force in DPD always contain some dependency on velocity, it is thus impossible to accurately integrate Eq. (2) with the VV algorithm [1]. In order to obtain the correct result, a self-consistent solution is required; i.e. the value for 𝑣 (𝑡 + Δ𝑡) calculated on the left-hand side must be the 𝐹𝐼𝐽𝐶⃗⃗⃗⃗ ⃗ = 𝐹𝐼𝐽𝐶(𝑟𝐼𝐽)𝑒𝐼𝐽⃗⃗ ⃗⃗ , 𝐹𝐼𝐽𝐷⃗⃗⃗⃗ ⃗ = −𝛾(𝑟𝐼𝐽)(𝑣𝐼𝐽⃗⃗⃗⃗ ⃗ ⋅ 𝑒𝐼𝐽⃗⃗ ⃗⃗ )𝑒𝐼𝐽⃗⃗ ⃗⃗ , 𝐹𝐼𝐽𝑅⃗⃗⃗⃗ ⃗ = 𝜎(𝑟𝐼𝐽)𝜉𝐼𝑗𝑒𝐼𝐽⃗⃗ ⃗⃗ , (35) 𝛾(𝑟𝐼𝐽) =12𝛽[𝜎(𝑟𝐼𝐽)]2 (36) 𝑥 (𝑡 + Δ𝑡) = 𝑥 (𝑡) + 𝑣 (𝑡)Δ𝑡 +12𝑚𝐹 (𝑡)Δ𝑡2 , 𝑣 (𝑡 + Δ𝑡) = 𝑣 (𝑡) +12𝑚(𝐹 (𝑡) + 𝐹 (𝑡 + Δ𝑡)) Δ𝑡 . (37) 13 same as the one used to find 𝐹 (𝑡 + Δ𝑡). This self-consistency guarantees time-reversibility, which in turn is necessary for detailed balance and a correct canonical sampling. Moreover, although the discussion above applies to the VV algorithm, most of the known MD integration schemes experience the same problem [20], while the self-consistent ones, such as the predictor-corrector method [1], are too clunky for our purpose. Therefore, alternative integration strategies are in order. We shall explore some of them in Section 3.1. There is another subtlety in the DPD force calculation. As a force describing the rapidly fluctuating interactions between the atoms, 𝐹 𝐼𝐽𝑅 is assumed to have a zero mean and to be uncorrelated with itself for any time scale relevant to the study [29], that is with 𝐶 being a scaling constant. Thus, we can divide the integral of 𝐹 𝐼𝐽𝑅 into smaller terms, each spanning over some time 𝛿𝑡 and drawing from the same distribution: Applying the central limit theorem on the right-hand side, we discover that the integral on the left-hand side also has a mean of zero and follows a distribution with a variance proportional to 𝑡𝛿𝑡. This means when propagating the velocity, the impulse due to the random force must be scaled with √𝛿𝑡: 〈𝐹 𝐼𝐽𝑅(𝑡)〉 = 0 , 〈𝐹 𝐼𝐽𝑅(0) ⋅ 𝐹 𝐼𝐽𝑅(𝑡)〉 = 𝐶𝛿(𝑡) , (38) ∫ 𝐹 𝐼𝐽𝑅(𝑠)𝑑𝑠𝑡0= ∑ ∫ 𝐹 𝐼𝐽𝑅(𝑚𝛿𝑡)𝑑𝑠(𝑚+1)𝛿𝑡𝑚𝛿𝑡𝑡/𝛿𝑡−1𝑚=0 . (39) Δ𝑣𝐼⃗⃗ ⃗ =1𝑀𝐼(∑𝐹𝐼𝐽𝐶⃗⃗⃗⃗ ⃗Δ𝑡 + 𝐹𝐼𝐽𝐷⃗⃗⃗⃗ ⃗Δ𝑡 + 𝐹𝐼𝐽𝑅⃗⃗⃗⃗ ⃗𝐼≠𝐽√Δ𝑡) . (40) 14 Chapter 3 Implementation 3.1. Integrators As we have seen previously, there are two complications with the practical application of a numerical DPD integrator. The second one is a superficial problem that can be solved by just scaling the random force with the appropriate factor. However, the first problem is much more fundamental; we cannot solve it with a simple insertion of the new forces in the force calculation step. In this section, we shall investigate the behavior of different DPD algorithms. The test we perform is very similar to what has been done by Nikunen et al. [20]. The reinvestigation is necessary as we have to implement our code, due to a lack of correct DPD integrators in popular MD programs. 3.1.1. Dissipative velocity-Verlet The dissipative velocity-Verlet (DPD-VV) integrator uses a very simple modification of the standard VV version, namely [30] in which we use 𝐹𝐶⃗⃗ ⃗⃗ (𝑡) = 𝐹𝐶⃗⃗ ⃗⃗ (?⃗? (𝑡)) and 𝐹𝑅⃗⃗⃗⃗ (𝑡) = 𝐹𝑅⃗⃗⃗⃗ (?⃗? (𝑡)). Clearly, without the last calculation, Eq. (41) just describes the traditional VV algorithm in DPD. The difference happens in the last line, where the dissipative force is recomputed using the velocity at the new time with ?⃗? (𝑡 + Δ𝑡) adjusted accordingly. In short, this is an iterative approximation to the true solution of the DPD EOM. Obviously, we should not expect that the updated velocity in DPD-VV is always a good approximation, since it only goes over a single iteration. To achieve self-consistency, we can repeat the last line of Eq. (41), in which 𝑉′⃗⃗⃗⃗ (𝑡 + Δ𝑡) is replaced by the velocity obtained from the previous iteration, until we reach the desired level of convergence. We denote this variant as self-consistent velocity-Verlet (SC-VV). 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝑉𝐼⃗⃗ ⃗(𝑡) +12𝑀𝐼[𝐹𝐼𝐶⃗⃗ ⃗⃗ (𝑡)Δ𝑡 + 𝐹𝐼𝐷⃗⃗⃗⃗ ⃗ (𝑅𝐼⃗⃗⃗⃗ (𝑡), 𝑉𝐼⃗⃗ ⃗(𝑡)) Δ𝑡 + 𝐹𝐼𝑅⃗⃗⃗⃗ ⃗(𝑡)√Δ𝑡] , 𝑅𝐼⃗⃗⃗⃗ (𝑡 + Δt) = 𝑅𝐼⃗⃗⃗⃗ (𝑡) + 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) Δ𝑡 , 𝑉𝐼′⃗⃗⃗⃗ (𝑡 + Δ𝑡) = 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) +12𝑀𝐼[𝐹𝐼𝐶⃗⃗ ⃗⃗ (𝑡 + Δ𝑡)Δ𝑡 + 𝐹𝐼𝐷⃗⃗⃗⃗ ⃗ (𝑅𝐼⃗⃗⃗⃗ (𝑡 + Δ𝑡), 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2))Δ𝑡+ 𝐹𝐼𝑅⃗⃗⃗⃗ ⃗(𝑡 + Δ𝑡)√Δ𝑡] , 𝑉𝐼⃗⃗ ⃗(𝑡 + Δ𝑡) = 𝑉𝐼′⃗⃗⃗⃗ (𝑡 + Δ𝑡) +12𝑀𝐼𝐹𝐼𝐷⃗⃗⃗⃗ ⃗ (𝑅𝐼⃗⃗⃗⃗ (𝑡 + Δ𝑡), 𝑉𝐼′⃗⃗⃗⃗ (𝑡 + Δ𝑡)) Δ𝑡 , (41) 15 Nevertheless, in practice, DPD-VV has been shown to give acceptable results while still being computationally efficient [20]. 3.1.2. Lowe-Andersen Like DPD-VV, this method (Lowe) is also a modified version of the standard VV integrator, but utilizes a very different idea [31]. where 𝑒𝐼𝐽⃗⃗ ⃗⃗ is the radial unit vector between 𝐼 and 𝐽, 𝜇𝐼𝐽 =𝑀𝐼𝑀𝐽𝑀𝐼+𝑀𝐽 the reduced mass of 𝐼 and 𝐽, 𝜉𝐼𝐽⃗⃗ ⃗⃗ a vector of normalized Gaussian random numbers, independent for each 𝐼𝐽 pair, and 𝑃 the selection probability. We need to highlight the sequential, i.e. non-parallelizable, characteristic of the indented block in Eq. (42): the velocity for the current pair of particles must be updated before moving to the next pair, which implies the values of Δ𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ are not independent of each other. However, for each pair of particles, we execute the chain of actions with independent probability 𝑃, meaning it is possible that one pair always goes through the treatment while another never. Furthermore, 𝑃 ≪ 1 weakly couples the system to the heat bath, while thermalization happens at every timestep in the limiting case of 𝑃 = 1. Therefore, its dynamical properties can be adjusted by the appropriate choice of 𝑃. Finally, we note that Lowe only has the DPD version; for an LD system, it is simplified to the Andersen thermostat [32], hence the name. 3.1.3. Impulsive leap-frog This section presents the impulsive leap-frog algorithm (Imp-LF) as described by Goga et al. [33]. Conceptually, it is the leap-frogged version of the Andersen thermostat, that is 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝑉𝐼⃗⃗ ⃗ (𝑡 −Δ𝑡2) +1𝑀𝐼𝐹𝐼𝐶⃗⃗ ⃗⃗ (𝑡)Δ𝑡 , 𝑅𝐼⃗⃗⃗⃗ (𝑡 + Δ𝑡) = 𝑅𝐼⃗⃗⃗⃗ (𝑡) + 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2)Δ𝑡 , 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) +1𝑀𝐼𝐹𝐼𝐶⃗⃗ ⃗⃗ (𝑡 + Δ𝑡)Δ𝑡 . For every pair 𝐼, 𝐽, do the following sequentially with probability 𝑃: Δ𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡)√12𝛽𝜇𝐼𝐽−12𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ (𝑡 +Δ𝑡2) , 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) +𝜇𝐼𝐽𝑚𝐼Δ𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ (𝑡 +Δ𝑡2) , 𝑉𝐽⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝑉𝐽⃗⃗ ⃗ (𝑡 +Δ𝑡2) −𝜇𝐼𝐽𝑚𝐽Δ𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ (𝑡 +Δ𝑡2) , (42) 16 For the uncorrelated system, the velocity reduction factor 𝑓 can be chosen to be anything between 0 and 1, and may differ for different particles. In this case, the limiting condition 𝑓 = 1 reduces the algorithm to the Andersen thermostat. In addition, the Imp-LF algorithm for the LD system is very similar to Lowe. However, to provide the correct dynamics, 𝑓 must be equal to 𝑒𝑥𝑝(−12𝛽Δ𝑡) and the value of 𝑃 is chosen according to the actual form of the dissipative force. Compared to the two integrators we have seen, Imp-LF employs a much more convoluted modification, so it is not immediately clear how this algorithm yields dynamically equivalent characteristics. Nevertheless, by applying Taylor’s expansion on 𝑓 = 𝑒𝑥𝑝(−12𝛽Δ𝑡) , we can at least confirm that in the small Δ𝑡 limit, the resulting impulse conforms to the requirement mentioned at the end of Section 2.4, i.e. Δ𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ ≈ −12𝛽𝑉𝐼𝐽⃗⃗⃗⃗ ⃗Δ𝑡 +1√𝜇𝐼𝐽𝜉𝐼𝐽√𝛥𝑡 satisfies the scaling condition of Eq. (40). 3.1.4. Test result 3.1.4.1. Model systems We tested the integrators on two models that capture some basic properties of a fluid of identical particles. The first one is an LD system without the conservative force 𝐹𝐼𝐶 = 0, so that the EOM becomes 𝑉𝐼⃗⃗ ⃗ (𝑡 +𝛥𝑡2) = 𝑉𝐼⃗⃗ ⃗ (𝑡 −𝛥𝑡2) +1𝑀𝐼𝐹𝐼𝐶⃗⃗ ⃗⃗ (𝑡)𝛥𝑡 . For the LD system: 𝑉𝐼⃗⃗ ⃗ (𝑡 +𝛥𝑡2) = (1 − 𝑓)𝑉𝐼⃗⃗ ⃗ (𝑡 −𝛥𝑡2) + √𝑓(2 − 𝑓)𝑘𝐵𝑇𝑀𝐼𝜉 . For the DPD system: For every pair 𝐼, 𝐽, do the following sequentially with probability 𝑃: 𝛥𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) = −(𝑓 − 1)𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ (𝑡 +𝛥𝑡2) + √1 − 𝑓2𝛽𝜇𝐼𝐽𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡) , 𝑉𝐼⃗⃗ ⃗ (𝑡 +𝛥𝑡2) = 𝑉𝐼⃗⃗ ⃗ (𝑡 +𝛥𝑡2) +𝜇𝐼𝐽𝑚𝐼𝛥𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) , 𝑉𝐽⃗⃗ ⃗ (𝑡 +𝛥𝑡2) = 𝑉𝐽⃗⃗ ⃗ (𝑡 +𝛥𝑡2) −𝜇𝐼𝐽𝑚𝐽𝛥𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) . 𝑅𝐼⃗⃗⃗⃗ (𝑡 + Δ𝑡) = 𝑅𝐼⃗⃗⃗⃗ (𝑡) + 𝑉𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2)Δ𝑡 . (43) 𝑑𝑑𝑡𝑃𝐼⃗⃗ ⃗(𝑡) = −𝛾𝑉𝐼⃗⃗ ⃗(𝑡) + 𝜎𝜉𝐼(𝑡) , (44) 17 with 𝛾 and 𝜎 being the strength of the damping and random forces, having a constant value as the particles do not interact with each other. Thus, this simulates 𝑁 independent particles, essentially describing the motion of large, non-interacting particles in a solvent. This simple system provides a quick preliminary test for the correctness of our implementation. Once the basic expected results had been verified, we tried the integrators on the DPD model, with the conservative force being zero again and the dissipative force applied to each pairwise velocity 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ for in which 𝑅𝐶 is the cutoff distance for the correlation function and 𝛾 the scale factor. Thus, the EOM for this model is The soft-repulsive dissipative force is a very common choice in DPD simulations and has been employed in various investigations for integration schemes [1, 34, 20], thus allowing comparable results. 3.1.4.2. Physical quantities Our survey was chiefly concerned with the average kinetic temperature whose convergence is one of the main conditions for reliable simulations in the canonical ensemble. Here, 𝑑 = 3 is the dimension of the system. However, having the correct temperature only validates the macroscopic properties of the system. In order to verify the accuracy of its microscopic behaviors, we need to consider the mean square displacement (MSD) and the velocity autocorrelation function 𝐶(𝑡) For the LD system, theoretically [29], we expect that the MSD grows linearly at large 𝑡, with while the VACF decays exponentially over time: 𝐹𝐷⃗⃗⃗⃗ ⃗ = ∑ 𝐹𝐼𝐽𝐷⃗⃗⃗⃗ ⃗𝐽≠𝐼= ∑−𝛾(𝑅𝐼𝐽)𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡)𝐽≠𝐼 , (45) 𝛾(𝑅𝐼𝐽) = {𝛾 (1 −𝑅𝐼𝐽𝑅𝐶)2 for 𝑅𝐼𝐽 ≤ 𝑅𝑐0 otherwise , (46) 𝑑𝑑𝑡𝑃𝐼⃗⃗ ⃗ = ∑ −𝛾𝐽≠𝐼 (𝑅𝐼𝐽)𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) + 𝜎(𝑅𝐼𝐽)𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡) . (47) 〈𝑘𝐵𝑇〉 =1𝑁 − 1𝑀𝑑∑ 𝑉𝐼2𝐼 , (48) 𝑀𝑆𝐷(𝑡) =1𝑁∑(𝑋𝐼(𝑡) − 𝑋𝐼(0))2𝐼 , 𝐶(𝑡) =1𝑁∑⟨𝑉𝐼⃗⃗ ⃗(𝑡) ⋅ 𝑉𝐼⃗⃗ ⃗(0)⟩𝐼 . (49) 𝑀𝑆𝐷 →2𝑑2𝑘𝐵𝑇𝛾𝑡 , (50) 18 On the other hand, we do not have the algebraic expression for these two quantities in the DPD model, but we do know they exhibit the same behavior mentioned above, namely linearity for the MSD and exponential decay for the VCAF, under the condition [35] Finally, for the DPD system, we also compute the radial distribution function (RDF) of which the result has been investigated by Nikunen et al. [20]. Naturally, the RDF is unity for a system with no conservative force, because it is structureless. 3.1.4.3. Simulation conditions This experiment employed the DPD reduced units, i.e. 𝑅 = 𝑅∗/𝑅𝐶, 𝑇 = 𝑇∗/𝛾 and 𝑡 = 𝑡∗𝑅𝐶(𝑀/𝛾)1/2, with the asterisk denoting values in real units. The simulations were all performed with 𝑁 = 4096 identical, non-interacting particles of 𝑀 = 1.0, in periodic cubic boxes of size 10 × 10 × 10, giving a number density of 𝜌 ≈ 4.1. The energy scale is defined by setting the temperature to unity 𝑘𝐵𝑇 = 1.0. The damping factor is chosen to be 0.18 in both cases, so the strength of the random force can be evaluated using Eq. (36), specifically 𝜎 = 0.6 for the LD system and 𝜎(𝑅𝐼𝐽) = 0.6(1 − 𝑅𝐼𝐽/𝑅𝐶)𝐻(𝑅𝐶 − 𝑅𝐼𝐽) for the DPD system, with 𝐻(𝑥) being the Heaviside step function. 3.1.4.4. Result for Langevin dynamics In Fig. 3-1, we plot the average kinetic temperature for the DPD-VV, Imp-LF and standard VV integrators for the LD system. Obviously, as the Lowe integrator reduces to the Andersen thermostat in this case, we shall not consider its result. We can see that all three integrators give temperature fluctuating around the correct value of 𝑘𝐵𝑇 =1.0. However, the VV simulation produced thermal fluctuations visibly larger than those by the DPD integrators, hinting at a divergence of the VV result from the other two. 𝐶(𝑡) =𝑘𝐵𝑇𝑀𝑒−𝛾𝑡/𝑀 . (51) 𝛾𝜌−1𝑑𝑑√𝑘𝐵𝑇/𝑀≪ 1 . (52) 𝐺(𝑅) =1𝑁(𝑁 − 1)∑〈𝛿(𝑅𝐼𝐽 − 𝑅)〉4𝜋𝑅2𝐼,𝐽≠𝐼 , (53) 19 Fig. 3-1: Time evolution of the average kinetic temperature 〈𝒌𝑩𝑻〉 for different integrators in LD model. As mentioned in Section 3.1.2, the Lowe integrator only has the pairwise form, so its result is not presented here. In fact, the discrepancy of the standard VV integrator becomes obvious in Fig. 3-2, which shows the computed MSD. For each result, we fit the values from 𝑡 = 20.0 with linear regression. We find that both DPD integrators yield very identical MSD lines, the asymptotic behavior of which conforms with the theoretical formula, namely 𝑀𝑆𝐷 → 100.0𝑡 for our specified choice of variables, while the corresponding result of the standard VV algorithm clearly deviates from the other two. Fig. 3-2: Time evolution of the mean square displacement for different integrators in LD model. Using linear regression, we fit the data from 𝑡 = 20.0 of the Imp-LF and VV integrators to two lines, whose equations are displayed on the graph. 20 Fig. 3-3: The decay of velocity autocorrelation function over time for different integrators in LD model. Unlike the previous plot, the trendline is simply here as visual help. Finally, Fig. 3-3 plots ln(𝐶(𝑡)/𝐶(0)) against time. Here, the two trendlines are not fitted to the data and are simply there for visual support, unlike Fig. 3-2. Akin to the previous result, the two DPD integrators generate data compatible with the theory, which dictates the logarithm of the VACF to be linear in time with a slope of −0.18. Likewise, the behavior of the VV algorithm is displayed again to highlight the discrepancy of the standard version, thus justifying the necessity of DPD integrators. Moreover, we observe that the fluctuations at the tail of the two plots are expected too, as the VACF approaches 0 and starts to be dominated by thermal noises. 3.1.4.5. Result for dissipative particle dynamics Next, we consider the DPD model, where we took 𝐹𝐼𝐽𝐶 = 0. Akin to the above discussion, our interest was focused on the three quantities 〈𝑘𝐵𝑇〉, MSD and VACF as a function of time. Here, we must remember that the Lowe and Imp-LF integrators are fundamentally different than DPD-VV, because the dynamics of the former cases can be controlled by choosing the appropriate collision probability 𝑃, while the latter’s effective friction rate depends solely on 𝛾(𝑅𝐼𝐽). Thus, the value of 𝑃 was chosen such that the simulations for Lowe and Imp-LF gave an effective friction rate similar to that of DPD-VV, namely 𝑃 = 0.037 for Lowe and 𝑃 = 2𝛾(𝑅𝐼𝐽)/𝛾(0) for Imp-LF [33]. Fig. 3-4 exhibits the average kinetic temperature, which should fluctuate around 𝑘𝐵𝑇 = 1.0, for all three integrators. We see that by having a similar friction rate, the three simulations yielded thermal fluctuations of very similar size. 21 Fig. 3-4: Time evolution of the average kinetic temperature 〈𝒌𝑩𝑻〉 for different integrators in DPD model. This fact is further substantiated by comparing the MSD and the logarithm of the VACF, which are plotted in Fig. 3-5 and Fig. 3-6, with the shown trendlines fitted to the DPD-VV data. We observe that the data are very similar to each other and display the anticipated behavior, namely the MSD are linear in time for large 𝑡, while the VACF decay exponentially at the expected rate of 0.1 [35]. We also want to note that this rate is significantly different than the LD result, even though the strength of the friction force in the DPD model is about the same, and less on average. This occurs due to the fact that the force is now pairwise, meaning there are now more collisions per timestep, which alters the mean free path and the effective friction rate. Fig. 3-5: Evolution of the mean square displacement for different integrators in DPD model. Using linear regression, we find the fit for the data from 𝑡 = 20.0 of the DPD-VV integrators. 22 Fig. 3-6: The decay of velocity autocorrelation function over time for different integrators in DPD model. The trendline is fitted to the DPD-VV result, with the y-intercept fixed at 0.0. Finally, we compared the radial distribution of DPD-VV, Lowe and Imp-LF, of which the result for the first two algorithms has been obtained by Nikunen’s group [20]. We can confirm by eye the expected unity of 𝐺(𝑅) at large distance. Furthermore, we notice the anomalies at small 𝑅𝐼𝐽, namely 𝐺(𝑅) for DPD-VV and Lowe both become larger than one near 𝑅 = 0, and Lowe’s distribution has a small dip around 𝑅 = 0.2, while DPD-VV’s does not. These deviations from the ideal gas limit have both been observed by Nikunen et al., but the group did not offer any explanation other than they are simulation artifacts. We can only partially explain the first behavior by the fact that the particles in our models do not collide and can pass through each other, which disturbs the distribution at small distance. Nevertheless, we verify our implementation produced results similar to Nikunen et al., so we conclude that our propagators are working correctly. 23 Fig. 3-7: The radial distribution function 𝑮(𝑹). 24 3.2. Memory With the integrators tested and proven to work correctly, we can now work on the implementation of the memory kernel, the core of our work. In this section, we follow the simulation model used by Li’s group for the standard and auxiliary memory [13, 21]. 3.2.1. Standard memory As we have discussed earlier, the objective of this section is not the retrieval of the memory kernel from an atomistic simulation, but the verification of our implementation for the memory kernel. For now, we assume that the relevant values have been found and focus on the computation of the EOM in Eq. (25).To facilitate the computation of the pairwise force between two particles, we want to express it in terms of the radial and orthogonal directions. For a system without any preferred direction, we expect that the fluctuating force 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ has no bias in the orthogonal plane. Therefore, we can decompose 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ into in which 𝛿𝐹𝐼𝐽𝒬∥⃗⃗ ⃗⃗ ⃗⃗ and 𝛿𝐹𝐼𝐽𝒬⊥⃗⃗ ⃗⃗ ⃗⃗ ⃗ are the radial and orthogonal components of 𝛿𝐹𝐼𝐽𝑄⃗⃗⃗⃗ ⃗. The memory kernel can also be approximated as where 𝐾𝐼𝐽∥ and 𝐾𝐼𝐽⊥ are presumed to be known. Discretizing and decomposing Eq. (25), we finally find with 𝑚 being the cutoff length for the memory kernel, 𝑉𝐼𝐽∥⃗⃗⃗⃗ = (𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ ⋅ 𝑒𝐼𝐽⃗⃗ ⃗⃗ )𝑒𝐼𝐽⃗⃗ ⃗⃗ , 𝑉𝐼𝐽⊥⃗⃗⃗⃗ ⃗ = 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗ − 𝑉𝐼𝐽∥⃗⃗⃗⃗ the parallel and orthogonal velocity between 𝐼 and 𝐽, 𝜉 and 𝐸 Gaussian random number and random antisymmetric matrix, both of which are independent for different particles pairs and at different times [13]. The coefficients 𝛼𝐼𝐽∥ and 𝛼𝐼𝐽⊥ are used to produce random forces that satisfy the generalized FDT: 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ = (𝑒𝐼𝐽⃗⃗ ⃗⃗ ⊗ 𝑒𝐼𝐽⃗⃗ ⃗⃗ ) ⋅ 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ + (1 − 𝑒𝐼𝐽⃗⃗ ⃗⃗ ⊗ 𝑒𝐼𝐽⃗⃗ ⃗⃗ ) ⋅ 𝛿𝐹𝐼𝐽𝑄⃗⃗⃗⃗ ⃗ = 𝛿𝐹𝐼𝐽𝒬∥⃗⃗ ⃗⃗ ⃗⃗ + 𝛿𝐹𝐼𝐽𝒬⊥⃗⃗ ⃗⃗ ⃗⃗ ⃗ , (54) 𝑲𝐼𝐽 = (𝑒𝐼𝐽⃗⃗ ⃗⃗ ⊗ 𝑒𝐼𝐽⃗⃗ ⃗⃗ )𝐾𝐼𝐽∥ + (1 − 𝑒𝐼𝐽⃗⃗ ⃗⃗ ⊗ 𝑒𝐼𝐽⃗⃗ ⃗⃗ )𝐾𝐼𝐽⊥ , (55) ΔΔ𝑡𝑃𝐼⃗⃗ ⃗(𝑡) = ∑[𝐹𝐼𝐽𝐶(𝑅𝐼𝐽)𝑒𝐼𝐽⃗⃗ ⃗⃗ 𝐽≠𝐼− 𝛥𝑡 ∑ (𝐾𝐼𝐽∥ (𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝑉𝐼𝐽∥⃗⃗⃗⃗ (𝑡 − 𝑛𝛥𝑡) + 𝐾𝐼𝐽⊥(𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝑉𝐼𝐽⊥⃗⃗⃗⃗ ⃗(𝑡 − 𝑛𝛥𝑡))𝑚𝑛=0+ √𝛥𝑡 ∑(𝛼𝐼𝐽∥ (𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝜉𝐼𝐽(𝑛𝛥𝑡)𝑒𝐼𝐽⃗⃗ ⃗⃗ + 𝛼𝐼𝐽⊥(𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝐸(𝑛𝛥𝑡) ⋅ 𝑒𝐼𝐽⃗⃗ ⃗⃗ )𝑚𝑛=0] , (56) 𝛽 ∑ 𝛼𝐼𝐽𝑥 (𝑅𝐼𝐽, 𝑛𝛥𝑡)𝛼𝐼𝐽𝑥 (𝑅𝐼𝐽, (𝑛 + 𝑠)𝛥𝑡)𝑚−𝑠𝑛=0= 𝐾𝐼𝐽𝑥(𝑅𝐼𝐽, 𝑠𝛥𝑡) , (57) 25 for 𝑥 ∈ {∥, ⊥}. We can now run the simulation by applying on Eq. (56) any of the integration schemes discussed previously. However, there is one last crucial problem we must address. Up until now, we have ignored the fact that 𝑒𝐼𝐽⃗⃗ ⃗⃗ is a function of time and may vary greatly during our period of interest, as the particles’ trajectory evolves. This conveniently allows us to write Eq. (55) and (56) in terms of pure radial and orthogonal components, avoiding the complication due to the cross-terms. However, we should expect that these terms are not negligible in general. Unfortunately, all of the works we have consulted do not share this concern; in fact, they seem to imply that the radial unit vector appearing in Eq. (54), (55) and (56) is the current unit vector, evaluated at time 𝑡 [13, 21, 15]. This simplification is somewhat puzzling, since conceptually, the colored noise generated in the last two terms of Eq. (56) is the residual effects of the fluctuating microscopic forces in the past, which may not be parallel or orthogonal to the unit vector at the present. In Section 3.2.2, we shall also investigate whether this distinction creates any significant discrepancy. 3.2.2. Test result for standard memory 3.2.2.1. Model system As mentioned before, the determination of the memory kernel is rather elaborate, as it requires a reference MD simulation, but the actual values are of little importance right now. Consequentially, for the propagators’ tests, we employed memory kernels with simple functional forms. Specifically, for the standard memory, we assumed the temporal and spatial dependence of the kernel are separable, and the random coefficients were in which 𝑥 ∈ {∥, ⊥}, 0 < 𝑐 ≤ 1 and 𝜎(𝑅𝐼𝐽) has been defined previously in Eq. (46). Equation (57) then required This choice of the memory kernel is beneficial because it vanishes at large time 𝑡 with the appropriate value for 𝑐 and the cutoff length 𝑚, while still being easily calculable. Clearly, the larger 𝑚 is, the smaller 𝛼𝐼𝐽𝑥 (𝑚Δ𝑡) and 𝐾𝐼𝐽𝑥(𝑚Δ𝑡) become. Fig. 3-8 shows 𝛼𝐼𝐽𝑥 /𝜎(𝑅𝐼𝐽) and 𝐾𝐼𝐽𝑥/𝛽𝜎(𝑅𝐼𝐽) for two combinations of 𝑐 and 𝑚, i.e. 𝑐 = 0.7, 𝑚 = 20 and 𝑐 =0.836, 𝑚 = 40. We can see that, for each 𝑐, the chosen 𝑚 is about the lower limit where the vanishing condition still holds, meaning 𝐾𝐼𝐽𝑥(𝑛Δ𝑡) ≪ 𝐾𝐼𝐽𝑥(0) and 𝛼𝐼𝐽𝑥 (𝑛Δ𝑡) ≪ 𝛼𝐼𝐽𝑥 (0). 𝛼𝐼𝐽𝑥 (𝑅𝐼𝐽, 𝑛𝛥𝑡) = 𝑐𝑛𝜎(𝑅𝐼𝐽) , (58) 𝐾𝐼𝐽𝑥(𝑅𝐼𝐽, 𝑛𝛥𝑡) =𝛽𝑐𝑛𝑐2 − 1(𝑐2(𝑚+1−𝑛) − 1)𝜎2(𝑅𝐼𝐽) . (59) 26 Fig. 3-8: Coefficients for dissipative force (left) and random force (right), with some choices of 𝒄 and 𝒏. Lastly, for the simulation conditions, all tests used DPD reduced units and were performed on a system of 𝑁 identical particles with 𝑀 = 1.0 and 𝐹𝐼𝐽𝐶 = 0, in periodic cubic boxes of size 10 × 10 × 10. 3.2.2.2. Result Fig. 3-9: Average kinetic temperature 〈𝒌𝑩𝑻〉 as a function of cutoff length 𝒎 for the standard memory scheme. Here, we used 𝑁 = 1000, 𝑇 = 3.0, 𝑐 = 0.7 and employed DPD-VV. Unfortunately, even the most basic benchmark, the average temperature, could not be achieved. Our first test was run with 𝑐 = 0.7 and 𝑚 = 20, whose data point is second from the left on Fig. 3-9. We discovered that the kinetic temperature converged nicely, but to a value slightly less than the expected temperature. Naturally, we thought this happened because the memory is insufficient, so we extended the memory cutoff. This means that the memory kernel varies with each case, but the random force is roughly identical, since 𝛼𝐼𝐽𝑥 (𝑛 > 20) are relatively very small. Bizarrely, 27 we found that the longer 𝑚 is, the further the average kinetic temperature 〈𝑇〉 drifts from the expected temperature, as we can see from Fig. 3-9. Apparently, our thermostat produced a converging temperature, but, for some reason, violated the FDT. Next, we realized the FDT was also violated by simply changing the absolute value of the reference temperature 𝑇. Fig. 3-10 plots 〈𝑇〉/𝑇 for different 𝑇, with 𝑐 = 0.836, 𝑚 = 40 and two different system sizes (green dotted line for 𝑁 = 125 and purple dash line for 𝑁 = 256). As in Fig. 3-9, we find 〈𝑇〉 does not converge to the expected temperature 𝑇, indicating that the thermostat somehow satisfied the FDT in a wrong way. In the same figure, we also show the corresponding results from simulations with 𝑐 = 0.7, which reinforce our observation, that there is a very strange, but in some sense well-behaved, relationship between the average temperature 〈𝑘𝐵𝑇〉 and the reference 𝑘𝐵𝑇. Once more, the relation hints that we were having an elementary trouble, as it both overestimated and underestimated 𝑘𝐵𝑇, eliminating the possibility of a systematic error. Fig. 3-10: Ratio between the kinetic temperature 〈𝒌𝑩𝑻〉 and the reference temperature 𝒌𝑩𝑻 as a function of 𝒌𝑩𝑻 in the standard memory scheme. For both 𝑁, the system size was chosen so that the density was constant 𝜌 = 0.4. The simulations were run with Δ𝑡 = 0.0005 and 𝑚 = 40. Finally, we remark that the proportion is independent of system size. This is confirmed by Fig. 3-11, which shows the ratio 〈𝑇〉/𝑇 for another series of tests with changing 𝑁, but constant 𝜌 = 0.4, 𝑐 = 0.7 and 𝑚 = 20. The result supports that the identified behavior is not a finite boundary effect: the detected trend actually becomes smoother with larger system size. 28 Fig. 3-11: The relationship between 〈𝑻〉/𝑻 and 𝒌𝑩𝑻 for different system sizes. Here, the simulations were all executed with Δ𝑡 = 0.01, 𝑐 = 0.7 and 𝑚 = 20. We also notice two other trends. First, we want to inspect the effect of the timestep Δ𝑡 on the earlier observations. In fact, the previous two figures actually used two different values for Δ𝑡: the simulations in Fig. 3-11 were executed with Δ𝑡 = 0.01, while Fig. 3-10 had Δ𝑡 = 0.0005. We isolate the data for 𝑐 = 0.7 in Fig. 3-12 and conclude that the timestep Δ𝑡 affects the results quantitatively, but does not alter the qualitative relationship between 〈𝑇〉/𝑇 and 𝑘𝐵𝑇. Second, Fig. 3-13 examines the difference of reading 𝑒𝐼𝐽⃗⃗ ⃗⃗ as the past vector or the present vector. Namely, the EOM for the present interpretation is with 𝑉𝐼𝐽∥⃗⃗⃗⃗ (𝑡 − 𝑛𝛥𝑡) = (𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡 − 𝑛𝛥𝑡) ⋅ 𝑒𝐼𝐽⃗⃗ ⃗⃗ (𝑡)) 𝑒𝐼𝐽⃗⃗ ⃗⃗ (𝑡), while the EOM for the past case is obtained by replacing all instances of 𝑒𝐼𝐽⃗⃗ ⃗⃗ (𝑡) with 𝑒𝐼𝐽⃗⃗ ⃗⃗ (𝑡 − 𝑛Δ𝑡) in Eq. (60). Evidently, for the simulation conditions of 𝑐 = 0.7, 𝑚 = 20, 𝑁 =1000 and 𝑘𝐵𝑇 = 1.0, there is no appreciable discrepancy between the two choices: both average kinetic temperatures converge to the same, incorrect value of 0.91. We also see that the thermal fluctuations in the present interpretation are larger than, but do not significantly differ from the historical ones. ΔΔ𝑡𝑃𝐼⃗⃗ ⃗(𝑡) = ∑[𝐹𝐼𝐽𝐶(𝑅𝐼𝐽)𝑒𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡)𝐽≠𝐼− 𝛥𝑡 ∑ (𝐾𝐼𝐽∥ (𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝑉𝐼𝐽∥⃗⃗ ⃗⃗ (𝑡 − 𝑛𝛥𝑡) + 𝐾𝐼𝐽⊥(𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝑉𝐼𝐽⊥⃗⃗⃗⃗ ⃗(𝑡 − 𝑛𝛥𝑡))𝑚𝑛=0+ √𝛥𝑡 ∑ (𝛼𝐼𝐽∥ (𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝜉𝐼𝐽(𝑛𝛥𝑡)𝑒𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) + 𝛼𝐼𝐽⊥(𝑅𝐼𝐽(𝑡 − 𝑛𝛥𝑡), 𝑛𝛥𝑡)𝐸(𝑛𝛥𝑡) ⋅ 𝑒𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡))𝑚𝑛=0] , (60) 29 Fig. 3-12: Effect of the timestep value on ⟨𝒌𝑩𝑻⟩. The simulations all used 𝑐 = 0.7. Fig. 3-13: Time evolution of the average kinetic temperature 〈𝒌𝑩𝑻〉 for two different interpretations of 𝒆𝑰𝑱⃗⃗⃗⃗ ⃗ in Eq. (56). In short, our attempt at implementing the standard memory is a confounding blunder. We spent a lot of resources to investigate the problem, testing many different suspicions. For example, we have discussed earlier our test on the different interpretation of Eq. (56). We also inspected whether the various integration methods mentioned in Section 3.1 produced any variation. Furthermore, as the memory was uninitialized (i.e. equal to zero) at the beginning of the simulation, we suspected this initial condition might somehow compound and affect the steady-state behavior, so we attempted a run using the DPD EOM initially, recording the memory along the way, then switched to the MZ EOM. Nevertheless, no matter what test we tried, the resulting characteristics stayed the same. Finally, we 30 tested if we could reduce the MZ EOM to a standard DPD EOM, namely taking 𝑚 = 0. However, this was also not possible as the functional form of our memory kernel cannot satisfy the FDT in this case1. The obvious pathological bias of 〈𝑘𝐵𝑇〉 on the reference temperature 𝑘𝐵𝑇 implies the thermostat is a complete failure, which means the simulation code may have some algorithmic problems. Curiously, the available literature seems to treat the actual implementation of the memory kernel as a trivial task, since we could not find any detailed discussion. On the other hand, the stability of the thermostat, the independence of the observed behavior from the periodic boundary conditions (and arguably the timestep) hint at more fundamental issues, for example, wrong interpretation of the mathematical foundation. We strongly suspect that our interpretation of Eq. (56) or (57), which we copied directly from the paper by Li et al. [13], is incorrect. 3.2.3. Auxiliary memory This lack of success motivated us to try the auxiliary memory method, with which we obtained much better results. Similar to the previous section, we want to decompose Eq. (30) into pairwise terms, with components in the radial and orthogonal direction [21]: with the parallel and perpendicular components of 𝑨 and 𝑩 satisfying the second FDT independently. Again, we assume that their functional form has been found. We shall demonstrate in Section 4.2 how their value can be obtained from the memory kernel. In the meantime, we want to discuss the various issues in applying Eq. (61) in the simulation. First, Eq. (61) is an EOM for the pairwise momenta, not for the momentum of each particle. Intuitively, the decomposition is reasonable, since each 𝑃𝐼𝐽⃗⃗⃗⃗ ⃗ can be considered as a degree of freedom. However, it is difficult to design an algorithm that can determine the momenta of 𝑁 particles from the 𝑁(𝑁 − 1) constraints found by solving Eq. (61). Thus, we want to use a form where each individual momentum appears explicitly: where 𝑒𝐼𝐽⊥⃗⃗ ⃗⃗ =𝑃𝐼𝐽⊥⃗⃗ ⃗⃗ ⃗⃗ 𝑃𝐼𝐽⊥ is the unit vector parallel to the orthogonal momentum. 1 It is a bit nuanced than simply taking 𝑚 = 0 in Eq. (58) and (59) then applying the standard FDT relation in Eq. (36). Instead, we must use the generalized formula (Eq. (19)) under the limit 𝑚 → 0. (𝑃𝐼𝐽∥̇𝑠𝐼𝐽∥⃗⃗ ⃗⃗ ̇) = (𝐹𝐼𝐽𝐶⃗⃗⃗⃗ ⃗0⃗ ) − (0 (𝑨𝐼𝐽∥ )𝑝𝑠(𝑨𝐼𝐽∥ )𝑠𝑝(𝑨𝐼𝐽∥ )𝑠𝑠)(𝑃𝐼𝐽∥𝑠𝐼𝐽∥⃗⃗ ⃗⃗ ) + (0 00 (𝑩𝐼𝐽∥ )𝑠𝑠) (0⃗ 𝜉𝐼𝐽∥⃗⃗ ⃗⃗ ) , (𝑃𝐼𝐽⊥̇𝑠𝐼𝐽⊥⃗⃗ ⃗⃗ ̇) = − (0 (𝑨𝐼𝐽⊥ )𝑝𝑠(𝑨𝐼𝐽⊥ )𝑠𝑝(𝑨𝐼𝐽⊥ )𝑠𝑠)(𝑃𝐼𝐽⊥𝑠𝐼𝐽⊥⃗⃗ ⃗⃗ ) + (0 00 (𝑩𝐼𝐽⊥ )𝑠𝑠) (0⃗ 𝜉𝐼𝐽⊥⃗⃗ ⃗⃗ ) , (61) 𝑑𝑑𝑡𝑃𝐼⃗⃗ ⃗ = ∑(𝑃𝐼𝐽∥̇𝑒𝐼𝐽⃗⃗ ⃗⃗ + 𝑃𝐼𝐽⊥̇𝑒𝐼𝐽⊥⃗⃗ ⃗⃗ )𝐽≠𝐼 , (62) 31 Subsequently, we need to find the finite-time propagator for Eq. (62). We must apply the Trotter splitting here, as the direct, naïve discretization of Eq. (61) produces incorrect behavior [24]. Here, we shall present the VV form of the finite integrator [24]: where 𝑥 ∈ {∥, ⊥} and Again, we note that the generalized FDT is satisfied by the second formula in Eq. (64). 3.2.4. Test result for auxiliary memory 3.2.4.1. Model system Except for the functional form of the memory, we employed the same conditions as the tests for the standard memory, namely, same system size and no conservative force. On the other hand, the auxiliary memory matrices were appropriated directly from the work by Li et al. [21]2, so we could easily compare the results. Thus, for each parallel degree of freedom, i.e. pair momentum 𝑃𝐼𝐽∥ , we have eight auxiliary variables, corresponding to four damped oscillators, while the orthogonal ones only have four auxiliary variables. The entries for the matrices 𝑨 are 2 Their standard memory is much more complicated, so we did not use the available function and opted for a simpler one. 𝑃𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) = 𝑃𝐼⃗⃗ ⃗(𝑡) +Δ𝑡2∑𝐹𝐼𝐽𝐶⃗⃗⃗⃗ ⃗𝐽≠𝐼 , 𝑃𝐼𝐽∥ = (𝑃𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) − 𝑃𝐽⃗⃗ ⃗ (𝑡 +Δ𝑡2)) ⋅ 𝑒𝐼𝐽⃗⃗ ⃗⃗ and 𝑃𝐼𝐽⊥ = 𝑃𝐼𝐽⃗⃗⃗⃗ ⃗ − 𝑃𝐼𝐽∥𝑒𝐼𝐽⃗⃗ ⃗⃗ , (𝑃𝐼𝐽𝑥𝑠𝐼𝐽𝑥⃗⃗ ⃗⃗ ) = [𝑻𝑥 − (1 𝟎𝟎 𝟎)] (𝑃𝐼𝐽𝑥𝑠𝐼𝐽𝑥⃗⃗ ⃗⃗ ) + 𝑺𝑥 (0⃗ 𝜉𝐼𝐽𝑥⃗⃗ ⃗⃗ ) , 𝑃𝐼⃗⃗ ⃗(𝑡 + Δ𝑡) = 𝑃𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) +Δ𝑡2∑𝐹𝐼𝐽𝐶⃗⃗⃗⃗ ⃗𝐽≠𝐼+ ∑(𝑃𝐼𝐽∥𝑒𝐼𝐽⃗⃗ ⃗⃗ + 𝑃𝐼𝐽⊥𝑒𝐼𝐽⊥⃗⃗ ⃗⃗ )𝐽≠𝐼 , 𝑅𝐼⃗⃗⃗⃗ (𝑡 + Δ𝑡) = 𝑅𝐼⃗⃗⃗⃗ (𝑡) +Δ𝑡𝑀𝐼𝑃𝐼⃗⃗ ⃗ (𝑡 +Δ𝑡2) , (63) 𝑻𝑥 = 𝑒−Δ𝑡𝑨𝑥 , 𝛽𝑺𝑥(𝑺𝑥)𝑇 = 𝟏 − 𝑻𝑥(𝑻𝑥)𝑇 . (64) 32 and (𝑨𝐼𝐽𝑥 )𝑠𝑝= −(𝑨𝐼𝐽𝑥 )𝑝𝑠, with being the spatial dependence of the memory kernel, in which 𝑅𝐶∥ and 𝑅𝐶⊥ are the cutoff distances. The corresponding matrices 𝑩 can then be computed from the FDT relation of Eq. (34). This particular form of the matrices was chosen because, applying Eq. (33), it yields a memory kernel 𝑲(𝑡) equal to a sum of damped harmonic oscillators, which is very convenient for the fitting process [21]. Lastly, for the simulation conditions, we ignored the conservative force again. The simulation was run in a box of 10 × 10 × 10, with density 𝜌 = 0.4 and 𝑘𝐵𝑇 = 3.0. 3.2.4.2. Result Fig. 3-14 displays the average kinetic temperature of the simulation using the auxiliary memory scheme. We confirm the correct temperature is finally obtained, but we cannot comment on the thermal fluctuations, since Li et al. did not provide their corresponding result. (𝑨𝐼𝐽⊥ )𝑝𝑠= [0 0.907𝜇 0 0.421𝜇] , (𝑨𝐼𝐽⊥ )𝑠𝑠= [51.69 38.78 0 0−38.78 0 0 00 0 15.00 12.720 0 −12.72 0] , (𝑨𝐼𝐽∥ )𝑝𝑠= [0 0.280𝜆 0 0.611𝜆 0 0.658𝜆 0 0.339𝜆] , (𝑨𝐼𝐽∥ )𝑠𝑠=[ 19.66 60.55 0 0 0 0 0 0−60.55 0 0 0 0 0 0 00 0 26.30 23.87 0 0 0 00 0 −23.87 0 0 0 0 00 0 0 0 23.55 39.44 0 00 0 0 0 −39.44 0 0 00 0 0 0 0 0 14.29 7.150 0 0 0 0 0 −7.15 0 ] , (65) 𝜇 = 30.57(1 + 2.62𝑅𝐼𝐽 𝑅𝐶⊥⁄ )0.5(1 − 𝑅𝐼𝐽 𝑅𝐶⊥⁄ )1.31 , 𝜆 = 38.06(1 + 1.78𝑅𝐼𝐽 𝑅𝐶∥⁄ )0.5(1 − 𝑅𝐼𝐽 𝑅𝐶∥⁄ )1.31 (66) 33 Fig. 3-14: Time evolution of the average kinetic temperature ⟨𝒌𝑩𝑻⟩ in the auxiliary memory scheme. Fig. 3-15: Time evolution of the mean square displacement for the auxiliary memory shame. Here, we used 𝑁 = 1000 and 𝜌 = 0.4. 34 Fig. 3-16: Log-log plot of the above mean square displacement versus time for the auxiliary memory method. On the other hand, the obtained MSD and VACF are shown in Fig. 3-15 and Fig. 3-17, respectively, which exhibit similar features when compared with the results of Li et al. [21]. Specifically, the log-log plot of the MSD has a slope of 2.0 in the ballistic regime, then 1.0 at large 𝑡 (highlighted in red in Fig. 3-16), Furthermore, the VACF creates a negative “trough” at small 𝑡 then fluctuates around zero at larger time. We also note that compared to the figures by Li’s group [21], the analogous patterns of our MSD and VACF occur at slightly earlier time, which can be attributed to a smaller system size, since we did not want to spend too much resources on basic testing. Fig. 3-17: The decay of velocity autocorrelation function over time. These developments confirm our implementation of the auxiliary memory and its respective integrator work as intended. 35 3.3. Mean force potential In this section, we present the result for our implementation of Boltzmann inversion, a popular method to derive coarse-grained (CG) potentials due to its straightforward nature and general applicability. 3.3.1. Inverse Boltzmann method As its name implies, the technique inverts the identity where 𝑔(𝑅) is the radial distribution function, to calculate the potential of mean force 𝑈(𝑅). The practical procedure is very direct: Using the atomistic trajectory, we count the number of neighboring particle that fall within the distance 𝑅 − Δ/2 and 𝑅 + Δ/2 for each particle 𝐼. Then the discrete 𝑔(𝑅) is computed by with 𝐻(𝑥) being the Heaviside step function, the average taken over the ensemble, and so The result can be further refined by running a CG simulation with the newly found values for 𝑈(𝑅), computing the CG radial distribution 𝐺(𝑅) and updating the potential with where 𝛼(𝑅) is a scaling factor to prevent large fluctuations. This approach can be applied iteratively until 𝐺(𝑅) matches the distribution from the atomistic simulation [36]. However, in the preliminary experiment, we only employed the single Boltzmann inversion, since we deemed the result was already acceptable. 3.3.2. Test result 3.3.2.1. Model system For the test of the inverse Boltzmann method, we wanted to compare with the result by Padding and Briels [25]. This procedure was chosen because our final objective was to apply the MZ formalism on a system of polyethylene chains and find the functional forms for the CG EOM, whose conservative part had been found in the paper. Again, we need to remark that, obviously, we could simply borrow the force field from there, but we still had to run the atomistic simulation to calculate the memory kernel anyway, so we just wanted to have our own complete CG program. 𝑔(𝑅) = 𝑒−𝛽𝑈(𝑅) , (67) 𝑔(𝑅) =1𝑁(𝑁 − 1)∑〈𝐻(Δ/2 − |𝑅𝐼𝐽 − 𝑅|)〉4𝜋𝑅2𝐼,𝐽 , (68) 𝑈(𝑅) = −𝑘𝐵𝑇 ln𝑔(𝑅) . (69) 𝑈(𝑅) = 𝑈(𝑅) − 𝛼(𝑅)𝑘𝐵𝑇 ln [𝐺(𝑅)𝑔(𝑅)] , (70) 36 Thus, our initial conditions were similar to those used by Padding and Briels and we used the GROMACS MD units, namely nanometer, picosecond, Kelvin, and atomic mass unit. To obtain the atomistic data, we simulated polyethylene chains of 40 monomers per chain at 𝑇 = 450.0𝐾 using GROMACS united atom force field. As mentioned in Section 1.3, we simply wish to obtain some results for reference, not perfectly realistic data, so we employed the united atom model, where each monomer is one indivisible pseudo-atom [23]. However, the chains are long and easy to entangle, so the system can be stuck in a configuration. To avoid this problem, we prepared ten identical boxes of size much larger than the targeted value, namely 𝐿 = 5.383𝑛𝑚, slowly squeezed them with an NPT thermostat until they reached the expected size and pressure. Because of the randomness in the heat bath, the boxes arrive at very different final configurations. Then, we ran them in NVT simulation and averaged the result. For the CG process, we combined the monomers with the rate of 20 to 1 (see Fig. 3-18, where the monomers are displayed in gray spheres, while the red circles represent the groupings). and computed the various 𝑔(𝑅) corresponding to different components of the conservative force. Here, we must distinguish the difference between the bonded and nonbonded pair. The particles are considered nonbonded if they do not share a bonded structure, even if they are on the same molecule. For example, in standard MD [23], the first and fifth particles in a chain are nonbonded because they have no common bond or angle. In this pilot test, we shall consider the nonbonded interaction only. Fig. 3-18: CG of a polyethylene chain of 40 united atom monomers with the rate of 𝟐𝟎 to 𝟏. 37 3.3.2.2. Result Fig. 3-19: Inverted nonbonded potential and fitting function. Once the RDF had been obtained, we calculated the potential using Eq. (69), then fitted the discrete data with an analytical function. Fig. 3-19 displays both the RDF resulting from the atomistic simulation and the fitting function, which we chose to be Gaussian, i.e. in which 𝑅 is measured in 𝑛𝑚. The fitting parameters were found to be 𝛼 = 7.062𝑘𝐽𝑚𝑜𝑙−1, 𝛽 = 4.015𝑛𝑚−2 and 𝛾 = −0.0926. Finally, we ran an NVE, i.e. canonical ensemble, simulation on the CG system, using fixed bond length and initial conditions identical to those of the reference simulation, then the nonbonded CG distribution 𝐺𝑛𝑏(𝑅) was extracted and compared with 𝑔𝑛𝑏(𝑅) in Eq. (68). As we can see, the two lines agree reasonably. Moreover, the discrepancy can be attributed to the failure of our analytical estimation to capture the finer shape of 𝑈𝑛𝑏(𝑅) around 0.4 < 𝑅 < 0.7. In fact, 𝐺𝑛𝑏(𝑅) follows closely the approximated function, constructed by applying the definition of the RDF to Eq. (71) and drawn with solid cyan line in the figure. Here, it should be emphasized that our test only employed the single Boltzmann inversion, so the achieved result is reasonable. Lastly, we note that the atomistic distribution is very similar to that of Padding and Briels [25], so our machinery for the Boltzmann inversion seems to be working as intended. 𝑈𝑛𝑏(𝑅) = 𝛼𝑒−𝛽𝑅2+𝛾 𝑘𝐽𝑚𝑜𝑙−1 , (71) 38 Fig. 3-20: Comparison between targeted distribution 𝒈𝒏𝒃(𝑹) and the CG distribution 𝑮𝒏𝒃(𝑹) obtained from simulation using nonbonded Boltzmann-inverted potential with the CG ratio of 𝟐𝟎 to 𝟏 and fixed bond length. The solid line is the approximated function of the atomistic distribution, constructed from the fitting potential in Eq. (70). 39 Chapter 4 Application 4.1. Coarse-graining of bonded interactions To stay consistent with the terminology in the previous section, we shall use the adjective “bonded” to describe anything related to the system’s topological structures, which can be bonds, valence angles or dihedral angles in standard MD [23]. So, for example, the bonded interactions include the interaction between two particles sharing a bond, and three particles sharing a bond angle, and four particles sharing a torsion angle. 4.1.1. Problem with coarse-graining of bonded interactions Fig. 4-1: Inverted bond potential and fitting function. In the previous chapter, we ran a trial NVE simulation on the polyethylene system, in which the dimers had fixed bond length, verified the consistency of the nonbonded distributions, and claimed that our implementation of Boltzmann inversion was working properly. However, the test failed spectacularly with regard to the bonded interactions, which we had intentionally avoided earlier. Following the same procedure described in Section 3.3.2, we inverted the mean force potential for the bond between two immediate particles and found the fitting polynomial 𝑈𝑏(𝑅) = 11.87𝑅6 − 88.04𝑅5 + 270.8𝑅4 − 438.1𝑅3 + 391.2𝑅2 − 184.0𝑅 + 36.87 𝑘𝐽𝑚𝑜𝑙−1 . (72) 40 Again, the distance 𝑅 is measured in 𝑛𝑚. Both the discrete potential of mean force and its analytical fit are displayed in Fig. 4-1. We then performed an NVE simulation of the CG system, where the particles could interact through either 𝑈𝑛𝑏(𝑅) (Eq. (71)) or 𝑈𝑏(𝑅), and their initial conditions were found by applying Eq. (8). The bond RDF 𝐺𝑏(𝑅) was then extracted and compared with the targeted one 𝑔𝑏(𝑅) in Fig. 4-2. We see that the result is simply incorrect: the CG distribution has a very different shape, extending very far into forbidden regions, where the potential is large. This is clearly disconcerting, as the same figure shows the nonbonded distribution 𝐺𝑛𝑏(𝑅) still exhibits the same behavior that we observed in Section 3.3.2.2, namely the small deviation from the reference 𝑔𝑛𝑏(𝑅) and the high similarity to the distribution resulting from the fitting nonbonded potential 𝑈𝑛𝑏(𝑅). Additionally, the corresponding approximated distribution for the CG bond correlates strongly with 𝑔𝑏(𝑅), but not at all with 𝐺𝑏(𝑅). The two contrasting observations hint that this complication does not arise from our Boltzmann inversion process, but from something else more fundamental about the bonded interactions. Fig. 4-2: Comparison between targeted radial distribution 𝒈(𝑹) and the CG distribution 𝑮(𝑹) for interactions of nonbonded pairs (left) and of pairs sharing a CG bond (right), with the CG ratio of 𝟐𝟎 to 𝟏. The solid lines are the approximated function of the atomistic distribution. 4.1.2. Solution After many more failed tests, including ones with all possible interactions in standard MD (nonbonded, bond, angle, torsion), we finally realized that maybe something other than our implementation of the bonded interactions went wrong. First, we considered a system consisting of 500 dimers, which can interact through the CG potentials in Eq. (71) and (72), in a simulation box of size 10 × 10 × 10𝑛𝑚. This gives an average spacing between the monomers of (𝑉/𝑁)−1/3 = 1.5𝑛𝑚, about the point of minimum 𝑈𝑏(𝑅). We then ran a simulation using the Andersen thermostat with 𝑇 = 5.0𝐾 and examined the bond radial distribution in Fig. 4-3. Obviously, at this low density, the nonbonded distribution is not correct, so we would not analyze it. Evidently, the resulting data for 𝐺𝑏(𝑅) agree quite well with 41 the targeted values, except for the region of 𝑅 < 0.2𝑛𝑚, where our sixth-order polynomial deviates significantly from the inverted potential. This shows that, at least independently, our code for the bond force works as intended. Fig. 4-3: Bond radial distribution from the thermalized simulation using Andersen thermostat with 𝑻 = 𝟓. 𝟎𝑲. The data for targeted distribution 𝑔𝑏(𝑅) are represented by the dashed blue line, while the solid line is the approximated function of the atomistic distribution. However, when we ran similar simulations with higher temperatures, i.e. 𝑇 = 50.0𝐾 and 𝑇 = 200.0𝐾, the generated data are incorrect: from Fig. 4-4, we see that they are still different from that of the simulation in Section 4.1.1, but clearly wrong nonetheless. Moreover, the deviations become progressively apparent as the temperature increases. This explains why 𝐺𝑏(𝑅) diverges remarkably in Fig. 4-2: at 𝑇 = 450.0𝐾, the bond structure is completely obscured by the thermal noises. In other words, the bond potential of Eq. (72) is too soft, so we can only obtain the correct pair correlation by exploring the phase-space at low 𝑇. In fact, the paper by Padding and Briels [25] specifically addressed this problem and proposed a solution in terms of an extra potential preventing the crossing of the bonds, essentially creating a multi-body potential. In short, with a CG rate of 20 to 1, it is not possible to completely capture the static structure of the polyethylene system by using the standard MD force fields. 42 Fig. 4-4: Bond radial distribution from the thermalized simulation using Andersen thermostat with 𝑻 = 𝟓𝟎. 𝟎𝑲 (yellow triangle) and 𝑻 = 𝟐𝟎𝟎. 𝟎𝑲 (green circle). The targeted distribution 𝑔𝑏(𝑅) is represented in dashed blue line. Subsequently, we reduced the CG ratio to 8 to 1, meaning the CG chains now have 5 monomers. So far, as a dimer only contains two particles, we had only considered the bond force, but the new chain is too long to ignore the angle force. The formalism to evaluate the functional form of the new interaction is similar to the one discussed in Section 3.3.1, except the angular distribution must be normalized with a factor of 1/ sin 𝜃, i.e. with 𝑁𝑎 being the total number of angles in the system. Fig. 4-5 shows the inverted data for both bonded potentials and their fitting function, which we chose to be polynomial: where 𝑈𝑎(𝜃) is the angle potential, with 𝜃 in radian. 𝑔𝑎(𝜃) =1𝑁𝑎∑〈𝐻(Δ/2 − |𝜃𝑖 − 𝜃|)〉sin 𝜃𝑖=1 , (73) 𝑈𝑏(𝑅) = 1918𝑅4 − 4971𝑅3 + 4763𝑅2 − 2022𝑅 + 328.6 𝑘𝐽𝑚𝑜𝑙−1 , 𝑈𝑎(𝜃) = 8.835𝜃6 − 103.0𝜃5 + 482.8𝜃4 − 1158𝜃3 + 1489𝜃2 − 974.5𝜃 + 265.4 𝑘𝐽𝑚𝑜𝑙−1, (74) 43 Fig. 4-5: Inverted potential and fitting function for the bonded interactions, with the CG ratio of 𝟖 to 𝟏. On the other hand, the nonbonded potential, shown in Fig. 4-6, possesses a behavior characteristic of the Lennard-Jones potential [37], which is where 𝜖 and 𝑅𝑚 are the depth of the potential well and the distance where 𝑈𝑛𝑏(𝑅) reaches the minimum. In our case, they have the value of 𝜖 = 0.690𝑘𝐽𝑚𝑜𝑙−1 and 𝑅𝑚 = 0.570𝑛𝑚. In fact, we also expect to find the wavelike structure at the tail of the inverted function: as our setup is far from the dilute limit, the pair correlation function for the nonbonded interaction is influenced by the indirect effects from the neighboring particles [1]. Fig. 4-6: Inverted nonbonded potential and fitting function for the CG ratio of 𝟖 to 𝟏. 𝑈𝑛𝑏(𝑅) = 𝜖 [(𝑅𝑚𝑅)12− 2(𝑅𝑚𝑅)6] , (75) 44 Next, we prepared a CG system following the same procedure described in Section 3.3.2, except each particle is now mapped from eight atoms and can act on the others through three different potentials, then performed an NVE simulation. The reader can find more detailed explanation on the implementation of the new interactions in Section 0. For now, we want to discuss the resulting distribution for the bonded (Fig. 4-7) and nonbonded (Fig. 4-8) interactions. We find the bonded distributions match quite well to the targeted values, save for the regions where the fitting polynomials fail to capture the inverted potentials’ subtler features, such as the curve around 0.5 < 𝑅 < 0.75 for 𝑈𝑏(𝑅) or the potential well near 𝜃 = 0.7 for 𝑈𝑎(𝜃). On the other hand, 𝐺𝑛𝑏(𝑅) only agrees qualitatively, but not quantitatively with 𝑔𝑛𝑏(𝑅), since the repulsive portion of the Lennard-Jones fit is much harder than the raw 𝑈𝑛𝑏(𝑅). Moreover, even if the wanted potential is Lennard-Jones, the minimum is not really at 𝑅 = 0.57𝑛𝑚: it has been shifted from its true value due to, again, the influence from the system’s density. However, for the purpose of this chapter, these finer details can be disregarded, so we conclude that the ratio of 8 to 1 provides an acceptable CG model for the polyethylene system. Fig. 4-7: Atomistic and CG distributions for the bonded interactions, with the CG ratio of 𝟖 to 𝟏. 45 Fig. 4-8: Atomistic and CG distributions for the CG ratio of 𝟖 to 𝟏. We do not provide the approximated function here since the correct form must be modified using the density, in an expansion akin to the virial equation. 46 4.2. Memory extraction This section describes the memory extraction process, mostly following the work done by Li’s group [13, 21]. At a glance, the definition of the memory kernel 𝑲𝐼𝐽(𝑡) = 𝛽 〈𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐼𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉 seems to offer a straightforward method to compute the kernel: once the mean force field has been determined, one only has to calculate, directly from the atomistic trajectory, 𝛿𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) = 𝐹𝐼𝐽𝑃⃗⃗⃗⃗ ⃗(𝑡) − 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) and find its temporal autocorrelation function. Regrettably, the true fluctuating microscopic force 𝛿𝐹𝐼𝐽𝑄⃗⃗⃗⃗ ⃗(𝑡) is not the same as 𝛿𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) for long memory. The explanation for the difference is quite nuanced, so we shall elaborate it at the end of Appendix A, but here we need to know that the correlation 𝑪𝐼𝐽(𝑡) = 𝛽〈𝛿𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝛿𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(0)〉 is just the zeroth-order approximation of the memory kernel. Instead, 𝑲𝐼𝐽 can be calculated exactly through This equation is found by multiplying 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(0) on both sides of Eq. (25) and taking the ensemble average. Either way, we must assume 𝑲𝐼𝐽 is a function of 𝑅𝐼𝐽⃗⃗ ⃗⃗ ⃗ only, as we cannot know the atomistic configuration of 𝐼 and 𝐽 from the CG simulation. Unfortunately, this means that not all autocorrelated values can be used: the calculation can only be done with the ones where the two particles have the same relative position. Thus, in practice, we divide the intermolecular distance 𝑅𝐼𝐽 into many bins of size Δ, preferably with the same value used previously for the Boltzmann inversion, then compute 𝑲𝐼𝐽(𝑅, 𝑡) by averaging over all 𝐼𝐽 pairs staying within 𝑅 − Δ/2 < 𝑅𝐼𝐽 < 𝑅 + Δ/2 during the period 𝑡, i.e. where we have discretized Eq. (76) and applied the decomposition similar to Eq. (55). The parallel and orthogonal factors of the velocity are 𝑉𝐼𝐽∥ (𝑠) = 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑠) ⋅ 𝑒𝐼𝐽⃗⃗ ⃗⃗ (𝑠) and 𝑉𝐼𝐽⊥(𝑠) = 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑠) − 𝑉𝐼𝐽∥ (𝑠)𝑒𝐼𝐽⃗⃗ ⃗⃗ (𝑠), for ∀𝑠 < 𝑡, and the components of 𝛿𝐹𝐼𝐽⃗⃗⃗⃗ ⃗ are defined identically. The two formulae in Eq. (77) allow the memory kernel to be computed recursively, with the base cases being 𝐾𝐼𝐽𝑥(𝑅, 0): for 𝑥 ∈ {∥, ⊥} and the average taken over all pairs 𝐼𝐽 with |𝑅𝐼𝐽(0) − 𝑅| < Δ/2. Next, we must make the distinction between different interaction types: it is reasonable to expect that the memory kernel should behave differently for a 〈𝛿𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑡) ⊗ 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(0)〉 = ∫ 𝑲𝐼𝐽(𝑡 − 𝑠)〈𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(𝑠) ⊗ 𝑉𝐼𝐽⃗⃗⃗⃗ ⃗(0)〉𝑑𝑠𝑡0 . (76) [〈𝛿𝐹𝐼𝐽∥ (𝑡)𝑉𝐼𝐽∥ (0)〉 = ∑ 𝐾𝐼𝐽∥ (𝑅, 𝑡 − 𝑚Δ𝑡)〈𝑉𝐼𝐽∥ (𝑚Δ𝑡)𝑉𝐼𝐽∥ (0)〉Δ𝑡𝑡/Δ𝑡𝑚]|𝑅𝐼𝐽(𝑚Δ𝑡<𝑡)−𝑅|<Δ/2 , [〈𝛿𝐹𝐼𝐽⊥(𝑡)𝑉𝐼𝐽⊥(0)〉 = ∑ 𝐾𝐼𝐽⊥(𝑅, 𝑡 − 𝑚Δ𝑡)〈𝑉𝐼𝐽⊥(𝑚Δ𝑡)𝑉𝐼𝐽⊥(0)〉Δ𝑡𝑡/Δ𝑡𝑚]|𝑅𝐼𝐽(𝑚Δ𝑡<𝑡)−𝑅|<Δ/2 , (77) 𝐾𝐼𝐽𝑥(𝑅, 0) =1Δ𝑡〈𝛿𝐹𝐼𝐽𝑥(0)𝑉𝐼𝐽𝑥(0)〉〈𝑉𝐼𝐽𝑥(0)𝑉𝐼𝐽𝑥(0)〉 , (78) 47 different potential. However, at the point of writing of this work, we have not been able to clarify a meaningful method to determine the force between bonded particles, even when they are only connected through some CG bonds. As an example, let us examine the atomistic angle 𝜃 lying in between two particles 𝐼 and 𝐽 in Fig. 4-9: it is unclear whether 𝐹𝑟𝑎⃗⃗ ⃗⃗ and 𝐹𝑠𝑎⃗⃗ ⃗⃗ should be included in the internal force of 𝐼 or considered to be the external forces acting on 𝐽 by 𝐼. Therefore, in this chapter, we shall only evaluate the nonbonded memory kernel. Fig. 4-9: Two bonded CG particles. To evaluate the functional form of the memory kernel, we made one last approximation that the temporal and spatial dependence of the kernel are separable, i.e. Indeed, we found that 𝐾𝐼𝐽𝑥(𝑅, 𝑡) at different times all decay exponentially. Moreover, from Fig. 4-10, which plots the logarithm of 𝐾𝐼𝐽𝑥(𝑅, 𝑡) for various 𝑡, we note that the decay rate does not change much over time, supporting our stated assumption. Disregarding the data for 𝑅 < 0.4𝑛𝑚, where the nonbonded force is large, making poor sampling, and averaging over 0 < 𝑡 < 0.3, we received a slope of 7.38𝑛𝑚−1 for the radial and 7.36𝑛𝑚−1 for the perpendicular component of 𝐾𝐼𝐽. 𝐾𝐼𝐽𝑥(𝑅, 𝑡) = Θ(𝑅)Τ(𝑡) . (79) 48 Fig. 4-10: Decay rate of the radial dependence in the parallel (left) and orthogonal (right) memory kernel. Fig. 4-11: Temporal dependence at 𝑹 > 𝟎. 𝟖𝒏𝒎 of the parallel (left) and orthogonal (right) memory kernel. On the other hand, the temporal function Τ(𝑡) has more fluctuations and its form depends highly on the value of 𝑅. In particular, Fig. 4-11 shows that, for 𝑅 > 0.8𝑛𝑚, the ratio 𝐾𝐼𝐽𝑥(𝑅, 𝑡)/𝐾𝐼𝐽𝑥(𝑅, 0) may not only stop decaying but can also increase again. However, we must recognize that, at this large intermolecular distance, the absolute value of the memory kernel is relatively small, meaning the fluctuating force 𝛿𝐹𝐼𝐽𝒬⃗⃗⃗⃗ ⃗ is minuscule and its effect on the particles’ motion negligible. Consequently, we decided to average over 0.4 < 𝑅 < 0.7, where 𝐾𝐼𝐽𝑥(𝑅, 𝑡)/𝐾𝐼𝐽𝑥(𝑅, 0) behaves quite homogenously, and approximated the ratio as a series of damped harmonic oscillators. Using three oscillators for the fitting process, we found the parallel component to be while the orthogonal part is 𝐾𝐼𝐽∥ (𝑅, 𝑡)𝐾𝐼𝐽∥ (𝑅, 0)= 14.4𝑒−6.74𝑡 cos(0.0866𝑡 − 1.56) + 0.919𝑒−42.6𝑡 cos(21.2𝑡 − 1.11)+ 0.474𝑒−11.2𝑡 cos(18.4𝑡 − 0.547) , (80) 49 The entries for the auxiliary matrices corresponding to each oscillator 𝐴(𝑡) = 𝑝 exp(−𝑞𝑡) cos(𝑟𝑡 − 𝑠) could then be computed with Fig. 4-12: Temporal correlation 𝑲𝑰𝑱𝒙 (𝑹, 𝒕)/𝑲𝑰𝑱𝒙 (𝑹, 𝟎) and its fitted function for the parallel (left) and orthogonal (right) memory kernel. Fig. 4-12 illustrates the radial and perpendicular temporal correlations and their fitting function. We note that the raw data do not extend very far in 𝑡, an unfortunate consequence from the compactness and the high diffusion rate of our setup: from Fig. 4-6, we concluded that the spatial resolution could be at most 0.02𝑛𝑚, giving reliable sampling until around 𝑡 = 0.4𝑝𝑠. Thus, we demanded the oscillators to decay quickly to obtain a reasonable kernel. Also, we observe the derivative of the memory kernel in both directions at 𝑡 = 0 is zero, which is expected according to Li et al. [21]. To force the same condition on the damped oscillators, we employed the constraint tan 𝑠 = 𝑞/𝑟. In our test, the coupling matrices were found to be 𝐾𝐼𝐽∥ (𝑅, 𝑡)𝐾𝐼𝐽∥ (𝑅, 0)= 5.94𝑒−4.00𝑡 cos(0.0791𝑡 − 1.55) + 1.30𝑒−33.0𝑡 cos(21.8𝑡 − 0.195)+ 0.182𝑒−7.73𝑡 cos(17.4𝑡 − 0.418) . (81) (𝑨𝐼𝐽𝑥 )𝑝𝑠= [√𝑝2(cos 𝑠 −𝑟𝑞sin 𝑠) √𝑝2(cos 𝑠 +𝑟𝑞sin 𝑠)] , (𝑨𝐼𝐽𝑥 )𝑠𝑠= [2𝑞 √𝑟2 + 𝑞2−√𝑟2 + 𝑞2 0] , (𝑨𝐼𝐽𝑥 )𝑠𝑝= (𝑨𝐼𝐽𝑥 )𝑝𝑠 . (82) 50 with (𝑨𝐼𝐽∥ )𝑝𝑠= [ 0 1.838𝜆 0 4.616𝜆 0 2.368𝜆] , (𝑨𝐼𝐽∥ )𝑠𝑠=[ 13.48 6.745 0 0 0 0−6.745 0 0 0 0 00 0 85.24 47.59 0 00 0 −47.59 0 0 00 0 0 0 22.43 21.590 0 0 0 −21.59 0 ] , (𝑨𝐼𝐽⊥ )𝑝𝑠= [ 0 1.413𝜇 0 4.060𝜇 0 1.966𝜇] , (𝑨𝐼𝐽⊥ )𝑠𝑠=[ 7.998 4.000 0 0 0 0−4.000 0 0 0 0 00 0 65.93 39.52 0 00 0 −39.52 0 0 00 0 0 0 15.46 19.050 0 0 0 −19.05 0 ] , (83) 𝜆2 = 8.224 × 105𝑒−7.38𝑅 , 𝜇2 = 2.591 × 105𝑒−7.36𝑅 . (84) 51 4.3. Simulation model For the simulation of the CG polyethylene system, we shall ignore the effect from the torsion angle; thus, particles that have 3 neighbors in between on a chain are considered nonbonded. Consequentially, our conservative force field possesses three terms [25]: in which 𝐹𝐼𝐽𝑛𝑏 is the nonbonded force, 𝐹𝐼𝐽𝑏 the bond force and 𝐹𝐼𝑎 the angle force acting on the particle 𝐼. The first sum runs over all particles not bonded to 𝐼, the second sum goes over all immediate neighbors of 𝐼, while the third over angles having 𝐼 as one of their vertices. Assuming the form of potentials is known, the nonbonded and bond forces are where 𝑈(𝑅𝐼𝐽) is the corresponding potential. On the other hand, when 𝐼 is not the central particle of the angle (see Fig. 4-13), the angle force is calculated by [27] with 𝑈𝑎(𝜃) being the angle potential, 𝜃 = 𝐼𝐽?̂? ≡ arccos(𝑒𝐽𝐼⃗⃗ ⃗⃗ ⋅ 𝑒𝐽𝐾⃗⃗ ⃗⃗ ⃗) the angle, 𝑒𝐼⃗⃗⃗ = 𝑛𝑜𝑟𝑚 (𝑒𝐽𝐼⃗⃗ ⃗⃗ × (𝑒𝐽𝐼⃗⃗ ⃗⃗ × 𝑒𝐽𝐾⃗⃗ ⃗⃗ ⃗)) the normalized vector in the plane of 𝐼𝐽?̂?, orthogonal to 𝑅𝐽𝐼⃗⃗ ⃗⃗ ⃗. The angle force acting on the center can be found by applying Newton’s third law. The form of all three potentials can be found in Section 4.1.2. Fig. 4-13: Valence angle. 𝑒𝐽𝐼⃗⃗ ⃗⃗ and 𝑒𝐽𝐾⃗⃗⃗⃗⃗⃗ are normalized vectors pointing from the center particle 𝐽. 𝐹𝐼𝐶⃗⃗ ⃗⃗ = ∑ 𝐹𝐼𝐽𝑛𝑏𝐽≠𝐼(𝑅𝐼𝐽)𝑒𝐼𝐽⃗⃗ ⃗⃗ + ∑ 𝐹𝐼𝐽𝑏(𝑅𝐼𝐽)𝑒𝐼𝐽⃗⃗ ⃗⃗ 𝐽+ ∑ 𝐹𝐼𝑎⃗⃗ ⃗⃗ (𝜃)𝜃∋𝐼 , (85) 𝐹𝐼𝐽⃗⃗⃗⃗ ⃗(𝑅𝐼𝐽) = −𝜕𝑈(𝑅𝐼𝐽)𝜕𝑅𝐼𝐽𝑒𝐼𝐽⃗⃗ ⃗⃗ , (86) 𝐹𝐼𝑎⃗⃗ ⃗⃗ (𝜃) = −𝜕𝑈𝑎(𝜃)𝜕𝜃𝑒𝐼⃗⃗⃗ 𝑅𝐽𝐼 , (87) 52 Combining with the memory kernel 𝐾𝐼𝐽𝑥(𝑅, 𝑡) in Section 4.2, we obtain the EOM for the CG simulation: with 𝑥 ∈ {∥, ⊥} and 𝑠𝐼𝐽𝑥⃗⃗ ⃗⃗ being the corresponding component of the auxiliary variables for the 𝐼𝐽 pair. The integration scheme for the auxiliary memory is presented in Section 3.2.3. For the simulation conditions, we prepared an atomistic system of 125 polyethylene chains with length of 40 monomers, followed the thermalization procedure in Section 3.3.2.1, then applied a CG ratio of 8 to 1 (see Fig. 4-14, with the empty circles being the CG particles). While the two end particles of the resulting molecule should behave somewhat differently from the rest of the body, we considered all particles to be of the same type, with mass of 120.28𝑢. The simulation was performed in a periodic box of size 5.38 × 5.38 × 5.38𝑛𝑚, at 𝑇 = 450.0𝐾 and Δ𝑡 =0.001𝑝𝑠. The cutoff distance for both the nonbonded interaction and the memory kernel was chosen at 1.0𝑛𝑚. Fig. 4-14: CG of a polyethylene chain of 40 united atom monomers with the rate of 𝟖 to 𝟏. 𝑑𝑑𝑡𝑃𝐼⃗⃗ ⃗ = ∑ (𝐹𝐼𝐽𝑛𝑏(𝑅𝐼𝐽)𝑒𝐼𝐽⃗⃗ ⃗⃗ − [(𝑨𝐼𝐽∥ )𝑝𝑠𝑠𝐼𝐽∥⃗⃗ ⃗⃗ ] 𝑒𝐼𝐽∥⃗⃗ ⃗⃗ − [(𝑨𝐼𝐽⊥ )𝑝𝑠𝑠𝐼𝐽⊥⃗⃗ ⃗⃗ ] 𝑒𝐼𝐽⊥⃗⃗ ⃗⃗ )𝐼𝐽 𝑛𝑜𝑛𝑏𝑜𝑛𝑑𝑒𝑑 +∑𝐹𝐼𝐽𝑏(𝑅𝐼𝐽)𝑒𝐼𝐽⃗⃗ ⃗⃗ 𝐽+ ∑𝐹𝐼𝑎⃗⃗ ⃗⃗ (𝜃)𝜃∋𝐼 , 𝑑𝑑𝑡𝑠𝐼𝐽𝑥⃗⃗ ⃗⃗ = −(𝑨𝐼𝐽𝑥 )𝑠𝑝𝑃𝐼𝐽𝑥⃗⃗ ⃗⃗ − (𝑨𝐼𝐽𝑥 )𝑠𝑠𝑠𝐼𝐽𝑥⃗⃗ ⃗⃗ + (𝑩𝐼𝐽𝑥 )𝑠𝑠𝜉𝐼𝐽𝑥⃗⃗ ⃗⃗ , (88) 53 4.4. Result We obtained a very reasonable result for the MZ simulation. First, let us inspect the distribution for the nonbonded (Fig. 4-15) and bonded (Fig. 4-16) interactions. In the same figures, we also show the corresponding data extracted from the atomistic (blue dot) and from the NVE (red dash) simulation in Section 4.1.2. Again, we observe that the MZ distributions exhibit the rough features of the atomistic distribution, except the area where the fitting potentials do not match the inverted ones. Moreover, the resulting 𝐺𝑛𝑏(𝑅), 𝐺𝑏(𝑅), 𝐺𝑎(𝜃) closely follow those of the NVE simulation, and, by proxy, the approximated function constructed from the fitting potential. This means as far as we are concerned, the MZ simulation replicated the correct static structure of the CG polyethylene system. Fig. 4-15: Nonbonded distribution for the CG simulation using the auxiliary memory method. The blue dotted line is the atomistic distribution, while the red dashed one represents the NVE data. Fig. 4-16: Nonbonded distribution for the CG simulation using the auxiliary memory method. 54 Next, we want to examine the dynamical property. Fig. 4-17 displays 𝐶(𝑡)/𝐶(0) from all three simulations: atomistic, NVE and MZ. The VACFs had to be scaled since under the mapping in Eq. (8), many degrees of freedom were eliminated, reducing the effective temperature in the mapped system. Looking at the figure, we find that the MZ and atomistic normalized VACFs match each other quite well, including an inflection point near 𝑡 = 0.3𝑝𝑠. However, the former has a zero derivative at 𝑡 = 0, a feature not shared with the latter. Apparently, this is a side-effect of the memory method: It persisted over various trial memory kernels 𝐾𝐼𝐽𝑥(𝑡), even when 𝑑𝐾𝐼𝐽𝑥(𝑡)/𝑑𝑡|𝑡=0≠ 0. Nevertheless, this is a successful implementation of the memory method, as we can see that the NVE VACF behaves very differently from the rest, decaying at a much slower rate. This is another expected consequence of the CG projection, which decreased the system’s thermal fluctuations and the effective friction rate. Moreover, we had not included the kernel for the bonded interactions, so this level of agreement is very reasonable. Fig. 4-17: Normalized VACF for the CG simulation using the auxiliary memory method. Similar to the graphs above, this also displays the data from the atomistic and the NVE simulation. 55 Chapter 5 Conclusion Previously, Li et al. had successfully employed the MZ projection to recover the correct dynamics for a system of CG star-polymer melts, where each CG particle represented one polymer [13]. In this work, we investigated the applicability of the MZ formalism on the more complex polyethylene system, which included bonded structures between the particles. While the formalism itself is quite straightforward, there were many subtle problems needed to be resolved before the practical application could be realized. First, the MZ EOM is formally very similar to the DPD EOM. Thus, we wanted to confirm whether a correct DPD integration scheme was necessary. However, at the point of writing, many popular simulation packages did not contain a working DPD integrator, so we decided to build our own version. While the implementation for the integrator was successful, we were not so lucky with our MZ endeavor: our simulations using the standard memory could not generate the correct temperature, no matter what we tried. Nevertheless, our system’s temperature still converged, albeit to a wrong value, and it clearly depended on the trial memory kernel we used. We believed this violation of the FDT was a consequence of our erroneous interpretation of the MZ EOM, but we were not able to resolve the issue. Fortunately, there is an alternative approach, which substitutes the direct memory with a series of damped oscillators. The behavior of these extra variables imitates the effect of the memory, in a fashion akin to the Nosé-Hoover heat bath [38]. We managed to implement the auxiliary memory correctly. Finally, we discovered that the CG process of the polyethylene system was not as simple as it seemed: it turned out that a naïve decomposition of the multi-body CG potential into standard MD components, namely nonbonded, bond and angle potentials, could not work with an aggressive CG strategy, such as a rate of 20 monomers to 1. Thus, in the end, we decided to use a rate of 8 to 1, whose potential replicated the system’s distributions reasonably. Next, we constructed a CG model for the polyethylene system by applying the MZ formalism on the MD trajectory. The atomistic data were obtained by simulating a system of 125 polyethylene chains of 40 monomers, using a GROMACS united atom force field. Then, we coarsened the chains with a rate of 8 to 1 and retrieved the CG potentials by employing the Boltzmann inversion. Once the potentials were obtained, we evaluated the fluctuating forces, defined as the difference between the sum of the atomistic forces and the mean force on the particle. These forces allowed us to compute the memory kernel and its corresponding coupling oscillators. Since we did not have a correct method to extract the kernel for the bonded interactions, we only calculated the nonbonded one, but our final simulation demonstrated that it was sufficient to correct the general dynamics of the system. Besides, while we could not provide an exact number on the gain in speed, as our implementation is certainly not as optimized as GROMACS, we estimated the speedup to be at least one order of magnitude by employing our own code for both the atomistic and 56 the CG runs. Thus, the MZ formalism achieved both targets outlined in Chapter 1, namely producing the accurate dynamics and improving the simulation’s performance. In the future, we plan to derive a formalism to evaluate the bonded memory kernel. Previously in Chapter 4, we discussed one aspect of the problem, which examines the ownership of the atomistic angle forces in a CG bond (Fig. 4-9). The issue can be solved if we simply consider the shared angle 𝜃 as an intermediary for the interactions between the two particles 𝐼 and 𝐽: thus, 𝐹𝑟𝑎⃗⃗ ⃗⃗ is the atomistic force on 𝐽 by 𝐼, while (𝐹𝑙𝑎⃗⃗ ⃗⃗ + 𝐹𝑐𝑎⃗⃗ ⃗⃗ ) is the reaction by 𝐽 on 𝐼. Due to Newton’s third law, this clarification guarantees the conservation of momentum in the system. However, this question alludes to a more fundamental complication, that the valence (or torsion) angle interaction involves more than two bodies, so it is unclear how we can obtain a formula to determine its respective memory kernel. One possible approach is to apply to the standard MD decomposition of the force field on Eq. (17), the most general MZ EOM. 57 Bibliography [1] D. Frenkel and B. Smit, Understanding molecular simulation: from algorithms to applications, vol. 1, Academic press, 2001. [2] W. G. Noid, J. W. Chu, G. S. Ayton, V. Krishna, S. Izvekov, G. A. Voth and H. C. Andersen, "The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models," The Journal of chemical physics, vol. 128, no. 24, p. 244114, 2008. [3] M. Neri, C. Anselmi, M. Cascella, A. Maritan and P. Carloni, "Coarse-grained model of proteins incorporating atomistic detail of the active site," Physical review letters, vol. 95, no. 21, p. 218102, 2005. [4] S. Kmiecik, D. Gront, M. Kolinski, L. Wieteska, A. E. Dawid and A. Kolinski, "Coarse-grained protein models and their applications," Chemical Reviews, vol. 116, no. 14, pp. 7898-7936, 2016. [5] B. J. Alder and T. E. Wainwright, "Studies in molecular dynamics. I. General method," The Journal of Chemical Physics, vol. 31, no. 2, pp. 459-466, 1959. [6] W. G. Noid, "Perspective: Coarse-grained models for biomolecular systems," The Journal of chemical physics, vol. 139, no. 9, p. 09B201_1, 2013. [7] S. J. Marrink, H. J. Risselada, S. Yefimov, D. P. Tieleman and A. H. De Vries, "The MARTINI force field: coarse grained model for biomolecular simulations," The journal of physical chemistry B, vol. 111, no. 27, pp. 7812-7824, 2007. [8] L. Monticelli, S. K. Kandasamy, X. Periole, R. G. Larson, D. P. Tieleman and S. J. Marrink, "The MARTINI coarse-grained force field: extension to proteins," Journal of chemical theory and computation, vol. 4, no. 5, pp. 819-834, 2008. [9] J. S. Andreu, C. Calero, J. Camacho and J. Faraudo, "On-the-fly coarse-graining methodology for the simulation of chain formation of superparamagnetic colloids in strong magnetic fields," Physical Review E, vol. 85, no. 3, p. 036709, 2012. [10] W. H. Wang, C. Schütte and L. Delle Site, "Adaptive resolution simulation (AdResS): a smooth thermodynamic and structural transition from atomistic to coarse grained resolution and vice versa in a grand canonical fashion," Journal of chemical theory and computation, vol. 8, no. 8, pp. 2878-2887, 2012. 58 [11] H. I. Ingólfsson, C. A. Lopez, J. J. Uusitalo, D. H. de Jong, S. M. Gopal, X. Periole and S. J. Marrink, "The power of coarse graining in biomolecular simulations," Wiley Interdisciplinary Reviews: Computational Molecular Science, vol. 4, no. 3, pp. 225-248, 2014. [12] R. Chudoba, J. Heyda and J. Dzubiella, "Temperature-Dependent Implicit-Solvent Model of Polyethylene Glycol in Aqueous Solution," Journal of chemical theory and computation, vol. 13, no. 12, pp. 6317-6327, 2017. [13] . Z. Li, X. Bian, X. Li and G. E. Karniadakis, "Incorporation of memory effects in coarse-grained modeling via the Mori-Zwanzig formalism," The Journal of chemical physics, vol. 143, no. 24, p. 243128, 2015. [14] C. Hijón, P. Español, E. Vanden-Eijnden and R. Delgado-Buscalioni, "Mori–Zwanzig formalism as a practical computational tool," Faraday discussions, vol. 144, pp. 301-322, 2010. [15] Y. Yoshimoto, I. Kinefuchi, T. Mima, A. Fukushima, T. Tokumasu and S. Takagi, "Bottom-up construction of interaction models of non-Markovian dissipative particle dynamics," Physical Review E, vol. 88, no. 4, p. 043305, 2013. [16] T. Kinjo and S. A. Hyodo, "Equation of motion for coarse-grained simulation based on microscopic description," Physical review E, vol. 75, no. 5, p. 051109, 2007. [17] García-Palacios, J. Luis and F. J. Lázaro, "Langevin-dynamics study of the dynamical properties of small magnetic particles," Physical Review B, vol. 58, no. 22, p. 14937, 1998. [18] S. Izvekov, "Mori-Zwanzig theory for dissipative forces in coarse-grained dynamics in the Markov limit," Physical Review E, vol. 95, no. 1, p. 013303, 2017. [19] P. J. Hoogerbrugge and J. M. V. A. Koelman, "Simulating microscopic hydrodynamic phenomena with dissipative particle dynamics," EPL (Europhysics Letters), vol. 19, no. 3, p. 155, 1992. [20] P. Nikunen, M. Karttunen and I. Vattulainen, "How would you integrate the equations of motion in dissipative particle dynamics simulations?," Computer Physics Communications, vol. 153, no. 3, pp. 407-423, 2003. [21] Z. Li, H. S. Lee, E. Darve and G. E. Karniadakis, "Computing the non-Markovian coarse-grained interactions derived from the Mori–Zwanzig formalism in molecular systems: Application to polymer melts," The Journal of chemical physics, vol. 146, no. 1, p. 014104, 2017. [22] "Re: [lammps-users] The temperature keeps rising with NVE, pair_style dpd, to simulate an amphiphilic polymeric system," 07 Jul 2016. [Online]. Available: http://lammps.sandia.gov/threads/msg63653.html. [Accessed 02 Feb 2018]. 59 [23] M. J. Abraham, D. van der Spoel, E. Lindahl, D. Hess and and the GROMACS development team, "GROMACS User Manual version 5.0.4," 2014. [Online]. Available: ftp://ftp.gromacs.org/pub/manual/manual-5.0.4.pdf. [Accessed 02 Feb 2018]. [24] M. Ceriotti, G. Bussi and M. Parrinello, "Colored-noise thermostats à la carte," Journal of Chemical Theory and Computation, pp. 1170-1180, 2010. [25] J. T. Padding and W. J. Briels, "Uncrossability constraints in mesoscopic polymer melt simulations: non-Rouse behavior of C 120 H 242," The Journal of Chemical Physics, vol. 115, no. 6, pp. 2846-2859, 2001. [26] J. A. Anderson, C. D. Lorenz and A. Travesset, "General purpose molecular dynamics simulations fully implemented on graphics processing units," Journal of Computational Physics, vol. 227, no. 10, pp. 5342-5359, 2008. [27] M. P. Allen and D. J. Tildesley, Computer simulation of liquids, Oxford university press, 2017. [28] W. C. Swope, H. C. Andersen, P. H. Berens and K. R. Wilson, "A computer simulation method for the calculation of equilibrium constants for the formation of physical clusters of molecules: Application to small water clusters," The Journal of Chemical Physics, vol. 76, no. 1, pp. 637-649, 1982. [29] R. Zwanzig, Nonequilibrium statistical mechanics, Oxford University Press, 2001. [30] G. Besold, I. Vattulainen, M. Karttunen and J. M. Polson, "Towards better integrators for dissipative particle dynamics simulations," Physical Review E, vol. 62, no. 6, p. R7611, 2000. [31] C. P. Lowe, "An alternative approach to dissipative particle dynamics," EPL (Europhysics Letters), vol. 47, no. 2, p. 145, 1999. [32] H. C. Andersen, "Molecular dynamics simulations at constant pressure and/or temperature," The Journal of chemical physics, vol. 72, no. 4, pp. 2384-2393, 1980. [33] N. Goga, A. J. Rzepiela, A. H. De Vries, S. J. Marrink and H. J. C. Berendsen, "Efficient algorithms for Langevin and DPD dynamics," Journal of chemical theory and computation, vol. 8, no. 10, pp. 3637-3649, 2012. [34] R. D. Groot and P. B. Warren, "Dissipative particle dynamics: Bridging the gap between atomistic and mesoscopic simulation," The Journal of chemical physics, vol. 107, no. 11, pp. 4423-4435, 1997. [35] P. Espanol and M. Serrano, "Dynamical regimes in the dissipative particle dynamics model," Physical Review E, vol. 59, no. 6, p. 6340, 1999. [36] T. C. Moore, C. R. Iacovella and C. McCabe, "Derivation of coarse-grained potentials via multistate iterative Boltzmann inversion," The Journal of chemical physics, vol. 140, no. 22, p. 06B606_1, 2014. 60 [37] J. E. Jones, "On the determination of molecular fields.—II. From the equation of state of a gas," Proceedings of the Royal Society of London A, vol. 106, no. 738, pp. 463-477, 1924. [38] W. G. Hoover, "Canonical dynamics: equilibrium phase-space distributions," Physical review A, vol. 31, no. 3, p. 1695, 1985. 61 Appendix A First, let us introduce some new functions of interest. Since we want to investigate the dynamical aspect of a CG mapping, we now care about the trajectory of the system. Let 𝛾 = ?̂?(𝑡) and Γ𝑃 ≡ 𝑷(𝛾) denote the trajectory of the atoms and of their COM in the atomistic simulation, respectively, with 𝑷 being the mapping operator in Eq. (8). Obviously, 𝛾 is determined completely by the atomistic Hamiltonian ℎ( 𝑟𝑛⃗⃗⃗⃗ , 𝑝𝑛⃗⃗ ⃗⃗ ), that is while Γ𝑃 can be deduced from 𝛾 using Eq. (8). However, we are not interested in the time-evolution of the Γ𝑃, but in a CG EOM that replicates similar statistical properties. Namely, for any trajectory Γ created using the CG EOM, 𝐴(Γ) has the same distribution as 𝐴(Γ𝑃), where 𝐴 being an observable in CG phase space. There are different ways of deriving the MZ EOM [14], but in this appendix, we shall follow the work of Kinjo and Hyodo [16], which produces the desired formula through the time evolution of the phase space density function. Let us define the COM phase space density with ?̂? being the instantaneous coordinates of the atomistic system in the COM phase space Φ. This definition allows us to express an observable of the COM as meaning 𝐴(𝑷(?̂?)) only depends on the atomistic coordinates {?̂?} through the mapping 𝑷 in Ω𝑃(?̂?). Then, we assume we can divide an atomistic observable 𝑎(?̂?) into where 𝑎𝒫(𝑷(?̂?)) can be expanded by the basis set {Ω𝑃(?̂?)} as We can think of the operator 𝒫 as the projection from 𝜑 into Φ, while 𝒬 = 1 − 𝒫 is the operator orthogonal to 𝒫. Next, we define the scalar product 〈… ,… 〉 as the correlation function of the canonical ensemble, i.e. with 𝑝𝜑(?̂?) ∝ 𝑒−𝛽ℎ being the equilibrium probability density. Then, we apply another assumption 𝑑𝑑𝑡𝑟𝑛⃗⃗⃗⃗ (𝑡) =𝜕ℎ𝜕𝑝𝑛⃗⃗ ⃗⃗ , 𝑑𝑑𝑡𝑝𝑛⃗⃗ ⃗⃗ (𝑡) = −𝜕ℎ𝜕𝑟𝑛⃗⃗⃗⃗ , (89) Ω𝑃(?̂?) = 𝛿(𝑷(?̂?) − ?̂?) = 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗) 𝛿 (𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝑁⃗⃗ ⃗⃗ ⃗) , (90) 𝐴(𝑷(?̂?)) = ∫𝐴(?̂?)𝛿(𝑷(?̂?) − ?̂?)𝑑?̂? = ∫𝐴(?̂?)Ω𝑃(?̂?)𝑑?̂? , (91) 𝑎(?̂?) = 𝑎𝒫(𝑷(?̂?)) + 𝑎𝒬(?̂?) ≡ 𝒫𝑎(?̂?) + 𝒬𝑎(?̂?) , (92) 𝑎𝒫(𝑷(?̂?)) = ∫𝐶(?̂?)Ω𝑃(?̂?)𝑑?̂? . (93) 〈𝑎(?̂?), 𝑏(?̂?)〉 = ∫𝑎(?̂?)𝑏(?̂?)𝑝𝜑(?̂?)𝑑?̂? , (94) 62 and take the scalar product of 𝑎(?̂?) with Ω𝑃(?̂?) to get which gives an expression for the projected part of 𝑎(?̂?) where 〈Ω𝑃(?̂?)〉 = ∫ 𝛿(𝑷(?̂?) − ?̂?)𝑝𝜑(?̂?)𝑑?̂? = 𝑃𝜑(?̂?) is the equilibrium distribution function of the mapped COM. Equation (97) shows the projection into the CG space is the thermal average of all atomistic states that correspond to the same CG state. We note that the two employed assumptions are reasonable: Eq. (93) simply means 𝑎𝒫 is the part of 𝑎 that can be described completely using the COM variables, while Eq. (95) implies the time scales of motion of the atoms and their corresponding CG particles are separable, or equivalently, Now, let us apply this formalism to the EOM of the phase space density of the COM. The time-evolution of Ω𝑃(?̂?) along the atomistic trajectory is where the second equality occurs by grouping the atoms by their CG particle, 𝑖𝑙 is the atomistic Liouville operator and 𝐹𝑖⃗⃗ is the conservative force acting on the atom 𝑖. Using the linearity of 𝑷 and the property of the Dirac delta function, the first term in the last line becomes with 𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ = ∑ 𝐹𝑖⃗⃗ 𝑖∈𝐼 being total atomistic force acting on the particle 𝐼. We can also transform the second term similarly to obtain the definition for the CG Liouville operator: 〈𝑎𝒬(?̂?), Ω𝑃(?̂?)〉 = 0 , (95) 〈Ω𝑃(?̂?), 𝑎(?̂?)〉 = 〈Ω𝑃(?̂?), 𝑎𝒫(𝑷(?̂?))〉 = ∫𝛿(𝑷(?̂?) − ?̂?)𝑎𝒫(𝑷(?̂?))𝑝𝜑(?̂?)𝑑?̂?= 𝑎𝒫(?̂?)∫𝛿(𝑷(?̂?) − ?̂?)𝑝𝜑(?̂?)𝑑?̂? , (96) 𝑎𝒫(?̂?) =〈Ω𝑃(?̂?), 𝑎(?̂?)〉〈Ω𝑃(?̂?)〉=∫ 𝛿(𝑷(?̂?) − ?̂?)𝑎(?̂?)𝑝𝜑(?̂?)𝑑?̂?∫ 𝛿(𝑷(?̂?) − ?̂?)𝑝𝜑(?̂?)𝑑?̂? , (97) 〈𝑎𝒬(?̂?), 𝑏𝒫(𝑷(?̂?))〉 = 0 (98) (𝑑𝑑𝑡)𝛾Ω𝑃(?̂?) = 𝑖𝑙Ω𝑃(?̂?) ≡ −∑ [𝜕ℎ𝜕𝑟𝑖 ⃗⋅𝜕𝜕𝑝𝑖⃗⃗⃗ −𝜕ℎ𝜕𝑝𝑖⃗⃗⃗ ⋅𝜕𝜕𝑟𝑖 ⃗] Ω𝑃(?̂?)𝑖= ∑ ∑[𝐹𝑖⃗⃗ ⋅𝜕𝜕𝑝𝑖⃗⃗⃗ +𝑝𝑖⃗⃗⃗ 𝑚𝑖⋅𝜕𝜕𝑟𝑖 ⃗] 𝛿(𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝐼⃗⃗ ⃗)𝛿(𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝐼⃗⃗⃗⃗ )𝑖∈𝐼𝐼= ∑ ∑[𝛿(𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝐼⃗⃗⃗⃗ )𝐹𝑖⃗⃗ ⋅𝜕𝜕𝑝𝑖⃗⃗⃗ 𝛿(𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝐼⃗⃗ ⃗) + 𝛿(𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝐼⃗⃗ ⃗)𝑝𝑖⃗⃗⃗ 𝑚𝑖⋅𝜕𝜕𝑟𝑖 ⃗𝛿(𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝐼⃗⃗⃗⃗ )]𝑖∈𝐼𝐼 , (99) ∑ [−𝛿(𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝐼⃗⃗⃗⃗ ) (∑𝐹𝑖⃗⃗ 𝑖∈𝐼) ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗𝛿(𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝐼⃗⃗ ⃗)]𝐼= −∑𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗Ω𝑃(?̂?)𝐼 , (100) (𝑑𝑑𝑡)𝛾Ω𝑃(?̂?) = −∑ [𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗+𝑃𝐼𝑃⃗⃗⃗⃗ ⃗𝑀𝐼⋅𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ] Ω𝑃(?̂?)𝐼≡ 𝑖𝐿Ω𝑃(?̂?) , (101) 63 in which 𝑃𝐼𝑃⃗⃗⃗⃗ ⃗ = ∑ 𝑝𝑖⃗⃗⃗ 𝑖∈𝐼 = ∑ 𝜕ℎ/𝜕𝑟𝑖 ⃗𝑖∈𝐼 is the total momentum of particle 𝐼. From Eq. (97), we find the projected part of the time-evolution to be We want to derive the explicit form for The first scalar product is Using 𝜓(?̂?) = 𝑒−𝛽ℎ/𝑍 and integration by parts, the inner integral can be rewritten as which gives since 𝜕𝜕𝑅𝐼⃗⃗ ⃗⃗ ′𝛿(?̂?′ − ?̂?) = −𝜕𝜕𝑅𝐼⃗⃗ ⃗⃗ 𝛿(?̂?′ − ?̂?). Similarly, we have 𝑓𝒫(?̂?) ≡ 𝒫𝑖𝐿Ω𝑃(?̂?) =〈Ω𝑃(?̂?′), 𝑖𝐿Ω𝑃(?̂?)〉〈Ω𝑃(?̂?′)〉=∫ 𝛿(𝑷(?̂?) − ?̂?′)𝑖𝐿𝛿(𝑷(?̂?) − ?̂?)𝑝𝜑(?̂?)𝑑?̂?∫ 𝛿(𝑷(?̂?) − ?̂?′)𝑝𝜑(?̂?)𝑑?̂? . (102) 〈Ω𝑃(?̂?′), 𝑖𝐿𝑃Ω𝑃(?̂?)〉 = ⟨Ω𝑃(?̂?′), −∑[𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗+𝑃𝐼𝑃⃗⃗⃗⃗ ⃗𝑀𝐼⋅𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ] Ω𝑃(?̂?)I⟩= −∑𝜕𝜕𝑃𝐼⃗⃗ ⃗⋅ 〈𝐹𝐼𝑃⃗⃗⃗⃗ ⃗Ω𝑃(?̂?′), Ω𝑃(?̂?)〉𝐼− ∑𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ⋅ 〈𝑃𝐼𝑃⃗⃗⃗⃗ ⃗𝑀𝐼Ω𝑃(?̂?′), Ω𝑃(?̂?)〉𝐼 . (103) 〈𝐹𝐼𝑃⃗⃗⃗⃗ ⃗Ω𝑃(?̂?′), Ω𝑃(?̂?)〉 = ∫(−∑𝜕ℎ𝜕𝑟𝑖 ⃗𝑖∈𝐼) 𝛿(𝑷(?̂?) − ?̂?′)𝛿(𝑷(?̂?) − ?̂?)𝑝𝜑(?̂?)𝑑?̂?= ∫𝛿 (𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝑁⃗⃗ ⃗⃗ ⃗′) 𝛿 (𝑷(𝑝𝑛⃗⃗ ⃗⃗ ) − 𝑃𝑁⃗⃗ ⃗⃗ ⃗) 𝑑𝑝𝑛⃗⃗ ⃗⃗ × ∫𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗′) 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗) (−∑𝜕ℎ𝜕𝑟𝑖 ⃗𝑖∈𝐼)𝑝𝜑(?̂?)𝑑𝑟𝑛⃗⃗⃗⃗ . (104) ∫ 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗′) 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗)1𝛽∑𝜕𝑝𝜑(?̂?)𝜕𝑟𝑖 ⃗𝑖∈𝐼𝑑𝑟𝑛⃗⃗⃗⃗ = −1𝛽∫𝑝𝜑(?̂?) (∑𝜕𝜕𝑟𝑖 ⃗𝑖∈𝐼) [𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗′) 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗)] 𝑑𝑟𝑛⃗⃗⃗⃗ =1𝛽∫𝑝𝜑(?̂?) (𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ′+𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ) [𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗′) 𝛿 (𝑷(𝑟𝑛⃗⃗⃗⃗ ) − 𝑅𝑁⃗⃗ ⃗⃗ ⃗)] 𝑑𝑟𝑛⃗⃗⃗⃗ , (105) 〈𝐹𝐼𝑃⃗⃗⃗⃗ ⃗Ω𝑃(?̂?′), Ω𝑃(?̂?)〉 =1𝛽(𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ′+𝜕𝜕𝑅𝐼⃗⃗⃗⃗ )∫𝛿(𝑷(?̂?) − ?̂?′)𝛿(𝑷(?̂?) − ?̂?)𝑝𝜑(?̂?)𝑑?̂?=1𝛽(𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ′+𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ) 𝛿(?̂?′ − ?̂?)𝑃𝜑(?̂?)=1𝛽𝛿(?̂?′ − ?̂?)𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑃𝜑(?̂?) , (106) 64 Thus, the projected portion of the CG Liouville equation is Next, we consider the time evolution for the rest of the CG Liouville equation, namely 𝒬𝑖𝐿Ω𝑃, along 𝛾 In contrast to the projected term, the orthogonal term can only be described completely using the full atomistic phase space coordinates, so its time evolution depends explicitly on 𝛾 and therefore it is a function of 𝑡. Hence, we shall relabel it as 𝑄𝑖𝐿Ω𝑃(?̂?) ≡ 𝑓𝒬(?̂?, 𝑡), where we define 𝑡 = 0 as the point in time when the projection takes place3. Then, the formal solution of Eq. (126) is where we have applied the following identity for linear operators: Now, we consider in which the second equality occurs due to the Hermicity of 𝑖𝑙, the fourth equality due to Eq. (98). The last line only involves thermal averages, so it does not depend on time and thus is constant under the effect of 𝑒−𝑖𝑙(𝑡−𝑠). Then, Eq. (110) is equivalent to 3 We did not make this clear earlier, but the projection 𝒫 is necessarily a projection at a fixed time. This, however, does not affect the result about 𝒫 in the previous paragraphs, since by the ergodic hypothesis, thermal averages in a simulation do not depend on time. Nevertheless, 𝑄 is not an average, so 𝑡 is relevant here. 〈𝑃𝐼𝑃⃗⃗⃗⃗ ⃗𝑀𝐼Ω𝑃(?̂?′), Ω𝑃(?̂?)〉 = −1𝛽𝛿(?̂?′ − ?̂?)𝜕𝜕𝑃𝐼⃗⃗ ⃗𝑃𝜑(?̂?) . (107) 𝑓𝒫(?̂?) = −1𝛽1𝑃𝜑(?̂?′)∑ [𝜕𝜕𝑃𝐼⃗⃗ ⃗⋅ (𝛿(?̂?′ − ?̂?)𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑃𝜑(?̂?)) −𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ⋅ (𝛿(?̂?′ − ?̂?)𝜕𝜕𝑃𝐼⃗⃗ ⃗𝑃𝜑(?̂?))]𝐼 . (108) (𝑑𝑑𝑡)𝛾𝒬𝑖𝐿Ω𝑃(?̂?) = −𝑖𝑙𝒬𝑖𝐿Ω𝑃(?̂?) . (109) 𝑓𝒬(?̂?, 𝑡) = 𝑒−𝑖𝑙𝑡𝑓𝒬(?̂?, 0) = 𝑒−(𝒫+𝒬)𝑖𝑙𝑡𝑓𝒬(?̂?, 0)= 𝑒−𝒬𝑖𝑙𝑡𝑓𝒬(?̂?, 0) − ∫ 𝑒−𝑖𝑙(𝑡−𝑠)𝒫𝑖𝑙𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)𝑑𝑠𝑡0 , (110) 𝑒(𝒜+ℬ)𝑡 = 𝑒ℬ𝑡 + ∫ 𝑒(𝒜+ℬ)(𝑡−𝑠)𝒜𝑒ℬ𝑠𝑑𝑠𝑡0 . (111) −𝒫𝑖𝑙𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0) =−⟨Ω𝑃(?̂?′), 𝑖𝑙𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)⟩⟨Ω𝑃(?̂?′)⟩=⟨𝑖𝑙Ω𝑃(?̂?′), 𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)⟩⟨Ω𝑃(?̂?′)⟩= ⟨𝑖𝐿Ω𝑃(?̂?′), 𝑒−𝒬𝑖𝑙𝑠𝒬𝑖𝐿Ω𝑃(?̂?)⟩⟨Ω𝑃(?̂?′)⟩=⟨𝒬𝑖𝐿Ω𝑃(?̂?′), 𝑒−𝒬𝑖𝑙𝑠𝒬𝑖𝐿Ω𝑃(?̂?)⟩⟨Ω𝑃(?̂?′)⟩=1𝑃𝜑(?̂?′)⟨𝑓𝒬(?̂?′, 0), 𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)⟩ , (112) 65 We have now derived the explicit form for the EOM of the CG phase space density function: To obtain the time evolution of the particle 𝐽, we need to multiply the former EOM with 𝑃𝐽⃗⃗ ⃗ then integrate over Φ. For example, using integration by parts, the left-hand side of Eq. (101) is while the projected term becomes Here, we remark this expression yields the CG conservative force by using the consistency conditions in Eq. (12) and (13), i.e. Next, for the orthogonal term, we note that 𝑓𝒬(?̂?, 𝑡) = 𝑒−𝒬𝑖𝑙𝑡𝑓𝒬(?̂?, 0) +1𝑃𝜑(?̂?′)∫ ⟨𝑓𝒬(?̂?′, 0), 𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)⟩𝑑𝑠𝑡0 . (113) (𝑑𝑑𝑡)𝛾Ω𝑃(?̂?) = 𝑓𝒫(?̂?) + 𝑓𝒬(?̂?, 𝑡)= −1𝛽1𝑃𝜑(?̂?′)∑[𝜕𝜕𝑃𝐼⃗⃗ ⃗⋅ (𝛿(?̂?′ − ?̂?)𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑃𝜑(?̂?)) +𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ⋅ ((?̂?′ − ?̂?)𝜕𝜕𝑃𝐼⃗⃗ ⃗𝑃𝜑(?̂?))]𝐼+1𝑃𝜑(?̂?′)∫ ⟨𝑓𝒬(?̂?′, 0), 𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)⟩𝑑𝑠𝑡0+ 𝑒−𝒬𝑖𝑙𝑡𝑓𝒬(?̂?, 0) . (114) ∫𝑃𝐽⃗⃗ ⃗ (𝑑𝑑𝑡)𝛾Ω𝑃(?̂?)𝑑?̂? = ∫(𝑑𝑑𝑡𝑃𝐽⃗⃗ ⃗) Ω𝑃(?̂?)𝑑?̂? =𝑑𝑑𝑡𝑃𝐽⃗⃗ ⃗ . (115) ∫𝑃𝐽⃗⃗ ⃗𝑓𝒫(?̂?)𝑑?̂? = ∫𝑃𝐽⃗⃗ ⃗𝒫𝑖𝐿Ω𝑃(?̂?)𝑑?̂?= −1𝛽∫𝑃𝐽⃗⃗ ⃗1𝑃𝜑(?̂?′)∑ [𝜕𝜕𝑃𝐼⃗⃗ ⃗⋅ (𝛿(?̂?′ − ?̂?)𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑃𝜑(?̂?)) +𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ⋅ ((?̂?′ − ?̂?)𝜕𝜕𝑃𝐼⃗⃗ ⃗𝑃𝜑(?̂?))]𝐼𝑑?̂?=1𝛽∫1𝑃𝜑(?̂?′)(𝜕𝜕𝑃𝐽⃗⃗ ⃗⋅ 𝑃𝐽⃗⃗ ⃗) 𝛿(?̂?′ − ?̂?)𝜕𝜕𝑅𝐽⃗⃗⃗⃗ 𝑃𝜑(?̂?) 𝑑?̂?=1𝛽1𝑃𝜑(?̂?′)𝜕𝜕𝑅𝐽′⃗⃗⃗⃗ 𝑃𝜑(?̂?′) =1𝛽𝜕𝜕𝑅𝐽⃗⃗⃗⃗ ln 𝑃𝜑(?̂?) . (116) 1𝛽𝜕𝜕𝑅𝐽⃗⃗⃗⃗ ln 𝑃𝜑(?̂?) =1𝛽𝜕𝜕𝑅𝐽⃗⃗⃗⃗ ln 𝑃Φ(?̂?) =1𝛽𝜕𝜕𝑅𝐽⃗⃗⃗⃗ ln [𝑃𝑅 (𝑅𝑁⃗⃗ ⃗⃗ ⃗) 𝑃𝑃 (𝑃𝑁⃗⃗ ⃗⃗ ⃗)]=1𝛽𝜕𝜕𝑅𝐽⃗⃗⃗⃗ ln 𝑃𝑅 (𝑅𝑁⃗⃗ ⃗⃗ ⃗) = −𝜕𝜕𝑅𝐽⃗⃗⃗⃗ 𝐻(?̂?) ≡ 𝐹𝐽⃗⃗ ⃗ . (117) ∫𝑃𝐽⃗⃗ ⃗𝑖𝐿Ω𝑃(?̂?)𝑑?̂? = −∫∑ 𝑃𝐽⃗⃗ ⃗ [𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗+𝑃𝐼𝑃⃗⃗⃗⃗ ⃗𝑀𝐼⋅𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ] 𝛿(?̂?′ − ?̂?)𝑑?̂?𝐼= ∫𝐹𝐽𝑃⃗⃗⃗⃗ ⃗ ⋅ (𝜕𝜕𝑃𝐽⃗⃗ ⃗𝑃𝐽⃗⃗ ⃗) 𝛿(?̂?′ − ?̂?)𝑑?̂? = 𝐹𝐽𝑃⃗⃗⃗⃗ ⃗ , (118) 66 so solving the same computation as in Eq. (115) and (116) for 𝑓𝒬(?̂?′, 0) gives which we have defined earlier as the atomistic fluctuating force. Moreover, by applying the consistency conditions to Eq. (108), we have therefore in which we define 𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗(𝑡) ≡ 𝑒−𝒬𝑖𝑙𝑡𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗(0). Thus, we obtain the MZ EOM: Finally, we want to clarify the difference between 𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑠) and 𝛿𝐹𝐼⃗⃗ ⃗(𝑠) and fulfill our promise in the beginning of Section 4.2. Since 𝛿𝐹𝐼⃗⃗ ⃗(𝑠) ≡ [𝐹𝐼𝑃⃗⃗⃗⃗ ⃗ − 𝐹𝐼⃗⃗ ⃗]𝑡=𝑠 is the deviation of the total atomistic force from the mean force, evaluated from the atomistic trajectory at time 𝑠, it evolves using the atomistic Liouville propagator, that is On the other hand, we have ∫𝑃𝐽⃗⃗ ⃗𝑓𝒬(?̂?, 0)𝑑?̂? = ∫𝑃𝐽⃗⃗ ⃗𝒬𝑖𝐿Ω𝑃(?̂?)𝑑?̂?|𝑡=0= [∫𝑃𝐽⃗⃗ ⃗𝑖𝐿Ω𝑃(?̂?)𝑑?̂? − ∫𝑃𝐽⃗⃗ ⃗𝒫𝑖𝐿Ω𝑃(?̂?)𝑑?̂?]𝑡=0= [𝐹𝐽𝑃⃗⃗⃗⃗ ⃗ − 𝐹𝐽⃗⃗ ⃗]𝑡=0≡ 𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗(0) , (119) 𝒫𝑖𝐿Ω𝑃(?̂?) = −1𝛽1𝑃Φ(?̂?′)∑[𝜕𝜕𝑃𝐼⃗⃗ ⃗⋅ (𝛿(?̂?′ − ?̂?)𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑃Φ(?̂?)) −𝜕𝜕𝑅𝐼⃗⃗⃗⃗ ⋅ (𝛿(?̂?′ − ?̂?)𝜕𝜕𝑃𝐼⃗⃗ ⃗𝑃Φ(?̂?))]𝐼= −1𝛽1𝑃Φ(?̂?′)∑[𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝑃Φ(?̂?) ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗𝛿(?̂?′ − ?̂?) −𝜕𝜕𝑃𝐼⃗⃗ ⃗𝑃Φ(?̂?) ⋅𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝛿(?̂?′ − ?̂?)]𝐼=𝑃Φ(?̂?)𝑃Φ(?̂?′)∑[𝐹𝐼⃗⃗ ⃗ ⋅𝜕𝜕𝑃𝐼⃗⃗ ⃗𝛿(?̂?′ − ?̂?) +𝑃𝐼⃗⃗ ⃗𝑀𝐼⋅𝜕𝜕𝑅𝐼⃗⃗⃗⃗ 𝛿(?̂?′ − ?̂?)]𝐼 , (120) ∫ 𝑃𝐽⃗⃗ ⃗1𝑃𝜑(?̂?′)⟨𝑓𝒬(?̂?′, 0), 𝑒−𝒬𝑖𝑙𝑠𝑓𝒬(?̂?, 0)⟩𝑑?̂?=1𝑃Φ(?̂?′)⟨𝒬𝑖𝐿Ω𝑃(?̂?′), 𝑒−𝒬𝑖𝑙𝑠 ∫𝑃𝐽⃗⃗ ⃗ 𝑓𝒬(?̂?, 0)𝑑?̂?⟩ =1𝑃Φ(?̂?′)⟨(1 − 𝒫)𝑖𝐿Ω𝑃(?̂?′), 𝑒−𝒬𝑖𝑙𝑠𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗(0)⟩=1𝑃Φ(?̂?′)∑ ⟨[(𝑃Φ(?̂?′)𝑃Φ(?̂?′′)𝐹𝐼′⃗⃗ ⃗ − 𝐹𝐼𝑃′⃗⃗ ⃗⃗⃗⃗ ) ⋅𝜕𝜕𝑃𝐼′⃗⃗ ⃗+ (𝑃Φ(?̂?′)𝑃Φ(?̂?′′)𝑃𝐼′⃗⃗ ⃗𝑀𝐼−𝑃𝐼𝑃′⃗⃗⃗⃗ ⃗⃗ 𝑀𝐼) ⋅𝜕𝜕𝑅𝐼′⃗⃗⃗⃗ ] 𝛿(?̂?′′ − ?̂?′), 𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗(𝑠)⟩𝐼= −𝛽 ∑〈𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑠) ⊗ 𝛿𝐹𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉𝐼𝑃𝐼′⃗⃗ ⃗𝑀𝐼 , (121) 𝑑𝑑𝑡𝑃𝐽⃗⃗ ⃗ = 𝐹𝐽⃗⃗ ⃗ − 𝛽 ∑ ∫ 〈𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑡 − 𝑠) ⊗ 𝛿𝐹𝐽𝓠⃗⃗⃗⃗ ⃗(0)〉𝑃𝐼⃗⃗ ⃗(𝑠)𝑀𝐼𝑑𝑠𝑡0𝐼+ 𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑡) . (122) 𝛿𝐹𝐼⃗⃗ ⃗(𝑠) = 𝑒−𝑖𝑙𝑠𝛿𝐹𝐼⃗⃗ ⃗(0) = 𝑒−𝑖𝑙𝑠 [𝐹𝐽𝑃⃗⃗⃗⃗ ⃗ − 𝐹𝐽⃗⃗ ⃗]𝑡=0= 𝑒−𝑖𝑙𝑠𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗(0) . (123) 𝛿𝐹𝐼𝓠⃗⃗⃗⃗ ⃗(𝑠) = 𝑒−𝒬𝑖𝑙𝑠𝛿𝐹𝐼𝒬⃗⃗⃗⃗ ⃗(0) = 𝛿𝐹𝐼⃗⃗ ⃗(𝑠) + 𝑒𝒫𝑖𝑙𝑠𝛿𝐹𝐼𝒬⃗⃗⃗⃗ ⃗(0) . (124) 67 Basically, while 𝛿𝐹𝐼⃗⃗ ⃗ and 𝛿𝐹𝐽𝒬⃗⃗⃗⃗ ⃗ are identical at 𝑡 = 0, their future values propagate according to different operator, so 𝑲𝐼𝐽(𝑡) and 𝑪𝐼𝐽(𝑡) are not equal to each other. For short memory, however, 𝒬 ≈ 1, so the memory kernel can be approximated as 𝛽〈𝛿𝐹𝐼⃗⃗ ⃗(𝑡 − 𝑠) ⊗ 𝛿𝐹𝐽⃗⃗ ⃗(0)〉. 68 Appendix B In this appendix, we want to prove the equivalence between the explicit and the auxiliary memory formulation. Specifically, under the condition 𝑩𝑠𝑠𝑩𝑠𝑠𝑇 = 𝑘𝐵𝑇(𝑨𝑠𝑠 + 𝑨𝑠𝑠𝑇 ), the time-correlation 〈𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) ⊗ 𝜁𝐾𝐿⃗⃗⃗⃗⃗⃗ (0)〉 also satisfies an FDT relation similar to Eq. (26), meaning with 𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) and 𝑲(𝑡) defined in Eq. (33). By definition, we have By remembering that the average is just another integral and the matrices 𝑨, 𝑩 are all constant, we can reorder the above equation into which is using the property of the random vector. Now let us consider the matrix 𝒁(𝑡) = ∫ 𝑑𝑡′𝑒𝑡′𝑨𝑠𝑠𝑒𝑡′𝑨𝑠𝑠𝑇𝑡−∞: the derivative of 𝒁 is in which 𝑰 is the identity matrix. Since 𝒁(0) is constant, we have Thus, under the condition of Eq. (34), we find 〈𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) ⊗ 𝜁𝐾𝐿⃗⃗⃗⃗⃗⃗ (0)〉 = 𝑘𝐵𝑇𝛿𝐼𝐽,𝐾𝐿𝑲(𝑡) , (125) 〈𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) ⊗ 𝜁𝐾𝐿⃗⃗⃗⃗⃗⃗ (0)〉 = 〈∫ 𝑨𝑝𝑠𝑒−(𝑡−𝑡′)𝑨𝑠𝑠𝑩𝑠𝑠𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡′)𝑑𝑡′𝑡−∞⊗ ∫ 𝑨𝑝𝑠𝑒t′′𝑨𝑠𝑠𝑩𝑠𝑠𝜉𝐾𝐿⃗⃗ ⃗⃗⃗⃗ (𝑡′′)𝑑𝑡′′0−∞〉 . (126) ∫ 𝑑𝑡′′ ∫ 𝑑𝑡′𝑨𝑝𝑠𝑒−(𝑡−𝑡′)𝑨𝑠𝑠𝑩𝑠𝑠〈𝜉𝐼𝐽⃗⃗ ⃗⃗ (𝑡′) ⊗ 𝜉𝐾𝐿⃗⃗ ⃗⃗⃗⃗ (𝑡′′)〉𝑩𝑠𝑠𝑇 𝑒𝑡′′𝑨𝑠𝑠𝑇𝑨𝑝𝑠𝑇𝑡−∞0−∞ , (127) ∫ 𝑑𝑡′′𝑨𝑝𝑠𝑒−𝑡𝑨𝑠𝑠 [∫ 𝑑𝑡′𝑒𝑡′𝑨𝑠𝑠𝛿𝐼𝐽,𝐾𝐿𝛿(𝑡′ − 𝑡′′)𝑡−∞] 𝑩𝑠𝑠𝑩𝑠𝑠𝑇 𝑒𝑡′′𝑨𝑠𝑠𝑇𝑨𝑝𝑠𝑇0−∞= 𝛿𝐼𝐽,𝐾𝐿𝑨𝑝𝑠𝑒−𝑡𝑨𝑠𝑠 [∫ 𝑑𝑡′′𝑒𝑡′′𝑨𝑠𝑠𝑩𝑠𝑠𝑩𝑠𝑠𝑇 𝑒𝑡′′𝑨𝑠𝑠𝑇0−∞] 𝑨𝑝𝑠𝑇 , (128) 𝑑𝑑𝑡𝒁(𝑡) = 𝑨𝑠𝑠𝒁(𝑡) + 𝒁(𝑡)𝑨𝑠𝑠𝑇 + 𝑰 , (129) −𝑰 = 𝑨𝑠𝑠𝒁(0) + 𝒁(0)𝑨𝑠𝑠𝑇 = ∫ 𝑑𝑡′𝑒𝑡′𝑨𝑠𝑠𝑨𝑠𝑠𝑒𝑡′𝑨𝑠𝑠𝑇0−∞+ ∫ 𝑑𝑡′𝑒𝑡′𝑨𝑠𝑠𝑨𝑠𝑠𝑇 𝑒𝑡′𝑨𝑠𝑠𝑇0−∞= ∫ 𝑑𝑡′𝑒𝑡′𝑨𝑠𝑠(𝑨𝑠𝑠 + 𝑨𝑠𝑠𝑇 )𝑒𝑡′𝑨𝑠𝑠𝑇0−∞ . (130) 〈𝜁𝐼𝐽⃗⃗ ⃗⃗ (𝑡) ⊗ 𝜁𝐾𝐿⃗⃗⃗⃗⃗⃗ (0)〉 = −𝑘𝐵𝑇𝛿𝐼𝐽,𝐾𝐿𝑨𝑝𝑠𝑒−𝑡𝑨𝑠𝑠𝑨𝑝𝑠𝑇 = 𝑘𝐵𝑇𝛿𝐼𝐽,𝐾𝐿𝑲(𝑡) . (131)
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Bottom-up coarse-graining : theory, implementation,...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Bottom-up coarse-graining : theory, implementation, application Tran, Hoang 2018
pdf
Page Metadata
Item Metadata
Title | Bottom-up coarse-graining : theory, implementation, application |
Creator |
Tran, Hoang |
Publisher | University of British Columbia |
Date Issued | 2018 |
Description | Molecular dynamics simulations at the atomic scale are a powerful tool to study the structure and dynamics of biological systems. However, important processes, such as protein folding, are still too computationally expensive to be investigated with atomistic simulations. In these situations, different coarse-grained force fields have been developed to appropriately reproduce the static structure of the systems. These strategies can range from specially designed, ad hoc approaches to transferrable, bottom-up plans. Nevertheless, because of the reduction in number of degrees of freedom, the obtained dynamics typically suffer from inaccuracies, which may be not adjustable using trivial scaling. To correctly reintroduce these properties, one can apply the Mori-Zwanzig formalism, essentially injecting the memory effects into these coarse-graining schemes. In practice, because of the difficulty in algorithm designing and the limit on computing power, the memory is often assumed to be short-term, and therefore can be approximated as a delta function. Recently, some investigations have been done successfully on simple models with significantly long memory. This work aims to extend these successes on systems with more complex topology. In particular, we applied the Mori-Zwanzig formalism to the system of polyethylene chains. Using GROMACS united atom model, molecular dynamics simulation was performed to obtain the coarse-grained force field and the memory kernel of the nonbonded interaction. These elements allowed for a successful coarse-grained simulation, which captured both static and dynamical structures of the reference system. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2018-04-23 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0365974 |
URI | http://hdl.handle.net/2429/65579 |
Degree |
Master of Science - MSc |
Program |
Physics |
Affiliation |
Science, Faculty of Physics and Astronomy, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2016-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2018_may_tran_hoang.pdf [ 2.45MB ]
- Metadata
- JSON: 24-1.0365974.json
- JSON-LD: 24-1.0365974-ld.json
- RDF/XML (Pretty): 24-1.0365974-rdf.xml
- RDF/JSON: 24-1.0365974-rdf.json
- Turtle: 24-1.0365974-turtle.txt
- N-Triples: 24-1.0365974-rdf-ntriples.txt
- Original Record: 24-1.0365974-source.json
- Full Text
- 24-1.0365974-fulltext.txt
- Citation
- 24-1.0365974.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0365974/manifest