IMPLEMENTATION METHODS FOR SINGULARLY PERTURBED TWO-POINT BOUNDARY VALUE PROBLEMS By SIMON JACOBS B.Math., University of Waterloo, 1984 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (DEPARTMENT OF COMPUTER SCIENCE) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA September 1986 © Simon Jacobs, 1986 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h C o l u m b i a , I agree t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by t h e head o f my department o r by h i s o r h e r r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department o f Co^puVtr^ *S = L(x,7)u,(x,i) + q(x,y) l < i < N l < j < k (2.6) B1(a)u, r(o) + B2(6)uff(6) = 0 (2.7) where xt;- := Xi + hfpj 1 < i < N 1 < j < k i.e. the solution is continuous on [a, 6], satisfies the boundary conditions, and is on each interval of 7r a polynomial of degree < k + 1 that satisfies the differential equation at the k collocation points, X{j. It can be shown that the requirements (2.5)-(2.7) uniquely determine the approximate solution uT(x). 2.1.2 I m p l e m e n t a t i o n a n d Equiva lence to R u n g e - K u t t a We now describe a specific implementation for the piecewise polynomial collocation method of §2.1.1, as e.g. in Ascher [2]. The approximate solution representation for the ith mesh interval is chosen to be u,(x) := u, + hi xfij (^-^) < Xi< x < x,+1 l 'i{pi)=6ii l;(!) uj y = U.-+1 l and with b := (6y), 6y := V*i(l) w e have u, + D,v, = u, + 1 l < i < N CHAPTER 2. BACKGROUND 12 where T r- mix* D, := hib1 € 3? (2.17) The extension to the general system (2.2) is straightforward, and corresponding to (2.14, 2.17) we get u t := (utl,...,u,n)TG9Jnxl V « : = (U!ll> • • • > U!ln> U'i21» • * • > U ! fc- ln ' U!'Jfcl> • • • > U!'Jtn) r ^ dtnkxl L(xn) V,- := -e -1 . L(xtJfc) e nJfcxJfc Wj := InJt - h(e - l a uL(x,i) «i2L(^a) • • • aufcL(z.i) a2iL(x,-2) «22L(xt-2) • • • a2fcL(x,-2) (2.18) afciL(xt-jfc) a^I^x,*) = Infc - /t.e-Miag {L(xtl),..., L(xtt)} A ® I„ G ® n k * n k Q. := e-1(q,-ii, • • • > I „ 6 5Kfcxnfc When Wt- is non-singular the local unknowns Vj can be eliminated from (2.13, 2.16) and the solution at adjacent mesh points can be expressed as A,u,+1 = r,u, + r, 1< i < N (2.19) CHAPTER 2. BACKGROUND 13 where A, = I r.^I-D.W.^V, r, = D.W^ q,. In combination with the boundary conditions, this gives an algebraic system for the so-lution u,- at mesh points. If the boundary conditions are separated (i.e. each condition is involved at only one endpoint) then the resulting system is almost block diagonal and can be inexpensively solved (SOLVEBLOK by deBoor & Weiss [14]). Note also that if W,~1Vt- and W^ q,- are saved, then the solution derivatives at collocation points can be recovered cheaply from (2.13) as v,- = Wr1(qt--V,-ut-) l < » < i V (2.20) hence we obtain the complete solution u T(x), which will be required for the non-linear ease. The non-singularity of W,- is, however, a point of contention. We may write (cf. (2.18)) W,- = I + Ofae-1) so if h{ « C £:, W T " 1 will exist. This is usually the case if hi ;» e also. But if hi ~ e, as may be expected near a transition layer, then W^ "1 is not always guaranteed to exist. We discuss this difficulty and its resolution in Chapter 3. It is now an easy task to verify the equivalence of collocation to certain implicit CHAPTER 2. BACKGROUND 14 Runge-Kutta schemes (Butcher [19], and [2]). From definitions earlier in this section and from the collocation equations (2.12) we have for the collocation point z f m ! t = e-Uimim + hi 52 ^ApmKj} + e-iqim l < i < N l < m < k j=i which, on letting K m := u{m, can be written as h K T O = c-1/I-m{ut- + /i,-X)a*yKy} + e"1q.-m 1 < » < N 1 < m < k (2.22) 3=1 From the continuity equation (2.15) we see directly that k u t + 1 = u,- + hi J2 fiyKy l < i < N (2.23) and we are done: (2.22) and (2.23) are the usual Runge-Kutta equations for (2.11) with coefficients A , b based on the points p. Note that the choice of ipj{x) was motivated in part by this equivalence property, and also because the unknowns uJ. are kept local to each mesh interval t. This duality is beneficial for many reasons. It allows us to implement a collocation method, while applying the conceptual understanding of Runge-Kutta. Furthermore, with collocation, a solution is obtained throughout the domain rather than at a fixed (2.21) l 0 1 < i < N (2.28) which is A-stability in the reverse direction, hence the stability properties of the scheme (ii) are invariant to the direction of integration, whereas those of scheme (i) are not. CHAPTER 2. BACKGROUND 18 This stems from the fact that (ii) is a symmetric (centred) scheme, i.e. the points on which the scheme is based are placed symmetrically about x i + i . The scheme (i) and certain others that are not symmetric are called one-sided. The usual approach for stiff IVPs is to use a one-sided scheme with L-stability. Errors accumulated when approximating the layers with such schemes decay exponen-tially, so relatively crude meshes can be used in the layer regions if an accurate solution is only required away from the layers. The application of these schemes to BVPs is significantly hampered, however, because unlike IVPs where all solution components for a stable problem decrease throughout the layer, here we may have both increasing and decreasing solution components. Hence the L-stable scheme may become expo-nentially unstable if it is not properly upwinded, that is, if the increasing components are not integrated with a scheme that is stable in the reverse direction. The process of upwinding is expensive and tricky, and limits the use of these schemes in the general purpose setting. Moreover, a fine mesh in the layer region may still yield an inaccurate solution there, so these difference schemes are of better use in special purpose methods, especially where the solution is required accurately only away from the layers [ll]. Our focus for BVPs instead is on symmetric schemes which possess A-stability or better (as described below). Errors accumulated in the layer regions are not damped, but rather are prevented from growing, so the error propagating from the layers will pollute the solution everywhere. The main advantage of symmetric schemes for BVPs CHAPTER 2. BACKGROUND 19 is their invariance to the direction of integration, hence their stability in both directions without implicit or explicit upwinding. In addition, a good layer mesh with a high order method will yield very accurate results in the layer itself. Their drawback is that the layer mesh must be chosen dense enough, and wide enough, for an accurate solution to be obtained over the entire mesh. This, however, is not an unreasonable restriction for schemes that will be incorporated into general purpose algorithms, where uniformly accurate solutions are usually required anyway. While symmetric, A-stable schemes are a marked improvement over one-sided schemes with initial value stability properties (and without upwinding), one should note that the actual stability property for very stiff problems is much weaker than L-stability. A-stability guarantees only that errors accumulated in a transition layer will not grow; L-stability instead guarantees that these errors will be damped away completely (at least for the test equation (2.24)). Indeed, the A-stability property is not always sufficient, and we shall see that the stronger AN-stability is a more desirable property to have. Consider a third scheme: (iii) trapezoidal scheme r-fa+i - = ^OfoK + '(z.+iH+i} + J{q(*.) + qte+i)} i < *' < JV Al, 2 I which is also symmetric, and A-stable in both directions. In fact, for the test equation CHAPTER 2. BACKGROUND 20 (2.24) the trapezoidal and midpoint schemes coincide. But their stability properties differ if we consider instead the variable coefficient test problem e u'(x) = X(x)u(x) a < x < b (2.29) If (2.26) holds for a scheme when it is applied to (2.29), then the scheme is said to be AN-stable. Denoting A,- := A(x,), we have for the trapezoidal scheme 1+ •^+1 = dfer i < t < JV -1; Al+1 A 2« so, if l^hjXjl » 1, \e~1hiXj+i\ » I, 1 < j, < i, u,-+i « — u , - w • • • « ( — U I 1 < i < N A«+i A, +i If | A j + i | -C |Ai|, then the approximate solution will experience a large increase while the analytic solution actually decreases. Note, however, that the instability here is less serious, namely not exponential, than when a one-sided scheme is applied in the "wrong" direction. The situation above can certainly occur in singular perturbation problems, where eigenvalues often experience large changes in magnitude through the domain. It is this change, culminating in a change of sign in the real part of one of the eigenvalues, that is often responsible for transition layers away from the boundaries. As the real part of an eigenvalue passes through zero, the character of the differential equation CHAPTER 2. BACKGROUND 21 changes, and because of the small perturbation parameter, it changes quickly. This results in the possibility for rapid change in the solution profile, the location of which is called a turning point. Kreiss, Nichols & Brown [24] and Ascher & Bader [5] show how disasterous the stability problem with the trapezoidal scheme can be for a classic second order differential equation with a turning point. To contrast the trapezoidal scheme, one can easily see that the midpoint scheme satisfies (2.26) and (2.28) for (2.29), and is AN-stable in both directions, its stability equation unchanged because of the evaluation of the eigenvalue X(x) only at the interval midpoint. The midpoint and trapezoidal schemes are the simplest instances of the general classes of symmetric Runge-Kutta schemes which are equivalent to collocation based on Gauss and Lobatto points. Burrage & Butcher [18], and Burrage [17] have shown that the members of the class of collocation schemes based on Gauss points are all algebraically stable, hence AN-stable, while those based on Lobatto points are not. As-cher & Bader [5] have shown further that the Gauss schemes are the only algebraically stable symmetric schemes based on collocation, and that other algebraically stable symmetric Runge-Kutta schemes not based on collocation are of no greater advantage. Note that while Gauss schemes are more stable in general, Lobatto schemes are slightly more accurate in certain cases (Ascher [4]). We feel that Runge-Kutta schemes based on Gauss points are the most suitable for CHAPTER 2. BACKGROUND a general purpose singular perturbation code, and proceed on that assertion. CHAPTER 2. BACKGROUND 23 2.3 Convergence Resul ts A great deal of attention has been devoted towards establishing convergence re-sults for the non-stiff case (h -C e) and from this work a rather complete theory has evolved. Unfortunately, the theory relies on the assumption that h can be chosen "small enough", and for singular perturbation problems this is not usually possible in an efficient implementation. While some work has been done to develop a theory for singular perturbation problems (Weiss [29], Ascher & Weiss [9]-[ll] and Ascher [3]), it is not in as complete a state as the non-stiff theory. We will only motivate the results here and refer the reader directly to those works for further detail. The results outlined here will focus only on schemes based on Gauss points. How-ever, the theory presented in the literature includes extensive convergence results for the symmetric Lobatto schemes also, and arguments are given therein regarding the appropriate use of both schemes. Almost all current work, and in particular that referenced above, concentrates on problems without turning points. Some also examine the turning point problem but they do so only for very specific cases. Recall from §2.2 that a turning point is a location where an eigenvalue of the matrix L of (2.2) varies by some orders of magnitude. The lack of turning points rules out transition layers away from the boundaries (if the inhomogeneity q is smooth) and leaves us with a three segment problem: a smooth CHAPTER 2. BACKGROUND 24 solution in the interior (away from the boundaries), and the possibility of a rapidly changing solution in a boundary layer at each end of the domain, connected by the interior region. The restriction to problems without turning points is indeed a very severe one. However, experimentation with the code C O L S Y S , the work of this author, and others suggests some extension of these results to complicated turning point problems in prac-tice (see e.g. Example 6.1), despite lack of supporting theory. In fact, the techniques to be discussed are often applied to problems with both boundary layers and turning points, with complete success, if sufficiently dense meshes are placed in the layer and turning point regions (see e.g. Example 6.3). 2.3.1 N o n - S t i f f T h e o r y Some of the findings for "regular" (non-stiff) problems are used in the development of the theory for stiff problems, so we include them here for completeness (cf. [2]). T h e o r e m 2.1 Assume that the B V P u'(x) = L(x)u(x) + q(x) a < x < b B 1(a)u(a)+B 2(6)u(6) = 0 is well posed in the sense that the conditioning constant K (i.e. a bound on the Green's function) is of moderate size, and has a unique solution u(x) € C2k+1[a, b]. Assume CHAPTER 2. BACKGROUND 25 further that L, q G C2k[a, 6], and the k collocation points are chosen to be Gaussian. Then, for h sufficiently small (a) the collocation method of §2.1.1 has a unique solution u„-(x) (b) there exists an implementation that is stable, with condition number (for the resulting algebraic system) ~ KN (c) at mesh points there is superconvergence, i.e. WuixJ-UrixJW^Oih2") l < i < N (d) while at other points |uW(x) - u « ( x ) | | = |p ( i )(x)ut*+i)(Zi)| + 0 ( / i * + 2 - ' ) + 0{h2k) x, < x < x , + 1 l < i < N 0 < j < k p(x) a known smooth function • 2.3.2 A n a l y s i s a n d P a r t i a l M e s h Se lect ion for the T h r e e Seg-m e n t P r o b l e m The results to be presented here assume a special mesh structure in each of the three problem segments. We first describe that structure. From the A-stability property of Gauss schemes, we know (cf. §2.2) that any er-ror introduced in the boundary layers will propagate into the smooth region because the growth of the numerical approximation is merely bounded. Therefore, the mesh CHAPTER 2. BACKGROUND 26 H — I — * II : in—N W — i — i — i — i — w m ^ x T T T T x=a x ^ a +XoC xpb-X,c x=b Figure 2.1: Mesh Structure for the Three Segment Problem required for the three segment problem must be dense enough in the layer regions to achieve a desired accuracy and ensure that only sufficiently small errors propagate into the smooth region, and sparse in the smooth region itself (for efficiency), with just enough points to resolve the solution there. The interior region is usually called the long segment and the others are called short segments. The breakup is as follows I. [a, a + Xoe] Xi_ — a + XoE II. [o + Xoe .A-X^] III. \b - Xxe, 6] z, = b - XiE h := max hi h := min/i,-«'<«, «>T £ 0 (2.30) with solution u(x) - exp r — I x > 0 For a given absolute tolerance 6, we would like |u(xt) -u,(x,)| 0 n_ + 1 < j < n |Im(A ;(x))| < const |Re(A y(x))| 1 < j < n and that (2.2) has a unique solution, bounded uniformly in terms of ||q|| and ||/?|| as e -> 0. W i t h w(x;er) := E _ 1 ( x ) u ( x ; e ) (2.34) look instead at the transformed problem e w ' (x ; e) = (A(x) + 0(e ) )w(x ; e) + p(x; er) + O(e) (2.35) p(x;er) := E _ 1 ( x ) q ( x ; e ) w i th appropriate boundary conditions on [a,b]. Th i s decouples the increasing and decreasing solut ion components, and allows us to look at a set of in i t ia l value problems and a set of termina l value problems independently. The exponential mesh (2.31) may then be appl ied on each short segment w i th slight modif icat ion, i.e. \M := Imax |A,(0)| (2.36) Re(A) := ^ i n RefA^O)) CHAPTER 2. BACKGROUND 30 for the left boundary, and similarly for the right, so that the layer width is computed on the basis of the slowest (smoothest) fast component, and the stepsize is computed on the basis of the fastest fast component. It is important to note here that the best that can now be achieved is u(x,J - VLr(xL) = O(S) since, even if the layer width and stepsize were estimated correctly for Ay(x) := Aj (0) , which cannot be guaranteed, the variability of the eigenvalues produces an 0(6) perturbation in the approximate solution at X0e. The layer width and density may therefore be chosen incorrectly. A coarse mesh is put in between the short segments, but because of the O(S) factor, should only be fine enough so that further refinement will not decrease the error. On the short segments, where h ~ e, there is no stability problem. On the long segment, where h e, a small theoretical restriction is necessary for stability, i.e. that the matrix P _ E - x ( 0 ) P + E - ^ l ) (2.37) be well-conditioned, with P_ € 9 J n - x n , P + e 3 J n + x n (n+ + n_ = n) obvious restriction matrices. This condition is primarily of a theoretical nature and is rarely of any consequence in practice. Problems arising from violating (2.37) are usually removed as an adaptive algorithm adjusts its mesh in response to the instability (§4 in [4]). CHAPTER 2. BACKGROUND 31 W i t h the requirement (2.37) and the well-posedness of the differential system (2.2), the three segments can be stably patched together, and the choice of exponential mesh gives ||u(x.) - u x ( x , ) | | = 0(hk+«) + 0(6) l < i < N (2.38) for e sufficiently small, with q = 1 if k is odd and the mesh is sufficiently locally uniform, and 0 otherwise, and the 0(8) term is the contribution from the layer regions. T h e superconvergence order of the non-stiff case is lost (cf. Theorem 2.1(c)). A more common situation is the problem of different time scales, where both fast and slow components are present. T h e simplest form occurs when the coupling of the fast and slow components is weak, i.e. when (2.2) can be written as ^ y ' = Lii(a: ; e)y + Li2(z; e)z + qi(x ; e) z'= L 2 i ( x ; e ) y + L22(x;e)z + q2(x;e:) (2.39) i.e. L(x; e) Lu(x;er) Li2(x;e:) eL2i(x;t:) eL22(x;e) where L n can be expressed in the form (2.33) and (2.37) applies, and similar uniqueness and boundedness assumptions are made. O n the short segments the collocation equations are applied first. T h e fast and slow components are then decoupled, up to 0 (/&,•), through a transformation that puts CHAPTER 2. BACKGROUND 32 L n in the form (2.33), and an analysis for each is performed essentially independently. The analysis for the fast component only case is applied to the fast components, y , while the "regular" analysis of §2.3.1 holds for the slow components, z. On the long interval, the collocation equations are treated directly, and simultaneously for y and z. If, up to 0(e), there are enough boundary conditions in z alone, we get the estimates for er sufficiently small, with q as in (2.38). If there are not sufficient boundary condi-tions in z, then (2.41) holds for the slow components also. The class of problems can be extended further, to the so-icalled singular singularly perturbed problem, where L has slow components strongly coupled with fast ones. Similarly to (2.33), assume there exists a smooth matrix transformation E(x) such that and \\y{xi)-yx{xi)\\ = 0{h^)-rO{B) 1 < i < N (2.41) L(x;0) = E(x)A(x)E-1(x) a < x 0 n_ + 1 < j < nr |Im(Aj(x))| < const |Re(Ay(x))| 1 < j < nr (nr = n_ + n+, ns = n — nr) with appropriate uniqueness and boundedness. Define also the restriction matrices P. := [0 I„.] G 5Rn'xn P r := 0] G 5ft^ x n P_ := [In_ 0] G SR"-Xn' P + := [0 I n +] G 5R'l+x"' The analysis is, once again, similar to that for (2.33), except that the matrix in the stability requirement (2.37) must be slightly modified to ' P.E-^O) ' . P+E-^l) . where E(x) is the non-singular solution matrix to z' = -P.E-^xjE'^PiTz P_z(0) = 0 P+z(l) = 0 We also require the consistency condition P.E-1(x)q(x;0) = 0 (2.43) On the short segments, the collocation conditions are applied to the transformed variable w of (2.34), to give the problem (2.35). The analysis then yields similar results CHAPTER 2. BACKGROUND 34 as in the previous case, where there is only weak coupling of fast and slow components. On the long segment, the fast and slow components are only partially decoupled, and then analyzed. The improved estimate (2.40) on the slow components drops and we are left with (2.38) for all components. The assumption in (2.33, 2.42) that L be diagonalizable is more restrictive than is actually necessary. The condition need only apply to the null space of (2.42), but the result is a somewhat messier analysis. An important practical consideration of this work is that, given the mesh stategies described earlier, the condition number of the resulting algebraic system, after appro-priate row scaling, is still O(N), and is independent of e. If Gaussian elimination with scaled partial pivoting is used (as is the case in COLSYS), the system can be solved without numerical difficulty. CHAPTER 2. BACKGROUND 35 2.4 C O L S Y S - A Collocation Code In this section we examine the adaptive, general purpose code COLSYS [6]. This code, developed in the late 1970s for mildly stiff problems, is based on collocation at Gauss points, and is still one of the best solvers for singular perturbation problems. Its most recent version, COLNEW (Bader & Ascher [12]), uses a monomial basis (which is an extension of the one described here). A sophisticated non-linear controller, along the lines of Deuflhard [20], is used for the quasi-linearization process applied to non-linear BVPs. COLSYS handles mixed order systems of non-linear ordinary differential equations subject to non-linear side conditions, i.e. it integrates systems of equations of the form u l " * 1 = F „ ( x , z(u)) 1 < n < d where the nth differential equation is of order mn and z(u) involves lower order deriva-tives in the various components of u = (ul5..., ud). The integration is direct, in the sense that the higher order equations are not converted to first order systems before integration. With mn = 1 for all equations, and simple two-point boundary conditions, this conforms to the standard non-linear problem (2.1). The point we wish to emphasize about this code is its ability to solve fairly tough (singularly perturbed) problems adaptively, through a series of local error estima-tions and appropriate mesh redistributions, even when starting from very crude initial CHAPTER 2. BACKGROUND 36 meshes. While the basis of the mesh refinement algorithm is not theoretically justified in the stiff case, it still performs well, and suggests that a mesh strategy aimed more towards stiff problems, but along the lines of COLSYS, can be incorporated with the basic collocation method to produce a robust general purpose code. This avenue is ex-plored in Chapter 5, so we discuss only the current implementation here. Examples of COLSYS performance can be seen in Chapter 6, where comparisons with an improved version for singular perturbation problems are made, or in [6]-[7]. We will restrict our attention to first order systems. For definiteness, consider the mesh 7r of (2.3) for the non-stiff linear system u> <1> 0 £ 9i n, L, B i , B 2 6 9J n x n and all are sufficiently smooth. The result of Theorem 2.1(d) can be extended to give an error estimate u'(x) = L(x)u(x) + q(x) a < x < b B1(o)u(a) + B2(6)u(6) = 0 eW(x) := u M ( i ) - u « ( i ) eW(«) m ( l ( x - Xi+:)) + 0{h?*-') + 0( fc") (2.44) Xi < x < x,+1 l(x) = u(*+1)(x,-+1) + 0(h) = K + ^ 1-lJllL + 0(h). (2.46) uW(x) is a computable piecewise constant because u x(x) 6 Pk+i,*, so the leading term of e(x) in (2.44) can be computed directly using (2.46). With only slight enhancements, this is precisely what COLSYS does. The use of (2.44) to estimate the global error for termination is somewhat secondary because it cannot be considered robust enough in practice for this purpose [6], [7], CHAPTER 2. BACKGROUND 38 Instead, the usual extrapolation process is applied to two consecutive meshes, where one is formed by halving each interval of the other, and (2.44) is used to calculate a more accurate error constant than is usually available. The beauty of the result (2.44) is in its localization of the error estimate. This estimate is only valid, however, if h is sufficiently small, so even when hi is chosen small in a troublesome region, the stepsize in the smooth regions (hence h) must still be taken small enough in order for the local term to dominate. In singular perturbation problems we expect h » e for efficiency reasons, so the result (2.44) is no longer theoretically justified. Hence, we expect the COLSYS error estimates to be inaccurate for very stiff problems, and indeed, this turns out to be the case theoretically also (compare (2.38) and (2.44)). In Chapter 4 we show in detail that the error in is essentially K*.-)ll « XX-1)*'' ' .0(h k + 1 ) + 0(6) i < i < i i=i for e sufficiently small, where, for the remainder of this work, ll'llt : = ll"lli.- I„ T h e only difference is that the usual Runge-Kutta coefficients amj := V'j(Pm) in W< have changed to the new coefficients amj := ipk+i-j{Pk+i-m) in W,-. T h e continuity equation (2.16) is the same for both representations. T h e representations are averaged to get „ Ui + Ui+i hi\rf , (x-Xi\ fxi+i-x\\ u,(x) := — j — + - g ( — J - [ — g — j } D i ) . 1 < •' < JV CHAPTER 3. SYMMETRIC RUNGE-KUTTA 43 Averaging (3.3) with (2.13) gives or rather iv,-(u,- + u i + 1) + i(W f- + W,)v,- = q, V,(u,- + u,+1) + W,v, = q, where V, := iv f-2 * Wf- := I(W,- + W,-> = Ink ~ / i , e - 1 diag {L(z , i ) , • • •, L ( x t j t ) } A ® In A := i ( A + A ) Since (2.16) is unchanged, we have for (2.19) A,- = I + D , W r 1 V, I\ = I - D . W r 1 V, r, = D,W 1~ 1q, The symmetry in the resulting difference equations is evident. This new representation may be thought of as a symmetric Runge-Kutta method. To be specific, the usual Runge-Kutta for (2.11) is written k Ui+1 = \ii + hi bjKj l < i < N (3.5) 3=1 k Km = e-1lim{m + hiYlamjKj}-re-1qim l k+i-j{Pk+i-m)-There is a more pleasant form for A. Comparing (3.6) and (3.7) we see that u, + K J2 «myKy = , + 1 + hi 5^ a m 7Ky J = l ^ J = l i.e. U , + l = u, + 2hi ^ ( a m y - 2my)Ky 3=1 and so from (3.5) amj = am; - ^bj 1 < j , m < k (3.8) This last development is also a simple way to look at the entire symmetrization process. From Theorem 4.10 in [12], we have that A has a simple form which is similar to a skew symmetric matrix, i.e. all its eigenvalues are purely imaginary, and when k is odd, one of the eigenvalues is zero. Hence, as long as the eigenvalues of L are bounded away from the imaginary axis, or er. This is what we would naturally expect when an adaptive code, starting from a crude initial mesh, attempts to redistribute its mesh with respect to the error on each interval. Consider the problem (2.2) when it can be expressed in the form (2.39) of §2.3.2 with Ln non-singular and L21 = 0, i.e. ey* =L11(x;er)y-r-L12(x;e:)z + q1(x;e:) ^ ^ z ' = L22(x;e:)z + q2(x;e) * " ' 46 CHAPTER 4. ERROR ESTIMATES 47 y 6 3ftnr, z £ 5Kn', n r + n.„ = n. Assume a unique solution, bounded uniformly in terms °f I I Q I I I ? || w e w m recover difference equations corresponding to a problem of the form (4.1) in terms of some transformed fast and slow components (Lemma 3.1 in [10]). ^ CHAPTER 4. ERROR ESTIMATES 48 4 . 1 Error at Mesh Points While the order of the error at mesh points is already known from (2.40, 2.41), it is important to understand more precisely the form of the error, in order that it may be distributed in a reasonable way by the process of adaptive mesh selection. For a fixed collocation point xtJ-, i < i < i — 1, 1 < j < k, the difference equations we must consider are k ehi 1(yn - y«)= Z)ai'{Lii(x»')y««+Li2(a;.0z«j + qiO^ i)} (4.2) k h{ 1 (zij - zt) = Ylaii{ L22 [xii)z« + q2 (xu)} 1=1 (cf. (3.5, 3.6) in [10]), and for the exact solution, similarly, (y('*y) y(xi))= &ji{i'ii(^i)y W+i'ia^aJaW + q iM} + (4.3) k z(x«"))= J 2 a A L22{xa)z(xii) + q2(a:«i)} + s,;-i=i Denote y{xi) - y, e? := z(xi) - Zi CHAPTER 4. ERROR ESTIMATES 49 then from (4.2) we get the obvious difference equations for the error eht 1{e?ij - ef)= Ea;'{Ln(x«0ei'+Li2(*.0^} + riS i=i (4.4) k K1^ ~ ei)= Y,aA L 2 2(x t,)ef J} + s,y 1=1 Similarly, we write for the fast components ~ *<) = £&i{Lii(*«)e£ + L12(x«)eS} + r,, f c + 1 (4.5) i=i These difference equations in the error can now be adapted to the theory of §5.1 in [10]. From (5.12) in [10], with e?, ef replacing y,-, zt- we have ej+i = (~l) f ce? + ehT1^ + B^/i^C,- - I)G,-Xf, + r* i < i < 1 - 1 (4.6) CHAPTER 4. ERROR ESTIMATES 50 where, by our notation, H,- is a matrix bounded independently of e diag{Ln(x,i),... ,Ln(x, f c)}(A <8) I) h^Dt = b r ® I n e-'hiWT1 = ( £/ lr 1 I - d i a g { L 1 1 ( x , 1 ) , . . . , L 1 1 ( x , f c ) } ( A ® I ) ) (fa,.. .,fik)T e-1(hiriik+1--Dir*i) Gi B C f, - l rj := r; := (T" ] ( T i l ) = (A ® I)-1 V rik J I r-* J ( 4 . 7 ) Now, G,-, C = O(l) since Ln = O(l), thus we may write ( 4 . 6 ) more clearly as eJVi = 7 x]Ift z _ x «) | d x =y(xi) + hiY^any'(xii) + / y'[xn, xik, X]H(X - xit)dx (4.10) for Li as in (2.10) and the remainder term in div ided difference form. Hence, y'[z, i,..., xik, x]J[(x - xu)dx i=i for some z,- < £ < z,-+i, i.e. ,(fc+i) (4.11) and, of course, s imi lar bounds hold also for rjy and f j , whi le r; = o{hk+1) (4.12) f rom (4.7). CHAPTER 4. ERROR ESTIMATES 52 We now bound f,-, for i < i < T — 1. Since we can safely assume that h is still chosen "sufficiently small" with respect to the slow components, and since the fast and slow components are decoupled in (4.1), then the usual non-stiff theory of §2.3.1 applies for z. In particular, if there are sufficient boundary conditions for z that do not involve e in any significant way, then at collocation points, z(*+1)(x,-) 0{hk+1) + 0{h2k) 1 < 3 < k (4.13) Since (4.11) holds also for rjy, we have from (4.7) f,.|| = ||z(fc+1)(x,)| 0{hk+1) + 0{h2k) + ||y ( f c + 1 ) ||. 0{ehk) (4.14) Finally, from (4.9, 4.12, 4.14), with e sufficiently small we have et+i (_ 1 )*(.-i - i ) e «' + ^ ( - l ) f c ( « - J ) | | | y ( f c + 1 ) | | y 0{h)+x) + 0(fej + 1 ) + 0{h2k)j i < i < i — 1 Now, when this initial value system is patched back into the original boundary value problem by transformation, there is a contribution of error, 0(S), from both segment boundaries, and we find that as e —• 0, the error is "II < £ ( - ! ) * ' (||y(fc+1)|.0(fc}+1) + ||z(*+1)|| (4.15) +0(h2k)) + 0{6) i < i < t CHAPTER 4. ERROR ESTIMATES 53 The error at mesh points for the fast components is no longer local. Its leading term is E(-i) t y (|y(*+1)|yo(^+1) + ||z(*+1)|| The error for the slow components satisfies, for the same reason as in e?- above, the superconvergence result e?|| = 0(fc") i 0, with the error, 6, from outside the segment fx,-,!*] suitably small. This is an excellent estimate of local error in the collocation points, whose leading term is independent of the error in the fast components. We will show its use in Chapter 5. Chapter 5 Mesh Selection In this chapter we discuss some considerations for a posteriori mesh selection al-gorithms. A good starting point is the collocation code COLSYS, which has a rather sophisticated and successful a posteriori mesh selection procedure. This procedure re-lies explicitly on the non-stiff theory of Theorem 2.1, and as indicated in §2.4, is not valid for singular perturbation problems. However, many of the principles involved are sound for collocation methods in general and the code itself has had some prac-tical success with such problems. Hence, we will apply some of the heuristics of this algorithm, with modification, for singular perturbation problems. It is important at this time to recall the context within which the mesh selection is to take place. Roughly stated, the task can be summarized as Given the BVP (2.2), and a desired error tolerance 8, efficiently determine a mesh 7T, as in (2.3), such that the number of mesh elements N is "small" and the error in the approximate solution VLr(x) satisfies the mixed absolute 57 CHAPTER 5. MESH SELECTION 58 and relative tolerance ||u(x) - uf(x)||,. < 6 (1 + |Mx)||,.) x, < x < x,+i i < i < i - 1 As in Chapter 4, we consider a segment [XJ, xT] of TT, where h » e, and for convenience define N := i — i The requirement that a tolerance be met for each component of u may not be realistic, for instance, when a higher order B V P is converted to a first order system for integration and some components are just higher derivatives whose accuracy is not important. In this case we may choose to be more selective about which components are monitored. The most straightforward (and common) a posteriori approach is to successively determine an approximate solution on a mesh ir and then refine to a new (hopefully better) mesh n* based on inferences about the error in the old solution, u,. The process is repeated until either the tolerances are satisfied or some control parameters cause termination. We will adopt this approach. CHAPTER 5. MESH SELECTION 59 5.1 Error Equidistribution Ideally, for a given mesh size N, we would like to "minimize" the error on [x^, x^] by a suitable distribution of mesh points. From (4.15) we know that a bound on the leading term of the error in mesh ele-ment t, i < i< T — 1, for our method, is i - i £ ( - l ) f c i C ; u (fc+i) h k + l (5.1) To approximately satisfy the given error tolerance S, we therefore require u < S(l + ||u||t.) (5.2) If k is odd and the mesh is sufficiently locally uniform, then the terms in the summation will not accumulate, but if k is even they will. Hence, for (5.2) to be true, we must have,, roughly, u where q = 0 if k is odd and the mesh is sufficiently locally uniform, and 1 otherwise. Restated, we require Ci{x) ||u(*+1) I *** hi < 1 for Ci(x) CiNq L*(i + JMI,) i ITT (5.3) CHAPTER 5. MESH SELECTION 60 A natural way to try to satisfy (5.3), regardless of whether N is large enough to allow it, is to choose ir* SO that the error is equidistributed, i.e. to find a partition of [x<, x-f] such that C,(x) u 1+1 hi = C i_ < i < i - 1 (5.4) for some constant C [27]. Unfortunately, this task is not practical, even when is used to estimate u. If instead we require i r*» ulk+1)(x)\\VTTdx = C = ±- C(x) u<*+1)(x) 1 dx (5.5) C(x) := C,-(x) for xt- < x < x,+1 * < t < i — 1 then from Theorem 2.1 in Russell & Christiansen [28], the mesh will be asymptotically equidistributing (see e.g. Pereyra & Sewell [26] and also [28]), i.e. Ci{x) u /i t- = C(l + O(h)) as N-too i < i < i - 1 (5.6) and (5.4) will be approximately satisfied. Assuming that a reasonable approximation to u( f c + 1) is available, (5.5) can be used efficiently to equidistribute a mesh of given size N fairly well. The major issue remaining then, is how to find a good approximation to u(* + 1). This is considered in detail in §5.2. With an approximation to u( f c + 1) at hand, it is a straightforward process to predict what new mesh ir*, iv i a — i X j (x) J x" dx = 1 i* (a:) J x: 1 dx (5.8) In §5.2 we discuss how and when the requirements (5.7, 5.8) are achieved i n practice. CHAPTER 5. MESH SELECTION 62 5.2 Implementation A general mesh selection algorithm must deal not only with the implementation of (5.7, 5.8), but also with the numerous safeguards for their appropriate use and alternatives when they are infeasible. Global error estimation and termination must also be considered. These issues are discussed in detail in §5.2.1 and §5.2.2. 5.2.1 G e n e r a t i n g a n E q u i d i s t r i b u t e d M e s h From (5.7) it is clear that to do well at estimating a new partition w*, we must estimate C and, more importantly u( f c + 1) reasonably well. With Cj of (5.1) constant, the equidistribution of a mesh for a given N is not affected by the particular choice of the constant, whose primary use is to help intelligently estimate the required N*. Since little is known about how to choose Cj for very stiff problems, we use the constant for the non-stiff case (i.e. that of COLSYS) which comes from Theorem 2.1(d) simply as C, = ||p||t. « < * < I - 1 and which works very well in practice for singular perturbation problems. C (x) is then formed from (5.3). The estimation of u(* + 1) poses much more difficulty, and care must be taken to find even a "good" O(l) approximation. We take advantage of the localized estimate of CHAPTER 5. MESH SELECTION 63 error at collocation points (4.13), which gives ||u(x0) - Ms^H = l ^ f r ) ! o(^+1) + o(h^) ( 5 g ) * < » < > - 1 l 1, ' ' iZIZiU[xiu---,xik,x\-^l[(x-xij) i=l 1=0 dx*" d /c-l fc =u[x f l, . . . , Xik, X] ^ - " T Z T I I { x - Xij) + ° ( h l ) (5.11) Now — 1]{x - xy) = A;!x — (k — l)!(x t l + • • • + x « ) dx* and for symmetric schemes, p3- = 1 — therefore £ii H (- xik = kx. •i+i Hence dx* = k\xi+k-(k-l)\(kxi+k) = 0 (5.12) and from (5.11, 5.12) we have for any symmetric scheme of order k, and in particular for Gauss schemes, Ri*'1,(*«+j) = o(fc?) Differentiation of (5.10) then gives = VLt1](xi+k) + 0{h\) + 0{h\-kh?k) (5.13) t < t < I — 1 CHAPTER 5. MESH SELECTION 65 Now, for each interval t, i +1 < i < I — 2, interpolating quadratically from _ 1^(x1 +i) on adjacent intervals and then differentiating gives the piecewise constant approxima-tion u ( f c + 1 , w = u — * — h — r - r u ^ 1 } ^ - ^ -Xi+i)(Xt_i - x j + | ) + ( 5 - 1 4 ) 2 where (X,+ |-X,_l ) ( x . + | -X,- +l) X,_! < X < X,+2 t + l < * < I — 2 |R,-J < 0(1) + (V-T1 + Kk~l + K"x~l)0{h2k) (5.15) The 0(1) term in R,- has two major constituents: z(*+1) | from (5.9), and u ( * + 1 ) from the 0(h}) term of R,- in (5.11);. However, recall that from the point of view of error equidistribution, we are really interested in some indication of the relative magnitudes of the error, hence u ( * + 1 ) , throughout [x;, xT]. These particular 0(1) terms are therefore tolerable. The approximation to u ( f e + 1 ) and the error coefficient C(x) are both piecewise con-sant, hence the results (5.8, 5.7) are easily used to generate new equidistributed meshes. In practice, (5.14) usually works well. However, difficulty is encountered when the quadratic interpolation is applied for a mesh interval where hi, / i t + i differ by CHAPTER 5. MESH SELECTION 66 many orders of magnitude. The error estimate obtained becomes large in magnitude relative to that from other intervals and often dominates over the error estimates for all intervals. Usually, when this problem occurs, we simply apply the interpolation in a one-sided way to compensate (as at the endpoints, normally). Near the boundaries of [x,-,xT] (and sometimes in the interior), however, there may be difficulty because three intervals of relatively equal magnitude are required to make even the one-sided estimation. If there are only two intervals of comparable magnitude we can, similarly to (5.10, 5.14), estimate on two intervals and then linearly interpolate. The necessary (k + l)8t collocation point is taken from the adjacent interval, and an error term like hk 0( j^1) results from the differentiation to u^fc+1^ (ignoring the contribution from global terms). Since we assume ~ hi in this case, this causes no difficulty. When only one interval is available (i.e. when the ratios of to ht and hf to hi+i vary considerably or a boundary interferes for some reason), no reasonable approximation is at hand, and we cannot resolve the problem. To prevent this situation, an additional mesh point is placed at the midpoint of [x,-, x,+1] when the mesh is created in order that a sufficient number of intervals will always exist for further redistribution. While the procedure for estimating u(*+1) is certainly less elegant than might be desired, it is the best that we are currently aware of. Other more general approaches have been examined. One idea is to equidistribute a lower derivative, say vSk\ instead of u(fc+1) (§3 in [28]), for which an O(h) approximation exists. However, experience CHAPTER 5. MESH SELECTION 67 shows that this is inferior in practice because the code does not detect and respond to trouble areas as well as the above method. Another idea is to estimate u( f c + 1) directly, by interpolating from k points on one interval and 2 points from adjacent intervals, and then differentiating. The problem with this idea is that the 0(1) term of the error f h k + l \ in u( f c + 1) behaves instead like O y^k+i J (ignoring the contribution from global terms), and creates a much more severe problem than intervals of differing orders of magnitude in (5.15). It is worthwhile to note here that the theory presented has been only for e 2, we can expect that something is to be gained by redistributing the current mesh. In this case the new mesh size N* is chosen so that N* := min j iV, imax{iV, predicted/V*} j with "predicted N*" coming from (5.8). Hence the new N* will be chosen between iV N and — . This restriction is added because little value is placed on a "predicted N*" that is out of this range. If a redistribution is not considered worthwhile, the mesh is simply doubled in size, i.e. each interval is halved. • In addition, several controls are placed on the number of times a mesh can be redistributed for a given N, or successively halved and doubled, because in the early stages the code will often try to redistribute a mesh several times based on incorrect inferences from inadequate meshes. When this happens, the mesh is doubled. Global error estimates are obtained by comparing the approximate solution on two consecutive meshes, where one has been formed by a doubling of the other. Recall that such meshes arise when there is no benefit from further redistribution of the current CHAPTER 5. MESH SELECTION 69 mesh. More precisely, leu,- - 2 (5.16) ** < i < t* - 1 where 7r* is the mesh formed by doubling TT, is the approximate solution on n*, and C is an error constant. The purpose of averaging is to give robustness to the estimation and to give an indication of the error for the entire interval not just at Xi. The COLSYS estimates for C are used, however, one should note that they do not differ by a factor of more than 2 or 3 from the simple extrapolation constants that can be calculated from (5.16) on the basis of the error for stiff problems. The difficulty with determining C more precisely is that the constant associated with the leading term of the error is not known, or easily calculated for stiff problems; the extrapolation constant may not prove to be any better than'that of COLSYS. CHAPTER 5. MESH SELECTION 70 5 . 3 A Priori Initial Meshes In general, there is little that can be done, a priori, to choose a "good" initial mesh, without investing a considerable amount of effort. However, the choice of the first mesh can have a significant effect on the entire mesh selection procedure. The usual approach, when no advanced information is known about the problem, is to use a uniform mesh of some small size, i.e. N not too large, and then let the adaptive algorithm increase N and redistribute the mesh as it sees fit, until the tolerances are met. If information is known, for instance about the location of steep gradients, then a more intelligent mesh may be specified by concentrating points in the potential problem areas. Sometimes, a reduced problem can be solved first to provide the necessary information. For more complicated problems still, a continuation process may be required, where a decreasing sequence of er-values is defined and the final mesh created for the solution with a given e is used as the initial mesh for the next smallest e. However, when simple boundary layers are present, we can sometimes choose the initial mesh in layer regions well by taking advantage of the exponential nature of the solution. The construction of §2.3.2, specifically (2.31), (2.36) and relevant comments, can be applied to each boundary layer. The exponential meshes are inexpensive to construct, requiring only one call per mesh to an eigenvalue routine, and if a layer is present, they can, and often do provide CHAPTER 5. MESH SELECTION 71 the code with a good start, at least in indicating a problem area [9]-[ll], [3]. Experience also shows that a layer mesh will be quickly removed if it is not warranted, and well supported if it is. Unfortunately there are significant difficulties. Although these meshes are, theoret-ically at least, guaranteed to give 0(6) accuracy, they cannot be treated as fixed parts of the global mesh for many reasons. First of all, the 0(6) is not adequate; we require a more strict agreement with 6. Secondly, the evaluation of the eigenvalues for e — 0 may result in a mesh density or width that is incorrectly chosen, considerably beyond the realm of 0(6). The latter is particularly alarming because it can cause an error to propagate into the long segment which cannot be resolved there by any coarse mesh placement. There are also difficulties associated with the actual implemention of the meshes, e.g. numerically separating the fast and slow eigenvalues. Hence, while these meshes are cheap and potentially very beneficial, they cannot be considered robust enough for a general purpose code. They are perhaps best thought of as pre-processors to the a posteriori mesh eontruction process, and should be applied cautiously only in that context. Chapter 6 Numerical Examples The a posteriori mesh selection considerations of Chapter 5 and the symmetric Runge-Kutta solution representation of Chapter 3 have been incorporated into an experimental adaptive collocation code, RKNEW, that integrates first order linear systems with separated two-point boundary conditions. RKNEW has been tested on numerous perturbation problems, with both boundary layers and turning points, and we now examine three representative problems, all of the singularly perturbed type. In some instances, comparisons are made with COLNEW [12], a code for mildly stiff problems which uses the usual ("one-sided") Runge-Kutta solution representation. COLNEW is the most recent version of COLSYS. Differential systems are converted to first order before integration (even for COL-NEW, which can integrate higher order equations directly), a limit of 500 mesh intervals is placed on any mesh generated by either code, and the error tolerance is imposed on all components of the first order system. Both codes were run in double precision (15 72 CHAPTER 6. NUMERICAL EXAMPLES 73 decimal digits) on a V A X 750, using the Fortran 77 compiler of BSD 4.2 UNIX. Run time comparisons are not meaningful given the preliminary state of R K N E W , how-ever, the mesh selection procedure of R K N E W is slightly more costly because, unlike C O L N E W , the solution must be evaluated at the k collocation points on each interval of the mesh every time a mesh redistribution is possible. Since the cost to find an approximation on a mesh of size N is O(N), the best indication of the relative work associated with finding a solution for a given e is to compare the total number of mesh intervals required from all meshes generated by each code. The following notation will be used: xii - Ith component of the exact solution E(uj) - maximum of mixed absolute and relative error in uj calculated at X{, xi+i, x t-+| for each mesh interval i , 1 < i < N 6 - mixed absolute and relative error tolerance w k - number of collocation points per mesh interval N - number of mesh intervals Mesh Sequence - sizes of successive meshes generated to meet tolerances TVtot - total number of mesh intervals in the Mesh Sequence a ± b - a • 10 ± 4 * - an extra point was added to make the mesh sufficiently locally uniform. CHAPTER 6. NUMERICAL EXAMPLES 74 Example 6.1 A Turning Point, ex. 3.7.4 in Hemker [23], ex. 1 in [6]. ery" + xy' = — EK2 cos(7rx) — 7rxsin(7rx) — 1 < x < 1 y ( - l ) = -2 y(l) = 0 converted to first order by Ui := y, u 2 := y', with solution erf u^x) = cos(7rx) H erf (—=-r As e —* 0 the solution develops a shock layer in the turning point region near x = 0 and is smooth elsewhere (Figure 6.1). To find a cheap, accurate approximation to this problem, each code must recognize the shock and place a sufficiently dense mesh around it, using as few mesh intervals as possible (i.e. JV" small). Results for COLNEW and R K NEW are given in Table 6.1 for an initially uniform mesh. As £ gets smaller, the problem gets more difficult to solve and more mesh points are required. It is clear that for large e there is little difference in the performance, but as the problem gets harder, RKNEW is far superior, finding adequate meshes of relatively small size for extremely small values of e. The size of the final mesh is important because it determines the amount of storage used, and it also reflects how well the code has detected and responded to difficulties early in the mesh selection process; once a particular size, JV say, has been reached, further meshes are always CHAPTER 6. NUMERICAL EXAMPLES 75 Figure 6.1: The Solution for Example 6.1 C O L N E W R K N E W e k 6 E (ui) E(u 2 ) JVtot Mesh Sequence E(u,) E(uiJ N t o t Mesh Sequence .1-1 4 .1-1 .52-4 .79-3 24 8,16 .52-4 .79-3 24 8,16 .1-1 4 .1-5 .89-7 .59-6 119 8,16,32,21,42 .15-6 .37-6 132 8,16,16,32,20,40 .1-3 4 .1-5 .12-7 .49-6 426 8,16,32,64,51,102,5 102 , .58-7 .37-6 312 8,16,16,16,32,32,64 128 .1-5 4 .1-5 .30-9 .27-7 1352 8,16,16,32,32,32,64 64,64,128,128,128, 256,128,256 .83-7 .37-6 474 8,16,16,16,32,32,32 64,43,86,43,86 .1-6 4 .1-5 8,16,16,32,32,32,64 64,64,128,128,128, 256,256,256.>500 .15-6 .24-5 406 8,16,16,16,32,32,32 64,64,42,84 .1-8 4 .1-5 .60-7 .77-7 942 8,16,16,16,32,32,32 64,60,120,111,222, 111,222 .1-11 4 .1-5 .59-7 .24-6 1263 8,16,16,16,32,32,32 64,39,78,53,106,85, 170,86,172,86,172 Table 6.1: Results for Example 6.1 With an Uniform Initial Mesh CHAPTER 6. NUMERICAL EXAMPLES 76 of size > — . Note how both codes redistribute the mesh several times early in each _ 2 mesh sequence, indicating how quickly the problem has been detected. For the final mesh of RKNEW at e = .1 - 11, 144 of 172 mesh intervals are in (-.1 - 4,.l - 4), h max hi = .69 - 1, min h{ = .15 - 6, and = .85 + 5. RKNEW does not have i500 5,10,20,40,80,160,320, >500 Table 6.2: Results for Example 6.2 With a Uniform Initial Mesh of the disturbance; the layer is simply "skipped over". An alternative is to use the knowledge that a boundary layer at x — 0 actually exists (from a solution for large e, or by some analytic means), and to choose the initial mesh so that the codes can detect the layer. A surprisingly simple way is to choose a new initial mesh, 7r a :— {0.0,ae,2a£,3ae,4ae,0.25}, where a is some parameter that concentrates mesh points around the layer. Results for smaller e with this non-uniform initial mesh are given in Table 6.3. Various values of a were tried for the initial COLNEW mesh at e = .1 — 5, but none was found to generate a solution in under 500 mesh intervals. In all instances the initial mesh was adjusted a little for small N and then simply doubled past 500 intervals. With RKNEW, on the final mesh at e = .1 - 11, 174 of 176 intervals were in (0.0, .1 - 9), and = .62 + 11. It is interesting to note that after the first mesh of 176 intervals, the maximum error is still ~ .1 + 9, and that after only two mesh redistributions, it is reduced to ~ .1 — 6. This CHAPTER 6. NUMERICAL EXAMPLES 78 COLNEW RKNEW e a g E( U l ) E(u2) ATtot Mesh Sequence B(ui) E(u2) Ntot Mesh Sequence .1-3 .1-1 5 .1-1 .29-8 .93-6 258 5,5,10,20,40,40, 23,46,23,46 .11-6 .37-6 156 12,24,20,40,20,40 .1-4 .1-2 5 .1-3 .17-9 .39-9 1220 5,5,10,20,40,80, 160,150,300,150, 800 .1-5 .1-1 .1-2 .1-3 5 .1-5 5,10,20,40,80, 160,320,>500 .26-10 .40-7 654 6',12,24,48,96,52, 104,52,104,52,104 .1-7 .1-4 5 .1-5 .26-10 .41-7 762 6*,12,24,48,96,64, 128,64,128,64,128 .1-9 .1-6 5 .1-5 .27-10 .42-7 870 6',12,24,48,96,76, 152,76,152,76,152 .1-11 .1-8 5 .1-5 .27-10 .43-7 978 6*,12,24,48,96,88, 176,88,176,88,176 Table 6.3: Results for Example 6.2 With a Simple Non-Uniform Initial Mesh RKNEW s k S E(ui) E(u2) jVtot Mesh Sequence .1-3 5 .1-5 .51-8 .39-7 234 9,9,18,36,18,36,18,36,18,36 .1-5 5 .1-5 .40-12 .20-8 1476 9,9,18,36,21,42,84,168,121,242,121,242,121,242 .1-7 5 .1-5 .18-10 .30-8 1134 9,18,36,72,36,72,36,72,40,80,51,102,102,68,136,68,136 .1-9 5 .1-5 9,10*,20,10,20,40,80,160,320,>500 Table 6.4: Results for Example 6.2 With an Exponential Initial Mesh is a common feature of both codes, but is more pronounced for RKNEW. Another option is to use the exponential meshes of §2.3.2 as initial meshes, as indicated in §5.3. With a uniform mesh of size 5 away from the layer and an exponential mesh in the layer region, we get the results in Table 6.4. Here we see the interesting aspect of exponential meshes: while their performance is far superior to the initially uniform meshes (as we would expect from almost any reasonable non-uniform mesh), it is also quite inferior to the performance of the very crude non-uniform meshes, ira. The initial mesh generated for e = .1 — 7 has an error ~ .1 + 3, which suggests that the CHAPTER 6. NUMERICAL EXAMPLES 79 problem of a boundary layer is recognized, but that the mesh width or density needed to resolve it is chosen incorrectly (see §2.3.2). For e — .1 — 7, the exponential part of the initial mesh lies entirely within (0.0, .2 — 6), compared with (0.0, .1 — 4) for 7 r a . The exponential meshes for e < .1 — 7 cannot be used with COLNEW because the local non-uniformity at the join of the layer and smooth regions is too great; COLNEW has overflow errors. The actual overflow may simply be a scaling problem, but experience shows that COLNEW is often unable to yield solutions on such meshes anyway, even when overflow does not occur. • Example 6.3 A Turning Point and Boundary Layer, ex. 2 in Brown & Lorenz [15]. e7r2cos(7rx) H—-x sin(7rx) =: g(x)! 1 < x < 1 E z" -f Z = 0 y(-i) = - l z(-l) = 1 converted to first order by introducing the variables w and v such that E Z* — V w' '= 2 - y + 2 x v + z _ g CHAPTER 6. NUMERICAL EXAMPLES 80 and ti! y, u 2 :— v, u 3 := w, u 4 := z. We get, on integrating, x 1 x e r u i = - - u i + -(tr - l ) u 2 + u 3 + (1 - £-)-u4 u'2=u4 1 x = - U i + -2 2 u 3 = - U ! + - u 2 + u 4 - g £ U 4=U 2 under appropriate boundary conditions. The exact solution is u^x) = j - . + u 4 + cos(7rxJ u 4(xj = exp I — Y e r f ( ^ ) v £ 2 Under the given boundary conditions, the solution for z has a boundary layer at x = — 1. The differential equation for y is formed essentially by adding terms involving z and its derivative to the problem of Example 6.1. Hence, the solution for y contains a boundary layer similar to z, plus a shock layer at the turning point x = 0 similar to the solution of Example 6.1 (Figure 6.2). The turning point is the more difficult feature to resolve. Results for an initially uniform mesh are given in Table 6.5. Again, R K NEW performs better. The turning point is noticed first, and for instance at e = .1 — 7, it is not until the second mesh of size 126 that any attempt is made to resolve the boundary layer. For the final mesh at £ = .1 — 7, 73 points are in (—1, —.99) and 79 points are in (—.1 — 2, .1 — 2), the rest of the mesh is a smooth transition between CHAPTER 6. NUMERICAL EXAMPLES 81 Figure 6.2: The Solution for Example 6.3 COLNEW RKNEW c k S E(ux) E(u<) T V tot Mesh Sequence E(U l) E ( U 4 ) AT tot Mesh Sequence . 1 - 1 4 . 1 - 1 . 1 1 - 3 . 2 4 - 3 1 5 5 , 1 0 . 1 1 - 3 . 2 4 - 3 1 5 5 , 1 0 . 1 - 1 4 . 1 - 5 . 2 7 : 6 . 3 5 - 6 7 5 5 , 1 0 , 2 0 , 4 0 . 2 7 - 6 . 3 5 - 6 7 5 5 , 1 0 , 2 0 , 4 0 . 1 - 2 4 . 1 - 5 . 7 9 - 7 . 9 6 - 7 1 7 5 5 , 1 0 , 2 0 , 2 0 , 4 0 , 8 0 . 5 9 - 7 . 1 3 - 7 2 3 3 5 , 1 0 , 2 0 , 1 7 , 3 4 , 2 1 , 4 2 , 8 4 . 1 - 3 4 . 1 - 5 . 8 1 - 6 . 2 9 - 7 3 9 0 5 , 1 0 , 2 0 , 4 0 , 3 5 , 7 0 , 3 5 , 7 0 , 3 5 , 7 0 . 1 4 - 6 . 3 1 - 7 2 9 0 5 , 1 0 , 2 0 , 2 0 , 2 0 , 4 0 , 2 5 , 5 0 , 1 0 0 . 1 - 5 4 . 1 - 5 . 4 4 - 6 . 1 1 - 7 8 1 7 5 , 1 0 , 1 0 , 1 0 , 2 0 , 2 0 , 4 0 , 4 0 , 4 0 , 8 0 , 8 0 , 7 7 , 1 5 4 , 7 7 , 1 5 4 . 1 4 - 6 . 7 8 - 8 6 3 5 5 , 1 0 , 2 0 , 4 0 , 4 0 , 4 0 , 8 0 , 4 0 , 8 0 , 4 0 , 8 0 , 1 6 0 . 1 - 6 4 . 1 - 5 5 , 1 0 , 1 0 , 1 0 , 2 0 , 2 0 , 2 0 , 4 0 , 4 0 , 4 0 , 8 0 , 8 0 , 8 0 , 1 6 0 , 1 6 0 , 1 6 0 , 3 2 0 , > 5 0 0 . 1 - 7 4 . 1 - 5 . 2 7 - 5 . 2 4 - 6 1 3 4 3 5 , 1 0 , 2 0 , 2 0 , 4 0 , 4 0 , 4 0 , 8 0 , 8 0 , 6 3 , 1 2 6 , 6 3 , 1 2 6 , 6 3 , 1 2 6 , 6 3 , 1 2 6 , 2 5 2 . 1 - 8 4 . 1 - 5 5 , 1 0 , 2 0 , 2 0 , 4 0 , 4 0 , 4 0 , 8 0 , 8 0 , 8 0 , 1 6 0 , 1 6 0 , 3 2 0 , > 5 0 0 Table 6.5: Results for Example 6.3 With a Uniform Initial Mesh CHAPTER 6. NUMERICAL EXAMPLES 82 COLNEW RKNEW e k S E( U l ) E(u4) JVtot Mesh Sequence E ( U 4 ) Mot Mesh Sequence .1-7 4 .1-5 .28-5 .27-8 740 77,51,102,51,102,51, 102,204 .1-9 4 .1-5 102,61,122,61,122,65, 130,65,130,122,244, 244,244,488,S09,>500 .24-6 .14-6 667 126,63,126,252 .1-11 4 .1-5 .56-4 36-7 1701 126,63,126,63,126,63, 126,252,126,252,126, 252 Table 6.6: Results for Example 6.3 Using Continuation of Meshes these regions and the boundaries (unlike the previous two examples). The results in Table 6.5 can be extended using a continuation of meshes: the final mesh at one value of e is used as the initial mesh for a smaller value of e. In practice we use the second last mesh (of which the last is a doubling), in order to convey the character of the final mesh with as few points as possible. Results are given in Table 6.6, where the first initial mesh is taken from the smallest value of e in Table 6.5, and subsequent initial meshes are taken from the previous value of e in the table. It is interesting to note that if we decrease £ by .1 instead of .1 — 1, then the step from .1 — 8 to .1 — 9 for R K N E W requires > 500 mesh points. This kind of behaviour is typical for continuation, and we therefore suggest that continuation should only be used to obtain a solution that would not otherwise be found, because of storage or time restrictions. Chapter 7 Conclusions and Further Considerations In this thesis we have presented implementation methods for general purpose solvers of singularly perturbed BVPs. With the new symmetric Runge-Kutta (mono-mial) basis, and an a posteriori mesh selection procedure based on localized error estimates at collocation points, collocation at Gauss points can be used to give efficient high order approximations to extremely stiff problems. These features have been in-corporated within the framework of COLSYS to give an experimental, general purpose code, RKNEW, for linear problems. R K N EW was compared to COLNEW (the most recent version of COLSYS) for numerous linear examples and the results are promising. As the perturbation param-eter, €, gets small, R K NEW almost always outperforms COLNEW, with respect to both the total work required (based on the total number of mesh points in the mesh sequence, i.e. JVtot of Chapter 6), and on the size of largest mesh required. Further-83 CHAPTER 7. CONCLUSIONS AND FURTHER CONSIDERATIONS 84 more, RKNEW recognizes and responds to stiff problems better and is more methodical in the way it chooses meshes. The results are quite dramatic for turning points, and to a lesser degree, for boundary layers. RKNEW does especially well when meshes with very large local non-uniformities of interval sizes are present, a situation that causes COLNEW great difficulty. COLNEW does work better for some non-stiff and mildly stiff problems; however, RKNEW usually still performs adequately in such cases. We concede that the creation of a new mesh is slightly more expensive in RKNEW. The performance of RKNEW is critically affected by the choice of initial mesh. If e is extremely small, a coarse initial mesh may actually "miss" the disturbance, and R K NEW will require very large meshes to recover. Simple non-uniform meshes, con-tinuation of meshes, and the exponential meshes of Ascher & Weiss are all alternatives, if used intelligently. Like most works, this thesis suggests many more problems-than it solves. The most obvious are the extension to non-linearity and replacing the constants, taken from COLSYS for use in global error estimation and mesh equidistribution, with constants more theoretically suitable for singular perturbation problems. Since COLNEW is better for non-stiff or "regularized" problems, a good code should eventually be able to switch between the theory of COLNEW and RKNEW depending on how locally "regularized" the problem is perceived to be; this is an ambitious goal. Finally, more understanding of the theory and practice for singular singularly perturbed and turning CHAPTER 7. CONCLUSIONS AND FURTHER CONSIDERATIONS 85 point problems is definitely required. The success of both codes (without supporting theory) and the superiority of RKNEW for such problems suggests there is much to be gained from research in this area. Bibliography [1] Z. Aktas & H.J. Stetter, "A classification and survey of numerical methods for boundary value problems in ordinary differential equation," Int. J. Numer. Meth. Eng., v. 11, 1977, pp. 771-796. [2] U. Ascher, "Collocation for two-point boundary value problems revisited," SIAM J. Numer. Anal., v. 23, 1986, pp. 596-609. [3] U. Ascher, "On some difference schemes for singular singulary-perturbed boundary value problems," Numer. Math., v. 46, 1985, pp. 1-30. [4] U. Ascher, "Two families of symmetric difference schemes for singular perturbation problems," in Proc. Workshop on Numerical Boundary Value ODEs, U. Ascher and R.D. Russell (Eds.) Birkauser, 1985. [5] U. Ascher &: G. Bader, "Stability of collocation at gaussian points," SIAM J. Numer. Anal., v. 23, 1986, pp. 412-422. [6] U. Ascher, J. Christiansen & R.D. Russell, "A collocation solver for mixed order systems of boundary value problems," Math. Comp., v. 33, 1979, pp. 659-679. [7] U. Ascher, J. Christiansen & R.D. Russell, "A collocation code for boundary value problems," in Springer Lecture Notes in Computer Science 76, B. Childs et al (Eds.), New York, 1979. [8] U. Ascher, J. Christiansen & R.D. Russell, "Collocation software for boundary value ODEs," ACM Trans. Math. Software, v. 7, 1981, pp. 209-222. [9] U. Ascher & R. Weiss, "Collocation for singular perturbation problems I: First order systems with constant coefficients," SIAM J. Numer. Anal., v. 15, 1983, pp. 537-557. [10] U. Ascher & R. Weiss, "Collocation for singular perturbation problems II: Linear first order systems without turning points," Math. Comp., v. 43,1984, pp. 157-187. 86 BIBLIOGRAPHY 87 [11] U. Ascher & R. Weiss, "Collocation for singular perturbation problems III: Non-linear problems without turning points," SIAM J. Sci. Stat. Comput., v. 5, 1984, pp. 811-829. [12] G. Bader & U. Ascher, "A new basis implementation for a mixed order boundary value ODE solver," Tech. Rep. 85-11, University of British Columbia, Vancouver, Canada, 1986. [13] C. de Boor & B. Swartz, "Collocation at Gaussian points," SIAM J. Numer. Anal, v. 10, 1973, pp. 582-606. [14] C. de Boor &; R. Weiss, "SOLVEBLOK: A package for solving almost block diag-onal linear systems," ACM Trans. Math. Software, v. 6, 1980, pp. 80-87. [15] D.L. Brown & J. Lorenz, "A high-order method for stiff boundary-value problems with turning points," Los Alamos Report LA-UR-85-4406, 1985. [16] R. Bulirsch, J. Stoer & P. Deuflhard, "Numerical solution of nonlinear two-point boundary value problems I," Num. Math. Handbook Series Approximation, 1977. [17] K. Burrage, "Stability and efficiency properties of implicit Runge-Kutta methods," Ph.D. Thesis, University of Auckland, New Zealand, 1978. [18] K. Burrage & J.C. Butcher, "Stability criteria for implicit Runge-Kutta methods," SIAM J. Numer. Anal., v. 18, 1979, pp. 46-57. [19] J.C. Butcher, "Implicit Runge-Kutta processes," Math. Comp., v. 18,1954, pp. 50-64. [20] P. Deuflhard, "Nonlinear equation solvers in boundary value problem codes," in Springer Lecture Notes in Computer Science 76, B. Childs et al (Eds.), New York, 1979. [21] L. Dieci & R.D. Russell, "Riccati and other methods for singularly perturbed BVPS," in Proc. BAILI Conference, J. Miller (Ed.), Boole Press, 1986. [22] I. Gladwell, "The development of boundary-value codes in the ordinary differential equations chapter of the NAG library," in Springer Lecture Notes in Computer Science 76, B. Childs et al (Eds.), New York, 1979. [23] P.W. Hemker, "A numerical study of stiff two-point boundary value problems," Tech. Rep. 80, Math. Centrum, Amsterdam, 1977. BIBLIOGRAPHY 88 [24] H. Kreiss, N. Nichols & D. Brown, "Numerical methods for stiff two-point bound-ary value problems," SI AM J. Numer. Anal., v. 23, 1986, pp. 325-368. [25] M. Lentini & V. Pereyra, "An adaptive finite difference solver for nonlinear two-point boundary value problems with mild boundary layers," SIAM J. Numer. Anal., v. 14, 1977, pp. 91-111. [26] V. Pereyra & E.G. Sewell, "Mesh selection for discrete solution of boundary value problems in ordinary differential equations," Numer. Math., v. 23, 1975, pp. 261-268. [27] R.D. Russell, "Mesh selection methods," in Springer Lecture Notes in Computer Science 76, B. Childs et al (Eds.), New York, 1979. [28] R.D. Russell & J. Christiansen, "Adaptive mesh selection strategies for solving boundary value problems," SIAM J. Numer. Anal., v. 15, 1978, pp. 59-80. [29] R. Weiss, "An analysis of the box and trapezoidal schemes for linear singularly perturbed boundary value problems," Math. Comp., v. 42, 1984, pp. 41-67.