UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Adjoint data assimilation in an equatorial coupled atmosphere-ocean model Lu, Jingxi 1997

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1997-251004.pdf [ 6.19MB ]
Metadata
JSON: 831-1.0088066.json
JSON-LD: 831-1.0088066-ld.json
RDF/XML (Pretty): 831-1.0088066-rdf.xml
RDF/JSON: 831-1.0088066-rdf.json
Turtle: 831-1.0088066-turtle.txt
N-Triples: 831-1.0088066-rdf-ntriples.txt
Original Record: 831-1.0088066-source.json
Full Text
831-1.0088066-fulltext.txt
Citation
831-1.0088066.ris

Full Text

A D J O I N T D A T A ASSIMILATION IN A N E Q U A T O R I A L C O U P L E D A T M O S P H E R E - O C E A N M O D E L by Jingxi Lu B. Sc. (Meteorology) Nanjing Institute of Meteorology, 1982, M . Sc. (Atmospheric Dynamics) Institute of Atmospheric Physics, Chinese Academy of Science, 1987 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES EARTH AND OCEAN SCIENCES We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August 1997 © Jingxi Lu, 1997 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Earth and Ocean Sciences The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1Z1 Date: 11 Abstract A general procedure of adjoint data assimilation for atmosphere-ocean coupled systems was developed. A simple equatorial coupled atmosphere-ocean model, with the atmosphere and the ocean each represented by a single-layer linear shallow water model, was then used to explore the effects of data type (wind or sea level height (SLH)), data sparsity, noise, and initial guess on the retrieval of model parameters and initial conditions. Six parameters (two Rayleigh damping coefficients, two Newtonian cooling coefficients, and two coupling parameters) of the simple model were estimated with its initial conditions specified. By assimilating the wind and SLH data, the temporal sparsity of the data was found to be more detrimental for parameter estimation than their spatial sparsity, and sparse wind data were more detrimental than sparse SLH data. Longer assimilation windows improved the parameter estimation, but the best window length for the wind data was not the best for the SLH data. A priori information for individual parameters as implemented in the cost function was useful in providing information for the size of the parameters and in enhancing the convexity of the cost function, but not as a substitute for inadequate data. Three initial oceanic fields (SLH and two horizontal current components) in the simple coupled model were estimated with the six parameters specified. By assimilating either the wind or SLH data, it was found that the SLH data were more efficient in retrieving the oceanic initial conditions than the wind data, and the initial SLH field was more accurately retrieved than the initial currents. The current fields were sensitive to the temporal density of data, especially with wind data, where once a day would be the minimum density needed for a satisfactory retrieval. The error in the magnitude of the initial guess was readily corrected, while the large phase error was not even with both the wind and SLH available. Assimilation of noisy data showed that the retrieval of the initial conditions was far more sensitive to noise in the SLH data than in the wind data. Ill Table of Contents Abstract ii List of Tables v List of Figures vi Acknowledgements x Chapter 1 Introduction 1 1.1 ENSO Forecasting 1 1.2 Data Assimilation 3 1.3 Motivation and Objectives , 10 Chapter 2 Adjoint Data Assimilation Theory and Application 15 2.1 Optimal control theory 15 2.2 Optimal Control Applied to Data Assimilation 15 2.2.1 Mathematical Formulation 16 2.2.2 Weak and Strong Constraint Formalisms and the Augmented Lagrangian 16 2.3 Adjoint Method 17 2.4 Minimization Algorithms 19 2.4.1 First Derivative Algorithms 20 2.4.2 Second Derivative Algorithms 22 2.4.3 The Role of the Hessian in Minimization Algorithms 23 2.4.4 Important Issues in Variational Adjoint Procedure Associated with Mriimization 25 Chapter 3 Adjoint Data Assimilation in Coupled Model Systems 3 0 3.1 A Generic Procedure for Parameter Estimation and Initialization in Coupled Systems 30 iv 3.2 Determining Parameters and Initial Conditions in a Local Thermal Equilibrium Model 33 3.2.1 Model 33 3.2.2 Adjoint Model 35 Chapter 4 Determining Model Parameters 3 8 4.1 Experiments 39 4.2 Results 43 4.2.1 Impact from the amount of data on parameter estimation 43 4.2.2 Impact of noisy data on parameter estimation 53 4.2.3 Improving parameter estimations 55 4.3 Conclusion 67 Chapeter 5 Determining Model Initial Conditions 71 5.1 Experiments 73 5.2 Results 74 5.2.1 Information transfer between the atmosphere and the ocean 74 5.2.2 Impact of initial guess on the initialization 92 5.2.3 Impact of noisy data on initialization 99 5.3 Conclusion ....103 Chapter 6 Summary and Discussions 108 Bibliography 113 Appendix A Derivation of the Continuous Adjoint Coupled Model 125 Appendix B Numerical Discretization 129 Appendix C Verifying Gradient Calculations 131 Appendix D Summary of Experiments and Results 133 List of Tables v 1 Parameters used in the coupled atmosphere-ocean model for parameter estimation 42 2 Parameters used in the coupled atmosphere-ocean model for retrieving initial conditions 73 1 A summary of the experiments performed to retrieve the three oceanic initial conditions from assimilating wind (w) and/or SLH (h) data. Data assimilation period was 90 days. The retrieval quality for the three oceanic initial conditions is classified as 133 2 A summary of the experiments performed to retrieve the six parameters from assimilating wind (w) and/or SLH (h) data. Unless specifically mentioned, the data assimilation period was 40 days. The retrieval quality is classified as 134 vi List of Figures 2.1 Illustration of the steepest descent method to obtain the optimum for a two dimensional problem 21 4.1 Spatial structure of (a) the oceanic zonal velocity component (cms-1) and (b) the depth of the thermocline (cm) after integrating the coupled model for 40 days using parameters in Table 1. Shaded regions indicate westward motion in (a) and elevated thermocline in (b) 40 4.1b 41 4.2 Change of the total cost function J and the gradient norm g = Igl with iterations during the minimization for a 5-day direct inversion. The subscripts UV, U and V refer respectively to the three experiments where wind components ua and va, or only ua, or only va , are available everywhere for assimilation. Both the cost function and the gradient norm have been normalized with respect to their initial values (JO and gO) before plotting 45 4.3 Relative estimation error of 6 parameters for a 40-day assimilation with (a) the wind available every 90 or 180 time steps at every grid point, and SLH available everywhere for all time, (b) SLH available every 8 or. 16 time steps at every grid, and the wind available everywhere for all time. The atmospheric model has 90 time steps per day and the oceanic model has 8 time steps per day. The two estimation errors for each parameter are from two data sparsity experiments (90 or 180 time steps for (a) and 8 or 16 time steps for (b)) respectively 46 4.3b 47 4.4 Relative estimation error of 6 parameters for a 40-day assimilation with data distributed every 10th grid zonally. Meridionally between 100 N and 300N and between 100S and 300S, data are available every 15th grid, and between 100S and 100N, every 2nd grid. The temporal data distribution is (a) every 90 or 180 time steps for the wind and every time step for the SLH, or (b) every 8 or 16 time steps for the SLH and every time step for the wind 50 4.4b 51 Vll 4.5 Relative estimation error of 6 parameters for a 40-day assimilation with the wind the same as in Fig.4.4b, the SLH available once every 8 time steps and every 10 grids zonally at 250N and 250S latitudes (experiment SI), and SLH available at three zonal grid points (30, 100 and 170) at 0.50N (experiment S2) 52 4.6 Relative estimation error for 6 parameters for a 40-day assimilation with the wind and SLH data everywhere. Experiment N l has 10 % noise added to SLH, N2, 10% noise added to wind, and N3, 10% noise added to both 54 4.7 Relative estimation error for 6 parameters for the 5, 20 and 40-day assimilation runs, (a) with the wind data available for all grid points once a day and SLH available everywhere, (b) with SLH available for all grid points once a day and wind data available everywhere 56 4.7b 57 4.8 Iteration history of the cost function and the gradient norm 59 4.8b 60 4.8c 61 4.8d 62 4.9 Iteration history for an experiment identical to experiment SI shown in Fig. 5, except with an a priori estimate term for a in the cost function 63 4.9b , 64 4.9c... 65 4.9d 66 4.10 Relative estimation error for the 6 parameters for various windows, with data availability identical to the case of assimilation every 90 time steps in Fig. 4.4a. A priori information was included for all parameters for the 20 and 40 day experiments 68 4.11 Spatial structure of the depth of the thermocline (cm) after integrating the coupled model for 40 days using damping and coupling parameters that are a factor of 5 larger than the parameters in Table 1. Shaded regions indicate elevated thermocline 70 Vlll 5.1 Spatial structure of the thermocline depth perturbation hO after integrating the coupled model for 10 days using the parameters in Table 2. This will be taken as the ocean's initial state for the 90-day control run and other experiments 74 5.1b Temporal and spatial structure of thermocline perturbation along the equator. Shaded regions indicate elevated thermocline. The depth perturbation is against the equivalent depth dO (see Table 2). Since in this model, the SLH is related to the thermocline depth perturbation by a constant factor, we will also refer to the hO field as the SLH field 75 5.2 The (a) RMSE and (b) correlation skill in retrieving the initial conditions hO, the (c) RMSE and (d) correlation skill in retrieving uO, and the (e) RMSE and (f) correlation skill in retrieving vO, for a 90-day assimilation run with wind or SLH available every 1, 10, 30, or 90 days at every grid point. For one day of integration, the ocean model took 8 time steps, while the atmospheric model took 90 steps. The labels w and h denote, respectively, experiments where wind or SLH data were available at every grid point in the model. The label al&2 means that the RMSE or correlation was calculated over the entire model domain, while al or a2 means that the values were only over area 1 (the TAO array area) or area 2 (the rest of the model domain). 77 5.2b 78 5.2c 80 5.2d 81 5.2e 83 5.2f 84 5.3 Same as Fig. 5.2 except for assimilating data at every 2nd grid in the meridional direction and every 15th grid point in the zonal direction in area 1 86 5.3b 87 5.3c 88 5.3d 89 5.3e 90 IX 5.3f .. 91 5.4 Difference in the initial (a) hO, (b) uO and (c) vO perturbations between case 3 and the control run 93 5.4b 94 5.4c 95 5.5 (a) RMSE and (b) correlation from retrieving three initial conditions from case 1 and case 2 by assimilating wind or SLH data at day 90 in area 1, and from case 3 and case 4 by assimilating once every day both wind and SLH data in area 1 97 5.5b 98 5.6 (a) RMSE and (b) correlation from retrieving three initial conditions (hO, uO and vO) for a 90-day assimilation with noisy wind or both noisy wind and SLH data available once every 1 or 10 days at every 2nd grid in the zonal direction and every 15th grid in the meridional direction in area 1 100 5.6b 101 5.6c 102 5.7 (a) RMSE and (b) correlation from retrieving three initial conditions (hO, uO and vO ) for a 90-day assimilation with noisy wind or both noisy wind and SLH data available once every 1 or 10 days at every 2nd grid in the zonal direction and every 15th grid in the meridional direction in area 1 104 5.7b 105 C Verification of the gradient computation. Note that F(e)=l+0(e) indeed holds until round-off errors become important for very small e 132 X Acknowledgments I deeply thank William Hsieh, my supervisor, for his guidance, encouragement, and support during the entire course of my five-year graduate studies. He has provided such a creative academic atmosphere that I have freely developed my interest and built up my knowledge in my studies. His patient guidance has extended to the full spectrum of my academic and social life. I am sincerely thankful for the insightful recommendations and comments from the members of my committee — Susan Allen, Phil Austin, and Mike Foreman. Special thanks go to Ding Yihui at the National Climate Center of China, who drew my attention to tropical atmosphere-ocean dynamical problems in 1984, to Denis Laplante, who has patiently helped me with computer problems starting from the first day I came to our lab, and to the Environment Canada and the Natural Sciences and Engineering Research Council of Canada for having financially supported me through research grants to W. Hsieh for five years. I am grateful to Ron C. Pacanowski for kindly providing us the GFDL C-grid shallow-water model code, to Ralf Giering for his AMC compiler in helping us to develop the adjoint code, and to M . K. Davey for his insightful comments on my paper. Chapter 1. Introduction 1 Chapter 1 Introduction 1.1 ENSO Forecasting The El Nino-Southern Oscillation (ENSO), a coupled atmosphere-ocean oscillation centered around the tropical Pacific, plays a major role in the short-term climate variability ( Horel and Wallace, 1981; Barnett, 1981; Shapiro, 1982; Gray et al., 1993; Eisner and Schmertmann, 1993; Ropelewski and Halpert, 1986), with important influences on the economy of many countries in the world. Forecasting ENSO at some lead time began in the mid 1980s (Inoue and O'Brien, 1984, Barnett et al., 1988). Three approaches, i.e., statistical, dynamical, or combined statistical-dynamical approaches have led to a variety of ENSO prediction models. Statistical approaches, including the canonical correlation analysis (CCA) ( Graham et al., 1987; Barnston and Repelewski, 1992), the principal oscillation patterns (POPs) ( Xu and von Storch, 1990), the inverse method (Penland and Magorian, 1993), the singular spectrum analysis - maximum entropy method (Keppenne and Ghil, 1992), or the neural network method (Tangang et al., 1997), were generally much less costly to run than dynamical models. While empirical prediction schemes may succeed to some extent, a proper understanding of the relationships on which they are based, together with the development of truly quantitative forecasts with error or probability information, requires coupled models. Hybrid models use a combined statistical-dynamical approach by coupling a statistical atmosphere, derived from an empirical relationship between sea surface temperature (SST) anomalies and wind anomalies, to a dynamical ocean model (Inoue and O'Brien, 1984; Graham et al., 1992; Neelin, 1990; Latif and Flugel, 1991; Barnett et al., 1993). Hybrid models are effective because the atmospheric response to the tropical oceanic boundary conditions can be Chapter 1. Introduction 2 modeled empirically with moderate success. Physical coupled models can be classified into two types: (1) a simple atmosphere coupled to either a simple ocean (henceforth referred to as a simple coupled model) (Cane et al., 1987), or to an Ocean General Circulation Model (OGCM) (Latif et al., 1993 a,b), and (2) a fully coupled GCM, i.e. an Atmosphere GCM (AGCM) coupled to an OGCM (Ji et al, 1994). Barnston et al. (1994) surveyed five operational or potentially operational models representing all three approaches — (1) the Lamont-Doherty Earth Observatory simple coupled model (hereafter referred to as Cane-Zebiak model, Zebiak and Cane, 1987) with both a simple atmosphere and a simple ocean, (2) the Scripps Institute for Oceanography-Max Planck Institute for Meteorology hybrid coupled model (Barnett et al., 1993) using an OGCM coupled to a statistical atmospheric model, (3) the National Center for Environmental Prediction (NCEP) comprehensive coupled model (Ji et al, 1994) using full GCMs for both the ocean and atmosphere, (4) the Climate Prediction Center (CPC) CCA model (Barnston and Repelewski, 1992), and (5) the CPC empirically constructed analog model (Van den Dool, 1994). Though specific differences (e.g. seasonality, geographical distribution as well as its decay with forecast lead time, etc.) can be found, the predictive skill of these models are roughly comparable at six-month lead time. The Cane-Zebiak model, widely used for seasonal forecasts of the tropical Pacific SST, on average succeeded in making predictions out to 1 year if a correlation of 0.5 is considered useful, or to 9 months if the criterion is a correlation of 0.6 (Palmer and Anderson, 1994). Simple coupled models use carefully chosen abbreviated versions of the full equations of atmospheric and oceanic motions and interactions. The simplifications bring many advantages, e.g. the models are efficient to run and have physics which is easy to understand (Matsuno, 1966; Gill, 1982). Their recent ability to produce forecasts useful at prediction times longer than previously thought possible is largely attributed to rapid advancement in data observing systems, increased understanding of the ENSO atmosphere-ocean interaction mechanisms, and their representation in coupled models. However, many puzzling aspects of ENSO forecasts remained Chapter I. Introduction 3 to be understood: (1) the predictability of the.ENSO system depends on seasons, i.e., forecasts initiated in winter are less skillful than those initiated after spring (see e.g. Fig. 12 on the skill as a function of the seasons in Palmer and Anderson, 1994), (2) The existing simple models do not in general perform well for all regions of the tropical Pacific ~ the coastal region off Peru and the western Pacific are less accurately predicted than the central and eastern equatorial Pacific. (3) There are certain decades for which the ENSO prediction skills are low. The prediction errors are affected by the correctness of the physical simplifications of the atmospheric and oceanic dynamics, the process of the atmosphere-ocean interactions, and initializations, etc. It is presently not known how much of the prediction errors can be ascribed to model errors, and how much to forcing errors. Apparently, to improve simple model ENSO forecasting, efforts should be made towards validating the models with real atmospheric and oceanic data and initializing the models by data assimilation. 1.2 Data Assimilation Meteorological data assimilation has been driven largely by the crucial need to specify the initial conditions of a numerical weather forecast. Initialization is a process that can provide diagnostic constraints used to generate approximate but model-consistent data that are not available from the observational network. The resulting balanced initial conditions will damp out the spurious high-frequency oscillations in the integration of the forecast model. The procedure which takes observed spatially and temporally heterogeneous data and creates spatially and temporally homogeneous fields, is usually called assimilation or analysis. The fields produced by an analysis or an assimilation must satisfy two basic requirements. First, they must be close to the observations, ideally to the accuracy of the observations themselves, at the required spatial and temporal scales. Second, they must verify the dynamical and/or statistical relationships which are known to be satisfied by the atmosphere and/or ocean. Meteorologists have worked on data analysis and assimilation for over 30 years (Daley, Chapter 1. Introduction 4 1991; Lorenc, 1986; Navon, 1986; Bengtsson et al, 1981). Most schemes belong to one of three basically different classes of algorithms: (1) Local spatial interpolation, (2) Statistical (optimal) interpolation, (3) Variational analysis. The classification of an algorithm to be one of the three depends on how information on the dynamical and/or statistical properties of the fields is introduced into the assimilation algorithm. The local spatial interpolation, also called objective analysis, was first introduced by Panofsky (1949). Information on the meteorological fields is introduced under a very indirect and simplified form of an "influence function" which characterizes the spatial scale of variation of the fields. The local spatial interpolation is both very simple and cheap to implement. The statistical interpolation algorithms are in effect linear regression algorithms, where past experience about the behavior of the atmosphere is used as the main source of information for determining the interpolation weights. The fields produced are the linear combinations of the observations which minimize the mean quadratic error over a large number of realizations of the analysis. Eliassen (1954) first proposed the method based on spatial autocorrelation functions of the pressure field, while Gandin (1965) developed the method for operational use. This class is now widely used, especially as part of the assimilation procedure for large-scale prediction models (Lorenc, 1981), and is at present the technique which produces the best results for operational weather forecasts. One main advantage of statistical interpolation is that it produces a practical and internally consistent approach for treating a large set of heterogeneous observations. The disadvantage is that it is computationally expensive to implement because of having to solve large systems of linear equations for the interpolation weights. The statistical interpolation can be extended to the time dimension by using an explicit dynamical model for temporal evolution of the atmospheric and oceanic flows. For linear models, the extension is the Kalman-Bucy filtering (K-B) (Kalman, 1960; Kalman and Bucy, 1961), where the estimation error variance is minimized to compute the gain matrix. The K-B filter has been applied to Chapter 1. Introduction 5 various simple models in meteorology and oceanography (e.g. Ghil et al, 1981; Miller, 1986; Budgell, 1986, 1987, Daley, 1993), as the computational cost would be prohibitive for complex models. ' The variational analysis algorithms require that the analyzed field satisfy an explicit dynamical constraint when a given measure of the "distance" between the analysis and the observations is minimized. The constraint will normally be expressed by one or more differential equations. These methods have the great theoretical advantage of providing consistency between the analysis and the dynamics. But their extensive mathematical technicality and their high computational cost have greatly limited their use. Although the potential usefulness of variational methods for meteorological problems was studied very early by Sasaki (1955, 1958, 1970), these methods have not been fully utilized until recently. This is especially true of assimilation studies addressing problems with explicit time derivatives, see e.g. Lewis and Bloom (1978). Variational techniques have thus been applied primarily to the analysis of mesoscale fields and initialization (Daley, 1978; Soliz and Fein, 1980; Tribbia, 1982; Temperton, 1982). Oceanographic data assimilation, driven by the need to deepen and broaden our understanding of ocean circulation/dynamics, emphasized model parameter estimation, formal testing of the models against the data, verifying solution errors arising from the errors inherent in the models and in the data. It is largely due to the smaller number and more uneven distribution of oceanographic data (104 times fewer than atmospheric data) and the less advanced state of oceanographic models. Ghil and Malanotte-Rizzoli (1991) had reviewed the status of oceanographic data assimilation for the beginning of the 1990's. A recent book presents a comprehensive picture of the state-of-the-art of data assimilation in oceanography in the mid 1990's (Malanotte-Rizzoli, 1996). There have been basically two approaches to oceanographic data assimilation: The first focuses on using more complex and realistic models to assimilate data with methodologically simple and efficient techniques, e.g. the blending, direct insertion, and nudging methods. The second is to develop and use sophisticated assimilation techniques with simple models, e.g. the Chapter 1. Introduction 6 sequential estimation methods with Kaman filter as the prototype, and the variational analysis mostly based on the adjoint method. In a broader sense, the first attempts to assimilate data into ocean models were the dynamical methods used to produce maps of currents from hydrographic observations in Sandstrom and Helland-Hansen (1903), as geostrophic shear can be directly established from density data. These early attempts at data assimilation were followed by more sophisticated models e.g. Stommel and Schott's (1977) beta spiral or Wunsch's (1977, 1978) inverse method. The blending technique is a highly simplified and localized version of optimal interpolation (01), with purely empirical weights. At assigned times (windows), the observed or forecast field variable/at a given grid point is replaced by a new variable y"ew which is a blending of the two,/m o d e l and/*5 f"ew = afobs + (l-a)fmo<ie\ where a is the weight for the observed value. Blending the model-evaluated variables and the observations was first applied to oceanographic problems in limited-ocean domains (Robinson and Leslie, 1985; Rienecker et al., 1987). Clancy et al. (1988) applied the method to larger domains but used a simple mixed-layer model. Leetmaa and Ji (1989) have used the method for operational hindcasting of the tropical Pacific. Derber and Rosati (1989) have applied the method to the global ocean. When a = 1, the method becomes the direct insertion of the observed value in place of the model-predicted value (Kindle 1986; Thompson 1986; Hurlburt 1986; Malanotte-Rizzoli and Holland 1986, 1988). The nudging (or relaxation) scheme was first introduced in meteorology by Anthes (Anthes, 1974). It was introduced for oceanographic data assimilation by Verron and Holland (1989) and by Holland and Malanotte-Rizzoli (1989). The technical problems using the nudging method have been the difficulties in determining optimally the nudging coefficients. The implementation of the nudging scheme is otherwise easy. Optimal interpolation was often used in oceanography with an objective analysis purpose. Chapter 1. Introduction 7 Marshal (1985a,b) applied 01 to the problem of determining the ocean circulation by assimilating satellite altimetry data. White et al. (1990a,b) also used 01 to assimilate first simulated and then GEOSAT altimetric sea-level observations continuously into a QG eddy-resolving ocean model. Multivariate statistical objective analysis of the 01 type was also applied by Carton and Hackert (1989) to the circulation of the tropical Atlantic Ocean. Many oceanographic data assimilation studies have used the K-B filter (Budgell, 1986, 1987; Miller, 1986, 1989; Bennett and Budgell, 1987, 1989; Gasper and Wunsch 1989, Miller and Cane, 1989, Miller and Ghil, 1990; Fu et al, 1991, 1993). The adjoint data assimilation technique is a novel methodology with great merits to meeting the needs of oceanographic data assimilation. This method overcomes the intrinsic problem of solving the resulting Euler-Lagrange equations of the problem required by the traditional variational analysis proposed by Sasaki (1958). Instead of solving the Euler-Lagrange equations, the variational assimilation problem (in particular for the 4-D problem) is now solved by iteratively minimizing the cost function along a descent direction of the gradient calculated from the adjoint equations of the local tangent linear equations of the original model equations. The idea of applying adjoint equations to meteorological problems was first suggested by Marchuk (1974). Some early studies can be found in Penenko and Obraztsov (1976), Sadokov and Shteinbok (1977), Cacuci (1981), Marchuk (1981), and Hall et al. (1983). Lucid introductions on how to use adjoint equations to meteorological problems can be found in Lewis and Derber (1985), Le Dimet and Talagrand (1986), and Talagrand and Courtier (1987). The adjoint method has been applied to meteorological models with varying degrees of complexity (Errico and Vukicevic, 1992; Navon et al., 1992a; Thepaut et al., 1991, 1993; L i et al., 1994). Adjoint application to data assimilation in oceanography lagged its atmospheric counterpart, though the Lagrange multiplier method which is essentially equivalent to the adjoint method, was pointed out to be useful by Thacker as early as 1987. To date, its applications can be sorted into three groups — initialization, ocean circulation investigation, and parameter estimation. Chapter 1. Introduction 8 As the experimental forecasts of tropical oceanic variability on seasonal and interannual time scales are evolving to operational ones, initialization becomes crucial in improving the performance of the numerical forecast models. The feasibility studies by many researchers, e.g., Thacker and Long (1988), Thacker (1988), Bennett and Miller (1991), Long and Thacker (1989a, 1989b), and Sheibaum and Anderson (1990 a,b), have shown the potential of oceanic model initialization using the adjoint method. Ocean dynamics are characterized by a broad range of temporal scales. The high frequencies, associated with the gravity waves, set the upper limit for the size of the time step for most numerical models; the low frequencies, associated with the slow process in the establishment of the oceanic equilibrium circulation, determine the number of time steps needed to spin up the model. The need to find an optimal method to efficiently compute the steady state arose, as the conventional method is very expensive. Efforts have been made by the groups at AOML (Atmospheric and Oceanic Marine Laboratory, Miami) and MIT to determine whether a state of the North Atlantic ocean can be estimated which is consistent with both the observations and the North Atlantic models in a dynamic steady state ( Tziperman and Thacker, 1989; Tziperman et al., 1992; Marotzke, et al., 1992; Bergamasco et al., 1993). The Harvard Oceanography Group has applied the adjoint method to the Harvard quasi-geostrophic open-ocean model (Robinson and Walstad, 1987) to study the characteristics of the Gulf Stream. Moore (1991) studied the ability of the adjoint method in correcting large errors in the speed and the position of the Gulf Stream jet by assimilating GEOSAT sea surface height observations. The adjoint method has also been used to investigate the fastest growing unstable modes of a Gulf Stream-like jet (Farrell and Moore, 1992). The application of the adjoint method by J. J., O' Brien's group at FSU addresses the issue of parameter estimation (Panchang and O' Brien, 1989, Smedstad and O' Brien, 1990, Yu and O' Brien, 1991). Although oceanic models have become quite sophisticated in recent years, they still cannot accurately represent the state of the ocean. This is largely because of the many uncertainties of the model inputs, such as eddy-mixing coefficients, surface wind forcing, Chapter 1. Introduction 9 surface heat and fresh water fluxes, etc. Normally, there is no direct information on many of these input parameters in the oceanic measurements. The purpose of parameter estimation is to deduce the unknown model inputs from the existing data (wind, temperature, salinity, currents, etc.) with the aid of the numerical models and, at the same time, to obtain an optimal estimation of the observed fields. Besides the work done by O' Brien's group, there have been other studies on parameter estimation, e.g., Schroter (1989), Das and Lardner (1991), and Lardner (1992). It should be noted that, although various algorithms introduced above were conceived and developed independently, they are mutually related. For example, the interpolation methods of Gilchrist and Cressman (1954) and the successive correction methods introduced by Bergthorsson and Doos (1955) and Cressman (1959) may be looked upon as an empirical approximation to the statistical interpolation method. In these methods the interpolation weights are computed explicitly without solving a system of linear equations, so the number of computations is relatively low compared with the statistical interpolation method. Kimeldorf and Wahba (1970) have shown that the statistical interpolation method produces fields which are the solution of a variational problem in which the function to be minimized is the sum of two terms, with one term representing the distance to the observations and the other, some measure of the smoothness of the fields. The so called forward-backward data assimilation by Morel et al. (1971) is a special case of the adjoint algorithm. In their approach, the model is integrated forward and backward repeatedly over time to obtain an adjustment of the model to the observations. In the variational approach, the model itself is integrated forward, but the adjoint of the model is used in the backward integration. Thacker (1986) discussed the connection between Kalman filtering and the variational approach for data assimilation using a linear model. The Kalman filter can be thought of as an algorithm to solve the variational equations. In summary, in dealing with the problems of validating and initializing the simple models, the variational analysis method using the adjoint technique appears superior to other methods because model dynamics and data are tightly constrained and can be determined (or estimated) together. Chapter 1. Introduction 10 1.3 Motivation and Objectives The linear shallow-water model describes the motion of a shallow, rotating layer of homogeneous incompressible fluid. The fluid is hydrostatic, i.e. the pressure in any depth of the layer equals the weight of fluid above (Pedlosky, 1979). ^--Jkxu = -gVh + F at ^ + dVu = G, dt where u is the horizontal velocities, h is the small deviation of the top surface elevation, the mean value of which is denoted by d. G and F represent damping and external forcing. In the absence of dissipative processes and forcing, atmospheric motion can be decomposed into a summation over normal modes, each of which is the product of a height-dependent (vertical) mode and a function depending only on the horizontal coordinates and time. For each of these vertical modes, variations in the horizontal position and in time are governed by the shallow-water model above but with a different d, called 'equivalent depth' of the fluid for each vertical mode. Without vertical mixing, a similar decomposition can be applied to the oceanic motion (Lighthill, 1969). An important contribution to the atmospheric motion will be expected from modes with inverse vertical wavenumber m-' of the same order as the scale of the forcing function. In the tropical atmosphere, Diabatic heating tends on average to have a maximum at 5 km. This can be taken to be as a representative value for m 1 . The wave speed for such a mode in the absence of rotation is about 60 ms-' (Matsuno, 1966, Gill 1980). A prominent feature of the equatorial ocean is that the sharp pycnocline, at an average depth of 100 m, can move vertically a considerable distance in a short period of time. This is taken to be the depth for the first mode of the oceanic motion. The wave speed in the absence of rotation is about 1.4 ms1. Most studies of ENSO address two different questions: What is the response of the ocean to the changes in surface winds observed during ENSO ? What is the response of the atmosphere Chapter 1. Introduction 11 to the high sea surface temperatures observed during ENSO. It is believed that unstable atmosphere-ocean interactions play a key role in ENSO events. The best way to study the unstable atmosphere-ocean interactions is to use a coupled atmosphere-ocean model. It is natural and quite realistic to start with a simple coupled model with the atmosphere and ocean each represented by a linear shallow-water model. Philander et al (1984) developed the GFDL (Geophysical Fluid Dynamics Laboratory) simple coupled atmosphere-ocean model to show how an initial perturbation in the sea surface height develops in an unstable coupled situation: dt a rcos</> dX a dv dt J a r d(j) dt rcos</> dA d(j) dun , gn dhn "if ~ f v" + ~ ~ ^ l f = ~ a u ° + 7 U ° dt rcos0 dA l t + f U ° + ^ = -aV° + YV° | d0 {du0 | <?(v0cos4>)) = b h ^ dt rcos(j) dX d(f) where the atmospheric and oceanic variables are denoted by the subscripts "a" and "o" respectively. This simple equatorial coupled model parameterized the tropical atmosphere-ocean coupling where changes in the thermocline depth in the wind-driven ocean affect the sea-surface temperature which, in turn, heats (or cools) the atmosphere. For the atmosphere, ua and va are the zonal and meridional velocity components, ha is the depth perturbation against the equivalent depth da, f is the latitude-dependent Coriolis parameter, and the coordinates (r,A,0) are the radial distance, the longitude and the latitude, respectively. The parameter for Rayleigh friction is A , and that for Newtonian cooling is B. The reduced gravity is ga and a is the thermodynamic coupling parameter. For the ocean, ua and v0 are the zonal and meridional currents and ho is the depth perturbation against the equivalent depth d0. Motion is damped by Rayleigh friction and Chapter 1. Introduction 12 Newtonian cooling with respective parameters a and b. The reduced gravity is g0 and y is the dynamic coupling coefficient. It has been shown that the most important parameters in this model are the coupling parameters a and y where increasing a and y increases the unstable growth rate of this coupled system (Philander, et a l , 1984). The derivation of a ( lO 2 s-') is based on the following assumptions: (1) The air is always saturated by the water vapor supplied from the ocean surface. (2) The sea surface temperature is 300° K (the mean sea surface temperature in the tropical Pacific). (3) The atmosphere is warmed by diabatic heating rate of 1° K per day. y (4.5 x 1(T7 s_1) is estimated by assuming that the mixed-layer in the ocean is 20 m thick, the drag coefficient 1.5 x 10~3, and the typical wind speed 5 ms1. The damping nature of the wave modes are determined by the damping parameters A,B,a, b, of which A and B are the most important since the unstable growth rate of this coupled system decreases with increasing A and B (Yamagata, 1985). The values taken for A, B are based on the fact that the oceanic and the atmospheric disturbances may grow with the growth rate of about 1/20 day. Though this model has many flaws, the principal result from this model is that unstable atmosphere-ocean interactions in the tropics can permit modest initial anomalies to grow temporally and spatially. Therefore this model can describes the developing stage of an ENSO event. The main flaw of this model is that the heat lost by the ocean leaves the ocean unaffected and an increase in thermocline depth necessarily causes heating of the atmosphere. Since there is no feed-back mechanism in the model to modulate the ENSO cycle, the winds continue to strengthen and in turn drive the ocean initial anomalies to grow indefinitely. This model can only be run on an intraseasonal scale. In reality, unusually warm surface water releases an exceptional amount of water vapor to the atmosphere but the heating of the atmosphere is not necessarily local. It occurs where the water vapor rises and condenses. There is a local heating of the atmosphere over the warm anomaly only if there is rising air over the anomaly. Therefore atmospheric convergence related to both the mean and anomaly winds must be considered in the atmospheric model. Including mean wind in the model means that the seasonal cycle which is believed to be closely related to ENSO, is also considered. Assuming that the sea-surface temperature is a simple function of the Chapter 1. Introduction 13 depth of the thermocline is not realistic. In reality the sea-surface temperature is strongly influenced by advection so that sea-surface temperature gradients are important. These gradients are largest in the eastern equatorial Pacific and small in the western equatorial Pacific. It suggests that the ocean model can be improved by specifying the observed sea-surface temperature gradients and by letting the current in the model advect these gradients. Al l these physical processes have been considered in the Cane-Zebiak simple coupled atmosphere-ocean model with a more realistic coupling scheme between the atmosphere and the ocean, i.e. the atmosphere gets heated where convergence of wind occurs, while the wind driven by the heating in turn drives the ocean. The currents can advect the sea surface temperature gradients. This model can predict operationally ENSO events (Zebiak and Cane, 1987). Two aspects of simple models — the highly simplified processes of tropical atmosphere-ocean interactions and associated parameterizations, and the simple way of initialization — are detrimental to model performance. Validating the parameterization of the simple models and initiating the simple models by assimilating data in both the tropical ocean and atmosphere are important for improving prediction skills of simple models (Zebiak, 1989, 1990; Goswami and Shukla, 1991; Allen and Davey, 1994; Dewitte and Perigaud, 1996; Perigaud and Dewitte, 1996). However, data assimilation for tropical coupled models has until recently received relatively little attention, largely due to the sparse and poor quality tropical data. An optimal and efficient data assimilation scheme should be found for assimilating the asynchronous and asynoptic data of mixed types in the tropics so as to improve the simple models. The adjoint/optimal control technique, capable of optimally extracting information from data to estimate unknowns of numerical models — e.g. model parameters and initial conditions — has recently received increasing attention (Courtier, et al. 1993). However, most of these studies used single atmospheric or oceanic models. Many practical problems of this method are still being investigated, e.g., ill-conditioning, discontinuity, and nonlinearity. In this thesis, an attempt is made to explore the potential of the adjoint method in validating and initializing the simple models by using the GFDL (Geophysical Fluid Dynamics Chapter 1. Introduction 1 4 Laboratory) simple coupled atmosphere-ocean model. By assimilating tropical wind and/or sea level height (SLH), this study first focuses on determining the 6 model parameters (A, B, a, b, a and y) from sparse and noisy wind and SLH. Then the initialization problem is investigated by looking at the effects of the wind and/or SLH, their sparsity and noise, and initial guesses (SLH and currents) on retrieving the oceanic initial conditions. The adjoint method used in study can be extended to a more realistic model like the Cane-Zebiak model, and the results obtained in this study will provides useful information for further applying the method to operational models. This thesis is organized as follows. In chapter 2, we present the optimal control theory, parameter estimation and initialization using the adjoint method. The commonly used minimization algorithms that are an important part of the adjoint data assimilation, are also briefly described. In chapter 3, a generic procedure extending the adjoint method to coupled atmosphere-ocean models, estimating the coupled model parameters and retrieving the model initial conditions, is formulated. This procedure is then applied to the simple coupled atmosphere-ocean model. In chapter 4 , results from estimating 6 parameters of the simple GFDL model are presented. The results for retrieving three oceanic initial conditions are shown in chapter 5, followed by a summary and discussion in chapter 6. The material on parameters estimation is being published as Lu and Hsieh (1997a), and the material on the initial condition estimation has been submitted as Lu and Hsieh (1997b). Chapter 2. Adjoint Data Assimilation : Theory and Application 15 Chapter 2 Adjoint Data Assimilation: Theory and Application 2.1 Optimal control theory The optimal control theory addresses the dependence of the output parameters in a numerical model, described by a set of coupled partial differential equations, on any or all of the input parameters in the model, or more specially, how the outputs can be controlled by the inputs (Lions, 1971; Bryson and Ho, 1975). A general class of optimal control problems involves optimally finding an input vector p to minimize a output scalar function J(x,p). The input parameters in p are termed control variables (or unknowns) of the model and the output J is termed cost function of the problem. The state vector x(p) is the solution of the model equations E(x,p) = 0. The objectives of the optimal control theory are to obtain necessary and sufficient conditions for J(x,p) to be a unique minimum. The ultimate goal is to construct algorithms amenable to numerical computations for finding the control p which minimizes J(x,p) (such a control is termed "optimal control"). For a given optimal control problem, the choice of which parameters to be designated as the control variables is not unique, as long as p determines x through the system E(x,p) = 0. This suggests a method to solve the data assimilation problem in meteorology and oceanography, where model empirical parameters, initial conditions, and boundary conditions need to be adjusted. The application of optimal control theory with the variational method for data assimilation has developed the adjoint technique for solving these problems. 2.2 Optimal Control Applied to Data Assimilation Chapter 2. Adjoint Data Assimilation : Theory and Application 16 2.2.1 Mathematical Formulation For determining model unknowns (parameters, initial conditions, and boundary conditions), data assimilation algorithms based on the optimal control theory can be described as: minimize the cost function J{x,p) subject to E{x,p) = 0. (2.1) 2.2.2 Weak and Strong Constraint Formalisms and the Augmented Lagrangian Sasaki (1970) in his pioneer paper has introduced two formalisms, i.e., the weak and strong constraint methods, to enforce a constraint into (2.1). In the weak constraint formalism (or the penalty method in Le Dimet and Navon (1988) and Daley (1991)), the constraint model is penalized by a prespecified weight p:: Jx{x,p) = J(x,p) + p:\\E(x,pf, (2.2) where ||E(x,p)\\ is a suitable norm of E(x,p). In this approach, the optimal analysis and the optimally estimated unknowns are dependent on pi. Using a larger (1 will result in an analysis satisfying the constraint more precisely. Taking the first variation of (2.2) with respect to the variables p and x to be zero yields the so-called Euler-Lagrange equations: dJ\ , * *. A —(x ,p ) = 0, dx (2.3a) dJ\ , # _ —(x ,p ) = 0, dp (2.3b) where p* and J C * are the optimal values of p and x. The strong constraint formalism requires the optimal solution to satisfy the constraint Chapter 2. Adjoint Data Assimilation : Theory and Application 17 E(x,p) = 0 exactly. A Lagrange function L(x,p, I) is defined to impose this condition. L{l,x,p) = J[x,p) + {l,E{x,p)} (2.4) where { , } is an inner product of two vectors and / is a vector of as yet unknown Lagrange multipliers with the same number of components as the numbers of equations in E . The Euler-Lagrange optimality conditions, which require the first variation of L(x,p, I) with respect to /, x, andp to vanish, are given by ^(l\x\p*)=Q, (2.5a) ul ^(T,x\p)=0, (2.5b) dx ^(r ,*V) = 0. (2.5c) dp The optimal estimatep*, x*, and /* are obtained by solving (2.5a) through (2.5c). The augmented Lagrangian is a formalism combining the weak and the strong constraint formalisms (Navon and De Villiers, 1983): E(l,x,p) = J(x,p) + n\\E(x,p)( +{l,E(x,p)}. (2.6) Since E{x,p) is imposed as both a weak and a strong constraint, this approach can take advantages of these two algorithms. A major advantage of this method is its ability to prevent the numerical instability associated with the ill-conditioning of the weak constraint problem (2.2) (Bertsekas, 1982; Fletcher, 1987; Bryson and Ho, 1975; Gill, 1981). Numerical instability is induced when a unknown variable is approaching its optimum. In this case, the weak constraint procedure involves the product of a large value of the penalty parameter fi by a small vector E(x, p) and is subject to considerable round-off errors. 2.3 Adjoint Method The constraint minimization problem (2.1) can in principle be solved through its Euler-Chapter 2. Adjoint Data Assimilation : Theory and Application 18 Lagrange equations either (2.3) or (2.5). But except in particular cases, no standard method exists for doing so directly. An alternative is to solve (2.1) by an iterative minimization method. To do so, an efficient method to calculate the first-order gradient of the cost function with respect to the control variables is needed since most minimization routines require at least the first-order gradient information. The adjoint method provides such a way of computing numerically the gradient for a complicated function, especially for time-dependent problems. The gradient computation involves running the model E (called forward model in the context of the adjoint method) and an adjoint model which is essentially the adjoint model of the linearized E (tangent linear model). The adjoint model can be derived in two ways: the control theory approach (Le Dimet and Talagrand, 1986, Lions, 1971), and the Lagrange multiplier approach (Thacker and Long, 1988; Lanczos, 1968). The control theory approach is mathematically elegant and an adjoint model can be easily coded directly from a given forward model code by using this method. The Lagrange multiplier method is mathematically simple but gives no direct way to derive an adjoint model from a forward model code. The adjoint method has been mainly applied to the strong constraint problem. It can therefore be illustrated by analyzing (2.5). Condition (2.5a) recovers the forward model itself: E{x,p) = 0. If the state of the atmosphere/ocean, namely x, evolves according to the model dynamics, (2.5a) is satisfied by definition. Condition (2.5b) yields the so-called adjoint model, which governs the evolution of the Lagrange multiplier(s). In general, the adjoint model is always linear in /, independent of the nonlinearity of the model E. However, for a nonlinear time-dependent model E, the adjoint model depends on the state of the forward model £ as a function of time. In this case the time history of the forward model E must be stored in memory to be used when integrating the adjoint model. Condition (2.5c) provides the gradient of the cost function J with respect to the control p, as shown below. Since the contribution of the second term in (2.4) to the Lagrange function is always zero because of condition (2.1), the value of L(x,p) is always equal to the cost function J(x,p), i.e., Chapter 2. Adjoint Data Assimilation : Theory and Application 19 J(x,p) = L(x,p) for E(x,p) = 0. Then the gradient of the cost function with respect to the control p can be calculated as y j _dJ _dL _dL + dL dx P dp dp dp dx bp (2.7) Since condition (2.5b) requires dL/dx = 0 at the optimal point, therefore the second term in (2.7) vanishes and (2.7) becomes V p 7 = dLI dp which is just (2.5c) and should be zero at the optimal point. Substituting (2.4) into (2.7) then yields: 3L = dJ_ dE dp dp bp W " J = ^  = ^  + { 1 ^ ] - (2-8) Equation (2.8) is the formula to calculate the gradient of the cost function J with respect to the control p. It is clear that VPJ is easy to obtain once the trajectory / of the adjoint model has been determined. With this gradient, the optimal control can be found by applying a descent method applied iteratively, until certain convergence criterion is satisfied. For an imperfect model, the so-called continuous variational assimilation should be used to take into account the forward model errors (Derber, 1989). Besides the technical problems related to developing the adjoint model, the adjoint method faces theoretical and practical problems in the minimization part. Many different minimization algorithms are available (Gill et al., 1981; Tarantola, 1987; and Luenberger, 1984). In the next section, we discuss briefly some algorithms and their properties. 2.4 Minimization Algorithms Many minimization algorithms implementing the descent method can be illustrated by Taylor expanding the function F(x) to be minimized about an n-dimensional point xk, where xk contains n unknowns to be adjusted: F(xk + adk) = F{xk) + agk(xk)dk+\a2d'krG(xk + a6dk)dk (2.9) where the superscript tr denotes matrix transpose and k denotes the kth iteration of the Chapter 2. Adjoint Data Assimilation : Theory and Application 20 minimization; 0<0< l . a i s a scalar, dk an n-dimensional vector, gk the gradient of the function J with respect to the unknowns, and G is the Hessian matrix (d2F I {dxfiXj), i = 1, n, j = 1, n). Note that Xk + ccdk is the estimated x at iteration k+l. If there exists a non-zero gk and a vector dk for which glrdk<0, ( 2 1 0 ) dkis termed a descent direction since F(xk+adk) w i n be smaller than F(xk) if ocgk(xk)dk + \ a2d'krG(xk + addk)dk<0 . The scalar a is termed a stepsize which determines the descent depth of F along the descent direction. Therefore, implementing the descent method can be described as follows: Step 1. Test for the convergence (usually test the cost and the gradient). Given an xk, if the conditions for convergence are satisfied, the algorithm terminates with xk as the solution. Step 2. Compute a search direction. Compute a non-zero n-dimensional dk, the direction of search. Step 3. Compute a step-size. Compute a positive scalar a k , the step-size, for which F(xk + akdk)<F(xk) Step 4. Update the estimate. Set xk+l = xk+akdk, k = k+l, and go back to step 1. Descent algorithms are different mainly in the ways for computing the search direction and stepsize. The oldest and the simplest way to find the descent direction is the method of steepest descent. Though simple, this method can guarantee a satisfactory analysis. 2.4.1 First Derivative Algorithms First derivative algorithms need only first order gradient information. The steepest-descent algorithm is one of them with the descent search direction taken as dk = —gk, which can be derived from the first two terms in (2.9). The steepest descent process is displayed in Fig. 2.1 in which the cost function has twovariables s, and s2. For each iteration from k to k+l, F(xk+l)< F(xk) along -gk, until the Chapter 2. Adjoint Data Assimilation : Theory and Application 21 irunimum of F(x) is reached. Unfortunately, the steepest descent algorithm does not ensure that it is an efficient method. This will be shown in 2.4.3 by examining the convergence rate of the steepest-descent method. S2 I > Figure 2.1 Illustration of the steepest descent method to obtain the optimum for a two dimensional problem. 2.4.2 Second Derivative Algorithms The models used in oceanographic studies often have a large number of unknowns N, which can easily exceed 104. For such cases the potential choices of descent methods are the Chapter 2. Adjoint Data Assimilation : Theory and Application 22 Newton-type algorithms and the conjugate-gradient algorithms. These second derivative algorithms which need second order gradient information can be derived from (2.9). If assuming that the stepsize a is equal to 1, (2.9) can be simplified as: F(xk +dk) = F(xk) + g'krdk + \d{rGkdk. (2.11) Since our aim is to find dk, it is helpful to formulate (2.11) in terms of ^(the step to the minimum) rather than the predicted minimum itself. Note that the minimum of the right-hand side of (2.11) will be achieved if dk is a minimum of the quadratic function ®(dk) = g'krdk+$d'krGkdk. (2.12) Further expanding (2.12) around dt we have 0(d*k + a'd') = 0(dt) + a'd"r{Gkdt + gk) + ^  a'2d"rGkd' (2.13) where a' and d' are new stepsize and search direction, respectively. The function & has a stationary point only if there exists a point dt where the first order gradient vector vanishes, i.e. Gkdt +gk = 0. (2.14) A minimization algorithm in which dk is defined by (2.14) is termed Newton's method. The Newton-type algorithms have better convergence rate than the conjugate-gradient algorithms but require bigger storage for the second derivative Hessian matrices of size NxN than the conjugate-gradient algorithms, which store a few vectors of length N (Hestenes, 1980). The conjugate-gradient algorithms are therefore more practical for atmospheric and oceanic data assimilation problems. The limited-memory quasi-Newton conjugate-gradient method is a Newton-type method using a conjugate method to calculate the Hessian. In the limited-memory quasi-Newton conjugate-gradient algorithm of Shanno and Phua (1980), each search direction dk is computed by using the present and previous gradients gk and gkA and the previous conjugate direction, dkA i.e., dk=-gk+!3kdk-u Chapter 2. Adjoint Data Assimilation : Theory and Application 23 where Bk = — . The search directions dp j = 0 k, are mutually conjugate, i.e. Pk-\(gk - gk-i) for / = 0 k and7 = 0 k d!rGdj = 0, i * j. Conjugate-gradient algorithms have been successfully applied in meteorology to minimize the cost function used in variational analysis, e.g. Hoffman (1982, 1984), Navon and De Villiers (1983). Navon and Legler (1987) compared different conjugate-gradient algorithms by applying them to two different meteorological problems, and concluded that the most consistent method was the above Shanno and Phua limited-memory quasi-Newton conjugate-gradient algorithm. Navon et al (1992b) have also reviewed some other algorithms that are suitable for meteorological data assimilation. In this thesis, all experiments were performed using a limited-memory quasi-Newton method in the NAG FORTRAN Library routine, E04DGF. The search direction in this routine is calculated by: p M = -gk+x + -^-(s?gk+lyk + y'kgk+lsk)-^^-(l + ^-)sk, ykSk yksk yksk where yk =gk+i -gk and sk = xk+i -xk = akdk. The limited-memory quasi-Newton conjugate-gradient method implemented in the Shanno and Phua's (1980) CONMIN algorithm was also used for some experiments but as no substantial improvements were found, all results reported in this thesis were obtained from using the E04DGF routine. 2.4.3 The Role of the Hessian in Minimization Algorithms There are many factors that might influence the convergence rate of the adjoint data assimilation, for example, the eigenstructure of the Hessian, data which cannot reflect the model dynamics well, and the adverse effects of rounding errors which cause the computed directions to dose conjugacy. Among them, the structure of the Hessian matrix plays a very important role in Chapter 2. Adjoint Data Assimilation : Theory and Application 24 determining the convergence rate. From (2.12), the efficiency of the steepest-descent method applied to minimizing 0 with an exact line search to determine the step length can be determined. Suppose that X^ and X^n are the largest and smallest eigenvalues of Gk, then it can be shown that (Gill, 1981) <P(dk+])-<P(d*) = i P i ™ ~Xminl *(<*')) = 7 ^ ( * ( A ) - * ( « 0 ) (K+l)2 where d* is the minimum of (2.12) and K denotes cond(Gt) (/tmax / A m i n ) , the spectral condition number of Gk. The striking feature of this result is that the asymptotic error constant (K- 1)/(K+ 1), which gives the factor of reduction in the error at each step, can be arbitrarily close to unity. For example, if K is given to be 50 (i.e. Gk is not particularly ill-conditioned), the error constant is (49/51)2= 0.92. As a result there is only a very small gain in accuracy at each iteration. In practice, the steepest-descent method typically needs hundreds of iterations to make very little progress towards the solution. It has been shown that a descent method with a linear rate of convergence is slow in convergence (Gill, 1981). For those second-derivative algorithms, the convergence rate is also determined by the structure of the Hessian because it follows from (2.13) and (2.14) that 0(dt + a'd') = 0{dl) + \af2d"rGkd'. y£.1J) Hence, the behavior of 0 in a neighborhood of dt is determined by the matrix Gk. Let A,y and vy denote they'-th eigenvalue and eigenvector of Gk, respectively. By definition, GkVj = Xjvj. The symmetry of Gk implies that the set of vectors {vj},j = 1 n, are orthonormal. When d' is equal to (i.e. descent direction is along the eigenvector), (2.15) becomes 0{dt + a'vj) = 0(dt) + \ a,2Xj. (2.16) Thus, the change in 0 when moving away from dt along the direction Vj depends on the sign of Chapter 2. Adjoint Data Assimilation : Theory and Application 25 Xj. If Xj is positive, 0 will strictly increase as a increases. If Xj is negative, 0 will monotonically decrease as I a ' I increases. If Xj is zero, 0 will remain constant when moving along any direction parallel to v y, since GkVj = 0; furthermore, 0 will reduce to a linear function along any such direction, since the quadratic term in (2.13) vanishes. When all eigenvalues are positive, d*k is the unique global minimum of 0. In this case, the contours of 0 are ellipsoids whose principal axes are in the directions of the eigenvectors of Gk, with lengths proportional to the.reciprocals of the square roots of the corresponding eigenvalues. If Gk is positive semi-definite, a stationary point (if it exists) is a weak local minimum. If Gk is indefinite and non-singular, dk is a saddle point, and 0 is unbounded above and below. 2.4.4 Important Issues in Variational Adjoint Procedure Associated with Minimization Before we proceed to formulate the variational adjoint procedure for coupled models, two issues, i.e. scaling and a priori information have to be discussed. The former is critical to determining the condition number of the Hessian, while the latter is critical to increasing the convexity of the Hessian. Consequently, according to the discussions in section (2.4.3), they are critical to accelerating the convergence and obtaining a unique minimum for the minimization. In theory, the condition number of the Hessian, which is the ratio of its largest to its smallest eigenvalues, represents the degree of singularity of the Hessian. If the condition number is large, the Hessian is ill conditioned; and if it is close to unity, the Hessian is well conditioned. An ill conditioned Hessian indicates that some model variables are poorly determined by the data, which can occur when model variables are inappropriately scaled or observations are inadequate (Thacker, 1988, 1989). The inefficiency of the variational adjoint method for data assimilation is largely due to slow convergence rates associated with no or poor scaling of numerical models (Navon and De Villiers, 1983; Zou et al., 1995). The conditioning of the Hessian can be Chapter 2. Adjoint Data Assimilation : Theory and Application 26 improved by the technique of scaling. Studies have shown that an initial well-scaled functional can lead to a significant improvement in the performance of the limited memory quasi-Newton conjugate-gradient algorithms (Gill et al., 1981; Yu and O'Brien, 1991). A l l the methods of scaling can be divided into three groups: scaling the variables (Gill, 1981; Navon and De Villiers, 1983; Schroter, 1989; Yu and O'Brien, 1991), scaling the cost function (Navon et al., 1992a), and scaling the gradient (Gill et al., 1981; L i et al., 1994; Zou and Holloway 1995). In this thesis, we used the technique of variable scaling. Variable scaling transforms the variables from units that typically reflect the physical nature of the problem to units that display certain desirable properties for the minimization process ( Gi l l et al., 1981; Navon and De Villers, 1983). The basic rule of variable scaling is to make all the variables in the scaled problem to be of order unity, so that each variable has a similar weight during the minimization. If typical values of all the variables are known, a problem can be transformed so that the variables are all of the same order of magnitude. Normally, only linear transformations of the variables should be used for scaling. The most commonly used transformation has the form Z = Sy (2 .17) where {z7-} are the original variables, {yy} are the transformed variables, and S is a fixed non-sigular matrix. By doing so, the derivatives of the cost function with respect to unknowns are also scaled. If gy and Hy represent the gradient vector and Hessian matrix of the transformed problem respectively, the relationships of the derivatives of the original to the transformed are given by g,=Sg; Hy=SlHS. It is obvious that variable scaling has a substantial effect on the Hessian. This will in turn significantly change the condition number of the Hessian and hence improve the convergence rate of the minimization algorithms. A unique minimum of (2.4) exists when two conditions are satisfied. First, the problem must be well-determined (Menke, 1984). Theoretically this can happen if the number of Chapter 2. Adjoint Data Assimilation : Theory and Application 27 unknowns is exactly equal to the number of observations. However this is not always true in the real application of data assimilation in meteorology/oceanography. The second condition is that the cost function to be minimized should be convex within the domain of the definition of the unknowns (Carrera and Neuman , 1986a,b). In variational analysis, the cost function is usually an L2 Euclidean norm measuring the misfit of the model output and the observations D(x) = ±j(x-xobsyrKx(x-x0bs)dcJ (2.18) where the subscript obs denotes observed data. The integral is over the observational space-time domain X. Kx is a weighting matrix and theoretically should be taken to be the inverse of the observational error covariance matrix. By assuming the errors in the data are uncorrelated and equally weighted, Kx is reduced to a unit matrix / multiplied by a constant Kx. The Hessian of the cost function (2.18) is d2D dp2 i fdx\tr(dx\,(l. v \ d2x W (-dp)+{x~Xobs)-cy (2.19) where differentiation by a vector means differentiation by an element of the vector. The first term on the right-hand side is positive semi-definite, while the second term could be negative. Therefore, the positive definiteness of the Hessian is not guaranteed in (2.19), and it is possible to find more than one solution of the minimization problem. In this case, although the data provide information about the unknown parameters, they do not provide enough to determine them uniquely. A strategy has been proposed to solve or ease the difficulty associated with (2.18) — i.e. by adding a priori information (Menke, 1984; Thacker, 1988, Panchang and O'Brien, 1988). The a priori knowledge is not the information based on present observations but on previous experience. It plays the role of bogus data, which are often used to supplement the insufficient real data. Its usefulness is in the great reduction of the range of possible solutions — or even causing the solution to be unique (Carrera and Neuman, 1986b). Chapter 2. Adjoint Data Assimilation : Theory and Application 28 By adding a new term h = i K p \(p-p)'r(p-p)d(j to (2.18), where the caret («) s denotes the a priori field and Kp the computational weight, we then have a new cost function J J(x,p) = D+Jp ( 2 2 0 ) From the mathematical point of view, the term Jp functions as a penalty term in which p corresponds to shifts of the origin and controls the size of the penalty (Fletcher, 1987). A priori information can take many forms and in each case it quantifies expectations about the character of the solution that are not based on the actual data (Menke, 1984). We first guess p in (2.20) and then update it using its value at the previous iteration. By doing so, the term Jp measures the closeness of the estimated unknown between two consecutive iterations of the minimization process. It then penalizes the departures from the previous estimate when searching for the optimal solution, which is the a priori prejudice towards smoothness. With this a priori information, the Hessian of (2.20) becomes d2J_d2D where / is a unit matrix. Clearly, the second term on the right-hand side is positive definite. Although one cannot guarantee the positive definiteness of (2.21), the inclusion of the information increases the probability that this will be the case and therefore enhances the convexity of the cost function J in (2.21). Carrera and Neuman (1986b) have studied several examples which clearly show the influence of a priori information on the formation of the unique solution. This problem is also addressed by Bennett and Miller (1991) in which an explicit contribution from the initial condition was included in the formation of the cost function in order to ensure a unique, low-noise forecast. Actually, the term Jp not only provides the a priori information for the estimated unknown but also accelerates the convergence of the minimization algorithm (Bertsekas, 1982; Fletcher, 1987; Thacker, 1988). When an ill-conditioned Hessian is due to inadequate observations, the a Chapter 2. Adjoint Data Assimilation : Theory and Application 29 priori information could serve as bogus data and hence increases the number of observations. Thus adding the term Jp can improve the conditioning of the Hessian with the practical benefit of speeding up the convergence of the minimization algorithm. Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 30 Chapter 3 Adjoint Data Assimilation in Coupled Model Systems 3.1 A Generic Procedure for Parameter Estimation and Initialization in Coupled Systems Let JC be the vector of state variables of the atmospheric model, with X the atmospheric model operator. Let y and Y denote the state vector and the operator of the ocean model, respectively. A coupled model can be described by: where the unknown model parameters (e.g. friction parameters, coupling parameters, etc.) and unknown initial conditions for the atmospheric and oceanic models are given by vectors px and x0, and vectors py and j 0 , respectively. The 4-D variational problem (2.1) involves minimizing a quadratic cost function J = J(x(px,t),y(py,t),px,py,Xo,yo) subject to model constraint (3.1), which is defined as: m = x(x,px,y,t) dt x(0) = xo 4L = Y{y,p„xtt) dt y(0) = yo, (3.1) J(x,y,px,py,xo,y0) = \l D(x,y,t)dt + (px - pxf Kp,(px - px) + (Py ~ Py T KP, (p, -py) + (x0- Xo )'r Kx 0 (x0 - Xo ) + (yo-yo)'rKyo(yo-yo) (3.2) where constant vectors px, py, x0 and y0 contain a priori knowledge of the magnitudes of the Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 31 parameters and initial conditions. Kp„, KPr, KXo, and Kyo are weighting matrices, and T is the assimilation time window. D measures the distance between the model state and the observations at time t: D(x,y,t) = [x-Xabs]'rWx(t)[x-Xobs] + [y-yobsTWy(t)[y-yobs], (3.3) where the subscript obs denotes the observations and Wx(t) and W (t) the weighting matrices. The gradients of / with respect to px, py, x0 and y0 needed to find the optimal parameters and initial conditions can be found by introducing two Lagrange multiplier vectors x*(t) and y*(t) to form a Lagrange function L : L(x,y,x\y\px,py,xo,yo) = J + lT0{x\tr[^--X] + f(tn^-Y]}dt. (3.4) dt dt The differentiation of L with respect to x* andy* will simply recover the coupled model. After integration by parts, L becomes L = [x*(tTx + f(tyyU + £{D-[^-x + x*(t)'rX] dt - [^f-y+y*(tTY]}dt+(Px-px r Kp, (Px - px) at + (Py-Py)'rKp,(Py-Py) + (Xo-X0fKXo(Xo ~X0) (3-5) + (yo-yo)'rKyo(yo-yo). The first order variation of L is: Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 32 6L = [x*(tTSx + /(tydylll + £ { VxD"&t + VyD'rdy dX^" - [ -[ v dt j car + 8y + \dx J dY x*(t)8x + jc*(f)8y + fdx^'r dp x'(t)Spx] y*(t)dy + OYY \oxJ rdYWr dp y\t)dPy]}dt + 2(px-pxfKphpx+2{py-pyfKphpy +2(x0-x0)'rKxte0 + 2(y0-y0TKjy0. (3.6) The coupled adjoint model can then be obtained by letting 8LI 8x = 8LI 8y = 0. _dx^ = dt x*(T) = 0 dt y*(T) = 0 ( — T \dx J (dYX x'(t)+ — y\t)-VxD \oxj f(t)+ ^ x\t)-VyD (3.7) According to (2.7), the formulae for computing the gradients of J with respect to px, p , x0 mdy0 can be obtained by differentiating (3.6) with respect to these unknowns: = 2KPx (px-p*)-)0 8px 8J_ 8py 8J_ 8XQ 8J_ 8y0 ydPx j = 2K„r(Py-py)-\l \OpyJ = 2KXo'r(x0-x0)-x*(0) = 2Ky:r(y0-y0)-y*(0). x*(i)dt y*(t)dt (3.8) In summary, the general computational procedure for estimating coupled model unknown parameters and initial conditions using the adjoint method is: Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 33 (i) specify an initial guess of the unknown px , py, x0 and y0 and estimate the size of control vectors px, py, x0 and y0; (ii) integrate the coupled model (3.1) from t = 0 to t = T; (iii) compute J within [0, T] using forward model trajectory and data; (iv) compute the gradient from (3.8) by a backward integration of the adjoint model (3.7) from t = T to t = 0; (v) use a descent algorithm to find a new estimate of px, py, x0 and y0 and replace px, py, x0 and y0 with the last px, py, x0 and y0, thus ensuring the new estimate to be not far away from the original estimate; (vi) if the convergence criteria are satisfied, terminate the procedure, otherwise redo the procedure from (ii) using the new guess of px, py, x0 and y 0 -3.2 Determining Parameters and Initial Conditions in a Local Thermal Equilibrium Model 3.2.1 Model The simple equatorial coupled model of Philander et al. (1984 ) (hereafter called the Philander model) parameterized the tropical atmosphere-ocean coupling in the simplest fashion, i.e., changes in the thermocline depth in the wind-driven ocean affect the sea-surface temperature which, in turn, heats (or cools) the atmosphere. The one-layer atmospheric motion, driven by a heat source proportional to the perturbations in the thermocline depth h0, is described by linear shallow-water equations: Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 34 . y V + - ^ ^ = -Au dt " rcos0 dX * r d * (3.9) dt rcos0 dX d(j) where ua and va are the zonal and meridional velocity components, ha is the depth perturbation against the equivalent depth da,f is the latitude-dependent Coriolis parameter varying with latitude, and the coordinates (r,A,0) are the radial distance, the longitude and the latitude, respectively. The parameter for Rayleigh friction is A , and that for Newtonian cooling is B. The reduced gravity is ga and a is the thermodynamic coupling parameter. The tropical upper ocean response to the atmospheric wind stress is also described by linear shallow-water equations :, duo r Sn dhn —-f- -fv0+ r-2- = -auQ + yua dt rcos0 dX ^t + f U ° + i 7 1 k i = - a V ° + 7 V ° dK | d„ {du0 | ^(v o cos0) ) = b h dt rcos0 dX d(j) (3.10) where u0 and v0 are the zonal and meridional currents and ho is the depth perturbation against the equivalent depth d0. Motion is damped by Rayleigh friction and Newtonian cooling with respective parameters a and b. The reduced gravity is g0 and y is the dynamic coupling coefficient The most important parameters in this model are the coupling parameters a and y, which determine the unstable growth of this coupled system (Philander, et al., 1984). The damping nature of the wave modes are determined by the damping parameters A, B, a, b, of which A and B are the most important (Yamagata, 1985). In this thesis, we apply the general procedure in the Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 35 last section to estimate these six parameters and leave the long gravity wave speeds (ga and g0) as known parameters. The atmospheric long gravity wave speed ca = 66 m s-1 is chosen such that the equatorial radius of deformation is about 2000 km and the corresponding equivalent depth is around 400 m. For the ocean, a long gravity wave speed of c„ =1.4 m s-1 gives an equatorial radius of deformation of 250 km and an equivalent depth of about 20 cm. This value of the equivalent depth is appropriate if the sharp, shallow tropical thermocline is considered to be an interface between the warm surface water and the cold deep water. The values chosen for a and 7 are as in Yamagata (1985). This coupled model, though not entirely realistic, can simulate to some extent (up to 40 days) the unstable local-growth of anomalous conditions in the equatorial region, starting from an initial state of rest with a small initial perturbed thermocline depth of Gaussian shape centered around the equator in the middle of the ocean, i.e. h, = 0.2exp[-(x2 + y2)/(500 km)2] cm. (3.11) Equations (3.9) and (3.10) were solved numerically following Philander et al (1984). The ocean model was run in a 200° longitudinal extent between 30° S and 30° N. The atmospheric domain was larger in the zonal direction by 20° and was zonally cyclic. The solid wall boundary condition was set for the ocean and the atmosphere in the northern and southern boundaries. The grid spacing was 1° in the latitudinal and the longitudinal directions in both the ocean and the atmosphere. The coupling between the atmosphere and the ocean was only updated once a day. For the rest of the day, the driving forces for the atmosphere and ocean were held fixed (see Appendix B for details). 3.2.2 Adjoint Model Following the general procedure, the adjoint model is derived from (3.9) and (3.10): Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 36 dua . * da dh*, . * * dD — - p „ + i ^ r = A M « - ^ + T ~ dt rcoscp dA dua dva . da dhl . » <9D — + /wa + — = Ava-yv0+—- (3.12) or r dtp dva dhl | & | <9v« cos 0 _ | (9D cfr rcos</> <9A <90 ^/i, or rcostp dA du0 dv*o r * d0 dhl * dD iF+fu'+T^=av'+^: 0.13) dhl dt T C O S 0 oW, (70 (7rt0 Variables with a superscript asterisk denote the adjoint variables (i.e. Lagrange multipliers) of their counterparts in (3.9) and (3.10). In the adjoint model, a forcing term on the RHS of each equation measures the misfit between the solution and the data, and the damping processes take opposite signs because of backward integration. Note that there exists an inverse coupling in the coupled adjoint model in which the atmospheric adjoint model is driven "dynamically" by the "adjoint current" and the oceanic adjoint model is driven "thermodynamically" by the atmospheric "adjoint depth perturbation". These two terms ensure information exchanges in the coupled adjoint model so that the model trajectories of the two single models can be tuned together. From (3.8) and with the a priori information terms omitted for simplicity, the gradients of the cost function with respect to the six parameters and the six model initial states are found to be: Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 37 8JI8A = \(uaua + v*va )dadt z 8J18B = \(h*ha )dadt 8318a = \iKah0)dadt i 8JI 8a = j(u*0u0 + v*0v0)dodt x 5J/8b = j(h*h0)dadt i 8J18y = ] (u0ua + v*0va)dodt 8J / 8ua=-ua{0) 8J/8va=-v:(0) 8J/8ha=-h*a(0) SJ / Su0 = -u0(Q) (3.14) 8J/8vo=-v:(0) SJ/8ho=-K(0) where the integrals are computed in the model spatial and temporal domain X . The derivation of (3.12-3.14) is given in Appendix A. The discretized forms of the adjoint model (3.12-3.13) and gradients (3.14) are given in Appendix B. The validations of the adjoint code and the gradient calculations (3.13-3.14) are given in Appendix C. Chapter 3. Adjoint Data Assimilation in Coupled Model Systems 38 Chapter 4 Determining Model Parameters In the existing simple models, the parameterization schemes and the values of parameters used are different (Neelin, 1989, Lindzen and Nigam, 1987). Verifying parameters of simple models is very important, as the values of these parameters are essential to the model dynamics (Hirst, 1986, 1988). Allen and Davey (1993) made the first effort to validate the atmospheric Rayleigh damping coefficient by inverting the steady state atmospheric model of the simplified Cane-Zebiak coupled model using ECMWF 1000-mb wind and then to estimate the coupling coefficient by regression. The atmospheric Rayleigh damping coefficient was not found exactly despite high quality grid point data used. Since the damping coefficient and the coupling parameter were estimated separately, the optimal combination of the empirical parameters was not found in their study. Identifiability, uniqueness and stability are the key issues for parameter estimation, which depend on whether the problem is well-posed ( Kitamura and Nikagiri, 1977, Yeh, 1986, Tarantola, 1987). Estimating unknown parameters of atmospheric and oceanic models using the adjoint method is still at an experimental stage. Navon recently gave a rigorous mathematical review of parameter estimation and a lucid review on the present status of parameter estimation using the adjoint method. To invert the tropical atmospheric and oceanic observations to estimate the parameters of simple models, the main obstacles would be: (1) data sparsity and noisiness which could generally cause ill-conditioning (Thacker and Long, 1988, Tziperman and Thacker 1989, Thepaut and Courtier 1991, Li et al, 1994 ), (2) nonlinearity which could result in nonuniqueness due to the existence of secondary minima. Though the TAO array of the 10-year TOGA program, spanning the tropical Pacific from 95°W to 137°E between 8°S and 8°N, has greatly improved the tropical observation Chapter 4. Determining Model Parameters 39 network by providing real-time oceanographic and surface meteorological data at about 2° latitude by 15° longitude daily averaging resolution, the present available data are still short of providing a complete, uniform, and accurate coverage of the tropical Pacific domain. We used identical twin experiments to examine the numerical feasibility and the potential of adjoint parameter estimation, focusing on how data sparsity and noise affect the quality of the parameter estimation as a first step towards estimating empirical parameters from real tropical observations. Remedies for the ill-conditioning problem were also tested. 4.1 Experiments Data used in this study were sampled from a 40-day control run — with the parameters given in Table 1 and the initial condition (3.11), the ocean model was first relaxed for 10 days, then the models were coupled for 40 days. Fig. 4.1 shows the slow eastward propagating unstable mode generated by the coupled system at day 40 (for a detailed discussion about the physics of this unstable growth, see Philander et al. 1984 ). We assumed that only observations of the wind and the sea level height (SLH) are available. As the atmospheric pressure field for simple models of this kind has different interpretations (Neelin, 1989), we would like to pose our problem on a tough condition, i.e., without pressure observations, whether we can still estimate the parameters. Current observations are sparse in general. We therefore minimize the cost function: D = ls{Wu(ua-uTf + Wv(va-vT)2+Wh(h0-h^)2}ds. ( 4 > 1 ) For our simple coupled model, the magnitudes of the six parameters and model state variables varied from 107 to 104. We scaled the model state variables in (3.9) and (3.10) by the values from the control run. The surface perturbations ha and ha were scaled by the atmospheric and the oceanic equivalent depths (da = 400 m, d0 = 20 cm ), respectively, the wind by the typical wind (100 cm s"1) and the current by the maximum current generated by this model (100 cm s"1)- After scaling, the weighting matrices in (4.1) were simply chosen to be 1/2 times the Chapter 4. Determining Model Parameters 40 100W 50 0 50 1 00E (a) Figure 4.1 Spatial structure of (a) the oceanic zonal velocity component (cms4) and (b) the depth of the thermocline (cm) after integrating the coupled model for 40 days using parameters in Table 1. Shaded regions indicate westward motion in (a) and elevated thermocline in (b). Chapter 4. Determining Model Parameters 41 Figure 4.1b (b) Chapter 4. Determining Model Parameters 42 identity matrix. The same weighting matrices were used for the experiments with added noise Table 1 Parameters used in the coupled atmosphere-ocean model for parameter estimation. parameter Values A 2.3xl0- 6 s-1 [ (5 days)-1] B 7.7xl0~7 s-' [ (15 days)-1] Ca =(gadayV2 66 m s-1 sa 40 cm S"2 a 1.2X10- 7 s-1 [ (100 days)-1] b 1 . 2 x l 0 - 7 s 1 [ (100 days)1] Co ^(godo)^2 1.4 m s-' So 2.0 cm S"2 a l.OxlO-2 s-1 7 5.0X10- 7 s-1 in this thesis. Next, we ran numerical experiments to explore three types of data: (1) Data available were sparse either in time or in space for either the wind or the SLH. (2) Data available were noisy, i.e., we tested the impact of the noise level on parameter estimation by adding noise to either the wind or the SLH or both. (3) In cases where the data are sparse and noisy for the wind or SLH, we studied the window effect and the a priori estimate effect on the parameter estimation by assimilating data with different temporal windows, and by introducing a priori estimates into the cost function, i.e.: J = \lD dt + WA(A - A)2 + WB(B - B)2 + Wa(a - a)2 + Wa(a - a)2 + Wb(b-b) + Wr(Y-y)2, (4.2) where WA,WB,Wa,Wa,Wb,W are weighting constants. When this cost function is used, the formulae for calculating the gradients of the six parameters in (3.14) are replaced by: Chapter 4. Determining Model Parameters 43 SJI 8A = 2WA(A - A) + \{uaua + vava)dodt X SJ/8B = 2WB(B-B) + j (h*ha )dcrdt 8JI Sa = 2Wa(a - a) + \{h*ah0)dodt X 8JI Sa = 2Wa(a -a) + \(u*0u0 + v*0v0)dadt X 8J/Sb = 2Wb(b -b) + \Qi0h0)dadt (4-3) X 8JI Sy = 2WY(y-y) + \(u0ua + v*0va)dadt X Quality appraisal for the parameter estimation is difficult because the parameters are not directly observable quantities. The correlations between the estimated state variables and their counterpart observations were used to appraise the parameter estimation by Smedstad and O'Brien (1991), and Yu and O'Brien (1991). For more appraisal methods, see Menke (1984). For all experiments, the relative errors of the estimations to the true values were taken to be the accuracy criteria. Five times the true values were used as the initial guesses for the 6 parameters. 4.2 Results 4.2.1 Impact from the amount of data on parameter estimation With data everywhere from wind components ua and va, or from only one component (ua or va), the six parameters can be easily determined in a 5-day inversion (Fig.4.2). Though the historical tropical atmospheric and oceanic data were sparsely scattered in space and time, with a tendency to cluster around the major sea lanes, the simplest way to study the effect of data sparsity is to assume that the sparse data are evenly distributed in the time domain and/or in the spatial domain. As the temporal and spatial sparsity of data might have different effects Chapter 4. Determining Model Parameters 44 on the parameter estimation, they were investigated separately. Fig.4.3a shows the effect of temporally sparse wind data on the parameter estimation for a 40-day assimilation in which wind data were provided everywhere spatially but only once every 90 or 180 time steps (with the atmospheric model taking 90 time steps per day); and the SLH data were provided everywhere and for every ocean time step. The relative estimation errors exhibited strong growth from decreasing wind availability from every 90 to 180 time steps. This error growth is consistent with the suggestion by Miller and Ghil (1990) that the number of secondary minima increases with the length of the time over which the minimization is carried out. Though each experiment in Fig.4.3a converges on different minima in the cost function, some commonly deterministic features could be observed, e.g., the oceanic parameters a and b and the atmospheric Newtonian cooling parameter B are more sensitive to the temporal data density of the wind than the other three parameters a, y and A. With wind data once a day (which is the temporal resolution of the TAO array network), the relative estimation errors of the latter three were good (smaller than 10"2) while the former three were acceptable with errors smaller than 10'. For comparison, Fig.4.3b displays the oceanic effect of temporal sparsity, where the SLH were available everywhere once every 8 or 16 time steps (with the ocean model taking 8 time steps per day), and the wind data available everywhere for every time step. The estimations also showed a degrading trend as SLH availability decreased. Comparing the cases with data once per day, we found that the overall error magnitudes for the sparse SLH case (Fig.4.3b) were smaller than those for the sparse wind case (Fig.4.3a), especially for the oceanic coefficients a and b and the atmospheric Newtonian cooling parameter B. We can conclude that (a) the overall quality of the estimations for the 6 parameters is more sensitive to the temporal density of the wind data than to that of the SLH, and (b) estimation of the oceanic coefficients a, b and atmospheric coefficient B are most likely to deteriorate when both wind and SLH data become sparse, while parameters A, a and y seem relatively easy to estimate. The overall estimation being more dependent on the wind density in time is not unexpected as Chapter 4. Determining Model Parameters 45 Iteration Figure 4.2 Change of the total cost function J and the gradient norm g = \g\ with iterations during the minimization for a 5-day direct inversion. The subscripts UV, U and V refer respectively to the three experiments where wind components ua and va, or only ua, or only va , are available everywhere for assimilation. Both the cost function and the gradient norm have been normalized with respect to their initial values (J0 and g0) before plotting. Chapter 4. Determining Model Parameters 46 10' o 1 CV CD C o CO .iio -1 CO CD CD > -2 -2 ©10 1 0" • 90 EO 180 l_ B a a pa ramete r r (a) Figure 4.3 Relative estimation error of 6 parameters for a 40-day assimilation with (a) the wind available every 90 or 180 time steps at every grid point, and SLH available everywhere for all time, (b) SLH available every 8 or 16 time steps at every grid, and the wind available everywhere for all time. The atmospheric model has 90 time steps per day and the oceanic model has 8 time steps per day. The two estimation errors for each parameter are from two data sparsity experiments (90 or 180 time steps for (a) and 8 or 16 time steps for (b)) respectively. Chapter 4. Determining Model Parameters 47 Chapter 4. Determining Model Parameters 48 the atmosphere varies faster than the ocean, thereby requiring more information in the atmospheric part to determine the whole system. The major factor determining the accuracy of the parameter estimation is likely to be the sufficiency of information. For example, the fact that parameters A, a and y were relatively well estimated for the wind sparsity case could be attributed to the relatively rich information available for their determination — parameter A was bounded by the atmospheric momentum equations where winds data were available directly, and the parameters a and y were constrained by the whole system and extracted information from all parts of the system. Theoretically, the information available in the whole system could be used for estimating all the parameters, but this is not entirely true for our system during a short 5-day period of assimilation, where large estimation errors for the oceanic coefficients a and b were found (not shown) when sparse SLH and perfect wind data were assimilated, indicating information insufficiency. This suggests that as the oceanic adjustment to the external forcing is slow, temporally dense wind observations in a short time period may not be helpful for parameter retrieval when oceanic data are temporally sparse (see Section 4.2.3 for further discussion). Contrary to the temporal sparsity effect, spatial sparsity especially for the ocean, did not show the above remarkable degrading effects on parameter estimation as the data were provided on increasingly coarser resolution for both the atmosphere and the ocean, up to 10° longitude by 5° latitude grids (not shown). That the spatial sparsity is less influential suggests that denser information in the time dimension is required than in the spatial dimension. We next conducted experiments on the combined effects of data sparsity in both time and space. For simplicity, the spatial sparsity was fixed by sampling both wind and SLH data with a very coarse resolution — 10° resolution (zonally and meridionally) between 10°N and 30°N and between 10°S and 30°S, and the TAO array resolution between 10°S and 10°N, while the temporal sparsity was allowed to vary such that when studying wind sparsity, the SLH was provided at every time step, and vice versa when studying SLH sparsity. Fig.4.4 depicts the combined effect for a 40-day assimilation. As the small scale structures of both the Chapter 4. Determining Model Parameters 49 atmospheric and oceanic motion were ignored in the assimilation, the overall estimations for the SLH sparsity case were strikingly improved (cf. Fig.4.4b and Fig.4.3b ), indicating a redundancy of SLH data for the retrieval. In Fig.4.4a, however, the parameter estimation degraded slightly (cf. Fig.4.3a and Fig.4.4a). Since the SLH was available every time step, this error growth resulted mainly from the combined sparsity of the wind data. These two experiments revealed that though the data amount was much more than the 6 unknown parameters, the atmospheric information would be generally insufficient if the wind data were reduced either in time or in space, with the information in the time dimension being more important than that in space. In contrast, the oceanic information could be reduced substantially in space, as supplying SLH data with small scale features yielded poorer estimates. To support this argument, two experiments were performed in which the wind data were the same as in Fig.4.4b, while the SLH data were provided once every 8 time steps and reduced substantially in space to lie only along two lines of latitude for the first experiment (with zonal resolution unchanged) and to lie along one line of latitude at three grid points for the second experiment. The two latitudes with data were far away from equator at 25°N and 25°S ( experiment SI ), and the three data points (experiment S2) were along 0.5°N at grids 30, 100, 170 (where 0 and 200 were the zonal boundaries). It is interesting to see that the relative estimation errors were strikingly small for experiment S2 (cf. Fig.4.5, Fig.4.3b and Fig.4.4b). It states that these 3 observations and the wind data are sufficient to retrieve the 6 parameters, and most importantly, it confirms the potential pointed out by Smedstad and O'Brien (1991), namely, that it is possible to retrieve and estimate empirical parameters from real oceanic observations that are sparse. The six parameters are spatially invariant constants - i.e. almost like domain averages - they are no smaller scale structure. Since model equations are linear, one could, in principle, write the equations in a spectral form for each horizontal spatial component. Thus, if you only had observations available on the largest horizontal scales (at high temporal resolution), this would be sufficient to determine the 6 parameters. In contrast, for experiment SI, the overall estimation errors are larger than those in Fig.4.4b and comparable to those in Chapter 4. Determining Model Parameters 50 paramete r (a) Figure 4.4 Relative estimation error of 6 parameters for a 40-day assimilation with data distributed every 10th grid zonally. Meridionally between 10° N and 30°N and between 10°S and 30°S, data are available every 15th grid, and between 10°S and 10°N, every 2nd grid. The temporal data distribution is (a) every 90 or 180 time steps for the wind and every time step for the SLH, or (b) every 8 or 16 time steps for the SLH and every time step for the wind. Chapter 4. Determining Model Parameters 51 Figure 4.4b Chapter 4. Determining Model Parameters 52 Figure 4.5 Relative estimation error of 6 parameters for a 40-day assimilation with the wind the same as in Fig.4.4b, the SLH available once every 8 time steps and every 10 grids zonally at 25°Nand25°S latitudes (experiment SI), and SLH available at three zonal grid points (30, 100 and 170) at 0.5°N (experiment S2). Chapter 4. Determining Model Parameters 53 Fig.4.3b. Possible explanations: As the air-sea interaction happens mainly within the equatorial waveguide area, even a little information within the waveguide (S2) is far more important for parameter retrieval than much information outside the waveguide (SI). We can further conclude that having good wind, sparse SLH data within the equatorial waveguide area is enough to retrieve the empirical parameters and that SLH data outside of the equatorial waveguide area are redundant. 4.2.2 Impact of noisy data on parameter estimation Random noise or error in the data was assumed to have a normal distribution. In the following experiments, designed to evaluate the impact of noisy data on parameter estimation, noisy data were generated by adding at every time step different sets of random noise with zero mean and standard deviation which was a given percentage of the maximum value of each state. variable at the final time step of the control run. Fig.4.6 shows the impact of noisy data on parameter estimation where data were provided everywhere at all time steps and 10% noise were added to SLH (experiment Nl) , wind (N2), or both (N3) in 40-day assimilations. The errors caused by the oceanic noise were smaller than those caused by atmospheric and combined oceanic-atmospheric noise. The coefficients A and a were again easily determined. Unlike the case on data sparsity, the dynamic coupling coefficient y was sensitive to the atmospheric and combined noise, while the oceanic Newtonian cooling coefficient b was less sensitive to noise. The coefficients a and B exhibited large errors when noise was introduced into the wind data. When temporally sparse wind with 10% noise was assimilated with SLH data provided everywhere and at every time step (but also with 10% noise), the overall estimation degraded further in comparison to the data sparsity case in Fig. 4.3a. The characteristics of error growth for each parameter did not change very much, e.g., the coupling parameters and the atmospheric Rayleigh damping A were quite well estimated, while the ocean damping Chapter 4. Determining Model Parameters 54 1 0"5I 1 1 1 I L-A B a a b y paramete r Figure 4.6 Relative estimation error for 6 parameters for a 40-day assimilation with the wind and SLH data everywhere. Experiment N l has 10 % noise added to SLH, N2, 10% noise added to wind, and N3, 10% noise added to both. Chapter 4. Determining Model Parameters 55 coefficients a and b were poorly estimated (not shown). When noisy and spatially sparse SLH was assimilated (not shown), the overall estimation also showed further degradation. The overall estimations were still acceptable for both experiments. 4.2.3 Improving parameter estimations Changes in the data amount or data quality could result in incorrect estimates of the originally well-estimated parameters, as seen in the above experiments. This problem was attributed to a loss of conditioning in meteorological variational assimilation studies, where increasing the length of the assimilation window has the tendency to worsen the conditioning for the retrieval of initial conditions (Thepaut and Courtier, 1991, L i et al 1994), and to cause information insufficiency in oceanographic adjoint applications (Thacker, 1989). Many ways for improving parameter estimation have been suggested and tested. Using more information from observations collected later on — the original idea of 4-D assimilation — is obviously the most important one. Fig.4.7 shows the window effect on improving the parameter estimation of the coupled system with temporally sparse data. When temporally sparse wind data were provided for every grid point but only once per day and SLH for all grid points at all times (Fig.4.7a), the estimations generally improved as the assimilation window was enlarged. In particular, the poorly estimated parameters a, b, and B have improved to an accuracy of 90% or more; while A, a, and y, have improved to 99 % from the 5-day to the 20-day assimilation. Note that the estimations for a, B, y improved when the window was enlarged from 5 to 20 days, but degraded from 20 to 40 days. In contrast, Fig.4.7b showed that when temporally sparse SLH data were assimilated into the coupled model once a day for all grid points and the wind data were provided everywhere, a 20-day assimilation did not improve the estimations for the 6 parameters over the 5-day assimilation, while improvements were found when the assimilation window was enlarged to 40 days. Hence, a longer assimilation window would in general be Chapter 4. Determining Model Parameters 56 ' U 0 1 0 2 0 3 0 4 0 5 0 assimilat ion window ( days ) (a) Figure 4.7 Relative estimation error for 6 parameters for the 5, 20 and 40-day assimilation runs, (a) with the wind data available for all grid points once a day and SLH available everywhere, (b) with SLH available for all grid points once a day and wind data available everywhere. Chapter 4. Determining Model Parameters 57 Figure 4.7b (b) Chapter 4. Determining Model Parameters 58 helpful for solving the information insufficiency problem for parameter estimation in the present system, though the optimal window size could be difficult to determine, since a longer oceanic window would help parameter estimation, but not necessarily so for the atmosphere. The minimization histories of the cost function J and its gradient norm g for these experiments (Fig.4.8) showed that the iteration histories were quite different between the atmospheric sparsity case and the oceanic sparsity case. For the atmospheric sparsity case, the poor estimations of the short window experiment (Fig.4.8a) resulted from the minimization picking up a poor direction at the first iteration, which reduced the gradient norm by more than 80% (cf. 75% for the 20-day assimilation in Fig.4.8b), but with only a 10% reduction in the cost function (cf. 40% for the 20-day assimilation). The poor direction led the minimization to a local minimum at iteration 15. As the window was enlarged to 20 days, a more correct direction was found in the first iteration, and the minimization continued until both the cost and gradient norm stopping criteria were satisfied. For the oceanic sparsity case, a poor direction was also taken at the first iteration for the short window experiment (cf. Fig.4.8c with 4.8d). The minimization reduced the gradient norm to a local minimum, where only the gradient norm criterion was satisfied, while the cost function almost did not decrease after iteration 6. As the window increased to 40 days, it seemed to improve the descent direction and 20% more cost was deducted. These two window experiments suggested that the degrading effect of data sparsity and poor data quality for our study was likely due to information insufficiency. A useful way to provide a priori information in parameter estimation is to include an extra a priori estimate term in the cost function. The two advantages in doing so — to prevent the estimated values from wandering too far from the a priori knowledge, and to enhance the convexity of the cost function have been discussed in section 2.4.4. Aiming to improve the estimation of a for the experiment SI (shown in Fig.4.5), we ran an experiment identical to SI, but added a single a priori estimate term for a in the cost function (4.2), with a = 10"7. This time the parameter a was well estimated. Figs.4.9a and 4.9b showed the minimization behavior of the gradients and the cost function for this experiment, while Fig.4.9c and 4.9d Chapter 4. Determining Model Parameters 59 I te ra t ion (a) Figure 4.8 Iteration history of the cost function and the gradient norm: SLH available everywhere at all times but wind data available once a day at every grid point for (a) 5-day assimilation and (b) 20-day assimilation; wind data available everywhere at all times but SLH data available for all grid points once a day for (c) 5-day assimilation and (d) 40-day assimilation. Both the cost function and the gradient norm have been normalized with respect to their initial values in this and in next figure. Chapter 4. Determining Model Parameters 60 Figure 4.8b Chapter 4. Determining Model Parameters 61 I te ra t ion (c) Figure 4.8c Chapter 4. Determining Model Parameters 62 I te ra t ion (d) Figure 4.8d Chapter 4. Determining Model Parameters 63 0 5 10 15 20 25 30 Iteration (a) Figure 4.9 Iteration history for an experiment identical to experiment SI shown in Fig. 5, except with an a priori estimate term for a in the cost function: (a) gradients of the cost function with respect to the parameters a, b, and y, and (b) the cost function and the gradient norm, (c) and (d) are the same as (a) and (b) respectively, but for experiment SI, i.e. without the a priori term. Chapter 4. Determining Model Parameters 64 Figure 4.9b Chapter 4. Determining Model Parameters 65 0 5 10 15 20 Iteration Figure 4.9c Chapter 4. Determining Model Parameters 66 Chapter 4. Determining Model Parameters 67 displayed the corresponding behavior for experiment SI. With a priori information, the gradient of parameter a was reduced by 98% percent at the first iteration (Fig.4.9a), while without the a priori term, the gradient was reduced by only 88% (Fig.4.9c). The cost function for the case with a priori information decreased 4 orders of magnitude in the first 10 iterations and continued to drop sharply from iterations 10 to 25 (Fig.4.9b), while the cost function for the case without a priori information decreased by only about 2 orders of magnitude in the first 10 iterations (Fig.4.9d). With a priori information, the cost function kept dropping to around 109 (Fig.4.9b), whereas with a estimated wrongly for the case without a priori information, the cost did not decrease very much after iteration 10 (Fig.4.9d). To test the usefulness of the a priori information, we tried to improve the performance of the 5-day assimilation experiment with temporally and spatially sparse wind data, which had given very poor estimation for the six parameters by adding a priori terms for all parameters (A = 10~6,fi = 10"7,a = 10"3,a = 10"7,^ = 10"7,7 = 10-8). However, the result remained poor (not shown). By increasing the window from 5 to 20 then 40 days, the errors were eventually reduced (Fig.4.10). Thus, it seemed that the a priori information for the parameters played a more important role in limiting the range of the parameters during the optimization search than in providing information for determining the system. A certain amount of data has to be provided for determining the coupled system even when one has good a priori estimation of the parameters. Alternatives such as adding a temporal smoother or spatial smoother as suggested by Thacker (1989) might be helpful for parameter estimation. 4.3 Conclusion With both wind and the SLH data available everywhere in space and time, the six model parameters were easily retrieved by the adjoint method, though the six guessed parameters were a factor of five larger than their true values, with the corresponding model output describing a much stronger warm event (Fig.4.11) than the control run (Fig.4.1). As sparsity and noise effects were introduced into the data assimilation, information needed to invert the coupled Chapter 4. Determining Model Parameters 68 5 2 0 4 0 window (days) Figure 4.10 Relative estimation error for the 6 parameters for various windows, with data availability identical to the case of assimilation every 90 time steps in Fig. 4.4a. A priori information was included for all parameters for the 20 and 40 day experiments. Chapter 4. Determining Model Parameters 69 system became insufficient. The emergence of local minima of the cost function led to degraded estimations for some parameters. The coupling parameters a and y and the atmospheric Rayleigh damping coefficient A were found to be well determined in most situations, while the oceanic damping coefficients a and b and the atmospheric Newtonian cooling coefficient B were more sensitive to data sparsity and noise. Winds were found to be very important in parameter retrieval as the relative estimation errors of all parameters tended to grow when the time interval between available wind data was enlarged. The temporal density of SLH observations had less effect on the estimates than that of wind data. The spatial density of the SLH observations was also found to be less important when dealing with large-scale features of the coupled system and very sparse SLH data near the equator could be adequate. Since the atmosphere varies much faster than the ocean and so does its adjustment to the external forcing, than the information required for determining the coupled system must therefore be more demanding for the atmospheric part than for the oceanic part. The information transfer within the coupled system seemed to work well only when long assimilation windows were applied. The relatively good estimation for a and y (compared to the damping parameters) might be attributed to the combined contributions of information from both parts of the coupled system, or to a more direct constraint by the two parts of the coupled system. Noisy data greatly influenced parameter estimation. The degrading effect on the estimates was smaller with noise in the SLH data than with noise in the wind or in both the wind and SLH. The coefficients a and A were relatively unaffected by both atmospheric and oceanic noise, while coefficients B and a were most sensitive to combined noise. A longer assimilation period (window) was found to improve the estimation performance. However, it was found that the optimal window for the wind data was not the optimum for the SLH data, which suggested the need for a careful selection of the most appropriate window for a coupled system. A priori information for individual parameters as implemented in the cost function was useful in providing information for the size of the parameters and enhancing the Chapter 4. Determining Model Parameters convexity of the cost function, but not as a substitute for inadequate data. 70 M A X / M I N 3 1 / - 4 6 CM Figure 4.11 Spatial structure of the depth of the thermocline (cm) after integrating the coupled model for 40 days using damping and coupling parameters that are a factor of 5 larger than the parameters in Table 1. Shaded regions indicate elevated thermocline. Chapter 5. Determining Model Initial Conditions 71 Chapter 5 Determining Model Initial Conditions Most simple models are based on the theories of equatorially trapped linear modes (Matsuno, 1966, Gill, 1980). The atmosphere is in steady state, while the ocean is time-dependent and needs to be initialized. The initial state of the ocean was usually established by forcing the ocean component of the coupled model with the observed wind stress anomalies for several years prior to the forecast date, and then the coupled model was run forward in time for forecasting. Coupling the atmosphere with the ocean without an initialization process assimilating the atmospheric and the oceanic data could cause inconsistency between the atmosphere and the ocean, which can result in poor prediction skill of simple models. It is presently not known how much of the simple model prediction errors can be ascribed to errors in the specified initial conditions. The general lack of data has greatly retarded the progress in initializing the simple models by assimilating data from the tropics. Recently, the observing system in the tropical Pacific has been greatly enhanced by the TOGA experiment, enriching the existing data sets and allowing exploration of data assimilation in simple coupled models. Our two objectives are: (1) to examine the usefulness and sufficiency of these data sets in determining the oceanic states, and (2) to develop efficient data assimilation techniques to satisfactorily initialize the simple models using these data sets. Carton et al (1996) have recently examined the relative importance of the main elements of the present observing system — sea level height data (SLH) provided by TOPEX/POSEIDON, the subsurface temperature data mainly by the Volunteer Observing Ship XBTs, and the subsurface temperature and current data by the TOGA TAO array moorings — in establishing the ocean state of the GFDL OGCM. Some recent efforts have been made to apply new data assimilation schemes to initialize simple coupled systems: Chen et al (1995) nudged Chapter 5. Determining Model Initial Conditions 72 surface wind data into the Cane-Zebiak model to overcome the well-known "spring barrier" in El Nino prediction, while Kleeman et al (1995) assimilated subsurface oceanic data into their simple model by the adjoint data assimilation method to improve its prediction skill. One of the advantages of the adjoint technique is that by using model dynamics as a constraint, information can be transferred from regions with data to other regions without data and from later times to earlier times, thereby potentially solving the data insufficiency of spatial and temporal coverage. However, studies have shown that data sparsity, initial guesses and noisy data could all affect the conditioning of the problem and the efficiency of this algorithm. In this idealized study, we tackle the following three problems in determining the initial conditions of simple models by the adjoint method: (1) The nature of information transfer — with the coupled model assuming the role of a dynamical interpolator/extrapolator, advection and propagation of information in 4-D space are expected to produce information exchange between the atmosphere and the ocean and from later times to earlier times. Previous studies have investigated 3-D information transfer in the ocean, i.e., whether sea surface information could be transferred down to reconstruct the deep circulation, and whether local information could be transferred to distant areas (Ghil and Malanotte-Rizzoli, 1991). (2) Sufficiency of information from the equatorial wave guide area: The TAO array, covering the equatorial wave guide area, has been developed for monitoring large-scale ocean-atmosphere interactions. Carton et al. (1996) have shown that thermistor data from the TAO array are helpful, but not as crucial, in resolving the seasonal cycle within the equatorial wave guide area as the XBTs are for the GFDL OGCM. Unstable waves associated with ENSO (El Nino-Southern Oscillation) are equatorially trapped, so that seasonal and interannual variations in this area are more pronounced than those outside it. Whether the present observing system is adequate for establishing the initial state of the simple models is therefore an interesting question. (3) The influence of initial guess and noise in the data: The initial states for simple operational models established by spinning up the ocean by the wind stress can contain some bias relative to the true ocean, and the observed data certainly contain noise. How these factors affect the Chapter 5. Determining Model Initial Conditions 73 efficiency of the adjoint data assimilation method need to be studied before the method can be used operationally. 5.1 Experiments A weaker coupling parameter set (Table 2) than that used by Philander et al (1984) was Table 2 Parameters used in the coupled atmosphere-ocean model for retrieving initial conditions. parameter Values A 2.3xl0-6s-' [ (5 days)-1] B 7.7xl0-7s-' [(15 days)-1] c =(gd Z2 66 ms-1 8a 40 cm s-2 a 1.2xl0-7 s 1 [(100 days)1] b 1.2xl0-7 s-1 [ (100 days)-1] c =(gd /2 1.4 ms-1 O 1 O o O' g0 2.0 cm s-2 a 2.5X10-3 s 1 Y 6.5xl0-8 s-1 chosen to run this model for 90 days to address the adjoint initialization problems for simple models at a seasonal time scale. The three initial oceanic conditions (h0, u0, and v 0) in the control run were obtained after allowing the ocean to first relax for 10 days from the initial sea level height anomaly (3.11). The atmosphere and the ocean were then coupled once per day for another 10 days. The atmospheric and the oceanic states at the end of this 10-day coupled run were saved as initial conditions for a 90-day control run. Since most real operational simple models have an equilibrium atmosphere, the initial conditions for the atmospheric model were specified using the saved atmospheric state for all the experiments in this study and only the oceanic initial conditions will be adjusted by the adjoint data assimilation. The initial SLH (after the models were coupled for 10 days) is shown in Fig. 5.1a, and the evolution of the SLH of the Chapter 5. Determining Model Initial Conditions 74 I N I T A I L H M A X / M I N = 0 . 4 7 / - 6 . 8 E - 2 C M 100W 50 0 50 100E (a) Figure 5.1 Spatial structure of the thermocline depth perturbation h0 after integrating the coupled model for 10 days using the parameters in Table 2. This will be taken as the ocean's initial state for the 90-day control run and other experiments. Chapter 5. Determining Model Initial Conditions 75 H ALONG EQUATOR 2 0 1 00W 50 50 1 0 0 E (b) Figure 5.1b Temporal and spatial structure of thermocline perturbation along the equator. Shaded regions indicate elevated thermocline. The depth perturbation is against the equivalent depth d0 (see Table 2). Since in this model, the SLH is related to the thermocline depth perturbation by a constant factor, we will also refer to the h0 field as the SLH field. Chapter 5. Determining Model Initial Conditions 76 90-day control run along the equator from the initial state is shown in Fig. 5.1b. As in the parameter estimation study, the wind and SLH data were assimilated using the cost function (4.1). Their usefulness was investigated in terms of efficiency of information transfer between the atmosphere and the ocean in retrieving the three initial oceanic conditions. The effects of data sparsity on retrieving the initial conditions were investigated by varying the data window, i.e., data were made available once every 1, 10, 30 or 90 days, where 1 day is the TAO array temporal resolution of observations, 10 days, the time step of the Cane-Zebiak model, 30 days, the monthly scale, and 90 days the sparsest window of this study. The root mean square error (RMSE) and the correlation coefficient between retrieved initial conditions and the true initial conditions were used to score the quality of the estimations for the initial conditions. The relative merit of using RMSE or correlation coefficient for model verification has been discussed by Murphy et al (1989). 5.2 Results 5.2.1 Information transfer between the atmosphere and the ocean With first-guesses taken to be zero, the role of different data types and the effects of their temporal sparsity in determining the initial conditions of our system were investigated by conducting experiments assimilating separately the wind and the SLH data. Though the adjoint model clearly allows information transfer between the atmosphere and the ocean (see 3.12-3.13), it has been shown that the efficiency of information transfer is associated with the usefulness of the particular type of data assimilated and with the model physics (Moore et al. 1987). These experiments will therefore give a basic account on the usefulness of the wind and the SLH for our system. The quality of the initial condition retrieval was scored for the whole model domain as well as for two sub-areas, where area 1 was the model domain between 8°S and 8°N, representative of the TAO array area, and area 2, the rest of the model domain. Fig.5.2a and Fig.5.2b depict the RMSE and the correlation of the retrieved h0 field with respect to the true h0 field by assimilating the wind or the SLH data, available everywhere in Chapter 5. Determining Model Initial Conditions 77 1 0 -1 -2 1<Tr-O LU CO DC 1 0 -3 1 0 -4 I II ll 8 1 1 I II 11 l l 1 S w a1&2 E w a1 Q w a 2 • h a1&2 • h a1 0 h a2 9 0 10 3 0 Data window (days) (a) Figure 5.2 The (a) RMSE and (b) correlation skill in retrieving the initial conditions h0, the (c) RMSE and (d) correlation skill in retrieving u0, and the (e) RMSE and (f) correlation skill in retrieving v0, for a 90-day assimilation run with wind or SLH available every 1, 10, 30, or 90 days at every grid point. For one day of integration, the ocean model took 8 time steps, while the atmospheric model took 90 steps. The labels w and h denote, respectively, experiments where wind or SLH data were available at every grid point in the model. The label al&2 means that the RMSE or correlation was calculated over the entire model domain, while al or a2 means that the values were only over area 1 (the TAO array area) or area 2 (the rest of the model domain). Chapter 5. Determining Model Initial Conditions 78 Figure 5.2b (b) Chapter 5. Determining Model Initial Conditions 79 space but once every 1, 10, 30 or 90 days. The retrieval of /i0was insensitive to the data sparsity, i.e., the RMSE and the correlation did not grow remarkably with the increasing temporal sparsity of the wind or SLH data except for the experiment where the SLH data were available only at day 90. The quality of the retrieved h0 field over area 1 was about the same as that over area 2 with the RMSE slightly smaller in area 2. When assimilating the SLH data, the RMSE is typically two orders of magnitude smaller than that when assimilating the wind data. However, the correlation was about the same for assimilating either data sets; it was very high ( 0.92-0.99 ) even for the case where the wind data ( ua and va) or the SLH data ( h0) were fewer in number than the unknown initial conditions (h0, u0, and v0 ), i.e. when the wind or SLH data were available only at day 90. Thacker et al (1989) have also shown a case of retrieving unknowns well where the amount of data equals the number of unknowns. One tends to think that good agreement would mean low RMSE and high correlation. However, it is quite common to have high RMSE occurring in conjunction with high correlation, and conversely. Since RMSE is not normalized with respect to the magnitude, a region of strong signals would tend to have a high RMSE even when the agreement is quite good. In contrast, if two anomaly fields are well aligned but have very different magnitudes, correlation values will be high despite the disagreement in the magnitude. The characteristics of the h0 retrieval for different experiments might be interpreted as follows: Since the strength of the anomalies was much greater in area 1 than in area 2, (see e.g., the SLH in Fig.5.1), the RMSE and the correlation were both likely to be larger in area 1 than in area 2, as found in Fig.5.2. Hence, the smaller RMSE obtained by assimilating the SLH data rather than by assimilating the wind data suggested that the SLH data were more useful than the wind data in retrieving h0 In contrast to the retrieval of h0, the retrieval of initial current fields was indeed sensitive to data sparsity. For u0, the retrieval was more sensitive to the sparsity of the SLH data than the wind data (Fig.5.2c and d). Degradation of the RMSE and the correlation occurred simultaneously when the wind data window was enlarged to 30 days, while the RMSE started to grow when the SLH data window was longer than one day. Interestingly, the Chapter 5. Determining Model Initial Conditions 80 1 0" 1 0" 3 O C -3 -4 1 C T b 1 0 -5 I i a w a1&2 0 w a1 Q w a 2 • h a1&2 • h a1 S h a2 10 3 0 Data window (days) 9 0 (c) Figure 5.2c Chapter 5. Determining Model Initial Conditions 81 Figure 5.2d (d) Chapter 5. Determining Model Initial Conditions 82 correlation did not immediately follow this RMSE degradation; instead its degradation occurred when the SLH data window was enlarged to 30 days. Although the u0 retrieval was more sensitive to the SLH sparsity, the retrieval quality by assimilating the SLH data was better than that by assimilating the wind data, e.g. when the wind data was available every 30 days, the correlation in area 2 was less than 0.5 (Fig.5.2d), much lower than the corresponding value with SLH data. Note that the RMSE in area 1 of the u0 retrieval was larger than that in area 2, while the correlation was smaller in area 2. The larger RMSE in area 1, as aforementioned, was associated with the stronger variations of the zonal current in the area, while the smaller correlation in area 2 might be caused by the more complicated structures of the current field induced by the boundaries. The v0 retrieval was even more sensitive to data sparsity than the u0 retrieval, e.g. the degradation of the retrieval quality occurred when both the wind and the SLH data window were enlarged to longer than one day (Fig.5.2 e and f). Like the w0 retrieval, the RMSE grew with the increasing temporal sparsity of the SLH data, while degradation of the correlation was not so obvious. But the correlation degradation was more obvious than the RMSE degradation when the wind data were assimilated. Note that when the SLH data or the wind data were assimilated once every 30 days, the SLH data were only half as numerous as the wind data (h0 versus ua and va), but the RMSE values for assimilating either data set were comparable, while the correlation by assimilating the SLH data was about 60% higher than that by assimilating the wind data. It seemed that sparse SLH data contained sufficient information to obtain the correct pattern of the retrieved fields, and the increasing sparsity mainly produced larger RMSE, whereas the wind sparsity mainly caused the pattern of retrieved fields to deteriorate. This again indicated that the SLH data were more useful in retrieving the current fields. The RMSE in area 1 was slightly larger than that in area 2, while the correlation in area 1 was slightly higher than in area 2. We can therefore conclude that by assimilating wind only, the information from the wind can be transferred to the ocean part to determine the oceanic initial conditions. However, Chapter 5. Determining Model Initial Conditions 83 > o LU CD C C 10 3 0 Data window (days) (e) Figure 5.2e Chapter 5. Determining Model Initial Conditions 84 1 _ 0 . 8 1 0.6 0.4 0.2 0 •0 .2 11* 3 w a1&2 0 w a1 0 w a 2 • h a1&2 • h a1 S h a2 1 1 J I 1 0 21 3 0 Figure 5.2f Data window (days) (f) I 9 0 Chapter 5. Determining Model Initial Conditions 85 temporally denser wind data are required to achieve an overall retrieval quality for the three oceanic initial conditions (h0, u0, and v 0) similar to that obtained by assimilating SLH data. Of the three initial condition, /i0was less sensitive to the temporal sparsity of the wind data and the SLH data than u0 and v0. Without prior information about the first-guesses, temporal resolution of 10 days for the wind or 30 days for the SLH are required for a good retrieval of the oceanic initial conditions in our system. Next, experiments were conducted to examine the effects of combined temporal and spatial sparsity of the wind and the SLH data on retrieving the initial ocean conditions. For these experiments, wind and SLH data were sampled from area 1 with the TAO array spatial resolution of 2° latitude by 15° longitude and once every 1, 10, 30 or 90 days. Zero first guesses were again used. The wind data in area 1 were found to be relatively more important than the SLH data in area 1 in determining h0 (Fig.5.3a and b), i.e., without the wind data in area 2, the quality of the h0 retrieval was about the same as that in the corresponding experiments (Fig.5.2a and b) with the wind data in area 2; the growth of the RMSE and the decrease of the correlation were minor, and no substantial degradation of the retrieval occurred in area 2. However, without the SLH data in area 2, the RMSE degradation in the h0 retrieval was remarkable (by 1-2 orders of magnitude larger, with degradation in both area 1 and area 2). Though the RMSEs were comparable between the experiments assimilating the wind data in area 1 and the SLH data in area 1, the correlation for the experiments assimilating the SLH data showed more apparent degradation, caused mainly by the degradation of the h0 retrieval in area 2. Without the wind or the SLH data in area 2 the information provided was generally insufficient for retrieving the current fields. Compared with the experiments assimilating the wind everywhere in space, the RMSE of the w0 retrieval increased by about one order of magnitude (cf. Fig.5.2c and Fig.5.3c), and the correlation was lower than 0.5 when the window was longer than one day (cf. Fig.5.2d and Fig.5.3d). However, the low overall correlation was caused by the low correlation in area 2, as the correlation for area 1 was Chapter 5. Determining Model Initial Conditions 86 1 0 3 0 data window (days) (a) Figure 5.3 Same as Fig. 5.2 except for assimilating data at every 2nd grid in the meridional direction and every 15th grid point in the zonal direction in area 1. Chapter 5. Determining Model Initial Conditions 8 7 Figure 5.3b (b) Chapter 5. Determining Model Initial Conditions 88 1 0 u O .9 LU 1 0 2 ' CO io"° u 1 0" i i l l S w a1&2 0 w a1 Q w a 2 • h a1&2 • h a1 S h a2 10 3 0 Data window (days) 9 0 (c) Figure 5.3c Chapter 5. Determining Model Initial Conditions 89 o 0-8 O = 0.6 CO .2 0.4 _co CD 5 0.2 0 -0 .2 • w a1&2 • h a1&2 • w a1 h a1 • w a2 m h a2 10 30 Data window (days) (d) 90 Figure 5.3d Chapter 5. Determining Model Initial Conditions 90 ^1 • S w a1&2 O w a1 Q w a 2 • h a1&2 • h a1 S h a2 10 3 0 Data window (days) (e) 9 0 Figure 5.3e Chapter 5. Determining Model Initial Conditions 91 0.8 o = 0.6 CO .1 0.4 05 CD 8 0.2 0 - 0 . 2 I 1 B w a1&2 • h a1&2 w a1 h a1 • w a2 m h a2 2 1 I 1 0 3 0 9 0 Figure 5.3f Data window (days) (f) Chapter 5. Determining Model Initial Conditions 92 insensitive to the increasing wind sparsity and was acceptable (about 0.6). Compared with the experiments assimilating the SLH data everywhere in space, the RMSEs for assimilating the SLH once 1 or 10 days were about two orders of magnitude larger, while the correlations were very low for all the experiments except for having the SLH data once a day. This indicated that without some information from area 2, the available information was insufficient for retrieving the zonal current field. Without wind data or SLH data in area 2, the information insufficiency was even more remarkable for v0 retrieval (Fig.5.3e and f). The retrieval failed because the correlations were generally lower than 0.4 except for the experiment where the SLH data were available once a day. We concluded that the initial sea surface height h0 of our model was more easily estimated and less sensitive to the assimilated data type (wind and SLH), and the temporal and spatial sparsity of data, than the initial currents (u0 and v0). Wind data in area 1 seemed more helpful than the SLH data in determining the oceanic initial conditions. Observations (wind or SLH) in area 1 was enough for estimating h0, but not for the current fields, especially in area 2. 5.2.2 Impact of initial guess on the initialization ENSO is generally believed to be a quasi-periodic phenomenon with the time scale set by the transit times of equatorial Kelvin or Rossby waves. Its predictability depends on our being able to identify where the coupled system is on the ENSO cycle. An ocean spun-up by wind stress may contain errors in the intensity and phase of the cycle as represented by the oceanic anomaly. Using such a spun-up ocean as the first-guess for the adjoint data assimilation could affect the convergence of the minimization procedure. To study this, we conducted four experiments where the first-guesses were from a spun-up ocean with defects: (a) The ocean has the correct phase but incorrect intensity. For generating the first guesses, the magnitude of the initial h in (3.11) was first systematically scaled up by 10% from the control. The first-guess for case 1 was then generated in the same way as the true initial conditions but by using the perturbed h. For case 2, 10% random noise was added to the "true" initial conditions from the control run. Chapter 5. Determining Model Initial Conditions 93 M A X / M I N 0 . 4 5 / - 0 . 7 2 C M 100W 50 0 50 1 0 0 E (a) Figure 5.4 Difference in the initial (a) h0, (b) u0 and (c) v0 perturbations between case 3 and the control run. Chapter 5. Determining Model Initial Conditions 94 M A X / M I N 2.9/-3.7 C M / S (b) Figure 5.4b Chapter 5. Determining Model Initial Conditions 95 M A X / M I N 3 . 2 / - 3 . 2 C M / S 100W 50 0 50 1 00E (c) Figure 5.4c Chapter 5. Determining Model Initial Conditions 96 (b) The ocean has both the phase and the intensity incorrect: The first-guesses for case 3 and case 4 were generated by relaxing the perturbed h located initially to the left (centered at grid 50 along the equator), and to the right (grid 150 along the equator), respectively, of the location in the control run (grid 100 along the equator). Fig.5.4 shows the difference of case 3 from the true initial conditions. Instead of describing a warm event developing in the central equatorial Pacific, case 3 described a warm event developing in the western equatorial Pacific, together with a strong westward current, while case 4 described a warm event which had propagated to the eastern equatorial Pacific accompanied by a strong eastward current (not shown). To compare with the results from previous studies using zero initial guess, experiments in this section were also conducted with data available once every 1, 10, 30, and 90 days only in area 1 with TAO array spatial resolution. With either the wind data or the SLH data, the retrievals of the three initial states from case 1 were very good, showing little sensitivity to the increasing temporal sparsity. Fig.5.5 showed the RMSE and the correlation for the retrievals with the wind data or the SLH data available only at day 90. Though the data were far fewer than the unknowns for these two experiments, the RMSEs were very small and the correlations were very high, especially for the current fields, e.g., with only the wind data, the RMSE was of the order of 10"4for u0 and 10-5 for v0, much smaller than those found earlier by zero initial guess (cf. Fig.5.5a, Fig.5.3c and Fig.5.3e). The correlation for the current fields were also improved in this case ( Fig.5.5b, Fig.5.3d and Fig.5.3f). The quality of the retrievals for area 2 was also very good (not shown). Case 2 was found to be more difficult to retrieve than case 1 as the RMSE and the correlation skill in retrieving the initial conditions from noisy initial guess deteriorated with increasing temporal sparsity of the wind or the SLH. However, even with data only at day 90, the retrievals were still better than those obtained by zero initial guess, especially for the current field retrieval (cf. Fig.5.5 and Fig.5.3). With data only in area 1, retrievals for area 2 were as good as in area 1 (not shown). Since our model is linear, the first guesses of the in-phase case should have the same spatial distributions as the true initial conditions. This might explain why Chapter. 5. Determining Model Initial Conditions 97 1 0 u 1 0" io" y LU 1 0" 1 0" 1 0" 1 0" w i n d w i n d I SLH I S L H & wind SLH • h LTJ  u V e a s e l c a s e 2 e a s e l c a s e 2 c a s e 3 c a s e 4 (a) Figure 5.5 (a) RMSE and (b) correlation from retrieving three initial conditions from case 1 and case 2 by assimilating wind or SLH data at day 90 in area 1, and from case 3 and case 4 by assimilating once every day both wind and SLH data in area 1. Chapter 5. Determining Model Initial Conditions 98 Figure 5.5b (b) Chapter 5. Determining Model Initial Conditions 99 the retrievals for case 1 needed few data and the information propagated readily from the later time to the starting time to determine the initial conditions. More information might be required to establish the phase of the initial ocean because phase restructuring implies energy redistribution. As the noise in the initial guess is harmful to retrieving the initial conditions, the noise level could be an important factor in the retrievals. The results from the two out-of-phase ( i.e. anomaly in wrong location) cases supported the above arguments. Having only the wind data or the SLH data in area 1 was generally insufficient for retrieving the initial conditions in both case 3 and case 4. With both wind and SLH data, the retrievals were still very poor. From Fig.5.5 where both wind and SLH data were assimilated once every day, we observed the same characteristics of information insufficiency found in the experiments with a zero first guess or with very sparse data, i.e., in retrieving the h0, the RMSEs were large and the correlations were high, while the retrievals for current fields were poor. Compared with Fig.5.3, the retrievals for case 3 and case 4 were poorer than those assimilating wind or SLH data with zero initial guess. Fig.5.6 depicts the difference between the retrieved initial current from case 3 and the true initial currents after assimilating both wind and SLH data in area 1 once every day. The initial strong eastward current in Fig.5.4 has been substantially reduced after data assimilation (cf. Fig.5.6b and 5.4b), while the meridional current structure has not (Fig.5.6c and 5.4c). Hence the efficiency of the adjoint method and the information sufficiency depend on the original guess, especially, the phase (i.e. location) of the warm water anomaly. With only wind and SLH data in the equatorial wave guide area, observations once per day will probably be insufficient for estimating the true ocean if the initial first guess for the ocean contains a large phase discrepancy to the true anomaly. 5.2.3 Impact of noisy data on initialization Noise in the data is detrimental to a successful retrieval of initial conditions, as it tends to Chapter 5. Determining Model Initial Conditions 100 M A X / M I N 0 . 3 8 / - 0 . 2 4 C M 100W 50 0 50 1 0 0 E (a) Figure 5.6 (a) RMSE and (b) correlation from retrieving three initial conditions (h0, u0 and v0) for a 90-day assimilation with noisy wind or both noisy wind and SLH data available once every 1 or 10 days at every 2nd grid in the zonal direction and every 15th grid in the meridional direction in area 1. Chapter 5. Determining Model Initial Conditions 101 MAX/MIN 1.4/-2.6 CM/S 100W 50 0 50 1 0 0 E (b) Figure 5.6b Chapter 5. Determining Model Initial Conditions 102 M A X / M I N 3 . 1 / - 3 . 1 100W 50 0 50 1 0 0 E (c) Figure 5.6c Chapter 5. Determining Model Initial Conditions 103 cause ill-conditioning of the problem (Thacker and Long, 1988). We conducted experiments to test the effect of assimilating noisy wind and SLH data in area 1. Normally distributed random noise with zero mean and standard deviation r (which was 10% of the maximum value of each state variable at day 90 of the control run) was added at every time step. The noisy initial conditions in the last section (case 2) were used as first guess. The retrievals of the initial conditions were less sensitive to the noise in wind data than noise in the SLH data. For example, with noisy wind data, the RMSE had the same order of magnitude as that estimated by assimilating perfect data, and showed little degradation to the increasing wind sparsity (Fig.5.7). However, the noise in the SLH was very detrimental to the retrievals of the initial conditions (not shown). Increasing the amount of SLH data (e.g. by decreasing the data window from 10 days to 1 day) actually brought more harm to the retrievals. By assimilating both wind and SLH data (Fig.5.7), the wind did not improve the retrievals, and the results were much poorer than those from assimilating wind data only. The RMSEs from using data once every 10 days was ironically smaller those from using data once every day. The correlation was also very low for the SLH and the zonal current. Hence noisy SLH data were most harmful to the retrievals of the ocean initial conditions. 5.3 Conclusion In this chapter, the sensitivity in retrieving the three initial conditions (SLH, two current components) to data type (wind and SLH), data sparsity, first guess of the initial conditions, and data noise were investigated. Usefulness of data type was tested by retrieving the initial conditions with first guess taken to be zero. Wind data assimilation was found to be useful in retrieving the initial oceanic conditions, with the overall retrieval quality sensitive to the temporal sparsity of the wind data. Information about the SLH seemed to be more effective than the wind in retrieving the three initial fields. Among the three, the retrieval of the initial SLH was least sensitive to data type and Chapter 5. Determining Model Initial Conditions 104 LU Data window (days) (a) Figure 5.7 (a) RMSE and (b) correlation from retrieving three initial conditions (h0, w0 and v 0) for a 90-day assimilation with noisy wind or both noisy wind and SLH data available once every 1 or 10 days at every 2nd grid in the zonal direction and every 15th grid in the meridional direction in area 1. Chapter 5. Determining Model Initial Conditions 105 Figure 5.7b Chapter 5. Determining Model Initial Conditions 106 to the temporal sparsity of data assimilated, while the retrieval of the current fields was sensitive to the temporal sparsity of the data assimilated. A temporal resolution of 10 days for the wind data was found necessary for achieving an acceptable retrieval quality. Since the current fields were determined by both wind and the SLH (see 3.13), the better results with the SLH data in retrieving the current fields might be attributed to the fact that (1) the wind was not a state variable of the ocean model, and (2) the oceanic motion had a slower time scale and therefore a longer memory to retain the information obtained from the SLH data, while the faster atmospheric motion has a shorter memory to carry the information obtained from the wind data. More information to retrieve the current fields was needed as the structure of the currents was more complicated, with smaller spatial scales than the SLH, especially for the regions outside the equatorial wave guide area and the regions along the boundary. Wind and SLH in the equatorial wave guide area were used to test the sensitivity of retrieving the initial oceanic fields to the combined (temporal and spatial) sparsity of data, and the sufficiency of the TAO array observing system in retrieving the initial conditions of simple models. Assuming no prior information was available for the ocean (i.e. a zero initial guess for the oceanic fields), wind and SLH data were found to be sufficient for retrieving the initial SLH, but insufficient for retrieving the current fields outside the equatorial wave guide area. The current fields in the equatorial wave guide area could not be retrieved well if the temporal resolution of the observations was longer than one day. The first guess in adjoint data assimilation plays the important role of providing initial background information on the ocean — a good first guess can provide useful information for retrieving the initial conditions, while a poor guess can be detrimental. The data sufficiency for retrieving the initial fields is therefore associated with the quality of the first guess. Two points can be made concerning the first guess effect: (1) Systematic bias in the magnitude of the initial conditions used for the first guess was easily overcome, without the need of a large amount of data as temporally very sparse wind or SLH data in the equatorial wave guide area were enough to retrieve the initial conditions well for the whole model domain. (2) Noisy perturbations in the Chapter 5. Determining Model Initial Conditions 107 magnitude of the initial conditions used for the first guess, however, were more difficult to handle. The retrieval of the initial conditions was found to be sensitive to the data sparsity for this case, which implied data insufficiency. For these two cases, the overall quality in retrieving the initial conditions was still better than those achieved from first guesses of zero, suggesting that initial ocean fields with error in the magnitude were still useful for initializing simple models. A possible explanation is that for a linear system, perturbing the magnitude of the initial guesses does not change the spatial structure of the true initial conditions, so that the model output and the control run output have the same spatial structure as well. The similarity in the spatial structure of the data (taken from the control run output) and of the model output does not lead to the tremendous demand for information, which arose in case 3 and case 4, where the first guesses had large spatial phase difference from the true initial conditions. The uncertainty of the phase differences between the first guess and the true initial condition could not be resolved easily by assimilating data in the equatorial wave guide area, as both wind and SLH data in the equatorial wave guide area with a temporal interval of one day were found to be insufficient. With current observations available from the TAO system, one can expect improvements in the poor current field retrieval. While SLH data was superior to wind data in retrieving the oceanic initial conditions, it was ironic that noise in the SLH data was far more detrimental to the retrieval than noise in the wind data. The reason is that when there is noise in the wind data, the ocean does not adjust quickly enough to the erroneous wind data, and with the atmosphere mainly a slave to the ocean at low frequencies, the wind soon adjusts to the ocean state — hence little damage is done by the noise in the original wind data. In contrast, when there is noise in the ocean data, the slave atmosphere adjusts rapidly to the erroneous ocean data, hence oceanic noise can cause much more damage to the coupled system. Chapter 6. Summary and Discussions 108 Chapter 6 Summary and Discussions This thesis was motivated by the need to develop efficient data assimilation techniques capable of validating and initializing simple equatorial coupled models with tropical atmospheric and oceanic data. A general procedure, assimilating tropical atmospheric and oceanic data into atmosphere-ocean coupled systems to determine their parameters and initial conditions, was developped. The feasibility and potential of applying this procedure to simple equatorial coupled models were studied with the Philander et al. (1984) simple equatorial coupled atmosphere-ocean model, which has the atmosphere and the ocean each represented by a single-layer linear shallow water model. A series of identical twin experiments was conducted to retrieve the six model parameters and the three initial oceanic conditions from data designed to resemble the available tropical data, which are typically asynchronous, asynoptic, sparse, noisy, and of mixed types. The effects of different data distributions and noise on the retrieval quality of the six parameters and the effects of different data distributions and noise, and the initial guesses of control variables on the retrieval quality of the three initial oceanic conditions were investigated. The retrieval quality was quantified by the relative estimation error between the estimated parameters and the true parameters, the root mean square error (RMSE), and the correlation between the retrieved initial conditions and the true initial conditions. My main purpose was to identify the usefulness of the wind and the SLH data in retrieving the six parameters and the three initial oceanic conditions, and to test the efficiency of information transfer between the atmosphere and the ocean. With both wind and the SLH data available everywhere in space and time, the six model parameters were easily retrieved by the adjoint method, though the six guessed parameters were a factor of five larger than their true values, with the corresponding model output describing a Chapter 6. Summary and Discussions 109 much stronger warm event than the control run. Winds were found to be very important in parameter retrieval as the relative estimation errors of all parameters tended to grow when the time interval between available wind data was increased. The temporal density of SLH observations had less effect on the estimates than that of the wind data. Winds were able to retrieve the initial oceanic conditions but the overall retrieval quality for the initial conditions was sensitive to the temporal sparsity of the wind data. The temporal density of SLH observations again had less effect on the retrieval of the initial conditions than that of the wind data. Since the atmosphere and its adjustment to the external forcing vary much faster than the ocean, the information required for determining the coupled system must therefore be more demanding for the atmospheric part than for the oceanic part. The information transfer within the coupled system seemed to work well only when long assimilation periods were used. Spatial sparsity was found to be less influential than temporal sparsity to both parameter estimation and initialization. Spatially sparse winds (TAO array resolution in area 1 and very sparse resolution in area 2) can cause degradation of parameter estimations, while the same spatially sparse SLH had little effect on parameter estimations. SLH observations at a few points in area 1 were found to be enough to retrieve the six parameters if temporally dense wind data were provided. Wind and SLH in the equatorial wave guide area were found to be sufficient for retrieving the initial SLH, but not enough for retrieving the current fields outside the equatorial wave guide area. The current fields in the equatorial wave guide area could not be retrieved well if the temporal resolution of the observations was longer than one day. More data were required to retrieve the six parameters than to determine the three initial oceanic fields, despite the fact that the number of parameters was far fewer than the number of unknown initial conditions. Though the data amount required for both problems was comparable, the estimated error growth for the six parameters was as sensitive as those for initializing the three oceanic fields, to the increasing data sparsity in time and space. The number of minimization iterations for retrieving the six parameters was far fewer than that for retrieving the three initial conditions. Chapter 6. Summary and Discussions 110 Fitting model to spatially sparse data allows the small scale structures of the atmospheric and oceanic motions to be ignored. This is why the SLH data were found to be almost redundant in the parameter estimations. However, if data were too sparse, two problems could arise: (1) the cost and the gradient of the problem are so small that the minimization cannot be carried out, and (2) the cost and the gradient can be reduced to very small values, however the retrieval quality of the problem can be low. Our studies showed that the coupling parameters a and y and the atmospheric Rayleigh damping coefficient A were well determined in most situations, while the oceanic damping coefficients a and b and the atmospheric Newtonian cooling coefficient B were more sensitive to the wind and the SLH sparsity. The relatively good estimation for a and y (compared to the damping parameters) might be attributed to the combined contributions of information from both parts of the coupled system, or to a more direct constraint by the two parts of the coupled system. Noisy data greatly influenced parameter estimation. The degrading effect on the estimates was smaller for having noise in the SLH data than having noise in the wind or in both the wind and SLH. The coefficients a and A were relatively unaffected by both atmospheric and oceanic noise, while coefficients B and a were most sensitive to combined noise. A longer assimilation period (window) was found to improve the estimation performance. However, it was found that the most preferable window for the wind data was not the most preferable for the SLH data, which suggested the need for a careful selection of the most appropriate window for a coupled system. A priori information for individual parameters as implemented in the cost function was useful in providing information for the size of the parameters and enhancing the convexity of the cost function, but not as a substitute for inadequate data. Among the three oceanic initial conditions, the retrieval of the initial SLH was least sensitive to data type and to the temporal sparsity of the assimilated data, while the retrieval of the current fields was sensitive to the temporal sparsity of the data assimilated. A temporal resolution of 10 days for the wind data was found necessary for achieving an acceptable retrieval quality. Chapter 6. Summary and Discussions 111 Since the current fields were determined by both the wind and the SLH (see 3.13), the better results with the SLH data in retrieving the current fields might be attributed to the fact that (1) the wind was not a state variable of the ocean model, and (2) the oceanic motion had a slower time scale and therefore a longer memory to retain the information obtained from the SLH data, while the faster atmospheric motion had a shorter memory to carry the information obtained from the wind data. The current fields needed more information to retrieve as the structure of the currents was more complicated, with smaller spatial scales than the SLH, especially for the regions outside the equatorial wave guide area and the regions along the boundary. The first guess in adjoint data assimilation plays the important role of providing initial background information on the ocean — a good first guess can provide useful information for retrieving the initial conditions, while a poor guess can be detrimental. The data sufficiency for retrieving the initial fields is therefore associated with the quality of the first guess. Two points can be made concerning the first guess effect: (1) Systematic bias in the magnitude of the initial conditions used for the first guess was easily overcome, without the need of a large amount of data as temporally very sparse wind or SLH data in the equatorial wave guide area were enough to retrieve the initial conditions well for the whole model domain. (2) Noisy perturbations in the magnitude of the initial conditions used for the first guess, however, were more difficult to handle. The retrieval of the initial conditions was found to be sensitive to the data sparsity for this case, which implied data insufficiency. For these two cases, the overall quality in retrieving the initial conditions was still better than those achieved from first zero guesses, suggesting that initial ocean fields with error in the magnitude were still useful for initializing simple models. A possible explanation is that for a linear system, perturbing the magnitude of the initial guesses does not change the spatial structure of the true initial conditions, so that the model output and the control run output have the same spatial structure as well. The similarity in the spatial structure of the data (taken from the control run output) and of the model output does not lead to the tremendous demand for information, which arose in case 3 and case 4, where the first guesses had large spatial phase difference from the true initial conditions. The uncertainty of the phase Chapter 6. Summary and Discussions 112 difference between the first guess and the true initial condition could not be resolved easily by assimilating data in the equatorial wave guide area, as both wind and SLH data in the equatorial wave guide area with a temporal interval of one day were found to be insufficient. With current observations available from the TAO system, one can expect improvements in the poor current field retrieval. While SLH data was superior to wind data in retrieving the oceanic initial conditions, it was ironic that noise in the SLH data was far more detrimental to the retrieval than noise in the wind data. The reason is that when there is noise in the wind data, the ocean does not adjust quickly enough to the erroneous wind data, and with the atmosphere mainly a slave to the ocean at low frequencies, the wind soon adjusts to the ocean state ~ hence little damage is done by the noise in the original wind data. In contrast, when there is noise in the ocean data, the slave atmosphere adjusts rapidly to the erroneous ocean data, hence oceanic noise can cause much more damage to the coupled system. For an easy access to the results obtained in this thesis, Appendix D outlines all the experiments conducted, the data used for the experiments and the corresponding results. In conclusion, adjoint data assimilation holds great promise for improving simple equatorial coupled models, in the optimal estimation of model parameter and model initial conditions. It is hoped that in the not-too distant future, adjoint data assimilation will be implemented in operational El Nino forecast models, such as the Cane-Zebiak model, leading to superior forecast skills. Bibliography 113 Bibliography Allen, M . R. and M . K. Davey, 1993 : Empirical parameterization of tropical ocean atmosphere coupling -- ' The inverse Gill Problem'. J. Climate, 3, 509-530. Anthes, R. A. 1974: Data assimilation and initialization of hurricane prediction models. J. Atmos. Sci., 31, 701-719. Barnett, T. P., 1981: Statistical prediction of North American air temperature from Pacific predictors. Mon. Wea. Rev., 109, 1021-1041. Barnett, T. P., N. Graham, M . Cane, S. Zebiak, S. Dolan, J. O'Brien, and Legler, 1988: On the prediction of the El Nino of 1986-87. Science, 241, 192-196. Barnett, T. P., M . Latif, N. Gramham, M . Flugel, S. Pazan, and W. White, 1993: ENSO and ENSO-related predictability. Part 1: Prediction of equatorial Pacific sea surface temperature with a hybrid coupled ocean-atmosphere model. J. Climate, 6, 1545-1566. Barnston, A. G., and C. F. Ropelewski, 1992: Prediction of ENSO episodes using canonical correlation analysis. J Climate, 5, 1316-1345. Barnston, A. G., H. M . Van den Dool, S. E., Zebiak, T. P. Barnett, M . Ji, D. R. Rodenhuis, M . A. Cane, A. Leetmaa, N. E. Graham, C. R. Ropelewski, V. E. Kousky, E. A. O'lenic, and R. E. Livezey, 1994: Long-lead seasonal forecasts - where do we stand? Bull. Amer. Meteor. Soc. 75:2097-2114. Bennett, A. F., and Budgell, W. P. 1987: Ocean data assimilation and the Kalman filter, spatial regularity. J. Phys. Oceanogr., 17, 1583-1601. Bennett, A. F., and Budgell, W. P. 1989: The Kalman smoother for a linear quasi-geostrophic model of ocean circulation. Dyn. Atmos. Oceans 13 ( 3-4), 219-268. Bennett, A. F., and R. N. Miller, 1991: Weighting initial conditions in variation assimilation schemes. Mon. Wea. Rev., 119, 1098-1102. Bengtsson, L., M . Ghil and E. Kallen 1981: Dynamic meteorology: Data assimilation methods, Springer Verlag, New York, 330pp. Bergamasco, A., P. Malanotte-Rizzoli, W. C. Thacker, and R. B. Long, 1993: The seasonal steady circulation of the eastern Mediterranean determined with the adjoint method. Deep Sea Res., 40, 1269-1298. Bergthorsson, P. and B. R. Doos 1955: Numerical weather map analysis. Tellus, 7, 329-340. Bertsekas, D. P. 1982: Constrained optimization and Lagrange multiplier methods. Academic Press, London, 395pp. Bryson, A. E., and Y. C. Ho, 1975: Applied Optimal Control. Revised Printing. Hemisphere Publishing Corporation, New York, 481 pp. Budgell, W. P., 1986: Nonlinear data assimilation for shallow water equations in branched channels. J. Geophys. Res. 91, 10633-10644. Bibliography 114 Budgell, W. P., 1987: Stochastic filtering of linear shallow water wave processes. SIAM Journal on Scientific and Statistical Computing, 8, 152-170. Cacuci, D. G., 1981: Sensitivity theory for nonlinear systems. I. Nonlinear functional analysis approach , J, Math, Phys., 22, 2794-2812. Cane, M . A., and S. E. Zebiak, 1987: Prediction of El Nino events using a physical model. Atmospheric and Oceanic Variability, H. Cattle, Ed., Royal Meteorological Society, 153-181. Carrera, J. and S. P. Neuman, 1986a: Estimation of aquifier parameters under transient and steady state conditions, 1: Maximum likelihood method incorporating prior information. Water Resour. Res., 22(2) 199-210. Carrera, J. and S. P. Neuman, 1986b: Estimation of aquifier parameters under transient and steady state conditions, 2: Uniqueness, stability and solution algorithms. Water Resour. Res., 22(2) 211-227. Carton, J. A., and Hackert, E. C. 1989: Application of multi-variate statistical objective analysis to the circulation in the tropical Atlantic Ocean. Dyn. Atmos. Oceans 13(3-4), 491-515. Carton, J. A., Giese, B., Cao, X. , and Miller, L., 1996: Impact of altimeter, thermistor, and expendable bathythermograph data on retrospective analyses of the tropical Pacific Ocean. J. Geophys. Res. 101, 14147-14159. Chen, D., S., Zebiak, A. J., Busalacchi, M. , Cane, 1995: An improved procedure for El Nino forecasting: implications for predictability. Science, 269, 1699-1702. Clancy, R. M . , Pollock, K. D., Cummings, J. A., and Phoebus, P. A. 1988: "Technical Description of the Optimal Interpolation System (OTIS) Version 1: A Model for Oceanographic Data Assimilation," FNOC Tech. Note 422-86-02, 422 Branch. Fleet Numerical Oceanography Center, Monterey, California. Courtier, P., and O. Talagrand, 1987: Variational assimilation of meteorological observation with the adjoint vorticity equations. Part II. Numerical Results. Quart. J. Roy. Meteor. Soc, 113, 1329-1347. Courtier, P., and O. Talagrand, 1990: Variational assimilation of meteorological observations with the direct and adjoint shallow-water equations. Tellus, 42A, 531-549. Courtier, P., J., Derber, R., Errico, J. F., Louis, and T. Vukecevik, 1993: Important literature on the use of Adjoint, Variational Methods and the Kalman filter in meteorology. Tellus, A-Dyn Meteorol Oceanogr, 45, 342-357. Cressman , G. P. 1959: An operational objective analysis system. Mon. Wea. Rev. 87, 367-374. Daley, R., 1978: Variational nonlinear normal mode initialization. Tellus 30, 201-218. Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, Cambridge, 457 pp. Daley, R., 1993: Atmospheric data assimilation on the equatorial beta plane. Atmos-Ocean, 31, 7, 421-450. Bibliography 115 Das, S. K., and R. W. Lardner, 1991: On the estimation of parameters of hydraulic models by assimilation of periodic tidal data. J. Geophys. Res., 96,15,187-15,196. Davies, H., and R. Turner, 1977: Updating prediction models by dynamic relaxation: A examination of the technique. Quart. J. Roy. Meteor. Soc, 103, 225-245. Derber, J. C , and Rosati, A. 1989: A global ocean data assimilation system. J. Phys. Oceanogr., 19, 1333-1347. Derber, J. C , 1989: A variational continuous assimilation technique. Mon. Wea. Rev. 117, 237-2445. Dewitte B., and C. Perigaud 1996: El Nino - La Nino events simulated with Cane and Zebiak's model and observed with satellite or in situ data. Part I: Model forced with observations. J. Climate, 9, 1188-1207. Eliassen, A. 1954: Provisional report on calculation of spatial covariance and autocovariance of the pressure field. Videnskapsakademiet Institutt for Va? og Klimaforskning, report no. 5. (Reprinted in Dynamic meteorology:Data assimilation methods. Bengtsson, L., M . Ghil and E. Kallen editors, Springer Verlag, New York, 1981, 319-330.) Eisner, J. B., and C. P. Schmertmann, 1993: Improving extended-range seasonal predictions of intense Atlantic hurricane activity. Wea. Forecasting, 8, 345-351. Errico, R. M. , and T. Vukicevic, 1992: Sensitivity analysis using an adjoint of the PSU-NCAR mesoscale model. Mon. Wea. Rev. 120, 1050-1076. Farrell, B. F., and A. M . Moore, 1992: An adjoint method for obtaining the most rapidly growing perturbation to oceanic flows. J. Phys. Oceanogr., 22, 338-349. Fletcher, R., 1987: Practical Methods of Optimization. John Wiley & Sons, New York, 436 pp. Fu, L. L., J. Vazquez, and C. Perigaud, 1991 : Fitting dynamic models to Geosat sea level observation in the tropical Pacific ocean. Part I: A free wave model. J. Phys. Oceanogr., 21. 798-809. Fu, L. L., I. Fukumori and R. Miller, 1993: Fitting dynamic models to the Geosat sea level observations in the tropical Pacific ocean. Part II: A linear, wind driven model. J. Phys. Oceanogr., 23, 2162- 2181. Gandin, L. S., 1965: Objective analysis of meteorological fields (translated from Russian). Israel program for scientific translations, Jerusalem, 242 pp. Gasper, P. and C. Wunsch, 1989: Estimates from altimetric data of barotropic Rossby waves in the northwestern Atlantic ocean. J. Phys. Oceanogr., 19, 1821-1844. Gauthier, P., 1992: Chaos and quadri-dimensional data assimilation: A study based on the Lorenz model. Tellus, 44A, 2-17. Ghil, M . S., E. Cohn, J. Tavantis, K. Buse and E. Isaacson 1981: Application of estimation theory to numerical weather prediction. In: Dynamic meteorology: Data assimilation methods. Bengtsson, L., M . Ghil and E. Kallen editors, Springer Verlag, New York, 139-224. Bibliography 116 Ghil, M. , and P. Malanotte-Rizzoli, 1991: Data assimilation in meteorology andoceanography. Adv. Geophys., 33, 141-266. Gill, A. E., 1980: Some simple solutions for heat-induced tropical circulation. Quart. J. Roy. Meteor. Soc, 100, 447-462. Gill, P. E., W. Murry, and M . H. Wright, 1981: Practical Optimization. Springer-Verlag, 377 pp. Gilchrist, B. and G. P. Cressman 1954: An experiment in objective analysis. Tellus, 6, 97-101. Goswami, N . E. and J. Shukla, 1991: Predictability of a coupled atmosphere ocean model. J. Climate, 4, 3-22. Graham, N. E., J. Michaelsen, and T. P. Barnett, 1987: An investigation of the El Nino-Southern Oscillation cycle with statistical models. 1. Predictor field characteristics. J. Geophys. Res., 92, 14 251-14 270. Graham, N. E., T. P. Bamett, and M . Latif, 1992: Considerations of the predictability of ENSO with a low-order coupled model. Proc. of the 16th Annual Climate Diagnostics Workshop, Climate Analysis Center, NNAA, Los angeles, CA, 323-329. Gray, W. M. , C. W. Landsea, P. Mielke, and K. Berry, 1993: Predicting Atlantic Basin seasonal tropical cyclone activity by 1 August. Wea. Forecasting, 8, 73-86. Hall, M . C. G. and D. G. Cacuci, 1983: Physical interpretation of the adjoint functions for sensitivity analysis of atmospheric models. J. Atmos. Sci., 40, 2537-2546. Hestenes, M . R., 1980: Conjugate directions methods in optimization. Applications of Mathematics, 12, Springer-Verlag, New York, 325. Hirst, A. C , 1986: Unstable and damped equatorial modes in simple coupled ocean-atmosphere models. J. Atmos. Sci., 43, 606-630. Hirst, A. C , 1988: Slow instabilities in tropical ocean basin - global atmosphere models. J. Atmos. Sci., 45, 830-852. Hoffman, R. N. 1982: SASS wind ambiguity removal by direct minimization. Mon. Wea. Rev., 110,434-445. Hoffman, R. N. 1984: SASS wind ambiguity removal by direct minimization. Part II. Use of smoothness and dynamical constraints. Mon. Wea. Rev., 112, 1829-1852. Holland, W. R., and P. Mallanotte-Rizzolli, 1989: Along-track assimilation of altimeter data into an ocean circulation model; space versus time resolution studies. J. Phys. Oceangr. 19, 1507-1534. Horel, J. D., and J. M . Wallace, 1981: Planetary-scale atmospheric phenomena associated with the Southern Oscillation. Mon. Wea. Rev., 109, 813-829. Hurlburt, H.'E. 1986: Dynamic transfer of simulated altimeter data into subsurface information by a numerical ocean model. J. Geophys. Res. 91, 2372-2400. Bibliography 117 Inoue, M. , and J. J. O'Brien, 1984: A forecasting model for the onset of a major El Nino. Mon. Wea. Rev., 112, 2326-2337. Ji, M. , A. Kumar, and A. Leetmaa, 1994: An experimental coupled forecast system at the National Meteorological Center: Some early results. Tellus, 46A, 398-418. Kalman, R. E. 1960: A new approach to linear filtering and prediction theory. Journal of Basic Engineering (Transactions of the ASME), 82D, 35-45. Kalman, R. E. and R. S. Busy 1961: New results in linear filtering and prediction theory. Journal of Basic Engineering (Transactions of the ASME), 83D, 95-108. Keppenne, C. L., and M . Ghil, 1992: Adaptive filtering and prediction of the Southern Oscillation Index. J. Geophys. Res., 97, 20449-20454. Kimeldorf, G. and Wahba, G. 1970: A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Ann. Math. Statist. 41, 495-502. Kindle, J. G. 1986: Sampling strategies and model assimilation of altimetric data for ocean monitoring and prediction. J. Geophys., Res. 91, 2418-2432. Kitamura, S. and S. Nakagiri, 1977: Identifiability of specially varying and constant parameters in distributed systems of parabolic type. SIAM, Journal on control and optimization, 15, 785-802. Kleeman, R., Moore, A. M . and Smith N. R., 1995: Assimilation of subsurface thermal data into a simple ocean model for the initialization of an intermediate tropical coupled ocean-atmosphere forecast model. Mon. WEa. Rev. 123, 3103-3113. Lacarra, J-F., and O. Talagrand, 1988: Short-range evolution of small perturbations in a barotropic model. Tellus, 40A, 81-95. Lardner, R. W., 1992: Optimal control of open boundary conditions for a numerical tidal model. Comput. Methods Appl. Methods Eng., 102, 367. Latif, M. , A. Sterl, E. Maier-Reimer, and M. M . Junge, 1993a: Climate Variability in a coupled GCM. Part I: The tropical Pacific. J. Climate, 6, 5-21. Latif, M. , A. Sterl, E. Maier-Reimer, and M . M . Junge, 1993b: Structure and predictability of the El Nino/Southern Oscillation phenomenon in a coupled Ocean-atmosphere general circulation model. J. Climate, 6, 700-708. Latif, M. , and M . Flugel, 1991: an investigation of short range climate predictability in the tropical Pacific. J. Phys. Oceanogr., 96, 2661-2673. Lanczos, C , 1968: The Variational Principles of Mechanics. 3rd ed. University of Toronto Press, Toronto, 375 pp. Le Dimet, F., and O. Talagrand, 1986: Variational algorithms for analysis and assimilation of meteorology observations: theoretical aspects. Tellus, 38A, 97-110. Le Dimet, F., and I. M . Navon, 1988: Variational and optimization methods in meteorology: A review. FSU-SCRI-88-144, Florida state University, Tallahassee, Florida, 54 pp. Leetmaa, A., and Ji, M . 1989: Operational hindcasting of the tropical Pacific. Dyn. Atmos. Bibliography 118 Oceans 13 ( 3-4 ), 465-490. Lewis, J. M . and Bloom, S. C. 1978: Incorporation of time continuity into subsynoptic analysis by using dynamical constraints. Tellus 30, 496-516. Lewis, J. M . and J. C. Derber 1985: The use of adjoint equations to solve a variational adjustment problem with advective constraints. Tellus, 37A, 309-322. Lindzen, R. S., and S. Nigam, 1987: On the role of sea surface temperature gradients in forcing low-level winds and convergence in the tropics. J. Atmos. Sci., 44, 2418-2436. L i , Y., Navon, I. M. , Yang, W. Y., Zhou, X.L. , Bates, J. R., Moorthi, S., Higgins, R.W. 1994: 4-Dimensional variational data assimilation experiments with a multilevel semi-Lagrangian semi-implicit general circulation model. Mon. Wea. Rev. 122, 966-983. Lions, J.-L., 1971: Optimal Control of Systems Governed by Partial Differential Equations. Springer-Verlag, Berlin. Lighthill, M . J., 1969: Dynamic response of the Indian Ocean to onset of the Sourthwest Monsoon, Phil. Trans.,R. Soc, 265, 45-92. Long R. B. and Thacker, W. C. 1989a; Data assimilation into a numerical equatorial ocean model, Part I. The model and the assimilation algorithm. Dyn. Atmos. Oceans 13 ( 3-4 ), 379-412. Long R. B. and Thacker, W. C. 1989b: Data assimilation into a numerical equatorial ocean model, Part II. Assimilation experiments . Dyn. Atmos. Oceans 13 ( 3-4 ), 413-440. b Lorenc, A. C , 1981: A global three-dimensional multivariate statistical intepolation scheme. Mon. Wea. Rev., 109, 701-721. Lorenc, A. C , 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteorol. Soc, 112, 1177-1194. Lu J. and W. W. Hsieh, 1997a: Adjoint data assimilation in coupled atmosphere-ocean models -Determining model parameters in a simple equatorial model. Quart. J. Roy. Meteor. Soc, (in press). Lu J. and W. W. Hsieh, 1997b: Adjoint data assimilation in coupled atmosphere-ocean models — Determining initial conditions in a simple equatorial model, (submitted to climate dynamics). Luenberger, D. C , 1984: Linear and nonlinear programming. Addison-Wesley, 491 pp. Malanotte-Rizzoli, P. (Editor), 1996: Modern approaches to data assimilation in ocean modeling. Elsevier Oceanography Series, 61, pp 455. Malanotte-Rizzoli, P., and Holland, W. R. 1986: Data constraints applied to models of the ocean general circulation. Part I. The steady case. J. Phys. Oceanogr., 16, 1665-1687. Malanotte-Rizzoli, P., and Holland, W. R.1988: Data constraints applied to models of the ocean general circulation. Part II. The transient, eddy resolving case. J. Phys. Oceanogr., 18, 1093-1107. Bibliography 119 Marchuk G. I., 1974: Numerical solution of the problems of the dynamics of the atmosphere and ocean (In Russian). Gidrometeoizdat, Lenningrad, U.S.S.R. Marchuk G. I., 1981: Mathematical issues of industrial effluent optimization. J. Met. Soc. Japan, 60, 481-485. Marotzke, J. and C. Wunsch, 1993: Finding the state of a general circulation model through data assimilation: application to the North Atlantic ocean. J. Geophys. Res. 98, 20149-20167. Marshal, J. L., 1985a: Determining the ocean circulation and improving the geoid from satellite altimetry. J. Phys. Oceanogr., 15, 330-349. Marshal, J. L., 1985b: Altimetric observing simulation studies with an eddy resolving circulation model. In " The Use of Satellite Data in Climate models" (compiled by J. J. Hunt), pp 73-76. ESA Scientific and Technical Publication Branch, Noordwijk, The Nethlands. Matsuno, T., 1966: Quasi-geostrophic motions in the equatorial area. J. Met. Soc. Japan, 44, 25-43. Menke, W., 1984: Geophysical Data Analysis: Discrete Inverse Theory. Academic Press, London, 260. Miller, R. N. 1986: Toward the application of the Kalman filter to regional open ocean modeling. J. Phys. Oceanogr., 16, 72-86. Miller, R. N. , and Cane, M. A. 1989a: A Kalman filter analysis of sea level height in the tropical Pacific. J. Phys. Oceanogr., 19, 773-790. Miller, R. N. , 1989b: Direct assimilation of altimetric differences using the Kalman filter. Dyn. Atmos. Oceans 13 ( 3-4 ), 317-334. Miller, R. N. , and Ghil, M . 1990: Data assimilation in strongly nonlinear current systems. Proc. Int. symp. Assimil.Obs. Meteorol. Oceanogr. 93-98. Moore, A. M. , 1991: data assimilation in a quasi-geostrophic open-ocean model of the Gulf Stream region using the adjoint method. J. Phys. Oceanogr., 21, 398-427. Moore, A. M . , Cooper, N. S., and Anderson, D. L. T. (1987): Initialization and data assimilation in model of the Indian ocean. J. Phys. Oceanogr. 17, 1965-1977. Morel, P., G. Lefevre and G. Rabreau 1971: On initialization and non-synoptic data assimilation. Tellus, 23, 197-206. Morse, P. and H. Feshbach, 1953: Method of Theoretical Physics. Part I. McGraw-Hill, New York. Murphy, A., and Epstein E., 1988: Skill scores and correlation coefficients in model verification. Mon. Wea. Rev. 117, 572-580. Navon, I. M. , and R. De Villiers, 1983: Combined penalty multiplier optimization methods to enforce integration in variant conservation. Mon. Wea. Rev., I l l , 1228-1243. Navon, I. M. , 1986: A review of variational and optimization methods in meteorology. In: Variational methods in geosciences. Y. Sasaki, editor, Elsevier, New York, 29-34. Bibliography 120 Navon, I. M. , and D. Legler, 1987: Conjugate-gradient methods for large-scale minimization in meteorology. Mon. Wea. Rev., 115, 1479-1502. Navon, I. M . , X. Zhou, J. Derber and J. Sela, 1992a: Variational data assimilation with an adiabatic version of the NMC spectral model. Mon. Wea. Rev. 120, 1433-1446. Navon, I. M . , X . Zou, M . Berger, K. H. Phua, T. Schlick and F. X. Le Dimet, 1992b: Numerical experience with limited memory quasi-Newton and truncated Newton methods. Optimization Techniques and Applications, Vol. 1, K. H. Phua et al., eds., World Scientific Publishing Co., New York, 445-480. Navon, I. M . : Practical and theorectical aspects of adjoint parameter estimattion and identifiability in meteorology and oceanography. Mon. Wea. Rev. (in press). Neelin, D. J., 1989: On the interpretation of the Gill model. J. Atmos. Sci., 46, 2466-2486. Neelin, D. J., 1990: A hybrid coupled general circulation model for El Nino studies. J. Atmos. Sci., 47, 674-693. Palmer, T. N. and Anderson, D. L. T. 1994: The prospectives for the seasonal forecasting - A review paper. Quart. J. Roy. Meteorol. Soc, 120, 755-793. Panchang, V. G., and J. J. O'Brien, 1988: On the determination of hydraulic model parameters using the adjoint stae formulation. Modeling Marine System, I, CRC Press, 5-18. Panofsky, H. A. 1949: Objective weather map analysis. J. Meteorology, 6, 386-392. Pedlosky, J., 1979 : Geophysical fluid dynamics. Springer-Verlag, Berlin and New York. Penland, C , and T. Magorian, 1993: Prediction of Nino 3 sea surface temperature using linear inverse-modeling. J. Climate, 6, 1067-1076. Penenko, V. V. and Obraztsov, N. N. 1976: A variational initialization method for the fields of the meterological elements (English translation). Soviet Meteorol. Hydrol., 11, 1-11. Perigaud, C , and Dewitte, B. 1996: El Nino - La Nino events simulated with Cane and Zebiak's model and observed with satellite or in situ data. Part II: Model data comparison. J. Climate, 9, 66-84. Philander, S. G.H., T. Yamagata and R. C. Pacanowski 1984: Unstable air-sea interactions in the tropics. J. Atmos. Sci. 41, 604-613. Rienecker, M. , Mooers, C. N. R., and Robinson, A. R. 1987: Dynamical interpolation and forecast of the evolution of mesoscale features off Northern California. J. Phys. Oceanogr., 17, 1189-1213. Robinson, A. R., and L. J. Walstad, 1987: The Harvard open ocean model: calibration and application to dynamical process, forecasting and data assimilation studies. Applied Numerical Mathematics, 3, 89-131. Robinson, A. R. and Leslie, W. B. 1985: Estimation and prediction of oceanic eddy fields. Prog. Oceanogr. 14, 485-510. Bibliography 121 Ropelewski, C. F., and M . S. Halpert, 1986: North American precipitation and temperature patterns associated with the El Nino /Southern Oscillation (ENSO). Mon. Wea. Rev., 114, 2352-2362. Sadokov, V. P. and Shteinbok, D. B. 1977: Application of conjugate function in analysis and forcasting of the temperature anomaly (English translation). Soviet Meterol. Hydrol., 10, 16-21. Sadoumy, R., 1975: The dynamics of finite difference models of the shallow water equations. J. Atmos. Sci., 32, 680-689. Sasaki, Y. 1955: A variational study of the numerical predition based on the variational principle. Meteorol. Soc. Jpn. 33, 262-275. Sasaki, Y. 1958: An objective analysis based on the variational method. J. Meteorol. Soc. Jpn. 36, 77-88. Sasaki, Y. 1970: Some basic formalisms in numerical variational analysis. Mon. Wea. Rev. 98, 875-883. Sandstrom, J. M . and B. Helland-Hanssen 1903: fiber die Berechnung von Meeresstroemungen. Report on Norwegian fishery and marine investigations, 2 (1902): 4, 43 pp. Schroter, J. 1989: Driving of the non-linear time dependent ocean models by observation of the transient tracers - a problem of constrained optimization. In " Ocean circulation models: combining data and dynamics" ( D. L. T. Anderson and J. Willebrand, eds.), 257-286. Shanno, D. F., and K. H. Phua, 1980: Remarks on algorithm 500 — a variable method subroutine for unconstrained nonlinear minimization. A C M Tran. on Math. Software, 6, 618-622. Shapiro, L., 1982: Hurricane Climatic Fluctuations. Part II: Relation to large-scale circulation. Mon. Wea. Rev., 110, 1014-1023. Soliz, P. and Fein, J. 1980: Kinematic analysises for a major winter storm during amtex 75. . Meteorol. Soc. Jpn. 58, 16-32. Stauffer, D. R., and J. W., Bao 1993: Optimal determination of nudging parameters using the adjoint equations. Tellus, 45A, 358-369. Sheinbaum, J.; Anderson, D.L.T. 1990a: Variation assimilation of XBT data. I. J. Phys. Oceanogr., 20, 689-704. Sheinbaum, J.; Anderson, D.L.T. 1990b: Variation assimilation of XBT data. II. Sensitivity studies and use of smoothing constraints. J. Phys. Oceanogr., 20, 689-704. Smedstad, O. M. , and J. J. O'Brien, 1991: Variational data assimilation and parameter estimation in an equatorial Pacific Ocean model. Progress in Oceanography, Vol. 26, Pergamon, 179-241. Stommel, H. ad F. Schott 1977: The beta spiral and the determination of the absolute velocity field from hydrographic data. Deep-Sea Research, 24, 325-329. Tangang, F. T., W. W. Hsieh and B. Tang, 1997: Forecasting the equatorial Pacific sea surface Bibliography 122 temperatures by neural network. Clim, Dynam, 13, 135-147. Talagrand, O., and P. Courtier, 1987 : Variational assimilation of meteorological observations with the adjoint vorticity equation. Part I: Theory. Quart. J. Roy. Meteorol. Soc, 113, 1311-1328. Talagrand, O., 1991: The use of adjoint equations in numerical modeling of the atmospheric circulation. In Andreas Griewank and George F. Corliess, editors, Automatic Differentiation of Algorithms: Theory, Implementation and Application. 169-180, SIAM. Thacker, W. C. 1986: Relationships between statistical and deterministic methods of data assimilation. In: Variational methods in geosciences, Y. Sasaki, editor, Elsevier, New York, 173-179. Thacker, W. C , 1987: Three lectures on fitting numerical models to observations. External Rep. GKSS 87/E/65, 64pp., GKSS Forschungszntrum Geesthacht, Geesthacht, Federal Republic of Germany. Thacker, W. C. and Long R. B. 1988: Fitting dynamics to data. J. Geophys. Res. 93, 1227-1240. Thacker, W. C , 1988, Fitting models to inadequate data by enforcing spatial and temporal smoothness. J. Geophys. Res. 93, 10655-10665. Thacker, W. C. and Long R. B. 1988: Fitting Dynamic to data. J. Geophys. Res. 93, 1227-1240. Thacker, W. C.,1989: The role of the Hessian Matrix in fitting models to measurements. J. Geophys. Res. 94, 6177-6196. Tarantola, A. 1987: Inverse problem theory. Methods for data fitting and model parameter estimation. Elsevier, Amsterdam, 613 pp. Temperton, C. 1982: Variational normal mode initialization for the ECMWF gridpoint model, in The interaction between objective analysis and initialization, ed. Williamson, D. Publication in Meteorology 127, McGill University Montreal, 160-164. Thepaut, J.-N., and P. Courtier 1991: Four-dimensional variational data assimilation using the adjoint of a multilevel primitive equation model. Q., J., R. Meteorol. Soc. 117, 1225-1254. Thepaut, J.-N., D. Vasiljevic, P. Courtier and J. Pailleux, 1993: Variational assimilation of conventional meteorological observations with a multilevel primitive equation model. Q., J., R. Meteorol. Soc. 119, 153-186. Thompson, J. D. 1986: Altimeter data and geoid error in mesoscale ocean prediction: some results from a primitive equation model. J. Geophys. Res. 91, 2401-2417. Tribbia, J. J. 1982: On variational normal mode initialization. Mon. Wea. Rev. 109, 455-470. Tziperman, E., and Thacker, W. C. 1989: An optimal control/adjoint equations approach to studying the oceanic gernal circulation. J. Phys. Oceanogr. 19, 1471-1485. Tziperman, E., and Thacker, W. C , and K. Bryan, 1992: Computing the steady ocean Bibliography 123 circulation using an optimization approach. Dyn. Atmos. Oceans, 7, 31-46. Van den Dool, H. M. , 1994: Searching for analogous, how long must we wait? Tellus, 46A, 314-432. Verron, J., and Holland, W. R. 1989: Impacts des donnees d'altimetrie satellaire sur les simulations numeriques des circulations geneales oceaniques aux latitutes moyennes. Ann. Geophys. 7, 31-46. White , W. B., Tai, C. K., and Holland, W. R. 1990a: Continuous assimilation of simulated GEOSAT altimetric sea level into an eddy resolving numerical ocean model. Part I. Sea level differences. J. Geophys. Res. 95, 3219-3236. White , W. B., Tai, C. K., and Holland, W. R. 1990b: Continuous assimilation of simulated GEOSAT altimetric sea level into an eddy resolving numerical ocean model. Part I. Referenced sea differences. J. Geophys. Res. 95, 3236-3251. Wunsch, C. 1977: Determining the general circulations of the oceans: a preliminary discussion. Science, 196, 871-975. Wunsch, C. 1978: The North Atlantic general circulation west of 50 W determined by inverse methods. Reviews of Geophysics and Space physics, 16, 583-620. Xu, Q. 1996: Generalized adjoint for the physical processes with parameterized discontinuities. 1. Basic issues and heuristic examples. J. Atmos. Sci., 53, 1123-1142. Xu, J.-S., and H. von Storch, 1990: Predicting the state of the Southern Oscillation using principal oscillation pattern analysis. J. Climate, 3, 1316-1329. Yamagata, T., 1985: Stability of a simple air-sea coupled model in the tropics, Coupled Ocean-Atmosphere Models, J. C. J. Nihoul, Ed., Elsevier Oceanogr. Ser., 40, 637-657. Yeh, W. W. 1986: Review of parameter identification procedures in the groundwater hydrology: the inverse problem. Water Resources Research, 22, 95-108. Yu, L. and J. J. O'Brien 1991: Variational estimation of the wind stress drag coefficient and the oceanic eddy viscosity profile. J. Phys. Oceanogr. 21, 709-719. Zebiak, S. E., and M . A. Cane, 1987: A model El Nino - Southern Oscillation. Mon. Wea. Rev., 115, 2262-2278. Zebiak, S. E., 1989: Oceanic heat content variability and El Nino cycles. J. Phys. Oceanogr., 19, 475-86. Zebiak, S. E., 1990: Diagnostic studies of Pacific surface winds. Journal of Climate, 3, 1016-31. Zou, X. , Navon, I. M. , and Le Dimet, F. X. 1993: An optimal nudging data assimilation scheme using parameter estimation. Quart. J. Roy. Meteor. Soc, 118, 1163-1186. Zou, J., and Holloway, G. 1995: Improving steady-state fit of dynamics to data using adjoint equation for gradient preconditioning. Mon. Wea. Rev., 123, 199-211. Zou, J., Hsieh, W. W., and Navon, I. M . 1995: Sequential open-boundary control by data Bibliography 124 assimilation in a limited-area model. Mon. Wea. Rev., 123, 2900-2909. Appendix A . Derivation of the Continuous Adjoint Coupled Model 125 Appendix A Derivation of the Continuous Adjoint Coupled Model Let the triple integration \Qdadt = \ j j()r cos<pd(pdAdt denote an integration over a £ to Ao 0o spherical domain [(Ao,Am),(0o,(/>„)] and time period (t0, T), where the spherical coordinates (r, A, (p) are the radial distance, the longitude and the latitude, respectively. The Lagrangian (3.4) for model (3.9-3.10) is given by: , » dua ga dha L = J + ]{ua(— fva + ——Aua) + E dt rcostp dA t b\a , , , ga dha + Va(— + fua + ~ — + Ava) + dt r dtp , ,*rdha , da .dua , C?(vaCOS0) + /?A(-^- + 7(-TT + - A —TT— L ) + Bha + aK) dt rcos</» ovl o'cp „ <7W0 go dh , „, , (7? rcostp dA *,dv„ , g0 dh0 , + v0(—-rfu0 + ^ — + aVo-yva) (A-l) otf r o></> ,*.dh0 do ,du0 (9(v o cos0l , , , , dt rcostp dA dtp where ua,va,ha,u0,v0,andh0 are the Lagrange multipliers (or adjoint variables) for the corresponding atmospheric and the oceanic model state variables, respectively. J is defined by (3.2). By integrating (A-l) by parts, L becomes Appendix A . Derivation of the Continuous Adjoint Coupled Model -26 £ = J + j {[(—T —Ua)-fvaUa+ ( — — /?«) + AuaUa] + i dt dt rcos</> dX dX r,dv*aVa dvl , , * , ga ,dvlcos(j)ha dvl cos f . * [(—^ —Va)-fuaVa+ ( T - ^ 7"— k ) + AVaVa ] + or dt rcosc? d(p df r.dKha dK, , da ,duaK dK . , 4 dK cos fva da dhl , ^ T~) + I ^ T i TTM«) + ( 1 T T V « ) dt dt rcosc? oW, <7/i rcosc? (70 r tfc? +Bhafh + ahak,] + T / V ^ u ^ u ^r*u * „ * g0 .dh0u0 du0 , s. , * * n . [( r — U0 ) - fVoUo + (—77 — h0 ) + aUoUo - JUaUo ] + rcosc? dA dA T,„r0,o go ,dVo COS fho (^ Vo COS (f) . * , [(—r r - V p ) - / M 0 V 0 + ° ( — — / l 0 ) + f l V 0 V 0 - y V f l V 0 ] + rcos</> df df (A-2) r . J 0 .duoK dhl d0 dhlcosfVo d0 dK rcosc/) C M , o'/l rcosc? df r df +bh0ho]}dadt. Reordering (A-2) by bringing the common terms together gives ^ = J + J uc—;— + —^— + —^—) +c—-— + — ^ — + — ^ — ) J + z dt dt dt dt dt dt ga dhqUq + dvl cos fha ^ + da duaK + dK cos 0V f l ^ rcos</> t9A rcosc? dX df go ,dh0Uo dvl cos f'ho. d0 ,du0K dhl cos c?v<, - ( - 7 - 7 - + — ) + r ( — 7 7 — + — )J du0u0 dul dt dt dvlv0 dvl dt dt dhlho dK dt dt rcos</>v dX df rcosc? v dX df [ — + jVa - - 7 7 - + AU„ - yU0\Ua + dt rcosc? dX [ dVa j* * da dha A * *, 7 fUa 7 — + Ava ~ JV0 JVa dt r df , r e?/k g« dul dvl cos f * + [ 7 e — ( — + r — J 1 ) + BK ]K + dt rcoscp dX df dul * J 0 <97?o . [—z— + fVo --ZTT-+ au0]Uo + dt rcosc? dX r _ ^ _ / M ; _ ^ + A V : ] V O (A-3) r df i r c?A0* g0 ,C9M^ dvlcosf . , +[ 2 — ( 1 —Z1) + bh0+ aha ]h0 }dodt. dt rcosc/) <9A <70 By specifying w*,v*,/i*,M*,v*,and/i* to be zero along the boundaries, those spatial Appendix A . Derivation of the Continuous Adjoint Coupled Model 127 integration terms in row 2 and row 3 of (A-3) vanish. And if the Lagrange multipliers in the first row of (A-2) are defined to be equal to zero at time T, (A-3) can be rewritten as: L = J - [uaua + v*ava + h*aha + U0U0 + V*0V0 + fhho]t=0 ! rr dua - * da dK . * *, j{[—^- + pa 7 ^ T + A u « ~ + £ dt rcosfp dA [ dva r * da dha . A * .. * i JUa — + AVa ~ JVo \Va dt r dtp ,r 9K ga ,du*a dv*aCOS(p , +[ ( — + r——) + Bha ]ha + dt rcostp dA d(p r du0 r * do dh0 *., [—— + fv0 --zrr + au0]Uo + dt rcos</> dA dvl , * d0 dh*0 * [—z--fu0 — + av0 ]v0 (A-4) dt r d<p . dK go ,du0 , dv*0co$,(p. * . . . . +[— -— (— + —-—-) + bh0 + aha ]h0 }dadt. dt rcoscp dA d(p The first order variation of L is SL = 8J- [u*adua + v*aSva + KSK + u08u0 + v*05v0 + KSho],=0 ]{[-— + M ^ — ^ + Aua - yu:]5ua+(6Aua-dyuDua + z dt rcos</> dA l~~ K - — ^ + AVa - YvWVa + {8AVa ~ c5yV> f l + dt r dtp r dK ga ,dua , dVaCOSfy D , * n o , , X j n , * U _ I_ [ ( ^ T — + ~) + BK ]Oha + OBhaha + dt rcostp dA dip cfu0 j. * (to [ h JVo — + au0 \oUo + oau0Uo + dt rcostp dA r dv0 r * do dh0 it, ~ . o * [—— - fu0 — + av0 ]ov0 + bav0Vo + (A-5) dt r d<p +[_dK g^_{dK_ + dv0 cos (p) + b K + a K ] § K + [ 5 h K + 5 a K i ) h o ) d G d u dt rcos0 dA dtp According to (2.5b) and (3.6-3.7), letting the first order variation of L with respect to the six atmospheric and the oceanic state variables of our model be zero gives the adjoint model for Appendix A . Derivation of the Continuous Adjoint Coupled Model 128 their corresponding adjoint variables: dua „ da dhl t * dD — r - + /v a - — - + Aua-yu0+ — = 0 dt rcosc? dA dua dvl , * da dhl , . * * dD —z--fua — + Ava - yv0 + — = 0 dt r dip dva dhl ga .dul dvlcoscp. » dD - \~^7~ ~ ) + £>na + ~Z— = v dt rcosc/) dA dtp dha dul , d 0 <9/z0* , dD —-~ + /v 0 7^T" + aMo+":r- = 0 dt rcosc? dA du0 dvl » d0 dhl * dD f A-M ——-fuo — + av0 + — = 0 l A °J OT r (7C? aMo g 0 dul , dvlcoscp , D l . , , rt rcosc? dA d<p du0 According to (2.8), the formulae for calculating the gradients of J with respect to the six parameters and the six model initial conditions, with the a prior terms omitted, can be obtained from (A-3): 8J/8A = \{U*aUa + Vlva )d(jdt £ SJ/SB = \(hlha )dodt i 8JI8a = j(hlho)dcrdt i dJ/da = \{ulu0 + vlv0)dadt £ 8Jl8b = \(h*0ho)dodt £ 8J/8y = j(u*0ua + vlva)dadt £ SJ 1 8ua = -ul(0) 8J18va = -Va(0) 8JI8ha = -K(0) SJ 1 Sua = -ul(0) SJ/Svo = -vo*(0) SJ18h0 = -K(0). (A-7) Appendix B. Numerical Discretization 129 Appendix B Numerical Discretization Equations (3.9-3.10) were solved numerically following Philander et al (1984). Ignoring the coupling terms which do not influence the formulation of the finite difference scheme, we can discretize the spherical shallow-water equations for the models (3.9) and (3.10) by using the finite difference scheme of Sadourny (1975), which conserves the potential entropy on the Arakawa C grid: where the superscript n indicates the time step, a is the radius of the earth, Z = (/cosc/>z) / Hcos(/)h^ is the potential vorticity, Uh = HXu and Vh = H*v are the volume transports, H is the equivalent depth, <j>h and <j>z are the latitudes at which the volume transports and the potential vorticity are computed, and (BI) &() = [( W - ( Un]/AX <?*( ) = [( );+i/2-( )j-m]/A<P (T =K W + ( )w/2]/2 (")'=[( W + ( );-i/2]/2 (B2) sr( )=[( r'-( rxvAt. Appendix B. Numerical Discretization 130 The ocean model was run in a 200° longitudinal extent between 30° S and 30° N . Zonally, the atmospheric domain was larger by 20° and was cyclic. The solid-wall boundary condition was set for the ocean and for the atmosphere in the northern and southern boundaries. The grid spacing was 1° latitudinally and longitudinally in both the ocean and the atmosphere. The coupling between the atmosphere and the ocean took place once a day. For the rest of the day, the driving forces for the atmosphere and ocean were held fixed. The formulae (3.14) for the gradient computation in discretized form become: &7/5A = rT ra K K X + V X ) /=0 s T=T2 *T„ 57/55= ± UKha) 1=0 s 67/5a = 2 I T±[K(ty,t2,s)xh0(t2,s)] ,2=0 s r]=0 T=T2*T0 57/5a= £ I O X + v>0) r=0 s T=T2 *T„ S7/SZ> = I S(fc>0) 1=0 s 57/5y= I I tH(hd2,s)xua(t2,s) + v*0(tvt2,s)xva(t2,s)l r2=0 s r\=0 where s denotes the spatial variables, T2 is the total days of coupling, and Ta = 90 and T0 = 8 are respectively the number of time steps for the atmospheric model and the oceanic model in one day. Appendix C Verifying Gradient Calculations 131 Appendix C Verifying Gradient Calculations The correctness of the adjoint code and the gradient computations must be verified before they can be used for data assimilation (Talagrand, 1991, Navon et al, 1992). The adjointness is defined as: {a,Mb} = {M*a,b} (CI) where {,} is inner product, a and b any vectors, M an operator (usually a subroutine), and M*is the adjoint operator of M. To check the adjoint model of our forward model, the tangent-linear code of our model must be developed from the forward model by differentiating each subroutine of the model (3.9-3.10). The adjoint code was developed directly from the tangent-linear model by transposing each subroutine of the tangent-linear model. (CI) was then used to check the adjointness of the adjoint model. Before checking (CI), the linearization of the forward model (3.9-3.10) must first be checked, i.e., for every subroutine of the model, AMa-dMa must be of order 5a where AMa = M{a + 8a) - M(a) is the output perturbation in finite difference form resulting from an input perturbation 8a, and 8Ma the perturbation computed from the tangent-linear code 8M. This check must hold to machine accuracy. The accuracy of gradient computation can be tested by the Taylor expansion. Let the Taylor expansion of cost function J be J(x,p + eh) = J(x,p) + ehTVpJ + 0(e2) (C2) where termp is the control vector containing unknowns, e a scalar, h a unit vector given by h = can be rewritten as a function of £ J(x,p + eh)-J(x,p) m e(v,7iv,7iryv,/ l + 0 i e ) If the values of J and the gradient are correctly calculated, the value of (pfe) will linearly Appendix C Verifying Gradient Calculations 132 approach 1 with e decreasing through a wide range of magnitudes. If 0(e) does not approach 1, errors must exist in the gradient calculation. Fig.C shows that our gradient for the six parameters is correct. - 9 - 7 - 5 - 3 1 0 log(e) Figure C Verification of the gradient computation. Note that O(e)=l+O(e) indeed holds until round-off errors become important for very small e. Appendix C Verifying Gradient Calculations Appendix D 133 Summary of Experiments and Results Table 1 A summary of the experiments performed to retrieve the three oceanic initial conditions from assimilating wind (w) and/or SLH (h) data. Data assimilation period was 90 days. The retrieval quality for the three oceanic initial conditions is classified as: excellent (excl), good (gd), acceptable (acpt) or poor (pr). The RMSE (r) ranges for the classification are: pr (r>10'), acpt (lO-'-lO-2), gd (10-2-104) , excl (r<104), and the correlation (c) ranges are: pr (c<0.5), acpt (0.5<c<0.6), gd (0.6<c<0.8), or excl (c>0.8). Data in the TAO array were sampled every 15° in the zonal direction and every 2° in the meridional direction. experiments 1 w sparse in time Fig.5.2 2 same as 1 Fig.5.2 3 same as 1 Fig.5.2 4 same as 1 Fig.5.2 5 h sparse in time Fig.5.2 6 same as 5 Fig.5.2 7 same as 5 Fig.5.2 8 same as 5 Fig.5.2 6 -9 w in TAO Fig.5.3 10 h in TAO Fig.5.3 11-14 same as 10 Fig.5.3 15 noisy w Fig.5.7 16 noisy w Fig.5.7 17 noisy w and h Fig.5.7 18 noisy w and h Fig.5.7 Effect of initial guesses 19-20 in-phase guesses 21-22 noisy perturbat. 23-24 out-phase guesses Data w everywhere and once per day w everywhere and once every 10 days w everywhere and once every 30 days w everywhere and once every 90 days h everywhere and once per day h everywhere and once every 10 days h everywhere and once every 30 days h everywhere and once every 90 days w in TAO once every 1, 10, 30, 90 days h in TAO, once per day h in TAO, once every 10, 30, 90 days w in TAO, once everyday w in TAO, once every 10 days w and h in TAO, once everyday w and h in TAO, once every 10 days Fig.5.5 w or h in TAO, once every 90 days w or h in TAO, once every 90 days w and h in TAO, once everyday result appraisal Correlation RMSE h0u0vo excl u0vo h0 acpt h0 u0 exclvo gd u0 gd, h0 vo acpt h0 u0 exclvo pr u0 gd 7. . . 4. vo pr h0, u0, vo excl. h0 vo gd; u0 excl h0 u0 vo excl. h0u0vo gd h0, u0, vo excl h0 u0 gd vo acpt h0 excl; u0 h0 u0 vo acpt; vo pr acpt h0 excl; h0 u0 vo acpt u0 gd;vo pr h0 u0 excl; vo h0u0vo acpt pr h0 excl; h0u0vo acpt u0 vo pr h0 u0 vo excl h0 u0 vo acpt h0 vo excl; h0 u0 vo acpt u0 acpt h u0 pr; h0 u0 vo pr h0 vo u0 pr vo acpt h0 u0 pr; vo acpt h0 u0 vo excl h0 u0 vo excl h0 u0 vo excl h0u0vo acpt h0 u0 vo pr h0 u0 vo acpt Appendix D. Summary of experiments and Results 134 Table 2 A summary of the experiments performed to retrieve the six parameters from assimilating wind (w) and/or SLH (h) data. Unless specifically mentioned, the data assimilation period was 40 days. The retrieval quality is classified as: excellent (excl), good (gd), acceptable (acpt) or poor (pr). The relative estimation error ranges (r) of the classes are: pr (r>10_1), acpt ( lCU-lO 2 ), gd (10-2- lO 4), or excl (r<104). Experiments 0 continuous wind Fig 4.2 1 sparse w in time Fig.4.3a 2 same as 1 Fig.4.3a 3 sparse h in time Fig.4.3b 4 same as 3 Fig.4.3b 5 sparse w in time and space, sparse h in space, Fig.4.4a 6 same as 5 Fig.4.4a 7-8 sparse h in time and space, sparse w in space Fig.4.4b 9 h at two latitudes Fig. 4.5 10 h at three points Fig. 4.5 11 noisy h Fig. 4.6 12 noisy w Fig. 4.6 13 noisy w and h Fig. 4.6 14 5 day assimilation Fig. 4.7a 15 20 day assimilation, Fig. 4.7a 16 40 day assimilation, Fig. 4.7a 17 5 day assimilation Fig. 4.7b 18 20 day assimilation Fig. 4.7b 19 40 day assimilation Fig. 4.7b 20 5 day assimilation, Fig. 4.10 21 20 day assimilation Fig. 4.10 22 40 day assimilation Fig. 4.10 Data distribution in time and space w everywhere and every time step w everywhere and once per day, h everywhere in time and space w everywhere and once every two days, h everywhere in time and space h everywhere and once per day, w everywhere in time and space h everywhere and once every two days, w everywhere in time and space h and w at every 10*10 grids in 10°S -30°S and 10°N-30°, at every 2*15 grids in 10°S-10°N; w once per day, h once every timestep same as 5 but w available once every two days same as 5 but h available once every day, w every time step same as 7 but h available once every two days same as 7 but h once everyday and every 10 grids along 25°N and 25°S same as 9 except h at grid points (30, 100 and 170)at 0.5°N w and h available everywhere, 10 % noise added to h w and h available everywhere, 10 % noise added to w w and h available everywhere, 10 % noise added to both w everywhere and once per day, h everywhere in time and space w everywhere and once per day, h everywhere in time and space w everywhere and once per day, h everywhere in time and space h everywhere and once per day, w everywhere in time and space same as 16 same as 16 same as 9 same as 9 but with a priori information same as 9 but with a priori information result appraisal a, y,A,a,b and B excl a, y and A gd; a, b and B acpt a, y, A, a, b acpt; B pr a, y,A, and b excl a and B gd a excl; y, A, a, b and B gd a, y acpt; B, A, a and b pr or, y acpt; B, A, a and b pr a, y, A, b and B excl a gd a, y, A and b excl a and B gd a, y,A,b and B acpt a pr a, y, A, a, b and B excl a, A and /excl; a, b and B gd a, A and b gd; a, /and B acpt a, A and b gd; a, /and B acpt a, /and A acpt; a, b, B pr a, /and A gd; a and B acpt; b pr oc, y, A and B gd; a and b acpt a and A gd; B, b and y acpt; a pr same as 17 a excl; A and ygd; B and b acpt; a pr a, A, B, b, a and y pr a, /and a acpt; A, B and b pr same as 21 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0088066/manifest

Comment

Related Items