Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Quasi-objective Nonlinear Principal Component Analysis and applications to the atmosphere Lu, Beiwei 2007

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2008_spring_lu_beiwei.pdf [ 6.18MB ]
Metadata
JSON: 24-1.0052779.json
JSON-LD: 24-1.0052779-ld.json
RDF/XML (Pretty): 24-1.0052779-rdf.xml
RDF/JSON: 24-1.0052779-rdf.json
Turtle: 24-1.0052779-turtle.txt
N-Triples: 24-1.0052779-rdf-ntriples.txt
Original Record: 24-1.0052779-source.json
Full Text
24-1.0052779-fulltext.txt
Citation
24-1.0052779.ris

Full Text

Quasi-Objective Nonlinear Principal Component Analysis and Applications to the Atmosphere by Beiwei Lu  M.Sc., The University of Victoria, 1999 B.Sc., The Ocean University of China, 1983  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Atmospheric Science)  THE UNIVERSITY OF BRITISH COLUMBIA December, 2007 c Beiwei Lu 2007 °  ii  Abstract NonLinear Principal Component Analysis (NLPCA) using three-hidden-layer feed-forward neural networks can produce solutions that over-fit the data and are non-unique. These problems have been dealt with by subjective methods during the network training. This study shows that these problems are intrinsic due to the three-hidden-layer architecture. A simplified two-hidden-layer feed-forward neural network that has no encoding layer and no bottleneck and output biases is proposed. This new, compact NLPCA model alleviates these problems without employing the subjective methods and is called quasi-objective. The compact NLPCA is applied to the zonal winds observed at seven pressure levels between 10 and 70 hPa in the equatorial stratosphere to represent the Quasi-Biennial Oscillation (QBO) and investigate its variability and structure. The two nonlinear principal components of the dataset offer a clear picture of the QBO. In particular, their structure shows that the QBO phase consists of a predominant 28.4month cycle that is modulated by an 11-year cycle and a longer-period cycle. The significant difference in variability of the winds between cold and warm seasons and the tendency for a seasonal synchronization of the QBO phases are well captured. The one-dimensional NLPCA approximation of the dataset provides a better representation of the QBO than the classical principal component analysis and a better description of the asymmetry of the QBO between westerly and easterly shear zones and between their transitions. The compact NLPCA is then applied to the Arctic Oscillation (AO) index and aforementioned zonal winds to investigate the relationship of the AO with the QBO. The NLPCA of the AO index and zonalwinds dataset shows clearly that, of covariation of the two oscillations, the phase defined by the two nonlinear principal components progresses with a predominant 28.4-month periodicity, plus the 11-year and longer-period modulations. Large positive values of the AO index occur when westerlies prevail near the middle and upper levels of the equatorial stratosphere. Large negative values of the AO index arise when easterlies occupy over half the layer of the equatorial stratosphere.  iii  Table of Contents Abstract  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iii  List of Tables  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vi  List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vii  Preface  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ix  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  x  1 A General Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1  1.1  Nonlinear principal component analysis by neural networks . . . . . . . . . . . . . . . . .  1  1.2  The Quasi-Biennial Oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  3  1.3  Association of the Arctic Oscillation with the Quasi-Biennial Oscillation  . . . . . . . . .  5  1.3.1  The Arctic Oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5  1.3.2  Effect of the Quasi-Biennial Oscillation on the Arctic Oscillation . . . . . . . . . .  6  2 A Compact Neural Network Model for NLPCA  . . . . . . . . . . . . . . . . . . . . . .  8  2.1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8  2.2  The standard three-hidden-layer feed-forward neural network for NLPCA . . . . . . . . .  9  2.2.1  Structure and variable glossary of the standard neural network . . . . . . . . . . .  9  2.2.2  The problem of non-uniqueness: theory . . . . . . . . . . . . . . . . . . . . . . . .  11  2.2.3  The problem of over-fitting: examples . . . . . . . . . . . . . . . . . . . . . . . . .  13  A new two-hidden-layer feed-forward neural network for NLPCA . . . . . . . . . . . . . .  15  2.3.1  Simplification: removal of bottleneck bias . . . . . . . . . . . . . . . . . . . . . . .  16  2.3.2  Simplification: removal of output biases . . . . . . . . . . . . . . . . . . . . . . . .  16  2.3  Table of Contents 2.4  The compact NLPCA model  iv  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  19  2.4.1  One bottleneck neuron  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  20  2.4.2  Two bottleneck neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  21  2.5  Number of parameters in the compact NLPCA model . . . . . . . . . . . . . . . . . . . .  21  2.6  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  22  3 Structure and Variability of the Quasi-Biennial Oscillation  . . . . . . . . . . . . . . .  35  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  35  3.1  Introduction  3.2  Data  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  36  3.3  Optimal simulation of the QBO winds . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  36  3.4  Variability of the QBO phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  38  3.4.1  Essential phase speed of the QBO winds  . . . . . . . . . . . . . . . . . . . . . . .  38  3.4.2  Frequency modulation of the QBO phase . . . . . . . . . . . . . . . . . . . . . . .  39  3.4.3  Statistical significance of the QBO phase features  . . . . . . . . . . . . . . . . . .  41  . . . . . . . . . . . . . . . . . . . . . . .  42  3.5  Phase variation in variability of the QBO winds  3.6  Seasonal synchronization of the QBO phases  . . . . . . . . . . . . . . . . . . . . . . . . .  45  3.7  Composite of the QBO winds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  47  3.8  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  49  4 Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation . . . . . .  64  4.1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4.2  Data of the AO index and the QBO winds  4.3  Optimal simulation of the AO-QBO dataset  4.4  64  . . . . . . . . . . . . . . . . . . . . . . . . . .  65  . . . . . . . . . . . . . . . . . . . . . . . . .  66  Relation between the AO and the QBO . . . . . . . . . . . . . . . . . . . . . . . . . . . .  67  4.4.1  Covariation of the AO and the QBO  . . . . . . . . . . . . . . . . . . . . . . . . .  67  4.4.2  Statistical significance of the phase features . . . . . . . . . . . . . . . . . . . . . .  68  4.4.3  Relationship of the AO with the QBO . . . . . . . . . . . . . . . . . . . . . . . . .  68  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  70  5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  81  4.5  5.1  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  81  5.1.1  Nonlinear principal component analysis by feed-forward neural networks  . . . . .  81  5.1.2  Structure and variability of the QBO winds  . . . . . . . . . . . . . . . . . . . . .  82  Table of Contents 5.1.3 5.2  Relationship of the AO with the QBO winds . . . . . . . . . . . . . . . . . . . . .  Future work  v 83  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  84  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  86  Appendices A Strategy of Nonlinear Optimization  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  94  vi  List of Tables 2.1  A variable glossary for the I-H-1-M-I model. . . . . . . . . . . . . . . . . . . . . . . . . .  10  2.2  A variable glossary for the I-Kb -M-Ib model. . . . . . . . . . . . . . . . . . . . . . . . . .  20  vii  List of Figures 2.1  A standard three-hidden-layer feed-forward neural network for NLPCA . . . . . . . . . . .  24  2.2  Mean square errors of 3-M-1-M-3 model applied to the Lorenz attractor . . . . . . . . . .  25  2.3  Bottleneck series of 3-2-1-2-3, 3-4-1-4-3, and 3-7-1-7-3 solutions of the Lorenz attractor . .  26  2.4  Output series of 3-2-1-2-3, 3-4-1-4-3, and 3-7-1-7-3 solutions of the Lorenz attractor . . . .  27  2.5  A two-hidden-layer feed-forward neural network for NLPCA . . . . . . . . . . . . . . . . .  28  2.6  Mean square errors of 3-1-M-3, 3-1b -M-3b , and 3-2b -M-3b models applied to the Lorenz attractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  29  2.7  Bottleneck series of 3-1-3-3, 3-1-4-3, and 3-1-5-3 solutions of the Lorenz attractor . . . . .  30  2.8  Bottleneck series of 3-1b -4-3b , 3-1b -5-3b , and 3-1b -6-3b solutions of the Lorenz attractor . .  31  2.9  Output series of 3-1b -2-3b , 3-1b -3-3b , and 3-1b -4-3b solutions of the Lorenz attractor . . . .  32  2.10 Bottleneck series and output series of the 3-2b -5-3b solution of the Lorenz attractor . . . .  33  2.11 Numbers of parameters of 3-M-2-M-3, 3-M-1-M-3, 3-2b -M-3b , and 3-1b -M-3b models . . . .  34  3.1  Height-time section of the QBO winds . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  51  3.2  Root mean square errors of 7-2b -M-7b simulations of the QBO winds . . . . . . . . . . . .  52  3.3  Time series of the QBO winds and the 7-2b -16-7b simulation . . . . . . . . . . . . . . . . .  53  3.4  Frequency distribution of the 7-2b -16-7b simulation error . . . . . . . . . . . . . . . . . . .  54  3.5  Frequency distribution of the PCA two-mode reconstruction error . . . . . . . . . . . . . .  55  3.6  The two nonlinear principal components and the periodic phase of the QBO winds . . . .  56  3.7  The accumulating phase of the QBO winds, and the phase residue and its Fourier amplitude coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  57  3.8  Fourier amplitude coefficients of the QBO winds and the 7-2b -16-7b simulation . . . . . .  58  3.9  Variation and generalization of the QBO winds . . . . . . . . . . . . . . . . . . . . . . . .  59  3.10 Seasonal variation in variability of the QBO winds . . . . . . . . . . . . . . . . . . . . . .  60  3.11 Distribution of the QBO phases in month and phase . . . . . . . . . . . . . . . . . . . . .  61  List of Figures  viii  3.12 Distribution of the QBO phase speeds in month and phase . . . . . . . . . . . . . . . . . .  62  3.13 The 7-2b -7ψ -7b simulation of the QBO winds . . . . . . . . . . . . . . . . . . . . . . . . .  63  4.1  Root mean square errors of 8-2b -M-8b simulations of the AO-QBO dataset . . . . . . . . .  72  4.2  QBO winds from the 8-2b -17-8b simulation . . . . . . . . . . . . . . . . . . . . . . . . . . .  73  4.3  AO index from the 8-2b -17-8b simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . .  74  4.4  Frequency distribution of the 8-2b -17-8b simulation error . . . . . . . . . . . . . . . . . . .  75  4.5  The two nonlinear principal components and periodic phase of the AO-QBO covariation .  76  4.6  The accumulating phase of the AO-QBO covariation, and the phase residue and its Fourier amplitude coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  77  4.7  Covariation of the AO index and the QBO winds . . . . . . . . . . . . . . . . . . . . . . .  78  4.8  The two nonlinear principal components in different months . . . . . . . . . . . . . . . . .  79  4.9  AO index against the periodic phase in different months . . . . . . . . . . . . . . . . . . .  80  ix  Preface I started the doctorate program by applying the standard three-hidden-layer feed-forward neural network model for NonLinear Principal Component Analysis (NLPCA) to oceanographic variables and encountered the problems of over-fitting and non-uniqueness of solutions. So I turned to work at finding out the causes of these problems as well as devising an NLPCA neural network to overcome them. The work involves both theoretical mathematical analysis and experimental demonstrative test of various NLPCA neural networks and comes to the compact NLPCA model, as shown in Chapter 2. Dr. Lionel Pandolfo, as the supervisor of a major part of the doctorate program, helped with English grammar during the writing of the dissertation. He suggested the analysis of the error histograms, as seen in Sections 3.3 and 4.3. Dr. William Hsieh, as the supervisor of an initial part of the doctorate program, advocated the application of the compact NLPCA model to simulate the equatorial stratospheric zonal winds that are well known for their dominant and complicated quasi-biennial oscillation. This leads to the investigation of structure and variability of the Quasi-Biennial Oscillation, as shown in Chapter 3. Dr. Kevin Hamilton at the University of Hawaii prompted the application of the compact NLPCA model to analyze the relationship between the Arctic Oscillation over the northern extra-tropics and the Quasi-Biennial Oscillation in the equatorial stratosphere. The analysis is described in Chapter 4.  x  Acknowledgements My grateful thanks are due to my supervisor Dr. Lionel Pandolfo for his encouragement and support throughout the course of this research. His careful comments, modifications and corrections on the dissertation draft improved the dissertation in both scientific point and English readability. The discussions with Drs. Yuval Zudman, William Hsieh and Susan Allen at the University of British Columbia, during the development of nonlinear principal component analysis by simplified two-hiddenlayer feed-forward neural network, motivated the analytical study of the relevant neural networks. Comments on the dissertation draft from Dr. Phil Austin at the University of British Columbia prompted the statistical significance analysis of the phase features found in the Quasi-Biennial Oscillation and in the covariation of the Arctic Oscillation and the Quasi-Biennial Oscillation.  1  Chapter 1  A General Introduction 1.1  Nonlinear principal component analysis by neural networks  Principal Component Analysis (PCA) (Pearson, 1901) is a multivariate statistical analysis technique widely used for reducing the dimension of large datasets and extracting their structural features (Jolliffe, 2002; von Storch and Zwiers, 2002). It decomposes a dataset into a series of ranked modes that are linear combinations of these data, so that the first mode captures the largest fraction of variance present in the data. The higher the rank number of a mode, the lower the fraction of variance it captures. Hence, the leading modes are often taken to represent the original data. On the one hand, if the dataset can be described by a few modes that are related linearly to each other, then PCA will extract these modes from the data. On the other hand, if the dataset is characterized by structures that are related nonlinearly to each other, PCA, as a linear technique, will leave important information of the data in higher order, non-leading modes (Palus and Dvorak, 1992). This can cause a mis-representation of the variability of such a dataset if only the leading modes from PCA are used to describe the data. Techniques extending principal component analysis into the nonlinear domain have been proposed (see (Hsieh, 2004) for a recent review). This study follows the neural network approach pioneered by Kramer (1991) and that was later applied to the atmospheric and oceanographic sciences by Hsieh and Tang (1998) and Monahan (2000, 2001). A typical structure for a NonLinear Principal Component Analysis (NLPCA) model consists of an auto-associative, feed-forward neural network having five layers of neurons. Those layers are arranged in sequence as input, encoding, bottleneck, decoding, and output layers. The three layers of neurons between the input and output layers are called hidden layers. This NLPCA model has been applied to a variety of meteorological and oceanographic fields to extract their one-dimensional approximations. Those fields include the heat content anomaly in the Pacific basin (Tang and Hsieh, 2003), the thermocline, sea surface temperature, sea level pressure, and wind stress anomalies in the tropical Pacific (An et al., 2005; Li et al., 2005; Monahan, 2001; Ye and Hsieh, 2006), the surface air temperature over Canada (Wu et al., 2002), and the winter 500-mb geopotential height and surface air  Chapter 1. A General Introduction  2  temperature anomalies over North America (Wu et al., 2003). In addition, the aforementioned NLPCA model has been used to show the existence of two regimes of variability in the sea level pressure field over the northern hemisphere produced by a general circulation model (Monahan et al., 2000). At other levels in the extra-tropical atmosphere over the northern hemisphere, NLPCA of the geopotential height fields has shown that three circulation regimes exist in wintertime (Monahan et al., 2001, 2003). However, Hsieh (2004) describes that an NLPCA model built from such neural network could over-fit the data presented to it, i.e., fitting to the noise in the data. This reduces the usefulness of NLPCA when the goal is to determine the most compact structure describing the variability in a dataset. Another problem plaguing the three-hidden-layer feed-forward neural network is that of the non-uniqueness of solutions. This is the more apparent, the smaller the dataset being analyzed. It is called the reproducibility issue by Monahan and Fyfe (2007) because the estimation of the parameters of a neural network from a dataset of finite size is sensitive to introduction of new data. This sensitivity could lead to unstable neural network solutions (Hsieh, 2004) since different solutions would be obtained for each set of different initial model parameters used to start training the neural network. These problems are not unique to NLPCA, but also arise in the nonlinear extensions to canonical correlation analysis (Hsieh, 2000), singular spectrum analysis (Hsieh, 2004), and complex PCA (Rattan and Hsieh, 2004, 2005) which employ this type of neural networks. To determine the causes leading to the problems of over-fitting and non-uniqueness of solutions, Chapter 2 of this dissertation examines in detail the standard three-hidden-layer feed-forward neural network commonly used as an NLPCA model. It is shown that the problems of over-fitting and nonuniqueness are intrinsic to this type of model due to the three-hidden-layer structure and that, these problems can be alleviated by simplifying appropriately the structure of the neural network. Having refined the three-hidden-layer feed-forward neural network that has been used extensively in the applications of NLPCA to meteorological and oceanographic datasets (An et al., 2005; Hamilton and Hsieh, 2002; Hsieh, 2001; Hsieh and Tang, 1998; Monahan, 2000, 2001; Monahan et al., 2000, 2001, 2003; Rattan and Hsieh, 2004; Tang and Hsieh, 2003; Wu et al., 2002, 2003), the simplified two-hiddenlayer feed-forward neural network, called compact NLPCA model, is applied to the study of the QuasiBiennial Oscillation (QBO) in Chapter 3. The QBO is a natural oscillation of the stratospheric zonal winds above the equator. Being a well-known quasi-periodic phenomenon, it is an optimum test-bed for investigating the behavior of the compact NLPCA model since results from previous studies are available for comparison.  Chapter 1. A General Introduction  1.2  3  The Quasi-Biennial Oscillation  The familiar QBO dominates variations of the zonal winds in the stratosphere (from about 16 to 50 km in altitude) around the equator. In general, the zonal winds alternate between easterly and westerly at a period varying between about 20 and 36 months with a relatively rapid transition over about 2 to 4 months. The wind regimes consistently propagate downward through the equatorial stratosphere. The wind amplitude, associated with the QBO, of about 20 m s−1 is nearly constant between 10 and 40 hPa and then reduces to about 5 m s−1 at 70 hPa. Noticeable asymmetries in the wind oscillation include: (i) the transition of easterly to westerly is more rapid than that of westerly to easterly, and (ii) the associated westerly shear zone, where westerly winds increase with height, descends more regularly and rapidly than the easterly shear zone which sometimes ‘stalls’ for several months between about 30 and 50 hPa (Marquardt and Naujokat, 1997). The zonal winds exhibit a rather complicated height-time structure. There has been considerable work to construct low-dimensional statistical fits that describe the key features of the wind oscillation. For instance, PCA has been applied to the monthly mean zonal winds in 10-100 hPa range. It was found that the leading two principal components varied out of phase through the QBO period and the leading two modes could describe the basic height-time structure of the zonal winds reasonably well (Wallace et al., 1993). In addition, singular spectrum analysis, also called extended empirical orthogonal function analysis, was applied to the zonal winds (Fraedrich et al., 1993; Wang et al., 1995). Although the linear approaches provide a reasonably good representation of the QBO structure, some of the more interesting characteristic features of the zonal winds are lost. For instance, the wind series tend to look much more sinusoidal in time than the actual wind oscillations, the shear zones are considerably smoothed out, and the observed asymmetry between easterly and westerly regimes almost completely missed. Recently, an NLPCA by three-hidden-layer feed-forward neural network with a circular bottleneck node (Kirby and Miranda, 1996), the NLPCA.cir model, was used to represent the monthly mean zonal winds observed between 10 and 70 hPa in the stratosphere near the equator (Hamilton and Hsieh, 2002). Taking advantage of the periodic nature of the QBO that dominates variations of the equatorial stratospheric zonal winds, in this approach, the observations are mapped to a single phase variable that advances between -π and π over each QBO period. The NLPCA.cir single variable approach fits the raw data more accurately than even PCA two-mode reconstructions. More importantly, this NLPCA.cir approach captures the qualitative features of the typical QBO cycle much better than the linear PCA approaches.  Chapter 1. A General Introduction  4  One important benefit of a low-dimensional statistical fit to the zonal winds is that a QBO phase can be determined that accounts for the overall variation of the winds at all vertical levels. In the linear fit by the leading two PCA modes, a time series of QBO phase is defined as the arc tangent of the leading two principal components (Wallace et al., 1993). The single phase variable in the NLPCA.cir approach can be directly interpreted as the phase of the QBO (Hamilton and Hsieh, 2002). An estimate of the QBO phase as a function of time is needed for at least two types of investigations: (i) characterizing the connections of the tropical stratospheric QBO with other aspects of the atmospheric circulations, and (ii) studying the relation between the QBO and the annual cycle. The first of these research areas has a long history extending back at least to the work of Holton and Tan (Holton and Tan, 1980) who found that, in the northern hemisphere, the extra-tropical stratospheric circulation differed on average between easterly and westerly phases of the equatorial QBO. The phase of the equatorial QBO has also been correlated with ozone variations and many aspects of meteorological variability in the stratosphere and troposphere (Baldwin et al., 2001). The possible relation between the QBO and the annual cycle has been considered by various investigators (Dunkerton and Delisi, 1985; Dunkerton, 1990; Wallace et al., 1993). Interestingly, there have been a number of very recent papers reexamining both of these aspects of the QBO. The connection between the phase of the QBO and the weather of the northern hemisphere troposphere, an issue of possible importance for seasonal forecasting, has been examined based on observations (Thompson et al., 2002; Cai, 2003). The problem of the annual synchronization of the QBO, in observations and simple mechanistic model experiments, has been revisited (Hampson and Haynes, 2004). Almost all of these studies have used a definition of the phase of the QBO based on the time series of the zonal wind at a single level (often chosen to be 50 or 40 hPa). An exception is the examination of the annual synchronization of the QBO phase that is determined from the leading two principal components of linear PCA (Wallace et al., 1993). The NLPCA.cir approach incorporates the full vertical structure of the zonal winds into the time series of a single variable interpretable as the phase of the QBO cycle (Hamilton and Hsieh, 2002). As noted above, this approach works quite well in many respects and better than the linear PCA approaches, but the restriction to essentially a constant amplitude in representing the wind oscillation is a significant limitation. The constant amplitude results from the NLPCA.cir model, which produces a one-dimensional approximation to the zonal winds by projecting the two bottleneck series onto a unit circle before decoding them. In Chapter 3, the compact NLPCA model with two bottleneck neurons is used to characterize the observed zonal winds in the equatorial stratosphere. This is a two-component fit which will effectively  Chapter 1. A General Introduction  5  allow the variability in both phase progress and amplitude value of the QBO winds from cycle to cycle to be represented. The two-component representation is then analyzed to investigate the variability and structure of the QBO winds. After testing the compact NLPCA model on the dataset of the QBO winds, the relationship between the tropical QBO and the northern hemisphere climate as represented by the Arctic Oscillation (AO) is analyzed using the compact NLPCA model in Chapter 4. The AO is an atmospheric oscillation occurring in the northern hemisphere that has been shown to account for a large fraction of atmospheric variability (Thompson and Wallace, 2000). Since the AO exists in the extra-tropical stratosphere as well as in the troposphere, the establishment of a connection between the QBO and the AO can indicate whether tropical stratospheric variability can affect extra-tropical tropospheric climate.  1.3  Association of the Arctic Oscillation with the Quasi-Biennial Oscillation  1.3.1  The Arctic Oscillation  The Arctic Oscillation has been considered the dominant climate feature over the extra-tropics of the northern hemisphere (Thompson and Wallace, 2001, 2000). It is usually defined as the leading empirical orthogonal function of 1000-hPa geopotential height (or sea level pressure) anomaly poleward of 20N (Thompson and Wallace, 1998, 2000). Its loading pattern portrays a meridional dipole extending from the subtropics to the pole with a node between 50N and 60N depending on longitude. The AO has a zonal structure which, southward of the node, consists of two major centers of action over the Atlantic and Pacific sectors, an asymmetric structure induced by land-sea thermal contrast and topographic forcing. Its time series, called the AO index, describes variations of the AO. Large positive values of the AO index are characterized by anomalously low sea level pressures and cold temperatures over the Arctic basin, anomalously strong westerlies blowing across the North Atlantic along 55N and warm temperatures in the mid-latitudes. In contrast, large negative values of the AO index are characterized by an anti-cyclonic circulation over the Arctic basin, cold anti-cyclones centered over central Canada and Russia, anomalously cold mean surface air temperatures and an increased frequency of extreme cold events in North America, Europe and Asia (Thompson and Wallace, 2001; Thompson et al., 2002). The rise and fall of the AO index have been related to the strengthening and weakening of the polar westerly vortex aloft in the stratosphere. In the extra-tropical stratosphere, zonal mean circulations  Chapter 1. A General Introduction  6  undergo a prominent seasonal cycle with an annual reversal from winter to summer. Westerly winds are induced in winter while the polar stratosphere cools down because of radiative cooling. The westerlies are then replaced by easterly winds in summer as the polar stratosphere warms up with increasing solar heating. However, the smooth seasonal cycle is perturbed in winter by episodic strengthening and weakening (or even reversal) of the polar westerly vortex on timescales of weeks to months (Holton and Mass, 1976; Yoden, 1990; Scott and Haynes, 1998). Large and sustained perturbations to the polar westerly vortex tend to propagate down to the earth’s surface in about three weeks, causing unusual weather anomalies and climate changes (Baldwin and Dunkerton, 1999, 2001; Coughlin and Tung, 2005).  1.3.2  Effect of the Quasi-Biennial Oscillation on the Arctic Oscillation  The perturbations to the polar westerly vortex in the stratosphere have been attributed to the effect of the QBO of equatorial stratospheric zonal winds on the structure of the waveguide of the planetary Rossby waves (Wallace and Thompson, 2002). The quasi-stationary, westward-traveling planetary Rossby waves are forced by land-sea thermal contrast and topography over the mid-latitudes and maintained through variation of the Coriolis parameter with latitude. They typically propagate upward and equator-ward in a westerly zone but are evanescent in an easterly region, thereby present in the winter stratosphere while absent in the summer stratosphere (Charney and Drazin, 1961; Dunkerton, 1983; Andrews et al., 1987). Because the equatorial stratospheric zonal winds alternate between easterly and westerly regimes with a variable period averaging approximately 28 months (Marquardt and Naujokat, 1997), the QBO modifies the location of the boundary between westerly and easterly winds in the stratosphere. The longitudinal zero-wind line coincides with the critical line for waves with a zero phase speed. Waves are partly absorbed and reflected while approaching their critical lines. A QBO westerly regime extends the waveguide to the summer stratosphere and allows the waves to penetrate into the tropics, leading to decreased westward wave drags in the winter extra-tropics and a strengthened polar westerly vortex. In contrast, a QBO easterly regime confines the waveguide within the winter extra-tropical stratosphere, leading to increased westward wave drags there and a weakened polar westerly vortex. Sometimes amplitudes of the planetary Rossby waves in the narrowed waveguide are so high that the waves break down and exert a significant westward drag on the polar vortex, causing the replacement of the westerly vortex by an easterly vortex (McIntyre and Palmer, 1983, 1984). This effect of the QBO has been typically shown by displaying that the geopotential height at high latitudes is significantly lower during QBO westerly regimes than during QBO easterly regimes (Holton and Tan, 1980, 1982; Dunkerton and Baldwin, 1991; Baldwin et al., 2001).  Chapter 1. A General Introduction  7  Due to asymmetry between the QBO westerly and easterly regimes and differences in the zonal winds and their phase progressions at different heights, the height-time structure of the zonal winds in the equatorial stratosphere is rather complicated (Marquardt and Naujokat, 1997). Previous investigations into the effect of the QBO on the AO generally used a definition of the QBO regime based on a time series of the equatorial zonal winds at a single level which was often chosen to be 50 or 40 hPa. However, the numbers of strengthening and weakening events of the polar westerly vortex during QBO easterly and westerly regimes are different if the regimes are defined according to the equatorial zonal winds at different vertical levels (Baldwin et al., 2001). The collection of critical lines at different heights corresponds to a critical surface. Naturally, it is the zonal winds through different heights, instead of only those at a single level in the equatorial stratosphere, that will define the topology of the critical surface and therefore control the waveguide of the planetary Rossby waves. A solid analysis of the effect of the QBO on the AO should account for the overall variability of the zonal winds at all levels in the equatorial stratosphere. In Chapter 4, the compact NLPCA model with two bottleneck neurons is applied to a dataset consisting of an AO index and the zonal winds to investigate the relationship of the AO with the QBO. The AO index is based on the 1000-hPa geopotential height anomaly poleward of 20N. The data of the zonal winds consist of observations at seven pressure levels from 10 to 70 hPa in the stratosphere near the equator. The NLPCA applied to the dataset results in two nonlinear principal components, which are then analyzed to investigate the relationship of the AO index with the overall QBO winds.  8  Chapter 2  A Compact Neural Network Model for Nonlinear Principal Component Analysis 2.1  Introduction  NonLinear Principal Component Analysis (NLPCA) of a dataset using a feed-forward neural network with three hidden layers of neurons between the input and output layers can produce solutions that over-fit the data and are non-unique (Hsieh, 2004). Up to now, the problems of over-fitting and nonuniqueness of solutions have been dealt with by resorting to subjective operations on the neural network and its functioning. For instance, to reduce the severity of over-fitting, methods such as “weight penalty” and “early-stop during the training phase” are commonly adopted (Hsieh, 2004). The weight penalty adds an extra term, a squared norm of neural network parameters tuned by a weight penalty parameter, to the error function used to train the neural network. When the error function is minimized during neural network training, the use of weight penalty discourages generation of large weights in the neural network as the presence of one large weight is more costly to the error function than many small ones (Hinton, 1989). Effectively, use of a weight penalty reduces the nonlinear flexibility of the neural network and, as a consequence, hampers the minimization of the error function. Subjectively, through trial and error, a value for the weight penalty parameter has to be chosen by the user (Hsieh, 2001). The early-stop technique stops the neural network training at a given iterative step of the algorithm used for the nonlinear optimization. The stop step can be either based on a validation criterium or preset. To determine the validation criterium, the dataset is split into training and validation subsets. The former is used to train the neural network and the latter to test the skill of the trained neural network to  Chapter 2. A Compact Neural Network Model for NLPCA  9  new data presented to it. If the mean square error (the usual error function) over the validation data is significantly larger than that over the training data found during the training, the stop step and/or the size of the training subset have to be changed until the error over the validation subset becomes smaller (Monahan, 2000, 2001; Monahan and Fyfe, 2007). This is also a subjective process driven by a trial and error procedure, since it depends on the subjective choices of how much (% of) data to set aside for the validation subset and the size of the error difference between validation and training subsets. To determine the causes leading to the problems of over-fitting and non-uniqueness, this chapter examines in detail, both in theory and by experiment, three-hidden-layer feed-forward neural networks and then proposes a compact NLPCA model. The chapter is organized as follows. Section 2.2 makes use of the fact that the neural network architecture of NLPCA provides continuous mapping functions to point out what causes the over-fitting and non-uniqueness of solutions of the three-hidden-layer feed-forward neural network. Section 2.3 proposes a compact neural network architecture for NLPCA. Section 2.4 gives examples of NLPCA using the new, compact neural network model. Section 2.5 compares the number of parameters of the compact NLPCA model with that of the standard three-hidden-layer feed-forward neural network model. Section 2.6 gives conclusions.  2.2  The standard three-hidden-layer feed-forward neural network for NLPCA  2.2.1  Structure and variable glossary of the standard three-hidden-layer feed-forward neural network for NLPCA  In the original three-hidden-layer feed-forward neural network for NLPCA (Kramer, 1991) (Fig. 2.1), input signals are fed into each encoding neuron of the first hidden layer through a nonlinear transfer function. The nonlinear transfer function is used again when signals are fed from the bottleneck neurons of the second hidden layer into each decoding neuron of the third hidden layer. A popular nonlinear transfer function for feed-forward neural networks is the hyperbolic tangent and it is also used in this study. A linear transfer function is used to forward signals from the encoding neurons to each bottleneck neuron and from the decoding neurons to each output neuron. Since signals are always fed forward from one layer of neurons to the next, the neural network is said to be feed-forward. In NLPCA, target data are the same as input data, so output neurons are in pairs with input neurons and the neural network is sometimes described as auto-associative. The number of input (and output) neurons, I, equals the  Chapter 2. A Compact Neural Network Model for NLPCA  10  number of variables present in the dataset to be analyzed. Dimension reduction is achieved when the signals pass through the bottleneck neurons and each bottleneck neuron produces a series of bottleneck signals which is considered a nonlinear principal component. For that reason, the number of bottleneck neurons, K, is always less than that of the input neurons, i.e., K < I. The number of encoding neurons, H, and the number of decoding neurons, M , usually set to be the same, are adjustable for an optimal fit of model output to target. This neural network will be referred to as an I-H-K-M-I model. For example, the neural network in Fig. 2.1, having three input and two encoding neurons, one bottleneck neuron followed by two decoding and three output neurons, is referred to as a 3-2-1-2-3 model. For reference, a variable glossary for a three-hidden-layer feed-forward neural network with one bottleneck neuron, i.e., an I-H-1-M-I model, is listed in Table 2.1.  I H M N Ahi ah Bh b Cm cm Dim di pin xhn un ymn qin φ  the number of input (and output) neurons the number of encoding neurons the number of decoding neurons, M ≥ 2 the number of samples weight of encoding neuron h to signals from input neuron i bias of encoding neuron h weight of the bottleneck neuron to signals from encoding neuron h bias of the bottleneck neuron weight of decoding neuron m to signals from the bottleneck neuron bias of decoding neuron m weight of output neuron i to signals from decoding neuron m bias of output neuron i signal from input neuron i at sampling point n signal from encoding neuron h at sampling point n signal from the bottleneck neuron at sampling point n signal from decoding neuron m at sampling point n signal from output neuron i at sampling point n mean square error of model output to input  Table 2.1: A variable glossary for the I-H-1-M-I model.  The encoding signal xhn , bottleneck signal un , decoding signal ymn , output signal qin , and mean square error function φ are expressed mathematically by, I X xhn = tanh( Ahi pin + ah ),  (2.1)  i=1  un =  H X h=1  Bh xhn + b,  (2.2)  Chapter 2. A Compact Neural Network Model for NLPCA  ymn = tanh(Cm un + cm ),  qin =  M X  Dim ymn + di ,  11  (2.3)  (2.4)  m=1  φ=  I N 1 XX (qin − pin )2 , IN i=1 n=1  (2.5)  where i = 1, . . . , I; h = 1, . . . , H; m = 1, . . . , M ; and n = 1, . . . , N . Except for the given input (and target) signals {pin }, all other signals and all weights and biases of the neural network are determined by a minimization of the error function φ. According to equation (2.5), the error function φ measures the squared difference between output and target signals. If all the transfer functions are set to be linear, the neural network will degenerate to perform a linear PCA (Bourlard and Kamp, 1988; Baldi and Hornik, 1989). If the neural network is given only one decoding neuron (M = 1), according to equation (2.4), all the output series would be linearly related to each other because of the linear transfer from the decoding signal to the output signal. Therefore, to focus on nonlinear analysis, only nonlinear neural networks with at least two decoding neurons (M ≥ 2) are discussed in this study. At this point, application of NLPCA to a dataset implies solving the neural network chosen for conducting the analysis. Solution (also called training) of the neural network means the determination of the appropriate weights and biases based on the minimization of the error function φ. The optimization of the neural network through minimization of the error function φ is accomplished through a hybrid procedure described in Appendix A.  2.2.2  The problem of non-uniqueness: theory  In order to understand why the problems of over-fitting and non-uniqueness arise when using the three-hidden-layer feed-forward neural network to conduct NLPCA, the neural network is de-constructed to pinpoint the causes of the problems. It has been proved in theory that, given enough hidden neurons, a one-hidden-layer feed-forward neural network with monotonically increasing but bounded nonlinear transfer functions at the hidden layer and linear transfer functions at the output layer can approximate any continuous function to an arbitrary accuracy (Cybenko, 1989; Hornik et al., 1989). The three-hidden-layer feed-forward neural network is just constructed as a consecutive connection of two such one-hidden-layer feed-forward neu-  Chapter 2. A Compact Neural Network Model for NLPCA  12  ral networks, with the input-encoding-bottleneck part coming from the first neural network and the bottleneck-decoding-output part coming from the second (Kramer, 1991). Then, supposedly, the first one-hidden-layer feed-forward neural network projects nonlinearly a high dimensional input space to a low dimensional bottleneck space. This is followed by an inverse transform projection through the second neural network from the bottleneck space back to the original space represented by the model output. From this it follows that given enough encoding neurons, the bottleneck signal can reach any possible value since there is no constraint to approximate a definite quantity at the bottleneck neuron. Now, the juxtaposition of hidden layers with linear and non-linear transfer functions, of which the linear transfer functions are unbounded, leads to a non-unique determination of the weights and biases associated with individual neurons. Whether the nonlinear principal component (i.e., the bottleneck series), can remain essentially unchanged between realizations depends on whether the neural network parameters can be uniquely determined. This can be shown by considering the partial derivatives of the error function φ with respect to bottleneck weight Bh , bottleneck bias b, decoding weight Cm , and decoding bias cm . From equations (2.2) to (2.5), the partial derivatives are: I N M X ∂φ 2 XX 2 = (qin − pin ) Dim (1 − ymn )Cm xhn , ∂Bh IN i=1 n=1 m=1  (2.6)  I N M X ∂φ 2 XX 2 = (qin − pin ) Dim (1 − ymn )Cm , ∂b IN i=1 n=1 m=1  (2.7)  I N ∂φ 2 XX 2 = (qin − pin )Dim (1 − ymn )un , ∂Cm IN i=1 n=1  (2.8)  I N 2 XX ∂φ 2 = (qin − pin )Dim (1 − ymn ). ∂cm IN i=1 n=1  (2.9)  Combining equations (2.2), (2.6), (2.8), and (2.9) yields M M H X X X ∂φ ∂φ ∂φ Bh + Cm b = Cm . ∂Bh ∂c ∂C m m m=1 m=1  h=1  (2.10)  Chapter 2. A Compact Neural Network Model for NLPCA  13  From equations (2.7) and (2.9), it follows that M X ∂φ ∂φ = Cm . ∂b ∂c m m=1  (2.11)  The linear transfer function at the bottleneck layer, defined by equation (2.2), implies two linear relations, equations (2.10) and (2.11), among the optimization equations (2.6) to (2.9). These linear relations indicate that optimization equations (2.6) to (2.9) do not form a set of independent equations. Since the number of independent optimization equations is always less than that of model parameters, the three-hidden-layer feed-forward neural network is an under-determined model whose parameters and hence solutions can not be uniquely determined, causing the non-uniqueness problem. The same is true when the target data of the neural network are different from its input data, since the difference will not affect the above analysis and conclusion.  2.2.3  The problem of over-fitting: examples  The introduction of NLPCA to atmospheric and oceanographic sciences (Monahan, 2000; Hsieh, 2001) was described by applying a three-hidden-layer feed-forward neural network model to the Lorenz chaotic attractor (Lorenz, 1963). In those studies, the problems of over-fitting and non-unique solutions were mentioned and practical methods were adopted to reduce their severity, but the root of the problems was not investigated. In this study, for demonstration and comparison with previous studies, a dataset of 1500 samples is generated from the Lorenz chaotic attractor to show NLPCA performances of various neural network models. The Lorenz chaotic attractor is governed by three differential equations: ẋ1 = −ax1 + ax2 , ẋ2 = −x1 x3 + bx1 − x2 , ẋ3 = x1 x2 − cx3 , where x1 , x2 and x3 are the three variables of the attractor and the overhead dot denotes a time derivative. With a = 10, b = 28 and c = 8/3, the system of equations displays chaotic behavior and its dynamics is characterized by an attractor. The dataset will be called the Lorenz attractor for convenience. Before applying NLPCA, the input (and target) data from the Lorenz attractor have been re-scaled to adjust the range of variability of x1 , x2 and x3 . The data from each variable are adjusted by subtracting the mean of the variable series and dividing them by the maximum standard deviation of the three variable series. For comparison purposes, after NLPCA the output series are scaled back to the original  Chapter 2. A Compact Neural Network Model for NLPCA  14  magnitudes by multiplying each output series by the maximum standard deviation and adding the mean of the corresponding variable series. All the NLPCA runs in this chapter will follow this pre- and postscaling procedure. To the Lorenz attractor, a 3-M-1-M-3 model is applied, where the number of encoding (and decoding) neurons M is varied from 2 to 7 with 7 runs for each M . The resultant mean square errors of the model runs are scattered between 4 and 12 and decrease monotonically with the increase of M (Fig. 2.2). Therefore, on average, the 3-7-1-7-3 model runs show the lowest mean square error for all the values of M . In terms of least square error, the 3-7-1-7-3 model seems to be the best choice to represent the Lorenz attractor. However, as will be shown, the 3-2-1-2-3 model runs already over-fit the data and going to a higher value of M only compounds the problem. Consider the nonlinear principal component (i.e., the bottleneck series), extracting it is the goal of NLPCA and its pattern affects the model output. Hence, it is expected that the nonlinear principal component is a particularly well-behaved function. But, the bottleneck series of the 3-2-1-2-3 solution frequently jumps between relatively small and large values (Fig. 2.3a), instead of being continuous as required by this kind of neural network (Cybenko, 1989). The discontinuities are aggravated in the bottleneck series of the 3-4-1-4-3 solution (Fig. 2.3b) and worsened in the bottleneck series of the 3-7-17-3 solution (Fig. 2.3c), as the number of encoding neurons is increased. Consequently, in each two-dimensional phase plot, the output series of the 3-2-1-2-3 solution form one or two segmented straight lines (Fig. 2.4a–c, densely overlapping circles), those of the 3-4-1-4-3 solution are separated into three segmented lines or curves (Fig. 2.4d–f, densely overlapping circles), and those of the 3-7-1-7-3 solution are split into five segmented lines or curves (Fig. 2.4g–i, densely overlapping circles), in a zigzag way. The discontinuities and zigzags in model output emerge from the solution of the simple 3-2-1-2-3 model and become increasingly serious as M is increased. A one-dimensional (i.e., having one bottleneck neuron) NLPCA model should produce a model output that is a smooth curve passing through data points of the model target. The zigzags in Fig. 2.4 indicate that the output signals are over-fitting the target data.  Chapter 2. A Compact Neural Network Model for NLPCA  2.3  15  A new two-hidden-layer feed-forward neural network for NLPCA  In this section, it is proposed to simplify the architecture of the neural network to solve the problems of over-fitting and non-uniqueness. A simplified neural network structure offers a way to obtain a more stable neural-network-based NLPCA. It has been proved in theory (Cybenko, 1988) that at most two hidden layers are needed for a feedforward neural network to approximate any continuous function to a given accuracy. The following will show that the over-fitting and non-uniqueness problems can be alleviated by using a two-hidden-layer feed-forward neural network for NLPCA. The neural network, shown in Fig. 2.5, has no encoding layer. The bottleneck neurons make up the first hidden layer and the decoding neurons the second hidden layer, both carrying the nonlinear transfer function. The hyperbolic tangent is retained as the nonlinear transfer function and the linear transfer function for the output signal remains. The number of decoding neurons, M , is adjustable for an optimal fit of model output to target. The neural network with one bottleneck neuron will be referred to as an I-1-M-I model and the bottleneck signal becomes I X un = tanh( Ai pin + a),  (2.12)  i=1  where Ai is the weight of the bottleneck neuron to signals from input neuron i and a is the bias of the bottleneck neuron. The symbols for the variables representing decoding and output neurons of the twohidden-layer feed-forward neural network are the same as those of the three-hidden-layer feed-forward neural network that are described in Table 2.1 and by equations (2.3) and (2.4). A 3-1-M-3 model is applied to the Lorenz attractor, where M is varied from 2 to 7 with 7 runs for each M . The resultant mean square errors of the model runs fall on 13.7 when M = 2 and on 13.0 when 3≤M ≤7 (Fig. 2.6a). The 3-1-3-3 model has the fewest model parameters among those models that offer solutions with the lowest mean square error (Fig. 2.6a, circle). It is optimal in terms of least square error and fewest model parameters. As desired, both the bottleneck series (Fig. 2.7a) and the two-dimensional phase plots of output series of the 3-1-3-3 solution (not shown, but refer to Fig. 2.9g–i for similar phase plots) are smooth curves. Moreover, the output series of the 4≤M ≤7 model solutions closely resemble those of the 3-1-3-3 solution, bearing Pearson correlation coefficients ≥0.996 between corresponding pairs of output series. More importantly for the present discussion, the output series from the 3-1-3-3 solution, along with those of the 4≤M ≤7 model solutions, are unique to the optimal representation of the Lorenz  Chapter 2. A Compact Neural Network Model for NLPCA  16  attractor using one nonlinear principal component. Having simplified the structure of the neural network from three to two hidden layers of neurons, it will be shown that further simplifications can be applied to the mathematical definitions of the bottleneck and output neurons. These simplifications reduce the number of parameters that need to be evaluated to construct the neural network model.  2.3.1  Simplification: removal of bottleneck bias  The major goal of NLPCA is to extract the nonlinear principal component (i.e., the bottleneck series) of a dataset which can effectively characterize and represent the data. Ideally, the nonlinear principal component should be the same among the model runs whose solutions of the output series are the same. However, although the shapes of the bottleneck series from the 3-1-3-3, 3-1-4-3, and 3-1-5-3 solutions are very similar (Fig. 2.7), the bottleneck series of the 3-1-3-3 solution fluctuates smoothly within [-1, 0.2] and has a mean at -0.6 (Fig. 2.7a), while the bottleneck series of the 3-1-4-3 solution and the 3-1-5-3 solution, which have the same mean square error as the 3-1-3-3 solution (Fig. 2.6a, circles), undulate within [-0.6, 0.6] with a mean at 0.1 (Fig. 2.7b) and within [-1, 1] with a mean at -0.2 (Fig. 2.7c), respectively. The variance among the bottleneck series is due at least partly to the different evaluations of the bottleneck bias. Current iterative algorithms for nonlinear optimization of a model hardly give the same value to a model parameter among model runs started with different initial values. The definition of P bottleneck signal un = tanh( i Ai pin + a) (equation (2.12)) shows that large or small evaluations of bottleneck bias a will raise or lower the bottleneck series, resulting in deviations between different model runs. Hence, eliminating the bottleneck bias will help to produce more stable nonlinear principal components. In general, a bottleneck signal un = 0 is expected when the input signals at a sampling point of n are all zero, i.e., {pin = 0 | i = 1, . . . , I}. This also suggests eliminating the bottleneck bias when input series are centered at zero. Again, the uniqueness of the nonlinear principal component is emphasized because it is critical for further data analysis and visualization.  2.3.2  Additional simplification: removal of output biases  Evaluations of the output biases also show significant differences between model runs. For instance, the output biases are d1 = −14000, d2 = 7200, and d3 = −38000 from the 3-1-3-3 model solution (run 5), but d1 = −2100, d2 = 1100, and d3 = −5600 from run 7 of the 3-1-3-3 model, although the Pearson correlation coefficients of both bottleneck series and output series of the two runs are 1.0000.  Chapter 2. A Compact Neural Network Model for NLPCA  17  The differences in output biases are quite large in view of the similarity between bottleneck series and output series of the two model runs. Let us examine the role of output biases in determining the output series and the mean square error of the neural network. As an odd function, the power series expansion of tanh(Cx + c) has the form tanh(Cx + c) =  X  as (Cx + c)2s−1 ,  s  where s = 1, 2, ... and as is a constant parameter depending on s, and can be further expanded as tanh(Cx + c) =  X  as fs (Cx, c) +  s  X  as c2s−1 ,  s  where fs is a function with a form depending on s. For example, for (Cx + c)3 = (Cx)3 + 3Cxc2 + 3(Cx)2 c + c3 , f3 = (Cx)3 + 3Cxc2 + 3(Cx)2 c. Substituting the power series expansion relevant to the P decoding signal ymn = tanh(Cm un + cm ) (equation (2.3)) into the output signal qin = m Dim ymn + di (equation (2.4)) gives qin =  X m  Dim  X  as fs (Cm un , cm ) +  s  X  Dim  m  X  + di . as c2s−1 m  s  This expression shows that the output bias di can be omitted from the expression of the output signal P qin = m Dim ymn + di (equation (2.4)) because the second term on the right side of the equation can express the flexibility afforded by di . In fact, in the demonstration of functioning of the one-hidden-layer feed-forward neural network, the output bias term is not used (Cybenko, 1989). The output bias di is not necessary for a minimization of the error function φ, either. This is shown as follows by examining the first and second derivatives of the error function φ with respect to output biases and output weights. The first derivatives of the error function φ with respect to output bias di and output weight Dim are  N ∂φ 2 X = (qin − pin ), ∂di IN n=1  (2.13)  N ∂φ 2 X = (qin − pin )ymn . ∂Dim IN n=1  (2.14)  The second derivative of φ with respect to output bias di ∂2φ 2 = ∂d2i I is always positive and that with respect to output weight Dim N ∂2φ 2 X 2 = y 2 ∂Dim IN n=1 mn  Chapter 2. A Compact Neural Network Model for NLPCA  18  is always positive provided {ymn | n = 1, . . . , N } are not all zero. The positive second derivatives indicate that, when the first derivative of the error function φ with respect to an output bias or weight is zero at a certain value of the parameter, the error function φ bears a minimum at this point, instead of a maximum. Hence, the optimized output biases and weights always lead the error function φ to a minimum. This minimum is a quadratic function of output bias di and output weight Dim , since the error function φ represents the sum of squared errors of the model output to target and the transfer functions at the output layer are linear. In general, the error function φ will have a multidimensional parabolic form in {di } and {Dim } space. There is then a single minimum of the error function φ, which can be located by a one-step exact minimization of the mean square error with respect to output biases and output weights using the least-square method. Using equation (2.14), it can be written that M X m=1  (Djm  M N N X X ∂φ ∂φ 2 X − Dim )= [Djm (qin − pin )ymn − Dim (qjn − pjn )ymn ]. ∂Dim ∂Djm IN m=1 n=1 n=1  When ∂φ/∂Dim = ∂φ/∂Djm = 0, the error function φ is minimized with respect to output weights, the 2 left side of the equation is zero and the constant factor at the right side can be canceled. Then the IN equation becomes  0=  M X N X  [Djm (qin − pin )ymn − Dim (qjn − pjn )ymn ].  m=1 n=1  The following step-by-step derivations using the mathematical definition of the output signal (equation (2.4)) repeatedly, will simplify the right side of above equation. M X N X  0=  (Djm qin ymn − Djm pin ymn − Dim qjn ymn + Dim pjn ymn ),  m=1 n=1 M X N X M X Djm Dik ykn ymn + Djm di ymn − Djm pin ymn ) 0= ( m=1 n=1 k=1 M X N X M X  −  (  Dim Djk ykn ymn + Dim dj ymn − Dim pjn ymn ),  m=1 n=1 k=1  0=  N X M X  (Djm di ymn − Djm pin ymn − Dim dj ymn + Dim pjn ymn ),  n=1 m=1  0=  N X n=1  [di (qjn − dj ) − pin (qjn − dj ) − dj (qin − di ) + pjn (qin − di )],  Chapter 2. A Compact Neural Network Model for NLPCA  0=  N X  19  [di (qjn − pjn ) − dj (qin − pin ) − pin qjn + pjn qin ].  n=1  When ∂φ/∂di = ∂φ/∂dj = 0, the error function φ is minimized with respect to output biases, P pin ) = n (qjn − pjn ) = 0 according to equation (2.13), then the last equation turns into  0=  N X  (pin qjn − pjn qin ).  P  n (qin  −  (2.15)  n=1  The same result can be obtained by eliminating the output biases, equivalent to letting di = dj = 0. Moreover, equation (2.13) imposes that the mean of the output series be exactly the same as that of its target series. That is not necessarily a practical restriction on model output. It is concluded that from the point-of-view of both signal definition and parameter optimization, the output biases are dispensable.  2.4  The compact NLPCA model  Following the discussion presented in section 2.3, a simpler neural network for NLPCA, called a compact NLPCA model, is now introduced. It is a two-hidden-layer feed-forward neural network with no bottleneck and output biases. The simplified two-hidden-layer feed-forward neural network will be referred to as an I-Kb -M-Ib model, where superscript b denotes no biases for corresponding neurons. A variable glossary for the I-Kb -M-Ib model is listed in Table 2.2. The bottleneck signal ukn , decoding signal ymn , and output signal qin are mathematically expressed by, I X ukn = tanh( Aki pin ),  (2.16)  i=1  ymn = tanh(  K X  Cmk ukn + cm ),  (2.17)  Dim ymn .  (2.18)  k=1  qin =  M X m=1  where i = 1, ..., I; k = 1, ..., K; m = 1, ..., M ; and n = 1, ..., N .  Chapter 2. A Compact Neural Network Model for NLPCA I K M N Aki Cmk cm Dim pin ukn ymn qin  20  the number of input (and output) neurons the number of bottleneck neurons, K < I the number of decoding neurons, M ≥ 2 the number of samples weight of bottleneck neuron k to signals from input neuron i weight of decoding neuron m to signals from bottleneck neuron k bias of decoding neuron m weight of output neuron i to signals from decoding neuron m signal from input neuron i at sampling point n signal from bottleneck neuron k at sampling point n signal from decoding neuron m at sampling point n signal from output neuron i at sampling point n  Table 2.2: A variable glossary for the I-Kb -M-Ib model.  2.4.1  One bottleneck neuron (K = 1)  A 3-1b -M-3b model is applied to the Lorenz attractor, where M is still varied from 2 to 7 with 7 runs for each M . Now the resultant mean square errors fall on 16.6 for the 3-1b -2-3b model runs, on 13.7 for the 3-1b -3-3b model runs and on 13.0 for the 3-1b -4,5,6,7-3b model runs (Fig. 2.6b, dots). The 3-1b -4-3b solution (Fig. 2.6b, circle) has the fewest model parameters among the models which offer the solutions with the lowest mean square error and is optimal in terms of least square error and fewest model parameters. Its bottleneck series fluctuate smoothly within [-1, 1] (Fig. 2.8a), in a pattern very similar to that of the 3-1-5-3 solution (Fig. 2.7c). The 3-1b -5-3b solution and the 3-1b -6-3b solution have the same mean square error as the 3-1b -4-3b solution (Fig. 2.6b, circles). As expected, the bottleneck series of the 3-1b -5-3b solution (Fig. 2.8b) and the 3-1b -6-3b solution (Fig. 2.8c) are similar to that of the 3-1b -4-3b solution, except for some alteration in the maximum scale of variation. The deviations between bottleneck series are significantly reduced after the bottleneck bias is removed. Now the structure of model outputs from the solutions with different mean square errors will be explored. The 3-1b -2-3b solution with a mean square error at 16.6, the 3-1b -3-3b solution with a mean square error at 13.7, and the 3-1b -4-3b solution with a mean square error at 13.0 (Fig. 2.6b, asterisks), represent three evaluations of the one nonlinear principal component representation of the Lorenz system characterized by different mean square errors. The two-dimensional phase plot of the output series of the 3-1b -2-3b solution is nearly a straight line in x1 -x2 plane and an L-shaped curve with a bending tail in x1 -x3 and x2 -x3 planes (Fig. 2.9a–c, densely overlapping circles). That of the 3-1b -3-3b solution is a straight line in x1 -x2 plane, but a U-shaped curve in x1 -x3 and x2 -x3 planes (Fig. 2.9d–f, densely overlapping circles). That of the 3-1b -4-3b solution is an S-shaped curve in x1 -x2 plane, a V-shaped curve  Chapter 2. A Compact Neural Network Model for NLPCA  21  in x1 -x3 plane, and a bending U-shaped curve in x2 -x3 plane (Fig. 2.9g–i, densely overlapping circles). For comparison, in previous studies, patterns similar to those of the 3-1b -3-3b solution (Fig. 2.9d–f, densely overlapping circles) were obtained using the 3-2-1-2-3 model subject to the weight penalty tuned with a penalty parameter of 1.0 and for a set of 600 samples from the Lorenz chaotic attractor (Hsieh, 2001). Patterns similar to those of the 3-1b -4-3b solution (Fig. 2.9g–i, densely overlapping circles), but for another set of 600 samples from the Lorenz chaotic attractor, were obtained using the 3-3-1-3-3 model subject to a validation of the solution following the procedure set out by Monahan (2000). Here, the procedure involved setting aside from training a randomly selected 30% of the data to be used for the validation and early-stopping pre-set at 500 iterations of the optimization algorithm, whichever comes first. The weight penalty or the validation and early-stopping were needed, otherwise, the solutions of NLPCA using the three-hidden-layer feed-forward neural networks would over-fit the data (Monahan, 2000; Hsieh, 2001).  2.4.2  Two bottleneck neurons (K = 2)  The variables of the Lorenz attractor actually form, in a two-dimensional plane, a curve with two loops and multiple intersections. Amplitude and phase variations between variables exist. This type of curve is impossible to simulate through a single principal component which usually carries only amplitude information. Two nonlinear principal components are needed to carry both amplitude and phase variances. So a 3-2b -M-3b model is applied to the Lorenz attractor, where M is varied from 2 to 7 with 7 runs for each M as before. Now the resultant mean square errors of the model runs reduce to below 2.4, with the lowest ones around 0.65 first attained by the 3-2b -5-3b solution (Fig. 2.6c, circle), the optimal solution in terms of least square error and fewest model parameters. The two bottleneck series of the 3-2b -5-3b solution clearly characterize the two loops of the Lorenz attractor (Fig. 2.10a) and the output series closely simulate the two loops of the Lorenz attractor, including the multiple intersections (Fig. 2.10b–d, densely overlapping circles). Given two bottleneck neurons, the compact NLPCA model is able to simulate a dataset with complicated nonlinear relations between the variables.  2.5  Number of parameters in the compact NLPCA model  An advantage of the compact NLPCA model is that, given the same number of bottleneck neurons, it uses significantly fewer parameters than the three-hidden-layer feed-forward neural network model. For example, the compact NLPCA model with one bottleneck neuron, the I-1b -M-Ib model, has  Chapter 2. A Compact Neural Network Model for NLPCA  22  I + (M + M ) + M × I parameters. But the corresponding I-M-1-M-I model (usually the number of encoding neurons is set to be the same as that of decoding neurons) has (I × M + M ) + (M + 1) + (M + M ) + (M × I + I) parameters, (I × M + M ) + (M + 1) more than the number of parameters used by the I-1b -M-Ib model. Similarly, the I-2b -M-Ib model uses I × 2 + (2 × M + M ) + M × I parameters. But the I-M-2-M-I model uses (I × M + M ) + (M × 2 + 2) + (2 × M + M ) + (M × I + I) parameters, that is (I × M + M ) + (M × 2 + 2) − I more than those used by the I-2b -M-Ib model. The difference in the number of parameters between the two types of neural networks increases with increasing M and I. The numbers of parameters used at I = 3 and M = 2, . . . , 9 by the I-1b -M-Ib , I-2b -M-Ib , I-M-1-M-I, and I-M-2-M-I models are shown in Fig. 2.11. The advantage is exhibited in the solutions discussed above. For instance, when producing similar patterns representing the Lorenz chaotic attractor, the 3-2-1-2-3 model subject to the weight penalty tuned with a weight penalty parameter of 1.0 (Hsieh, 2001) uses 24 neural network parameters, whereas the 3-1b -3-3b model involves only 18 parameters. The 3-3-1-3-3 model subject to the validation and earlystopping (Monahan, 2000) uses 34 parameters, whereas the 3-1b -4-3b model involves only 23 parameters. Among the 3-M-2-M-3 model runs subject to the validation and early-stopping, the 3-6-2-6-3 solution appears to be optimal in closely simulating the two loops of the Lorenz chaotic attractor (Monahan, 2000). The simulation is similar to that of the 3-2b -5-3b solution (Fig. 2.10b–d, densely overlapping circles). But the 3-6-2-6-3 model uses 77 parameters, whereas the 3-2b -5-3b model uses 36 only or 41 fewer parameters than the 3-6-2-6-3 model.  2.6  Conclusions  Principal component analysis has been used extensively in the atmospheric and oceanographic sciences to identify the structures characterizing the variability of physical fields. Recently, a nonlinear generalization of this multivariate statistical analysis technique, called NLPCA, using an auto-associative  Chapter 2. A Compact Neural Network Model for NLPCA  23  three-hidden-layer feed-forward neural network (Kramer, 1991) has been applied in atmospheric and oceanographic sciences (Hsieh, 2004). In this chapter, I presented a new, compact neural network model to conduct NLPCA on large datasets. The compact model has a simpler architecture than the standard three-hidden-layer feed-forward neural network used originally in the development of NLPCA. Due to the particular layer structure of the original three-hidden-layer feed-forward neural network used in NLPCA, the parameters of the model can not be uniquely determined. As the number of encoding neurons is increased, a bottleneck signal can reach any possible value by adding more encoding signals. Subsequently, the model output will over-fit the data and have difficulty arriving at unique solutions. The problems are resolved by using a simplified two-hidden-layer feed-forward neural network, a compact NLPCA model. This neural network has no encoding layer and no bottleneck and output biases. Eliminating the bottleneck bias helps to reduce the variance of bottleneck series among different model runs. The output biases can be neglected according to both signal definition and parameter optimization. According to Cybenko (1988), two hidden layers of neurons are sufficient for a feed-forward neural network to approximate any continuous function. The simplified two-hidden-layer feed-forward neural network for NLPCA, developed in this chapter, removes most of the subjectivity inherent in the threehidden-layer feed-forward neural network. In addition, given the same number of bottleneck neurons, this quasi-objective compact NLPCA model uses significantly fewer parameters than the three-hidden-layer feed-forward neural network model.  Chapter 2. A Compact Neural Network Model for NLPCA  ↓  ↓  ↓ input: p  in  encoding: xhn = tanh(Σi Ahi pin + ah) bottleneck: un = Σh Bh xhn + b decoding: ymn = tanh(Cm un + cm) output: q = Σ D y + d in m im mn i  ↓  ↓  ↓  Figure 2.1: A diagram of a standard three-hidden-layer feed-forward neural network with one bottleneck neuron for NLPCA. From top to bottom, the sequence of rows represent input, encoding, bottleneck, decoding, and output neurons (circles). The meanings of the symbols to the right of the diagram are shown in Table 2.1. In NLPCA, input and target data are the same, so output neurons are in pairs with input neurons. The numbers of encoding and decoding neurons are adjustable for an optimal fit of model output to target.  24  Chapter 2. A Compact Neural Network Model for NLPCA  25  12  11  10  Mean Square Error  9  8  7  6  5  4 2  3  4  5 M  6  7  Figure 2.2: The resultant mean square errors of runs 1 to 7 (dots from left to right for each M ) of the 3-M-1-M-3 model applied to the Lorenz attractor, where M is varied from 2 to 7. The solutions selected for display in Figures 2.3 and 2.4 are indicated by a circle around the dot.  Chapter 2. A Compact Neural Network Model for NLPCA  26  (a) the 3−2−1−2−3 solution  u  −0.8  −1.1  1  250  500  750  1000  1250  1500  1250  1500  1250  1500  (b) the 3−4−1−4−3 solution  u  5  −14  1  250  500  750  1000  (c) the 3−7−1−7−3 solution  u  11  −37  1  250  500  750 1000 Sampling Point n  Figure 2.3: The bottleneck series of the (a) 3-2-1-2-3 solution, (b) 3-4-1-4-3 solution, and (c) 3-7-1-7-3 solution of the Lorenz attractor.  Chapter 2. A Compact Neural Network Model for NLPCA  (a)  (b)  30  (c)  55  −30 −25  25  55  0 −25  (d)  25  25  55  0 −25  (g)  25  25 (x1, x2)  30 (i)  55  −30 −25  0 −30  (h)  30  30 (f)  55  −30 −25  0 −30  (e)  30  27  55  0 −25  25 (x1, x3)  0 −30  30 (x2, x3)  Figure 2.4: The two-dimensional phase plots of the output series (densely overlapping circles) of the (a–c) 3-2-1-2-3 solution, (d–f) 3-4-1-4-3 solution, and (g–i) 3-7-1-7-3 solution of the Lorenz attractor. The two-dimensional phase plots of the corresponding series of the Lorenz attractor are shown in dots as background.  Chapter 2. A Compact Neural Network Model for NLPCA  ↓  ↓  28  ↓ input: pin bottleneck: un = tanh(Σi Ai pin + a) decoding: y  mn  = tanh(Cm un + cm)  output: q = Σ D y in  ↓  ↓  m  im mn  +d  i  ↓  Figure 2.5: A diagram of a two-hidden-layer feed-forward neural network with one bottleneck neuron for NLPCA. From top to bottom, the sequence of rows represent input, bottleneck, decoding and output neurons (circles). Among the symbols to the right of the diagram, Ai denotes the weight of the bottleneck neuron to signals from input neuron i and a the bias of the bottleneck neuron. The other symbols have the same meanings as those of the threehidden-layer feed-forward neural network shown in Table 2.1. In NLPCA, input and target data are the same, so output neurons are in pairs with input neurons. The number of decoding neurons is adjustable for an optimal fit of model output to target.  Chapter 2. A Compact Neural Network Model for NLPCA  29  (a) the 3−1−M−3 model  Mean Square Error  17 16 15 14 13 12 2  3  4  5  6  7  (b) the 3−1b−M−3b model  Mean Square Error  17 16 15 14 13 12 2  3  4  5 b  6  7  b  (c) the 3−2 −M−3 model  Mean Square Error  2.5 2 1.5 1 0.5 2  3  4  5 M  6  7  Figure 2.6: The resultant mean square errors of runs 1 to 7 (dots from left to right for each M ) of the (a) 3-1-M-3 model, (b) 3-1b -M-3b model, and (c) 3-2b -M-3b model applied to the Lorenz attractor, where M is varied from 2 to 7. The solutions selected for display in Figures 2.7 to 2.10 are indicated by a circle around and/or an asterisk over the dot.  Chapter 2. A Compact Neural Network Model for NLPCA  30  (a) the 3−1−3−3 solution 1  u  0.2  −0.6 −1  1  250  500  750  1000  1250  1500  1250  1500  1250  1500  (b) the 3−1−4−3 solution 1  u  0.6 0.1  −0.6 −1  1  250  500  750  1000  (c) the 3−1−5−3 solution  −u  1  −0.2  −1  1  250  500  750 1000 Sampling Point n  Figure 2.7: The bottleneck series of the (a) 3-1-3-3 solution, (b) 3-1-4-3 solution, and (c) 3-1-5-3 solution of the Lorenz attractor. The sign of that of the 3-1-5-3 solution is changed for convenient comparison.  Chapter 2. A Compact Neural Network Model for NLPCA  31  (a) the 3−1b−4−3b solution  u  1  0  −1  1  250  500  750  1000  1250  1500  1250  1500  1250  1500  (b) the 3−1b−5−3b solution 0.7  u  0  −0.7  1  250  500  750 b  1000 b  (c) the 3−1 −6−3 solution  −u  1  0  −1  1  250  500  750 1000 Sampling Point n  Figure 2.8: The bottleneck series of the (a) 3-1b -4-3b solution, (b) 3-1b -5-3b solution, and (c) 3-1b -6-3b solution of the Lorenz attractor. The sign of that of the 3-1b -6-3b solution is changed for convenient comparison.  Chapter 2. A Compact Neural Network Model for NLPCA  (a)  (b)  30  (c)  55  −30 −25  25  55  0 −25  (d)  25  25  55  0 −25  (g)  25  25 (x1, x2)  30 (i)  55  −30 −25  0 −30  (h)  30  30 (f)  55  −30 −25  0 −30  (e)  30  32  55  0 −25  25 (x1, x3)  0 −30  30 (x2, x3)  Figure 2.9: The two-dimensional phase plots of the output series (densely overlapping circles) of the (a–c) 3-1b -2-3b solution, (d–f) 3-1b -3-3b solution, and (g–i) 3-1b -4-3b solution of the Lorenz attractor. The two-dimensional phase plots of the corresponding series of the Lorenz attractor are shown in dots as background.  Chapter 2. A Compact Neural Network Model for NLPCA  (a)  33  (b)  1  30  −1 −0.3  0.2  −30 −25  (u1,u2)  25 (x1, x2)  (c)  (d)  55  55  0 −25  25 (x1, x3)  0 −30  30 (x2, x3)  Figure 2.10: The two-dimensional phase plots of (a) the two bottleneck series and (b–d) the output series (densely overlapping circles) of the 3-2b -5-3b solution of the Lorenz attractor. The two-dimensional phase plots of the corresponding series of the Lorenz attractor are shown in dots as background.  Chapter 2. A Compact Neural Network Model for NLPCA  34  120  110 the 3−M−2−M−3 model 100  90 the 3−M−1−M−3 model  Number of Parameters  80  70  60 b  b  the 3−2 −M−3 model 50 the 3−1b−M−3b model 40  30  20  10  2  3  4  5  6  7 M  8  9  Figure 2.11: The numbers of parameters used by the 3-M-2-M-3, 3-M-1-M-3, 3-2b -M-3b , and 3-1b -M-3b models, where M is varied from 2 to 9.  35  Chapter 3  Structure and Variability of the Quasi-Biennial Oscillation 3.1  Introduction  The original model of the standard three-hidden-layer feed-forward neural network for NonLinear Principal Component Analysis (NLPCA) has been refined in Chapter 2. The mathematical analysis of the neural network, accompanied with examples based on the Lorenz attractor (Lorenz, 1963), resulted in a compact NLPCA model. In this chapter, this compact NLPCA model, a simplified two-hiddenlayer feed-forward neural network without bottleneck and output biases, will be applied to the study of a natural physical phenomenon, the Quasi-Biennial Oscillation (QBO). The QBO is a perfect test-bed for the model developed in the preceding chapter. The oscillation is present in a long time series (45 years) of wind data. It has been extracted from this and similar datasets by many researchers using linear and nonlinear statistical analysis techniques. Hence, results are available from many studies for comparison. Firstly, the compact NLPCA model is applied to the 45-year wind time series to extract the QBO. Secondly, various aspects of the modeled QBO are investigated and compared to previous studies to determine how well the compact model succeeded in revealing the physics of the QBO. The QBO is a well-known, predominant oscillation of the zonal winds between easterlies and westerlies in the equatorial stratosphere (Marquardt and Naujokat, 1997; Baldwin et al., 2001). The amplitude of the oscillation increases with height from 70 to 40 hPa, but remains nearly constant between 40 and 10 hPa. The phase of the oscillation consistently propagates downward through the equatorial stratosphere, with the westerly shear zone, where westerly wind increases with height, descending more regularly and rapidly than the easterly shear zone. At each vertical level, transitions between the easterly and westerly regimes are about ten times shorter than the durations of the regimes. The height-time structure of the zonal winds is rather complicated.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  36  In order to construct a low-dimensional statistical fit of the zonal winds at multiple vertical levels to describe the key features of the QBO, different techniques have been used (Wallace et al., 1993; Fraedrich et al., 1993; Wang et al., 1995; Hamilton and Hsieh, 2002; Hsieh and Hamilton, 2003). Having these previous studies available for comparison, the QBO is an optimum test-bed to investigate the behavior of the simplified two-hidden-layer feed-forward neural network and to show the advantages of using the compact NLPCA model to describe the behavior of a complex physical phenomenon. This chapter is organized as follows. Section 3.2 briefly describes the dataset of monthly mean observations of the equatorial stratospheric zonal winds used in this study. Section 3.3 selects, among various model runs, an optimal NLPCA two-component representation of the dataset. Section 3.4 analyzes the QBO phase defined by the two nonlinear principal components of the optimal representation. Long period variations in the phase progression of the QBO cycle are also investigated. Section 3.5 discusses the phase variation in variability of the zonal winds. Section 3.6 investigates the seasonal synchronization of the QBO phases. Section 3.7 produces a NLPCA one-component representation of the dataset and a corresponding composite of the zonal winds. Section 3.8 gives conclusions.  3.2  Data  The data are the well-known monthly mean series of the equatorial stratospheric zonal winds, calculated from the twice-per-day measurements by balloons above Canton Island (2.8o N, 171.7o W) from January 1956 to August 1967, Gan of Maldives (0.7o S, 73.1o E) from September 1967 to December 1975, and Singapore (1.4o N, 103.9o E) from January 1976 to December 2000, and distributed by Free University of Berlin (Naujokat, 1986; Marquardt and Naujokat, 1997). The operational balloon soundings are usually capped at 10 hPa, around 30 km in altitude, although the QBO appears at least up to 40 km (Hamilton, 1981). Values at 70, 50, 40, 30, 20, 15, and 10 hPa (from about 20 to 30 km in altitude), with their 45-year means removed but weak seasonal cycles retained, are used in this study and hereafter called the QBO winds for convenience. The height-time section of the QBO winds is shown in Fig. 3.1 for future reference.  3.3  Optimal simulation of the QBO winds  The standard deviations of the QBO winds at the seven pressure levels vary from 6.6 to 19.9 m s−1 . Proper scaling of the QBO winds helps to attain an optimum performance of the NLPCA. So the QBO  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  37  winds at all the pressure levels are normalized by dividing them by their overall standard deviation (i.e., the standard deviation of the total data) and then a 7-2b -M-7b model is applied, where M is varied from 4 to 20 with 40 runs for each M . Two bottleneck neurons are assigned to the model to extract two nonlinear principal components from the seven time series of the QBO winds. The two-component approximation allows the variability in both phase progress and amplitude value of the QBO winds from cycle to cycle to be effectively represented. The model output, called simulation, is multiplied by the overall standard deviation and the root mean square error is calculated for comparison with the original data and shown in Fig. 3.2a. The lowest root mean square error of the 40 runs for each 7-2b -M-7b model is 3.37, 3.19, 3.07, 3.00, 2.94, 2.89, 2.87, 2.82, 2.82, 2.75, 2.74, 2.70, 2.67, 2.67, 2.65, 2.61, 2.63 m s−1 for M = 4, . . . , 20, respectively. It reduces relatively quickly and steadily from 3.37 to 2.67 m s−1 as M is increased from 4 to 16 and then remains between 2.67 and 2.61 m s−1 when 16≤M ≤20 (Fig. 3.2b). Hence, the 7-2b -16-7b model is optimal in terms of least square error and fewest model parameters, since it includes the least number of parameters among those model simulations with the lowest root mean square error. The time series of the 7-2b -16-7b simulation (Fig. 3.3, line) are highly correlated with those of the QBO winds (Fig. 3.3, dots) with a Pearson correlation coefficient 0.974≤ρ≤0.994 for pressure levels between 10 and 50 hPa and ρ = 0.874 for 70 hPa where the original data are rather noisy. The standard deviation of the error series of the 7-2b -16-7b simulation to the QBO winds varies between 1.93 and 3.41 m s−1 between the seven pressure levels (Fig. 3.4). When compared to the normal distribution with expectation zero and variance equal to that of the error series (Fig. 3.4, curve), the frequency distribution of the errors has a higher kurtosis and little skewness (Fig. 3.4, vertical lines). The 7-2b -16-7b simulation is a very satisfactory fit to the QBO winds and the two 7-2b -16-7b nonlinear principal components are an effective two-dimensional representation of those winds. In contrast, the reconstruction from the leading two modes of the linear principal component analysis of the QBO winds renders the Pearson correlation coefficient 0.934≤ρ≤0.984 for pressure levels between 10 and 50 hPa and ρ = 0.822 for 70 hPa (Hamilton and Hsieh, 2002). The standard deviation of the error series of the reconstruction to the QBO winds varies between 3.54 and 5.76 m s−1 between the seven pressure levels (Fig. 3.5). Referred to the normal distribution with expectation zero and variance equal to that of the error series of the reconstruction (Fig. 3.5, curve), the frequency distribution of the reconstruction errors has a lower kurtosis and/or a noticeable skewness (Fig. 3.5, vertical lines). The nonlinear two-component approximation of the QBO winds is significantly better than the linear  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  38  two-mode reconstruction.  3.4  Variability of the QBO phase  The two 7-2b -16-7b nonlinear principal components, u and v (Fig. 3.6a,b), closely resemble the oscillations of the QBO winds. In particular, the variance of the amplitude of the oscillations is relatively weak, but the period of the cycles exhibited in the periodic phase Ψ = arctan(v/u) (−π≤Ψ≤π) (Fig. 3.6c) changes significantly from cycle to cycle. The change in length of the QBO period is believed to be due mostly to the change in length of the easterly regime, as the length of the westerly regime appears rather constant (Quiroz, 1981; Dunkerton and Delisi, 1985; Naujokat, 1986; Maruyama and Tsuneoka, 1988).  3.4.1  Essential phase speed of the QBO winds  Theoretically, the oscillation period of the zonal winds is inversely proportional to the momentum flux contributed by the upward-propagating tropical waves that drive the oscillation (Plumb, 1977). However, it is difficult to measure the momentum fluxes of these waves. Instead, a QBO period has usually been obtained by averaging the lengths of individual oscillation cycles. The early investigations based on then available data often delivered a QBO period of 26 or 27 months. Recent results based on time series spanning about four decades come close to 28 months (Baldwin et al., 2001). In contrast to previous calculations of QBO period from the wind observations at one vertical level, the present analysis is based on a set of monthly mean series spanning 45 years and from seven pressure levels extending about 10 km in height. The periodic phase Ψ (Fig. 3.6c) shows a strong tendency for the phase to increase linearly with time through each cycle. So an accumulating phase Φ, defined as the progression of the phase Ψ with each new cycle starting at (2ncycle − 3)π with ncycle varying from 1 to 20, is calculated. The first cycle of Φ starts from −π because the phase Ψ advances from -π to π in each cycle. The accumulating phase Φ progresses predominantly at a constant rate (Fig. 3.7a). The phase Φ is characterized by the linear regression ΦL = 2πnmonth /28.4 + 1.3π, where nmonth is the number of months from the initial sampling point and starts from 1. The phase of the QBO winds advances predominantly at a speed of 2π/28.4-month or 0.42 cycle per year (cpy). The importance of this period is confirmed by examining the spectra of Fourier amplitude coefficients of the QBO winds (Fig. 3.8, thin lines) and the 7-2b -16-7b simulation (Fig. 3.8, thick lines). Both spectra show an extreme peak at 0.422 cpy or equivalently a 28.4-month period under current  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  39  frequency resolution 1/45 = 0.02̇ cpy. Both the phase analysis and the Fourier transformation analysis indicate that 2π/28.4-month is an essential phase speed of the QBO winds. In other words, 28.4-month stands out as a fundamental oscillation period of the QBO winds. It is noticeable that the phase analysis can produce an accurate phase speed that matches the periodicity of the QBO winds obtained by their Fourier analysis. Furthermore, one can get this phase speed even based on datasets that have different lengths (e.g., N = 480 or N = 510). Now, because the Fourier transformation of the QBO winds with 45 years of monthly samples has a frequency resolution of 0.02̇ cpy, the frequency of 19 × 1/45 = 0.42̇ cpy is equivalent to 28.4 months and the frequency of 20 × 1/45 = 0.4̇ cpy is equivalent to 27 months. The difference between the two equivalent periods is 1.4 months. The Fourier transformation cannot differentiate between 28-month and 28.4-month periods.  3.4.2  Frequency modulation of the QBO phase  Recently, whether the QBO is modulated by the 11-year sunspot cycle was discussed (Salby and Callaghan, 2000; Hamilton, 2002). Solar radiation being the primary energy source of the Earth, variations in the solar cycle have been naturally considered to affect the earth’s climate, including the QBO (Labitzke, 2004, 2005; Baldwin and Dunkerton, 2005). Considering the 7-2b -16-7b solution of the QBO winds, after the linear component of the accumulating phase is subtracted, the phase residue φ = Φ − ΦL fluctuates between -0.6π and 0.5π (Fig. 3.7b, line with dots). In its spectrum of Fourier amplitude coefficients, the phase residue has two adjoining, prominent peaks at 0.022 (= 45-year period) and 0.044 (= 22.5-year period) cpy, and a dominant peak at 0.089 (= 11.25-year period) cpy (Fig. 3.7c). Since we have 45 years of monthly data, the spectral peaks at 0.022 and 0.044 cpy represent modulations with periods close to the length of the data. Therefore, the strength of these oscillations might not have been sampled properly. Furthermore, the signal of presumably quasithree-decade period is not well documented, either. In an ordinary meteorological record spanning a few decades, this signal is hardly picked out from strong short term signals, whereas easily mixed into a nontrivial long term trend. Therefore, a discussion of these possible modulations cannot be done in a meaningful way. At the current frequency resolution 1/45 = 0.02̇ cpy, the frequencies 3/45 = 0.06̇, 4/45 = 0.08̇ and 5/45 = 0.1̇ are equivalent to 15, 11.25 and 9 years, respectively. But the particularly high value of the amplitude coefficient peak at 0.089 cpy indicates that the significant signal in the period interval between 9 and 15 years has a period very close to 11 years. The following discussion will focus on the dominant  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  40  peak at 0.089 cpy that corresponds to a significant modulation of the QBO phase at a period of 11 years. Salby and Callaghan (2000) showed that, in a band-pass (centered at 0.41, 0.50 and 0.59 cpy) filtered monthly mean series of the equatorial zonal wind at 45 hPa, westerly regimes appear shortened near the maxima of solar flux at 10.7-cm wavelength while lengthened near the minima. However, a raw monthly mean series at 50 hPa indicates that this kind of relation is not strong during the solar maxima near 1981, although some short westerly regimes occurred near times of solar maxima, particularly near 1970 and 1990 (Hamilton, 2002). Similar to the result of Hamilton (2002), the time series of the harmonic component relevant to the 11-year cycle (Fig. 3.7c, asterisk) completely misses the data points of the phase residue around 1980, although it passes through those between 1960 and 1970 (Fig. 3.7b, line). The Pearson correlation coefficient between the phase residue and the time series of the harmonic is only 0.50. The QBO phase is also prominently modulated by a cycle of longer period, indicated by the adjoining prominent peaks at 45- and 22.5-year periods. The 11-year cycle indeed is involved in modulating the QBO, but by itself is not enough to describe the overall modulation. As far as the northern extra-tropics are concerned, whether the 11-year sunspot cycle is present as part of a QBO signal is arguable. An 11-year cycle was detected in polar temperature at 30 hPa (Salby et al., 1997) and isentropic potential vorticity between 20 and 35 hPa (Baldwin and Dunkerton, 1998), together with a QBO and a biennial oscillation. Baldwin and Dunkerton (1998) considered the 11-year cycle a possible result of the interaction between the QBO and the biennial oscillations. An interaction between two oscillations at frequencies f and g can produce another two oscillations at frequencies f − g and f + g. The interaction can be expressed mathematically as [a + cos(2πf t)][b + cos(2πgt)] = ab + a cos(2πgt) + b cos(2πf t) + cos(2πf t) cos(2πgt) 1 1 = ab + a cos(2πgt) + b cos(2πf t) + cos[2π(f − g)t] + cos[2π(f + g)t], 2 2 where a and b are constant off-set parameters, f and g represent frequencies and t is time. So the interaction between a biennial oscillation at f = 0.50 cpy and a QBO at approximately g = 0.41 cpy could produce an oscillation at f − g = 0.09 cpy or 11-year period. The signal of the biennial oscillation will disappear in the resultant series when b is negligible. The phase residue obtained from the 7-2b -16-7b solution of the QBO winds might be such a case. Moreover, the three moderate peaks in the spectrum of Fourier amplitude coefficients of the phase residue coincide with 1.000 − 0.422 = 0.578 cpy (Fig. 3.7c). The oscillation at 0.578 cpy is probably produced by the interaction between an annual oscillation and the QBO. In the northern extra-tropics, the interaction between an annual oscillation and a QBO  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  41  was considered to be the source of the two coexisting oscillations at about 1.00 − 0.42 = 0.58 and 1.00 + 0.42 = 1.42 cpy. The examples include angular momentum and Eliassen-Palm flux (Baldwin and Tung, 1994), ozone (Tung and Yang, 1994a,b), north polar temperature at 30 hPa (Salby et al., 1997) and isentropic potential vorticity between 20 and 35 hPa (Baldwin and Dunkerton, 1998). All these studies favor the possibility that the 11-year cycle in the northern extra-tropics arises from an internal interaction in the atmosphere. However, in the tropical regions, radiative ozone heating can introduce an 11-year sunspot cycle to stratospheric dynamics. Nonetheless, the two possible origins of the 11-year cycle, one from the variation of solar irradiance and the other through the interaction of relevant oscillations in the atmosphere, do not exclude each other and the possibility exists that both mechanisms contribute to the modulation of the QBO. There are amplitude coefficient peaks at frequencies 0.289 and 0.822 cpy, too (Fig. 3.7c). But the two peak amplitude coefficients are less than one-third of that at 0.089 cpy and their origins are not clear. A more definitive answer would require more analysis. As a hypothesis, the amplitude coefficient peak at 0.289 = 0.200 + 0.089 cpy could be regarded as a result of an interaction between the oscillations at 0.200 and 0.089 cpy. The other, corresponding frequency 0.200 − 0.089 = 0.111 cpy adjoins 0.089 cpy to the right. Although it does not appear to be a peak, the amplitude coefficient at 0.111 cpy is about two-thirds of that at 0.089 cpy. In contrast, the amplitude coefficient at the frequency adjoining 0.089 cpy to the left is less than half of that at 0.089 cpy. The difference in amplitude coefficient between these two frequencies next to 0.089 cpy implies that the signal at frequency 0.111 cpy is significant. Similarly, the amplitude coefficient peak close to the frequency 0.800 cpy could be ascribed to an interaction between the oscillations at 1.000 and 0.200 cpy. The other, corresponding frequency 1.200 cpy is not shown because the amplitude coefficient at 1.200 cpy is even lower than that at 0.822 cpy.  3.4.3  Statistical significance of the QBO phase features  The essential linear increase with time of the accumulating phase Φ and the dominant decadal fluctuations of the phase residue φ are robust features in the two nonlinear principal components. These features appear, and are almost the same, among the solutions of the 7-2b -M-7b model runs with close results of the root mean square error. For instance, there are 200 runs in total for the 7-2b -16,17,18,19,20-7b models and their resultant root mean square errors all fall between 2.6 and 2.8 m s−1 (Fig. 3.2a). The accumulating phases defined by the two nonlinear principal components of each of the 200 solutions closely resemble each other (not shown). They reach a Pearson correlation coefficient ρ>0.999, with a mean  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  42  of 2π/28.4-month and a standard deviation of 2π/0.008-month for the slope coefficients of their linear regressions. The Fourier amplitude coefficients of the corresponding phase residues at 0.022, 0.044 and 0.089 cpy have a mean of 0.13π, 0.17π and 0.18π, with a standard deviation of 0.002π, 0.003π and 0.002π, respectively. Compared to the values of the means, the corresponding standard deviations are negligible. From the point-of-view of reproducibility, the predominant linear increase at a speed of 2π/28.4-month and the dominant decadal modulations at 11-, 22.5-, and 45-year cycles of the phase residue are very robust features in variation of the QBO winds. The Fourier amplitude coefficients of the phase residues at frequencies between 0.13 and 6 cpy have means <0.06π and standard deviations <0.014π. The Fourier amplitude coefficients at the higher frequencies are significantly smaller than those at 0.022, 0.044 and 0.089 cpy, as seen in the spectrum of Fourier amplitude coefficients of the phase residue from the 7-2b -16-7b solution (Fig. 3.7c).  3.5  Phase variation in variability of the QBO winds  To describe variations of the QBO winds, a reference state needs to be constructed. This reference state will be a general structure without perturbation, in which the amplitude of oscillations of the zonal wind at each pressure level is constant at each phase. This general structure should be extracted starting from the two 7-2b -16-7b nonlinear principal components representing the QBO winds. This is realized by using a 2-2b -Mψ -2b model to generalize the two 7-2b -16-7b nonlinear principal components (i.e., the two 7-2b -16-7b nonlinear principal components are the input/target series of the model), where M is varied from 2 to 9 with 10 runs for each M . The superscript ψ denotes that the input series of the decoding neurons are constructed from the sine and cosine of the angles formed by the arc tangent of the two incoming bottleneck series, instead of the bottleneck series themselves. Hence, in this model, the input series of the decoding neurons have a constant amplitude. Subsequently, the corresponding output signals are constant at each phase of the angles. This will be shown as follows. The lowest root mean square error of the 10 runs for each 2-2b -Mψ -2b model is 0.0694, 0.0622, 0.0603, 0.0586, 0.0576, 0.0570, 0.0566, 0.0565 for M = 2, 3, . . . , 9, respectively. Except for the reduction of the root mean square error being relatively quick as M is varied from 2 to 3, that with further increase of M is rather slow. The change in the rate of decrease of the lowest error at M = 3 and the relatively small number of parameters in the model with M = 3 are sufficient conditions for the choice of M . Hence, the 2-2b -3ψ -2b simulation is optimal based on the combined effect in terms of least square error and fewest model parameters. Presented in a scatter plot, the generalized nonlinear principal components (Fig. 3.9a,  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  43  densely overlapping crosses) pass through the 7-2b -16-7b nonlinear principal components (Fig. 3.9a, dots) as a smooth and closed curve. Taking the generalized nonlinear principal components as input series to the decoding neurons of the optimal 7-2b -16-7b model, a general height-phase structure of the QBO winds is obtained from the corresponding output series of the model (Fig. 3.9b). Relative to the generalized nonlinear principal components, the two 7-2b -16-7b nonlinear principal components are more variable in one half of their phase cycle from 0.25π counterclockwise to -0.75π or equivalently from 0.25π to 1.25π, than in the other half from -0.75π to 0.25π. Referred to the general height-phase structure, the phase range from 0.25π to 1.25π corresponds to a duration when westerlies prevail through the pressure levels and easterlies are absent or weak except for their buildup at upper levels around π. In contrast, the phase range from -0.75π to 0.25π corresponds to a duration when easterlies dominate through the pressure levels and westerlies are absent or weak except for their buildup at the upper levels around 0. The variability in the amplitudes of the nonlinear principal components shows marked differences between phases of the QBO winds. The variability is greater during the westerly than the easterly phase. The variance also changes with seasons. After being stratified according to seasons, the points in the scatter plot of the two 7-2b -16-7b nonlinear principal components appear to diverge frequently from the closed curve of the generalized nonlinear principal components in the phase cycle from 0.25π counterclockwise to -0.75π in February, March and April (Fig. 3.10a) and in November, December and January (Fig. 3.10d). The scattering implies that the nonlinear principal components and hence the QBO winds are rather variable in this phase range during the boreal winter. In contrast, those points in May, June and July (Fig. 3.10b) and August, September and October (Fig. 3.10c) tend to be close to the closed curve of the generalized nonlinear principal components through the phase cycle. It indicates that the nonlinear principal components and hence the QBO winds are quite stable during the boreal summer whether the zonal winds are blowing toward west or east. These changes in wind variability according to phase and seasons are consistent with the annual cycle and the asymmetry between northern and southern hemispheres in the interaction between the equatorial stratospheric zonal winds and the linear planetary Rossby waves. Consider the general heightphase structure of the QBO winds (Fig. 3.9b). While an easterly or westerly starts to reach its full strength at 10 hPa, a wind of opposite direction is strong at 40 hPa. Hence, composites of a field variable based on the equatorial zonal wind at 10 hPa will be nearly opposite to those based on the equatorial zonal wind at 40 hPa. The westerly regime from 70 to 40 hPa coincides with QBO phases between 0.25π  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  44  and 1.25π when westerlies are prevailing in the equatorial stratosphere and easterlies are absent or weak most of the time. From linear wave theory, the presence of equatorial stratospheric westerlies allows a considerable amount of planetary Rossby wave energy to penetrate into the tropics, leading to mitigated wave activities in the extra-tropics, intensified perturbations of the equatorial stratospheric winds and, hence, greater variability there. In contrast, the easterly regime from 70 to 40 hPa corresponds to the QBO phases between -0.75π and 0.25π when easterlies are dominating in the equatorial stratosphere and westerlies are absent or weak most of the time. This blocks a considerable amount of planetary Rossby wave energy from entering the tropics, leading to enhanced wave drag on the polar westerly vortex and leaving the equatorial stratospheric winds relatively undisturbed. This picture, describing the variability of the QBO winds according to seasons, changes and ultimately reverses as one considers vertical levels successively higher than 40 hPa. Not surprisingly, if the phase of the QBO is determined according to the equatorial stratospheric zonal wind at a single level, it is crucial to select a proper level when investigating the wave guiding modulation of the planetary Rossby waves by the QBO. Often, a level between 30 and 50 hPa has been chosen (Baldwin et al., 2001). A case in point is the study of the sudden stratospheric warming from the point-of-view of the Holton and Tan hypothesis (Holton and Tan, 1980). As described in the previous paragraph, the QBO modulation of the waveguide affects the extra-tropics in a way opposite to the tropics. Composite geopotential heights and temperatures in the winter stratosphere over the north pole are significantly lower during the westerly regime of the equatorial zonal wind at 50 hPa than during the easterly regime when the planetary Rossby waves in the extra-tropics are strengthened (Holton and Tan, 1980, 1982). Also, the Eliassen-Palm fluxes (Eliassen and Palm, 1961; Edmon et al., 1980) of the planetary Rossby waves were more/less convergent in the mid-latitude upper stratosphere during the easterly/westerly regime of the equatorial zonal wind at 40 hPa (Dunkerton and Baldwin, 1991). It follows that during the QBO easterly regime, part of the planetary Rossby wave energy is reflected back toward the pole, leading to enhanced westward wave drags on the polar vortex (Haynes and McIntyre, 1987). In the northern hemisphere, the planetary Rossby waves may grow enough to break, deposit westward momentum flux, slow down the polar westerly jet and even convert it into an easterly jet, leading to dramatic warming over the pole (Matsuno, 1971; McIntyre and Palmer, 1983). A change in the direction of the polar night jet accompanies what is known as a sudden stratospheric warming. A polar westerly vortex can reappear later through radiative equilibration if the winter persists. These kinds of variations are among the strongest in the high-latitude stratosphere. They can transmit down to the surface in weeks, change the weather and even cause cold snaps. They offer an  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  45  opportunity for seasonal forecasting (Baldwin and Dunkerton, 2001; Thompson et al., 2002; Cai, 2003). According to Holton and Tan (1980), the sudden stratospheric warming should be more frequent during QBO easterly regimes than QBO westerly regimes. This has been observed by Dunkerton et al. (1988). As in the previous explanation of the seasonal change in variance of the QBO winds, this relation depends on which vertical level is chosen to determine the phase of the equatorial zonal wind. For instance, weak polar westerly vortices and warm polar temperatures were often observed during easterly regimes of the equatorial zonal wind at 40 hPa, but westerly regimes at 10 hPa (Dunkerton and Baldwin, 1991). In nature, it is the vertical structure of the QBO winds that defines the topography of the waveguide of the planetary Rossby waves. The 7-2b -16-7b solution shows that the two nonlinear principal components are least variable around the QBO phase of -0.5π (Fig. 3.9a) when easterlies predominate through the equatorial stratosphere except for the appearance of westerlies above 15 hPa (Fig. 3.9b). It implies that around this phase or vertical structure of the QBO winds, the equatorial stratospheric zonal winds are least disturbed and the planetary Rossby waves are most confined to the extra-tropics.  3.6  Seasonal synchronization of the QBO phases  A recent analysis based on a record of the monthly mean zonal wind near the equator at 50 hPa alone showed that the easterly onset occurs more frequently in May-July and the westerly onset in May-June than other months (Baldwin et al., 2001). This annual synchronization of the phases of the QBO has been linked to seasonally varying tropical upwelling (Hampson and Haynes, 2004). It will be shown that this phase synchronization is seen in the two 7-2b -16-7b nonlinear principal components and also how it varies throughout the year. The data points of the two 7-2b -16-7b nonlinear principal components appear, in the scatter plot, to cluster around some angular phases, and these angular positions change with the season of the year (Fig. 3.10). In more detail, the distribution of data points according to the periodic QBO phase and calendar month (Fig. 3.11) shows a tendency for a seasonal synchronization of the phases. If the data points were distributed randomly, each QBO phase would contain five to six events in a month (45 samples per month over eight phase intervals of 0.25π). In fact, each QBO phase has some particular, consecutive months in which it is more recurrent than in the other months. For instance, a QBO phase of -π occurred 10 to 13 times a month during January-June, except for 8 times in March. The number reduced to 5 to 8 in the other months. A QBO phase of -0.25π appeared 9 to 11 times a month during September-December, but 2 to 6 times per month during the other months  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  46  of the year. Moreover, the QBO phase tends to advance with the particular months. The QBO phases of -π, -0.75π and -0.25π are most recurrent during January-June, June-August and September-December, respectively. Then the QBO phases of 0, 0.25π, 0.5π and 0.75π are most recurrent during JanuaryMarch, April-July, August-October and November-April, respectively. It is followed by the QBO phase of π, equivalent to -π, being more recurrent during January-June. The QBO phase spans 20 to 30 such months in a cycle. If the QBO winds had the same phase speed all the time, the distribution of data points should have been no different between months and phases. Actually, the phase speed varies, for the phase residue fluctuates with time (Fig. 3.7b). For comparison between phases as well as months, the individual phase speeds are calculated from the accumulating phase as ∆Φn /∆nmonth = (Φn+1 − Φn−1 )/2 and stratified according to the periodic QBO phase and calendar month (Fig. 3.12). All the phase speeds are positive and smaller than 0.3π a month. The phase speeds around the QBO phase of 0, when westerlies blow at the upper levels and easterlies breeze at lower levels, and around the QBO phase of π, when easterlies blow at the upper levels and westerlies breeze at the lower levels (Fig. 3.9b), are mostly below 0.1π per month through a year. In contrast, the phase speeds around the QBO phase of -0.5π, when easterlies dominate through the pressure levels, fall below 0.1π per month during December-March, but with relatively few events in these months, then rise to 0.1π to 0.3π per month during April-November. The phase speeds around the QBO phase of 0.5π, when westerlies prevail through the pressure levels, fall below 0.1π per month during July-August, then rise to 0.1π to 0.3π per month during September-June. Naturally, a QBO phase with a nearly zero speed in a month would persist in this month or a following month. A QBO phase with a high speed would have low recurrence because the phase will quickly advance to the next phase. These associations between the phase speed and phase recurrence are seen in Figs. 3.12 and 3.11. The phase speeds at the QBO phases of -0.5π and 0.5π are usually higher than 0.1π per month (Fig. 3.12). So these QBO phases rarely recurred in a month, except for those months near which the previous QBO phase had a relatively high recurrence (Fig. 3.11). In contrast, the QBO phases 0 and π were often nearly stationary (Fig. 3.12). They recurred more than other QBO phases and occurred preferentially during January-March and January-June, respectively (Fig. 3.11). In the general heightphase structure (Fig. 3.9b), a QBO phase of 0 corresponds to the easterly-to-westerly transition near 40 hPa and a QBO phase of π corresponds to the westerly-to-easterly transition near 40 hPa. Hampson and Haynes (2004) showed that the seasonally varying tropical upwelling could tune the  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  47  QBO phases toward an annual synchronization through some unknown processes. However, the present analysis indicates that each of the QBO phases had some specific, consecutive months during which it occurred preferentially compared to the other months of the year. The consecutive months change with the QBO phase and all seasons have such months, whether the tropical upwelling is strong or weak. In addition, according to the phase speeds (Fig. 3.12) and the general height-phase structure (Fig. 3.9b), the processes that synchronize the QBO exert different effects on different QBO phases. For instance, in JulyAugust, they significantly slow down the phase speeds if westerlies are prevailing through the pressure levels, but not if easterlies are dominating there. This suggests that waves, propagating upward from the troposphere, could be involved in regulating the phase speeds of the QBO winds. Some particular waves in July-August provide eastward momentum flux to maintain the westerlies in the equatorial stratosphere, but thereby decelerate the easterlies.  3.7  Composite of the QBO winds  In this section, the compact NLPCA model is used to obtain a one-dimensional representation of the QBO winds. The result is compared to another one-dimensional representation of the QBO winds calculated by Hamilton and Hsieh (2002). The QBO winds, to be represented through a single variable, are simulated using the 7-2b -Mψ -7b model, where M is varied from 2 to 10 with 40 runs for each 7-2b -Mψ -7b model. The superscript ψ denotes that the input series of the decoding neurons are those of the sine and cosine of the periodic angular phase ψ formed by the arc tangent of the two incoming bottleneck series, instead of the bottleneck series themselves. This trigonometric transformation concentrates the variability of the two bottleneck series into the phase ψ and projects the amplitudes of the two bottleneck series to a constant. The lowest root mean square error of the 40 runs for each 7-2b -Mψ -7b model reduces relatively quickly from 5.05 to 3.81 m s−1 as M is increased from 2 to 7 and then slowly from 3.81 to 3.74 m s−1 as M is varied from 7 to 10, suggesting that the 7-2b -7ψ -7b simulation is optimal in terms of least square error and fewest model parameters. The time series of the 7-2b -7ψ -7b simulation (not shown) fit the QBO winds with Pearson correlation coefficients ρ = 0.963, 0.977, 0.983, 0.985, 0.979, 0.958, 0.863 and root mean square errors 5.12, 4.21, 3.60, 3.06, 3.25, 3.76, 3.31 m s−1 for pressure levels 10, 15, 20, 30, 40, 50, 70 hPa, respectively. In addition, at each pressure level, the frequency distribution of the error series of the 7-2b -7ψ -7b simulation has a higher kurtosis and little skewness (not shown) than the normal distribution with expectation zero and  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  48  variance equal to that of the 7-2b -7ψ -7b simulation error. This result is similar to that from the 7-2b -16-7b simulation discussed in Section 3.3. The time series constructed from accumulating the phase ψ of the 7-2b -7ψ -7b solution resembles that of the accumulating phase Φ from the 7-2b -16-7b solution (Fig. 3.7a), with a Pearson correlation coefficient of 1.000. Their corresponding phase residues are correlated at 0.955. The 7-2b -7ψ -7b solution has almost the same features in phase variability as the 7-2b -16-7b solution. The time series of the 7-2b -7ψ -7b simulation (Fig. 3.13a,b, densely overlapping circles) circles through the data points of the QBO winds in each two-dimensional scatter plot (Fig. 3.13a,b, dots) as a smooth and closed curve and form the height-phase composite of the QBO winds shown in Fig. 3.13c. Considering the predominant linear increase with time of the accumulating phase, the height-phase structure in Fig. 3.13c can also be viewed as a height-time composite of the QBO winds. In this QBO composite, characteristic features of the QBO such as the demarcation between prevailing westerlies and easterlies, the more rapid transition of easterlies to westerlies than that of westerlies to easterlies, and the stronger intensity of easterlies compared to westerlies in the middle and low levels are apparent. Another one-dimensional representation of the QBO winds is calculated using the NLPCA.cir model by Hamilton and Hsieh (2002). The NLPCA.cir model is the standard three-hidden-layer feed-forward neural network (discussed in Section 2.2, Chapter 2 of this dissertation) with two bottleneck neurons, with signals from the two bottleneck neurons constrained by u2n + vn2 = 1, where un is the signal from one of the two bottleneck neurons and vn from the other at sampling point n. Namely, the NLPCA.cir model applied to the QBO winds is a 7-M-2-Mc -7 model, where superscript c denotes the constraint. Since u2n + vn2 = 1 is equivalent to cos2 θn + sin2 θn = 1, where θn is the arc tangent of (un , vn ), the constraint implies that the signals entering the decoding neurons are those of the sine and cosine of the angular phase θn formed by the arc tangent of the two bottleneck signals. Such constraint is equivalent to the trigonometric transformation in the 7-2b -Mψ -7b model that is described in the first paragraph of this section. Compared with the compact, 7-2b -Mψ -7b model, the actual difference of the NLPCA.cir, 7-M-2-Mc -7 model is that it includes the encoding layer and the bottleneck and output biases. The NLPCA.cir M = 6 model solution was chosen to represent the QBO winds (Hamilton and Hsieh, 2002). The M = 6 model involves (7×6+6) + (6×2+2) + (2×6+6) + (6×7+7) = 129 parameters. In contrast, the 7-2b -7ψ -7b model employs (7×2) + (2×7+7) + (7×7) = 84 or 45 fewer parameters. Nonetheless, the Pearson correlation coefficient 0.863≤ρ≤0.985 and the root mean square error between 3.06 and 5.12 m s−1 from the 7-2b -7ψ -7b simulation are generally better than 0.851≤ρ≤0.985 and the root mean square error between 3.07 and 4.91 m s−1 from the NLPCA.cir M = 6 simulation. Furthermore and  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  49  more importantly, the composite of the QBO winds from the 7-2b -7ψ -7b simulation directly characterizes the asymmetry between the westerly and easterly regimes of the QBO winds and that between the transitions of the wind regimes. However, the composite from the NLPCA.cir M = 6 simulation shows the asymmetry only after the phases from the NLPCA.cir solution of the QBO winds are grouped into 20 equal phase intervals and a three-point running mean is applied to the histogram function. Using fewer parameters than the NLPCA.cir M = 6 model, the compact 7-2b -7ψ -7b model offers a more robust simulation of the QBO winds. These significant improvements in both reducing the number of model parameters and enhancing the characterization of observations manifest the advantage of applying the compact NLPCA model to characterize the periodic and quasi-periodic variations.  3.8  Conclusions  The compact NLPCA model developed in Chapter 2 is applied to the monthly mean zonal winds observed at seven pressure levels between 10 and 70 hPa in the stratosphere near the equator. Two nonlinear low-dimensional fits of the zonal winds are produced, one through two nonlinear principal components and the other through a single phase variable. The two fits have almost the same phase series featuring the QBO. The quasi-periodic phase varies in length from cycle to cycle, but the accumulating phase increases with time essentially in a linear fashion at a rate of 2π/28.4-month, which is also the predominant oscillation frequency of the observed and simulated zonal winds. This rate is the essential phase speed of the QBO wind variability. In other words, 28.4-months is a fundamental period of the QBO. The phase residue, after the subtraction of the linear component from the accumulating phase, fluctuates with a predominant 11-year period as well as longer periods. Existence of modulations by cycles of longer periods could be the reason why the modulation of the QBO can not be completely explained by using the 11-year sunspot cycle alone. Also, whether the 11-year cycle mirrors sunspots activity is arguable, and the cycle could originate from the interaction of the QBO with a biennial signal. Now, consider the model solution where two nonlinear principal components are produced. They show clearly that the zonal winds in the equatorial stratosphere are subject to perturbations, and therefore are more variable, when mean westerlies dominate through middle or low levels of the equatorial stratosphere in the northern hemisphere winter. The zonal winds are relatively undisturbed whenever mean easterlies dominate through the middle or low levels of the equatorial stratosphere. In the southern hemisphere winter, the zonal winds are relatively undisturbed, no matter what direction the mean zonal winds are  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  50  blowing. These relationships between zonal wind variability due to the QBO and the direction of the zonal mean wind in the equatorial stratosphere are consistent with the annual appearance and disappearance of the planetary Rossby waves in the northern extra-tropical stratosphere, the modulation of the waveguide by the equatorial zonal winds and the asymmetry in the strength of the planetary Rossby waves between northern and southern hemispheres. It is also shown that the QBO phase defined by the two nonlinear principal components exhibits a tendency for seasonal synchronization. In general, a QBO phase recurs more frequently during some consecutive months in the year. All seasons have such intervals. The phase advances with these particular months and spans 20 to 30 such months in a cycle. The difference in the number of times a QBO phase occurs in a given month compared to another month is linked to the change of phase speed. The phase speed changes by month as well as by phase. When westerlies or easterlies blow at the upper levels and winds of the opposite direction are present at lower levels, the phase speeds primarily fall between 0 and 0.1π per month through a year. In contrast, when easterlies dominate through the pressure levels, the phase speeds rise to 0.1 to 0.3π per month during April-November, but fall between 0 and 0.1π per month during December-March. When westerlies prevail through the pressure levels, the phase speeds rise to 0.1 to 0.3π per month during September-June, but mostly fall between 0 and 0.1π per month during July-August. A QBO phase is more recurrent in a month when its associated phase speeds tend to be zero and rarely occurs if its associated phase speeds are relatively high. The change of phase speeds with phase and time suggests the involvement of upward-propagating tropospheric waves in modulating the QBO. In the one-dimensional NLPCA approximation of the QBO winds, the compact NLPCA model provides a better representation of the QBO winds than the classical principal component analysis and a better description of the asymmetry of the QBO between westerly and easterly shear zones and between westerly-to-easterly and easterly-to-westerly transitions.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  51  (a)  Pressure (hPa)  10 15 20 30 40 50 70 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 (b)  Pressure (hPa)  10 15 20 30 40 50 70 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 (c)  Pressure (hPa)  10 15 20 30 40 50 70 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 (d)  Pressure (hPa)  10 15 20 30 40 50 70 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 Year Figure 3.1: The height-time section of the QBO winds. The contour interval is 5 m s−1 , with westerlies in thin lines within the shaded area, easterlies in thin lines and zero velocity in thick lines.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  (a)  −1  Root Mean Square Error (m s )  3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3 2.9 2.8 2.7 2.6 4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20  (b)  −1  Root Mean Square Error (m s )  3.8 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3 2.9 2.8 2.7 2.6  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 M  Figure 3.2: (a) The resultant root mean square errors of runs 1 to 40 (dots from left to right for each M ) of the 7-2b -M-7b model applied to the normalized QBO winds, where M is varied from 4 to 20. (b) The root mean square error of the lowest run among the 40 runs for each M (dot) and that of the optimal solution (circle around the dot).  52  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  35  53  10 hPa, ρ=0.984  −35 40  15 hPa, ρ=0.993  −40  Wind Speed (m s−1)  35  20 hPa, ρ=0.993  −35 30  30 hPa, ρ=0.994  −30 30  40 hPa, ρ=0.991  −30 30  50 hPa, ρ=0.974  −30 20  70 hPa, ρ=0.874  −20 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year Figure 3.3: The time series of the QBO winds (dots) and the 7-2b -16-7b simulation (line). The Pearson correlation coefficient ρ between observation and simulation at each pressure level is shown at the top of corresponding plot. The vertical scale varies between the plots.  Frequency Distribution  0.25  Frequency Distribution  0.25  Frequency Distribution  0.25  Frequency Distribution  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  0.25  54  0.25 10 hPa  15 hPa  σ=3.41  σ=2.42  0 −18  −10  0  10  18  0 −18  −10  0  10  18  0  10  18  0  10  18  0.25 20 hPa  30 hPa  σ=2.26  σ=1.93  0 −18  −10  0  10  18  0 −18  −10  0.25 40 hPa  50 hPa  σ=2.14  σ=2.99  0 −18  −10  0  10  18  0  10  18  0 −18  −10  70 hPa σ=3.18  0 −18  −10  Figure 3.4: The frequency distribution (vertical lines) and the standard deviation σ of the 7-2b -16-7b simulation error for each pressure level. The normal distribution with expectation zero and variance σ 2 equal to that of the error is shown as curve. The units of the error and of the standard deviation are m s−1 .  Frequency Distribution  0.25  Frequency Distribution  0.25  Frequency Distribution  0.25  Frequency Distribution  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  0.25  55  0.25 10 hPa  15 hPa  σ=5.76  σ=3.54  0 −18  −10  0  10  18  0 −18  −10  0  10  18  0  10  18  0  10  18  0.25 20 hPa  30 hPa  σ=4.21  σ=4.18  0 −18  −10  0  10  18  0 −18  −10  0.25 40 hPa  50 hPa  σ=3.65  σ=4.71  0 −18  −10  0  10  18  0  10  18  0 −18  −10  70 hPa σ=3.74  0 −18  −10  Figure 3.5: The frequency distribution (vertical lines) and the standard deviation σ of the error series of the reconstruction from the leading two modes of the linear principal component analysis of the QBO winds for each pressure level. The normal distribution with expectation zero and variance σ 2 equal to that of the error is shown as curve. The units of the error and of the standard deviation are m s−1 .  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  56  (a)  u  1  0  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 (b)  v  1  0  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 (c)  Ψ (π)  1  0  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year Figure 3.6: (a,b) The monthly series of the two 7-2b -16-7b nonlinear principal components, u and v. (c) The monthly series of the periodic QBO phase, Ψ = arctan(v/u) (−π≤Ψ≤π), defined by the two nonlinear principal components. Sampling points are marked by dots.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  57  (a) 40 Φ =2n  month  / 28.4 + 1.3 (π)  Φ (π)  L  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 (b)  L  φ = Φ−Φ (π)  0.5  −0.6 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year (c) 0.2  φ  A (π)  0.15 0.1 0.05 0 0  0.089  0.289 0.422 0.578 Frequency (cycle per year)  0.822  Figure 3.7: (a) The accumulating phase Φ (densely overlapping dots) defined by the two 7-2b -16-7b nonlinear principal components and its linear regression ΦL , where nmonth is the number of months from the initial sampling point and starts from 1. (b) The phase residue φ = Φ − ΦL (line with dots marking sampling points) and its harmonic component at 0.089 cpy (line). (c) The Fourier amplitude coefficients of the phase residue at frequencies between 0 and 1 cpy. Possible origins or uncertainties of the amplitude coefficient peaks are discussed in Section 3.4.2 of this dissertation.  1  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  20  58  10 hPa  0 20  15 hPa  Fourier Amplitude Coefficient (m s−1)  0 20  20 hPa  0 20  30 hPa  0 20  40 hPa  0 20  50 hPa  0 10  0  70 hPa  0  0.422 Frequency (cycle per year)  Figure 3.8: The Fourier amplitude coefficients of the QBO winds (thin lines) and of the 7-2b -16-7b simulation (thick lines) for each pressure level at frequencies between 0 and 1 cpy.  1  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  0.75π  59  0.25π  (a)  π  0  (u,v)  −0.75π  −0.25π  (b) 10  Pressure (hPa)  15 20  30 40 50 70 −1 −0.75 −0.5 −0.25  0  0.25  0.5 0.75 Ψ (π)  1  1.25  1.5  1.75  2  Figure 3.9: (a) The scatter plots of the two 7-2b -16-7b nonlinear principal components (u, v) (dots) and their generalizations from the 2-2b -3ψ -2b solution (densely overlapping crosses). The range of both x-axis and y-axis is [-1, 1]. (b) The height-phase structure of the QBO winds from the generalized nonlinear principal components. The contour interval is 5 m s−1 , with westerlies in thin lines within the shaded area, easterlies in thin lines and zero velocity in thick lines.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  0.75π  (a)  0.25π  0.75π  FMA  (b)  60  0.25π  MJJ  π  0  π  0  −0.75π  −0.5π  −0.25π  −0.75π  −0.5π  −0.25π  0.75π  (c)  0.25π  0.75π  (d)  0.25π  ASO  NDJ  π  0  −0.75π  (u,v)  −0.25π  π  −0.75π  0  (u,v)  −0.25π  Figure 3.10: The scatter plots of the two 7-2b -16-7b nonlinear principal components (dots) and their generalizations from the 2-2b -3ψ -2b solution (densely overlapping crosses), stratified according to seasons.  Month  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  61  J  10  4  1  4  10  5  2  9  10  F  10  3  2  4  9  4  3  10  10  M  8  5  2  2  9  6  2  11  8  A  11  4  3  3  5  9  1  9  11  M  13  5  2  5  2  11  1  6  13  J  10  8  6  5  1  9  3  3  10  J  7  8  7  5  3  10  2  3  7  A  7  8  4  6  5  7  5  3  7  S  8  4  4  9  3  7  6  4  8  O  7  4  3  11  3  5  6  6  7  N  5  5  3  10  4  6  2  10  5  D  8  5  1  9  5  6  2  9  8  0 Ψ (π)  0.25  0.5  0.75  1  −1  −0.75  −0.5  −0.25  Figure 3.11: Distribution of the QBO phases defined by the two 7-2b -16-7b nonlinear principal components, stratified according to calendar month and phase. A phase interval of 0.25π is chosen to avoid too few samples within a phase interval. The phase intervals are centered at -π, -0.75π, -0.5π, -0.25π, 0, 0.25π, 0.5π, and 0.75π, respectively.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  ∆Φ/∆n (π/month)  0.3  0.2  0.2  0.1  0.1  0.1  ∆Φ/∆n (π/month)  0.3  −0.5  0  0.5  1  0 −1 0.3  Apr  −0.5  0  0.5  1  0.3  May  0.2  0.2  0.1  0.1  0.1  0.3  −0.5  0  0.5  1  0 −1 0.3  Jul  −0.5  0  0.5  1  0.2  0.2  0.2  0.1  0.1  0.1  0 −1 0.3  −0.5  0  0.5  1  0 −1 0.3  Oct  −0.5  0  0.5  1  0.2  0.2  0.2  0.1  0.1  0.1  0 −1  −0.5  0  Ψ (π)  0.5  1  0 −1  −0.5  0  Ψ (π)  0.5  1  0  0.5  1  −0.5  0  0.5  1  0  0.5  1  0  0.5  1  Sep  0 −1 0.3  Nov  −0.5  Jun  0 −1 0.3  Aug  Mar  0 −1  0.2  0 −1  ∆Φ/∆n (π/month)  0.3  Feb  0.2  0 −1  ∆Φ/∆n (π/month)  0.3  Jan  62  −0.5  Dec  0 −1  −0.5  Ψ (π)  Figure 3.12: Distribution of the speeds of the QBO phases defined by the two 7-2b -16-7b nonlinear principal components, ∆Φn /∆n = (Φn+1 − Φn−1 )/2, stratified according to calendar month and phase.  Chapter 3. Structure and Variability of the Quasi-Biennial Oscillation  (a)  63  (b)  (70 hPa)  20  (30 hPa)  30  −30 −35  35  −20 −35  35  (10 hPa)  (10 hPa)  (c) 10  Pressure (hPa)  15 20  30 40 50 70 −1 −0.75 −0.5 −0.25  0  0.25  0.5 0.75 ψ (π)  1  1.25  1.5  1.75  2  Figure 3.13: (a,b) The two-dimensional phase plots of the QBO winds (dots) and the 72b -7ψ -7b simulation (densely overlapping circles) at 10, 30, and 70 hPa. (c) The composite of the QBO winds according to the 7-2b -7ψ -7b simulation. The contour interval is 5 m s−1 , with westerlies in thin lines within the shaded area, easterlies in thin lines and zero velocity in thick lines.  64  Chapter 4  Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation 4.1  Introduction  Atmosphere variability is often described by variance patterns. These are modes that are constructed to extract the maximum possible amount of variance present in the atmosphere over certain regions. In the troposphere, the Arctic Oscillation (AO) is considered the dominant climate feature over the extra-tropics of the northern hemisphere (Thompson and Wallace, 2000, 2001). In the stratosphere, the Quasi-Biennial Oscillation (QBO) of the zonal winds plays such a role over the tropics (Hamilton, 1981). The QBO could cause perturbations to the polar westerly vortex in the stratosphere as it affects the topology of waveguide of the planetary Rossby waves (Wallace and Thompson, 2002). Recently, it has been shown that large and sustained perturbations to the polar westerly vortex tend to propagate downward to the earth’s surface in about three weeks (Baldwin and Dunkerton, 1999, 2001). These perturbations can cause unusual weather anomalies and climate effects. They also offer the potential to improve long-range weather forecasting (Thompson et al., 2002). It has been suggested (see (Baldwin et al., 2001) for a review) that this downward propagation proceeds from the QBO, affecting tropospheric variability associated with the AO. It is this mode interaction that is proposed to study in this work. Due to nonlinear relations between the equatorial stratospheric zonal winds at different vertical levels, previous investigations into the effect of the QBO on the AO generally used the zonal wind data at one level to define a QBO regime. But they came to different conclusions if different levels were chosen (Baldwin et al., 2001). In this chapter, the compact NonLinear Principal Component Analysis (NLPCA) developed in Chapter 2 is applied to a dataset consisting of a monthly mean AO index and zonal wind observations at multiple pressure levels. The AO index is based on the 1000-hPa geopotential height anomaly poleward  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  65  of 20N. The data of the zonal wind consist of the monthly mean observations at seven pressure levels from 10 to 70 hPa in the stratosphere near the equator. The compact NLPCA applied to the dataset results in two nonlinear principal components, which are then analyzed to investigate the relationship of the AO index with the overall zonal winds. This chapter is organized as follows. Section 4.2 describes the data used for analysis. Section 4.3 selects an optimal two-dimensional NLPCA simulation of the dataset among various model runs. Section 4.4 analyzes the relationship of the AO with the QBO. Section 4.5 gives conclusions.  4.2  Data of the AO index and the QBO winds  The data of the monthly AO index of the 1000-hPa geopotential height anomaly is provided at www.cpc.ncep.noaa.gov/products/precip/cwlink/ by the Climate Prediction Center, National Centers for Environmental Prediction, National Oceanic and Atmospheric Administration of the United States. Its loading pattern is the leading empirical orthogonal function of the monthly mean 1000-hPa geopotential height anomaly during 1979-2000. A daily AO index is constructed by projecting the daily 1000-hPa geopotential height anomaly poleward of 20N onto the loading pattern. A monthly mean AO index is obtained by averaging the daily values of the AO index for each month and normalizing it by dividing by the standard deviation of the 1979-2000 base period. The normalized monthly mean AO index from January 1956 through December 2000, spanning the same period as the data of the equatorial stratospheric zonal winds, is used in this study. The QBO appears at least up to 40 km in the equatorial stratosphere, but operational balloon soundings are usually capped at 10 hPa around 30 km (Hamilton, 1981). Regular observations of winds in the equatorial stratosphere up to 10 hPa started from the early 1950s. The winds were measured twice per day using balloons above Canton Island (2.8N, 171.7W) from January 1956 through August 1967, Gan, Maldives (0.7S, 73.1E) from September 1967 through December 1975, and Singapore (1.4N, 103.9E) from January 1976 through December 2000. A standard height-time record of monthly mean zonal winds was calculated from the observations at 70, 50, 40, 30, 20, 15, and 10 hPa, and distributed by Free University of Berlin (Naujokat, 1986; Marquardt and Naujokat, 1997). The monthly mean series, with their 45-year means removed but weak seasonal cycles retained, are used in this study. The data are the same as those used in Chapter 3 and called the QBO winds for convenience. The height-time section of the QBO winds can be seen in Fig. 3.1 of Chapter 3 for reference.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  4.3  66  Optimal simulation of the AO-QBO dataset  The AO index and the QBO winds are physically different variables with different variances. The standard deviation of the AO index is 1.0, but that of all the QBO winds at all the pressure levels is 16.5 m s−1 . In order to keep the QBO winds from dominating the objective function of the mean square error, the QBO winds at all the pressure levels are normalized by dividing them by their overall standard deviation. Then an 8b -2-M-8b model, where M is varied from 4 to 20 with 40 runs for each 8b -2-M-8b model, is applied to the dataset consisting of eight time series, the AO index and the re-scaled QBO winds. This 8b -2-M-8b model contains two bottleneck neurons. The results did not change significantly when using more bottleneck neurons in the neural network model. This is not surprising since Wallace et al. (1993) showed that the basic dynamics of the QBO winds could be described reasonably well using only two linear principal components. The root mean square errors between output and target of the 8b -2-M-8b model of all the model runs are calculated (Fig. 4.1a) for comparison. The lowest root mean square error among the 40 runs for each 8b -2-M-8b model, indicating the best solution of the model in terms of least square error, reduces relatively quickly from 0.367 to 0.286 as M is increased from 4 to 17 and then remains between 0.286 and 0.282 when 17≤M ≤20 (Fig. 4.1b, dots). In terms of least square error and fewest model parameters, the 8-2b -17-8b simulation (Fig. 4.1b, circle) is optimal . The corresponding time series of the 8-2b -17-8b simulation (Fig. 4.2, line) fit the QBO winds (Fig. 4.2, dots) extremely well, with Pearson correlation coefficients of 0.960≤ρ≤0.982 for pressure levels between 10 and 50 hPa and ρ = 0.879 at 70 hPa where the original data are rather noisy. The AO index (Fig. 4.3, dots) and the corresponding time series of the 8-2b -17-8b simulation (Fig. 4.3, line) are correlated at ρ = 0.856. This is quite a good match given that the variability of the AO index involves strong interannual variations. The standard deviation σ of the error series of the 8-2b -17-8b simulation for the QBO winds at each pressure level falls between 3.12 and 5.17 and that for the AO index is 0.52. With respect to a normal distribution with zero mean and variance σ 2 equal to that of the 8b -2-17-8b simulation error for each pressure level (Fig. 4.4, curve), the frequency distribution of the 8-2b -17-8b simulation error has a higher kurtosis and little skewness (Fig. 4.4, vertical lines). Clearly, the 8-2b -17-8b simulation captures the co-variability of the AO index and the QBO winds without over-fitting the original data.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  4.4 4.4.1  67  Relation between the AO and the QBO Covariation of the AO and the QBO  The two nonlinear principal components, u and v, produced by the 8-2b -17-8b solution oscillate with a quasi-biennial period (Fig. 4.5a, b). There are slight variations in the amplitude of the oscillations of the nonlinear principal components and also in the length of the period from cycle to cycle, similar to the variability observed in the QBO winds themselves. The two 8-2b -17-8b nonlinear principal components are out of phase. Accordingly, the phase defined by the arc tangent of the two nonlinear principal components, Ψ = arctan(v/u), (−π≤Ψ≤π), varies between −π and π through each cycle (Fig. 4.5c). Although Ψ exhibits noticeable differences in the length of the period from cycle to cycle, it also shows a strong tendency for the phase to increase linearly with time through each cycle. An accumulating phase Φ, defined as the progression of the phase Ψ with each new cycle starting at (2n − 3)π with n varying from 1 to 19, progresses predominantly in a linear fashion (Fig. 4.6a). The accumulating phase is characterized by the linear regression ΦL = 2π nmonth /28.4 − 0.1π, where nmonth is the number of months from the initial sampling point and starts from 1. After the linear component is subtracted from the accumulating phase, the phase residue, φ = Φ − ΦL , fluctuates between -0.7π and 0.7π (Fig. 4.6b) with two dominant peaks, one at a frequency of 0.044 cpy (= 22.5-year period) and the other at 0.089 cpy (= 11.25-year period) under the current frequency resolution of 1/45 = 0.02̇ cpy (Fig. 4.6c). The 28.4-month cycle in phase progression and the 22.5- and 11-year cycles in frequency modulation of the phase of covariation of the AO and the QBO are almost the same as those found in the phase of the overall QBO winds at the seven pressure levels alone (Fig. 3.7 in Chapter 3). The phase progression of the overall QBO winds is discussed in detail in a study on the variability and structure of the QBO [Chapter 3]. This similarity reflects the significant impact of the QBO on the AO. In fact, the QBO signature has been shown to exist in the winter fields of the 1000-hPa geopotential height, surface air temperature and sea level pressure anomalies associated with the AO index (Coughlin and Tung, 2001). However, a classical Fourier transform of the AO index produces multiple dominant harmonics and does not show a clear QBO signal in the time series of the AO index.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  4.4.2  68  Statistical significance of the phase features  Since we are analyzing a 45-year dataset, the 22.5- and 11-year periods, present in the modulations of the phase residue φ, might appear as a spurious result. However, the predominant linear increase with time of the accumulating phase Φ and the dominant decadal fluctuations of the phase residue φ are robust features characterized by the two nonlinear principal components. They appear and are almost the same in the corresponding two nonlinear principal components of the 8-2b -M-8b model runs with close root mean square errors. For example, among the total 160 runs of the 8-2b -17,18,19,20-8b models, 159 runs offer the root mean square errors falling between 0.28 and 0.31 (Fig. 4.1a). Of these 159 NLPCA solutions, the accumulating phases defined by the two nonlinear principal components of individual solutions closely resemble each other (not shown) at Pearson correlation ρ > 0.999. The slope coefficients of their linear regressions have a mean of 2π/28.4-month and a standard deviation of 2π/0.008-month. The standard deviation is negligible compared to the value of the mean. The Fourier amplitude coefficients of the corresponding phase residues have a mean of 0.18π at both 0.044 and 0.089 cpy, with a standard deviation of 0.004π at 0.044 cpy and 0.002π at 0.089 cpy. Again, the standard deviations are negligible compared to the values of the mean. From the point-of-view of reproducibility, the predominant linear increase at a speed of 2π/28.4-month and dominant decadal modulations at 22.5- and 11-year cycles of the phase are very robust features in the covariation of the AO and the QBO.  4.4.3  Relationship of the AO with the QBO  In this section, the connection between the AO and the QBO is explored by analyzing the NLPCA representations of their co-variability. NLPCA does not allow a simple representation of the nonlinear principal components (i.e., the bottleneck series) as the product of a constant spatial pattern by a temporal series (like in PCA). Hence, a picture of the variability represented by the two nonlinear principal components from the optimal solution of the AO-QBO dataset is best obtained, in this case, by reducing the dimensionality of the space of nonlinear principal components from two to one. This additional reduction can be done since the two nonlinear principal components, u and v, determined from the 8-2b -17-8b solution show a clear phase relationship. The 2-2b -Mψ -2b model will be used here. Superscript ψ indicates that the series entering the decoding neurons are the sine and cosine of the angular phase formed by the arc tangent of the two bottleneck series, instead of the bottleneck series themselves. In this 2-2b -Mψ -2b model, u and v from the 8-2b -17-8b solution are used as input and target data of the neural network. M is varied from 2 to 5 with 10 runs for each 2-2b -Mψ -2b model. The lowest  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  69  root mean square error of the 10 runs for each 2-2b -Mψ -2b model is 0.097, 0.090, 0.089, and 0.087 for M = 2, 3, 4, 5. The 2-2b -3ψ -2b solution appears optimal by the measures of least square error and fewest model parameters. In a phase plot, the generalized nonlinear principal components from the 2-2b -3ψ -2b model output (Fig. 4.7a, crosses) pass through the data points of (u, v) (Fig. 4.7a, dots), the two 8-2b 17-8b nonlinear principal components, as a smooth and closed curve. Sending the generalized nonlinear principal components to the decoding neurons of the optimal 8-2b -17-8b model, the seven output series corresponding to the QBO winds form a general height-phase structure (Fig. 4.7b). Also, the eighth output series corresponding to the AO index can be plotted against the periodic phase Ψ from the 8-2b -17-8b solution (Fig. 4.7c). From Figures 4.7b and 4.7c, it is noted that large positive values of the AO index occur when QBO westerlies occupy over half the 10-70 hPa range of the equatorial stratosphere, whereas large negative values of the AO index arise when QBO easterlies occupy most of the depth of the 10-70 hPa layer. Thompson and Wallace (2000) indicated that strong perturbations of the AO index happen only in boreal winter. This feature appears clearly in the 8-2b -17-8b solution. When stratified bimonthly, it is easily seen that, in comparison with their generalization, the two 8-2b -17-8b nonlinear principal components are quite variable during November-April, whereas close to the generalization or quite stable during May-October (Fig. 4.8). Concomitantly, after the simulated AO index is stratified bimonthly, the winter amplification of the AO index is clearly observed (Fig. 4.9). The AO index varies from -4 to 3 during November-April, whereas falls mostly between -1 and 1 during May-October. These features are consistent with the proposed general influence of the tropical QBO of the zonal winds on the extra-tropical AO through the modulation of the waveguide of stratospheric planetary Rossby waves. During QBO westerly regimes, the waveguide is extended to summer stratosphere and planetary Rossby waves can penetrate into the tropics (Ortland, 1997; Hamilton, 1998), leading to decreased westward wave drags on the extra-tropical westerly winds, enhanced polar westerly vortex and large positive values of the AO index in the winter stratosphere. In contrast, during QBO easterly regimes, the waveguide is confined to the winter stratosphere, leading to increased westward wave drags on the extra-tropical westerly winds, a weakened polar westerly vortex or even an easterly vortex, and large negative values of the AO index (Haynes and McIntyre, 1987; O’Sullivan and Salby, 1990; Holton and Austin, 1991; O’Sullivan, 1997; Hamilton, 1998). Several mechanisms have been proposed to explain the dynamical coupling between the stratosphere and the troposphere in the extra-tropics. For instance, through an interaction between upward propagating planetary Rossby waves and the mean flow, pertur-  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  70  bations in the stratosphere can propagate downward to the troposphere (Kodera et al., 1996; Shindell et al., 1999). Meanwhile, the planetary Rossby waves associated with the perturbations may be reflected from the stratosphere to the troposphere (Perlwitz and Harnik, 2003). The amount of equator-ward penetration and poleward reflection of the planetary Rossby waves depends on the phase of the equatorial stratospheric zonal winds. The phase not only changes with time, but also propagates downward with time. The effective vertical range of the equatorial stratosphere that allows or blocks the wave penetration varies somewhat sinusoidally. The larger the vertical range the QBO westerlies occupy, the greater the penetration of planetary Rossby waves into the tropics could be, potentially leading to a stronger polar westerly vortex and a larger positive AO index. The lower equatorial stratosphere (20-23 km in altitude) is relatively well shielded from the penetration of the planetary Rossby waves (O’Sullivan, 1997), so large positive values of the AO index occur when QBO westerlies dominate near the middle and upper levels of the equatorial stratosphere (Fig. 4.7b,c). In contrast, large negative values of the AO index arise when QBO easterlies occupy over half the layer of the equatorial stratosphere.  4.5  Conclusions  Recently, there has been considerable work analyzing the AO, a dominant mode of variability over the extra-tropics of the northern hemisphere, and linking it to the QBO of the zonal winds in the equatorial stratosphere. The observed linkage is rather complicated due to the complex relations between the two oscillations and between the QBO winds at different vertical levels. Perturbations of the AO were usually linked synoptically to the effect of the QBO (see (Baldwin et al., 2001) for a review). This study applies the new, compact NLPCA model to a dataset consisting of monthly mean AO index of the 1000-hPa geopotential height anomaly and zonal winds in the equatorial stratosphere between 10 and 70 hPa to investigate the relationship of the AO with the overall QBO winds. The eight time series of the dataset are effectively represented by two nonlinear principal components. The phase defined by the arc tangent of the two nonlinear principal components reveals that the phase of covariation of the AO and the QBO is actually governed by that of the QBO winds [Chapter 3]. The phase also progresses predominantly at a constant rate and its residual (with the linear increase removed) is modulated at periods of 22.5 and 11 years. The nonlinear simulation of the dataset describes the relationship of the AO with the vertical structure of the QBO winds in detail. Large positive values of the AO index occur when QBO westerlies prevail near the middle or upper levels of the equatorial stratosphere. In contrast, large negative values of  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  71  the AO index arise when QBO easterlies occupy over half the layer of the equatorial stratosphere. Both extremes of the AO index occur only in boreal winter from November to April.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  0.38  Root Mean Square Error  0.37 0.36 0.35 0.34 0.33 0.32 0.31 0.3 0.29 0.28 4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 M  0.38  Root Mean Square Error  0.37 0.36 0.35 0.34 0.33 0.32 0.31 0.3 0.29 0.28  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 M  Figure 4.1: (a) The resultant root mean square errors of runs 1 to 40 (dots from left to right for each M ) of the 8-2b -M-8b model applied to the dataset of the AO index and QBO winds, where M is varied from 4 to 20. (b) The root mean square error of the lowest run among the 40 runs for each 8-2b -M-8b model (dot) and that of the optimal model solution (circle around the dot).  72  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  35  73  10 hPa, ρ=0.962  −35 40  15 hPa, ρ=0.976  −40  Wind Speed (m s−1)  35  20 hPa, ρ=0.980  −35 30  30 hPa, ρ=0.982  −30 30  40 hPa, ρ=0.977  −30 30  50 hPa, ρ=0.960  −30 20  70 hPa, ρ=0.879  −20 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year Figure 4.2: The time series of the QBO winds (dots) and the corresponding ones of the 8-2b -17-8b simulation (line). The Pearson correlation coefficient ρ between the observation and simulation for each pressure level is shown at the top of the corresponding plot.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  4 2  74  ρ=0.856  0 −2 −4 1956  1957  1958  1959  1960  1961  1962  1963  1964  1965  1966  1967  1968  1969  1970  1971  1972  1973  1974  1975  1976  1977  1978  1979  1980  1981  1982  1983  1984  1985  1986  1987  1988  1989  1990  1991  1992  1993  1994  1995  1996 1997 Year  1998  1999  2000  2001  4 2 0 −2 −4 1965  AO Index  4 2 0 −2 −4 1974 4 2 0 −2 −4 1983 4 2 0 −2 −4 1992  Figure 4.3: The time series of the AO index (dots) and the corresponding one of the 82b -17-8b simulation (line). The Pearson correlation coefficient ρ between the observation and simulation is shown at the top of the top plot.  Frequency Distribution  0.16  Frequency Distribution  0.16  Frequency Distribution  0.16  Frequency Distribution  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  0.16  75  0.16 10 hPa  15 hPa  σ=5.17  σ=4.30  0 −15  0  15  0 −15  0  15  0  15  0  15  0  1.5  0.16 20 hPa  30 hPa  σ=3.88  σ=3.37  0 −15  0  15  0 −15 0.16  40 hPa  50 hPa  σ=3.40  σ=3.68  0 −15  0  15  0 −15 0.16  70 hPa  1000 hPa  σ=3.12  σ=0.52  0 −15  0  15  0 −1.5  Figure 4.4: The frequency distribution (vertical lines) and the standard deviation σ of the 8-2b -17-8b simulation error for each pressure level. The normal distribution with expectation zero and variance σ 2 equal to that of the error is shown as a curve. 10 to 70 hPa are the pressure levels of the QBO winds and 1000 hPa is that of the AO index. The units of the error and of the standard deviation for the QBO winds are m s−1 .  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  76  (a)  u  1  0  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 (b)  v  1  0  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 (c)  Ψ (π)  1  0  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year Figure 4.5: (a,b) The monthly series of the two 8-2b -17-8b nonlinear principal components, u and v. (c) The monthly series of the periodic phase, Ψ = arctan(v/u) (−π≤Ψ≤π), defined by the two nonlinear principal components. Dots mark the sampling points.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  77  (a) 40 Φ =2n  month  / 28.4 − 0.1 (π)  Φ (π)  L  −1 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 (b)  L  φ = Φ − Φ (π)  0.7  −0.7 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year (c)  φ  A (π)  0.2  0 0  0.089  1 Frequency (cycle per year)  Figure 4.6: (a) The accumulating phase Φ (densely overlapping dots) defined by the two 8-2b -17-8b nonlinear principal components and its linear regression ΦL , where nmonth is the number of months from the initial sampling point and starts from 1. (b) The phase residue φ = Φ − ΦL . Dots mark the sampling points. (c) The Fourier amplitude coefficients of the phase residue at frequencies between 0 and 1 cpy.  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  (a)  0.75π  78  0.25π  π  0  −0.75π  −0.25π  (u,v) (b)  Pressure (hPa)  10 15 20 30 40 50 70 −1 −0.75 −0.5 −0.25  0  0.25  0.5  0.75  1  1.25  1.5  1.75  2  0.5 0.75 Ψ (π)  1  1.25  1.5  1.75  2  AO Index  (c) 4 3 2 1 0 −1 −2 −3 −4 −1 −0.75 −0.5 −0.25  0  0.25  Figure 4.7: (a) The phase plots of the two 8-2b -17-8b nonlinear principal components (u, v) (dots) and their generalization from the 2-2b -3ψ -2b solution (densely overlapping crosses). The range of both x-axis and y-axis is [-1, 1]. (b) The general height-phase structure of the QBO winds according to the generalized nonlinear principal components. The contour interval is 5 m s−1 , with westerlies in thin lines within the shaded area, easterlies in thin lines and zero velocity in thick lines. (c) The AO index from the 8-2b -17-8b simulation against the periodic phase Ψ defined by (u, v).  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  0.75π  0.5π  0.25π  0.75π  JF  0.5π  79  0.25π  MA  π  0  π  0  −0.75π  −0.5π  −0.25π  −0.75π  −0.5π  −0.25π  0.75π  0.5π  0.25π  0.75π  0.5π  0.25π  MJ  JA  π  0  π  0  −0.75π  −0.5π  −0.25π  −0.75π  −0.5π  −0.25π  0.75π  0.5π  0.25π  0.75π  0.5π  0.25π  SO π  ND 0  −0.75π  (u,v)  −0.25π  π  −0.75π  0  (u,v)  −0.25π  Figure 4.8: The phase plots of the two 8-2b -17-8b nonlinear principal components (u, v) (dots) and their generalization from the 2-2b -3ψ -2b solution (densely overlapping crosses), in January-February, March-April, May-June, July-August, September-October, and NovemberDecember, respectively.  AO Index  AO Index  AO Index  Chapter 4. Relationship of the Arctic Oscillation with the Quasi-Biennial Oscillation  4 JF 3 2 1 0 −1 −2 −3 −4 −1 −0.5  4 MJ 3 2 1 0 −1 −2 −3 −4 −1 −0.5  4 SO 3 2 1 0 −1 −2 −3 −4 −1 −0.5  0  0  0 Ψ (π)  0.5  0.5  0.5  80  1  4 MA 3 2 1 0 −1 −2 −3 −4 −1 −0.5  0  0.5  1  1  4 JA 3 2 1 0 −1 −2 −3 −4 −1 −0.5  0  0.5  1  1  4 ND 3 2 1 0 −1 −2 −3 −4 −1 −0.5  0 Ψ (π)  0.5  1  Figure 4.9: The AO index from the 8-2b -17-8b simulation against the periodic phase Ψ defined by the two 8-2b -17-8b nonlinear principal components, in January-February, March-April, MayJune, July-August, September-October, and November-December, respectively.  81  Chapter 5  Conclusions and Future Work 5.1  Conclusions  Over the last ten years, artificial neural networks have been used to develop a nonlinear version of principal component analysis. Initial applications of NonLinear Principal Component Analysis (NLPCA) by neural networks to various atmospheric and oceanic fields have shown the potential of this technique to elucidate the nonlinear structures present in those fields. Previously, the classic linear principal component analysis was used to search for patterns in geophysical fields with the result that a nonlinear structure, if present, was diluted into a number of linear principal components. The development of NLPCA enables the direct search of nonlinear structures present in a large dataset. The first aim of the research presented in this dissertation was to develop an NLPCA model that is simpler than those used recently and that is less susceptible to data over-fitting and non-uniqueness of solutions [Chapter 2]. The second aim was to apply NLPCA to describing the Quasi-Biennial Oscillation to show the power of the technique to represent succinctly a quasi-periodic atmospheric oscillation [Chapters 3 and 4].  5.1.1  Nonlinear principal component analysis by feed-forward neural networks  The NLPCA model of the three-hidden-layer feed-forward neural network (Kramer, 1991) is plagued by the problems of over-fitting and non-uniqueness of solutions (Hsieh, 2004). Through both mathematical analysis and experimental examples, it is shown in Chapter 2 of this dissertation that these problems are intrinsic due to the three-hidden-layer architecture. In particular, the parameters of the neural network can not be uniquely determined. As the number of encoding neurons is increased, a bottleneck signal (from the middle hidden layer of the three-hidden-layer structure) can reach any possible value by receiving and combining more encoding signals. Subsequently, the model output will over-fit the target data and  Chapter 5. Conclusions and Future Work  82  have difficulty arriving at unique solutions. To resolve these problems, a new, compact neural network model is presented to conduct NLPCA on large datasets. The compact NLPCA model is a simplified two-hidden-layer feed-forward neural network that has no encoding layer and no bottleneck and output biases. The mathematical analysis and experimental examples demonstrate that eliminating the bottleneck bias helps to reduce the variance of bottleneck series, also called nonlinear principal components, among different model runs. The output biases can be neglected according to both signal definition and parameter optimization. To represent a dataset with the same number of nonlinear principal components, the compact NLPCA model uses significantly fewer parameters than the three-hidden-layer feed-forward neural network.  5.1.2  Structure and variability of the QBO winds  To show the advantage of the compact NLPCA as well as to investigate the structure and variability of the QBO, in Chapter 3 of this dissertation, the new NLPCA model is applied to the monthly mean zonal winds between 10 and 70 hPa in the stratosphere near the equator. The dataset, called the QBO winds, consists of seven monthly series from 45 years of observations. Two low-dimensional representations of the dataset are obtained, one through two nonlinear principal components and the other through a single phase variable. The two representations both produce a time series that characterizes the phase variability of the overall zonal winds through the equatorial stratosphere. The two phase series closely resemble each other. They reveal two distinctive phase features of the QBO winds. (i) The QBO phase increases with time essentially in a linear fashion at a rate of 2π/28.4-month. The period of 28.4 months is determined to be a fundamental period of the QBO. (ii) The QBO phase is dominantly modulated by an 11-year cycle as well as a cycle of longer period. This is seen in the spectrum of Fourier amplitude coefficients of the phase residue. At the current frequency resolution 1/45 = 0.02̇ cpy, the period equivalent to 0.089 cpy is 11.25 years, with an equivalent period of 15 years to the left and 9 years to the right. But the markedly high value of the amplitude coefficient peak at 0.089 cpy indicates that the significant signal in the period interval between 9 and 15 years has a period very close to 11 years. The two nonlinear principal components of the first representation, being able to describe variations in both phase progression and amplitude value of oscillations, clearly characterize several dynamic features of the QBO winds. (i) The QBO winds are more variable if westerlies dominate through the middle or low levels of the equatorial stratosphere during the northern hemisphere winter. In contrast, the QBO  Chapter 5. Conclusions and Future Work  83  winds are relatively undisturbed if easterlies dominate through the middle or low levels of the equatorial stratosphere whether in winter or summer season and during southern hemisphere winter whether the zonal winds blow eastward or westward. The variance of the QBO winds changes according to the phase of the QBO and the seasons of the year. These changes can be explained by the interaction of the planetary Rossby waves in the northern extra-tropical stratosphere and the QBO winds, the seasonal activity of the planetary Rossby waves and the asymmetry in the strength of the planetary Rossby waves between northern and southern hemispheres. (ii) The QBO phases tend to synchronize with seasons. A QBO phase occurs more often during some consecutive months than the other months of the year. The phase progresses with the particular months, appears in all seasons and spans 20 to 30 such months in a cycle. The seasonal synchronization is linked to the phase and seasonal changes of phase speeds. (iii) The phase speeds often fall between 0 and 0.1π per month. However, when easterlies dominate through the pressure levels during April-November, the phase speeds can rise to 0.3π per month. Similarly, when westerlies prevail through the pressure levels during September-June, the phase speeds can rise to 0.3π per month. A QBO phase with a nearly zero progression speed would stay longer or appear more recurrent in a month than those phases with a relatively high progression speed. The one-dimensional NLPCA representation provides a better approximation of the QBO winds than the two-mode reconstruction from the classical principal component analysis. The description of the asymmetry of the zonal winds between westerly and easterly shear zones and the description of the westerly-to-easterly and easterly-to-westerly transitions are closer to observations when using the onedimensional NLPCA representation.  5.1.3  Relationship of the AO with the QBO winds  The compact NLPCA model is then applied to the dataset consisting of an AO index and the QBO winds to investigate the relationship between the two oscillations in Chapter 4 of this dissertation. The AO index is a monthly mean series of the first principal component from the classical principal component analysis of the 1000-hPa geopotential height anomaly. Since the AO index represents the large-scale variability of the tropospheric winds, it was interesting to investigate how this mode relates to the QBO. The data of QBO winds are the same as those used in Chapter 3 and contain seven monthly mean series of the zonal winds observations between 10 and 70 hPa in the equatorial stratosphere. The eight time series are effectively represented by two nonlinear principal components. The phase of covariation of the AO index and the QBO winds, defined by the arc tangent of the two nonlinear principal  Chapter 5. Conclusions and Future Work  84  components, appears to be governed by that of the QBO winds [Chapter 3]. It progresses predominantly at a constant rate of 2π/28.4-month. After the subtraction of the linear increase, the phase residual, at the current frequency resolution 1/45 = 0.02̇ cpy, appears modulated at periods of 22.5 and 11 years. The NLPCA simulation of the dataset depicts the relationship of the AO with the vertical structure of the QBO winds in detail. Large positive values of the AO index are associated with the prevailing westerlies near the middle or upper levels of the equatorial stratosphere. In contrast, large negative values of the AO index are connected with the dominating easterlies over half the layer of the equatorial stratosphere. Extreme values in both signs of the AO index emerge only in boreal winter from November to April.  5.2  Future work  Following the in-depth investigation of the structure of neural networks used in NLPCA, more questions have arisen. For instance, can the input data be prepared in some way that would expedite the NLPCA? To a dataset consisting of many variables, applying NLPCA by neural network involves a large number of network parameters. To reduce the number of network parameters, applying NLPCA to the linear principal components derived from the principal component analysis of the dataset was proposed (Hsieh and Tang, 1998). This approach has been applied to various data in oceanography and meteorology, such as the tropical Pacific sea surface temperature and sea level pressure (Monahan, 2001; Hsieh, 2001; Hsieh and Wu, 2002), and the Pacific subsurface heat content (Tang and Hsieh, 2003). The results from this approach are often compared with those from the principal component analysis. The number of linear principal components versus that of nonlinear principal components and the fraction of variance accounted for by these components are compared. Usually, several linear principal components are approximated through one nonlinear principal component. So the total variance of the one-dimensional NLPCA approximation of multiple linear principal components is compared with the variance of the first linear principal component. However, the NLPCA approximations of the linear principal components are not scrutinized, although they appear quite different from the original linear principal components and the total variance of the NLPCA approximations of the linear principal components is significantly lower than the total variance of the original linear principal components concerned. For instance, the first three linear principal components of the sea surface temperature anomaly (SSTA) over the tropical Pacific explain 51.4%, 10.1% and 7.2% of the SSTA variance respectively and 68.7% in total. A one-dimensional NLPCA approximation  Chapter 5. Conclusions and Future Work  85  of the three linear principal components explains 56.6% of the SSTA variance, 12.1% lower than that originally explained by the three linear principal components in total, although 5.2% higher than that originally explained by the first linear principal component (Hsieh, 2001). Peculiarly, the original linear Principal Component 2 varies between -15 and 25, whereas its NLPCA approximation is confined within [-5, 15], as shown in Fig. 8 of (Hsieh, 2001). When the first three linear principal components of sea surface temperature over the tropical Pacific are approximated, although the original linear Principal Component 2 fluctuates between -60 and 60 and the original linear Principal Component 3 changes between -30 and 30, the corresponding NLPCA approximation is confined within [-30, 30] or half of the original variation range and within [-10, 10] or only one third of the original variation range, respectively, as shown in Fig. 11 of (Hsieh, 2001). If nonlinear relations among variables of a dataset are significant, the principal component analysis will leave important information in higher order modes (Palus and Dvorak, 1992). To account for the nonlinear relations, it is important to effectively approximate all nontrivial linear principal components. However, the NLPCA approximations of the linear principal components could be confined within variation ranges much smaller than those of the original linear principal components. The confinement and the concomitant distortion of the nontrivial linear principal components could lead to specious conclusions which otherwise would be different if the data are well simulated. Finding out the cause of the confinement and the solution to the problem would facilitate the application of the NLPCA. Another research avenue is to apply the compact NLPCA to more atmospheric and oceanic datasets. Can the compact NLPCA model applied to the 500 hPa geopotential heights over the Northern Hemisphere extra-tropics reproduce the results obtained by Monahan et al. (2001) that the low-frequency variability in the troposphere is characterized by three distinct quasi-stationary states? Since the compact NLPCA has effectively simulated the complicated cyclical QBO winds without assuming the existence of quasi-periodic oscillation in the data beforehand, presumably it could detect the possible existence of planetary-scale quasi-periodic variability in the troposphere. The result may show if planetary-scale cyclical wind structures are only present in the stratosphere or exist in the troposphere as well. Since the flexibility of NLPCA can allow a more exhaustive search of dynamical patterns of variability in geophysical fields, one can also consider the application of NLPCA to Global Climate Models. Since these models are used extensively to make projections about climate change, it is imperative to know how well they are able to simulate climate variability. Because of its nonlinear capabilities, NLPCA can be a more discriminating tool to assess the modelling abilities of various Global Climate Models.  86  Bibliography An, S.-I., W. W. Hsieh, and F.-F. Jin, 2005: A nonlinear analysis of the ENSO cycle and its interdecadal changes. Journal of Climate, 18, 3229–3239. Andrews, D. G., J. R. Holton, and C. B. Leovy, 1987: Middle Atmosphere Dynamics. Academic Press, San Diego, CA, 489 pp. Baldi, P. and K. Hornik, 1989: Neural network and principal component analysis: Learning from examples without local minima. Neural Networks, 2, 53–58. Baldwin, M. P. and T. J. Dunkerton, 1998: Biennial, quasi-biennial, and decadal oscillations of potential vorticity in the northern stratosphere. Journal of Geophysical Research, 103, 3919–3928. — 1999: Propagation of the Arctic Oscillation from the stratosphere to the troposphere. Journal of Geophysical Research, 104, 30,937–30,946. — 2001: Stratospheric harbingers of anomalous weather regimes. Science, 294, 581–584. — 2005: The solar cycle and stratosphere-troposphere dynamical coupling. Journal of Atmospheric and Solar-Terrestrial Physics, 67, 71–82. Baldwin, M. P., L. J. Gray, T. J. Dunkerton, K. Hamilton, P. H. Haynes, W. J. Randel, J. R. Holton, M. J. Alexander, I. Hirota, T. Horinouchi, D. B. A. Jones, J. S. Kinnersley, C. Marquardt, K. Sato, and M. Takahashi, 2001: The Quasi-Biennial Oscillation. Reviews of Geophysics, 39, 179–229. Baldwin, M. P. and K. K. Tung, 1994: Extra-tropical QBO signals in angular-momentum and wave forcing. Geophysical Research Letters, 21, 2717–2720. Bishop, C. M., 1995: Neural Networks for Pattern Recognition. Oxford: Oxford University Press, 504 pp. Bourlard, H. and Y. Kamp, 1988: Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics, 59, 291–294.  Bibliography  87  Broyden, C. G., 1970: The convergence of a class of double-rank minimization algorithms: 1. general considerations. IMA Journal of Applied Mathematics, 6, 76–90. Cai, M., 2003: Potential vorticity intrusion index and climate variability of surface temperature. Geophysical Research Letters, 30, 19–1–19–4. Charney, J. G. and P. G. Drazin, 1961: Propagation of planetary-scale disturbances from the lower into the upper atmosphere. Journal of Geophysical Research, 66, 83–109. Coughlin, K. and K. K. Tung, 2001: QBO signal found at the extratropical surface through northern annular modes. Geophysical Research Letters, 28, 4563–4566. — 2005: Tropospheric wave response to decelerated stratosphere seen as downward propagation in northern annular mode. Journal of Geophysical Research, 110, D01103, 1–9. Cybenko, G., 1988: Continuous valued neural networks with two hidden layers are sufficient. Technical report, Department of Computer Science, Tufts University, Medford, MA. — 1989: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2, 303–314. Dunkerton, T. J., 1983: Laterally-propagating Rossby waves in the easterly acceleration phase of the Quasi-Biennial Oscillation. Atmosphere-Ocean, 21, 55–68. — 1990: Annual variation of deseasonalized mean flow acceleration in the equatorial lower stratosphere. Journal of Meteorological Society of Japan, 68, 499–508. Dunkerton, T. J. and M. P. Baldwin, 1991: Quasi-biennial modulation of planetary-wave fluxes in the northern hemisphere winter. Journal of the Atmospheric Sciences, 48, 1043–1061. Dunkerton, T. J. and D. P. Delisi, 1985: Climatology of the equatorial lower stratosphere. Journal of the Atmospheric Sciences, 42, 376–396. Dunkerton, T. J., D. P. Delisi, and M. P. Baldwin, 1988: Distribution of major stratospheric warmings in relation to the Quasi-Biennial Oscillation. Geophysical Research Letters, 15, 136–139. Edmon, H. J., B. J. Hoskins, and M. E. McIntyre, 1980: Eliassen-Palm cross-sections for the troposphere. Journal of the Atmospheric Sciences, 37, 2600–2616.  Bibliography  88  Eliassen, A. and E. E. Palm, 1961: On the transport of energy in stationary mountain waves. Geophysical Publications, 22, 1–23. Fletcher, R., 1970: A new approach to variable metric algorithms. Computer Journal , 13, 317–322. Fraedrich, K., S. Pawson, and R. Wang, 1993: An EOF analysis of the vertical-time delay structure of the Quasi-Biennial Oscillation. Journal of the Atmospheric Sciences, 50, 3357–3365. Glover, F. and M. Laguna, 1998: Tabu Search. Kluwer Academic Publishers, 408 pp. Goldfarb, D., 1970: A family of variable-metric methods derived by variational means. Mathematics of Computation, 24, 23–26. Hamilton, K., 1981: The vertical structure of the Quasi-Biennial Oscillation: Observations and theory. Atmosphere-Ocean, 19, 236–250. — 1998: Effects of an imposed quasi-biennial oscillation in a comprehensive troposphere-stratospheremesosphere general circulation model. Journal of the Atmospheric Sciences, 55, 2393–2418. — 2002: On the quasi-decadal modulation of the stratospheric QBO period. Journal of Climate, 15, 2562–2565. Hamilton, K. and W. W. Hsieh, 2002: Representation of the quasi-biennial oscillation in the tropical stratospheric wind by nonlinear principal component analysis. Journal of Geophysical Research, 107, 3–1–3–10. Hampson, J. and P. Haynes, 2004: Phase alignment of the tropical stratospheric QBO in the annual cycle. Journal of the Atmospheric Sciences, 61, 2627–2637. Haynes, P. H. and M. E. McIntyre, 1987: On the representation of Rossby wave critical layers and wave breaking in zonally truncated models. Journal of the Atmospheric Sciences, 44, 2359–2382. Hinton, G. E., 1989: Connectionist learning procedures. Artificial Intelligence, 40, 185–234. Holton, J. R. and J. Austin, 1991: The influence of the equatorial QBO on sudden stratospheric warmings. Journal of the Atmospheric Sciences, 48, 607–618. Holton, J. R. and C. Mass, 1976: Stratospheric vacillation cycles. Journal of the Atmospheric Sciences, 33, 2218–2225.  Bibliography  89  Holton, J. R. and H.-C. Tan, 1980: The influence of the equatorial Quasi-Biennial Oscillation on the global circulation at 50 mb. Journal of the Atmospheric Sciences, 37, 2200–2208. — 1982: The Quasi-Biennial Oscillation in the northern hemisphere lower stratosphere. Journal of the Meteorological Society of Japan, 60, 140–148. Hornik, K., M. Stinchcombe, and H. White, 1989: Multilayer feed-forward networks are universal approximators. Neural Networks, 2, 359–366. Hsieh, W. W., 2000: Nonlinear canonical correlation analysis by neural networks. Neural Networks, 13, 1095–1105. — 2001: Nonlinear principal component analysis by neural network. Tellus, 53A, 599–615. — 2004: Nonlinear multivariate and time series analysis by neural network methods. Review of Geophysics, 42, RG1003, 1–25. Hsieh, W. W. and K. Hamilton, 2003: Nonlinear singular spectrum analysis of the tropical stratospheric wind. Quarterly Journal of the Royal Meteorological Society, 129, 1–17. Hsieh, W. W. and B. Tang, 1998: Applying neural network models to prediction and data analysis in meteorology and oceanography. Bulletin of the American Meteorological Society, 79, 1855–1870. Hsieh, W. W. and A. Wu, 2002: Nonlinear multichannel singular spectrum analysis of the tropical pacific climate variability using a neural network approach. Journal of Geophysical Research, 107(C7), 3076, DOI:10.1029/2001JC000957. Jolliffe, I. T., 2002: Principal Component Analysis. Springer–Verlag, New York, 502 pp. Kirby, M. J. and R. Miranda, 1996: Circular nodes in neural networks. Neural Computation, 8, 390–402. Kodera, K., M. Chiba, H. Koide, A. Kitoh, and Y. Nikaidou, 1996: Interannual variability of the winter stratosphere and troposphere in the northern hemisphere. Journal of the Meteorological Society of Japan, 74, 365–382. Kramer, M. A., 1991: Nonlinear principal component analysis using autoassociative neural networks. American Institute of Chemical Engineers Journal , 37, 233–243. Labitzke, K., 2004: On the signal of the 11-year sunspot cycle in the stratosphere and its modulation by the quasi-biennial oscillation. Journal of Atmospheric and Solar-Terrestrial Physics, 66, 1151–1157.  Bibliography  90  — 2005: On the solar cycle-QBO relationship: a summary. Journal of Atmospheric and Solar-Terrestrial Physics, 67, 45–54. Li, S., W. W. Hsieh, and A. Wu, 2005: Hybrid coupled modeling of the tropical Pacific using neural networks. Journal of Geophysical Research, 110, C09024:1–12. Lorenz, E. N., 1963: Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20, 130–141. Marquardt, C. and B. Naujokat, 1997: An update of the equatorial QBO and its variability. World Meteorological Organization Technical Document No. 814 , 1, 87–90. Maruyama, T. and Y. Tsuneoka, 1988: Anomalously short duration of easterly wind phase of the QBO at 50 hPa in 1987 and its relationship to an El Nino event. Journal of the Meteorological Society of Japan, 66, 629–633. Matsuno, T., 1971: A dynamical model of the stratospheric sudden warming. Journal of the Atmospheric Sciences, 28, 1479–1494. McIntyre, M. E. and T. N. Palmer, 1983: Breaking planetary waves in the stratosphere. Nature, 305, 593–600. — 1984: The ‘surf zone’ in the stratosphere. Journal of Atmospheric and Terrestrial Physics, 46, 825– 849. Monahan, A. H., 2000: Nonlinear principal component analysis by neural networks: Theory and application to the Lorenz system. Journal of Climate, 13, 821–835. — 2001: Nonlinear principal component analysis: Tropical Indo-Pacific sea surface temperature and sea level pressure. Journal of Climate, 14, 219–233. Monahan, A. H. and J. C. Fyfe, 2007: Comment on “The shortcomings of nonlinear principal component analysis in identifying circulation regimes”. Journal of Climate, 20, 375–377. Monahan, A. H., J. C. Fyfe, and G. M. Flato, 2000: A regime view of northern hemisphere atmospheric variability and change under global warming. Geophysical Research Letters, 27, 1139–1142. Monahan, A. H., J. C. Fyfe, and L. Pandolfo, 2003: The vertical structure of wintertime climate regimes of the northern hemisphere extratropical atmosphere. Journal of Climate, 16, 2005–2021.  Bibliography  91  Monahan, A. H., L. Pandolfo, and J. C. Fyfe, 2001: The preferred structure of variability of the northern hemisphere atmospheric circulation. Geophysical Research Letters, 28, 1019–1022. Naujokat, B., 1986: An update of the observed quasi-biennial oscillation of the stratospheric winds over the tropics. Journal of the Atmospheric Sciences, 43, 1873–1877. Ortland, D. A., 1997: Rossby wave propagation into the tropical stratosphere observed by the High Resolution Doppler Imager. Geophysical Research Letters, 24, 1999–2002. O’Sullivan, D., 1997: Interaction of extratropical Rossby waves with westerly quasi-biennial oscillation winds. Journal of Geophysical Research, 102, 19,461–19,469. O’Sullivan, D. and M. L. Salby, 1990: Coupling of the quasi-biennial oscillation and the extratropical circulation in the stratosphere through planetary wave transport. Journal of the Atmospheric Sciences, 47, 650–673. Palus, M. and I. Dvorak, 1992: Singular-value decomposition in attractor reconstruction: Pitfalls and precautions. Physica D: Nonlinear Phenomena, 55, 221–234. Pearson, K., 1901: On lines and planes of closest fit to system of points on space. Philosophical Magazine, Series 6 , 2, 559–572. Perlwitz, J. and N. Harnik, 2003: Observational evidence of a stratospheric influence on the troposphere by planetary wave reflection. Journal of Climate, 16, 3011–3026. Plumb, R. A., 1977: Interaction of 2 internal waves with mean flow - implications for theory of QuasiBiennial Oscillation. Journal of the Atmospheric Sciences, 34, 1847–1858. Quiroz, R. S., 1981: Period modulation of the stratospheric Quasi-Biennial Oscillation. Monthly Weather Review , 109, 665–674. Rattan, S. S. P. and W. W. Hsieh, 2004: Nonlinear complex principal component analysis of the tropical Pacific interannual wind variability. Geophysical Research Letters, 31, L21201, 1–4. — 2005: Complex-valued neural networks for nonlinear complex principal component analysis. Neural Networks, 18, 61–69. Salby, M. and P. Callaghan, 2000: Connection between the solar cycle and the QBO: The missing link. Journal of Climate, 13, 328–338.  Bibliography  92  Salby, M., P. Callaghan, and D. Shea, 1997: Interdependence of the tropical and extratropical QBO: Relationship to the solar cycle versus a biennial oscillation in the stratosphere. Journal of Geophysical Research, 102, 29,789–29,798. Scott, R. K. and P. H. Haynes, 1998: Internal interannual variability of the extratropical stratospheric circulation: The low-latitude flywheel. Quarterly Journal of the Royal Meteorological Society, 124, 2149– 2173. Shanno, D. F., 1970: Conditioning of quasi-Newton methods for function minimization. Mathematics of Computation, 24, 647–656. Shindell, D., D. Rind, N. Balachandran, J. Lean, and P. Lonergan, 1999: Solar cycle variability, ozone, and climate. Science, 284, 305–308. Tang, Y. and W. W. Hsieh, 2003: Nonlinear modes of decadal and interannual variability of the subsurface thermal structure in the Pacific Ocean. Journal of Geophysical Research, 108, 29–1–29–12. Thompson, D. W. J., M. P. Baldwin, and J. M. Wallace, 2002: Stratospheric connection to northern hemisphere wintertime weather: Implications for prediction. Journal of Climate, 15, 1421–1428. Thompson, D. W. J. and J. M. Wallace, 1998: The Arctic Oscillation signature in the wintertime geopotential height and temperature fields. Geophysical Research Letters, 25, 1297–1300. — 2000: Annular modes in the extratropical circulation. Part I: Month-to-month variability. Journal of Climate, 13, 1000–1016. — 2001: Regional climate impacts of the northern hemisphere annular mode. Science, 293, 85–89. Tung, K. K. and H. Yang, 1994a: Global QBO in circulation and ozone. Part I: Reexamination of observational evidence. Journal of the Atmospheric Sciences, 51, 2699–2707. — 1994b: Global QBO in circulation and ozone. part II: A simple mechanistic model. Journal of the Atmospheric Sciences, 51, 2708–2721. von Storch, H. and F. W. Zwiers, 2002: Statistical Analysis in Climate Research. Cambridge: Cambridge University Press, 494 pp. Wallace, J. M., R. L. Panetta, and J. Estberg, 1993: Representation of the equatorial stratospheric Quasi-Biennial Oscillation in EOF phase space. Journal of the Atmospheric Sciences, 50, 1751–1762.  Bibliography  93  Wallace, J. M. and D. W. J. Thompson, 2002: Annular modes and climate prediction. Physics Today, 55, 28–33. Wang, R., K. Fraedrich, and S. Pawson, 1995: Phase-space characteristics of the tropical stratospheric Quasi-Biennial Oscillation. Journal of the Atmospheric Sciences, 52, 4482–4500. Webb, A. R. and D. Lowe, 1988: A hybrid optimization strategy for adaptive feed-forward layered networks. RSRE Memorandum 4193, Royal Signals and Radar Establishment, St. Andrews Road, Malvern, UK, 22 pp. Wu, A., W. W. Hsieh, and A. Shabbar, 2002: Nonlinear characteristics of the surface air temperature over Canada. Journal of Geophysical Research, 107, 8–1–8–15. Wu, A., W. W. Hsieh, and F. W. Zwiers, 2003: Nonlinear modes of North American winter climate variability derived from a general circulation model simulation. Journal of Climate, 16, 2325–2339. Ye, Z. and W. W. Hsieh, 2006: The influence of climate regime shift on ENSO. Climate Dynamics, 26, 823–833. Yoden, S., 1990: An illustrative model of seasonal and interannual variations of the stratospheric circulation. Journal of the Atmospheric Sciences, 47, 1845–1853.  94  Appendix A  Strategy of Nonlinear Optimization The solution of the model, that is, the determination of the weights and biases of the neural network, relies on a procedure of optimization. This procedure involves minimizing a scalar objective function of the model. The mean square error of output to target (and input) signals is adopted as the objective function of NLPCA model to be minimized for an optimized solution (Kramer, 1991; Monahan, 2000; Hsieh, 2001, 2004). In this analysis, the minimization is performed using a hybrid procedure (Webb and Lowe, 1988) consisting of the quasi-Newton method and the least square method, since the neural network has the linear transfer from decoding to output signals. The quasi-Newton method with a mixed quadratic and cubic line search and the BFGS formula for updating the approximation of the Hessian matrix (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970) is used to minimize the mean square error with respect to the weights and biases of the hidden layers’ neurons. Each time these weights and biases are updated, hence the signals from the decoding neurons are changed, a one-step exact minimization of the mean square error with respect to the weights and biases of the output neurons is obtained using the least square method. The hybrid procedure has two obvious advantages over the full nonlinear optimization of the entire neural network. Firstly, the computational time of the least square method is usually much less than that required for general nonlinear optimizations. The hybrid procedure is thereby much faster to reach an optimized solution than the full nonlinear optimization of the entire neural network. Secondly, the exact minimization of the mean square error with respect to the weights and biases of the output neurons improves the convergence of the resultant mean square errors toward the unique solutions. The quasi-Newton method is an advanced and fast algorithm to train feed-forward neural networks (Bishop, 1995), but stops at the first minimum of the objective function it finds. The topology of the mean square error function from a nonlinear neural network usually bears more than one minimum. In an effort to find the lowest or global minimum, multiple runs with random initial values of the weights and biases of the hidden layers’ neurons are carried out. Each set of the random initial values for a model run  Appendix A. Strategy of Nonlinear Optimization  95  has a normal distribution with mean zero and variance one. A model run is defined as follows. After a minimum of the mean square error is found, the parameters of the weights and biases of the hidden layers’ neurons are adjusted by adding to them another set of random values, which has a normal distribution with mean zero but variance 0.01, and a new minimum is searched (Glover and Laguna, 1998). If the new minimum is lower, the newly optimized values of the weights and biases of the neural network will become the current ones. Otherwise, the current values remain unchanged. The procedure is repeated until no lower mean square error has been found for several consecutive adjustments of the parameters. After the multiple model runs have been carried out, the weights and biases of the model run with the lowest mean square error of model output to target are taken to construct the model solution.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0052779/manifest

Comment

Related Items