UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Nonlinea principal component analysis of climate data Monahan, Adam Hugh 2000

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2000-486788.pdf [ 8.42MB ]
Metadata
JSON: 831-1.0089632.json
JSON-LD: 831-1.0089632-ld.json
RDF/XML (Pretty): 831-1.0089632-rdf.xml
RDF/JSON: 831-1.0089632-rdf.json
Turtle: 831-1.0089632-turtle.txt
N-Triples: 831-1.0089632-rdf-ntriples.txt
Original Record: 831-1.0089632-source.json
Full Text
831-1.0089632-fulltext.txt
Citation
831-1.0089632.ris

Full Text

NONLINEAR PRINCIPAL C O M P O N E N T ANALYSIS OF CLIMATE DATA  by Adam Hugh Monahan B . Sc. (Honours Physics) University of Calgary, 1993 M . Sc. (Physics) University of British Columbia, 1995  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in  THE FACULTY OF GRADUATE STUDIES EARTH AND OCEAN SCIENCES We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA February 2000 © Adam Hugh Monahan, 2000  In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives.  It is understood that copying or publication of this thesis for  financial gain shall not be allowed without my written permission.  Earth and Ocean Sciences The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1Z1  Date:  Abstract  A nonlinear generalisation of Principal Component Analysis ( P C A ) , denoted Nonlinear P r i n c i p a l Component Analysis ( N L P C A ) , is introduced and applied to the analysis of climate data. This method is implemented using a 5-layer feed-forward neural network introduced originally i n the chemical engineering literature. and details of its implementation are addressed.  T h e method is described  It is found empirically that N L P C A  partitions variance i n the same fashion as does P C A , that is, that the sum of the total variance of the N L P C A approximation with the total variance of the residual from the original data is equal to the total variance of the original data. A n important distinction is drawn between a modal P-dimensional N L P C A analysis, in which P successive I D approximations are determined iteratively so that the approximation is the sum of P nonlinear functions of one variable, and a nonmodal analysis, i n which the P-dimensional N L P C A approximation is determined as a nonlinear non-additive function of P variables. Nonlinear P r i n c i p a l Component Analysis is first applied to a data set sampled from the Lorenz attractor. It is found that the N L P C A approximations are much more representative of the data than are the corresponding P C A approximations. In particular, the I D and 2D N L P C A approximations explain 76% and 99.5% of the total variance, respectively, i n contrast to 60% and 95% explained by the I D and 2D P C A approximations. W h e n applied to a data set consisting of monthly-averaged tropical Pacific Ocean sea surface temperatures (SST), the modal I D N L P C A approximation describes average variability associated w i t h the E l Nino/Southern Oscillation ( E N S O ) phenomenon, as does the I D P C A approximation.  The N L P C A approximation, however, characterises  the  asymmetry i n spatial pattern of S S T anomalies between average warm and cold events  ii  ( m a n i f e s t e d i n t h e skewness o f t h e d i s t r i b u t i o n ) i n a m a n n e r t h a t t h e P C A a p p r o x i m a t i o n cannot.  T h e s e c o n d N L P C A m o d e o f S S T is f o u n d t o c h a r a c t e r i s e  differences  i n E N S O v a r i a b i l i t y b e t w e e n i n d i v i d u a l events, a n d i n p a r t i c u l a r is c o n s i s t e n t w i t h t h e c e l e b r a t e d 1977 " r e g i m e shift". A 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n is d e t e r m i n e d , t h e i n t e r p r e t a t i o n o f w h i c h is c o m p l i c a t e d b y the fact t h a t a s e c o n d a r y f e a t u r e e x t r a c t i o n p r o b l e m has t o be c a r r i e d out to i n t e r p r e t the results. It is f o u n d t h a t t h i s a p p r o x i m a t i o n c o n t a i n s m u c h t h e s a m e i n f o r m a t i o n as t h a t p r o v i d e d b y t h e m o d a l a n a l y s i s . A m o d a l N L P C a n a l y s i s o f t r o p i c a l I n d o - P a c i f i c sea l e v e l pressure ( S L P ) finds t h a t t h e first m o d e describes average E N S O v a r i a b i l i t y i n this field, a n d also characterises a n a s y m m e t r y i n S L P fields b e t w e e n average w a r m a n d c o l d events. N o r o b u s t n o n l i n e a r m o d e b e y o n d t h e first c o u l d b e f o u n d . N o n l i n e a r P r i n c i p a l C o m p o n e n t A n a l y s i s is u s e d t o find t h e o p t i m a l n o n l i n e a r a p p r o x i m a t i o n t o S L P d a t a p r o d u c e d b y a 1001 y e a r i n t e g r a t i o n o f t h e C a n a d i a n C e n t r e for C l i m a t e M o d e l l i n g and Analysis ( C C C m a ) coupled general circulation m o d e l ( C G C M 1 ) . T h i s a p p r o x i m a t i o n ' s a s s o c i a t e d t i m e series is s t r o n g l y b i m o d a l a n d p a r t i t i o n s t h e  data  i n t o t w o d i s t i n c t r e g i m e s . T h e first a n d m o r e persistent r e g i m e describes a s t a n d i n g osc i l l a t i o n w h o s e s i g n a t u r e i n the m i d - t r o p o s p h e r e is a l t e r n a t i n g a m p l i f i c a t i o n a n d a t t e n u a t i o n o f t h e c l i m a t o l o g i c a l r i d g e over N o r t h e r n E u r o p e . T h e s e c o n d a n d m o r e e p i s o d i c r e g i m e describes m i d - t r o p o s p h e r i c split-flow s o u t h o f G r e e n l a n d .  E s s e n t i a l l y the s a m e  s t r u c t u r e is f o u n d i n the I D N L P C A a p p r o x i m a t i o n o f the 5 0 0 m b h e i g h t field itself. I n a 500 y e a r i n t e g r a t i o n w i t h a t m o s p h e r i c CO2 at four t i m e s p r e - i n d u s t r i a l c o n c e n t r a t i o n s , t h e o c c u p a t i o n s t a t i s t i c s o f these preferred m o d e s o f v a r i a b i l i t y c h a n g e , s u c h t h a t e p i s o d i c s p l i t - f l o w r e g i m e o c c u r s less f r e q u e n t l y w h i l e the s t a n d i n g o s c i l l a t i o n  the  regime  occurs more frequently. F i n a l l y , a g e n e r a l i s a t i o n o f K r a m e r ' s N L P C A u s i n g a 7-layer a u t o a s s o c i a t i v e  111  neural  n e t w o r k is i n t r o d u c e d to address the i n a b i l i t y o f K r a m e r ' s o r i g i n a l n e t w o r k t o f i n d P d i m e n s i o n a l s t r u c t u r e t o p o l o g i c a l l y different f r o m the u n i t c u b e i n U . p  T h e example of  a n ellipse is c o n s i d e r e d , a n d it is s h o w n t h a t the a p p r o x i m a t i o n p r o d u c e d b y t h e 7-layer n e t w o r k is a s u b s t a n t i a l i m p r o v e m e n t over t h a t p r o d u c e d b y the 5-layer n e t w o r k .  iv  Table of Contents  Abstract  ii  List of Tables  viii  List of Figures  ix  List of Acronyms  xvi  Acknowledgements  xix  1  Introduction  1  2  Nonlinear Principal Component Analysis: Theory and Implementation  5  2.1  Introduction  5  2.2  Feature Extraction Problems  5  2.2.1  Principal Component Analysis  6  2.2.2  Nonlinear Principal Component Analysis  9  3  2.3  Implementation of N L P C A  17  2.4  Dynamical Significance of Low-Dimensional Approximations  20  Nonlinear Principal Component Analysis of the Lorenz Attractor  23  3.1  Introduction  23  3.2  Model Building  25  4  5  6  3.3  Results  25  3.4  Conclusion  37  N o n l i n e a r P r i n c i p a l C o m p o n e n t A n a l y s i s of T r o p i c a l Indo-Pacific Sea Surface T e m p e r a t u r e a n d Sea Level Pressure  39  4.1  Introduction  39  4.2  Data and Model Building  41  4.3  Tropical Pacific Sea Surface Temperature  42  4.4  Tropical Indo-Pacific Sea Level Pressure  65  4.5  Conclusions  75  N o n l i n e a r P r i n c i p a l C o m p o n e n t Analysis of N o r t h e r n H e m i s p h e r e A t mospheric C i r c u l a t i o n D a t a  77  5.1  Introduction  77  5.2  Data and Model Building  80  5.3  Analysis of GCM Sea Level Pressure  82  5.4  Analysis of GCM 500mb Heights  5.5  Analysis of GCM SLP in a 4 x C 0 Integration  115  5.6  Conclusions  123  100 2  Seven-Layer N e t w o r k s for Discontinuous P r o j e c t i o n a n d E x p a n s i o n Functions  125  6.1  Introduction  125  6.2  Neural Network Approximations to Discontinuous Functions  126  6.3  7-Layer NLPCA Network  130  6.4  Conclusions  132  vi  7  Summary and Conclusions  136  7.1  Summary  136  7.2  Conclusions  142  Appendices  143  A Neural Networks  143  B  Principal Curves and Surfaces  147  C  Symmetric and Anti-symmetric Components of Composites  150  Bibliography  153  vn  List of Tables  Percentages of variance explained by the ID and 2D N L P C A approximations to the Lorenz data for the three noise levels 77 considered  vm  List of Figures  2.1  T h e 5-layer feed-forward a u t o a s s o c i a t i v e n e u r a l n e t w o r k u s e d t o p e r f o r m NLPCA  3.1  10  T h e Lorenz attractor,  projected on the  (2:1,2:3),  (2:3, £ ) > a n d 2  (x ,X\) 2  planes  24  3.2  A s i n F i g u r e 3.1, for a s u b s a m p l e of 584 p o i n t s  .  3.3  Noise-free L o r e n z d a t a for a s u b s a m p l e o f 584 p o i n t s a n d t h e i r I D P C A  26  a p p r o x i m a t i o n , p r o j e c t e d as i n F i g u r e 3.1 (note axes h a v e b e e n r e s c a l e d ) . The  dots represent  the o r i g i n a l d a t a p o i n t s , the o p e n circles represent  points of the a p p r o x i m a t i o n  28  3.4  A s i n F i g u r e 3.3, b u t for t h e I D N L P C A a p p r o x i m a t i o n  29  3.5  Noise-free L o r e n z d a t a for a s u b s a m p l e o f 584 p o i n t s a n d t h e i r 2 D P C A a p p r o x i m a t i o n . T h e dots represent t h e o r i g i n a l d a t a p o i n t s , a n d t h e o p e n circles t h e p o i n t s o f the a p p r o x i m a t i o n  31  3.6  A s i n F i g u r e 3.5, b u t for the n o n m o d a l 2 D N L P C A a p p r o x i m a t i o n .  . . .  3.7  L o r e n z d a t a w i t h noise l e v e l rj = 2.0 for a s u b s a m p l e o f 584 p o i n t s a n d its  32  I D N L P C A a p p r o x i m a t i o n . D o t s represent the d a t a p o i n t s a n d t h e o p e n circles represent p o i n t s o f the a p p r o x i m a t i o n 3.8  3.9  A s i n F i g u r e 3.7, b u t w i t h t h e 2 D n o n m o d a l N L P C A  33 approximation of  t h e L o r e n z d a t a w i t h noise l e v e l n = 2.0  35  A s i n F i g u r e 3.7, for L o r e n z d a t a w i t h noise l e v e l 77 = 5.0  36  ix  4.1  S p a t i a l p a t t e r n s o f the first t h r e e S S T A E O F p a t t e r n s , n o r m a l i s e d t o u n i t m a g n i t u d e . T h e c o n t o u r i n t e r v a l is 0.02, t h e zero c o n t o u r is i n b o l d , a n d n e g a t i v e c o n t o u r s are d a s h e d  4.2  43  S c a t t e r p l o t o f S S T A d a t a p r o j e c t e d o n t o the p l a n e s p a n n e d b y t h e first two E O F patterns  4.3  .  44  Scatterplot of S S T A data (points) and S S T A N L P C A m o d e 1 a p p r o x i m a t i o n ( o p e n circles) p r o j e c t e d o n t o the planes s p a n n e d b y E O F s (a) ( e i , e ) , 2  (b) ( e , e ) , (c) ( e ^ 6 3 ) . (d) shows a s c a t t e r p l o t o f t h e I D N L P C A a p p r o x 2  3  i m a t i o n p r o j e c t e d i n t o t h e subspace ( e e 2 , e 3 )  46  1;  4.4  (a) P l o t o f ai(t ) n  = Sf(X.(t )),  the s t a n d a r d i s e d t i m e series a s s o c i a t e d  n  w i t h S S T A N L P C A m o d e 1. (b) P l o t o f t h e N i n o 3.4 i n d e x n o r m a l i s e d t o unit variance 4.5  47  Sequence of spatial maps characterising S S T A N L P C A a  x  = - 3 . 5 (b)  a  i  = - 1 . 5 (c) a i = - . 7 5 (d) a, = - . 2 5 (e) a j = .25 (f)  a i = 0.75 (g) a i = 1.5 (h) 4.6  m o d e 1 for (a)  a  i  = 3.5. C o n t o u r i n t e r v a l : 0 . 5 ° C  49  S S T A c o m p o s i t e m a p s for "average" (a) E l N i n o a n d (b) L a N i n a e v e n t s . C o n t o u r i n t e r v a l : 0 . 5 ° C . (c) S y m m e t r i c c o m p o n e n t o f c o m p o s i t e s (a) a n d (b).  C o n t o u r i n t e r v a l : 0 . 1 ° C . See t e x t for d e f i n i t i o n o f c o m p o s i t e s a n d o f  the s y m m e t r i c component 4.7  51  M a p o f p o i n t w i s e c o r r e l a t i o n coefficient b e t w e e n o b s e r v e d S S T A a n d (a) I D N L P C A a p p r o x i m a t i o n , (b) I D P C A a p p r o x i m a t i o n , a n d (c) difference b e t w e e n (a) a n d (b)  53  4.8  A s for F i g u r e 4.3, b u t for S S T A N L P C A m o d e 2  54  4.9  A s for F i g u r e 4.4(a), b u t for  ct (t ), the t i m e series c o r r e s p o n d i n g t o S S T A 2  n  N L P C A mode 2  55  x  4.10 M a p s corresponding to S S T A N L P C A mode 2 for (a) a (c) a a  2  2  = - 0 . 2 5 (d) a  = 0 (e) a  2  = 0.35 (i) a  = 0.4 (j) a  2  = 0.15 (f) a  2  = 0.5 (k) a  2  2  2  = - 4 (b) a  = 0.25 (g) a  2  = 0.75 (1) a  2  2  =  -1  = 0.3 (h)  2  = 1.5.  Contour  interval: 0 . 5 ° C  56  4.11 Composites of S S T A for L a N i n a events (a) before 1977 and (b) after 1977. Contour interval 0.5°C7  58  4.12 A s for Figure 4.3, but for S S T A 2D nonmodal N L P C A approximation. and (b) B (t ),  4.13 T i m e series (a) B^Q  2  n  where (3 ,B )(t ) x  2  n  .  60  = s (X(<„)) is r  the pair of time series associated with the S S T A 2D nonmodal N L P C A approximation. B o t h time series have been normalised to unit variance. .  61  4.14 M a p s of pointwise correlation between observed S S T A and (a) 2D P C A approximation (b) 2D nonmodal N L P C A approximation, and (c) 2D modal N L P C A approximation  62  4.15 A s for Figure 4.3, but for S S T A 2D modal N L P C A approximation.  . . .  64  4.16 Spatial patterns of S L P A (a) E O F mode 1, (b) E O F mode 2, (c) E O F mode 3. The black dots i n (a) designate the positions of T a h i t i and D a r w i n , Australia  67  4.17 A s for Figure 4.3, but for S L P A N L P C A mode 1 4.18 (a) Plot of ai(t ) n  68  = 5 / ( X ( i ) ) , the standardised time series associated 1  n  w i t h S L P A N L P C A mode 1. (b) Plot of 5-month running mean of S O I , standardised to unit variance  69  4.19 Plot of a sequence of spatial maps characterising S L P A N L P C A mode 1 for (a) a i = - 3 (b) a  x  a  x  = 0.5 (g) a  :  = - 2 {c)a  = 1 (h) a  x  :  = - 1 (d) a  x  = - 0 . 5 (e) a  x  = 0 (f)  = 2. Contour interval: 0.5 h P a  70  4.20 Composites of S L P A during (a) E l Nino and (b) L a N i n a . Contour Interval: 0.5 h P a  71 xi  4.21 Spatial pattern of pointwise correlation coefficient between SLPA and (a) ID N L P C A approximation and (b) ID P C A approximation  73  4.22 SLPA N L P C A mode 2 (a) Spatial pattern (not normalised, units are hPa) and (b) time series (normalised to unit variance) 5.1  74  Spatial structure of the leading E O F pattern from observed SLP. Contour intervals are 1 hPa (...,-1.5,-0.5,0.5,1.5,...)  5.2  79  Spatial structure of the leading four E O F patterns from C C C m a SLPA: (a) E O F 1, (b) E O F 2, (c) E O F 3, (d) E O F 4. These patterns explain 23.7%, 10.6%, 8.5%, and 6.5% of the variance in SLP, respectively.  Negative  contours are dashed. Contour intervals are 1 hPa (..., -1.5, -0.5, 0.5, ...). . 5.3  Spatial structure of the leading four E O F patterns from C C C m a Z A:  83  (a)  500  E O F 1, (b) E O F 2, (c) E O F 3, (d) E O F 4. These patterns explain 19.6%, 12.5%, 9.3%, and 8.2% of the variance in Z o, respectively. 50  Negative  contours are dashed. Contour intervals are 10 m (..., -15, -5, 5, ...). 5.4  ...  84  Scatterplot of the leading two SLPA P C time series, overlaid with a histogram estimate of the corresponding marginal probability density function. Contour intervals are 5 x 10 ,1.5 x 1 0 , 3 x 10~ ,6 x 10~ ,1 x -4  _3  3  3  10 , 2 x 10 , 3 x 10 . The histogram bin size is 25 hPa in both directions. 85 2  5.5  2  2  ID N L P C A approximation X of SLPA, projected in the space of the first two SLPA EOFs (open circles), overlaying histogram estimate of SLPA P D F as in Figure 5.4  5.6  86  Plot of the ID SLPA N L P C A time series a(t ) n  histogram estimate of the P D F (right)  xn  (left) and the associated 87  5.7  A s i n F i g u r e 5.5, b u t w i t h the P D F s o f t h e p o p u l a t i o n s c o r r e s p o n d i n g t o B r a n c h 1 ( s o l i d c o n t o u r s ) a n d B r a n c h 2 ( d a s h e d c o n t o u r s ) p l o t t e d separ a t e l y , a n d w i t h a b i n size of 20 h P a .  5.8  89  C o m p o s i t e s o f S L P A over c h a r a c t e r i s t i c ranges o f a.  T h e s e ranges  are  i n d i c a t e d i n parentheses b e l o w t h e m a p s , a l o n g w i t h t h e n u m b e r N o f m a p s u s e d i n the c o m p o s i t e .  C o n t o u r i n t e r v a l is 2 h P a  (...,-3,-1,1,3,...).  C o n t i n u e d o n n e x t page  90  5.8  Continued  91  5.9  A s w i t h F i g u r e 5.8,  5.9  b u t for c o m p o s i t e s o f Z500 a n o m a l i e s .  T h e contour  i n t e r v a l is 20 m (...,-30,-10,10,30,...)  93  Continued  94  5.10 A s w i t h F i g u r e 5.8, b u t for c o m p o s i t e s of £500- T h e 5300 a n d 5500 m c o n t o u r s are i n b o l d . C o n t o u r i n t e r v a l is 50 m  95  5.10  Continued  96  5.11  M a p s o f s p a t i a l d i s t r i b u t i o n o f v a r i a n c e o f S L P f r o m (a) C C C m a G C M a n d (b) o b s e r v a t i o n s ( c o n t o u r i n t e r v a l 10 ( h P a ) ) , a n d skewness o f S L P 2  f r o m (c) C C C m a G C M a n d (d) o b s e r v a t i o n s ( c o n t o u r i n t e r v a l 0.2).  5.12 H i s t o g r a m e s t i m a t e s o f 2D m a r g i n a l P D F s o f Z  5 0 0  . . .  98  A P C s ( a ) ( P C l , PC72),  (b) (PC71,Pc73), (c)(PC2,PC73), a n d ( d ) ( P C l , P C 4 ) . T h e c o n t o u r i n t e r vals are as i n F i g u r e 5.4. B i n sizes are 2500 m 5.13  P r o j e c t i o n of the Z A 500  d a t a (dots) a n d i t s I D N L P C A  101 approximation  ( o p e n circles) o n t o the spaces s p a n n e d b y E O F s (a) ( e i , 62), (c) ( e , e ) , a n d 2  5.14  3  (d)  (b) ( e i , e ) , 3  (ei,e ,e ) 2  102  3  P l o t o f t h e I D Z500A t i m e series a(t ) n  e s t i m a t e o f the P D F (right)  (left) a n d t h e a s s o c i a t e d h i s t o g r a m 103  Xlll  5.15 A s i n F i g u r e 5.9, b u t for Z500A c o m p o s i t e d over c h a r a c t e r i s t i c ranges o f a a s s o c i a t e d w i t h t h e I D Z500A N L P C A a p p r o x i m a t i o n . C o n t o u r i n t e r v a l is 20m  (...,-30,-10,10,30,...). C o n t i n u e d o n n e x t page  105  5.15 C o n t i n u e d  106  5.16 S c a t t e r p l o t o f the t i m e series a(t ) n  corresponding to the I D S L P A  and  Z500A N L P C A a p p r o x i m a t i o n s  108  5.17 M a p s o f v a r i a n c e (a) a n d skewness (b) of C C C m a m o d e l l e d Zsoo- C o n t o u r i n t e r v a l i n (a) is 500 m  2  a n d i n (b) is 0.2  110  5.18 A s w i t h F i g u r e 5.15, b u t for c o m p o s i t e s o f Z  .  50Q  T h e 5300 a n d 5500 m  c o n t o u r s are i n b o l d . C o n t o u r i n t e r v a l is 50 m  I l l  5.18 C o n t i n u e d  112  5.19 A s i n F i g u r e 5.15, b u t for S L P A . C o n t o u r i n t e r v a l is 2 h P a (...,-3,-1,1,3,...). C o n t i n u e d o n n e x t page  113  5.19 C o n t i n u e d . .  114  5.20 H i s t o g r a m e s t i m a t e o f t h e m a r g i n a l p r o b a b i l i t y d e n s i t y f u n c t i o n ( c o n t o u r s ) of S L P A f r o m t h e G C M i n t e g r a t i o n w i t h C 0  c o n c e n t r a t i o n s at f o u r t i m e s  2  t h e p r e - i n d u s t r i a l v a l u e , i n the space of t h e l e a d i n g t w o c o n t r o l i n t e g r a t i o n S L P A P C A modes, overlaid w i t h the corresponding I D N L P C A i m a t i o n (open circles). 10~ ,6 x l O 3  - 3  ,1 x lO  - 2  C o n t o u r i n t e r v a l s are 5 x 1 0  — 4  approx-  ,1.5 x 10~ ,3 x 3  , 2 x 1 0 - , 3 x 1 0 ~ . T h e h i s t o g r a m b i n size is 25 2  2  h P a i n b o t h directions  116  5.21 P l o t o f the I D N L P C A S L P A t i m e series a(t ) n  (left) a n d the  associated  h i s t o g r a m e s t i m a t e o f the P D F ( r i g h t ) for the G C M i n t e g r a t i o n w i t h CO2 c o n c e n t r a t i o n at four t i m e s the p r e - i n d u s t r i a l l e v e l  xiv  117  5.22 As in Figure 5.8, but for SLPA in the GCM integration at four times the pre-industrial CO2 concentration. Contour interval is 2 hPa (...,-3,1,1,3,...). Continued on next page  118  5.22 Continued  119  5.23 (a) Variance (contour interval 10 (hPa) ) and (b) skewness (contour in2  terval 0.2) of SLPA from GCM integration with quadrupled atmospheric C0 6.1  122  2  Plot of Y(t ) n  (diamonds) and X(i ) (open circles) as defined by equation n  (6.3) with iV = 50 6.2  128  Neural network approximations of the functional relationship between data sets X(i„) and Y(t )  defined in equation (6.3): (a) network with one  n  hidden layer, (b) network with two hidden layers 6.3  129  Results of ID NLPC analysis of an ellipse using a 5-layer autoassociative neural network: (a) NLPCA approximation X(£„), (b) associated time series a(t ) = s/(X(t )) (note scale on y-axis is arbitrary) n  6.4  n  As in Figure 6.3, but for NLPCA performed using a 7-layer autoassociative neural network  A.l  131  133  Diagrammatic representation of neural network with input data X and output data Z  144  xv  List of Acronyms  AAO  AO  Antarctic Oscillation  Arctic Oscillation  CCCma EOF  C a n a d i a n C e n t r e for C l i m a t e M o d e l l i n g a n d A n a l y s i s  Empirical Orthogonal Function  ENSO  E l Nino/Southern  Oscillation  FEV  F r a c t i o n of E x p l a i n e d  FUV  F r a c t i o n of U n e x p l a i n e d  GCM  Variance  General Circulation M o d e l  NAO  N o r t h Atlantic Oscillation  NLPCA NMSD PCA  Variance  Nonlinear Principal Component Analysis  N o r m a l i s e d M e a n Square Distance  Principal Component Analysis  PCS  P r i n c i p a l C u r v e s a n d Surfaces  PDF  Probability Density  PNA  Pacific-North America  RPCA  Function  Rotated Principal Component Analysis xvi  SLP (A)  Sea Level Pressure ( A n o m a l y )  SST(A)  S e a Surface T e m p e r a t u r e  (Anomaly)  Z5oo(A) 5 0 0 m b g e o p o t e n t i a l height ( A n o m a l y )  xvn  C A L V I N : I used to hate writing assignments, but now I enjoy them. I realized that the purpose of writing is to innate weak ideas, obscure poor reasoning and inhibit clarity. With a little practice writing can be an intimidating and impenetrable fog! Want to see my book report? H O B B E S (reading): "The Dynamics of Interbeing and Monological Imperatives in 'Dick and Jane': A Study in Psychic Transnational Modes." C A L V I N : Academia, here I come! Bill Watterson  XVlll  Acknowledgements  I would like to thank my supervisors Dr. William Hsieh and Dr. Lionel Pandolfo for their help and guidance during the course of this work. Thanks are also due to Dr. John Fyfe and Dr. Greg Flato, who collaborated on much of the work presented in Chapter 5. As well, I would like to acknowledge Dr. Benyang Tang, Dr. Susan Allen, Dr. Phil Austin, Dr. Cindy Greenwood, and Dr. Roland Stull for suggestions and advice. None of this work could have been accomplished without the computer support provided by Denis Laplante. Merci. M y colleagues, past and present, improved this work with useful comments and suggestions. Thank you, Steven Bograd, Ana Carrasco, Pal Isachsen, Youmin Tang, Fredolin Tangang, and Yuval. I am grateful to the National Science and Engineering Research Council, the Crisis Points Group of the Peter Wall Institute for Advanced Study, and the University of British Columbia for financially supporting me over the course of this work. Tom Waits, John Zorn, and Frank Zappa have helped me maintain what has passed for my sanity over the last four years. Gentlemen, I am in your debt. Most of all, I would like to express my love and gratitude to my friends and family. Thank you Drew, Michelle, James, Anna, Jenn, Jono, and Alison; thank you Mom, Dad, Erin, Sarah, and Darren. Thanks for putting up with me, and thanks for dragging me away from my computer every now and again.  xix  Chapter 1 Introduction  Early work in the field of climate variability involved the consideration of climatic variables averaged over very large spatial and temporal scales. These variables could in consequence be represented in phase spaces of a small number of dimensions. However, the range of spatial and temporal scales under consideration has broadened substantially over the last few decades, so that data sets considered in contemporary research may involve phase spaces with hundreds to thousands of dimensions. While this refinement of scales allows consideration of a richer class of physical phenomena, it also confounds attempts at reaching an holistic understanding of the data under consideration. A typical modern climatic dataset is overwhelming in the amount of information it contains, so statistical techniques to distill massive multivariate datasets down to a phase space of smaller dimension, characterising the essential information contained in the data, are of great importance. In other words, it is necessary to develop methods to extract the signal from noise in climate data. These methods may be described as belonging to the general class of feature extraction problems, which attempt to characterise lower-dimensional structure in large multivariate datasets. One of these feature extraction methods, Principal Component Analysis (PCA), also known as Empirical Orthogonal Function (EOF) analysis, has been widely used in oceanography and meteorology since its introduction to thesefieldsby Lorenz (1956). PCA is an objective technique used to detect and characterise optimal lower-dimensional linear structure in a multivariate data set, and it is one of the most important methods  1  Chapter 1.  Introduction  2  in the geostatistician's multivariate statistics tool-box. Consequently, it has been wellstudied, and standard references exist describing the method and its implementation (Preisendorfer, 1988; Wilks, 1995; von Storch and Zwiers, 1999). Its applications include reduction of data dimensionality for data interpretation (e.g. Barnston and Livezey, 1987; Miller et al., 1997) and for forecasting (e.g. Barnston and Ropelewski, 1992; Tangang et al., 1998). Furthermore, the connection between the results of PCA, which are statistical in nature, and the underlying dynamics of the system under consideration are understood in some detail (North, 1984; Mo and Ghil, 1987). By construction, PCA finds a lower-dimensional hyperplane which optimally characterises the data, such that the sum of squares of orthogonal deviations of the data points from the hyperplane is minimised. If the structure of the data is inherently linear (for example, if the underlying distribution is Gaussian), then PCA is an optimal feature extraction algorithm; however, if the data contain nonlinear lower-dimensional structure, it will not be detectable by PCA. An example of a data set with nonlinear low-dimensional structure is the noisy parabola X{t ) = {t ,t f 2  n  n  n  + e(t ) n  (1.1)  where e(t ) is a 2D iid AA(0,S ) noise process. Underlying this 2D data set is a ID 2  n  parabolic curve. Because this curve cannot be described by a single straight line, a ID PCA approximation would be unable to characterise this underlying ID structure. In the early 1990s, a neural-network based generalisation of PCA to the nonlinear feature extraction problem was introduced in the chemical engineering literature by Kramer (1991), who referred to the resulting technique as Nonlinear Principal Component Analysis (NLPCA). Another solution to this problem, coming from the statistics community, was put forward independently by Hastie and Stuetzle (1989), who named their method Principal Curves and Surfaces (PCS). Recently, Malthouse (1998) demonstrated that  Chapter 1. Introduction  3  NLPCA and PCS are closely related, and are, for a broad class of situations, essentially the same. Kramer's NLPCA has been applied to problems in chemical engineering (Kramer, 1991; Dong and McAvoy, 1996), psychology (Fotheringhame and Baddeley, 1997; Takane, 1998), dynamical systems theory (Kirby and Miranda, 1994; Kirby and Miranda, 1999), biomedical signal processing (Stamkopoulos et al., 1998), satellite remote sensing (Del Frate and Schiavon, 1999), and image compression (De Mers and Cottrell, 1993), but apart from a single unpublished report by Sengupta and Boyle (1995), the results of which were equivocal, it has not been applied to the large multivariate datasets common in oceanic and atmospheric sciences. The object of this thesis is to investigate the application of NLPCA to climatic data sets. A brief review of PCA and a description of NLPCA are given in Chapter 2. As well, Chapter 2 contains a discussion of subtleties in the implementation of NLPCA. In particular, it addresses the problem of overfitting and presents the approach adopted for its avoidance. In Chapter 3, NLPCA is first applied to a synthetic data set sampled from the Lorenz attractor (Lorenz, 1963). Synthetic data are used to develop intuition about the implementation of NLPCA and its results. As well, the addition of noise to this synthetic data allows an investigation of the ability of this statistical tool to extract structure from noisy data. The application of NLPCA to actual climate data is investigated first in Chapter 4, in which the low-dimensional nonlinear structure of tropical Pacific Ocean sea surface temperatures (SST) and tropical Indo-Pacific sea level pressure (SLP) is investigated. Chapter 5 contains the results of an analysis of Northern Hemisphere extratropical SLP and 500mb geopotential height from the Canadian Centre for Climate Modelling and Analysis (CCCma) coupled general circulation model (GCM). Chapter 6 discusses a generalisation of Kramer's NLPCA to a 7-layer autoassociative neural network, which can solve a broader class of feature extraction problems than can  Chapter  1.  Introduction  t h e o r i g i n a l 5-layer n e t w o r k .  4  C o n c l u s i o n s are p r e s e n t e d i n C h a p t e r 7. A p p e n d i x A de-  scribes f e e d - f o r w a r d n e u r a l n e t w o r k s a n d A p p e n d i x B b r i e f l y i n t r o d u c e s P r i n c i p a l C u r v e s . A p p e n d i x C discusses the s y m m e t r i c a n d a n t i s y m m e t r i c c o m p o n e n t s o f c o m p o s i t e s , a n i d e a i n t r o d u c e d i n C h a p t e r 4.  Chapter 2  Nonlinear Principal Component Analysis: T h e o r y and Implementation  2.1  Introduction  Both traditional Principal Component Analysis (PCA) and Nonlinear Principal Component Analysis ( N L P C A ) are examples of feature extraction problems, the goal of which is to extract from multivariate data sets representative structures of lower dimension. However, unlike P C A , N L P C A does not admit an analytic solution. Its implementation involves the iterative solution of a variational problem, which must be approached with care. This chapter presents P C A and N L P C A as feature extraction problems, and then addresses the methodology adopted for the implementation of N L P C A .  2.2  Feature Extraction Problems  Denote by X ( i ) £ SR a typical meteorological or oceanographic data set, where M  n  n G ( 1 , N ) labels observation times and the individual components of the vector X ( i ) n  correspond to individual observing stations. It is usually the case that the field values at different stations do not evolve independently in time, that is, that the temporal variability of X ( t ) includes contributions from large-scale, spatially-coherent features. n  In such a circumstance, the data will not be scattered evenly through the M-dimensional phase space of stations, but will tend to cluster around lower-dimensional surfaces; it is then appropriate to describe X ( i ) by the model: n  5  Chapter 2. Nonlinear Principal Component Analysis: Theory and Implementation  X(* ) n  =  (f os )(X(< )) + e  =  ±{t )  f  n  n  6  n  +e  (2.1)  n  The function S f : D£ —> 9£ , 1 < P < M parameterises a manifold of dimensionality M  p  lower than that of X ( i ) , f : SR -> 3ft is a smooth map from this manifold to the P  M  n  original space, and the e are residuals. The estimation of S f and f from the data X ( i ) , n  n  subject to an optimality criterion such as minimising the sum of squares of the residuals, is an example of a feature extraction problem: given a noisy data set X ( i ) , it is desired n  to retrieve the signal X ( £ ) = (f o s )(X(£„)). Doing so, the M-dimensionality of X(t ) n  f  n  is treated as being in a sense only superficial, as the signal of interest lives on a Pdimensional manifold embedded in SR . Because once f and S f have been found, it is M  no longer necessary no longer need to work with the signal in 9ft and can concentrate M  instead on the signal in 3? , feature extraction can also be thought of as reduction of p  data dimensionality. The method of feature extraction most common in the atmospheric and oceanic sciences is Principal Component Analysis (PCA), which optimally extracts linear features from the data. However, if the underlying structure of the data is nonlinear, traditional P C A is suboptimal in characterising this lower-dimensional structure. This deficiency of P C A motivates the definition of Nonlinear Principal Component Analysis ( N L P C A ) . In this section, I providea brief review of P C A and demonstrate how it generalises naturally to a nonlinear method. I then discuss the N L P C A method in detail.  2.2.1  Principal Component Analysis  Traditional P C A can be formulated variationally as a special case of the feature-extraction problem, in which the data X(i„) (assumed, without loss of generality, to have zero mean  Chapter 2.  Nonlinear  Principal  Component Analysis:  Theory and Implementation  7  in time) is fit to the linear P-dimensional model: p X(i ) = £ ( X ( t „ ) - e ) e fc=i n  f c  f c  + e„  (2.2)  for vectors e*. £ 9ft , e^ • e.,- = 8 j such that the sum of squares of the residuals: M  k  J=<||X-X|| >  (2.3)  2  is a minimum, where angle brackets denote a sample time average. The vector e*. is the Aith Empirical Orthogonal Function (EOF) and the projection of X ( i ) on e is k  n  the fcth Principal Component (PC). The product of the fcth EOF with the fcth PC defines a vector time series usually referred to as the fcth PC mode. The P-dimensional PCA approximation X ( £ ) to X(f„) lives on the P-dimensional hyperplane that passes n  optimally through the middle of the data (von Storch and Zwiers, 1999). Principal Component Analysis has the variance partitioning property: M M M £ var(Xi) = £ var(X ) + £ var(X< - Xi),  (2.4)  i  i=l  «=1  »=1  so it is sensible to say that X(i ) "explains" a certain fraction of the variance of X(£ ). In n  n  particular, X(i ) = (X(i ) • ei)ei is the one-dimensional linear approximation to X ( £ ) n  n  n  which explains the highest percentage of the variance. The fraction of variance explained by X(£ ) is a non-decreasing function of the approximation dimension P; increasing the n  dimensionality of the PCA approximation increases its fidelity to the original data. Principal Component Analysis is usually thought of in terms of the eigenstructure of the data covariance matrix C =< X X >. In fact, the vectors e^ are eigenvectors of C T  corresponding to the P largest eigenvalues: Ce  k  = \ e k  k  (2.5)  where Ai > A > ... > Xp. This fact follows from the minimisation of (2.3) subject to the 2  constraint that the vectors e^ are normalised. While this eigenanalysis approach is the  Chapter 2. Nonlinear Principal Component Analysis: Theory and Implementation  8  s t a n d a r d a p p r o a c h t o c a l c u l a t i n g t h e e*., it has no analogue i n t h e n o n l i n e a r g e n e r a l i s a t i o n t o b e c o n s i d e r e d . T h e v a r i a t i o n a l f o r m u l a t i o n of P C A ( f r o m w h i c h follows i t s r e l a t i o n t o t h e e i g e n s t r u c t u r e o f t h e c o v a r i a n c e m a t r i x ) is e m p h a s i s e d b e c a u s e i t generalises n a t u r a l l y t o t h e n o n l i n e a r feature e x t r a c t i o n p r o b l e m . N o t e t h a t the P C A a p p r o x i m a t i o n X ( i „ ) t o X ( i ) is the c o m p o s i t i o n o f t w o f u n c t i o n s : n  1. a p r o j e c t i o n f u n c t i o n Sf : S R —l 3 ? : M  p  s (X(t )) f  n  =  (X(i ).e ,...,X(t )-e )  =  IIX(* )  n  1  n  T  P  (2.6)  n  w h e r e I I is t h e P x M m a t r i x whose fcth r o w is t h e v e c t o r e^, a n d I. a n e x p a n s i o n f u n c t i o n f : 3t  ->• * R :  p  M  f (s ) = n s T  f  (2.7)  f  T h u s , t h e P C A a p p r o x i m a t i o n X ( £ „ ) to X ( i ) is g i v e n b y n  X{t ) n  =  (f0 8f)(X(t ))  =  n (nx(i„))  =  (n n)x(i )  B  T  r  n  I n t h e l a n g u a g e o f L e B l a n c a n d T i b s h i r a n i (1994), t h e p r o j e c t i o n f u n c t i o n  (2.8) characterises  t h e dimension reduction aspect of P C A , a n d the e x p a n s i o n f u n c t i o n characterises i t s  function approximation aspect. I n t r a d i t i o n a l P C A , b o t h t h e p r o j e c t i o n a n d e x p a n s i o n f u n c t i o n s are l i n e a r . T h i s m e t h o d is t h u s o p t i m a l i f t h e feature t o be e x t r a c t e d is w e l l c h a r a c t e r i s e d b y a set o f o r t h o g o n a l , s t r a i g h t axes:  t h a t is, i f t h e d a t a c l o u d is c i g a r -  s h a p e d (e.g. G a u s s i a n ) . B u t w h a t i f the d a t a c l o u d is r i n g - l i k e , or b o w e d ? I n s u c h cases, t h e r e is a clear l o w e r - d i m e n s i o n a l s t r u c t u r e to t h e d a t a , b u t not one w h i c h is l i n e a r ,  Chapter 2. Nonlinear Principal Component Analysis: Theory and Implementation  9  and thus can not be extracted by traditional P C A . T h i s motivates the definition o f a generalised, nonlinear P C A . T h e g e n e r a l i s a t i o n o f P C A t o N L P C A c a n also be m o t i v a t e d b y t h e f o l l o w i n g observ a t i o n . T h e I D P C A a p p r o x i m a t i o n X ( £ ) t o X ( i ) is separable i n t e r m s o f i t s s p a t i a l n  and temporal structure.  n  T h a t i s , X ( £ ) is t h e p r o d u c t o f a f u n c t i o n o f t i m e , X ( i „ ) • e i , n  w i t h a f u n c t i o n o f space, e i :  = (X{t )-e )e .  X(t )  n  n  1  (2.9)  1  I n c o n s e q u e n c e , t h i s a p p r o x i m a t i o n c a n o n l y describe s t a n d i n g v a r i a b i l i t y i n t h e d a t a set, t h a t i s , a f i x e d s p a t i a l p a t t e r n w i t h a n a m p l i t u d e t h a t varies as a f u n c t i o n o f t i m e . T h e r e is n o a  priori r e a s o n t o b e l i e v e t h a t t h e o p t i m a l I D a p p r o x i m a t i o n t o a c l i m a t i c d a t a set  is a s t a n d i n g o s c i l l a t i o n , b u t t h a t is a l l P C A c a n p r o d u c e . A s s h a l l b e seen i n C h a p t e r s 4 a n d 5, b y m o v i n g t o N L P C A , I D a p p r o x i m a t i o n s t o c l i m a t i c d a t a sets w h i c h are n o t standing oscillations m a y be obtained.  2.2.2  Nonlinear Principal Component Analysis  T o c i r c u m v e n t t h e l i m i t a t i o n s o f l i n e a r i t y i n h e r e n t i n t h e P C A m o d e l (2.2), K r a m e r (1991) i n t r o d u c e d a nonlinear generalisation that solved t h e general feature e x t r a c t i o n p r o b l e m d e s c r i b e d b y t h e m o d e l (2.1), where d a t a X{t ) n  f a n d Sf are a l l o w e d t o b e nonlinear f u n c t i o n s . G i v e n  e 5 R , t h e p r o b l e m is t o e s t i m a t e f u n c t i o n s s/ : 3 ? M  M  dt  p  and f : 9£  p  w h e r e P < M, s u c h t h a t t h e a p p r o x i m a t i o n  X(t ) n  t o X(t ) n  = (fo  8 /  )(X(t )) n  (2.10)  passes t h r o u g h t h e m i d d l e o f t h e d a t a , i e , s u c h t h a t t h e s u m o f s q u a r e d r e s i d u a l s ,  J=<  ||X(t )-X(i )|| > 2  n  n  (2.11)  Chapter 2. Nonlinear Principal  Input layer  Analysis:  Theory and Implementation  10  Bottleneck Output layer Encoding Decoding layer layer  M  F i g u r e 2.1:  Component  l a y e r  M  r  T h e 5-layer f e e d - f o r w a r d a u t o a s s o c i a t i v e n e u r a l n e t w o r k u s e d t o p e r f o r m  NLPCA. is a m i n i m u m . C a l l e d  Nonlinear Principal Component Analysis ( N L P C A ) , K r a m e r i m -  p l e m e n t e d his s o l u t i o n u s i n g a 5-layer feed-forward n e u r a l n e t w o r k . N e u r a l n e t w o r k s are n o n l i n e a r , n o n p a r a m e t r i c s t a t i s t i c a l tools for f u n c t i o n e s t i m a t i o n . T h e y are d e s c r i b e d i n detail i n A p p e n d i x A . F i g u r e 2.1 shows t h e a r c h i t e c t u r e o f the 5-layer n e t w o r k u s e d t o e x t r a c t t h e NLPCA  ID  a p p r o x i m a t i o n to the d a t a set X ( i „ ) £ Sft ; t h i s n e t w o r k is u n u s u a l i n t h a t M  t h e t h i r d l a y e r c o n t a i n s o n l y a single n e u r o n . T h i s t h i r d l a y e r as is r e f e r r e d t o as t h e bot-  tleneck layer. T h e first (input) a n d fifth (output) layers each c o n t a i n M n e u r o n s . L a y e r s  Chapter 2. Nonlinear Principal  Component Analysis:  Theory and Implementation  11  2 and 4 are called respectively the encoding and decoding layers; they contain L neurons, the transfer functions of which are hyperbolic tangents. The transfer functions of the bottleneck and output layers are linear. As input, the network is presented the vector X(i ) for each time t ; the corresponding network output is denoted Af(X.(t )). n  n  The  n  weights and biases are adjusted ("trained"), using a conjugate gradient algorithm (Press et al., 1992) until the sum of squared differences between input and output: J =< \\X(t ) n  - N-(X(t ))\\  2  n  >  (2.12)  is minimised (subject to certain robustness criteria discussed in the next section). Because the network is trained to approximate as closely as possible the input data itself, it is said to be autoassociative.  It was proved by Sanger (1989) that if the transfer functions  of the neurons in the second and fourth layers are linear, the resulting network performs classical PCA, such that the output of the bottleneck layer is the time series Sf(t ) n  of  equation (2.6) (up to a normalisation factor). Now consider the manner by which this network solves the feature extraction problem for P = 1. The first three layers, considered alone, form a map from D? to SR, and the M  last three layers alone form a map from 5R to $l . M  All five layers together are simply  the composition of these two maps. Because the bottleneck layer contains only a single neuron, the network must compress the input down to a single one-dimensional time series before it produces its final M-dimensional output. Once the network has been trained, the output M(X.(t )) n  is the optimal one-dimensional approximation to X(f ), embedded n  in 5ft . As is discussed in Appendix A, it is known from a result due to Cybenko (1989) M  that if L is sufficiently large, then the first three layers can approximate any continuous Sf,  and the last 3 layers any continuous f, to arbitrary accuracy. Thus, the network  illustrated in Figure 2.1 should be able to recover optimally, in a least-squares sense, any one-dimensional nonlinear structure present in X(i„) for which the projection and  Chapter 2. Nonlinear Principal  Component Analysis:  Theory and Implementation  12  expansion functions are continuous. It is not required that the encoding and decoding layers each have the same number of neurons, but the numbers are fixed to be the same so as to have only one free parameter in the model architecture. That the network must be composed of (at least) 5 layers follows from the fact that each of the functions Sf and f requires a network with (at least) 3 layers for its approximation. The composition f o s f of the two must then have at least 5 layers, as one layer is shared. The network illustrated in Figure 2.1 will extract the optimal one-dimensional curve characterising X(£ ). To uncover higher-dimensional structure, the number of neurons in n  the bottleneck layer can be increased. For example, if two neurons are used, the network will determine the optimal two-dimensional characterisation (by continuous functions) of X(i ). In general, a P-dimensional NLPCA approximation to X(i ) can be obtained by n  n  setting to P the number of neurons in the bottleneck layer. Another solution to the general feature extraction problem was introduced independently by Hastie and Stuetzle (1989). Their method, termed Principal  Curves and Sur-  faces (PCS), is described in Appendix B. Principal Curves and Surfaces is based on a rather different set of ideas than NLPCA. In practice, however, because both minimise the sum of squared errors (2.11), the two methods both boil down to iterative solutions to the variational formulation of a feature extraction problem. In fact, Malthouse (1998) argued that NLPCA and PCS are quite similar for a broad class of feature extraction problems. A primary difference between NLPCA and PCS is that in the former the projection function Sf is constrained to be continuous, while in the latter it may have a finite number of discontinuities. Although here I will investigate the use of Kramer's NLPCA, because its implementation is straightforward, PCS has a stronger theoretical underpinning. In Chapter 6 a generalisation of Kramer's NLPCA that can model discontinuous projection and expansion functions is introduced, and is thus closer to PCS. As noted by LeBlanc and Tibshirani (1994), Hastie and Stuetzle's PCS partitions  Chapter 2. Nonlinear Principal  Component Analysis:  Theory and Implementation  13  variance in the same fashion as does traditional PCA: if X(£ ) is the PCS approximation n  to X(t„), then M  U  M E v a r p f O = 5 > a r ( X 0 + £ v a r p C ; - X ). i=l i=l i=l t  (2.13)  As with PCA, it is therefore sensible to describe a PCS approximation as explaining a certain fraction of variance in the original dataset. From the close relationship between NLPCA and PCS demonstrated by Malthouse (1998), it is tempting to hypothesise that NLPCA also partitions variance in such a fashion. While I am not aware of a rigorous proof of this result, this partitioning of variance in fact occurs in all of the examples I have considered, and in the following discussion it shall be assumed that equation (2.13) holds for X(f ) the NLPCA approximation to X(i ). n  n  Yet a third nonlinear generalisation of PCA was introduced by Oja and Karhunen (1993) and by Oja (1997), in which the map Sf is allowed to be nonlinear while f remains linear. Such a generalisation can be carried out using a two-layer recursive neural network. Because only the projection function is nonlinear, this approach is distinct from the class of feature extraction problems addressed by Kramer's NLPCA and Hastie and Stuezle's PCS. Because the traditional PCA model has the additive structure (2.2), the optimal linear P-dimensional substructure of X(i ) can be found all at once, or mode by mode; n  both methods yield the same result. In the iterative approach, the first mode X^^(t ) of n  X(i ) is calculated from the entire data set, and then the second mode is calculated from n  the residual X(i ) — X^^(i ), taking advantage of the fact that the second PC mode of X(£ ) is the first PC mode of this residual. The two approaches are equivalent for PCA n  n  n  because the most general linear function of P variables has the additive structure: p g [ u , U , . . . , U ) = Yj i ia  1  2  P  U  (- ) 2  14  They are generally distinct, however, for NLPCA, as an arbitrary smooth function / of  Chapter 2. Nonlinear Principal  Component Analysis:  Theory and Implementation  14  P variables cannot be decomposed as a sum of smooth functions of one variable. That is, in general / ( « i , « , ...,u ) + 2  P  for some functions / 1 , / , f p :  + /  2  (2.15)  M + ... + fp{u ) P  f cannot usually be written as a GeneraUsed Additive  2  Model (Hastie and Tibshirani, 1990). The iterative approach will be referred to as a modal analysis, and to the all-at-once approach as nonmodal. Naturally enough, in a modal analysis, each ID approximation will be referred to as a mode, and ordered in terms of decreasing fraction of variance explained. I will compare both the modal and nonmodal approaches in this thesis. Theoretically, the nonmodal P-dimensional approximation should be superior to the modal approximation, because it is drawn from a broader class of functions, although the modal analysis is more amenable to interpretation. Of course, a general P-dimensional analysis could involve both modal and nonmodal decompositions at various stages; such mixed modal/nonmodal analyses will not be considered here. Malthouse (1998) pointed out two limitations of NLPCA as formulated by Kramer (1991).  First, Kramer's NLPCA is unable to characterise low-dimensional structure  which is self-intersecting.  Because the projection Sf must be discontinuous for a self-  intersecting surface, there will be open neighbourhoods in  that are mapped by Sf  into non-open neighbourhoods in SR . Consider the example of a circle in SR . It is a ID P  2  surface, a natural parameterisation of which is the interval 6 £ [0,27r], with the points 6 = 0 and 8 = 2n identified (5 topology). Clearly, for any small e, there will exist an 1  open neighbourhood on the circle which maps onto the non-open set [0, e) U (27r — e,2ir].  This limitation of Kramer's NLPCA is not of great importance  to the analysis of climate data, as precisely cyclic variability is not characteristic of climatic systems. An exception is perhaps the annual cycle, but it is typically removed  Chapter 2. Nonlinear Principal Component  Analysis:  Theory and Implementation  15  from climate data before analysis. The limitation can be removed by moving to a 7layer neural network, which can approximate discontinuous  projection and expansion  functions. This issue is addressed in Chapter 6. The second limitation of Kramer's NLPCA highlighted by Malthouse is that the parameterisation Sf of the NLPCA approximation is only determined up to an arbitrary homeomorphism (i.e., a continuous, one-to-one, and onto function with a continuous inverse). That is, for an arbitrary homeomorphism g : 5R \-¥ 3? , the time series P  g(sf ( X ( £ ) ) ) n  is  also  an acceptable  f o Sf = (f o g ) o (g o S f ) . _ 1  parameterisation  of the  p  surface,  because  This degeneracy is a potentially serious complication in  the interpretation of the time series produced by the bottleneck layer of Kramer's network. Based on the results presented in this thesis, it is apparent that this degeneracy is not problematic in a modal NLPC analysis. Homeomorphisms from 3ft to 3ft are functions which can only stretch and compress (locally or globally), or translate globally, and thus do not radically change the information present in the time series Sf(X(£„)). The time series arising from the modal analyses in Chapters 4 and 5 are amenable to natural interpretation in terms of familiar phenomena in the systems under consideration. This degeneracy is substantially-more significant in the case of a nonmodal analysis, because homeomorphisms from $t. to 3ft for P > 1 can include rotations as well as dilap  p  tions and compressions. Generally, the time series produced by the P different neurons in the bottleneck layer will thus not be independent, or even uncorrelated, because of mixing induced by this arbitrary rotation. Any P-dimensional surface can be parameterised by a set of independent variables 7;, i = 1,...,P. Determining such a set from the set of P time series Bi(t ) determined empirically by NLPCA is another problem of feature n  extraction in the space of the variables parameterising the surface. In principle, PCA or modal NLPCA can be used to calculate the ji(t ). n  An example of such an approach  Chapter 2. Nonlinear Principal Component Analysis: Theory and Implementation  16  is c o n s i d e r e d i n C h a p t e r 5. T h e fact t h a t n o n m o d a l N L P C A leads t o a s e c o n d f e a t u r e e x t r a c t i o n p r o b l e m reduces its u t i l i t y , despite its aesthetic a p p e a l . A n o t h e r c o m m o n g e n e r a l i s a t i o n o f P C A is R o t a t e d P r i n c i p a l C o m p o n e n t ( R P C A ; R i c h m a n , 1986).  Analysis  A s was p o i n t e d out b y B u e l l (1975,1979), b e c a u s e P C A is  d e s i g n e d t o m a x i m i s e the g l o b a l v a r i a n c e e x p l a i n e d b y t h e l e a d i n g m o d e , a n d b e c a u s e successive m o d e s are c o n s t r a i n e d to have o r t h o g o n a l s p a t i a l p a t t e r n s , its r e s u l t s c a n be s t r o n g l y affected b y the shape of the d a t a d o m a i n . R o t a t e d P C A addresses t h i s p r o b l e m b y m o d i f y i n g t h e cost f u n c t i o n (2.3) to i n c l u d e a " s i m p l e s t r u c t u r e c r i t e r i o n " so t h a t t h e r e s u l t i n g a p p r o x i m a t i o n strikes a c o m p r o m i s e b e t w e e n m a x i m i s i n g i t s e x p l a i n e d v a r i a n c e a n d m i n i m i s i n g its s p a t i a l scale.  I n d o i n g so, e i t h e r t h e o r t h o g o n a l i t y o f t h e  spatial  p a t t e r n s , t h e u n c o r r e l a t e d n e s s o f the t i m e series, or b o t h , m u s t be s a c r i f i c e d . R o t a t e d P C A a n d N L P C A are b o t h generalisations o f P C A , b u t t h e y address s u b s t a n t i a l l y different issues. R o t a t e d P C A allows t h e d e t e c t i o n o f l o c a l i s e d v a r i a b i l i t y i n t h e d a t a , b u t s t i l l v a r i a b i l i t y t h a t is l i n e a r . C o n s e q u e n t l y , I D R P C A  a p p r o x i m a t i o n s share  w i t h I D P C A a p p r o x i m a t i o n s the p r o b l e m t h a t t h e y c a n o n l y d e s c r i b e s t a n d i n g v a r i a b i l ity. O n t h e o t h e r h a n d , N L P C A is c o n c e r n e d w i t h d e t e c t i n g a n d c h a r a c t e r i s i n g n o n l i n e a r s t r u c t u r e i n d a t a sets. T h e g e n e r a l I D N L P C A a p p r o x i m a t i o n c a n n o t be e x p r e s s e d as a s e p a r a b l e f u n c t i o n o f space a n d t i m e s u c h as (2.9), so it is able t o d e s c r i b e v a r i a b i l i t y m o r e g e n e r a l t h a n s t a n d i n g o s c i l l a t i o n s . B e c a u s e o f the l a c k o f a " s i m p l e s t r u c t u r e c r i t e r i o n " i n t h e cost f u n c t i o n (2.11), N L P C A also m a x i m i s e s g l o b a l v a r i a n c e , a n d its r e s u l t s w i l l p r e s u m a b l y suffer t o some degree f r o m the same s e n s i t i v i t y t o d o m a i n b o u n d a r i e s  that  P C A does. H o w e v e r , t h i s p r o b l e m is p r e s u m a b l y less serious w i t h N L P C A b e c a u s e o f t h e absence o f o r t h o g o n a l i t y c o n s t r a i n t s b e t w e e n different m o d e s .  I n fact, s u c h c o n s t r a i n t s  c a n n o t be n a t u r a l l y f o r m u l a t e d for the n o n l i n e a r a p p r o x i m a t i o n s . A f u r t h e r g e n e r a l i s a t i o n o f N L P C A t o encourage r e g i o n a l i s a t i o n o f the a p p r o x i m a t i o n c o u l d be i n t r o d u c e d b y m o d i f y i n g t h e cost f u n c t i o n (2.11) to i n c l u d e a s i m p l e s t r u c t u r e c r i t e r i o n .  Such a  Chapter 2. Nonlinear Principal Component Analysis: Theory and Implementation 17  generalisation, while an interesting direction for future research, is beyond the scope of the present study.  2.3  I m p l e m e n t a t i o n of N L P C A  Neural networks are powerful tools for function approximation. Given input and target data sets u(t ) and v ( i ) , n = 1 , N , n  a neural network, denoted by Af, can be trained  n  until Af(u(t )) is an arbitrarily good approximation to v ( i ) , if the number of neurons n  n  in the hidden layer is sufficiently large. That is, a network can always be built so that the total sum of squared errors < ||v(t )-A/(u(t ))|| >  (2.16)  2  n  n  is as small as desired. Another important property of the neural network is that it generalises, that is, that given new data u(£w i), v(£./v+i)> the network error on these +  data is about the same size as the errors over the training set: l l v ^ + i ) - Af(u(t ))\\  ~ < ||v(* ) - AA(u(* ))|| > .  2  2  N+1  n  n  (2.17)  The two goals of minimising network error and maximising its ability to generalise are often incompatible, and a subtle balance must be struck between the two. This situation arises, for example, in the case when u(t ) and v ( i ) are of the form n  n  u(*„)  = z{t ) + e  (2.18)  v(t )  = f(z(i ))+»7„  (2.19)  n  n  n  n  where e and n are noise terms, and it is desired that Af learn the deterministic relationn  n  ship f between u(t ) and v(t ). In such a case, care must be taken to avoid allowing the n  n  network to fit the noise as well. If Af is trained until it maps particular details of a given realisation of u(t ) into those of a given realisation of v ( i ) , and thus will not generalise, n  n  Chapter 2. Nonlinear Principal  Component Analysis:  Theory and Implementation  18  the network has overfit. An overfit network is not truly representative of the structure underlying a data set. The avoidance of overfitting by neural networks is a primary issue in their implementation (Finnoff et al., 1993; Yuval, 1999). To avoid overfitting, a simple technique called early stopping has been used. Because the neural network is nonlinear in the model parameters, they must be determined iteratively in a process referred to as training. In early stopping, the training is terminated before the error function is minimised, according to a well-defined stopping criterion. In essence, the idea behind early stopping is that the training is allowed to continue sufficiently long to fit the structure underlying the data, but not long enough to fit the noise. The strategy employed was to hold aside a fraction of the data points, selected randomly, in a validation set not used to train the network. While network training proceeded, the network performance on the validation set was monitored, and training was stopped when this performance began to degrade, or after a fixed large number of iterations, whichever came first. The use of early stopping along with the deterministic conjugate gradient algorithm to minimise the error function confers on the training results a degree of sensitivity to the network parameters (the weights and biases) used to initialise the iterative training procedure. This sensitivity is exacerbated by the possible existence of multiple minima in the error function (2.11). To address this problem, an ensemble of training runs starting from different, randomly chosen, initial parameter values was carried out for each analysis performed. The training results from these runs were examined, and those members of the ensemble for which the final error over the validation set was greater than that over the training set were discarded. The remaining members of the ensemble are referred to as candidate models. The number of neurons L in the encoding and decoding layers determines the class of functions that Sf and f can take. As L increases, there is more flexibility in the  Chapter 2. Nonlinear  Principal  Component Analysis:  Theory and Implementation  19  forms of Sf and f, but the model also has more parameters, implying both that the error surface becomes more complicated and that the parameters are less constrained by data. Consequently, for L large, the scatter among the candidate models can be large, as measured by the normalised mean square distance (NMSD). The NMSD between approximations X^^(t ) and X^ ^(i ) is defined as 2  n  n  NMSD =  "  < ll ll x  >  (2-20)  This statistic was introduced in Monahan (1999), in which it was found that NLPCA approximations for which the NMSD was less than about 2% were essentially indistinguishable. In the end, the number L of neurons used in the encoding and decoding layers was the maximum such that the NMSD between NLPCA approximations to X(i ) in n  the candidate model set was less than 5%. This threshold value of NMSD was chosen on the basis of experience and intuition; there is no existing rigorous sampling theory for this test statistic. In other words, for any given analysis, the value of L used in the NLPCA network is the largest that produces a robust set of candidate models. The early stopping technique ensures that the NLPCA approximation is robust to the introduction of new data, and the existence of a set of similar candidate models (as measured by NMSD) ensures that the approximation is robust with respect to the initial parameter values used in the training. Finally, once a maximal L was determined and a set of candidate models obtained, the model selected as "the" NLPCA approximation was the one with the highest Fraction of Explained Variance (FEV): < l|X|| > 2  which is a meaningful statistic because NLPCA partitions variance as described in equation (2.13). Alternately, the candidate model selected was that which minimised the  Chapter  2. Nonlinear  Principal  Component Analysis:  Theory and Implementation  20  Fraction of Unexplained Variance (FUV): FUV = 1 - FEV  (2.22)  Typically, the F E V differed little between candidate models.  2.4  D y n a m i c a l Significance of L o w - D i m e n s i o n a l A p p r o x i m a t i o n s  The lower-dimensional approximations obtained by feature extraction methods are statistical in nature. A natural question concerns the relation they bear to the dynamics of the system under investigation. North (1984) considered the dynamical system Z^(M) = C(x,0  (2.23)  where L is a space-time linear differential operator and C(x, £) represents stochastic forcing, and concluded that the EOFs of ip(x,t) coincide with its dynamical modes if and only if the operator L is normal (i.e. it commutes with its adjoint) and the noise £ is white in space and stationary in time: £(C(x!,*i)C(x , t )) = gilh - t \)5{^ - x ) 2  2  2  2  (2.24)  where g(r) is a lag autocovariance function. These requirements greatly restrict the class of dynamical systems for which the connection between the statistics and dynamics is clear-cut, as it does not even include the geophysically-relevant class of linear models for which non-modal variance growth is important (Penland and Sardeshmukh, 1995; Farrell and Ioannou, 1996; Whitaker and Sardeshmukh, 1998). Mo and Ghil (1987) attempted to assess the connection between the results of E O F analysis and the dynamics of the system under consideration in the context of nonlinear dynamics. They concluded that, "the dynamical interpretation of EOFs is their pointing  Chapter 2. Nonlinear Principal  Component Analysis:  Theory and Implementation  21  from the time mean to the most populated regions of the system's phase space". If the distribution of the system in phase space is Gaussian, then this direction will he along the distribution's principal axis. On the other hand, if the distribution is not Gaussian, but is characterised by an inhomogeneous density with a small number of local extrema corresponding to preferred regimes of behaviour (associated, e.g., with slow manifolds of the dynamics (Ghil and Childress, 1987)), then the leading EOFs will characterise the distribution of these extrema. Unlike PCA, NLPCA approximations are not characterised by unique "directions" through phase space, but rather by curved surfaces. Interpretation of the results of an NLPC analysis must then be couched in rather different terms than that of PCA. A natural interpretation, using the language of dynamical systems theory is that NLPCA approximations characterise the attractor of the system under consideration, as was noted by Kirby and Miranda (1994). Many naturally occurring systems possess a stable attractor, which is a manifold typically of smaller dimension than the Cartesian phase space in which it must be embedded to preclude spurious self-intersections (Ott, 1993). These attractors are generally complicated surfaces of non-integer dimension; only in very special cases are they planar. Because PCA produces an orthogonal coordinate system in the phase space, it can at best eliminate the degrees of freedom in the data associated with noise, thereby producing an embedding space for the attractor. Nonlinear PCA, however, can characterise the curved structures associated with these attractors (although it is restricted to approximations of integer dimension), and produce what Kirby and Miranda denote the "optimal coordinates" of the system. The results of NLPCA are thus best considered as characterising the underlying attractor of the system under consideration, as will be illustrated in the following Chapter when an NLPCA approximation of the Lorenz attractor is constructed. The estimation of the attractor underlying a data set  Chapter 2.  Nonlinear  Principal  Component  provides insight into the governing physics.  Analysis:  Theory and Implementation  22  Knowledge of the dominant forms of vari-  ability i n the data can act as a guide to reductionism, helping to develop mechanistic models of the system under investigation, which then provide insight into the underlying dynamics.  Chapter 3  N o n l i n e a r P r i n c i p a l C o m p o n e n t Analysis of the L o r e n z A t t r a c t o r  3.1  Introduction  A s a preHminary investigation into the implementation of N L P C A , consider a synthetic d a t a set c o n s i s t i n g o f a set o f p o i n t s s a m p l e d f r o m t h e L o r e n z a t t r a c t o r ( L o r e n z , 1963). T h i s f a m i l i a r o b j e c t is t h e a t t r a c t o r o n w h i c h (as t —> oo) l i v e s o l u t i o n s x ( i ) o f t h e s y s t e m of coupled nonlinear O D E s  x\ — —ax -\-ax x  x  2  x  3  =  —XxXz  + rxi — x  2  = xix — bx , 2  (3.1)  2  3  (3-2)  (3.3)  w i t h p a r a m e t e r values r = 2 8 , 6 = 8 / 3 , a n d a = 10. S y n t h e t i c d a t a is u s e d t o test t h e N L P C A m e t h o d because  • b y a d d i n g r a n d o m noise t o t h e s i g n a l x ( £ ) , t h e s e n s i t i v i t y o f t h e m e t h o d t o noise level c a n be tested, a n d  • t h e s t r u c t u r e o f t h e L o r e n z a t t r a c t o r is w e l l - k n o w n , a n d o f sufficiently l o w d i m e n s i o n t h a t v i s u a l i s a t i o n o f results is s t r a i g h t f o r w a r d .  F i g u r e 3.1 d i s p l a y s t h e p r o j e c t i o n s o f t h e L o r e n z a t t r a c t o r (as d e t e r m i n e d b y n u m e r i c a l i n t e g r a t i o n o f e q u a t i o n s (3.1)-(3.3)) o n t h e ( £ 1 , 2 : 2 ) , (#2,2:3), a n d (2:3,2:1) p l a n e s . It t u r n s o u t t h a t t h e L o r e n z a t t r a c t o r is f r a c t a l , w i t h a b o x - c o u n t i n g d i m e n s i o n o f a b o u t 2.04  23  Chapter 3.  Nonlinear Principal  Component  Analysis  of the Lorenz Attractor  24  Chapter 3. Nonlinear  Principal  Component Analysis  of the Lorenz  Attractor  25  (Berliner, 1992). However, inspection of the butterfly-shaped attractor indicates that a one-dimensional U-shaped curve passing through the centres of the two lobes should explain a substantial fraction of the variance. To produce a dataset of size similar to that typically encountered in climate applications (e.g. 600 points in length, corresponding to 50 years of monthly data), the data displayed in Figure 3.1 were subsampled at uniform intervals in time to produce a 3D time series 584 points in length, to be denoted z(t ). The subsampled data set is disn  played in Figure 3.2. Clearly, it retains the gross structure of the original attractor. To investigate the effects of noise on the NLPCA results, constructed the datasets x(t ) = z{t ) + e{t ) n  n  V  n  (3.4)  were constructed, where e(t ) is a 584-point 3D series of Gaussian iid random deviates n  with zero mean and unit standard deviation, and 77 is a tunable parameter for the noise level. This noise is added in an effort to model measurement error; the stochasticity is not intrinsic to the dynamics.  3.2  M o d e l Building  The early stopping algorithm described in the previous chapter was used to carry out the NLPC analysis of the Lorenz data. A validation set containing 30% of the data points was set aside, and network performance over this set was monitored as training progressed. Training was stopped when this validation set error started to increase, or after 500 iterations, whichever was the first to occur.  3.3  Results  The ID PCA approximation to x(t ) n  when rj = 0 is displayed in Figure 3.3; it is a  straight line passing through the centres of the two lobes of the attractor, and explains  Chapter 3. Nonlinear Principal  1  Component  Analysis  o 1  1  •  •  ••••••  co o  cn  ..  • •  .... •  1  . ...... .v  * . ; • • •  • •  I  I  •  •  1  I cn  •  1  •  •-  I  I o  1  i  i  • •  •  I  cn  I  cn  o  .*! *• *•*  ..... •  o  cn  cn  1 o  ' • • • V." •#  *  i  2  I o  cn  • •  *•• .'* -  •  o  1  cn O  •  »  1  I  cn  26  •• •  •  cn  Attractor  o 1  •  •. •  co cn  1  1 •  of the Lorenz  ^  •  Figure 3.2: As in Figure 3.1, for a subsample of 584 points.  cn  o  Chapter  60%  3. Nonlinear  Principal  of the variance of x ( i ) . n  x(i„).  Component Analysis  of the Lorenz  F i g u r e 3.4 d i s p l a y s t h e I D N L P C A  Attractor  27  approximation to  A s a n t i c i p a t e d , i t is a U - s h a p e d c u r v e p a s s i n g t h r o u g h t h e m i d d l e o f t h e d a t a .  T h i s c u r v e e x p l a i n s 76% o f t h e v a r i a n c e , a n i m p r o v e m e n t o f 16% over t h e P C A r e s u l t s . Clearly, the I D N L P C A  a p p r o x i m a t i o n is s u b s t a n t i a l l y closer t o x ( i „ ) t h a n is t h e I D  P C A a p p r o x i m a t i o n . T h e n e t w o r k used t o p e r f o r m t h e N L P C A h a d 3 i n p u t a n d o u t p u t neurons for x , x , a n d x , x  2  3  1 bottleneck neuron, a n d L = 3 neurons i n the encoding  a n d decoding layers. E x p e r i m e n t a t i o n indicated that t h e N L P C A results i m p r o v e d using L = 3 over u s i n g L = 2 (ie, t h e f o r m e r h a d a s m a l l e r F U V t h a n t h e l a t t e r ) , b u t t h a t f o r L > 3, t h e r e s u l t s d i d n o t i m p r o v e . T u r n i n g n o w t o t h e issue o f r o b u s t n e s s o f r e s u l t s , t h e N M S D b e t w e e n 6 different I D N L P C A curves ( n o t s h o w n ) varies b e t w e e n 0.5% a n d 2 % . T h e s e c u r v e s differ o n l y i n s m a l l details, a n d agree i n t h e i r e s s e n t i a l s t r u c t u r e w i t h t h e c u r v e s h o w n i n F i g u r e 3.4. T h u s , t h e I D N L P C A a p p r o x i m a t i o n t o x ( t ) d i s p l a y e d i n n  F i g u r e 3.4 is a r o b u s t result t h a t i m p r o v e s s u b s t a n t i a l l y over t h e I D P C A a p p r o x i m a t i o n . F i g u r e s 3.3 a n d 3.4 i l l u s t r a t e t h e s t r e n g t h  of N L P C A  relative to P C A . It c a n b e  p r o v e n a n a l y t i c a l l y ( L i i c k e , 1976) t h a t x is u n c o r r e l a t e d w i t h X\ a n d x . 3  2  Consequently,  t h e c o v a r i a n c e m a t r i x takes o n t h e f o r m  r-21  I0 w h e r e Y\  2  r  1 2  0  r  2 2  0  0  \  T33  = T i b e c a u s e t h e c o v a r i a n c e m a t r i x is s y m m e t r i c . O n e e i g e n v e c t o r o f Y t h u s 2  lies a l o n g t h e x axis w h i l e t h e o t h e r t w o s p a n t h e x x 3  x  2  plane. One o f the latter appears  as t h e l e a d i n g P C A a p p r o x i m a t i o n d i s p l a y e d i n F i g u r e 3.3. I n t h e P C A d e s c r i p t i o n o f the L o r e n z attractor, variability along x appears i n a separate mode f r o m v a r i a b i l i t y i n 3  the x x  plane.  the x x  p l a n e , i t is clear u p o n i n s p e c t i o n o f F i g u r e 3.2 t h a t these are n o t independent  x  x  2  2  However, while variability along x  3  is u n c o r r e l a t e d w i t h v a r i a b i l i t y i n  m o d e s o f v a r i a b i l i t y . I n d e e d , large values o f \xi\ are a s s o c i a t e d w i t h large values o f x . 3  Chapter 3. Nonlinear  o ho  i o  Principal  o  Component Analysis of the Lorenz  o  o co  CS)  IV)  . •  i o bo i  i  i  o  i  o  So  •  0  i o I bo  28  o  •• •  o ho  o  Attractor  o  o  o bo  b)  • . • •» •^- • • •  Q  I o cn  i  o  i  o  So  o ho  o 4*.  o CS)  o bo  O  •  •  Figure 3.3: Noise-free Lorenz data for a subsample of 584 points and their ID PCA approximation, projected as in Figure 3.1 (note axes have been rescaled). The dots represent the original data points, the open circles represent points of the approximation.  Figure 3.4: As in Figure 3.3, but for the ID N L P C A approximation.  Chapter 3.  Nonlinear Principal  Component  Analysis  of the Lorenz Attractor  30  T h e I D N L P C A a p p r o x i m a t i o n characterises t h i s d e p e n d e n c e b e t w e e n x\ a n d x , w h i l e 3  also d e s c r i b i n g t h e c o v a r i a b i l i t y o f X i a n d x  2  described by the I D P C A a p p r o x i m a t i o n .  T h e p o w e r o f N L P C A is t h a t it c a n characterise c o v a r i a b i l i t y b e t w e e n v a r i a b l e s t h a t are u n c o r r e l a t e d , b u t not i n d e p e n d e n t , w h i c h P C A c a n n o t . F i g u r e 3.5 d i s p l a y s the 2 D P C A a p p r o x i m a t i o n o f t h e d a t a x ( £ ) w h e n 77 = 0; t h i s n  surface e x p l a i n s 9 5 % o f the v a r i a n c e .  T h e 2 D P C A a p p r o x i m a t i o n is a flat sheet t h a t  c h a r a c t e r i s e s t h e s t r u c t u r e o f the d a t a as p r o j e c t e d i n the (xi, X3) a n d (x , x ) 2  b u t fails t o r e p r o d u c e t h e s t r u c t u r e seen i n t h e p r o j e c t i o n o n t h e (aji, x ) 2  3  planes well  plane. In F i g u r e  3.6, t h e r e s u l t o f a 2 D n o n m o d a l N L P C A o f x ( i ) is s h o w n . T h i s surface e x p l a i n s 9 9 . 5 % o f n  t h e v a r i a n c e , i m p l y i n g a n o r d e r o f m a g n i t u d e r e d u c t i o n i n F U V as c o m p a r e d t o t h e P C A r e s u l t . T h e n e t w o r k u s e d to p e r f o r m the N L P C A h a d 2 n e u r o n s i n t h e b o t t l e n e c k l a y e r a n d L = 6 n e u r o n s i n the e n c o d i n g a n d d e c o d i n g layers. It was f o u n d t h a t d e c r e a s i n g L b e l o w 6 also d e c r e a s e d t h e f r a c t i o n o f v a r i a n c e e x p l a i n e d , a n d i n c r e a s i n g it a b o v e L = 6 h a d l i t t l e effect u p o n t h e results.  T h e 2 D n o n m o d a l N L P C A r e s u l t is h i g h l y r o b u s t :  a  s a m p l e o f 4 N L P C A m o d e l s (not s h o w n ) has N M S D b e t w e e n c u r v e s o f at m o s t 0 . 1 % . A s w i t h t h e I D e x a m p l e c o n s i d e r e d a b o v e , t h e N L P C A a p p r o x i m a t i o n is a s u b s t a n t i a l l y b e t t e r a p p r o x i m a t i o n t o the o r i g i n a l d a t a set t h a n is t h e P C A a p p r o x i m a t i o n . C o n s i d e r n o w a d a t a s e t x ( i ) o b t a i n e d f r o m e q u a t i o n (3.4) w i t h 77 = 2.0. n  The ID  P C A a p p r o x i m a t i o n t o x ( i ) (not s h o w n ) e x p l a i n s 5 9 % o f t h e v a r i a n c e . T h e I D N L P C A n  a p p r o x i m a t i o n ( F i g u r e 3.7), e x p l a i n s 74% o f t h e v a r i a n c e . T h e c u r v e i n F i g u r e 3.7 is v e r y s i m i l a r t o t h a t s h o w n i n F i g u r e 3.4 for the 77 = 0 case; t h e t w o - l o b e d s t r u c t u r e o f t h e d a t a is s t i l l m a n i f e s t at a noise l e v e l o f 77 = 2.0, a n d t h e N L P C A is able t o r e c o v e r i t . A d d r e s s i n g a g a i n t h e issue o f robustness o f r e s u l t s , 6 different N L P C A t o x(t ) n  approximations  were f o u n d t o h a v e N M S D v a r y i n g f r o m 0.5% t o 3 % . T h e s e 6 c u r v e s agree i n  t h e i r e s s e n t i a l d e t a i l s , a l t h o u g h the set d i s p l a y s m o r e v a r i a b i l i t y b e t w e e n m e m b e r s d i d t h e c o r r e s p o n d i n g set for 77 = 0.  than  F i g u r e 3.8 shows t h e r e s u l t s o f a 2 D n o n m o d a l  Chapter  3.  Nonlinear  Principal  Component  Analysis  F i g u r e 3.5: Noise-free L o r e n z d a t a for a s u b s a m p l e approximation.  of the Lorenz Attractor  31  o f 584 p o i n t s a n d t h e i r 2 D P C A  T h e dots represent t h e o r i g i n a l d a t a p o i n t s , a n d t h e o p e n circles t h e  points of the approximation.  Chapter 3. Nonlinear Principal  Component  Analysis  of the Lorenz Attractor  32  Chapter 3. Nonlinear  I o  i  o I 03  o  Principal  o  I  o  05  Component Analysis of the Lorenz  o  O 00  0)  I o  Attractor  I o ro  CD  33  cn  1 o  o  ro  o  o b  o bo  Figure 3.7: Lorenz data with noise level rj = 2.0 for a subsample of 584 points and its ID NLPCA approximation. Dots represent the data points and the open circles represent points of the approximation.  Chapter  3.  Nonlinear  Principal  Component Analysis  of the Lorenz  Attractor  34  N L P C A p e r f o r m e d o n t h i s d a t a set. T h i s e x p l a i n s 97.4% o f t h e v a r i a n c e , i n c o n t r a s t  to  t h e 2 D P C A a p p r o x i m a t i o n (not s h o w n ) , w h i c h e x p l a i n s 9 4 . 2 % . T h u s , t h e F U V o f t h e 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n is a b o u t h a l f t h a t o f t h e 2 D P C A a p p r o x i m a t i o n . T h e s e r e s u l t s t o o are r o b u s t ; the N M S D b e t w e e n different 2 D n o n m o d a l N L P C A was a b o u t 0.2%.  T h e 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n is a g a i n a n  models  improvement  over t h e 2 D P C A a p p r o x i m a t i o n , b u t not b y s u c h a s u b s t a n t i a l m a r g i n as was t h e case w h e n x] = 0. T h e noise-free L o r e n z a t t r a c t o r is v e r y n e a r l y t w o - d i m e n s i o n a l , so t h e 2 D n o n m o d a l N L P C A was able t o a c c o u n t for a l m o s t a l l o f the v a r i a n c e .  T h e addition of  noise a c t e d t o s m e a r out this fine f r a c t a l s t r u c t u r e a n d m a d e t h e d a t a c l o u d m o r e 3 dimensional.  T h e 2 D N L P C A a p p l i e d t o this q u a s i - 3 D s t r u c t u r e c o u l d n o t p r o d u c e  as  close a n a n a l o g u e as was the case w h e n rj = 0. A t a noise s t r e n g t h o f rj = 5.0, the d a t a set x ( i ) s t i l l has a d i s c e r n i b l e t w o - l o b e d n  structure,  b u t it is s u b s t a n t i a l l y o b s c u r e d .  T h e I D P C A a p p r o x i m a t i o n (not  shown)  e x p l a i n s 5 4 % o f t h e v a r i a n c e , whereas the I D N L P C A a p p r o x i m a t i o n ( s h o w n i n F i g u r e 3.9) e x p l a i n s 6 5 % . A g a i n , the I D N L P C A a p p r o x i m a t i o n to x ( i ) is q u a l i t a t i v e l y s i m i l a r n  t o t h a t o b t a i n e d i n the noise-free case ( F i g u r e 3.4). T h e rj = 0 a n d rj = 5.0 I D  NLPCA  a p p r o x i m a t i o n s differ at the ends o f the curves. P r e s u m a b l y , t h e s t r u c t u r e r e p r e s e n t e d i n t h e f o r m e r is s o m e w h a t  w a s h e d out b y noise i n t h e l a t t e r .  F o u r different  NLPCA  c u r v e s for t h e d a t a o b t a i n e d w i t h n — 5.0 share t h e i r gross features, b u t differ f a i r l y s u b s t a n t i a l l y i n d e t a i l . I n t h i s case, the N M S D b e t w e e n curves varies b e t w e e n 5% a n d 10%.  T h e 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n t o these d a t a (not s h o w n ) e x p l a i n s 9 0 %  of the variance, only slightly more t h a n the 2 D P C A a p p r o x i m a t i o n , w h i c h explains 88% of the variance. F i n a l l y , at a noise l e v e l o f rj =  10.0 t h e t w o - l o b e d s t r u c t u r e o f x(t )  o b v i o u s , a n d t h e d a t a c l o u d a p p e a r s as a f a i r l y h o m o g e n e o u s ,  n  is n o l o n g e r  vaguely ellipsoidal blob.  T h e r e s u l t s o f N L P C A b y t h i s noise l e v e l are no longer r o b u s t , t e n d i n g t o be a s y m m e t r i c ,  Chapter 3. Nonlinear Principal  o ho  o  Component  o  O)  Analysis  of the Lorenz  Attractor  35  o co  co  i o cn  0 O-' <  o  O  o  o cn  CX *  F i g u r e 3.8: A s i n F i g u r e 3.7, b u t w i t h the 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n o f t h e L o r e n z d a t a w i t h noise l e v e l n = 2.0.  Chapter 3. Nonlinear Principal Component Analysis of the Lorenz Attractor  convoluted curves.  A t t h i s noise l e v e l , t h e n , N L P C A  37  seems no l o n g e r t o b e a useful  t e c h n i q u e for c h a r a c t e r i s i n g l o w - d i m e n s i o n a l n o n l i n e a r s t r u c t u r e o f t h e d a t a set, p r e c i s e l y b e c a u s e t h e a d d i t i o n o f noise has d e s t r o y e d t h i s  3.4  structure.  Conclusion  T h e r e s u l t s c o n t a i n e d i n t h i s c h a p t e r d e m o n s t r a t e t h a t N L P C A is a b l e t o p r o d u c e l o w d i m e n s i o n a l a p p r o x i m a t i o n s t o m u l t i v a r i a t e d a t a sets t h a t are m o r e r e p r e s e n t a t i v e o f t h e d a t a t h a n the corresponding P C A approximations. T h e I D N L P C A a p p r o x i m a t i o n to a d a t a set s a m p l e d f r o m the L o r e n z a t t r a c t o r e x p l a i n s 76% o f t h e v a r i a n c e , i n c o n t r a s t  to  60% e x p l a i n e d b y t h e I D P C A a p p r o x i m a t i o n , a n d characterises t h e t w o - l o b e d s t r u c t u r e of the data.  A 2D nonmodal N L P C A  approximation explains 99.5% of the  variance,  w h e r e t h e 2 D P C A a p p r o x i m a t i o n e x p l a i n s 9 5 % . A s the b o x - c o u n t i n g d i m e n s i o n o f t h e a t t r a c t o r f r o m w h i c h t h e d a t a is s a m p l e d is 2.04, t h e 2 D n o n l i n e a r a p p r o x i m a t i o n is able t o c a p t u r e a l m o s t t h e entire s t r u c t u r e of t h e d a t a ; b e c a u s e t h e a t t r a c t o r is e m b e d d e d i n 3? , t h e 2 D P C A a p p r o x i m a t i o n c a n n o t do t h i s . W i t h t h e a d d i t i o n o f G a u s s i a n noise o f 3  s m a l l t o m o d e r a t e s t r e n g t h t o the d a t a , N L P C A r e m a i n s s u p e r i o r t o P C A i n its a b i l i t y to detect low-dimensional structure.  A s the noise l e v e l increases, t h e i m p r o v e m e n t o f  N L P C A over P C A decreases, u n t i l e v e n t u a l l y the noise d o m i n a t e s t h e s i g n a l a n d N L P C A c a n n o t i m p r o v e u p o n P C A . T a b l e 3.1 presents a s u m m a r y o f these r e s u l t s . C o n s i d e r a t i o n o f s y n t h e t i c d a t a was useful because it is o f l o w d i m e n s i o n a n d e a s i l y v i s u a l i s e d , a n d b e c a u s e t h e s e n s i t i v i t y o f the results o f N L P C A t o noise l e v e l c a n b e assessed t h r o u g h m a n i p u l a t i o n o f the s t r e n g t h o f the noise. R e a l c l i m a t e d a t a sets h o w e v e r are o f a h i g h d i m e n s i o n a l i t y a n d o f a fixed noise l e v e l , a n d i t is t h e p o t e n t i a l o f N L P C A t o p r o d u c e r o b u s t , e n l i g h t e n i n g a p p r o x i m a t i o n s to c l i m a t e d a t a t h a t d e t e r m i n e s t h e u t i l i t y of t h e m e t h o d . T h e a n a l y s i s o f s u c h d a t a sets is t h e s u b j e c t o f t h e n e x t t w o C h a p t e r s .  Chapter  3.  Nonlinear  Principal  Component  Analysis  ID  of the Lorenz  Attractor  38  2D  PCA  NLPCA  PCA  NLPCA  77 = 0  60  76  95  99.5  r7 = 2  59  74  94.2  97.4  rj = 5  54  65  88  90  T a b l e 3.1: P e r c e n t a g e s o f v a r i a n c e e x p l a i n e d b y t h e I D a n d 2 D N L P C A t o t h e L o r e n z d a t a for t h e t h r e e noise levels 77 c o n s i d e r e d .  approximations  Chapter 4  Nonlinear Principal Component Analysis of Tropical Indo-Pacific Sea Surface Temperature and Sea Level Pressure  4.1  Introduction  I n t e r a n n u a l v a r i a b i l i t y o f the E a r t h ' s c l i m a t e s y s t e m is d o m i n a t e d b y t h e t r o p i c a l P a c i f i c b a s i n - w i d e p h e n o m e n o n k n o w n as E l N i n o a n d t h e S o u t h e r n O s c i l l a t i o n ( E N S O )  (Phi-  l a n d e r , 1990). T h i s p h e n o m e n o n is c h a r a c t e r i s e d b y a l t e r n a t i n g p e r i o d s o f a n o m a l o u s l y w a r m or cold water i n the eastern equatorial Pacific, alternately weakening or strengthe n i n g t h e z o n a l sea surface t e m p e r a t u r e ( S S T ) g r a d i e n t across t h e P a c i f i c o c e a n . T h e s e phases o f t h e p h e n o m e n o n are referred to r e s p e c t i v e l y as E l N i n o a n d L a N i n a . A s s o c i a t e d w i t h these changes i n S S T are p r o n o u n c e d changes i n t h e z o n a l g r a d i e n t o f t h e r m o c l i n e d e p t h a n d sea surface h e i g h t . I n t h e a t m o s p h e r e , E l N i n o ( L a N i n a ) events are a s s o c i a t e d w i t h a s l a c k e n i n g ( s t r e n g t h e n i n g ) o f the z o n a l l y - o r i e n t e d W a l k e r c i r c u l a t i o n , i m p l y i n g a r e d u c t i o n (increase) i n w i n d stress a p p l i e d to the o c e a n surface a s s o c i a t e d w i t h t h e easte r l y T r a d e W i n d s , a n d a n e a s t w a r d ( w e s t w a r d ) shift i n t h e r e g i o n o f deep c o n v e c t i o n . V a r i a b i l i t y o f the W a l k e r c i r c u l a t i o n manifests i t s e l f as a n east-west d i p o l a r f l u c t u a t i o n i n sea l e v e l p r e s s u r e ( S L P ) k n o w n as t h e S o u t h e r n O s c i l l a t i o n . A s was o r i g i n a l l y d e s c r i b e d b y B j e r k n e s (1969), these changes i n a t m o s p h e r i c c i r c u l a t i o n feed b a c k o n t h e o c e a n a n d r e i n f o r c e t h e o r i g i n a l S S T a n o m a l i e s . I n the l a t e 1980's, a m e c h a n i s m was p r o p o s e d for t h e t r a n s i t i o n b e t w e e n E l N i n o a n d L a N i n a events t h a t i n v o k e d t h e d y n a m i c s o f o c e a n i c  39  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  40  b a r o c l i n i c waves i n t h e so-called " e q u a t o r i a l w a v e g u i d e " ( S u a r e z a n d Schopf, 1988; B a t t i s t i a n d H i r s t , 1989).  D e n o t e d the " d e l a y e d o s c i l l a t o r " m e c h a n i s m , t h i s has  become  t h e d o m i n a n t p a r a d i g m for t h e n e g a t i v e feedback t h a t t e r m i n a t e s E N S O events ( B a t t i s t i a n d S a r a c h i k , 1995). E N S O v a r i a b i l i t y is a p e r i o d i c , w i t h p o w e r p r i m a r i l y i n t h e 4-7 y e a r b a n d ( T a n g a n g et a l . , 1998). M o d e l s i n v o k e d to e x p l a i n t h e a p e r i o d i c i t y o f t h e v a r i a b i l i t y r a n g e f r o m t h e s t o c h a s t i c f o r c i n g o f a l i n e a r s y s t e m ( P e n l a n d , 1996) t o l o w - d i m e n s i o n a l d y n a m i c s o f a c h a o t i c s y s t e m ( J i n et a l , 1996); the a c t u a l c h a r a c t e r o f E N S O d y n a m i c s is s t i l l a s u b j e c t o f some d e b a t e , a l t h o u g h recent e v i d e n c e favours t h e f o r m e r o f t h e a b o v e m o d e l s ( P e n l a n d et a l , 1999). W h i l e t h e p h y s i c a l m e c h a n i s m s p r o d u c i n g E N S O are t h o u g h t t o b e m a i n l y c o n f i n e d t o t h e e q u a t o r i a l P a c i f i c , its effects are g l o b a l i n scale ( P h i l a n d e r , 1990; T r e n b e r t h al.,  1998).  I n consequence,  forecasts  n u m b e r of researchers, and throughout  et  of E N S O variability have been a t t e m p t e d by a the 1990s it has b e e n t h e p a r a d i g m a t i c p r o b l e m  o f c l i m a t e p r e d i c t i o n ( B a r n s t o n et a l . , 1994; B a r n s t o n et a l . , 1999). E N S O d y n a m i c s has q u i t e c e r t a i n l y b e e n t h e m o s t i n t e n s i v e l y s t u d i e d p r o b l e m i n c l i m a t e p h y s i c s for t h e last decade.  E N S O v a r i a b i l i t y is often d i a g n o s e d f r o m o b s e r v a t i o n s u s i n g l i n e a r s t a t i s t i c a l  t o o l s , i n p a r t i c u l a r P C A ; t h e d o m i n a n t E N S O s i g n a l i n t r o p i c a l P a c i f i c S S T a n d S L P is u s u a l l y i d e n t i f i e d w i t h the l e a d i n g P C A a p p r o x i m a t i o n t o these d a t a sets. I n t h i s c h a p t e r , N L P C A is a p p l i e d to c l i m a t i c d a t a sets r e l e v a n t t o E N S O v a r i a b i l ity:  t r o p i c a l P a c i f i c O c e a n sea surface t e m p e r a t u r e a n d t r o p i c a l I n d o - P a c i f i c sea l e v e l  p r e s s u r e . I n t h e case o f S S T , N L P C A is able t o p r o d u c e one- a n d t w o - d i m e n s i o n a l app r o x i m a t i o n s t h a t are o f g r e a t e r f i d e l i t y t o the o r i g i n a l d a t a t h a n t h e c o r r e s p o n d i n g onea n d two-dimensional P C A approximations. In particular, the I D S S T N L P C A  describes  E N S O v a r i a b i l i t y i n a m a n n e r t h a t characterises the a s y m m e t r y i n s p a t i a l d i s t r i b u t i o n o f t e m p e r a t u r e a n o m a l i e s b e t w e e n E l N i n o a n d L a N i n a e v e n t s , w h i c h are t r e a t e d s y m m e t rically i n the I D P C A a p p r o x i m a t i o n . T h e improvement of the N L P C A  approximations  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  41  over P C A are m o r e m o d e s t , b u t s t i l l n o t a b l e , i n t h e case o f S L P . I n C h a p t e r 3, I c o n s i d e r e d t h e a p p l i c a t i o n o f N L P C A to s y n t h e t i c d a t a sets o f suff i c i e n t l y l o w d i m e n s i o n , a n d o f sufficiently l o w noise l e v e l , t h a t t h e i r u n d e r l y i n g l o w d i m e n s i o n a l s t r u c t u r e was m a n i f e s t .  N o n l i n e a r P C A was able to r e c o v e r t h i s  structure,  e v e n i n t h e presence o f m o d e r a t e noise levels. F u n d a m e n t a l l y , h o w e v e r , N L P C A is o n l y o f p r a c t i c a l use i n c l i m a t e r e s e a r c h i f it is able to r o b u s t l y c h a r a c t e r i s e l o w - d i m e n s i o n a l s t r u c t u r e i n r e a l d a t a sets a r i s i n g f r o m the c l i m a t e s y s t e m , a n d i m p r o v e u p o n t h e r e s u l t s obtained by traditional linear methods.  I show here t h a t t h i s is i n d e e d t h e case, a n d  t h e r e b y d e m o n s t r a t e t h e p o t e n t i a l u t i l i t y o f N L P C A i n t h e a n a l y s i s o f c l i m a t i c d a t a sets.  4.2  D a t a and M o d e l Building  T h e S S T d a t a c o n s i d e r e d consist of m o n t h l y - a v e r a g e d N O A A sea surface t e m p e r a t u r e s for t h e t r o p i c a l P a c i f i c O c e a n .  T h e d a t a are o n a 2 ° x 2 ° g r i d f r o m 19S t o 1 9 N , a n d  f r o m 1 2 5 E t o 6 9 W , a n d s p a n the p e r i o d f r o m J a n u a r y 1950 t o D e c e m b e r 1998.  This  d a t a set was p r o d u c e d u s i n g t h e P C A - b a s e d i n t e r p o l a t i o n m e t h o d d e v e l o p e d b y S m i t h et a l .  (1996).  A c l i m a t o l o g i c a l a n n u a l c y c l e was c a l c u l a t e d b y a v e r a g i n g t h e d a t a for  e a c h c a l e n d a r m o n t h , a n d m o n t h l y S S T a n o m a l i e s ( S S T A ) were d e f i n e d r e l a t i v e t o t h i s annual cycle. T h e S L P d a t a were C O A D S m o n t h l y - a v e r a g e d sea l e v e l p r e s s u r e ( S L P ) over  the  t r o p i c a l I n d o - P a c i f i c a r e a ( W o o d r u f f et a l , 1987 ) o n a 2 ° x 2 ° g r i d f r o m 27S t o 1 9 N , a n d f r o m 3 1 E t o 6 7 W , c o v e r i n g the p e r i o d f r o m J a n u a r y 1950 t o D e c e m b e r 1998.  The  a n n u a l c y c l e was r e m o v e d i n the s a m e f a s h i o n as for the S S T d a t a t o p r o d u c e sea l e v e l p r e s s u r e a n o m a l i e s ( S L P A ) . T h e S L P A field was t h e n s m o o t h e d i n t i m e u s i n g a 3 - m o n t h r u n n i n g m e a n filter a n d a 1-2-1 filter i n e a c h s p a t i a l d i r e c t i o n .  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  42  In the analysis of both the SSTA and SLPA data, the early stopping algorithm described in the previous chapter was used. In all analyses, 20% of the data was held aside in a validation set, for which network performance was monitored as training proceeded. Training was stopped when this performance began to degrade, or after 5000 iterations, whichever came first. It was found that increasing the maximum number of iterations beyond 5000 did not affect the results of the analysis. 4.3  Tropical Pacific Sea Surface Temperature  To render the NLPCA problem tractable, the SSTA data set was pre-processed by projecting it on the space of its first 10 EOF modes {e^}^, in which 91.4% of the total variance is contained. Doing so takes advantage of the data compression aspect of PCA, which is a feature distinct from feature extraction, for which NLPCA shall be used. Such pre-processing of data to reduce the problem to a manageable size is common in rotated PCA (Barnston and Livezy, 1987) and in statistical forecasting (Barnston, 1994; Tangang et al., 1998). The first 3 EOF spatial patterns of SSTA are displayed in Figure 4.1; these explain, respectively, 57.6%, 10.9%, and 6.8% of the total SSTA variance. A scatterplot of the two leading principal component time series is shown in Figure 4.2. It can also be considered to be a plot of the projection of the data into the plane spanned by the first two SSTA EOF modes. The time series corresponding to these two PCA modes are uncorrelated, but they are clearly not independent] the distribution of the data appears to be markedly non-Gaussian. Indeed, Figure 4.2 indicates that there is an inverted U-shape underlying the data, such that both strongly positive and negative values of the first PCA time series are associated with negative values of the second. Physically, this coupling of the PCI and PC2 time series describes the fact that the most positive SST anomalies during an average El Nino event lie closer to the eastern boundary of the  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST  and  SLP  43  SST EOF 1  150E  180E  150W  120W  SST EOF 2  150E  180E  150W  120W  90W  SST EOF 3  J  1  150E  \—L  180E  150W  C  ±J  120W  90W  F i g u r e 4 . 1 : S p a t i a l p a t t e r n s o f the first t h r e e S S T A E O F p a t t e r n s , n o r m a l i s e d t o u n i t m a g n i t u d e . T h e c o n t o u r i n t e r v a l is 0.02, the zero c o n t o u r is i n b o l d , a n d n e g a t i v e c o n t o u r s are d a s h e d .  Chapter  4. NLPCA  .41 -  of Tropical Indo-Pacific  1 3  1 -  2  SST  SLP  i  1 -  and  1  PC1  0  44  i 1  I  i  2  3  4  F i g u r e 4.2: S c a t t e r p l o t o f S S T A d a t a p r o j e c t e d o n t o t h e p l a n e s p a n n e d b y t h e first t w o E O F patterns.  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  45  P a c i f i c t h a n do t h e coldest a n o m a l i e s d u r i n g a n average L a N i n a e v e n t . I s h a l l r e t u r n t o this point later. C o n s i d e r first a m o d a l N L P C A d e c o m p o s i t i o n o f t h i s S S T A d a t a . M o d e 1, f o u n d u s i n g a n e t w o r k w i t h L — 4 nodes i n the e n c o d i n g a n d d e c o d i n g layers a n d a single n e u r o n i n t h e b o t t l e n e c k l a y e r , e x p l a i n s 6 9 . 1 % o f the v a r i a n c e i n the 1 0 - d i m e n s i o n a l E O F space, a n d t h u s 6 3 . 3 % o f t h e v a r i a n c e i n the t o t a l S S T A d a t a , as c o m p a r e d t o 5 7 . 6 % e x p l a i n e d b y t h e I D P C A a p p r o x i m a t i o n . F o u r c a n d i d a t e m o d e l s were o b t a i n e d f r o m a n e n s e m b l e o f 20; these m o d e l s differed w i t h N M S D of at m o s t 1%. P r o j e c t i o n s o f t h e first N L P C A  mode  o n t o t h e s u b s p a c e s s p a n n e d b y the S S T A E O F s (ei,e ), (e ,e ), (ei,e ), a n d (ei,e ,e ) 2  are s h o w n i n F i g u r e 4.3 (a)-(d), r e s p e c t i v e l y .  2  A l l four p r o j e c t i o n s  it is difficult t o u n d e r s t a n d the s t r u c t u r e o f the N L P C A projection.  3  3  2  are s h o w n b e c a u s e  a p p r o x i m a t i o n f r o m a single  T h i s d i f f i c u l t y is p a r t i c u l a r l y e v i d e n t i n F i g u r e 4 . 3 ( b ) :  the curve, viewed  edge-on, a p p e a r s t o be self-intersecting, w h e n i n fact the o t h e r p r o j e c t i o n s  demonstrate  t h i s is n o t t h e case. F i g u r e 4.3(a) i n d i c a t e s t h a t N L P C A m o d e 1 c h a r a c t e r i s e s t h e shaped  structure discussed i n the previous paragraph;  m i x t u r e o f P C A m o d e s 1 a n d 2.  3  U-  N L P C A m o d e 1 is p r i m a r i l y a  A s s o c i a t e d w i t h this m o d e is t h e s t a n d a r d i s e d  time  series s {X(t ))-  <s >  n  r  < (s T  f  < s >) r  2  >/' 1 2  c o r r e s p o n d i n g t o t h e o u t p u t o f the single n e u r o n i n the b o t t l e n e c k l a y e r . A p l o t o f a p p e a r s i n F i g u r e 4.4(a).  ;  cti(t ) n  T h i s t i m e series bears a s t r o n g r e s e m b l a n c e t o t h e N i n o 3.4  t i m e series (defined as t h e average S S T A over a b o x f r o m 7S to 7 N , a n d f r o m 1 1 9 W t o 171W)  d i s p l a y e d i n F i g u r e 4.4(b); the c o r r e l a t i o n coefficient b e t w e e n t h e t w o series is  0.88. I n c o n t r a s t t o P C A , no single s p a t i a l p a t t e r n is a s s o c i a t e d w i t h a n y g i v e n N L P C A  Chapter 4. NLPCA  i p  i p  o  SST  and  SLP  PC3  i  i o  of Tropical Indo-Pacific  i o  i o  p  o  o  i o  PC2  i o  46  i o  i o  i o  o  o  o  o  F i g u r e 4.3: S c a t t e r p l o t o f S S T A d a t a (points) a n d S S T A N L P C A m o d e 1 a p p r o x i m a t i o n ( o p e n circles) p r o j e c t e d o n t o the planes s p a n n e d b y E O F s (a) ( e i , e ) , (b) (e2,e ), (c) 2  (ei,e ). 3  (d) shows a s c a t t e r p l o t o f the I D N L P C A  s u b s p a c e (e  1}  e ,e ). 2  3  3  approximation projected into the  F i g u r e 4.4: (a) P l o t o f ai(t ) n  = Sf(X.(t )), n  the s t a n d a r d i s e d t i m e series a s s o c i a t e d w i t h  S S T A N L P C A m o d e 1. (b) P l o t o f the N i n o 3.4 i n d e x n o r m a l i s e d t o u n i t v a r i a n c e .  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  48  mode. The approximation X , however, provides a sequence of patterns that can be visualised cinematographically. This cinematographic interpretation is implicit in traditional PCA: the ID PCA approximation X(i ) = (ei -X(i ))ei describes the evolution in time n  of a standing oscillation.  n  This oscillation has a fixed spatial structure with an amplitude  that varies in time. The more general approximation X(i„) = ( f o s ) ( X ( £ ) ) , with Sf and f  n  f nonlinear, is not so constrained, and can characterise more complex lower-dimensional variability. There is no a priori reason to expect the optimal ID approximation to a spatial field to be a standing wave - but standing waves are the only such approximations that traditional PCA can produce. The power of NLPCA lies in its ability to characterise more general lower-dimensional structure. Figure 4.5 displays maps characterising the first NLPCA mode  corresponding to  values of the time series a = -3.5,-1.5,-0.75,-0.25,0.25,0.75,1.5,3.5. These values x  were chosen to provide a representative sample of spatial structures associated with the NLPCA approximation. Clearly, NLPCA mode 1 describes the evolution of average ENSO events, in contrast with PCA mode 1, which describes only the standing oscillation associated with average ENSO variability. This difference between NLPCA and PCA modes 1 results from the spatial asymmetry between the average warm and cold phases of ENSO. In particular, warm events described by NLPCA mode 1 display the strongest anomalies near the Peruvian coast, whereas the cold events are strongest near 150W. This asymmetry in the evolution of NLPCA mode 1 arises because NLPCA mode 1 mixes PCA modes 1 and 2: for both El Nino and La Nina events, the PCA mode 2 spatial map (Figure 4.1(b)) enters into the NLPCA mode 1 approximation with the same (negative) sign. This spatial asymmetry between average El Nino and La Nina events is manifest in a composite study. Figure 4.6(a) is a composite of NDJ (November, December, January)  Chapter 4. NLPCA  of Tropical Indo-Pacific SST and SLP  49 (b)  (a)  150E  180E  150W  120W  90W  (c)  150E  180E  150W  120W  90W  (f)  150E  180E  150W  120W  90W  150E  180E  150W  120W  90W  120W  90W  150E  180E  150W  120W  90W  (9)  150E  180E  150W  Figure 4.5: Sequence of spatial maps characterising SSTA N L P C A mode 1 for (a) ax = -3.5 (b) ct = -1.5 (c) a = -.75 (d) a = -.25 (e) cx = .25 (f) a i = 0.75 (g) a = 1.5 (h) a i = 3.5. Contour interval: 0.5°C. 1  :  :  :  x  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  50  averaged SSTA for those years in which the NDJ Nino 3.4 index is greater than one standard deviation above the long-term average; Figure 4.6(b) is the same for those NDJ for which Nino 3.4 is less than one standard deviation below the long-term average. This averaging period was used for the composites because NDJ displays the largest variance of all 3-month seasons. These two maps correspond to the SSTA patterns of an average El Nino and an average La Nina, respectively. Note that, consistent with the maps corresponding to the ID NLPCA approximation (Figure 4.5), the largest SST anomalies tend to be located in the central Pacific during average La Nina events and in the eastern Pacific during average El Nino events. This asymmetry in the composite fields was previously noted by Hoerling et al. (1997). The symmetric component of the composite El Nino and La Nina maps, as defined in Appendix B, is displayed in Figure 4.6(c). This map, which in a rough sense characterises the pattern in the composites that is related nonlinearly to the Nino 3.4 time series, bears a strong resemblance to SST EOF mode 2 (Figure 4.1(b)). In fact, the spatial correlation between the two maps is -0.90. The antisymmetric component of the composite (not shown) bears a strong resemblance to EOF 1; the pattern correlation between these two maps is 0.975. Thus, PCA mode 1 may be interpreted as characterising the component of average ENSO behaviour that is antisymmetric between average El Nino and La Nina events. By mixing EOF modes 1 and 2, NLPCA mode 1 is able to characterise the spatial asymmetry between average El Nino and La Nina events. The bias of SST toward warm anomalies in the eastern part of the basin and toward cold anomalies in the western and central parts is also evident in the study of Burgers and Stephenson (1999) , who calculate the skewness of the observed SSTA distribution. It is interesting to note the striking similarity between their map of the spatial distribution of skewness (their Figure 3(a)) and the symmetric component of the SSTA composite displayed in Figure 4.6(c). A final comparison of the ID NLPCA and ID PCA approximations is given in Figures  F i g u r e 4.6: S S T A c o m p o s i t e m a p s for "average"  (a) E l N i n o a n d (b) L a N i n a  events.  C o n t o u r i n t e r v a l : 0 . 5 ° C . (c) S y m m e t r i c c o m p o n e n t o f c o m p o s i t e s (a) a n d ( b ) . C o n t o u r i n t e r v a l : 0 . 1 ° C . See t e x t for d e f i n i t i o n o f c o m p o s i t e s a n d o f t h e s y m m e t r i c c o m p o n e n t .  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  52  4.7 (a)-(c), which show respectively maps of the pointwise correlation between the original SSTA data and the ID N L P C A approximation, the pointwise correlation between SSTA and the ID P C A approximation, and the difference between these two pointwise correlations. The two approximations are equally well correlated with the original data over the eastern-central half of the Pacific Ocean, except near the Ecuadorian coast, where the N L P C A correlations are somewhat higher than those of P C A . In the western Pacific, and in particular in the neighbourhood of the E O F mode 1 zero line, the ID N L P C A approximation displays greater fidelity to the original data, as measured by pointwise correlation, than does the ID P C A approximation. Now consider SSTA N L P C A mode 2, which was calculated using a network containing L = 3 neurons in the encoding and decoding layers. Figure 4.8 displays mode 2 projected onto the subspaces spanned by SSTA EOFs (ei, 62), (e 63), (ei, e ), and (ei, e , e^). The 2)  2  3  5 candidate models from an ensemble of 20 trials differed from each other with N M S D always less than 4%. N L P C A mode 2 explains 11.1% of the variance in the original SSTA data. The associated standardised time series, a (t ) is shown in Figure 4.9. Interestingly, 2  n  the correlation coefficient between cti(t ) and a (t ) n  2  n  is —0.06; the two time series are  uncorrelated. Figure 4.10 displays maps of SSTA N L P C A mode 2, X( ), corresponding 2  to a = -4,-1,-0.25,0,0.15,0.25,0.3,0.35,0.4,0.5,0.75,1.5. 2  These values of Q were 2  selected to present a representative sample of maps describing N L P C A mode 2. When a  2  is strongly negative, SSTA N L P C A mode 2 is characterised by negative anomalies  in the central and western Pacific and positive anomalies in the eastern Pacific. As a  2  increases through zero, the anomalies decrease in magnitude, while the positive anomalies in the eastern part of the basin become increasingly concentrated in the equatorial region. Eventually, the region of positive SSTA breaks off from the coast of South America and migrates into the central Pacific. As a increases further, the SSTA pattern becomes 2  the opposite of that for a near zero, with positive anomalies in the central and western 2  Chapter 4. NLPCA  of Tropical Indo-Pacihc SST and SLP  53  Figure 4.7: Map of pointwise correlation coefficient between observed SSTA and (a) ID N L P C A approximation, (b) ID P C A approximation, and (c) difference between (a) and (b).  Figure 4.8: As for Figure 4.3, but for SSTA N L P C A mode 2.  Chapter 4. NLPCA of Tropical Indo-Pacific SST and SLP  1950  1955  1960  1965  1970  1975  1980  1985  55  1990  1995  2000  Figure 4.9: As for Figure 4.4(a), but for a (t ), the time series corresponding to SSTA N L P C A mode 2. 2  n  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and  SLP  56  F i g u r e 4.10: M a p s c o r r e s p o n d i n g t o S S T A N L P C A m o d e 2 for (a) a (c) a a  2  2  = - 0 . 2 5 (d) a  = 0.4 (j) a  2  2  = 0 (e) a  = 0.5 (k) a  2  2  = 0.15 (f) a  = 0.75 (1) a  2  2  2  = - 4 (b) ct =  = 0.25 (g) cx = 0.3 (h) a 2  = 1.5. C o n t o u r i n t e r v a l : 0 . 5 ° ( 7 .  2  2  -1  = 0.35 (i)  Chapter 4. NLPCA  of Tropical Indo-Pacific SST and SLP  P a c i f i c a n d n e g a t i v e a n o m a l i e s i n t h e east.  F i n a l l y , for a  57  2  near t h e e x t r e m e p o s i t i v e  p a r t o f its r a n g e , S S T A N L P C A m o d e 2 is c h a r a c t e r i s e d b y n e g a t i v e a n o m a l i e s a l o n g t h e e q u a t o r , e x t e n d i n g t o t h e d a t e l i n e , w i t h p o s i t i v e a n o m a l i e s t h r o u g h o u t t h e rest o f t h e b a s i n . B e c a u s e t h e a n o m a l i e s a r e often c o n c e n t r a t e d a l o n g t h e e q u a t o r , i t is r e a s o n a b l e t o a s s o c i a t e t h i s m o d e w i t h c e r t a i n aspects o f E N S O v a r i a b i l i t y n o t c a p t u r e d b y N L P C A m o d e 1. I n d e e d , i t is i n t e r e s t i n g t o note f r o m F i g u r e 4.4 t h a t N L P C A m o d e 2 is m o r e a c t i v e i n t h e l a t e r p a r t o f the r e c o r d t h a n i n t h e earlier. T h e t w o s t r o n g m i n i m a i n  a (t ) 2  n  c o i n c i d e w i t h t h e d e c a y phases o f t h e large E l N i n o events o f 1 9 8 2 / 8 3 a n d 1 9 9 7 / 9 8 , describing t h e lingering patches of w a r m water i n t h e eastern t r o p i c a l Pacific observed d u r i n g these p e r i o d s .  T w o o f t h e t h r e e w e a k e r m i n i m a i n t h e m o r e a c t i v e p e r i o d are  a s s o c i a t e d w i t h t h e p e a k s o f the L a N i n a events o f 1 9 8 5 / 8 5 a n d 1 9 8 8 / 8 9 . I n d e e d , t h e c o l d a n o m a l i e s d u r i n g L a N i n a s i n the l a t e r p e r i o d are s o m e w h a t s t r o n g e r , m o r e c o n c e n t r a t e d i n t h e c e n t r a l P a c i f i c O c e a n , a n d w e a k e r i n t h e e a s t e r n P a c i f i c O c e a n t h a n L a N i n a events i n t h e e a r l i e r p a r t o f t h e r e c o r d , as i n d i c a t e d b y a c o m p o s i t e a n a l y s i s ( F i g u r e 4.11). A n u m b e r o f s t u d i e s h a v e n o t e d a shift i n E N S O v a r i a b i l i t y i n 1977 (see, e.g., W a n g , 1995). The  a p p a r e n t n o n s t a t i o n a r i t y o f a (t )  is consistent w i t h a shift at t h i s t i m e , a l t h o u g h  t h e p r e c i s e t i m i n g o f the shift i n a (t )  is n o t o b v i o u s . T h e 1977 shift is i n fact m a n i f e s t  2  2  n  n  i n S S T A N L P C A m o d e 1 t i m e series ai(t ) n  ( F i g u r e 4.4); t h e t i m e series is b i a s e d t o w a r d  n e g a t i v e e x t r e m a before 1977 a n d t o w a r d p o s i t i v e e x t r e m a after 1977. I t s h o u l d h o w e v e r be n o t e d t h a t t h i s shift m a y s i m p l y b e a n artifact o f t h e t e c h n i q u e u s e d t o r e c o n s t r u c t t h e g r i d d e d S S T d a t a set. T h u s , t h e first m o d e o f t h e m o d a l N L P C A d e c o m p o s i t i o n o f t r o p i c a l P a c i f i c  SSTA  d e s c r i b e s t h e average v a r i a b i l i t y a s s o c i a t e d w i t h t h e E N S O p h e n o m e n o n , a n d n i c e l y c h a r a c t e r i s e s t h e a s y m m e t r y i n s p a t i a l s t r u c t u r e b e t w e e n average E l N i n o a n d L a N i n a e v e n t s . T h e s e c o n d m o d e characterises the difference i n e v o l u t i o n b e t w e e n different E N S O e v e n t s , a n d i n p a r t i c u l a r , d i s p l a y s a n o n s t a t i o n a r i t y consistent w i t h t h e o b s e r v e d " r e g i m e  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and  SLP  58  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  59  shift" i n E N S O v a r i a b i l i t y . P l o t s o f a 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n o f t h e S S T A d a t a (ie, u s i n g 2 n e u r o n s i n t h e b o t t l e n e c k l a y e r ) , p r o j e c t e d i n t h e subspaces s p a n n e d b y S S T A E O F s ( e i , 62), (^2,63), ( e i , e ) , a n d ( e i , e 2 , e ) , are s h o w n i n F i g u r e 4.12 (a)-(d). 3  T h e a s s o c i a t e d net-  3  w o r k u s e d L = 6 n e u r o n s i n t h e e n c o d i n g a n d d e c o d i n g layers, a n d t h e N M S D  between  c a n d i d a t e m o d e l s (8 o u t o f a n ensemble o f 20) v a r i e d b e t w e e n 1% a n d 3 % . T h e 2 D n o n m o d a l N L P C A a p p r o x i m a t i o n e x p l a i n s 79.0% o f t h e v a r i a n c e i n t h e t r u n c a t e d set, a n d t h u s 7 2 . 2 % o f t h e v a r i a n c e o f t h e o r i g i n a l d a t a .  T h e t i m e series c o r r e s p o n d i n g  (81,/^X^n)  t o t h e o u t p u t o f t h e b o t t l e n e c k layers, d e n o t e d  data  —  Sf(X(i„)),  are s h o w n i n  F i g u r e 4.13. T h e s e t w o t i m e series are h i g h l y c o r r e l a t e d w i t h e a c h o t h e r ( r = -.835) a n d w i t h t h e N i n o 3.4 i n d e x ( r i = —.879 a n d r  2  = .889, r e s p e c t i v e l y ) . B e c a u s e t h e 2 D n o n -  m o d a l N L P C A d e p e n d s o n 2 p a r a m e t e r s , 81 a n d 82, it is difficult t o v i s u a l i s e t h e results u s i n g a sequence o f m a p s as was done w i t h t h e m o d a l N L P C A i n F i g u r e s 4.5 a n d 4 . 1 0 . F i g u r e s 4.14(a) a n d (b) d i s p l a y m a p s o f t h e p o i n t w i s e c o r r e l a t i o n coefficient b e t w e e n t h e S S T A data a n d the 2 D P C A a n d 2 D n o n m o d a l N L P C A approximations, respectively. T h e 2 D n o n m o d a l N L P C A approximation produces higher correlations t h a n the 2 D P C A a p p r o x i m a t i o n i n the central equatorial, western, a n d southeastern Pacific, a n d slightly lower correlations i n the eastern equatorial Pacific. It is w o r t h c o n s i d e r i n g t h e t i m e series 8\(t ) a n d 8 (tn) n  d i s c u s s e d i n C h a p t e r 2, t h e p a r a m e t e r i s a t i o n  more detail.  m  2  Sf(X(i )) n  of the P-dimensional  A s was surface  d e t e r m i n e d b y N L P C A is o n l y defined u p t o a n a r b i t r a r y h o m e o m o r p h i s m (i.e., a c o n t i n u o u s , one-to-one, a n d o n t o f u n c t i o n w i t h c o n t i n u o u s i n v e r s e ) . T h a t i s , f o r a n a r b i t r a r y h o m e o m o r p h i s m g : 3ft  3ft , t h e t i m e series g ( s f ( X ( < ) ) ) is also a n a c c e p t a b l e p a -  p  p  n  r a m e t e r i s a t i o n o f t h e surface, because f o S f = (f 0 g any h o m e o m o r p h i s m  g  :  3ft  2  H->  3ft , [gi{tn), g2(tn)) 2  _  1  ) 0 ( g 0 S f ) . I n p a r t i c u l a r , for  — g(8i(t ), n  8 (t )) p a r a m e t e r i s e s t h e 2  n  surface f o u n d b y 2 D n o n m o d a l N L P C A . W h i c h p a r a m e t e r i s a t i o n is d e t e r m i n e d b y t h e  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  Figure 4.12: As for Figure 4.3, but for SSTA 2D nonmodal NLPCA approximation.  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  61  (a)  .41 1950  1  1  1955  1960  1  1965  1 1970  1  1975  1  1  1980  1985  1  1990  1 1995  I 2000  Figure 4.13: Time series (a) and (b) B (t ) where (/3i,/3 )(i„) = s ( X ( i „ ) ) is the pair of time series associated with the SSTA 2D nonmodal N L P C A approximation. Both time series have been normalised to unit variance. 2  n  t  2  f  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  62  (a)  150E  180E  150W  120W  90W  120W  90W  120W  90W  (b)  150E  180E  150W (c)  150E  180E  150W  F i g u r e 4.14: M a p s o f p o i n t w i s e c o r r e l a t i o n b e t w e e n o b s e r v e d S S T A a n d (a) 2 D P C A approximation approximation.  (b) 2 D n o n m o d a l N L P C A  a p p r o x i m a t i o n , a n d (c) 2 D m o d a l  NLPCA  Chapter 4. NLPCA  of Tropical Indo-Pacific SST and SLP  63  N L P C A a l g o r i t h m is a m a t t e r o f c h a n c e . T h i s d e g e n e r a c y c o m p l i c a t e s t h e i n t e r p r e t a t i o n o f t h e t i m e series d e t e r m i n e d b y n o n m o d a l N L P C A . I n p a r t i c u l a r , t h e t i m e series  f3i(t ) n  need not be independent, or even uncorrelated. D e t e r m i n i n g a set o f P i n d e p e n d e n t v a r i a b l e s 7, p a r a m e t e r i s i n g t h e surface f r o m t h e set o f P t i m e series /%(£„) d e t e r m i n e d e m p i r i c a l l y b y N L P C A is a n o t h e r p r o b l e m o f f e a t u r e e x t r a c t i o n , i n t h e space o f t h e variables p a r a m e t e r i s i n g t h e surface.  Therefore, P C A or  m o d a l N L P C A c a n b e u s e d t o c a l c u l a t e t h e ~yi(t ). I n t h e case at h a n d , i n s p e c t i o n o f t h e n  s c a t t e r p l o t o f j3i(t ) n  w i t h p2(t )  i n d i c a t e d t h a t P C A was a p p r o p r i a t e for t h e s e p a r a t i o n  n  of t h e c o r r e l a t e d t i m e series f3\(t ) a n d / 3 ( £ ) i n t o t w o u n c o r r e l a t e d t i m e series 71 (t ) n  and  72(^71)-  2  n  n  T h e h o m e o m o r p h i s m g is t h u s s i m p l y a l i n e a r f u n c t i o n . T h e first P C A m o d e  e x p l a i n e d 9 2 . 7 % o f t h e v a r i a n c e i n /3-space, a n d t h e a s s o c i a t e d t i m e series ( n o t s h o w n ) d e s c r i b e s average E N S O v a r i a b i l i t y . series is 0.92 a n d w i t h ai(t )  is 0.87. T h e s e c o n d P C A m o d e e x p l a i n s t h e r e m a i n i n g  n  7.3%  Its c o r r e l a t i o n coefficient w i t h t h e N i n o 3.4 t i m e  o f t h e v a r i a n c e i n /3-space,-and t h e a s s o c i a t e d t i m e series  72(^71)  ( n o t s h o w n ) is  r a t h e r s i m i l a r t o a ( £ ) . T h e t w o t i m e series h a v e a c o r r e l a t i o n o f 0.7, a n d i n p a r t i c u l a r 2  n  72 ( ^ n ) d i s p l a y s t h e s a m e shift i n a c t i v i t y f r o m t h e pre-1977 t o t h e p o s t - 1 9 7 7 as does a (t ), 2  n  parameterisation  w i t h t h e s a m e p r o m i n e n t p e a k s a p p e a r i n g i n b o t h t i m e series. (/3i(t ),^(^n))  The  t h u s c o n t a i n s e s s e n t i a l l y t h e s a m e i n f o r m a t i o n as t h e  n  t w o t i m e series ai(t )  period  a n d a ( £ ) - T h e fact t h a t a n e x t r a step o f p r o c e s s i n g is r e q u i r e d t o  n  2  n  a l l o w i n t e r p r e t a t i o n o f t h e t i m e series p r o d u c e d b y n o n m o d a l N L P C A i n d i c a t e s a d i s t i n c t a d v a n t a g e o f m o d a l over n o n m o d a l N L P C A . F i g u r e 4.15 d i s p l a y s t h e 2 D m o d a l a p p r o x i m a t i o n t o t h e S S T A d a t a , o b t a i n e d b y a d d i n g t h e t w o I D m o d a l a p p r o x i m a t i o n s o b t a i n e d before, p r o j e c t e d i n S S T A E O F s u b spaces ( e i , e ) , ( e , e ) , 2  2  3  (ei,e ), and (ei,e ,e ). 3  2  3  Because of the variance-partitioning  p r o p e r t y o f N L P C A , t h e f r a c t i o n o f v a r i a n c e e x p l a i n e d b y t h i s a p p r o x i m a t i o n is t h e s u m of t h e f r a c t i o n s e x p l a i n e d b y t h e t w o I D m o d a l a p p r o x i m a t i o n s , i . e . , 7 4 . 4 % , w h i c h is  F i g u r e 4.15: A s for F i g u r e 4.3, b u t for S S T A 2 D m o d a l N L P C A  approximation.  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and  SLP  65  s l i g h t l y g r e a t e r t h a n t h a t o b t a i n e d w i t h the 2 D n o n m o d a l a p p r o x i m a t i o n . T h i s a p p r o x i m a t i o n differs i n d e t a i l f r o m the n o n m o d a l a p p r o x i m a t i o n d i s p l a y e d i n F i g u r e 4.12, b u t t h e t w o agree b r o a d l y i n t h e i r g e n e r a l features.  F i g u r e 4.14(c) d i s p l a y s a m a p o f t h e  p o i n t w i s e c o r r e l a t i o n coefficient b e t w e e n the 2 D m o d a l N L P C A a p p r o x i m a t i o n a n d o r i g i n a l S S T A d a t a ; c o r r e l a t i o n s are s o m e w h a t  the  h i g h e r t h a n those o f t h e 2 D n o n m o d a l  a p p r o x i m a t i o n i n the western Pacific ocean and somewhat lower i n the eastern equatorial P a c i f i c , b u t b y a n d large t h e differences b e t w e e n the t w o p o i n t w i s e c o r r e l a t i o n m a p s are small. N o t e t h a t w h i l e i n p r i n c i p l e one w o u l d e x p e c t a 2 D n o n m o d a l N L P C A  approximation  t o be s u p e r i o r t o a 2 D m o d a l N L P C A a p p r o x i m a t i o n , because it s h o u l d h a v e access t o a b r o a d e r class o f f u n c t i o n s , i n t h e case o f S S T A the f o r m e r e x p l a i n s 7 2 . 2 % o f t h e v a r i a n c e w h i l e t h e l a t t e r e x p l a i n s 74.4%. I n fact, the m o d e l c o r r e s p o n d i n g t o t h e m o d a l  NLPCA  h a d 1 3 % m o r e free p a r a m e t e r s t h a n t h a t c o r r e s p o n d i n g t o t h e n o n m o d a l N L P C A .  It  seems t h a t t h i s difference a l l o w e d the m o d a l m o d e l m o r e f l e x i b i l i t y t h a n t h e n o n m o d a l , l e a d i n g t o t h e slight i m p r o v e m e n t i n the f r a c t i o n o f v a r i a n c e e x p l a i n e d . T h u s , i n b o t h I D a n d 2 D , a n d for b o t h m o d a l a n d n o n m o d a l a p p r o a c h e s ,  NLPCA  p r o d u c e s a p p r o x i m a t i o n s o f greater f i d e l i t y to the t r o p i c a l P a c i f i c S S T A t h a n does P C A . I n p a r t i c u l a r , b o t h t h e I D P C A a n d the I D N L P C A a p p r o x i m a t i o n s d e s c r i b e "average" E N S O v a r i a b i l i t y , b u t t h e I D m o d a l N L P C A a p p r o x i m a t i o n was able t o c h a r a c t e r i s e  the  s p a t i a l a s y m m e t r y b e t w e e n average E l N i n o a n d L a N i n a events i n a f a s h i o n t h a t  ID  PCA  cannot.  4.4  T r o p i c a l Indo-Pacific Sea Level Pressure  A s was d o n e w i t h t h e S S T A d a t a , the S L P A d a t a was p r e p r o c e s s e d b y p r o j e c t i n g i t o n t o t h e space s p a n n e d b y its 10 l e a d i n g E O F m o d e s , w h i c h t o g e t h e r e x p l a i n 6 0 % o f t h e t o t a l  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  66  v a r i a n c e i n t h e d a t a . F i g u r e 4.16 d i s p l a y s m a p s o f t h e t h r e e l e a d i n g S L P A E O F m o d e s , w h i c h e x p l a i n 2 4 . 2 % , 10.7%, a n d 6.0% o f t h e t o t a l v a r i a n c e , r e s p e c t i v e l y . F i g u r e 4.17 d i s p l a y s t h e I D N L P C A a p p r o x i m a t i o n o f t h e S L P A d a t a p r o j e c t e d o n t h e subspaces s p a n n e d b y S L P A E O F s ( e i , e ) , (e2,e3), ( e ^ e s ) , a n d ( e i , e , e 3 ) . T h i s c u r v e 2  2  was o b t a i n e d u s i n g a n e t w o r k w i t h L = 2 neurons i n t h e e n c o d i n g a n d d e c o d i n g l a y e r s ; t h e N M S D b e t w e e n t h e 8 c a n d i d a t e m o d e l s f r o m a n e n s e m b l e o f 50 r a n g e d b e t w e e n 0 . 1 % a n d 0.3%. T h e I D N L P C A a p p r o x i m a t i o n explains 27.0% of the t o t a l variance i n the S L P A d a t a , a slight i m p r o v e m e n t over t h e f r a c t i o n o f v a r i a n c e e x p l a i n e d b y t h e I D P C A approximation.  F i g u r e s 4.18(a) a n d (b) d i s p l a y r e s p e c t i v e l y a p l o t o f cti(t ), n  the time  series a s s o c i a t e d w i t h t h e I D N L P C A a p p r o x i m a t i o n , a n d t h e S o u t h e r n O s c i l l a t i o n I n d e x ( S O I ) , c a l c u l a t e d b y s u b t r a c t i n g t h e S L P A at D a r w i n , A u s t r a l i a ( 1 3 1 E , 1 2 S ) f r o m t h a t at T a h i t i ( 1 4 9 W , 1 7 S ) (see F i g u r e 4.16(a)), a n d t h e n a p p l y i n g a 5 - m o n t h r u n n i n g average smoother.  T h e t w o t i m e series b e a r a s t r o n g r e s e m b l a n c e t o e a c h o t h e r o n i n t e r a n n u a l  a n d l o n g e r t i m e s c a l e s ; t h e i r c o r r e l a t i o n coefficient is 0.72. T h e I D m o d a l N L P C A a p p r o x i m a t i o n t h u s seems t o d e s c r i b e E N S O v a r i a b i l i t y i n t h e S L P A field. T h i s a s s o c i a t i o n is r e i n f o r c e d b y i n s p e c t i o n o f t h e sequence of m a p s X ^ ) for 1  = —3, —2, —1, —0.5, 0, 0 . 5 , 1 , 2  ( F i g u r e 4.19). T h i s sequence o f s p a t i a l p a t t e r n s describes t h e east-west S L P A d i p o l e ass o c i a t e d w i t h average S o u t h e r n O s c i l l a t i o n v a r i a b i l i t y . F i g u r e 4.20 d i s p l a y s c o m p o s i t e s o f S L P A a v e r a g e d over those D J F ( D e c e m b e r , J a n u a r y , F e b r u a r y ) p e r i o d s i n w h i c h t h e S O I was less t h a n 1 s t a n d a r d d e v i a t i o n b e l o w t h e l o n g - t e r m average ( F i g u r e 4.20(a)) o r w a s m o r e t h a n o n e s t a n d a r d d e v i a t i o n a b o v e t h e l o n g - t e r m average ( F i g u r e 4 . 2 0 ( b ) ) .  This  3 - m o n t h p e r i o d was selected because o f a l l 3 - m o n t h seasons i n t h e r e c o r d , i t d i s p l a y e d t h e greatest v a r i a n c e . F i g u r e s 4.20(a) a n d 4.20(b) represent "average" E l N i n o a n d L a N i n a p a t t e r n s , r e s p e c t i v e l y . C l e a r l y , t h e m a p s i n F i g u r e 4.19 for ai < 0 c o r r e s p o n d t o t h e E l N i n o c o m p o s i t e a n d those for a i > 0 c o r r e s p o n d t o t h e L a N i n a c o m p o s i t e . C o m p a r i s o n o f F i g u r e s (4.19) a n d (4.20) i n d i c a t e s t h a t t h e I D N L P C A a p p r o x i m a t i o n  characterises  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST  and  SLP  67  (a)  60E  90E  120E  150E  180E  150W  120W  90W  180E  150W  120W  90W  (b)  60E  90E  120E  150E  F i g u r e 4.16: S p a t i a l p a t t e r n s o f S L P A (a) E O F m o d e 1, (b) E O F m o d e 2, (c) E O F m o d e 3. T h e b l a c k dots i n (a) designate the p o s i t i o n s o f T a h i t i a n d D a r w i n , A u s t r a l i a .  Figure 4.17: As for Figure 4.3, but for SLPA N L P C A mode 1.  Chapter  r-  4. NLPCA  of Tropical Indo-Pacific  SST  and  69  SLP  Oh  1950  1955  1960  1965  1970  1975  1980  1985  1990  1995  2000  1950  1955  1960  1965  1970  1975  1980  1985  1990  1995  2000  F i g u r e 4.18: (a) P l o t o f cti(t ) n  = 5 ( X ( i ) ) , t h e s t a n d a r d i s e d t i m e series a s s o c i a t e d w i t h / l  n  S L P A N L P C A m o d e 1. (b) P l o t o f 5 - m o n t h r u n n i n g m e a n o f S O I , s t a n d a r d i s e d t o u n i t variance.  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST  and  SLP  70  F i g u r e 4.19: P l o t o f a sequence o f s p a t i a l m a p s c h a r a c t e r i s i n g S L P A N L P C A m o d e 1 for (a)  ax = - 3 (b) a = - 2 ( c ) a i = - 1 (d) x  (h) a i = 2. C o n t o u r i n t e r v a l : 0.5 h P a .  a  i  = - 0 . 5 (e)  a = 0 (f) a = 0.5 (g) a = 1 x  x  x  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and  71  SLP  (a)  60E  90E  120E  150E  180E  150W  120W  90W  60E  90E  120E  150E  180E  150W  120W  90W  F i g u r e 4.20: C o m p o s i t e s o f S L P A d u r i n g (a) E l N i n o a n d (b) L a N i n a . C o n t o u r I n t e r v a l : 0.5 h P a .  Chapter  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  72  t h e a s y m m e t r y i n S L P A p a t t e r n b e t w e e n average E l N i n o a n d L a N i n a e v e n t s , p a r t i c ularly i n the eastern half of the d o m a i n .  F i g u r e 4.21 d i s p l a y s t h e s p a t i a l s t r u c t u r e o f  t h e p o i n t w i s e c o r r e l a t i o n coefficient b e t w e e n S L P A a n d the I D N L P C A ( F i g u r e 4.21(a)), a n d t h e I D P C A a p p r o x i m a t i o n ( F i g u r e 4 . 2 1 ( b ) ) .  approximation  A s was t h e case i n  the previous section, t h e . I D S L P A N L P C A a p p r o x i m a t i o n produces higher  correlations  t h a n t h e I D P C A a p p r o x i m a t i o n p a r t i c u l a r l y a r o u n d n o d a l lines o f t h e l a t t e r . I n c a l c u l a t i n g t h e s e c o n d m o d e o f the m o d a l N L P C A d e c o m p o s i t i o n o f t h e  SLPA  d a t a , it was d e t e r m i n e d t h a t o n l y for L = 1 n e u r o n i n t h e e n c o d i n g a n d d e c o d i n g l a y e r s c o u l d r o b u s t r e s u l t s be o b t a i n e d . N e u r a l - n e t w o r k b a s e d N L P C A c a n o n l y find n o n l i n e a r s t r u c t u r e i f t h e r e are t w o or m o r e n e u r o n s i n the e n c o d i n g a n d d e c o d i n g l a y e r s .  Thus,  t h e o p t i m a l I D c h a r a c t e r i s a t i o n o f the r e s i d u a l d a t a , o b t a i n e d b y s u b t r a c t i n g f r o m t h e o r i g i n a l S L P A d a t a t h e I D N L P C A a p p r o x i m a t i o n , is a s t r a i g h t l i n e . T h e I D  NLPCA  a p p r o x i m a t i o n o f t h e r e s i d u a l d a t a coincides w i t h the I D P C A a p p r o x i m a t i o n o f these d a t a , a n d h e n c e no l o w e r - d i m e n s i o n a l n o n l i n e a r s t r u c t u r e s c a n be f o u n d i n these r e s i d u a l s . O u r c a l c u l a t i o n (not s h o w n ) shows t h a t N L P C A m o d e 2 e x p l a i n s 15.9% o f t h e v a r i a n c e o f t h e o r i g i n a l d a t a . T h e s p a t i a l p a t t e r n a n d a s s o c i a t e d t i m e series are s h o w n i n F i g u r e 4.22. I n fact, S L P A N L P C A m o d e 2 bears a s t r o n g r e s e m b l a n c e t o S L P A P C A m o d e 2; t h e c o r r e l a t i o n coefficient b e t w e e n the t w o t i m e series is 0.96 a n d t h e p a t t e r n c o r r e l a t i o n b e t w e e n t h e a s s o c i a t e d s p a t i a l p a t t e r n s is 0.93.  T h e s i m i l a r i t y b e t w e e n t h e t w o is n o t  s u r p r i s i n g , as S L P A N L P C A m o d e 1 does not differ s u b s t a n t i a l l y f r o m S L P A P C A m o d e 1. T h e r e s u l t s o f a 2 D n o n m o d a l N L P C A o f t h e S L P A d a t a (not s h o w n ) d i d n o t y i e l d particularly interesting  results.  T h u s , apart from a weakly nonlinear I D N L P C A  approximation corresponding  to  a v e r a g e E N S O v a r i a b i l i t y a n d c h a r a c t e r i s i n g a slight a s y m m e t r y b e t w e e n a v e r a g e E l N i n o a n d L a N i n a events, t h e r o b u s t l o w - d i m e n s i o n a l s t r u c t u r e o f t h e S L P A d a t a is l i n e a r .  Figure 4.21: Spatial pattern of pointwise correlation coefficient between SLPA and (a) ID N L P C A approximation and (b) ID P C A approximation.  F i g u r e 4.22: S L P A N L P C A m o d e 2 (a) S p a t i a l p a t t e r n (not n o r m a l i s e d , u n i t s are h P a ) a n d (b) t i m e series ( n o r m a l i s e d to u n i t v a r i a n c e ) .  Chapter  4.5  4. NLPCA  of Tropical Indo-Pacific  SST and SLP  75  Conclusions  Application of N L P C A to two data sets of climatic significance, namely tropical Pacific sea surface temperatures and tropical Indo-Pacific sea level pressure, has demonstrated that N L P C A is able to robustly produce one- and two-dimensional approximations that are superior to the corresponding approximations produced by P C A . The improvement is particularly striking in the case of SST: variability in this field is dominated by E N S O , the average manifestation of which is asymmetric between average E l Nino and L a Nina phases. As P C A is constrained to produce ID approximations which are standing oscillations with fixed spatial pattern, it is unable to characterise this asymmetry. On the other hand, the ID N L P C A approximation, by mixing P C A modes 1 and 2, is able to capture this difference in SST structure between average E l Nino and L a Nina episodes. Figures 4.12 and 4.15 display, respectively, 2D nonmodal and modal N L P C A approximations to the SST data. These are seen to be rather similar in structure. Inspection of the time series (3i(t ) and f3 (t ) parameterising the 2D nonmodal N L P C A approxn  2  n  imation highlights the importance of the fact, pointed out by Malthouse (1998), that N L P C A produces time series that are unique only up to an arbitrary homeomorphism. The time series (3\(t ) and n  /^(^n) were strongly correlated, and therefore  contain substan-  tial overlap in the information they convey. A second feature analysis problem, solved using traditional P C A , was then used to untangle (3i(t ) and n  ^(in)- The resulting time  series bore strong similarities to the time series cti(t ) and ct2(t ) corresponding to the n  n  first two modes of the modal N L P C analysis. This result further strengthens the interpretation that the modal and nonmodal analyses are producing essentially the same approximation. The fact that the use of nonmodal N L P C A required the solution of a subsidiary feature extraction problem to interpret the time series produced illustrates a deficiency of nonmodal N L P C A , as compared to modal.  Chapter 4. NLPCA  of Tropical Indo-Pacific  SST and SLP  76  The first modal NLPCA approximation to the SLP data describes average ENSO variability in this field, producing a somewhat better approximation to the original data than that produced by PCA and characterising the asymmetry in SLPA between average El Nino and La Nina episodes. The differences between the NLPCA and PCA approximations for SLP are less striking than was the case with SST, indicating that the low-dimensional structure of SLP is more linear than SST. Indeed, no nonlinear mode beyond the first could robustly be found in the data, indicating that either SLP variability in the tropical Indo-Pacific region is very nearly linear, or that any nonlinear structure is too subtle to detect within existing records.  Chapter 5  N o n l i n e a r P r i n c i p a l C o m p o n e n t A n a l y s i s of N o r t h e r n H e m i s p h e r e Atmospheric Circulation Data  5.1  Introduction  Low-frequency, large-scale coherent variability in the Northern Hemisphere midlatitude circulation has been a subject of considerable interest in climate research over the last few decades. It is typically characterised in terms of spatially-fixed, temporally fluctuating anomaly patterns modifying the climatological mean circulation. Some of these patterns are zonally localised, such as the Pacific-North America (PNA) pattern (Wallace and Gutzler, 1981), a chain of alternating positive and negative geopotential height anomalies in the mid-troposphere extending from the subtropical North Pacific Ocean over North America, following a great circle route; and the North Atlantic Oscillation (NAO; van Loon and Rogers, 1978; Hurrell, 1995), a dipolar pattern with geopotential height anomalies of opposite sign over Iceland and the Azores. Other patterns are more zonal in structure, such as the Arctic Oscillation (AO; Thompson and Wallace, 1998) and the Antarctic Oscillation (AAO; Gong and Wang, 1999). These are approximately zonallysymmetric patterns of variability with anomalies of opposite sign over the polar region and the midlatitudes, for the Northern and Southern Hemisphere, respectively. The connections between these different patterns of variability (e.g. Deser, 1999), and their dynamical origin - in particular their maintenance by lower boundary forcing or internal dynamics (see, e.g., Feldstein and Lee, 1996; Corti et al, 1997; Trenberth et al, 1998)  77  Chapter 5. NLPCA  of Northern  - are areas o f a c t i v e r e s e a r c h .  Hemisphere  Atmospheric  Circulation  Data  These characteristic patterns of atmospheric  78  variability  h a v e h i s t o r i c a l l y b e e n d i a g n o s e d u s i n g c o r r e l a t i o n a n a l y s i s ( W a l l a c e a n d G u t z l e r , 1981; H s u a n d L i n , 1992), P C A ( K u s h n i r a n d W a l l a c e , 1989; T h o m p s o n a n d W a l l a c e , 1998), c o m b i n e d P C A ( B a l d w i n a n d D u n k e r t o n , 1999; C P C A , also k n o w n as e x t e n d e d E O F a n a l y s i s , is d e f i n e d b y B r e t h e r t o n et a l . , 1992), c a n o n i c a l c o r r e l a t i o n a n a l y s i s ( P e r l w i t z a n d G r a f , 1995), a n d r o t a t e d P C A ( B a r n s t o n a n d L i v e z e y , 1987). A l l o f these m e t h o d s are l i n e a r a n d p r o d u c e s p a t i a l a n d t e m p o r a l p a t t e r n s t h a t d e s c r i b e s t a n d i n g o s c i l l a t i o n s . R e c e n t l y , t h e A r c t i c O s c i l l a t i o n ( A O ) has b e e n o f p a r t i c u l a r i n t e r e s t .  T h e A O was  defined b y T h o m p s o n a n d W a l l a c e (1998) as the l e a d i n g P C A m o d e o f m o n t h l y averaged N o v e m b e r t h r o u g h A p r i l N o r t h e r n H e m i s p h e r e S L P A n o r t h o f 2 0 ° . T h e A O s p a t i a l p a t t e r n , d e r i v e d f r o m the T r e n b e r t h a n d P a o l i n o S L P d a t a set (1980), is d i s p l a y e d i n F i g u r e 5.1. T h e c a n o n i c a l A O s t r u c t u r e is r o u g h l y z o n a l l y - s y m m e t r i c , w i t h a n o m a l i e s o f o p p o s i t e s i g n i n t h e p o l a r r e g i o n a n d the m i d l a t i t u d e s . D e v i a t i o n s f r o m z o n a l s y m m e t r y are c h a r a c t e r i s e d b y a w a v e n u m b e r t w o p a t t e r n r e f l e c t i n g l a n d - o c e a n c o n t r a s t s .  I n par-  t i c u l a r , t h e d i p o l e p a t t e r n i n S L P over the N o r t h A t l a n t i c s t r o n g l y resembles t h e surface signature of the N A O . T h e A O diagnosed using P C A on the T r e n b e r t h a n d P a o l i n o d a t a set s t r o n g l y resembles t h a t o b t a i n e d b y T h o m p s o n a n d W a l l a c e (1998), w h o f o u n d t h e A O a n d N A O t i m e series t o b e h i g h l y c o r r e l a t e d (r = 0.69).  T h i s point was consid-  e r e d f u r t h e r b y D e s e r (1999), w h o c o n s i d e r e d the e x t e n t t o w h i c h coherent v a r i a b i l i t y i n N o r t h e r n H e m i s p h e r e c i r c u l a t i o n is r e a l l y h e m i s p h e r i c i n e x t e n t , a n d suggested t h a t i n fact t h e A O s h o u l d be t e r m e d t h e " A r c t i c - A t l a n t i c O s c i l l a t i o n " . T h e A O is s t r o n g l y e q u i v a l e n t b a r o t r o p i c i n s t r u c t u r e , throughout  the troposphere  with  coherent  variability  a n d i n t o the lower s t r a t o s p h e r e ( P e r l w i t z a n d G r a f , 1995;  T h o m p s o n a n d W a l l a c e , 1998, 1999a; B a l d w i n a n d D u n k e r t o n , 1999).  Interest i n the  A O has b e e n p a r t i c u l a r l y s t r o n g i n recent years because o f its p o t e n t i a l as a s e n s i t i v e  F i g u r e 5.1: S p a t i a l s t r u c t u r e o f the l e a d i n g E O F p a t t e r n f r o m o b s e r v e d S L P . C o n t o u r i n t e r v a l s are 1 h P a (...,-1.5,-0.5,0.5,1.5,...).  Chapter  5. NLPCA  barometer  of Northern  o f c l i m a t e change.  Hemisphere  Atmospheric  T h o m p s o n et a l .  Circulation  Data  80  (1999b) n o t e d t h e s i m i l a r i t y i n s t r u c -  t u r e b e t w e e n t h e s p a t i a l p a t t e r n o f the A O a n d o f recent t r e n d s i n S L P i n t h e N o r t h e r n H e m i s p h e r e . T h e y suggested t h a t the c l i m a t e change s i g n a l is m a n i f e s t i n g i t s e l f as a secu l a r shift t o w a r d t h e p o s i t i v e p h a s e o f t h e A O ( s t r o n g p o l a r v o r t e x ) s u p e r i m p o s e d u p o n m o n t h l y timescale A O  fluctuations.  I n d e e d , i n m o d e l l i n g studies, S h i n d e l l et a l . (1999,  u s i n g t h e G o d d a r d I n s t i t u t e for S p a c e Sciences ( G I S S ) a t m o s p h e r i c G C M ) a n d F y f e et al. (1999, u s i n g t h e C a n a d i a n C e n t r e for C l i m a t e M o d e l l i n g a n d A n a l y s i s ( C C C m a ) c o u p l e d G C M ) f o u n d a p r o n o u n c e d t r e n d i n the A O s i g n a l as g r e e n h o u s e gases were i n c r e a s e d . H o w e v e r , t h e p h y s i c a l m e c h a n i s m for this b e h a v i o u r r e m a i n s c o n t r o v e r s i a l , as S h i n d e l l et a l . f o u n d t h a t o n l y w i t h a full s t r a t o s p h e r e w o u l d t h e i r m o d e l p r o d u c e a n A O t r e n d u n d e r i n c r e a s i n g g r e e n h o u s e f o r c i n g , w h i l e F y f e et al.  were able t o p r o d u c e t h i s r e s u l t  using an atmospheric m o d e l w i t h a poorly-resolved stratosphere. I n t h i s c h a p t e r , t h e results o f a n N L P C analysis o f fields c h a r a c t e r i s i n g t h e N o r t h e r n H e m i s p h e r e c i r c u l a t i o n w i l l be c o n s i d e r e d , u s i n g d a t a f r o m the C C C m a c o u p l e d G C M . T h e results will demonstrate that N L P C A  is able to detect a n d c h a r a c t e r i s e  regime  b e h a v i o u r i n m u l t i v a r i a t e d a t a sets. C o n s i d e r a t i o n o f these r e g i m e s w i l l t h r o w some l i g h t on the controversies concerning the relationship between the A O a n d the N A O discussed above.  5.2  D a t a and M o d e l Building  T h e d a t a a n a l y s e d i n this c h a p t e r comes f r o m t w o i n t e g r a t i o n s o f t h e C C C m a c o u p l e d c l i m a t e m o d e l ( C G C M 1 ) : a 1001-year c o n t r o l i n t e g r a t i o n w i t h a t m o s p h e r i c c a r b o n d i o x i d e (CO2) CO2  at p r e - i n d u s t r i a l levels a n d a 500-year s t a b i l i s a t i o n i n t e g r a t i o n w i t h a t m o s p h e r i c concentrations  levels i n 2100.  at four t i m e s p r e - i n d u s t r i a l l e v e l , c o r r e s p o n d i n g t o p r e d i c t e d C 0  2  U s e o f m o d e l d a t a r a t h e r t h a n o b s e r v a t i o n s allows t h e i n v e s t i g a t i o n o f  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  81  potential changes in the atmospheric variability associated with increased atmospheric CO2 concentrations. The fields considered are monthly-averaged Northern Hemisphere sea level pressure and 500mb geopotential height (Z o) from 20°N to 90°N, over the 50  extended winter period from November though April. Both fields are on a Gaussian grid with 3.75° resolution in the zonal and meridional directions. The CCCma CGCM1 model and its climate are described in Flato et al. (1999). The atmospheric component is a T32 spectral primitive equation model with 9 unequally spaced levels (McFarlane et al., 1992). The ocean component is a global primitive equation grid-point model with 1.875° resolution and 29 vertical levels. It is based on the Geophysical Fluid Dynamics Laboratory (GFDL) Modular Ocean Model (MOM) 1.1 (Pacanowski et al., 1993). The leading EOF of this coupled model displays spatially and temporally realistic AO behaviour (Fyfe et al, 1999). For both the SLP and Z  500  fields, monthly anomalies (SLPA and Z500A, respectively)  were computed by subtracting the climatological annual cycle. Fields were weighted by the square root of the cosine of the latitude before calculation of the EOFs to account for the poleward concentration of gridpoints on the Gaussian grid. NLPC analysis of SLPA and Z500A data was carried out using the early stopping algorithm described in Chapter 2. In all analyses, 20% of the data was held aside in a validation set, for which network performance was monitored as training proceeded. Training was stopped when this performance began to degrade, or after 20000 iterations, whichever came first. It was found that increasing the maximum number of iterations beyond 20000 did not affect the results of the analysis.  Chapter  5.3  5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  82  A n a l y s i s of G C M Sea L e v e l Pressure  A s was d o n e w i t h t h e t r o p i c a l S L P A a n d S S T A d a t a i n t h e p r e v i o u s c h a p t e r , t h e n o r t h e r n h e m i s p h e r e S L P A a n d Z500A were p r o j e c t e d o n t o t h e spaces o f t h e i r first 10 E O F m o d e s , i n w h i c h 8 5 % a n d 7 6 % o f t h e v a r i a n c e are r e s p e c t i v e l y c o n t a i n e d . patterns of S L P A and Z A  T h e first four E O F  are d i s p l a y e d i n F i g u r e s 5.2 a n d 5.3, r e s p e c t i v e l y .  500  The map  d i s p l a y e d i n F i g u r e 5.2(a) is t h e s p a t i a l p a t t e r n o f t h e c a n o n i c a l A r c t i c O s c i l l a t i o n i n t h e C C C m a m o d e l ( F y f e et a l . , . 1999). F i g u r e 5.4 d i s p l a y s a s c a t t e r p l o t  of the S L P A data projected i n the plane  by the leading two E O F s , overlaid w i t h a histogram-based  spanned  estimate of the marginal  p r o b a b i l i t y d e n s i t y f u n c t i o n ( P D F ) o f these d a t a i n t h i s space.  T h e P D F displays a  m a r k e d deviation from Gaussian structure i n the form of a pronounced lobe i n the lowerr i g h t q u a d r a n t . F i g u r e 5.5 d i s p l a y s t h e e s t i m a t e o f t h e P D F a l o n g w i t h t h e I D N L P C A a p p r o x i m a t i o n X ( i ) t o these data. n  T h e N L P C A approximation was found using a  n e t w o r k w i t h L = 2 n e u r o n s i n t h e e n c o d i n g a n d d e c o d i n g layers.  This approximation  was o b t a i n e d f r o m a n e n s e m b l e o f 5 n e t w o r k s , o f w h i c h 3 were c a n d i d a t e m o d e l s differing f r o m e a c h o t h e r w i t h a n N M S D o f at m o s t 1%. T h e a p p r o x i m a t i o n , w h i c h e x p l a i n s 2 6 . 5 % of t h e t o t a l v a r i a n c e ( i n c o n t r a s t t o 24% e x p l a i n e d b y t h e l e a d i n g P C A a p p r o x i m a t i o n ) , is t h u s r o b u s t .  N o t e t h a t X ( i ) has a p i e c e w i s e - l i n e a r s t r u c t u r e , c o m p o s e d o f t h r e e n  b r a n c h e s . T h e a s s o c i a t e d s t a n d a r d i s e d t i m e series (t  n  \  */( (*"))~  < f  X  s  >  a l o n g w i t h a h i s t o g r a m e s t i m a t e o f t h e P D F o f a(t ), a r e d i s p l a y e d i n F i g u r e 5.6. T h e d i s n  t r i b u t i o n o f a(t ) n  is s t r o n g l y b i m o d a l , d e m o n s t r a t i n g t h e e x i s t e n c e o f t w o w e l l - s e p a r a t e d  r e g i m e s i n w h i c h t h e N L P C A a p p r o x i m a t i o n resides. E a c h o f t h e t h r e e b r a n c h e s o f X ( i ) n  corresponds  t o a f e a t u r e i n t h e P D F o f a(t ): n  t h e u p p e r b r a n c h (hereafter r e f e r r e d t o  as B r a n c h 1) is a s s o c i a t e d w i t h t h e larger o f t h e t w o p e a k s , t h e l o w e r b r a n c h  (hereafter  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  83  Figure 5.2: Spatial structure of the leading four EOF patterns from CCCma SLPA: (a) EOF 1, (b) EOF 2, (c) EOF 3, (d) EOF 4. These patterns explain 23.7%, 10.6%, 8.5%, and 6.5% of the variance in SLP, respectively. Negative contours are dashed. Contour intervals are 1 hPa (..., -1.5, -0.5, 0.5, ...).  Chapter  5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  84  F i g u r e 5.3: S p a t i a l s t r u c t u r e o f t h e l e a d i n g four E O F p a t t e r n s f r o m C C C m a  Z500A: (a)  E O F 1, (b) E O F 2, (c) E O F 3, (d) E O F 4. T h e s e p a t t e r n s e x p l a i n 19.6%, 1 2 . 5 % , 9 . 3 % , a n d 8.2% o f t h e v a r i a n c e i n Z , r e s p e c t i v e l y . N e g a t i v e c o n t o u r s are d a s h e d . 50Q  i n t e r v a l s are 10 m (..., - 1 5 , - 5 , 5, . . . ) .  Contour  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  85  200  150  100  Q_  oo  O  Q_  -100  -150  -200 -300  -200  -100  100  200  300  P C 1 (hPa)  Figure 5.4: Scatterplot of the leading two SLPA P C time series, overlaid with a histogram estimate of the corresponding marginal probability density function. Contour intervals are 5 x 10" ,1.5 x 10" ,3 x ICT ,6 x 1CT ,1 x l f r , 2 x ICT , 3 x 1(T . The histogram bin size is 25 hPa in both directions. 4  3  3  3  2  2  2  Chapter  5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  86  F i g u r e 5.5: I D N L P C A a p p r o x i m a t i o n X o f S L P A , p r o j e c t e d i n t h e space o f t h e first t w o S L P A E O F s ( o p e n c i r c l e s ) , o v e r l a y i n g h i s t o g r a m e s t i m a t e o f S L P A P D F as i n F i g u r e 5.4.  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Figure 5.6: Plot of the ID SLPA NLPCA time series a(t ) histogram estimate of the PDF (right). n  Data  87  (left) and the associated  Chapter 5. NLPCA  of Northern  Hemisphere Atmospheric  Circulation  Data  88  r e f e r r e d t o as B r a n c h 2) is a s s o c i a t e d w i t h the s m a l l e r p e a k , a n d t h e i n t e r m e d i a t e b r a n c h is a s s o c i a t e d w i t h the m i n i m u m o f the P D F o f a b e t w e e n t h e t w o p e a k s , a n d is r a r e l y v i s i t e d . T h e a p p r o x i m a t i o n X ( i „ ) is i n B r a n c h 1 for 84% o f t h e m o n t h s a n d i n B r a n c h 2 for 13%.  A n i n s p e c t i o n of a ( t ) indicates that variability o n X ( t ) consists of oscillatory n  n  m o t i o n a l o n g B r a n c h 1 w i t h o c c a s i o n a l e p i s o d i c e x c u r s i o n s t o B r a n c h 2, o n w h i c h t h e a p p r o x i m a t i o n r a r e l y resides longer t h a n a m o n t h or t w o . F i g u r e 5.7 is t h e s a m e as F i g u r e 5.5 b u t w i t h t h e e s t i m a t e d P D F s o f t h e p o p u l a t i o n s c o r r e s p o n d i n g t o t h e u p p e r a n d lower branches p l o t t e d separately.  T h e P D F of the  p o p u l a t i o n p r o j e c t i n g o n t o B r a n c h 1 is seen to be n e a r l y G a u s s i a n , w i t h a m a j o r a x i s n e a r l y p a r a l l e l t o B r a n c h 1. B r a n c h 2 also r u n s t h r o u g h t h e m i d d l e o f its a s s o c i a t e d P D F . T h e o v e r l a p o f t h e t w o P D F s is a n artifact o f t h e coarse b i n n i n g u s e d i n t h e i r e s t i m a t i o n . A s was d i s c u s s e d i n t h e p r e v i o u s C h a p t e r , N L P C A differs f r o m P C A i n t h a t t h e I D a p p r o x i m a t i o n p r o d u c e d b y N L P C A does n o t c o r r e s p o n d t o a u n i q u e s p a t i a l p a t t e r n , b u t i n fact t o a sequence o f m a p s . I n C h a p t e r 4, these m a p s were p r e s e n t e d at points along the curve X ( i ) . n  representative  I n this c h a p t e r , a s o m e w h a t different m e t h o d o l o g y for  p r o d u c i n g m a p s c o r r e s p o n d i n g to t h e N L P C A a p p r o x i m a t i o n is a d o p t e d .  Because the  features o f i n t e r e s t i n v o l v e a t m o s p h e r i c c i r c u l a t i o n d a t a at different a l t i t u d e s , b a s e d o n a n a p p r o x i m a t i o n c a l c u l a t e d u s i n g d a t a at a single a l t i t u d e , i n s t e a d r e p r e s e n t a t i v e p o i n t s a l o n g t h e a p p r o x i m a t i o n X ( i ) are selected a n d c o m p o s i t e the o r i g i n a l d a t a over a l l t i m e s n  t h a t t h e a p p r o x i m a t i o n resides i n the n e i g h b o u r h o o d s o f these p o i n t s . o f t h e t w o m e t h o d s for t h e field f r o m w h i c h t h e N L P C A  A comparison  a p p r o x i m a t i o n was d e r i v e d  d e m o n s t r a t e s t h a t t h e r e s u l t i n g m a p s are e s s e n t i a l l y i d e n t i c a l , as e x p e c t e d s i n c e t h e N L P C A a p p r o x i m a t i o n is c o n s t r a i n e d to r u n t h r o u g h t h e " m i d d l e " o f t h e d a t a . F i g u r e 5.8 d i s p l a y s c o m p o s i t e s o f S L P A over n e i g h b o u r h o o d s i n a o f w i d t h 0.4, c e n t r e d at i n t e r v a l s o f 0.4 f r o m —3.1  to 1.3.  V a r i a b i l i t y a l o n g B r a n c h 1 is e x e m p l i f i e d b y  Chapter 5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  89  PC1 (hPa)  Figure 5.7: As in Figure 5.5, but with the PDFs of the populations corresponding to Branch 1 (solid contours) and Branch 2 (dashed contours) plotted separately, and with a bin size of 20 hPa.  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  90  Figure 5.8: Composites of SLPA over characteristic ranges of a. These ranges are indicated in parentheses below the maps, along with the number N of maps used in the composite. Contour interval is 2 hPa (...,-3,-1,1,3,...). Continued on next page.  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  (g)  (h)  (-0.9,-0.5); N=76  (-0.5,-0.1); N=381  (i)  (j)  (-0.1,0.3); N=1631  (0.3,0.7); N=2127  (k)  (I)  (0.7,1.1); N=822  (1.1,1.5); N=107  Figure 5.8: Continued.  91  Chapter  5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  92  c o m p o s i t e s (h) a n d (k) i n F i g u r e 5.8. T h e s e two m a p s d i s p l a y p a t t e r n s o f S L P a n o m a lies t h a t differ i n sign a n d m a g n i t u d e , b u t not i n s p a t i a l p a t t e r n , consistent  w i t h the  i n t e r p r e t a t i o n o f B r a n c h 1 v a r i a b i l i t y as d e s c r i b i n g a s t a n d i n g o s c i l l a t i o n . T h e a n o m a l i e s a s s o c i a t e d w i t h t h i s o s c i l l a t i o n are of o p p o s i t e sign over t h e p o l a r cap a n d t h e m i d l a t i t u d e s , w i t h a p o l a r l o c a l e x t r e m u m over n o r t h e r n E u r a s i a a n d s m a l l l o c a l m i d l a t i t u d e e x t r e m a over t h e west M e d i t e r r a n e a n a n d t h e n o r t h P a c i f i c . T h i s p a t t e r n r e s e m b l e s t h a t of t h e c a n o n i c a l A O as d i a g n o s e d b y E O F a n a l y s i s , b u t w i t h e a s t w a r d s h i f t e d p o l a r a n d M e d i t e r r a n e a n centres o f a c t i o n . I n d e e d , the c o r r e l a t i o n b e t w e e n t h e A O t i m e series (ie, t h e l e a d i n g S L P A P C ) a n d a ( i ) , over those t i m e s w h e n t h e a p p r o x i m a t i o n is o n B r a n c h n  1, is -0.96. C h a r a c t e r i s t i c S L P A a n o m a l i e s associated w i t h B r a n c h 2 are i l l u s t r a t e d i n F i g u r e 5.8(c). T h i s m a p shares c e r t a i n features w i t h t h e a n o m a l y p a t t e r n s d i s p l a y e d i n F i g u r e s 5.8(h) a n d 5 . 8 ( k ) , i n p a r t i c u l a r a n o m a l i e s of o p p o s i t e sign over t h e p o l a r c a p a n d  the  m i d l a t i t u d e s . H o w e v e r , the p o l a r e x t r e m u m i n F i g u r e 5.8(c) is s h i f t e d t o a l o c a t i o n over I c e l a n d , t h e A t l a n t i c m i d l a t i t u d e e x t r e m u m is c e n t r e d over the A z o r e s , a n d over t h e m i d l a t i t u d e P a c i f i c o c e a n are weak.  anomalies  T h i s m a p resembles s t r o n g l y t h e  s i g n a t u r e o f t h e n e g a t i v e phase o f the N o r t h A t l a n t i c O s c i l l a t i o n .  SLPA  N o t e also t h a t  the  a n o m a l y p a t t e r n o f o p p o s i t e sign t o t h a t i n F i g u r e 5.8(c) does not a p p e a r o n B r a n c h 2, i n c o n t r a s t t o w h a t was o b s e r v e d a l o n g B r a n c h 1. V a r i a b i l i t y a l o n g B r a n c h 2 does n o t describe an oscillation, but episodic excursions to a single-phased, strongly  anomalous  circulation. F i g u r e s 5.9 a n d 5.10 d i s p l a y r e s p e c t i v e l y c o m p o s i t e s o f t h e C C C m a m o d e l l e d Z500A a n d Z oo fields over t h e same i n t e r v a l s i n oc(t ) u s e d t o c a l c u l a t e t h e c o m p o s i t e s i n F i g u r e 5  5.8.  n  T h e p a t t e r n s d i s p l a y e d i n F i g u r e 5.9 d e m o n s t r a t e t h a t t h e a n o m a l o u s c i r c u l a t i o n s  are s t r o n g l y e q u i v a l e n t b a r o t r o p i c , w i t h a slight w e s t w a r d p h a s e t i l t w i t h h e i g h t . midlatitude  The  Z500 a n o m a l i e s are s l i g h t l y stronger r e l a t i v e t o t h e p o l a r a n o m a l i e s t h a n is  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  (a)  (b)  (-3.3,-2.9); N=48 (c)  (-2.9,-2.5); N=234  (-2.5,-2.1); N=321 (e)  (-2.1,-1.7); N=135  (-1.7,-1.3); N=62  (-1.3,-0.9); N=58  (d)  (f)  F i g u r e 5.9: A s w i t h F i g u r e 5.8, b u t for c o m p o s i t e s o f Z  500  is 20 m (...,-30,-10,10,30,...).  93  anomalies. T h e contour interval  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Figure 5.9: Continued.  Circulation  Data  94  Chapter 5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Figure 5.10: As with Figure 5.8, but for composites of Z contours are in bold. Contour interval is 50 m. 500  Data  The 5300 and 5500  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Figure 5.10: Continued.  Circulation  Data  96  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  97  the case with SLPA. Examination of the Z oo held is illuminating. Figures 5.10 (h)-(k) 5  indicate that the mid-tropospheric manifestation of variability along Branch 1 consists of an alternating amplification and attenuation of the climatological ridge over Europe. Figure 5.10(c) demonstrates that Branch 2 describes, on average, an amplified climatological ridge over western North America and flow split around a local anticyclone over southern Greenland. Again, the mid-tropospheric composites associated with X(£ ) den  scribe two markedly different modes of atmospheric circulation: a standing oscillation in the strength of the climatological ridge over North Europe (Branch 1), and episodic split flow events over Greenland (Branch 2). Figure 5.11 displays maps of the geographical distribution of variance and skewness of SLP, for both the observations and the CCCma model output. The skewness, s, of a random variable x is the ratio < ( * - < * »  < (a;- < x >)  2  3  >  (5.2)  >/  3 2  and is a measure of the asymmetry of the distribution of x about its mean. For both the modelled and observed SLP, the variance is greater in the polar latitudes than in the midlatitudes, and displays a broad extremum from southern Greenland to northeastern Eurasia, with a second extremum over the North Pacific. The Pacific centre is substantially stronger in observations than in the model results. The broad local extremum in the variance of GCM SLP corresponds precisely to the polar centre of action in SLP EOF 1 (Figure 5.2(a)), and its west and east flanks correspond to the polar centres of action of Branches 1 and 2 of the NLPCA approximation X(£ ). The leading EOF patn  tern of SLP is in a way a compromise between the circulation anomalies described by Branches 1 and 2. The broad extremum in variance results from the separate occurrence of two circulation regimes, and the PCA approximation, being linear, must attempt to characterise both modes of variability simultaneously.  F i g u r e 5.11: M a p s o f s p a t i a l d i s t r i b u t i o n o f v a r i a n c e o f S L P f r o m (a) C C C m a G C M a n d (b) o b s e r v a t i o n s ( c o n t o u r i n t e r v a l 10 ( h P a ) ) , a n d skewness o f S L P f r o m (c) 2  G C M a n d (d) o b s e r v a t i o n s ( c o n t o u r i n t e r v a l 0.2).  CCCma  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  99  There is broad agreement between observed and modelled skewness; it is seen that both are generally characterised by negative values in the midlatitudes and positive values in polar latitudes, as was found by Nakamura and Wallace (1991) and by Holzer (1996). As well, both observed and modelled SLP skewness has a local maximum over Greenland and a local minimum over the midlatitude North Atlantic Ocean. In observations, the Greenland maximum is weaker than that of the modelled SLP, and the North Atlantic minimum is shifted northwestward of the corresponding minimum in the modelled SLP. As well, the modelled skewness over northern Eurasia is weaker than in observations, and the North Pacific extremum evident in observations is shifted to over Alaska. Note however that some of these apparent differences may in fact be an artifact of poor observational data coverage north of 70N, as is discussed further in section 5.6. In their observational study of the geographical distribution of skewness in Northern Hemisphere tropospheric circulation, Nakamura and Wallace (1991) suggested that, "Large skewness at a particular location might be indicative of a juxtaposition of two flow regimes: one which prevails most of the time and another which occurs relatively infrequently and is thus characterised by large anomalies." This is in fact the situation seen to prevail in the NLPCA approximation X(i ) of SLPA. The dipole in skewness over n  the North Atlantic arises because of the regular occupation of Branch 1 and the episodic occupation of Branch 2. The negative extremum in skewness over the Atlantic Ocean west of Africa, which corresponds to the southern, negative centre of action of Branch 2 events, is displaced somewhat southward because skewness tends to be enhanced where the variance of the field is low, as is clear from equation (5.2).  Chapter 5. NLPCA  5.4  of Northern  Hemisphere  Atmospheric  Circulation  Data  100  A n a l y s i s of G C M 500mb Heights  In Figure 5.12 are displayed histogram estimates of the 2D marginal distributions of the Z A 500  held P C s (PC1.PC2), (PC1,PC3), (PC2.PC3), and (PC1,PC4).  The marginal  distribution of P C I and P C 2 is closer to Gaussian than is the corresponding SLPA distribution, although P C I is clearly skewed. In the Z A field, marked deviations from 500  Gaussianity are most evident in the marginal distributions of P C I with P C 3 and of P C I with PC4. Figures 5.12(b) and (d) demonstrate the existence of substantial non-Gaussian structure in the Z A 500  data. A l l other 2D marginal distributions do not appear to differ  strongly from being Gaussian. The ID N L P C A approximation to the Z o data is displayed in Figure 5.13, projected 50  in the spaces spanned by Z A 500  EOFs (ei,e ), (e^es), (e ,e ), and (e!,e ,e3). This 2  2  3  2  approximation explains 23.9% of the variance in the data, in contrast to 19.6% explained by the first P C A approximation. Note that unlike the corresponding SLPA approximation it projects strongly onto EOFs other than the leading two. The N L P C A approximation is constructed based on the joint distribution of the data in the 10-dimensional embedding space, and because it has non-negligible projections on EOFs beyond the first two, it is not particularly useful to compare the 2D marginal distributions presented in Figure 5.12 with the projections of the approximation in these planes. Thus, no equivalent of Figure 5.7 is presented. The ID N L P C A Z ooA approximation is similar to that of the 5  SLPA in that it is composed of three branches. As well, the corresponding standardised time series a ( t ) and its histogram (displayed in Figure 5.14) display a strikingly bimodal n  character. As was the case with the ID SLPA N L P C A approximation, one branch of X(£ ) corresponds to the larger peak of the distribution of a(t ), n  n  a second to the smaller  peak, and the third branch to the rarely-visited region in between. The system spends 78% of all months in the branch corresponding to the larger peak, 19% in the branch  Chapter 5. NLPCA  F i g u r e 5.12:  of Northern Hemisphere Atmospheric  Circulation  H i s t o g r a m e s t i m a t e s o f 2D m a r g i n a l P D F s o f Z A 500  Data  101  P C s (a)(PCl, PC2),  ( b ) ( P C l , P c 7 3 ) , (c)(PC72,PC3), a n d ( d ) ( P C l , P C 4 ) . T h e c o n t o u r i n t e r v a l s are as i n F i g u r e 5.4. B i n sizes are 2500 m .  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  F i g u r e 5.13: P r o j e c t i o n o f t h e Z oA 50  Circulation  Data  102  d a t a (dots) a n d i t s I D N L P C A a p p r o x i m a t i o n ( o p e n  circles) o n t o t h e spaces s p a n n e d b y E O F s (a) (ei,e ), (b) (ei,e ), (c) (e2,e ), a n d (d) 2  (e ,e ,e ). 1  2  3  3  3  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  F i g u r e 5.14: P l o t o f t h e I D Z A estimate of the P D F (right). 500  t i m e series a(t ) n  Circulation  Data  103  (left) a n d t h e a s s o c i a t e d h i s t o g r a m  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  104  c o r r e s p o n d i n g t o t h e s m a l l e r p e a k , a n d 3% i n t h e i n t e r m e d i a t e b r a n c h . A n i n s p e c t i o n o f t h e t i m e series ct(t ) i n d i c a t e s t h a t v a r i a b i l i t y i n t h e a p p r o x i m a t i o n is c h a r a c t e r i s e d b y n  o s c i l l a t o r y m o t i o n a l o n g t h e m o r e p o p u l a t e d b r a n c h , w i t h e p i s o d i c e x c u r s i o n s t o t h e less populated branch. F i g u r e 5.15 d i s p l a y s c o m p o s i t e s o f Z oA 50  over t i m e s a s s o c i a t e d w i t h  i n a o f w i d t h 0.375, c e n t r e d at i n t e r v a l s o f 0.375 f r o m -2.9 t o 1.6.  neighbourhoods  Inspection o f these  composites indicates that the branch o f variability associated w i t h t h e larger o f the t w o p e a k s i n t h e P D F o f ot(t ) c o r r e s p o n d s t o e s s e n t i a l l y a s t a n d i n g o s c i l l a t i o n ( F i g u r e s 5.15 n  ( h ) - ( l ) , e s p e c i a l l y (h) a n d ( k ) ) . T h i s o s c i l l a t i o n is c h a r a c t e r i s e d b y a n o m a l i e s w i t h a s t r o n g w a v e n u m b e r four s i g n a l i n m i d l a t i t u d e s a n d a w a v e n u m b e r t w o s i g n a l i n h i g h e r latitudes.  T h e c i r c u l a t i o n a n o m a l i e s over t h e e a s t e r n N o r t h P a c i f i c a n d N o r t h A m e r i c a  d i s p l a y a P N A - l i k e s t r u c t u r e , a n d those over E u r a s i a r e s e m b l e t h e S c a n d i n a v i a p a t t e r n d e s c r i b e d i n B e l l a n d H a l p e r t (1995).  T h e structure o f this oscillation i n m i d l a t i t u d e s  b e a r s a s t r i k i n g s i m i l a r i t y t o t h a t o b s e r v e d i n t h e 500 m b field o f t h e l e a d i n g m o d e o f t h e c o m b i n e d E O F a n a l y s i s c a r r i e d o u t b y B a l d w i n a n d D u n k e r t o n (1999), a l t h o u g h t h e s i g n a l i n t h e p o l a r regions is r a t h e r different. V a r i a b i l i t y a l o n g t h e b r a n c h a s s o c i a t e d w i t h t h e s m a l l e r o f t h e t w o p e a k s is i l l u s t r a t e d i n F i g u r e s 5.15 ( a ) - ( d ) .  U n l i k e the b r a n c h associated w i t h t h e larger peak,  anomalies  a l o n g t h i s b r a n c h a r e o f a single p h a s e , w i t h h i g h s o v e r t h e p o l a r r e g i o n a n d l o w s o v e r t h e m i d l a t i t u d e s . A l o c a l m a x i m u m i n t h e a n o m a l y field over t h e p o l a r r e g i o n is l o c a t e d over S o u t h e r n G r e e n l a n d , w i t h a local negative e x t r e m u m i n m i d l a t i t u d e s over t h e A z o r e s . T h i s s t r u c t u r e is s t r o n g l y r e m i n i s c e n t o f t h e n e g a t i v e p h a s e o f t h e N A O . A c o m p a r i s o n o f t h e c o m p o s i t e s p r e s e n t e d i n F i g u r e 5.15 w i t h t h o s e g i v e n i n F i g u r e 5.9, a n d o f t h e t i m e series p r e s e n t e d i n F i g u r e s 5.14 a n d 5.6, d e m o n s t r a t e s t h a t t h e I D N L P C A a p p r o x i m a t i o n s o f S L P A a n d Z QN 50  describe essentially t h e same m o d e o f vari-  a b i l i t y . T h e c i r c u l a t i o n a n o m a l i e s p l o t t e d i n F i g u r e 5.9(b)-(d) b e a r a s t r i k i n g r e s e m b l a n c e  Chapter  5. NLPCA  of Northern  Hemisphere  Atmospheric  F i g u r e 5.15: A s i n F i g u r e 5.9, b u t for Z A 500  of a associated w i t h the I D Z A 5Q0  (...,-30,-10,10,30,...).  Circulation  c o m p o s i t e d over c h a r a c t e r i s t i c  N L P C A approximation.  C o n t i n u e d o n n e x t page.  Data  105  ranges  C o n t o u r i n t e r v a l is 2 0 m  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation Data  (h)  (-0.275,0.1); N=565 (i)  (0.1,0.475); N=1635  (0.475,0.85); N=1842  (k)  (I)  (0.85,1.225); N=600  (1.225,1.6); N=34  Figure 5.15: Continued.  106  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  t o t h o s e p l o t t e d i n F i g u r e 5.15(b)-(d).  Circulation  Data  107  S i m i l a r l y , the p o l a r a n d E u r a s i a n sectors o f t h e  s t a n d i n g o s c i l l a t i o n as d i a g n o s e d f r o m the a n a l y s i s o f S L P A ( F i g u r e 5.9 (h) a n d ( k ) ) are e s s e n t i a l l y t h e s a m e as those d i a g n o s e d f r o m t h e a n a l y s i s o f Z A 500  (k)).  ( F i g u r e 5.15(h) a n d  T h e m o r e a n d less p o p u l a t e d r e g i m e s o f c i r c u l a t i o n i n t h e Z 5 0 0 A f i e l d are r e f e r r e d  t o as t h e o s c i l l a t o r y a n d s p l i t - f l o w r e g i m e s , r e s p e c t i v e l y . b o r n e o u t b y t h e c o m p o s i t e s o f Z500 t ° be p r e s e n t e d  T h i s c h a r a c t e r i s a t i o n w i l l be  later.  T h e t w o a p p r o x i m a t i o n s do not c o r r e s p o n d e x a c t l y . T h e Z500A a p p r o x i m a t i o n is i n t h e l e s s - p o p u l a t e d r e g i m e 19% of t h e t i m e , i n c o n t r a s t mation.  t o 1 3 % for t h e S L P A  approxi-  T h e c o r r e s p o n d i n g n u m b e r s for the b r a n c h d e s c r i b i n g t h e s t a n d i n g o s c i l l a t i o n  are 7 8 % a n d 8 4 % .  T h e s p l i t - f l o w r e g i m e is m o r e f r e q u e n t l y o c c u p i e d i n t h e Z500A a p -  p r o x i m a t i o n t h a n i n t h e S L P A . F i g u r e 5.16 d i s p l a y s a s c a t t e r p l o t o f t h e t i m e series ct(t ) n  corresponding to the S L P A and Z  5 0  o A approximations.  t w e e n t h e s e t w o t i m e series is 0.81.  T h e c o r r e l a t i o n coefficient be-  T y p i c a l l y t h e a p p r o x i m a t i o n s are  b o t h i n e i t h e r t h e o s c i l l a t o r y or split-flow r e g i m e s .  simultaneously  For a small number of months,  the  S L P A a p p r o x i m a t i o n is i n t h e s p l i t - f l o w r e g i m e w h i l e t h e Z500A a p p r o x i m a t i o n is i n t h e o s c i l l a t o r y r e g i m e , b u t m o r e c o m m o n l y w h e n t h e t w o a p p r o x i m a t i o n s differ i t is b e c a u s e t h e S L P A a p p r o x i m a t i o n is i n t h e o s c i l l a t o r y r e g i m e w h e n t h e Z500A a p p r o x i m a t i o n is i n t h e s p l i t - f l o w r e g i m e . T h i s is a r e f l e c t i o n of the fact t h a t t h e s p l i t - f l o w r e g i m e is m o r e f r e q u e n t l y o c c u p i e d i n the Z oA 50  a p p r o x i m a t i o n t h a n i n the S L P A a p p r o x i m a t i o n .  T h i s difference i n o c c u p a t i o n s t a t i s t i c s has l i t t l e effect o n t h e c o m p o s i t e s  character-  i s i n g t h e s p l i t - f l o w b r a n c h . H o w e v e r , the s t a n d i n g o s c i l l a t i o n d i a g n o s e d f r o m t h e  Z A 500  d a t a is m o r e h e m i s p h e r i c i n e x t e n t t h a n is t h a t f r o m t h e S L P A d a t a . N o t e i n p a r t i c u l a r t h e p r e s e n c e i n t h e Z oA 50  a p p r o x i m a t i o n of a local e x t r e m u m i n the height anomalies over  w e s t e r n C a n a d a a n d A l a s k a o f the same sign as the a n o m a l i e s o v e r t h e p o l e , a n d l o c a l e x t r e m a o f t h e o p p o s i t e s i g n over t h e c e n t r a l N o r t h P a c i f i c a n d e a s t e r n N o r t h A m e r i c a w h i c h together comprise a wavetrain reminiscent of the P N A pattern.  These e x t r e m a do  Chapter 5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  108  F i gure 5.16: Scatterplot of the time series o:(i ) corresponding to the ID SLPA and Z^QQA N L P C A approximations. n  Chapter 5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  109  not occur i n the composites corresponding to the I D S L P A N L P C A a p p r o x i m a t i o n . F i g u r e 5.17 p r e s e n t s t h e g e o g r a p h i c a l d i s t r i b u t i o n o f v a r i a n c e a n d skewness o f t h e C C C m a m o d e l l e d Z500A field. for  T h e s e m a p s show t h e s a m e sort o f s t r u c t u r e  as t h o s e  t h e m o d e l l e d S L P : t h e v a r i a n c e d i s p l a y s a g e n e r a l increase w i t h l a t i t u d e , a n d t h e  skewness is g e n e r a l l y n e g a t i v e i n m i d l a t i t u d e s a n d p o s i t i v e i n t h e p o l a r r e g i o n . A s w e l l , t h e s t a n d a r d d e v i a t i o n a n d skewness o f Z500A d i s p l a y l o c a l e x t r e m a i n m u c h t h e s a m e l o c a t i o n s as S L P A , b u t w i t h a c o n s i d e r a b l y stronger s i g n a l i n t h e N o r t h P a c i f i c .  As  was t h e case w i t h S L P A , b o t h t h e skewness a n d v a r i a n c e m a p s are c o n s i s t e n t w i t h t h e t w o - r e g i m e s t r u c t u r e o f t h e N L P C A a p p r o x i m a t i o n o f Z500A. C o m p o s i t e s o f t h e t o t a l Z oo field are g i v e n i n F i g u r e 5.18. 5  T h e composites o n the  s p l i t - f l o w b r a n c h o f t h e Z500A a p p r o x i m a t i o n do n o t differ s u b s t a n t i a l l y f r o m t h o s e b a s e d o n t h e S L P A a n a l y s i s . T h e split-flow b r a n c h o f t h e I D Z500A N L P C A a p p r o x i m a t i o n is seen i n d e e d t o d e s c r i b e split flow over G r e e n l a n d a n d a n e n h a n c e d r i d g e over W e s t e r n Canada.  V a r i a b i l i t y a l o n g t h e o s c i l l a t o r y b r a n c h o f t h e Z500A a p p r o x i m a t i o n d e s c r i b e s  a l t e r n a t i n g a m p l i f i c a t i o n a n d a t t e n u a t i o n o f t h e c l i m a t o l o g i c a l ridges over b o t h E u r o p e a n d N o r t h A m e r i c a , i n c o n t r a s t t o v a r i a b i l i t y a l o n g t h e S L P A a p p r o x i m a t i o n , w h i c h was c o n c e n t r a t e d i n t h e E u r o p e a n sector. F i n a l l y , F i g u r e 5.19 d i s p l a y s c o m p o s i t e s o f S L P A b a s e d o n t h e I D Z oA 50  approximation.  A s was t h e case w i t h t h e Z A 500  and Z  500  S L P A over c h a r a c t e r i s t i c ranges o f t h e t i m e series a(t ) n  NLPCA  fields, t h e c o m p o s i t e s o f  c o r r e s p o n d i n g t o t h e Z500A  a p p r o x i m a t i o n are m o r e h e m i s p h e r i c i n e x t e n t t h a n was t h e case w i t h t h e c o r r e s p o n d i n g c o m p o s i t e s b a s e d o n t h e S L P A a p p r o x i m a t i o n . I n p a r t i c u l a r , t h e r e is e n h a n c e d v a r i a b i l i t y over w e s t e r n C a n a d a , c o r r e s p o n d i n g t o a n a m p l i f i e d w a v e n u m b e r 2 c o n t r i b u t i o n t o t h e a n o m a l y field. T h u s , a I D N L P C A a n a l y s i s o f a s e c o n d field f r o m t h e C C C M A c o u p l e d G C M , v i z N o r t h e r n H e m i s p h e r e w i n t e r t i m e 5 0 0 m b g e o p o t e n t i a l height a n o m a l i e s , p r o d u c e s r e s u l t s  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  (a)  Circulation  Data  110  (b)  Figure 5.17: Maps of variance (a) and skewness (b) of CCCma modelled Z o- Contour interval in (a) is 500 m and in (b) is 0.2. 50  2  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  (a)  (b)  (-2.9,-2.525); N=39  (-2.525,-2.15); N=237  (C)  (d)  (-2.15,-1.775); N=442  (-1.775,-1.4); N=367  (e)  (f)  (-1.4,-1.025); N=97  (-1.025,-0.65); N=59  Figure 5.18: As with Figure 5.15, but for composites of Z . contours are in bold. Contour interval is 50 m. 500  111  The 5300 and 5500 m  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Figure 5.18: Continued.  Circulation  Data  112  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  Circulation  Data  113  Figure 5.19: A s i n Figure 5.15, but for S L P A . Contour interval is 2 h P a (...,-3,-1,1,3,...). Continued on next page.  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric Circulation Data  (9)  (h)  (0.85,1.225); N=600  (1.225,1.6); N=34  Figure 5.19: Continued.  114  Chapter  5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  115  that are very similar to those produced by the analysis of the SLPA field, providing strong evidence that this structure is a robust feature of the data. The structures diagnosed in the two fields are not exactly the same: the Z500A approximation anomaly fields are more hemispheric in extent than those of the SLPA approximation, and the split-flow branch is occupied more frequently.  5.5  Analysis of G C M SLP in a 4 x C 0  2  Integration  A plot of the histogram estimate of the 2D marginal P D F of SLPA from the G C M integration with C 0 concentrations at four times the pre-industrial level, in the space 2  of the leading two control integration EOFs, is presented in Figure 5.20, along with the corresponding ID N L P C A approximation. The N L P C A approximation explains 31.4% of the variance of the data, in contrast to 29.8% explained by the first P C A mode. The most striking aspect of this P D F is the fact that it appears to be much more Gaussian than the corresponding P D F for the control integration (Figure 5.4). In particular, the P D F of the 4 x C 0  2  integration lacks the bulge in the lower-right quadrant associated  with the split-flow branch of the control integration ID N L P C A approximation. That the structure of the data is now much more nearly Gaussian is reflected in the structure of the ID N L P C A approximation displayed in Figure 5.20. This approximation is no longer branched, but is a slightly curved line lying nearly along the major axis of the marginal distribution. The corresponding time series (Figure 5.21) is unimodal. Note that the N L P C A approximation of the SLPA in the 4 x C 0  2  integration lies along what was the  oscillatory branch of the SLPA N L P C A approximation from the control integration. Figure 5.22 displays composites of the ID 4 x C 0  2  SLPA N L P C A approximation over  characteristic ranges of the (standardised) time series ct(t ). n  The variability illustrated  in Figure 5.22 is not precisely that of a standing oscillation; the composites over the  Chapter  5. NLPCA  F i g u r e 5.20:  of Northern  Hemisphere  Atmospheric  Circulation  Data  116  H i s t o g r a m e s t i m a t e o f the m a r g i n a l p r o b a b i l i t y d e n s i t y f u n c t i o n  t o u r s ) o f S L P A f r o m the G C M i n t e g r a t i o n w i t h C 0  2  concentrations  (con-  at four t i m e s  the  p r e - i n d u s t r i a l v a l u e , i n the space o f the l e a d i n g t w o c o n t r o l i n t e g r a t i o n S L P A P C A m o d e s , overlaid w i t h the corresponding I D N L P C A a p p r o x i m a t i o n (open circles). C o n t o u r i n t e r v a l s are 5 x 1 0 , 1 . 5 x l O " , 3 x 1 0 " , 6 x 1 0 - 4  3  3  h i s t o g r a m b i n size is 25 h P a i n b o t h d i r e c t i o n s .  - 3  , 1 x 10" , 2 x 1 0 2  - 2  , 3 x 10~ . 2  The  Chapter 5. NLPCA  0  of Northern  500  1000  Hemisphere  1500  t (months)  Atmospheric  2000  Circulation  2500  3000  Data  0  117  0.05  0.1  Frequency  Figure 5.21: Plot of the ID N L P C A SLPA time series a(t ) (left) and the associated histogram estimate of the P D F (right) for the G C M integration with C 0 concentration at four times the pre-industrial level. n  2  Chapter 5. NLPCA  of Northern Hemisphere Atmospheric  (a)  (-5.4,-4.5); N=3  Circulation  Data  118  (b)  (-4.5,-3.6); N=5  (C)  (d)  (-3.6,-2.7); N=20  (-2.7,-1.8); N=71  (e)  (f)  (-1.8,-0.9); N=344  (-0.9,0); N=1075  Figure 5.22: As in Figure 5.8, but for SLPA in the GCM integration at four times the pre-industrial CO2 concentration. Contour interval is 2 hPa (...,-3,-1,1,3,...). Continued on next page.  Figure 5.22: Continued.  Chapter 5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  120  tails o f a a c c e n t u a t e s t r o n g l y p o s i t i v e a n o m a l i e s over the n o r t h e a s t P a c i f i c a n d n o r t h e r n S i b e r i a , r e s p e c t i v e l y . N o t e , h o w e v e r , t h a t the a p p r o x i m a t i o n is i n a s t a t e b e t w e e n F i g u r e s 5.22(e) a n d (h) 9 3 % o f the t i m e , over w h i c h r a n g e t h e s t r u c t u r e is e s s e n t i a l l y t h a t o f a standing oscillation.  N o t e also t h a t the s t r u c t u r e o f t h i s o s c i l l a t i o n is v e r y s i m i l a r  to  t h a t o f t h e o s c i l l a t o r y b r a n c h o f the ID N L P C A a p p r o x i m a t i o n o f S L P A i n t h e c o n t r o l i n t e g r a t i o n , w i t h a s l i g h t l y e n h a n c e d N o r t h P a c i f i c centre o f a c t i o n . T h u s , t h e d o m i n a n t m o d e o f n o n l i n e a r v a r i a b i l i t y i n S L P A u n d e r q u a d r u p l e d CO2 is o n l y w e a k l y n o n l i n e a r a n d s t r o n g l y resembles t h e o s c i l l a t o r y r e g i m e o f S L P A v a r i a b i l i t y u n d e r p r e - i n d u s t r i a l CO2  concentrations.  P a l m e r (1999) has suggested, based o n e x p e r i m e n t s w i t h s i m p l e l o w - d i m e n s i o n a l n o n l i n e a r s y s t e m s , t h a t t h e response of the c l i m a t e s y s t e m t o e x t e r n a l p e r t u r b a t i o n s  (e.g.  i n c r e a s e d a t m o s p h e r i c CO2) w i l l not be a change i n the s t r u c t u r e o f d o m i n a n t c i r c u l a t i o n r e g i m e s , b u t r a t h e r i n t h e i r o c c u p a t i o n frequencies.  O u r results are b r o a d l y c o n s i s t e n t  w i t h t h i s h y p o t h e s i s . U n d e r q u a d r u p l e d CO2, the o s c i l l a t o r y r e g i m e o f S L P A v a r i a b i l i t y b e c o m e s m o r e f r e q u e n t l y o c c u p i e d at the expense o f the split-flow r e g i m e . I n fact,  the  latter almost disappears entirely. O u r r e s u l t s are consistent as w e l l w i t h those o f U l b r i c h a n d C h r i s t o p h (1999), w h o f o u n d t h a t t h e ECHAM4+OPYC3  coupled G C M predicted a systematic  northeastward  shift o f t h e centres o f a c t i o n o f the N A O w i t h a n increase i n a t m o s p h e r i c CO2. NAO  The  as d i a g n o s e d u s i n g l i n e a r s t a t i s t i c a l tools is seen i n t h e l i g h t o f t h e ID c o n t r o l  S L P A N L P C A a p p r o x i m a t i o n as a c o m p r o m i s e b e t w e e n B r a n c h 1 a n d B r a n c h 2 v a r i ability. A s the split-flow b r a n c h becomes depopulated w i t h increasing atmospheric C 0 , 2  t h e c o m p r o m i s e w i l l i n c r e a s i n g l y f a v o u r the a n o m a l y p a t t e r n s c h a r a c t e r i s t i c o f t h e oscillatory branch.  T h i s change w o u l d manifest i t s e l f i n a r e g i o n a l a n a l y s i s as a s e c u l a r  n o r t h e a s t w a r d shift o f t h e centres o f a c t i o n o f the N A O . F i g u r e 5.23 d i s p l a y s m a p s of the v a r i a n c e a n d skewness o f t h e S L P A field i n t h e 4x  CO2  Chapter 5. NLPCA  integration.  of Northern Hemisphere Atmospheric  Circulation  Data  121  T h e v a r i a n c e d i s p l a y s l o c a l e x t r e m a over N o r t h e r n R u s s i a a n d  Southern  A l a s k a ; b o t h o f these l o c a t i o n s c o r r e s p o n d t o centres o f a c t i o n i n t h e r a n g e o f a o v e r w h i c h t h e ID N L P C A a p p r o x i m a t i o n is effectively a s t a n d i n g o s c i l l a t i o n . T h e skewness is s t r o n g l y p o s i t i v e over the N o r t h e a s t P a c i f i c a n d over E u r o p e a n R u s s i a .  Inspection  o f t h e c o m p o s i t e s d i s p l a y e d i n F i g u r e 5.22 i n d i c a t e s t h a t t h e N L P C A a p p r o x i m a t i o n is c o n s i s t e n t w i t h t h i s d i s t r i b u t i o n of skewness. F o r s t r o n g l y n e g a t i v e values o f  a(t ), t h e r e n  are s t r o n g p o s i t i v e S L P a n o m a l i e s c e n t r e d s o u t h o f A l a s k a t h a t are not c o u n t e r b a l a n c e d b y s i m i l a r n e g a t i v e a n o m a l i e s for large p o s i t i v e  a(t ), n  a(t ). n  S i m i l a r l y , for s t r o n g l y p o s i t i v e  s t r o n g p o s i t i v e S L P a n o m a l i e s o c c u r over N o r t h e r n R u s s i a t h a t do n o t h a v e a  n e g a t i v e c o u n t e r p a r t for large n e g a t i v e values o f a. T h u s , the ID N L P C A a p p r o x i m a t i o n c h a r a c t e r i s e s t h e gross features of b o t h the skewness a n d t h e v a r i a n c e o f t h e SLPA  data.  N o t e t h a t w h i l e the tails i n the d i s t r i b u t i o n o f a(t ) n  do not  4xCC>2  contribute  s u b s t a n t i a l l y t o t h e v a r i a n c e , the s t r u c t u r e o f w h i c h is d o m i n a t e d b y v a r i a b i l i t y n e a r the m e a n of  a(t ), t h e y are manifest i n the s p a t i a l s t r u c t u r e o f t h e skewness. N o t e as n  w e l l t h a t b e c a u s e skewness i n v o l v e s a c o m p r o m i s e b e t w e e n t h e s e c o n d a n d t h i r d o r d e r m o m e n t s , its e x t r e m a are shifted s o m e w h a t s o u t h w a r d o f t h e centres o f a c t i o n i n t h e composite towards latitudes of lower variance. T h u s , i n t h e C C C M A c o u p l e d G C M , the q u a d r u p l i n g o f t h e a t m o s p h e r i c c o n c e n t r a t i o n o f CO2 r e s u l t s not so m u c h i n a change i n t h e s t r u c t u r e s o f a t m o s p h e r i c v a r i a b i l i t y , b u t i n a c h a n g e i n t h e i r o c c u p a t i o n frequencies, consistent w i t h t h e h y p o t h e s i s o f P a l m e r (1999). T h e s p l i t - f l o w b r a n c h o f t h e ID c o n t r o l S L P A a p p r o x i m a t i o n is d e p o p u l a t e d i n the 4 x C 0  2  r u n , w h i l e the o s c i l l a t o r y r e g i m e r e m a i n s l a r g e l y u n c h a n g e d .  Chapter  5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  122  F i g u r e 5.23: (a) V a r i a n c e ( c o n t o u r i n t e r v a l 10 ( h P a ) ) a n d (b) skewness ( c o n t o u r i n t e r v a l 2  0.2) o f S L P A f r o m G C M i n t e g r a t i o n w i t h q u a d r u p l e d a t m o s p h e r i c C 0 . 2  Chapter 5. NLPCA of Northern Hemisphere Atmospheric Circulation Data  5.6  123  Conclusions  In this chapter the N L P C analysis of sea level pressure anomaly data and 500mb geopotential height anomaly data from the Canadian Centre for Climate Modelling and Analysis coupled G C M have been considered. It was found that for data from a control integration at pre-industrial CO2 concentrations, in both the SLPA and the  Z500A  fields,  the ID N L P C A approximation was composed of three branches, and the corresponding time series was bimodal. The most frequently occupied branch describes the oscillatory amplification or attenuation of the climatological ridges over Europe, for the SLPA approximation, and over both Europe and North America for the Z oA. approximation. 50  The less populated branch is occupied episodically, and strongly resembles the negative phase of the North Atlantic Oscillation. Mid-tropospheric circulation patterns in this regime are associated with split flow over southern Greenland. The fact that the independently-determined ID N L P C A approximations found for SLPA and  Z500A  are  very similar provides strong evidence that these approximations represent actual structure in the data. A n analysis of SLPA data from a G C M integration with C 0 levels at four times the2  pre-industrial concentrations indicated that the characteristic regimes of low-frequency variability did not change in structure, but in occupation frequency, as was suggested by Palmer (1999). The oscillatory regime became increasingly populated at the expense of the split-flow regime, which in fact became so depopulated that the branched structure of the N L P C A approximation disappeared. In all cases, it was seen that the ID N L P C A approximation describes coherent hemispheric-scale atmospheric variability that is consistent with both the geographical distribution of variance and of skewness. In the case of the control run integrations, the strong skewness in the region of the N A O is associated with the fact that variability is  Chapter 5. NLPCA  of Northern  Hemisphere  Atmospheric  Circulation  Data  124  dominated by Gaussian oscillations with episodic excursions to a single strong anomaly pattern. This mechanism for the origin of skewness has been suggested in the literature by Nakamura and Wallace (1991). N L P C analyses were also carried out on a Southern Hemisphere SLPA data set from the C C C m a coupled G C M and on the Trenberth and Paolino Northern Hemisphere SLPA data set (1980). In neither case could robust ID N L P C A approximations be found that differed from the ID P C A approximation. In the case of the Southern Hemisphere G C M SLPA, this result is likely because of the strong zonal symmetry of the lower boundary, which is not as favourable to the formation of geographically-fixed circulation anomalies as is the Northern Hemisphere, with its strong topography and land-sea contrast. There are two possible reasons for the failure to detect strong nonlinear structure in the Trenberth and Paolino SLPA data. The first is simply that such structure is not there, and that the structure of the data is predominantly linear. The second is that the structure is there, but cannot be found because the records are short and data are of very poor quality in the polar regions, precisely the latitudes in which the nonlinear structure found in the G C M data is strongest. In fact, no data are reported on the latitude circles 75°N and 85°N or at the pole. A future study will consider the analysis of N C E P reanalysis data, which does not suffer from this deficiency in geographical coverage.  Chapter 6  Seven-Layer N e t w o r k s for Discontinuous P r o j e c t i o n a n d E x p a n s i o n Functions  6.1  Introduction  K r a m e r ' s N L P C A allows t h e e s t i m a t i o n o f continuous f u n c t i o n s Sf a n d f s u c h t h a t  X(t ) n  = (fos )(X(t )) f  (6.1)  n  is t h e o p t i m a l ( i n t h e least-squares sense) a p p r o x i m a t i o n t o X(t ) n  b y a continuous  curve  or surface. A s was p o i n t e d o u t b y K i r b y a n d M i r a n d a (1996) a n d b y M a l t h o u s e (1998), a n d d i s c u s s e d i n C h a p t e r 2, t h e r e s t r i c t i o n t o c o n t i n u i t y o f t h e p r o j e c t i o n a n d e x p a n sion maps precludes the detection a n d characterisation  by N L P C A  s t r u c t u r e s n o t t o p o l o g i c a l l y e q u i v a l e n t t o t h e u n i t c u b e i n 3ft . p  of P-dimensional  T h e simplest  example  i l l u s t r a t i n g t h i s p r o b l e m arises i n t h e N L P C analysis o f t h e u n i t c i r c l e e m b e d d e d i n 9ft , 2  S = {(cos t, s'mt); t £ [0, 27r)}.  (6.2)  T h e t o p o l o g y o f t h e m a n i f o l d p a r a m e t e r i s i n g t h i s s t r u c t u r e is 5 ; t h e c o r r e s p o n d i n g m a p 1  Sf  : 3ft i—y 3ft is d i s c o n t i n u o u s , because for a n y 0 < e < < 1 t h e r e w i l l b e a s m a l l o p e n 2  n e i g h b o u r h o o d D o n t h e c i r c l e a b o u t t h e p o i n t ( 1 , 0 ) s u c h t h a t Sf(D) = [0, e)U(27r —e, 27r). T h e r e is n o c o n t i n u o u s m a p b e t w e e n t h e c i r c l e a n d t h e u n i t i n t e r v a l i n 3ft, b e c a u s e t h e y are t o p o l o g i c a l l y i n e q u i v a l e n t . T h u s , N L P C A as f o r m u l a t e d b y K r a m e r c a n n o t  characterise  t h e l o w - d i m e n s i o n a l s t r u c t u r e u n d e r l y i n g a self-intersecting c u r v e or surface. o t h e r h a n d , i f t h e f u n c t i o n s i n (6.1) are a l l o w e d t o b e discontinuous, 125  O n the  t h e n there exists  Chapter 6. Seven-Layer Networks for Discontinuous Functions  126  a class of parameter manifolds topologically inequivalent to the unit cube in $l  p  such  that the projection and expansion functions of an N L P C A approximation X ( £ ) can be n  approximated. This class includes topologies such as the circle (S ) and torus (S ® S ) 1  but excludes the sphere (S ). 2  1  1  Because the parameterisation of the sphere is degenerate  at the poles, it cannot be expressed as a simple discontinuous function from 5R — ( >• SR . 2  3  Given enough neurons in its single hidden layer, a three-layer feed-forward neural network with M input neurons and P output neurons can approximate to arbitrary accuracy any continuous function from SR to 3ff . Such a network generally does a poor job M  p  approximating a discontinuous function between these spaces. If, however, the number of hidden layers is increased to two, each with nonlinear transfer functions, the network is much better able to approximate a discontinuous function. This fact suggests the generalisation of Kramer's 5-layer network to a 7-layer network, with two encoding and two decoding layers. In such a network, the first four layers approximate the (potentially discontinuous) function Sf : SR H-4 9ft and the last four layers the (potentially disconM  p  tinuous) function f : 5R — i > 5R ; the 7-layer network is just the composition of these P  M  two functions. This chapter demonstrates that in fact a 4-layer neural network is better able to approximate a discontinuous function than is a 3-layer network, and investigates the application of the generalised 7-layer network to the N L P C analysis of an ellipse embedded in 5ft . 2  6.2  Neural Network Approximations to Discontinuous Functions  To demonstrate the superior ability of a 4-layer neural network to approximate a discontinuous function relative to that of a 3-layer network, consider the AT-point data sets: X(£ ) n  =  (cos 2nt , sin 27vt ) n  n  Chapter 6. Seven-Layer  Networks  for Discontinuous  t  n  + 0.5  Functions  1/N < t  n  < 0.5  n  <  127  (6.3)  Y{tn)  *„-0.5 where t  n  t  =  1/iV, 2 / J V , 1  0.5 <  t  1  T h e m a p g r e l a t i n g X ( £ ) a n d ^ ( £ ) is d i s c o n t i n u o u s n  n  at  = 0.5; it is d i s p l a y e d i n F i g u r e 6.1. T h e f u n c t i o n a l r e l a t i o n s h i p b e t w e e n X ( i ) a n d  n  n  Y(t ) n  ( w i t h N = 1000) is m o d e l l e d u s i n g t w o different f e e d - f o r w a r d n e u r a l n e t w o r k s . T h e  first, d e n o t e d N N 1 , has a single h i d d e n l a y e r w i t h 13 n e u r o n s , a n d t h e s e c o n d , N N 2 , has t w o h i d d e n layers w i t h 5 n e u r o n s i n each. H y p e r b o l i c t a n g e n t s were u s e d as t h e  transfer  f u n c t i o n s i n e a c h o f t h e h i d d e n layers. T h e n u m b e r o f n e t w o r k p a r a m e t e r s i n N N 1 a n d N N 2 is 53 a n d 51 r e s p e c t i v e l y . E a c h n e t w o r k was t r a i n e d for 10000 i t e r a t i o n s , f r o m r a n d o m l y - d e t e r m i n e d i n i t i a l weights a n d biases.  B e c a u s e t h e d a t a are  starting  noise-free,  e a r l y - s t o p p i n g was not e m p l o y e d i n t h e t r a i n i n g process. F i g u r e s 6.2(a) a n d (b) d i s p l a y r e s p e c t i v e l y the a p p r o x i m a t i o n s t o Y(t ) n  produced  b y n e u r a l n e t w o r k s N N 1 a n d N N 2 . N o t e t h a t N N 2 is m u c h b e t t e r able t o r e p r e s e n t t h e d i s c o n t i n u i t y i n t h e r e l a t i o n s h i p b e t w e e n X ( f ) a n d Y(t ) n  n  t h a n is N N 1 . T h e n e t w o r k N N 1  spreads t h e d i s c o n t i n u i t y out over a n u m b e r o f p o i n t s , a n d i n t r o d u c e s G i b b s o s c i l l a t i o n s i n i t s v i c i n i t y . T h e p o i n t o f d i s c o n t i n u i t y is m u c h b e t t e r r e p r e s e n t e d b y N N 2 : t h e w i d t h o v e r w h i c h t h e d i s c o n t i n u i t y is s p r e a d is s u b s t a n t i a l l y r e d u c e d , a n d t h e G i b b s o s c i l l a t i o n s are s u p p r e s s e d . T h i s difference is not a result o f differences i n i n i t i a l w e i g h t s b e t w e e n t h e t w o cases, as a n e n s e m b l e o f i n d e p e n d e n t l y t r a i n e d n e t w o r k s (not s h o w n ) d e m o n s t r a t e s t h a t for b o t h N N 1 a n d N N 2 t h e a p p r o x i m a t i o n s are i n d e p e n d e n t  of initial  parameter  values. N o r is t h e i m p r o v e m e n t a r e f l e c t i o n of the n u m b e r o f p a r a m e t e r s i n t h e m o d e l , as N N 1 a n d N N 2 h a v e e s s e n t i a l l y the same n u m b e r o f p a r a m e t e r s .  T h u s , N N 2 is b e t t e r  able t o a p p r o x i m a t e d i s c o n t i n u o u s f u n c t i o n s t h a n is N N 1 b e c a u s e o f differences i n t h e i r  architecture.  Chapter 6. Seven-Layer Networks for Discontinuous Functions  Figure 6.1: Plot of Y(t ) (6.3) with TV = 50. n  128  (diamonds) and X ( i ) (open circles) as defined by equation n  Chapter  6. Seven-Layer  Networks  for Discontinuous  Functions  129  F i g u r e 6.2: N e u r a l n e t w o r k a p p r o x i m a t i o n s o f t h e f u n c t i o n a l r e l a t i o n s h i p b e t w e e n d a t a sets X ( i ) a n d Y(t ) n  n  defined i n e q u a t i o n (6.3): (a) n e t w o r k w i t h one h i d d e n l a y e r , (b)  n e t w o r k w i t h t w o h i d d e n layers.  Chapter 6. Seven-Layer  6.3  Networks for Discontinuous  Functions  130  7-Layer N L P C A N e t w o r k  T h e i m p r o v e d a b i l i t y o f a 4-layer n e u r a l n e t w o r k over a 3-layer n e t w o r k t o a p p r o x i m a t e d i s c o n t i n u o u s f u n c t i o n s suggests t h a t N L P C A c a r r i e d out u s i n g a 7-layer n e t w o r k s h o u l d b e b e t t e r able t h a n t h a t u s i n g a 5-layer n e t w o r k t o a p p r o x i m a t e s t r u c t u r e s w h o s e p r o j e c t i o n a n d e x p a n s i o n f u n c t i o n s are d i s c o n t i n u o u s . I n this s e c t i o n , t h i s h y p o t h e s i s is t e s t e d t h r o u g h c o n s i d e r a t i o n o f the ID N L P C A a p p r o x i m a t i o n o f a n e l l i p s e i n t h e p l a n e :  = (0.5 cos 27rt , 0.7 s i n 27r* )  X(t )  n  n  for n = l , N  n  t  n  n — 1 = — —  (6.4)  — 1, w h e r e N = 550. A s i n the p r e v i o u s s e c t i o n , b e c a u s e these d a t a  are  noise-free, n o e a r l y s t o p p i n g was u s e d i n c o n s t r u c t i n g t h e m o d e l . Figure a(t ) n  6.3  displays  the  NLPCA  approximation  X(£„)  and  the  time  series  = 5 / ( X ( t ) ) o b t a i n e d u s i n g a 5-layer a u t o a s s o c i a t i v e n e u r a l n e t w o r k w i t h L = 13 n  nodes i n the m a p p i n g a n d d e m a p p i n g layer.  C o n v e r g e n c e t o t h i s a p p r o x i m a t i o n was  e x t r e m e l y slow; t h e r e s u l t s s h o w n are f r o m a r u n i n w h i c h 1 0 out i n the t r a i n i n g .  6  i t e r a t i o n s were c a r r i e d  A s was discussed i n M a l t h o u s e (1998), t h e a p p r o x i m a t i o n X ( i ) n  d i s p l a y s h i g h f i d e l i t y to t h e o r i g i n a l d a t a t h r o u g h o u t t h e b u l k o f t h e d o m a i n (the F U V is 1.2 x 1 0 ) , b u t fails e n t i r e l y over a r a n g e o f p o i n t s c e n t r e d n e a r (0.5,0.2). T h i s set o f - 3  p o i n t s c o r r e s p o n d s t o t h e n e i g h b o u r h o o d o f the p o i n t at w h i c h t h e i d e a l p r o j e c t i o n a n d e x p a n s i o n f u n c t i o n s are d i s c o n t i n u o u s because t h e t o p o l o g y o f t h e m a n i f o l d p a r a m e t e r i s i n g t h e c u r v e is S . 1  N o t e t h a t the p o s i t i o n o f t h i s p o i n t is a r b i t r a r y , a n d is d e t e r m i n e d  r a n d o m l y by the i n i t i a l network parameters.  T h e t i m e series oc(t ) is u n a b l e t o r e p r e n  sent t h i s d i s c o n t i n u i t y , d i s p l a y i n g p r e c i s e l y the s a m e o v e r s h o o t as was o b s e r v e d i n F i g u r e 6.2(a).  B e c a u s e o f t h e r e s t r i c t i o n of sj a n d f to the space o f c o n t i n u o u s f u n c t i o n s ,  the  5-layer a u t o a s s o c i a t i v e n e t w o r k c a n n o t a c c u r a t e l y a p p r o x i m a t e t h e e l l i p s e . O n t h e o t h e r h a n d , F i g u r e 6.4 d i s p l a y s t h e N L P C A a p p r o x i m a t i o n X ( £ ) a n d t h e corn  r e s p o n d i n g t i m e series ot(t ) = Sf(X(t )) n  n  o b t a i n e d u s i n g a 7-layer a u t o a s s o c i a t i v e n e u r a l  Figure 6.3: Results of ID N L P C analysis of an ellipse using a 5-layer autoassociative neural network: (a) N L P C A approximation X ( i ) , (b) associated time series a ( t ) = Sf(X.(t )) (note scale on y-axis is arbitrary). n  n  n  Chapter 6. Seven-Layer Networks for Discontinuous Functions  132  n e t w o r k w i t h 5 h i d d e n n e u r o n s i n each o f t h e t w o m a p p i n g a n d d e m a p p i n g l a y e r s . A s was t h e case w i t h t h e 5-layer n e t w o r k , this n e t w o r k was t r a i n e d for 1 0 i t e r a t i o n s . 6  a p p r o x i m a t i o n d i s p l a y s great f i d e l i t y t o t h e o r i g i n a l d a t a t h r o u g h o u t  This  the d o m a i n (the  F U V is 2.4 x 1 0 ) , e x c e p t for a r a t h e r s m a l l i n t e r v a l c e n t r e d n e a r (-0.5,-0.1). A g a i n , t h i s - 5  region corresponds to the neighbourhood of the point i n w h i c h the ideal m a p p i n g a n d d e m a p p i n g f u n c t i o n s are d i s c o n t i n u o u s ; n o t e t h a t i t is m u c h s m a l l e r t h a n t h e c o r r e s p o n d i n g r e g i o n o b t a i n e d u s i n g a 5-layer n e t w o r k . A s w e l l , t h e F U V o f t h e a p p r o x i m a t i o n f r o m t h e 7-layer n e t w o r k is t w o orders o f m a g n i t u d e s m a l l e r t h a n t h a t o f t h e a p p r o x i m a t i o n f r o m t h e 5-layer n e t w o r k . T h e t i m e series a(t ) n  f r o m t h e 7-layer n e t w o r k is a m u c h b e t -  ter a p p r o x i m a t i o n o f t h e d i s c o n t i n u o u s p r o j e c t i o n m a p t h a n was t h a t o b t a i n e d f r o m t h e 5-layer n e t w o r k . T h e 5-layer a u t o a s s o c i a t i v e n e u r a l n e t w o r k c o n s i d e r e d a b o v e c o n t a i n e d 107 p a r a m eters, w h i l e t h e 7-layer n e t w o r k c o n t a i n e d 103. T h u s , t h e s u p e r i o r p e r f o r m a n c e o f t h e 7-layer n e t w o r k is n o t d u e t o i t s h a v i n g a larger n u m b e r o f p a r a m e t e r s .  B o t h networks  were t r a i n e d for t h e s a m e n u m b e r o f i t e r a t i o n s . O t h e r t r a i n i n g r u n s ( n o t s h o w n ) d i s p l a y t h e s a m e s u p e r i o r i t y o f t h e 7-layer n e t w o r k t o t h e 5-layer n e t w o r k ; t h i s r e s u l t is i n s e n s i tive to the choice o f i n i t i a l m o d e l parameters.  T h u s , t h e 7-layer n e t w o r k is s u p e r i o r t o  t h e 5-layer n e t w o r k because o f differences i n a r c h i t e c t u r e ; t h e presence o f t w o m a p p i n g a n d d e m a p p i n g layers allows t h e n e t w o r k t o a p p r o x i m a t e a b r o a d e r class o f p r o j e c t i o n and expansion functions than that open to K r a m e r ' s original network.  6.4  Conclusions  It has b e e n d e m o n s t r a t e d  t h a t a 7-layer g e n e r a l i s a t i o n o f K r a m e r ' s 5-layer n e t w o r k , i n  w h i c h t w o m a p p i n g a n d d e m a p p i n g layers are u s e d , d i s p l a y s a m a r k e d s u p e r i o r i t y i n i t s a b i l i t y t o m o d e l l o w - d i m e n s i o n a l s t r u c t u r e whose c o r r e s p o n d i n g p r o j e c t i o n a n d e x p a n s i o n  Chapter 6. Seven-Layer  _0 8'  1  -0.5  Networks for Discontinuous  1  1  -0.4  -0.3  1  1  -0.2  Functions  1  1  -0.1  0 X  133  1  0.1  i  1  0.2  0.3  i— 0.4  0.5  1  (b)  0.3  8'  O  0.2  o o  0.1  CP 0  ^  ^  ^  ^  J-o,' -0.2  o  -0.3  o o o o  -0.4 -0.5  0  1  100  ...  1  200  1  300  t  1  400  1  500  600  n  Figure 6.4: As in Figure 6.3, but for N L P C A performed using a 7-layer autoassociative neural network.  Chapter 6. Seven-Layer  Networks for Discontinuous  Functions  134  f u n c t i o n s are d i s c o n t i n u o u s . T h i s s u p e r i o r i t y was s h o w n t h r o u g h t h e N L P C a n a l y s i s o f a n e l l i p s e i n t h e p l a n e , w h o s e p r o j e c t i o n a n d e x p a n s i o n f u n c t i o n s are d i s c o n t i n u o u s b e c a u s e of the S  1  t o p o l o g y o f t h e m a n i f o l d p a r a m e t e r i s i n g t h e ellipse.  A n o t h e r s o l u t i o n t o this p r o b l e m was p r e s e n t e d b y K i r b y a n d M i r a n d a (1996), w h o i n t r o d u c e d t h e c o n c e p t of " c i r c u l a r nodes" i n n e u r a l n e t w o r k s . B e c a u s e s t a n d a r d  nodes  i n n e u r a l n e t w o r k s m a p to t h e r e a l l i n e , K i r b y a n d M i r a n d a n o t e d t h a t s u c h nodes  are  unable to encode "angular" information. In other words, they cannot m a p continuously t o a space w i t h S  1  topology. K i r b y and M i r a n d a introduced the idea of coupling pairs  o f nodes s u c h t h a t t h e i r o u t p u t is c o n s t r a i n e d t o f a l l o n the u n i t c i r c l e . T h e s e c o u p l e d nodes c a n be t r e a t e d as a single a b s t r a c t n o d e t h a t c a n e n c o d e a n g u l a r i n f o r m a t i o n . T h i s c o u p l i n g r e q u i r e s m o d i f i c a t i o n o f the b a c k p r o p a g a t i o n a l g o r i t h m ( A p p e n d i x A ) ; s u c h a m o d i f i c a t i o n is p r e s e n t e d b y K i r b y a n d M i r a n d a . T h e r e are t w o p r o b l e m s w i t h K i r b y a n d M i r a n d a ' s a p p r o a c h .  F i r s t , t h e m e t h o d is  s o m e w h a t difficult to i m p l e m e n t , as it requires a m o d i f i c a t i o n o f t h e b a c k p r o p a g a t i o n a l g o r i t h m a n d t h u s c a n n o t be c a r r i e d out u s i n g a r e g u l a r , c o m m e r c i a l n e u r a l n e t w o r k package.  S e c o n d , t h e use o f c i r c u l a r nodes p r e s u m e s the e x i s t e n c e o f p e r i o d i c s t r u c t u r e  i n t h e d a t a . T h e 7-layer n e t w o r k p r e s e n t e d here has n e i t h e r o f these difficulties. H o w e v e r , K i r b y a n d M i r a n d a ' s m e t h o d allows a n e x a c t c h a r a c t e r i s a t i o n o f m a p s t o S , 1  whereas  t h e 7-layer g e n e r a l i s a t i o n o f K r a m e r ' s n e t w o r k p r o d u c e s o n l y a p p r o x i m a t i o n s . It is n o t clear t h a t t h i s l a c k o f exactness is a p r o b l e m i n p r a c t i c e , as n e u r a l n e t w o r k s c a n o n l y ever p r o d u c e a p p r o x i m a t e m o d e l s t o d a t a . F e w large-scale c l i m a t i c p h e n o m e n a d i s p l a y t h e s t r i c t p e r i o d i c i t y o f t h e d a t a set c o n sidered i n this section.  C l i m a t e v a r i a b i l i t y is m u c h m o r e i r r e g u l a r , a n d e x a m p l e s o f  s t r i c t l y p e r i o d i c l i m i t cycles are rare.  E x a m p l e s o f p e r i o d i c v a r i a b i l i t y i n c l u d e t h e sea-  sonal cycle ( i n c l u d i n g annual a n d higher harmonics), the tides, a n d the d i u r n a l cycle; a l l o f these are p e r i o d i c because t h e y arise f r o m e x t e r n a l , a s t r o n o m i c a l forcings w h i c h  Chapter 6. Seven-Layer Networks for Discontinuous  Functions  135  are strongly periodic. These signals are of well-known frequency, as are their harmonics arising from through nonlinear rectification (Huang and Sardeshmukh, 1999). If the annual cycle and its harmonics are stationary in time, they can be removed from the data using harmonic analysis. If stationarity does not obtain, more complicated techniques such as wavelet analysis could be used instead. Thus, the inability of Kramer's 5-layer autoassociative neural network to model strictly periodic variability is not expected to be a significant liability in the analysis of climate data. The fact that this problem can be addressed by generalising Kramer's network is an interesting theoretical result. In practice, its application will probably be limited to data thought to have been generated by a highly periodic system.  Chapter 7 Summary and Conclusions  7.1  Summary  Principal component analysis is a tool of great utility in the analysis of climate data. The phase space of a typical climatic data set has a dimensionality in the range from hundreds to thousands, which makes straightforward visualisation impossible. P C A finds the ordered set of axes in the phase space which provides the most efficient linear description of the data, in that the projection of the data into the space spanned by the first P axes is the optimal P-dimensional linear approximation to the data (in a least-squares sense). It is thus a tremendously useful algorithm for the reduction of data dimensionality. There is, however, no a priori reason to believe that any lower-dimensional structure underlying a multivariate dataset is linear, in that it is optimally described by a set of orthogonal axes. Indeed, it is a basic result of the theory of dynamical systems that if the dynamical system i(t)  =  P(x(t))  x(0)  = x  0  (7-1) (7.2)  possesses a if-dimensional stable attractor F (where K is not necessarily an integer), then in general the description of the manifold V by a set of Cartesian coordinates (ie, orthogonal coordinate axes) requires the dimension of the embedding space to be at least 2K + 1. P C A can determine an appropriate embedding space for a general lowdimensional structure, but it cannot provide the most efficient description of this surface. 136  Chapter 7. Summary and Conclusions  137  In an effort to circumvent this limitation of P C A , Kramer (1991) introduced a nonlinear generalisation of P C A , which he denoted Nonlinear Principal Component Analysis. Implemented using a 5-layer feed-forward neural network, N L P C A attempts to find functions S f : 5ft !->• 3? and f : 3ft H-» U M  p  p  M  such that the approximation  X ( i ) = (f o Sf)(X(£„)) is the optimal P-dimensional nonlinear approximation to the n  M-dimensional data set X ( i ) , in a least-squares sense. If the functions S f and f are n  constrained to be linear, this approach reduces to P C A . This thesis presents the first systematic application of N L P C A to the analysis of climate data. A summary of the results follows. 1. The similarities and differences between P C A and N L P C A were discussed. Both P C A and N L P C A can be characterised as variational problems for detecting lowerdimensional structure in multivariate data sets: P C A finds linear structure, N L P C A can find more general nonlinear structure. The P-dimensional P C A approximation to a data set is the sum of its first P one-dimensional approximations; in N L P C A , the situation is more complicated. A single P-dimensional surface determined using an autoassociative neural network with P neurons in the bottleneck layer is said to be a nonmodal approximation, while the sum of the first P one-dimensional approximations is said to be a modal approximation; these two approaches are generally distinct. A degeneracy in the parameterisation of the surface determined by a P-dimensional nonmodal N L P C A complicates interpretation of the parameterisation for P > 2, and generally requires the consideration of a secondary feature extraction problem; thus, modal N L P C A for the analysis of real climate data is preferred. Both ID P C A and N L P C A modes correspond to a single time series, however, unlike P C A , a ID N L P C A mode does not have a unique spatial pattern. In fact, the approximation is characterised by a sequence of spatial patterns, which  Chapter  may  7. Summary and  Conclusions  138  be v i s u a l i s e d c i n e m a t o g r a p h i c a l l y . B o t h P C A a n d N L P C A p a r t i t i o n v a r i a n c e  i n t h a t t h e s u m o f the t o t a l v a r i a n c e o f a P - d i m e n s i o n a l a p p r o x i m a t i o n w i t h t h e t o t a l v a r i a n c e o f the r e s i d u a l equals the v a r i a n c e o f t h e o r i g i n a l d a t a .  T h i s result  c a n be p r o v e d for P C A , b u t so far r e m a i n s a n e m p i r i c a l r e s u l t for N L P C A .  2. I m p l e m e n t a t i o n o f N L P C A was c o n s i d e r e d i n some d e t a i l . T h e N L P C A a l g o r i t h m is c a r r i e d out u s i n g a 5-layer a u t o a s s o c i a t i v e f e e d - f o r w a r d n e u r a l n e t w o r k m o d e l . B e c a u s e N L P C A i n v o l v e s t h e s o l u t i o n of a n o n l i n e a r v a r i a t i o n a l p r o b l e m , n o a n a l y t i c f o r m u l a for the s o l u t i o n exists, a n d i t e r a t i v e f u n c t i o n m i n i m i s a t i o n p r o c e d u r e s m u s t be u s e d .  T h i s m i n i m i s a t i o n process is referred t o as " t r a i n i n g " .  A n issue  of p r i m a r y i m p o r t a n c e is the a v o i d a n c e of o v e r f i t t i n g ; a n N L P C A a p p r o x i m a t i o n should r o b u s t l y characterise lower-dimensional structure i n the data.  Overfitting  was a v o i d e d t h r o u g h t h e use o f a n e a r l y s t o p p i n g t e c h n i q u e i n w h i c h a r a n d o m l y s e l e c t e d p o r t i o n o f t h e o r i g i n a l d a t a is set aside a n d not u s e d t o fit t h e m o d e l parameters.  T h e p e r f o r m a n c e of t h e m o d e l over this w i t h h e l d d a t a set is m o n i -  t o r e d as t r a i n i n g progresses; i f the m o d e l p e r f o r m a n c e over t h i s v a l i d a t i o n set is i n f e r i o r t o t h e p e r f o r m a n c e over t h e t r a i n i n g d a t a , t h e m o d e l is d i s c a r d e d . e n s e m b l e o f m o d e l s is c o n s t r u c t e d , e a c h o f w h i c h satisfy the r o b u s t n e s s  An  criteria  d e s c r i b e d a b o v e . T h e s e are referred to as the " c a n d i d a t e m o d e l s " . T h e n u m b e r o f n e u r o n s i n t h e e n c o d i n g a n d d e c o d i n g layers o f t h e a u t o a s s o c i a t i v e n e t w o r k a n d t h e n u m b e r o f i t e r a t i o n s u s e d to t r a i n the m o d e l are a d j u s t e d u n t i l a sufficient n u m b e r o f s i m i l a r c a n d i d a t e m o d e l s is o b t a i n e d , at w h i c h p o i n t a r e p r e s e n t a t i v e m e m b e r is e x t r a c t e d a n d s a i d to be the N L P C A a p p r o x i m a t i o n t o t h e d a t a .  3. A p r e l i m i n a r y i n v e s t i g a t i o n o f N L P C A was c a r r i e d out u s i n g d a t a s a m p l e d f r o m the Lorenz attractor,  t o w h i c h r a n d o m noise was a d d e d .  It was f o u n d t h a t  at  l o w t o m o d e r a t e noise levels N L P C A was able t o p r o d u c e r o b u s t a p p r o x i m a t i o n s  Chapter 7. Summary and  Conclusions  139  that were more characteristic of the data, and explained higher percentages of the variance, than were those produced by PCA. In the limit of no noise, the ID NLPCA approximation explained 76% of the total variance while the corresponding PCA approximation explained 60%. The NLPCA mode was able to describe covariability between two uncorrelated but dependent variables, which PCA cannot do. A 2D nonmodal NLPCA approximation explained 99.5% of the variance, while the 2D PCA approximation explained 95%. As the noise level increased, the improvement of NLPCA over PCA decreased, in terms of the percentage of variance explained. At high noise levels, in which the structure of the Lorenz data was entirely obscured, NLPCA could not produce robust approximations that differed from those obtained by PCA. 4. A tropical Pacific Ocean sea surface temperature data set was analysed by NLPCA. The ID NLPCA approximation to these data describes average ENSO variability. Unlike the ID PCA approximation, which also describes average ENSO variability, the ID NLPCA approximation is able to characterise the asymmetry in SST spatial patterns between average El Nino and La Nina events. The distribution of SST is skewed toward positive anomalies in the eastern Pacific and toward negative anomalies in the western Pacific, and the ID NLPCA approximation is able to characterise this distribution of skewness. The second NLPCA mode is also related to ENSO, and seems to characterise differences between individual events. It is particularly active in the period after 1977, a time which has been noted in a number of studies as corresponding to a shift in ENSO variability. A 2D nonmodal NLPCA approximation to the SST was also determined. A secondary PCA analysis in the 2D space of variables parameterising this surface indicated that the  Chapter  7. Summary and  Conclusions  140  v a r i a b i l i t y d e s c r i b e d b y this a p p r o x i m a t i o n c o n t a i n s e s s e n t i a l l y t h e s a m e i n f o r m a t i o n as t h e first t w o N L P C A m o d e s .  An NLPC  sea l e v e l p r e s s u r e was also c a r r i e d o u t .  analysis of t r o p i c a l Indo-Pacific  T h e first m o d e was f o u n d t o c o r r e s p o n d  to the E N S O signal i n S L P , and characterised asymmetries i n the S L P between E l N i n o a n d L a N i n a events.  N o robust nonlinear structure could be detected i n  . t h e r e s i d u a l s f r o m the first N L P C A m o d e ; w i t h i n the c o n s t r a i n t s i m p o s e d b y t h e q u a n t i t y o f d a t a a v a i l a b l e , S L P v a r i a b i l i t y is l i n e a r b e y o n d t h e first N L P C A m o d e .  5. T h e first N L P C A m o d e o f m o n t h l y - a v e r a g e d N o r t h e r n H e m i s p h e r e S L P A f r o m t h e C a n a d i a n C e n t r e for C l i m a t e M o d e l l i n g a n d A n a l y s i s c o u p l e d G C M was f o u n d t o p a r t i t i o n the data into two distinct populations.  The I D N L P C A approximation  h a d a t h r e e - b r a n c h e d s t r u c t u r e , a n d t h e d i s t r i b u t i o n a s s o c i a t e d t i m e series was s t r o n g l y b i m o d a l . O n e b r a n c h ( B r a n c h 1), a s s o c i a t e d w i t h t h e l a r g e r p e a k o f t h e d i s t r i b u t i o n o f t h e t i m e series, c o r r e s p o n d e d t o a s t a n d i n g o s c i l l a t i o n w i t h a n o m a l i e s of o p p o s i t e s i g n over the p o l a r r e g i o n a n d the m i d l a t i t u d e s , s t r o n g l y r e s e m b l i n g t h e A r c t i c O s c i l l a t i o n . M o s t o f the d a t a p r o j e c t e d o n t o t h i s b r a n c h . A s e c o n d b r a n c h ( B r a n c h 2 ) , w h i c h c o r r e s p o n d e d to the s m a l l e r p e a k o f t h e t i m e series P D F , was only o c c u p i e d episodically, and strongly resembled the negative phase of the N o r t h A t l a n t i c O s c i l l a t i o n . T h e B r a n c h 1 s i g n a l i n 500 m b g e o p o t e n t i a l h e i g h t c o m p o s i t e s (based on the S L P A analysis) described alternating amplification a n d  attenuation  o f t h e c l i m a t o l o g i c a l r i d g e over E u r o p e , w h i l e B r a n c h 2 d e s c r i b e d s t r o n g l y s p l i t flow  over G r e e n l a n d .  A n analysis o f the S L P skewness i n d i c a t e s t h a t t h e r e are  s t r o n g p o s i t i v e a n d n e g a t i v e l o c a l e x t r e m a i n skewness i n t h e s a m e l o c a t i o n s as t h e p o s i t i v e a n d n e g a t i v e e x t r e m a o f the B r a n c h 2 a n o m a l y p a t t e r n s . T h e s e e x t r e m a i n skewness t h u s s e e m t o arise b e c a u s e o f t h e c o m b i n a t i o n o f a s t a n d i n g o s c i l l a t i o n displaying G a u s s i a n variability w i t h episodic occurrences of a strongly anomalous  Chapter  7. Summary and  Conclusions  141  circulation, a possibility that has been suggested by other authors in the past. An NLPC analysis of the 500mb height anomaly field itself resulted in a branched approximation that was very similar to that obtained from the SLPA field; the primary difference between the two is that the former corresponds to anomaly fields that are somewhat more hemispheric in extent. Thus, the two-regime structure identified using NLPCA appears to be equivalent barotropic in nature. Finally, the results of an analysis of SLPA from a GCM run with C 0 concentra2  tions at four times the pre-industrial levels indicated that the oscillatory branch of the control NLPCA approximation was largely unchanged but that the splitflow branch was substantially depopulated. This behaviour is consistent with the suggestion by Palmer (1999) that the climate response to greenhouse forcing will not be changes in the structure of characteristic circulation regimes, but in their occupation frequencies. 6. Because Kramer's 5-layer autoassociative neural network can only find continuous projection and expansion functions, a 7-layer generalisation was suggested for the analysis of data sets where the expansion and projection functions are discontinuous. Such a data set is an ellipse, because the manifold parameterising the low-dimensional approximation is topologically different than the unit interval. It was found that the NLPCA approximation produced by a 7-layer autoassociative neural network was substantially better than that produced by a 5-layer network, because the former was much better able to approximate discontinuous projection and expansion functions. It was demonstrated that this improvement was due to the different architectures of the two networks, and not to a difference in the number of model parameters.  Chapter  7.2  7. Summary and  Conclusions  142  Conclusions  N o n l i n e a r P r i n c i p a l C o m p o n e n t A n a l y s i s has been d e m o n s t r a t e d  t o be a useful t o o l for  t h e a n a l y s i s o f c l i m a t e d a t a . W h e r e P C A characterises t h e v a r i a n c e i n a m u l t i v a r i a t e d a t a set, N L P C A is able t o also characterise h i g h e r o r d e r m o m e n t s o f v a r i a b i l i t y . T h i s thesis has i n t r o d u c e d N L P C A to t h e s t u d y of c l i m a t e d a t a , b u t t h e r e r e m a i n s w o r k t o b e d o n e . T h e m o d e l b u i l d i n g procedures described i n C h a p t e r 2 retain an element of subjectivity. It w o u l d b e useful t o develop a n a u t o m a t e d , approximations;  o b j e c t i v e t e c h n i q u e for b u i l d i n g N L P C A  s u c h a m e t h o d o l o g y c o u l d p e r h a p s use a s o p h i s t i c a t e d  regularisation  t e c h n i q u e s u c h as G e n e r a l i s e d C r o s s V a l i d a t i o n ( Y u v a l , 1999). U s e f u l g e n e r a l i s a t i o n s o f N L P C A m i g h t also be d e v e l o p e d t h r o u g h m o d i f i c a t i o n s o f t h e cost f u n c t i o n t o i n c l u d e constraints on the a p p r o x i m a t i o n . A n example w o u l d be a simple structure constraint o f t h e f o r m u s e d i n r o t a t e d P C A analysis ( R i c h m a n , 1 9 8 6 ) . A s e c o n d m o d i f i c a t i o n o f t h e cost f u n c t i o n t o e n s u r e self-consistency o f the N L P C A m o d e l is suggested i n R i c o - M a r t i n e z et a l . (1996). K i r b y a n d M i r a n d a (1999) suggest a n u m b e r o f o t h e r p o s s i b l e c o n s t r a i n t s . w e l l , N L P C A c o u l d be u s e d i n a n o n l i n e a r g e n e r a l i s a t i o n o f S i n g u l a r S y s t e m s  As  Analysis  ( S S A ) , w h i c h is s i m p l y P C A a p p l i e d t o a t i m e series e x p r e s s e d i n d e l a y c o o r d i n a t e space ( B r o o m h e a d a n d K i n g , 1986; v o n S t o r c h a n d Z w i e r s , 1 9 9 9 ) . F i n a l l y , a n a l y t i c d e m o n s t r a t i o n s o f features o f N L P C A d i s c o v e r e d e m p i r i c a l l y , s u c h as t h e p a r t i t i o n i n g o f v a r i a n c e , are l a c k i n g . T h e a n a l y s i s o f l a r g e m u l t i v a r i a t e datasets, either o b s e r v a t i o n s or G C M o u t p u t , is a n i m p o r t a n t a c t i v i t y for the u n d e r s t a n d i n g  of c l i m a t e v a r i a b i l i t y . N o n l i n e a r P r i n c i p a l  Component  A n a l y s i s w i l l not r e p l a c e t r a d i t i o n a l P C A , b e c a u s e it is m o r e difficult  implement.  H o w e v e r , I" believe t h a t N L P C A m a y w e l l b e c o m e a n i m p o r t a n t  to the geophysical statistician's variability of the climate system.  toolbox, and will provide important  to  addition  insight into  the  Appendix A  Neural Networks  A s is d e s c r i b e d i n d e t a i l b y B i s h o p (1995) a n d b y H s i e h a n d T a n g (1998), a f e e d - f o r w a r d n e u r a l n e t w o r k is a n o n - p a r a m e t r i c s t a t i s t i c a l m o d e l u s e d t o e s t i m a t e ( g e n e r a l l y n o n l i n ear) f u n c t i o n a l r e l a t i o n s b e t w e e n t w o d a t a sets, X ( i ) £ Dft a n d Z(t ) 5  n  n  £ 9ft . T h e n e u r a l T  n e t w o r k is c o m p o s e d o f a series o f p a r a l l e l l a y e r s , e a c h o f w h i c h c o n t a i n s a n u m b e r o f p r o c e s s i n g e l e m e n t s , o r neurons, s u c h t h a t t h e o u t p u t o f t h e i t h l a y e r is u s e d as i n p u t t o t h e (i + l ) t h . I f y^  is t h e o u t p u t o f t h e jth. n e u r o n o f t h e z t h l a y e r , t h e n  y r ' - ^ ^ E - r ^ + ^ j is t h e o u t p u t o f t h e  (A.i)  kth. n e u r o n o f t h e (i + l ) t h l a y e r . T h e e l e m e n t s o f t h e a r r a y s w^ ^  are r e f e r r e d t o as t h e  +1  k  weights, a n d those o f t h e v e c t o r s b ^ as t h e biases. T h e transfer l+1  k  function c h a r a c t e r i s i n g t h e ( i + l ) t h l a y e r is d e n o t e d  tr(  , + 1  );  it m a y be linear or nonlinear.  T h e first, o r i n p u t l a y e r , receives t h e values o f t h e d a t a p r e s e n t e d t o t h e n e t w o r k ; i t s t r a n s f e r f u n c t i o n is s i m p l y t h e i d e n t i t y m a p 07 : x \-¥ x.  T h e famous flexibility o f  n e u r a l n e t w o r k s comes f r o m t h e use o f n o n l i n e a r transfer f u n c t i o n s ( t y p i c a l l y h y p e r b o l i c t a n g e n t s ) i n some o r a l l o f t h e r e m a i n i n g layers. A n i m p o r t a n t r e s u l t d u e t o C y b e n k o (1989) is t h a t a 3-layer n e u r a l n e t w o r k w i t h S i n p u t n e u r o n s , h y p e r b o l i c t a n g e n t t r a n s f e r f u n c t i o n s i n t h e s e c o n d l a y e r a n d l i n e a r transfer f u n c t i o n s i n t h e t h i r d l a y e r o f T n e u r o n s c a n a p p r o x i m a t e t o a r b i t r a r y a c c u r a c y a n y c o n t i n u o u s f u n c t i o n f r o m 3ft t o 3ft , i f t h e 5  n u m b e r o f n e u r o n s i n t h e s e c o n d l a y e r is sufficiently l a r g e .  143  T  Appendix A. Neural Networks  144  X  z  Input layer  layer  Output layer  Figure A . l : Diagrammatic representation of neural network with input data X and output data Z. Neural networks are often represented diagrammatically, with open circles representing the neurons and straight lines the weights, as is illustrated in Figure A . l . These diagrams are meant to be suggestive of biological neuronal systems, reflecting the origin of neural network theory in the context of artificial intelligence research. Feed-forward neural networks as described above are fit to data, or trained, as follows. Suppose it is desired to fit the data X(i„) £ Oft to the data Z(t ) £ 5  n  Denoting the  network as Af, the weights and biases (referred to collectively as u) are adjusted until the cost function J=<  \\Z-Af{X;p)\\  2  >  (A.2)  Appendix  A.  Neural  145  Networks  is minimised. That is, parameters / i  m  are determined such that  m  dJ 0  (A.3)  A  =  dm  min  for each parameter \±i. In the above, the angle brackets < . > denote averaging over time and ||.|| denotes the L -norm. For a network in which all transfer functions <r^ 2  are linear, equation (A.3) can be expressed as a simple matrix equation which admits an analytic solution, and the approach simply reduces to multivariate regression. When the transfer functions are nonlinear, equation (A.3) does not reduce to a simple matrix equation, and the cost function J must be minimised numerically. The minimisation of the cost function was carried out using a conjugate-gradient algorithm (Press et al., 1992). At each step of this algorithm, the gradient of J with respect to the parameters over which it is being minimised must be evaluated. It is easy to show that for an /-layer neural network,  " = -2 < ey(')(^ r> >  fa*  (A.4)  )y  where  (A.5)  ,(0 ..(»-i)} + _L A(^, T_=^.(0E<yr i  e(t ) = Z{t )-Af{X(t );u.) n  n  (A.6)  n  and the prime represents differentiation. Thus, the gradient of the cost function with respect to the weights of the output layer can be evaluated exactly at every step of the minimisation algorithm. The same is true for the gradient of J with respect to the biases of the output layer. Furthermore, it can be shown that 8  J  -2  E^' "(^ K]-"'- (#" )!/i'" ) (  )  1)  1)  2)  (A-T)  Appendix A. Neural Networks  146  and  dj -  2  E E  V ( ' - » ) ( ^ - ) ) y i ' - » ) \ (A.8)  ^%^)v>^-%^))  a  N o t e t h a t i n t h e e x p r e s s i o n for t h e d e r i v a t i v e s o f J w i t h respect t o t h e w e i g h t s at t h e (i — l ) t h l a y e r , q u a n t i t i e s used i n t h e c a l c u l a t i o n o f t h e d e r i v a t i v e w i t h r e s p e c t t o t h e w e i g h t s o f t h e z t h l a y e r a p p e a r ( i n t h e square b r a c k e t s ) . A n efficient a l g o r i t h m for t h e e v a l u a t i o n o f t h e g r a d i e n t o f J t h e n is t o c a l c u l a t e t h e q u a n t i t i e s :  df  4  /_2)  =  e/V)  =  Ei ^ M M - ^ ^ )  (A-9) 1  (A.10)  = £<f- H *' ~ («?~ ) 1  f_1)  (/  a)  a)  (A.11)  i  a n d so f o r t h , u p t o cft^K T h e g r a d i e n t o f J w i t h respect t o t h e w e i g h t s is s i m p l y t h e n given by  S i m i l a r e q u a t i o n s h o l d for t h e g r a d i e n t o f J w i t h respect t o t h e biases. T h i s a l g o r i t h m for e v a l u a t i n g t h e g r a d i e n t o f t h e cost f u n c t i o n w i t h respect t o t h e n e u r a l n e t w o r k p a r a m e t e r s is r e f e r r e d t o as back-propagation. N o t e t h a t b a c k p r o p a g a t i o n allows t h e e x a c t e v a l u a t i o n of t h e g r a d i e n t at e a c h step o f t h e conjugate g r a d i e n t a l g o r i t h m .  Appendix B  Principal Curves and Surfaces  F i r s t c o n s i d e r P r i n c i p a l C u r v e s , w h i c h are t h e I D v e r s i o n o f P C S . F o l l o w i n g H a s t i e a n d S t u e t z l e (1989), l e t X be a r a n d o m v e c t o r i n 3ft , t h e d i s t r i b u t i o n o f w h i c h , d e n o t e d b y M  A s u s u a l , a s s u m e E(X.) = 0, w i t h o u t loss o f g e n e r a l i t y .  h, h a s f i n i t e s e c o n d m o m e n t s . L e t t h e m a p f : 3ft —> 9ft  M  be C°°, u n i t speed ( t h a t i s , ||f'|| = 1), a n d n o n - s e l f - i n t e r s e c t i n g  ( t h a t i s , Xi ^ A => f ( A i ) ^ f ( A ) ) . T h e p r o j e c t i o n f u n c t i o n Sf : 3ft  M  —> 3ft is defined s u c h  (x) = s u p { A : | | x - f ( A ) | | = i n f ||x - f(//)||}.  (B.l)  2  2  that 5/  A  *  T h a t i s , o f t h o s e A s u c h t h a t f (A) are t h e p o i n t s closest t o x, £ / ( x ) is t h e l a r g e s t . H a s t i e a n d S t u e t z l e defined f t o be self-consistent i f t h e e x p e c t a t i o n v a l u e o f a l l t h e p o i n t s p r o j e c t i n g o n t o a c e r t a i n p o i n t o n t h e c u r v e f is t h a t p o i n t itself, i e , i f  E {X\s {X) h  f  = \)=f{\).  (B.2)  w h e r e Eh(.) d e n o t e s e x p e c t a t i o n over t h e d i s t r i b u t i o n h. Iff is self-consistent, t h e n i t is a principal curve. H a s t i e a n d S t u e t z l e p r o v e d t h a t i f f is c o n s t r a i n e d t o b e l i n e a r , a n d i f it is self-consistent, t h e n it is a p r i n c i p a l c o m p o n e n t . F u r t h e r m o r e , i n t h e space o f c u r v e s t h r o u g h t h e d a t a , a p r i n c i p a l c u r v e is a c r i t i c a l p o i n t o f t h e d i s t a n c e f u n c t i o n , i n t h e f o l l o w i n g sense. L e t f b e a p r i n c i p a l c u r v e a n d g b e a n a r b i t r a r y s m o o t h f u n c t i o n f r o m 3ft t o 3ft , a n d define f = f + f g . D e n n i n g t h e d i s t a n c e f u n c t i o n M  t  d(x f ) = | | x - f ( ( x ) ) | | >  t  t  147  a/t  (B.3)  Appendix B. Principal Curves and Surfaces  and D (h,f ) 2  t  = E d (X,f ),  then  2  h  148  t  d  0  (B.4)  t=o  Equation (B.4) is a formal expression of the idea that the principal curve passes through the "middle" of the data, and is the clearest point of connection between PCS and NLPCA. A very useful fact pointed out by LeBlanc and Tibshirani (1994) is that principal curves partition variance such that M M M £ varpO) = £ v a r ( / i ( - / ( X ) ) ) + ^ T v a r p ^ - / ^ ( X ) ) ) i=i  (B.5)  i=i  j=i  It is therefore sensible to describe the principal curve f as explaining a certain fraction of the variance of the random vector X . The construction of principal curves presented above presupposes knowledge of the distribution h of the random vector X; this is not usually known for real data sets. Hastie and Stuezle present an iterative algorithm, involving the use of locally-weighted running-lines smoothers, to determine the principal curve of a data set. Hastie and Stuetzle denoted the generalisation of principal curves to two dimensions as principal surfaces. As with principal curves, given a two-dimensional surface f £ 3ft, 2  a projection index s/ : 3ft —¥ 3ft is defined such that sy(x) is the point on f closest to M  2  x; f is a principal surface if £ ( X | ( X ) = A) = f(A) S /  (B.6)  Hastie and Stuetzle did not discuss principal surfaces in much detail; they did mention that preliminary numerical investigations indicated that principal surfaces share many properties with principal curves. Principal surfaces can be further generalised to surfaces of dimensionality higher than two; LeBlanc and Tibshirani (1994) constructed a piecewise linear generalisation they denoted adaptive principal surfaces.  Appendix  B. Principal  Curves and Surfaces  149  Hastie and Stuetzle provided rigorous proofs of PCS results for only the ID case, although LeBlanc and Tibshirani (1994) and Malthouse (1998) consider higher-dimensional generalisations. A hybrid approach to NLPCA using both Kramer's autoassociative neural network and PCS has been proposed by Dong and McAvoy (1995). Their method involved a preliminary pre-processing of the data by PCS, followed by an NLPC analysis of the processed data. The logic behind this approach was that PCS possesses a better theoretical grounding than does Kramer's NLPCA, but does not produce a simple model of the data in that when presented with a new data point, there is no simple algorithm to determine its PCS approximation. Kramer's NLPCA, on the other hand, does produce such a model of the data. Dong and McAvoy recognised that by combining the two methods, the benefits of both can be realised. However, this approach is more cumbersome than Kramer's NLPCA alone, and was thus not implemented in this work.  Appendix C  S y m m e t r i c and A n t i - s y m m e t r i c C o m p o n e n t s of C o m p o s i t e s  Consider a spatial field Y(t ),n  = 1,N,  n  as follows. Two subsets of time,  which is composited using a time series  X(t ) n  and t^~\ are defined by =  {t :X(t )>c}  (Cl)  =  {t :X(t )<-c}  (C.2)  n  n  n  n  where c is some threshold: in our case, it is one standard deviation of X(t ). n  The positive  and negative composites of Y(t ), Y ^ ' and Y ^ ^ , are simply defined as the respective +  -  n  averages over { i ^ } and {^~^}: Y< >  =  <Y>  +  (C.3)  Y<->  =  <Y>_  (C.4)  +  Maps of Y W and Y^~\ where Y(t ) n  is SSTA and X(t )  is the NDJ-averaged Nino 3.4  n  index are shown in Figures 4.6(a) and (b), respectively. In general, the spatial patterns of Y ( ) and Y ^ ) differ by more than a sign. +  -  It is desired to determine the symmetric and anti-symmetric (under a change of sign in X(t )) n  and Y^~\ To address this question, assume the minimal  components of  nonlinear model for the dependence of Y(t )  on X(t ):  n  Y(t ) n  n  = (°> + a^X(t ) a  n  + WX(t )  2  a  n  +e  n  (C.5)  where e is a vector noise process, assumed to satisfy n  < e > = < € > + = < e >_=  150  0  (C.6)  Appendix C. Symmetric and Antisymmetric  Components of Composites  151  T h e validity o f this a p p r o x i m a t i o n depends b o t h on the validity o f the m o d e l ( C . 5 ) a n d  {^L ^}> +  o n t h e l e n g t h o f t h e records, {t }> n  g e n e r a l i t y t h a t b o t h Y ( i ) a n d X(t ) n  a  n  i {^T^}- I t c a n b e a s s u m e d w i t h o u t loss o f  (  are z e r o - c e n t r e d i n t i m e . T h i s i m p l i e s t h a t  n  0 = A * * + A< > < X 0  2  >  2  (C.7)  a n d so t h e m o d e l c a n b e r e w r i t t e n as  Y(t )  = a^X(t )  n  + W(X(t ) -  <X >)  2  n  a  +e  2  n  (CS)  n  T h e v e c t o r A ^ ) is t h e field p a t t e r n a n t i - s y m m e t r i c u n d e r a c h a n g e o f sign i n X, 1  while  A ^ ) is t h e field p a t t e r n s y m m e t r i c u n d e r s u c h a change o f s i g n . T h e y w i l l b e r e f e r r e d t o , 2  r e s p e c t i v e l y , as t h e a n t i - s y m m e t r i c a n d s y m m e t r i c c o m p o n e n t s o f t h e c o m p o s i t e . Clearly, b y the definition of the composite maps,  Y W  =  Y<->  =  <X  >++*&(<  X  2  >  - <X  2  +  a™ <X>.+aS {<X >2)  >)  (C.9)  -<X >)  2  (CIO)  2  T h i s is a l i n e a r e q u a t i o n w h i c h c a n easily be s o l v e d t o y i e l d :  W  a  (2)  a  = =  D~ [(<X l  D -  >_ -  2  1  <X  2  >)Y^-(<X  [ - < I > . Y W + < I >  +  >  2  +  -  <X  >)Y(-1]  2  Y H ]  (C.ll) (C.12)  where  D =< X >+ (< X  2  >_ -  < X  2  >)-  < X >_ (< X  2  >+ -  < X  2  >)  (C.13)  F i g u r e 4.6(c) d i s p l a y s A ^ ' for t r o p i c a l P a c i f i c S S T A c o m p o s i t e d a c c o r d i n g t o t h e N i n o 2  3.4 i n d e x . A m a p o f A ^ (not s h o w n ) l o o k s v e r y m u c h l i k e S S T A E O F m o d e 1 ( F i g u r e 4 . 6 ( a ) ) ; t h e s p a t i a l c o r r e l a t i o n b e t w e e n these is 0.975.  Appendix  C. Symmetric and Anti-symmetric  152  Components of Composites  Hoerling et al. (1997) considered the linear combinations Y^ ) — Y^ ) and Y^ ) + Y^") +  -  +  and denoted them the linear and nonlinear responses of Y to X, respectively. The above analysis shows this identification is appropriate only in the special case that < X >_ = —<X >  and < X  2  +  >_ = < X  2  > . This is certainly not true in general, although for +  the case they considered, in which X(t ) n  was an SST index similar to Nino 3.4, it is a  fairly good approximation. In principle, one could use the technique described above to fit the more general model ('n)  Y  = E ^ X(t ) k)  k  n  +e  n  (C.14)  k=0  by stratifying the data into K + 1 subsets. Presumably, however, as K increases, so does the sampling variability associated with decreasing validity of approximations (C.6).  Bibliography  B a l d w i n , M . P . a n d D u n k e r t o n , T . J . (1999). P r o p a g a t i o n o f t h e A r c t i c O s c i l l a t i o n f r o m the stratosphere to the troposphere. B a r n s t o n , A . G . (1994). northern hemisphere.  J. Geophys. Res.,  Linear statistical short-term J. Climate,  i n review.  climate predictive skill i n  the  7:1513-1564.  B a r n s t o n , A . G . , G l a n t z , M . H . , a n d H e , Y . X . (1999). P r e d i c t i v e s k i l l o f s t a t i s t i c a l a n d d y n a m i c a l c l i m a t e m o d e l s i n S S T forecasts d u r i n g t h e 1997-98 E l N i n o e p i s o d e a n d t h e 1998 L a N i n a onset. Bull. Amer. Met.  Soc,  80:217-243.  B a r n s t o n , A . G . a n d L i v e z e y , R . E . (1987). C l a s s i f i c a t i o n , seasonality, a n d p e r s i s t e n c e o f l o w - f r e q u e n c y a t m o s p h e r i c c i r c u l a t i o n p a t t e r n s . Mon. B a r n s t o n , A . G . a n d R o p e l e w s k i , C . F . (1992). c a n o n i c a l c o r r e l a t i o n a n a l y s i s . J. Climate,  Wea. Rev.,  115:1083-1126.  P r e d i c t i o n o f E N S O episodes  using  5:1316-1345.  B a r n s t o n , A . G . , v a n d e n D o o l , H . M . , Z e b i a k , S. E . , B a r n e t t , T . P . , J i , M . , R o d e n h u i s , D . R., Cane, M . A . , Leetma, A . , Graham, N . E . , Repelewski, C. R . , Kousky, V . E . , O ' L e n i c , E . A . , a n d L i v e z e y , R . E . (1994). L o n g - l e a d seasonal forecasts - w h e r e do we s t a n d ?  Bull. Am.  Met.  Soc.,  75:2097-2114.  B a t t i s t i , D . S. a n d H i r s t , A . C . (1989). I n t e r a n n u a l v a r i a b i l i t y i n a t r o p i c a l a t m o s p h e r e ocean model: Atmos. Sci.,  Influence o f t h e basic state, o c e a n g e o m e t r y a n d n o n l i n e a r i t y .  J.  46:1687-1712.  B a t t i s t i , D . S. a n d S a r a c h i k , E . S. (1995). U n d e r s t a n d i n g a n d p r e d i c t i n g E N S O .  Reviews  of Geophysics, S u p p l e m e n t : 1367-1376. U . S . N a t i o n a l R e p o r t t o I n t e r n a t i o n a l U n i o n of G e o d e s y a n d G e o p h y s i c s 1991-1994. B e l l , G . D . a n d H a l p e r t , M . S. (1995). Interseasonal to 1993.  and Interannual  Variability:  1986  N O A A A t l a s N o . 12. U . S . D e p a r t m e n t o f C o m m e r c e .  B e r l i n e r , L . M . (1992). S t a t i s t i c s , p r o b a b i l i t y , a n d chaos. Statistical B i s h o p , C . M . (1995). Neural Networks for Pattern Recognition. B j e r k n e s , J . (1969). Weath. Rev.,  Science,  7:69-122.  Clarendon Press, Oxford.  A t m o s p h e r i c teleconnections from the equatorial Pacific.  97:163-172.  153  Mon.  BIBLIOGRAPHY  154  Bretherton, C. S., Smith, C , and Wallace, J. M. (1992). An intercomparison of methods for finding coupled patterns in climate data. J. Climate, 5:541-560. Broomhead, D. and King, G. P. (1986). Extracting qualitative dynamics from experimental data. Physica, 20D:217-236. Buell, C. E . (1975). The topography of the empirical orthogonal functions. In Fourth Conf. on Probability  and Statistics in Atmospheric Sciences, Tallahassee, F L , pages  188-193. Amer. Meteor. Soc. Buell, C. E . (1979). On the physical interpretation of empirical orthogonal functions. In Sixth Conf. on Probability  and Statistics in Atmospheric  Sciences, Banff, AB,  Canada, pages 112-117. Amer. Meteor. Soc. Burgers, G. and Stephenson, D. B. (1999). The "normality" of El Nino. Geophys. Res. Lett, 26:1027-1030. Corti, S., Giannini, A., Tibaldi, S., and Molteni, F. (1997). Patterns of low-frequency variability in a three-level quasi-geostrophic model. Climate Dynamics, 13:883-904. Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function.  Math.  Contr. Signals Syst., 2:303-314.  Del Frate, F. and Schiavon, G. (1999). Nonlinear principal component analysis for the radiometric inversion of atmospheric profiles by using neural networks. IEEE Trans. Geosci. Rem. Sensing, 37:2335-2342.  DeMers, D. and Cottrell, G. (1993). Nonlinear dimensionality reduction. Neural  Inform.  Processing Syst., 5:580-587.  Deser, C. (1999). A note on the annularity of the "Arctic Oscillation". Geophys. Res. Lett., in review. Dong, D. and McAvoy, T. J. (1996). Nonlinear principal component analysis - based on principal curves and neural networks. Computers Chem. Engng., 20:65-78. Farrell, B. F. and Ioannou, P. I. (1996). Generalised stability theory. Part I: autonomous operators. J. Atmos. Sci., 53:2025-2040. Feldstein, S. and Lee, S. (1996). Mechanisms of zonal index variability in an aquaplanet GCM. J. Atmos. Sci., 53:3541-3555. Finnoff, W., Hergert, F., and Zimmermann, H. G. (1993). Improving model selection by nonconvergent methods. Neural Networks, 6:771-783.  BIBLIOGRAPHY  155  Flato, G. and et al. (1999). The Canadian Centre for Climate Modelling and Analysis global coupled model and its climate. Climate Dynamics, in review. Fotheringhame, D. and Baddeley, R. (1997). Nonlinear principal components analysis of neuronal spike train data. Biological Cybernetics, 77:282-288. Fyfe, J. C , Boer, G. J., and Flato, G. M. (1999). The Arctic and Antarctic Oscillations and their projected changes under global, warming. Geophys. Res. Lett., 26:16011604. Ghil, M. and Childress, S. (1987). Topics in Geophysical Fluid Dynamics: Dynamics, Dynamo Theory, and Climate Dynamics. Springer-Verlag.  Atmospheric  Gong, D. and Wang, S. (1999). Definition of Antarctic oscillation index. Geophys. Res. Lett., 26:459-462. Hastie, T. and Stuetzle, W. (1989). Principal curves. J. Amer. Statist. Assoc., 84:502516. Hastie, T. and Tibshirani, R. J. (1990). Generalised Additive Models. Hall, London.  Chapman and  Hoerling, M. P., Kumar, A., and Zhong, M. (1997). El Nino, La Nina, and the nonlinearity of their teleconnections. J. Climate, 10:1769-1786. Holzer, M . (1996). Asymmetric geopotential height fluctuations from symmetric winds. J. Atmos. Sci., 53:1361-1379. Hsieh, W. W. and Tang, B. (1998). Applying neural network models to prediction and data analysis in meteorology and oceanography. Bull. Amer. Met. Soc, 79:18551870. Hsu, H.-H. and Lin, S.-H. (1992). Global teleconnections in the 250-mb streamfunction field during the northern hemisphere winter. Mon. Wea. Rev., 120:1169-1190. Huang, H.-P. and Sardeshmukh, P. D. (1999). Another look at the annual and semiannual cycles of atmospheric angular momentum. J. Climate, in review. Hurrell, J. W. (1995). Decadal trends in the North Atlantic Oscillation: Regional temperatures and precipitation. Science, 269:676-679. Jin, F.-F., Neelin, J. D., and Ghil, M. (1996). El Nino/Southern Oscillation and the annual cycle: subharmonic frequency-locking and aperiodicity. Physica D, 98:442465.  BIBLIOGRAPHY  156  K i r b y , M . J . a n d M i r a n d a , R . (1994). N o n l i n e a r r e d u c t i o n o f h i g h - d i m e n s i o n a l d y n a m i c a l s y s t e m s v i a n e u r a l n e t w o r k s . Phys. Rev. Lett., 7 2 : 1 8 2 2 - 1 8 2 5 . K i r b y , M . J . a n d M i r a n d a , R . (1996). C i r c u l a r nodes i n n e u r a l n e t w o r k s . Neural 8:390-402.  Comp.,  K i r b y , M . J . a n d M i r a n d a , R . (1999). E m p i r i c a l d y n a m i c a l s y s t e m r e d u c t i o n I: G l o b a l nonlinear transformations.  I n Semi-Analytic  Methods for the Navier-Stokes  Equa-  tions, v o l u m e 20 o f CRM Proc. Lecture Notes, pages 4 1 - 6 3 . A m e r . M a t h . S o c . K r a m e r , M . A . (1991).  Nonlinear principal component  n e u r a l n e t w o r k s . AIChE  analysis using  autoassociative  J., 3 7 : 2 3 3 - 2 4 3 .  K u s h n i r , Y . a n d W a l l a c e , J . M . (1989). L o w - f r e q u e n c y v a r i a b i l i t y i n t h e n o r t h e r n h e m i sphere winter: geographical d i s t r i b u t i o n , structure a n d time-scale dependence. Atmos.  J.  Sci., 4 6 : 3 1 2 2 - 3 1 4 2 .  L e B l a n c , M . a n d T i b s h i r a n i , R . (1994).  A d a p t i v e p r i n c i p a l surfaces.  J. Amer.  Statist.  Assoc., 8 9 : 5 3 - 6 4 . L o r e n z , E . N . (1956). E m p i r i c a l o r t h o g o n a l f u n c t i o n s a n d s t a t i s t i c a l w e a t h e r p r e d i c t i o n . T e c h n i c a l r e p o r t , D e p a r t m e n t o f M e t e o r o l o g y , M I T . S c i e n c e R e p o r t 1. L o r e n z , E . N . (1963). D e t e r m i n i s t i c n o n p e r i o d i c flow. J. Atmos.  Sci., 2 0 : 1 3 0 - 1 4 1 .  L u c k e , M . (1976). S t a t i s t i c a l d y n a m i c s o f t h e l o r e n z m o d e l . J. Stat. Phys., 1 5 : 4 5 5 - 4 7 5 . M a l t h o u s e , E . C . (1998). L i m i t a t i o n s o f n o n l i n e a r P C A as p e r f o r m e d w i t h g e n e r i c n e u r a l n e t w o r k s . IEEE  Trans. Neural Nets., 9:165-173.  M c F a r l a n e , N . A . , B o e r , G . J . , B l a n c h e t , J . - P . , a n d L a z a r e , M . (1992).  The Canadian  C l i m a t e C e n t r e second-generation general circulation m o d e l a n d its e q u i l i b r i u m clim a t e . J. Climate,  5:1013-1044.  M i l l e r , A . J . , W h i t e , W . B . , a n d C a y a n , D . R . (1997). N o r t h P a c i f i c t h e r m o c l i n e v a r i a t i o n s o n E N S O t i m e s c a l e s . J. Phys. Ocean., 2 7 : 2 0 2 3 - 2 0 3 9 . M o , K . C . a n d G h i l , M . (1987). Atmos.  Statistics a n d dynamics of persistent anomalies.  /.  Sci., pages 8 7 7 - 9 0 1 .  M o n a h a n , A . H . (1999).  Nonlinear principal compnent  analysis b y neural  networks:  T h e o r y a n d a p p l i c a t i o n t o t h e L o r e n z s y s t e m . J. Climate, i n press. N a k a m u r a , H . a n d W a l l a c e , J . M . (1991).  Skewness o f l o w - f r e q u e n c y f l u c t u a t i o n s i n  t h e t r o p o s p h e r i c c i r c u l a t i o n d u r i n g t h e n o r t h e r n h e m i s p h e r e w i n t e r . J. Atmos. 48:1441-1448.  Sci.,  BIBLIOGRAPHY  157  N o r t h , G . R . (1984). E m p i r i c a l o r t h o g o n a l functions a n d n o r m a l m o d e s .  J. Atmos.  Sci.,  41:879-887. O j a , E . (1997).  T h e nonlinear P C A learning rule i n independent component  Neurocomputing,  analysis.  17:25-45.  O j a , E . a n d K a r h u n e n , J . (1993). N o n l i n e a r P C A : A l g o r i t h m s a n d a p p l i c a t i o n s . T e c h n i c a l R e p o r t A 1 8 , H e l s i n k i U n i v e r s i t y of Technology. O t t , E . (1993).  Chaos in Dynamical  Systems. C a m b r i d g e U n i v e r s i t y P r e s s .  P a c a n o w s k i , R . C , D i x o n , K . , a n d R o s a t i , A . (1993). T h e G F D L m o d u l a r o c e a n m o d e l users g u i d e . T e c h n i c a l R e p o r t 2, G e o p h y s i c a l F l u i d D y n a m i c s L a b o r a t o r y , P r i n c e t o n , USA. P a l m e r , T . N . (1999). Climate,  A nonlinear d y n a m i c a l perspective  on climate prediction.  J.  12:575-591.  P e n l a n d , C . (1996). A s t o c h a s t i c m o d e l o f I n d o P a c i f i c sea surface t e m p e r a t u r e a n o m a l i e s . Physica  D, 98:534-558.  P e n l a n d , O , F l i i g e l , M . , a n d C h a n g , P . (1999). O n the i d e n t i f i c a t i o n o f d y n a m i c a l r e g i m e s in an intermediate coupled ocean-atmospheric  m o d e l . J. Climate,  page i n press.  P e n l a n d , C . a n d S a r d e s h m u k h , P . D . (1995). T h e o p t i m a l g r o w t h o f sea surface t e m p e r ature anomalies.  J. Climate,  8:1999-2024.  P e r l w i t z , J . a n d G r a f , H . - F . (1995). T h e s t a t i s t i c a l c o n n e c t i o n b e t w e e n t r o p o s p h e r i c stratospheric  c i r c u l a t i o n o f the n o r t h e r n h e m i s p h e r e i n w i n t e r .  J. Climate,  and  8:2281-  2295. P h i l a n d e r , S. G . (1990).  El Nino,  La Nina,  and the Southern  Oscillation.  Academic  Press, San Diego. P r e s s , W . H . , T e u k o l s k y , S. A . , V e t t e r l i n g , W . T . , a n d F l a n n e r y , B . P . (1992).  Numerical  Recipes in C. C a m b r i d g e U n i v e r s i t y P r e s s , C a m b r i d g e . P r i e s e n d o r f e r , R . W . (1988). Principal  Component Analysis  in Meteorology and  Oceanog-  raphy. E l s e v i e r , A m s t e r d a m . R i c h m a n , M . B . (1986). R o t a t i o n o f p r i n c i p a l c o m p o n e n t s . Int. J. Climatology, R i c o - M a r t i n e z , R . , A n d e r s o n , J . , a n d K e v r e k i d i s , I. (1996).  Self-consistency i n n e u r a l  n e t w o r k - b a s e d N L P C analysis w i t h a p p l i c a t i o n s t o time-series p r o c e s s i n g . chem. Engng, 2 0 , S u p p l . : S l 0 8 9 - S 1 0 9 4 .  6:293-335.  Computers  BIBLIOGRAPHY  158  Sanger, T. (1989). Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks, 2:459-473. Sengupta, S. K. and Boyle, J. S. (1995). Nonlinear principal component analysis of climate data. Technical Report 29, PCMDI. ShindeU, D. T., Miller, R. L., Schmidt, G. A., and Pandolfo, L. (1999). Simulation of recent northern winter climate trends by greenhouse-gas forcing. Nature, 399:452455. Smith, T. M., Reynolds, R. W., Livezey, R. E., and Stokes, D. C. (1996). Reconstruction of historical sea surface temperatures using empirical orthogonal functions. J. Climate, 9:1403-1420. Stamkopoulos, T., Diamantaras, K., Maglaveras, N., and Strintzis, M. (1998). E C G analysis using nonlinear PCA neural networks for ischemia detection. IEEE Trans. Sig. Proc, 46:3058-3067. Suarez, M. J. and Schopf, P. S. (1988). A delayed action oscillator for ENSO. </. Atmos. Sci., 45:3283-3287. Takane, Y. (1998). Nonlinear multivariate analysis by neural network models. In Studies in Classification, Data Analysis, and Knowledge Organisation: Classification, and Related Methods. Springer.  Data  Science,  Tangang, F. T., Tang, B., Monahan, A. H., and Hsieh, W. W. (1998). Forecasting ENSO events: A neural network-extended EOF approach. J. Climate, 11:29-41. Thompson, D. W. and Wallace, J. M. (1998). The Arctic Oscillation signature in the wintertime geopotential height and temperature fields. Geophys. Res. Lett., 25:12971300. Thompson, D. W. and Wallace, J. M. (1999). Annular modes in the extratropical circulation part I: Month-to-month variability. / . Climate, in review. Thompson, D. W., Wallace, J. M . , and Hegerl, G. C. (1999). Annular modes in the extratropical circulation part II: Trends. J. Climate, in review. Trenberth, K. E., Branstator, G. W., Karoly, D., Kumar, A., Lau, N . - C , and Ropelewski, C. (1998). Progress during TOGA in understanding and modeling global teleconnections associated with tropical sea surface temperatures. J. Geophys. Res., 103:1429114324. Trenberth, K. E. and Paolino, D. A. (1980). The northern hemisphere sea level pressure data set: trends, errors, and discontinuities. Mon. Wea. Rev., 108:855-872.  BIBLIOGRAPHY  159  Ulbrich, U. and Christoph, M. (1999). A shift of the NAO and increasing storm track activity over Europe due to anthropogenic greenhouse gas forcing. Climate Dynamics, 15:551-559. van Loon, H. and Rogers, J. C. (1978). The seesaw in winter temperatures between Greenland and northern Europe, part I: General description. Mon. Wea. Rev., 106:296-310. von Storch, H. and Zwiers, F. W. (1999).  Statistical  Analysis  in Climate  Research.  Cambridge University Press, Cambridge. Wallace, J. M. and Gutzler, D. S. (1981). Teleconnections in the geopotential height field during the northern hemisphere winter. Mon. Weath. Rev., 109:784-812. Wang, B. (1995). Interdecadal changes in El Nino onset in the last four decades. Climate, 8:267-285.  J.  Whitaker, J. S. and Sardeshmukh, P. D. (1998). A linear theory of extratropical synoptic eddy statistics. / . Atmos. Sci., 55:237-258. Wilks, D. S. (1995). Statistical Methods in the Atmospheric San Diego.  Sciences. Academic Press,  Woodruff, S., Slutz, R., Jenne, R., and Steurer, P. (1987). A comprehensive oceanatmosphere data set. Bull. Am. Met. Soc, 66:1239-1250. Yuval (1999). Neural network training for prediction of climatological time series; regularized by minimization of the generalized cross validation function. Mon. Weath. Rev., in press.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0089632/manifest

Comment

Related Items