THE STATISTICAL ESTIMATION OF EXTREME WAVES by NEIL GRANT MACKENZIE Sc(Hons) University of Newcastle-upon-Tyne, 19733 A THESIS'.: SUBffilTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE IN THE FACULTY OP GRADUATE STUDIES i n the Department of CIVTL ENGINEERING We accept this thesis as conforming to the required standard The University of B r i t i s h Columbia June, 1979 © Neil Grant MacKenzie, 1979 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced d e g r e e a t the U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and I f u r t h e r agree that permission f o r s c h o l a r l y p u r p o s e s may by h i s r e p r e s e n t a t i v e s . for extensive study. copying of this thesis be g r a n t e d by the Head o f my Department o r I t i s u n d e r s t o o d t h a t c o p y i n g or p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my written permission. Department The U n i v e r s i t y o f B r i t i s h Columbia 2075 Wesbrook P l a c e V a n c o u v e r , Canada V6T 1W5 DE-6 BP 75-51 1E (ii) ABSTRACT This thesis contains a review of existing s t a t i s t i c a l techniques for the prediction of extreme waves for coastal and offshore i n s t a l l a t i o n design. A description of the four most widely used probability distributions i s given, together with a detailed discussion of the methods cormionly used for the estimation of t h e i r parameters. Although several of these techniques have been i n use for several years, i t has never been s a t i s f a c t o r i l y shown which are capable of yielding the most r e l i a b l e predictions. The main purpose of t h i s thesis i s to suggest a p r a c t i c a l method of solving t h i s problem and achieving the best estimate. The basic theory for the prediction of extreme values was described i n d e t a i l by Gumbel (1958).who concentrated largely on the double exponential d i s t r i b u t i o n which i s named after him. An order to evaluate the quality of f i t between t h i s law and the data, Gumbel derived expressions which enabled one to plot confidence intervals to enclose the data. The method described i n t h i s thesis i n partly an extension of Gumbel's work, and similar confidence interval methods are given for the remaining distributions, thus permitting direct comparisons to be drawn between their performances. The outcome of t h i s i s that the most r e l i a b l e model of the data may be chosen, and hence the best prediction made. (iii) The method also contains a curvature test which has been devised to f a c i l i t a t e computation and lead more directly to the end result. The particular form of the wave data, which i s quite different from wind records, i s also taken into consideration and a working definition of the sample t a i l i s suggested. (iv) TABLE OF CONTENTS Chapter :•• Page 1. INTRODUCTION - AND LITERATURE REVIEW 2. THE DISTRIBUTIONS AND. THEIR PROPERTIES 6 2.1 The Lognormal Distribution 6 2.2 2.3 Asymptotic Distribution of the Extreme Value .. Gumbel Distribution - Type I 7 8 2.4 Fretchet Distribution - Type^II 9 2.5 Weibull Distribution - Type I I I 10 3- PROCEDURE FOR COLLECTING THE DATA 3.1 Forming the Sample 3.2 Determination of Plotting Positions 3-3 Definition of the Sample T a i l 12 14 17 18 4. INITIAL SELECTION OF PARAMETRIC FAMILIES 4.1 Curvature Properties 4.2 The Curvature Test 20 20 23 5. METHODS OF PARAMETER ESTIMATION 5.1 Method of Moments 5.1.1 Gumbel Distribution - Type I 5.1.2 Fretchet Distribution - Type I I 24 25 25 26 5.1.3 5.1.4 5.2 Weibull Distribution - Type I I I Lognormal Distribution 1 27 28 Method of Least Squares 29 5.2.1 Gumbel Distribution - Type I 31 5.2.2 Fretchet Distribution - Type I I 32 5.2.3 Lognormal Distribution • 32 5.2.4 Weibull Distribution - Type I I I 33 (v) TABLE OF CONTENTS . (continued) Page 5.3 Method of Maximum' Likelihood 5.3-1 Gumbel Distribution - Type I 35 36 5.3.2 . Fretchet. Distribution - Type I I 6. 5.3.3 Weibull Distribution - Type I I I 38 5.3.4 Lognormal Distribution 39 TESTS OF FIT BETWEEN THE DISTRIBUTION AND THE DATA . 40 6.1 Derivation of Confidence Intervals 4l 6.2 Asymptotic Distributions of the mth Statistic Approximate Distribution of the mth Statistic 6.3 6.4 6.5 6.6 6.7 7. 37 Limitations Method of Determining C r i t i c a l Points Procedure for Plotting Confidence Intervals Examples of Confidence Intervals METHODS OF PREDICTION 7.1 Expected Significant Height 7.2 Extreme Wave Period 7.3 Maximum Wave Height 7.4 Confidence Intervals for Prediction 7.5 Encounter Probability and Waiting Time 44 44 45 46 ^6 47 1+8 ^9 50 51 53 54 8. CONCLUSIONS - A RECOMMENDED PROCEDURE 56 9. WORKED EXAMPLE 59 FUTURE WORK 63 10.. BIBLIOGRAPHY 6 4 (vi) TABLE OP CONTENTS (continued) APPENDIX Al A2 A3 A4 A5 A6 A7 Properties of the Type I Distribution Properties of the Type I I Distribution Properties of the Type I l l y Distribution Properties of the Lognormal Distribution Curvature of the Type I I Distribution Curvature of the Type I l l y Distribution Curvature of the Type I I I Distribution T 104 107 109 I l l 113 114 115 (vii) LIST OF FIGURES Figure Page 1. Typical Examples of Data on Lognormal Paper 2. Typical Density Curves for the Types I , I I and 75 76 Lognormal Distributions 3. 77 Typical Density .Curves for the Type I I I 77 Distributions ...... 78 4. Typical Wave Elevation Recording 5. A Bivariate Scatter Diagram 79 6. The Definition of Sample T a i l 8b 7. Comparison of T a i l Curvatures and Gumbel Paper 8l 8. Skewness of the Fretchet Distribution 82 9- Shape Factor for the Weibull Distribution 83 10. Method of Least Squares 84 11. Convergence of Least Squares Procedure for !. ' 85 Type I I I Distributions 12. 86 Comparison Between the Approximate and Exact Confidence Bands 13. Confidence Intervals on a Type I Plot 14. Confidence Intervals on a Type I I Plot 15. Confidence Intervals on a Type I I I Plot 89. 16. Confidence Intervals on a Lognormal Plot . go 17. Determination of Confidence Interval from Typical Distribution Density of mth Observation 18. The Curvature Test 87 88 : ... .91 9,2 (viii) LIST OP FIGURES (continued)• Page 19. Type I I I 20. Type I l l y Plot with Confidence Intervals 21. Curvature of the Lognormal Distribution on L Plot with.Confidence Intervals 23. Typical Prediction Intervals 24. The Relationship Between P(h) and P (h) 97 m = 1 98 The Relationship Between P(h) and P (h) m for m = 2 m m = 3 100 The Relationship Between P(h) and F (h) m for 28. 99 The Relationship Between P(h) and F (h) f or 27. 96 m for 26. 94 The Relationship Between Return Period and Encounter Probability 25. 93 95 Type I Paper 22. .. m = 4 101 The Relationship Between P(h) and F (h) m for m = 5 10.2 (ix) LIST OF TABLES TABLE Page I Probability Distribution and Their Properties 69 II Curvature Properties of the Distribution 70 III Shape Factors for the Fretchet Distribution 71 IV Shape Factors f o r the Weibull Distribution 72 V Estimation of Confidence Intervals f o r Type i l l . Plot 73 (x) ACKNOWLEDGEMENT The author p r o f i t s by this occasion to thank h i s supervisor, Dr. M. de St. Q. Isaacson, f o r the constant help and encouragement given. The typing and ingenious presentation of the text was the work of Ms. S. McLintock. 1. CHAPTER 1 INTRODUCTION AND LITERATURE REVIEW The technique of plotting collected wave data on probability paper i n order to predict the probable magnitude of extreme values has a firm place i n engineering design. Although i t s use i s widespread, a general method of selecting the most suitable probability model has not been previously suggested. Considerable attention has been given i n design to accurately predicting the effect of a selected design wave ...on.. a<-structure.-. .:iHowever,/the,process for-/selecting a design wave s t i l l remains comparatively unreliable, and represents a weak l i n k within the design process. In engineering practice and' current l i t e r a t u r e only four distributions are commonly used for t h i s model. These are the LOGNORMAL, GUMBEL (Type T ) , FRETCHET (Type I I ) , and WEIBULL (Type I I I ) distributions. Each of these four distributions i s actually a family of probability functions whose properties vary subtly as their parameters change i n value. The f i r s t three are each defined by two parameters, conveniently called "shape" and "scale" parameters. The Type I I I distribution requires a t h i r d parameter for i t s d e f i n i t i o n . This i s referred to as a "location" parameter and enables the Type I I I distribution to be used i n two alternative forms. These are l a t e r designated the Type I I L ^ and Type H l y distributions. The success of the method described here i s dependent upon a close .empirical f i t between the model and the data, being achieved. Depending on the data used, there i s often a tendency for one 2. distribution to be more suitable than the others, and hence able to give more r e l i a b l e predictions. . The systematic search for this distribution has received very l i t t l e attention i n the l i t e r a t u r e , and i s the basic problem considered i n this thesis. Fisher and Tippet [1926] showed that there are only three asymptotic distributions, and that these describe the behaviour of the maximum value from any parent distribution as the sample size approaches i n f i n i t y . Gumbel [1958] developed the Type I. distribution to a considerable degree as a tool for flood prediction. As a result of his work, this distribution has, u n t i l recently, been the most widely used throughout the various applications of extremal s t a t i s t i c s . Gumbel also popularised the method of moments, and to some extent the method.of least-squaresfor estimating the parameters of the Type I distribution. The former method was generally adopted since i t could be carried out by hand, whereas the l a t t e r required a computer program. Thorn [1954] developed the Type I I distribution for wind analysis and suggested using the method of maximum likelihood for estimating the parameters. Both the Type I and Type I I distributions have been adopted by meteorologists for the prediction of extreme wind speeds i n the United States and elsewhere. Jasper [1956] suggested the use of the lognormal distribution for describing the occurrence of significant wave heights. This has been commonly used by many authors including Draper [1963], though i n recent years i t s popularity may have diminished s l i g h t l y . 3. Following the successes with, wind speed prediction, Thom [1971] went on to advocate the application of the Type I I distribution to wave, heights. He argued that this distribution was superior to the Type I since i t had a lower bound of zero height. He plotted data taken from several Ocean Station Vessels based i n the P a c i f i c and Atlantic Oceans, achieving a good f i t i n some cases. The data from these vessels was mainly based on visual estimates, and the vessels themselves stationed i n the deep waters of.mid-ocean. However, i t i s not unreasonable to expect the distributions of waves i n shallower, more restricted sites to be rather different from those described. There are two alternative forms of the Type I I I distribution which are denoted by the Type I H respectively. L and Type I l l y distributions The Type I I I ^ (Weibull lower-bound) distribution of minima (see TABLE I) was used i n combination with the method of moments by Gumbel [1954] for estimating the worst drought occurring in a river. This was a natural choice since i t enabled the engineer:..to place a lower l i m i t e on the least flow ever possible i n the r i v e r . Bretschneider [1965] suggested the use of the Type I H L distribution for the short-term significant wave heights associated with a given storm. Hogben [1967] used data gathered i n the midAtlantic to make a comparison between the Type UL^ distribution and the lognormal distribution. He concluded that, although the l a t t e r gave a better f i t for lesser wave heights, the Type I I I ^ was superior for larger heights. Battjes [1970], using instrument-recorded data from the mid-Atlantic and the C e l t i c Sea, found a strong departure i n the data from the lognormal distribution for extreme wave heights. In this instance the lognormal distribution gave an over-estimation of wave height for a specified probability of exceedance. He found that the Type III-^ distribution gave a superior f i t when a small positive value was used for e.. Generally e was less than one metre and represented the extent of background noise which was always found to be present. St. Denis [1973] suggested that the Type I l l y (Weibull upper-bound - see TABLE I) - distribution should be used for the description of wave heights i n situations where a physical upper l i m i t i n the height could be expected. A t y p i c a l case might be that of shoaling, or of a d i s t i n c t l y limited fetch. No reports of i t s use have been found i n the published l i t e r a t u r e , although i t s use has also been advocated by Borgman [1975]. This may be the result of d i f f i c u l t i e s surrounding parameter estimation.. These are discussed i n d e t a i l i n CHAPTER 5 . on the estimation of parameters. Petrauskas and Aagaard [1971] described a computer method which enabled them to select the most suitable distribution for a data sample from eight chosen p o s s i b i l i t i e s . These consisted of the Type I distribution together with, seven Type I H L distributions, each with a different prescribed shape parameter. The process of parameter estimation then simplified to one of determining two rather than three parameters for each of the eight cases. This was then achieved by a direct least-squares approach. The resulting distribution was then plotted with "uncertainty intervals" to indicate the degree of error i n prediction. 5. In a paper on rubble-mound breakwaters, Ouellet [1974], noted the wide v a r i a b i l i t y , i n sample of wave data, and the need for a consistent approach to predict from them. FIGURE 1,. which i s taken from his paper, shows five sets of data from different sources. It can be seen that i n one case (Moffat Beach, Australia) the researcher did not f i t a single straight l i n e but used three straight sections. This implies, that the sample was a mixture of data from three quite different lognormal populations. From the point of view of prediction this i s quite undesirable, since only one-third of the sample could be used for long-term forecasts. Two other sets of data (Benghazi Harbour and Mangalore Harbour) develop pronounced curvature as the exceedance probability decreases. In CHAPTER 4, the role of this property i s examined i n d e t a i l . In the next chapter the distributions are described i n d e t a i l . Each subsequent chapter discusses a step i n .the derivation of a design wave, as indicated by t h e i r t i t l e s . The conclusion to this thesis describes a complete procedure and a worked example i s provided for demonstration. 6. CHAPTER 2 THE DISTRIBUTIONS AND THEIR PROPERTIES The distributions described here .are the most commonly used for extreme wave prediction, and an outline of their properties i s given. 2.1 ... .The.lognormal. Distribution The Lognormal Distribution i s derived by transforming a variable to i t s logarithm before applying the normal distribution. This results i n a density which only exists for a positive variable, as shown i n FIGURE 2. If. Y i s an N(y,a ) variable, that i s i t 2 possesses a Normal distribution with mean y and variance a , " then 2 X = exp(Y) i s a lognormal variable with parameters y and a . 2 The density i s given i n TABLE I with a =a .and 6 = p . The popularity of t h i s distribution amongst coastal engineers i s largely due to i t s connection with the normal distribution, and i t s considerable f l e x i b i l i t y rendered by the scale and shape parameters y and a 2 respectively. In other related f i e l d s , such as meteorology, the lognormal distribution i s much less popular. I t i s possible that meteorologists f e e l j u s t i f i e d i n only using asymptotic distributions by the relative abundance of weather data. Of the various distributions considered here, the lognormal i s an exception i n that i t i s not an asymptotic form. In other words, i t does not l i m i t i t s description to the t a i l of a parent distribution from which the body of data might be collected. 7. Lognormal Paper Is constructed as.follows: a) The ordinate.scale carries the Standard Normal Distribution c r i t i c a l points corresponding to the exceedance probability Q(h). A c r i t i c a l point i s the value of variate which defines the lower l i m i t of area representing the exceedence probability Q(h) under the density curve. The procedure has been described by Draper [1963]. b) The abscissa scale i s simply the natural logarithm of the wave height. True lognormal data w i l l l i e on a straight l i n e whose slope and intercept w i l l be determined by the two parameters a and 0 . 2 2.2 Asymptotic Distributions of the Extreme Value Generally the distribution of data occurring within two standard deviations of the mean value i s well described by the parent distribution. However, i n many cases (for example the NORMAL DISTRIBUTION), the areas within the t a i l , corresponding to comparatively rare events, are d i f f i c u l t to.calculate with precision. This problem does not arise i n practice since the distribution of the maximum value occurring within a sample from any parent family tends i n distribution to one of the three asymptotic types as the sample size approaches. i n f i n i t y . The three types, namely Gumbel, Fretchet, and Weibull distributions, a l l have cumulative distribution functions 8. (TABLE 1) which may be evaluated by pocket-calculator instead of tables or a computer program. In order to simplify the description of these distributions the following notation w i l l be adopted: a - shape parameter which determines basic shape of the density function. 0 - scale.parameter controlling the density scale or spread along the variate axis. e - location parameter locating the position of the density function on the variate axis. In the special case of the Types I H y and I H distributions j e locates one end L of the density function (see Figure 3)• 2.3 Gumbel Distribution - Type I The Type I distribution i s the l i m i t i n g form for maximum values taken from the exponential class of parent d i s t r i b u t i o n s , which include the Normal, Exponential and Gamma distributions. The probability function i s defined i n TABLE I, and the density function i s sketched i n FIGURE 2. This asymptotic distribution has.become an accepted form for the prediction of extreme winds i n the U.S.A. according to. Simiu [1976]. I t sets no upper l i m i t on the intensity of the windspeed which may occur. In common with the other two types, the GUMBEL PAPER has a linear ordinate scale given by 9. y where Q(h) = - hi { -In [ l - Q(h)]} ....(1) i s the exceeelance probability of a given wave height. The abscissa scale i s simply the wave height, which need not be standardised before plotting. 2.4 Fretchet Distribution - Type I I The probability function of the Fretchet Distribution i s defined i n TABLE I and t y p i c a l shapes of the density function are shown i n FIGURE 2. The Type I I asymptote i s the l i m i t i n g distribution of maximum values taken from the Cauchy class of parent distributions which are not commonly used i n engineering because their means and variances do not always exist.. The Cauchy class generally have densities which are functions of the reciprocal of the intensity (wave height). A useful property of the Type I I asymptote i s that i t s density decays more slowly than the other two asymptotes. This property has made the Type I I distribution invaluable for the prediction of hurrican intensities i n the U.S.A. (Simiu 1976). Thorn [1971] suggested that the Type I I distribution i s particularly suited to the description of wave heights. He argued that wave heights are bounded quantities since they cannot have negative values, and thus, merit quite different treatment from temperatures and pressures, which are unbounded. The transformation from an unbounded variate to i t s extreme value i s achieved by a translation, whereas a bounded variate i s transformed by a change of scale. The Type I I distribution may be considered as a Type I distribution i n 10. which the variate has been transformed t o . i t s logarithm, thus giving i t a change of scale. Fretchet Paper has a linear ordinate scale given by y as before. 2.5 - in { - In [ l - Q(h)]} The linear abscissa scale i s now given by x where h = = £n{h} (2) i s the wave height. Weibull Distribution - Type I I I The Weibull probability functions are given i n TABLE I , and typical density functions are shown i n FIGURE 3. As already mentioned, two forms of the Weibull distribution are available. The upper bound distribution, Type I l l y , i s the distribution of maximum values taken from parent distributions with a f i n i t e t a i l .length, such as the UNIFORM DISTRIBUTION. The lower bound distribution, Type I I I , i s the distribution of minimum values from the same source. T The Type H I ^ distribution has been used to a certain extent as an empirical tool for wave height prediction. The second form of Type III. does not appear to have been widely applied to the problem considered here. Type I l l y paper has the same ordinate scale as Types I and I I , which i s given by y = . - Jin { - Jin [ 1 - Q(h)]} Its. abscissa scale i s dependent upon one of the three parameters, and so varies from one set of data, to another. = _ in{ - h} x e (3) where e i s the maximum wave height ever possible and .is f i n i t e . Type I I I L Paper has a different ordinate scale to that used for the other asymptotic distributions, and this i s generally y = + in{- in Q(h)} (4) and the abscissa scale i s again parameter dependent. x = + in {h - e} (5) where e i s the smallest wave height possible and e > 6.. 12. CHAPTER 3 ' PROCEDURE FOR'COLLECTING WAVE DATA An ideal data source would consist of a continuous wave height recording over a period of several years. of a continuous sample i s not feasible. In practice, the collection The accepted alternative i s systematic intermittent sampling which consists of chart-recording over a short period of several minutes i n each successive interval of a few hours. The recording period i s often set at twenty minutes, for a recording interval of three hours. Generally, the sea state changes slowly enough for t h i s to be representative and a t y p i c a l wave elevation recording i s sketched i n FIGURE 4. Engineers concerned with the prediction of extreme winds generally use data which has been collected over a decade or more. This i s usually available from nearby airports, which keep records of this length. I t i s rare for a coastal engineer to have an equally long record for wave heights at some l o c a l i t y . Wave heights are sensitive to physical influences such as fetch length and water depth. The engineer i s often obliged to use recordings made at the location .. of interest i n order to account for l o c a l effects, such as d i f f r a c t i o n by coastline projections or refraction. A further d i f f i c u l t y . a r i s e s because the engineer, rarely knows the location of the project some years before the design i s started, and hence i s often forced to work with a short record. A final problem arises because there are usually practical d i f f i c u l t i e s i n operating wave recorders accurately over long periods of time since they usually have to be rigged with a buoy and anchor. Experience shows that the chances of a malfunction or loss of an instrument deployed i n this fashion i s quite high. In cases where i t ...becomes necessary to i n i t i a t e a l o c a l wave-recording program, the record w i l l rarely cover more than a year or two, except perhaps i n long term wave research projects. More often the time period studied w i l l be one year. Although a shorter study period would be expected to introduce seasonal variations, there i s reason to suggest that provided the winter months are covered i n d e t a i l no serious information loss should occur (see SECTION 3-3). In this extreme case the parameter estimation should be carried out by a least-squares approach (SECTION 5-2 et seq.). I f summer months are omitted, i t i s often r e l a t i v e l y simple to confirm that no storms more severe than those measured during the winter occurred. The result of a wave recording program would be a series of representative wave heights, one for each recording i n t e r v a l , e.g. the significant heights, or the maximum height measured. The method of converting a continuous record into a series of s t a t i s t i c s i s not described here and has been well documented by Tucker [1963]. The basic form for the presentation of wave data i s the bivariate histogram or "scatter diagram". This consists of a table containing significant wave heights which are divided, by frequency of occurrence,.into intervals of wave period. The t o t a l number of occurrances i s equal to.the number of wave records (including calms) 14. gathered during the study period, [e.g. one year). FIGURE 5 provides an example of t h i s , and. from such a diagram one may assemble a table of height classes and their frequencies by summing over the wave periods. The resulting data i s then used for plotting. Although the methods of prediction used for extreme wind speeds are very similar to those used for. wave heights, there i s one basic difference i n approach which results from the much shorter wave period. Engineers concerned with the extreme wind speed usually have a sample containing one maximum speed for each year of the record. For the reasons just discussed, the wave prediction has often to be based on values occurring within a single year's recording. As one might expect then, the wave prediction must lack the degree of precision of a wind prediction and this i s reflected i n the width of the confidence intervals (see SECTION 7-8). 3.1 Forming The Sample Sampling i s a c r i t i c a l stage i n any s t a t i s t i c a l analysis. only does i t enable the s t a t i s t i c i a n to reduce the vast universe of data into something which i s both meaningful and manageable, but i t also largely determines the shape of the questions which may be answered. As f a r as a literature review could show, very few authors attempted any other method of forming the sample than the one used by Draper [1963] and summarised below: a) Each short length of wave.recording corresponding to a recording period i s reduced to a significant height h g and a zero crossing period T^. The Not method proposed by. Tucker [l'96l] Is used. Each pair of s t a t i s t i c s then applies to one recording interval ..of several hours. b) To f a c i l i t a t e , handling, a scatter diagram i s prepared. This i s a table of h against T , both divided into classes, and each element z Is appropriately marked with a number of recording intervals, (see FIGURE 5). c) Each wave height class (0 - 1.99 f t . , 2 - 3.99 f t . etc.) i s summed over a l l classes of T 7 to give marginal frequencies of height. d) The probability that the significant wave height may exceed the lower l i m i t of any class i s then calculated as -, s _ Number of height values 2 h 1 + t o t a l number of height values 0( where h i s the lower l i m i t of each height class (0 f t . 2 f t . etc) and Q(h) i s [Prob H>h] e) Paired values of h and Q(h) may then be used to plot the lower l i m i t of each class onto a probability paper (e.g. Type I I I T paper). 16. This approach i s commonly used i n s t a t i s t i c s and was developed to f i t a distribution to. the body .of the sample.. I t i s particularly useful when estimating by the method of moments. However, I t i s rather non-specific i n the way i t achieves a f i t and does not always give the engineer the type of f i t he requires. Since the purpose of sampling wave heights i s to arrive at r e l i a b l e estimates of the rare occurance wave, i t seems unreasonable to concentrate oh achieving a good f i t at the median of the sample. In fact the quality of f i t for wave heights which are exceeded almost daily i s quite Irrelevant to the problem considered here. Clearly, a high quality f i t i n the v i c i n i t y of the t a i l of the parent distribution i s required i n order to predict events whose exceedence probabilities occupy t h i s region, It would be most helpful i f the " t a i l " , of the sample could be defined so that a distribution could be f i t t e d directly to this portion of the data. In practical applications o f . s t a t i s t i c s i t i s quite common to define the f i n i t e end of a t a i l as being a fixed . number of standard deviations from the mean. The result i s an empirical rule of the form: Lowest height within t a i l i h + a where h i s the sample mean height S^ i s the sample standard deviation a... i s a constant An alternative approach, which yields the constant 'a', i s based on the method of calculating plotting positions and w i l l be given i n SECTION 3-3- 17. 3-2 . Determination of Plotting Positions In order to plot the data, on probability paper one must assign a fixed probability to each value i n the sample. To do this the data i s ordered according to height and the s u f f i x m to denote i t s position or RANK. Thus m = 1 largest value and m = n i s used : corresponds to the to the smallest of a sample containing n wave heights. A formula which has gained wide acceptance for calculating the plotting position i s : Q^) . . .... •= 1 - P^) (7)" = m/n+1 I t has been shown by Gumbel [1958] that the expected probability for the m^ ordered observation i s given by m/n+1 and that this i s independent of the distribution. However, i t has been demonstrated by Kimball [i960] and Gringorten [1963] that this formula tends to introduce a slight bias towards the distribution being estimated. . Although i t i s possible to form unbiased forms for the distributions considered here, such expressions would vary according to the parameters. Since the parameters s t i l l have to be estimated, this would lead to either approximate forms or to an i t e r a t i v e procedure. The example given by Gringorten suggests that the bias introduced by the simple formula Eqn. 7 i s small enough to be considered a second-order effect i n comparison with those introduced by adopted different estimation . . methods or sampling procedures. The simple rule i s invariably used for plotting and, for..example, has been strongly recommended by Borgman [1975]. I t w i l l be adopted-throughout the present study and the effects of alternative formulae are not examined. It should be noted that the rank value m i s assigned to each individual wave height recorded but not directly to the height class l i m i t s . This may be seen i n the worked example i n CHAPTER 9- As a consequence of this the class l i m i t s i n the example have the approximate ranks shown i n TABLE I I I . 3.3. Definition of the Sample T a i l It has already been mentioned that for the prediction of extreme wave heights, a good f i t i n the t a i l of the data i s of considerable importance, i . e . the distribution should give a good f i t to the worst of the extreme measurements made. I t i s therefore convenient to define the sample t a i l . In SECTION 3.1 one simple method of defining the sample t a i l was mentioned. An alternative approach may be based on the fact that for a l l . three asymptotic distributions a function of probability provides the ordinate scale th for plotting. The ordinate of the m wave height s t a t i s t i c i s given by y where QCfcfo) m = - £n {- In [ l - Qd^)]} i s calculated as i n SECTION 3-2. A plot of this function against QCh^) describes the distortion applied to obtain the scale along the ordinate and i s shown i n FIGURE 6(a). The gradient of the curve i s given by 19. dy/ dQ = {(1-Q) £n(l-Q)} and this i s plotted against As QCbjjj) ...... (-8) _1 Qfl^.) i n FIGURE 6(b). approaches the median value, of 0.5, the gradient decreases and becomes almost constant for values less then, say, 0.1. The v e r t i c a l distance between plotted points i s controlled d i r e c t l y by t h i s gradient and hence a lower l i m i t for Q(h^) of 0.1 i s chosen to locate the start of the t a i l . The extent of the t a i l i s then determined by the position of the sample wave height having a rank where and n w = w (n+l)/10 . (9) i s the number of wave heights i n record. As a result of this procedure only 10% of the o r i g i n a l sample i s used for estimation. A proportion of the bulk of data, which Is used i n the f i t t i n g procedure should contain a l l measurements made during the summer months of. lower storm a c t i v i t y . As a r e s u l t , i t no longer becomes necessary to r e l y on precise measurements during these periods of low storm a c t i v i t y . Thus gaps i n the record for these months need not be serious, and often this may be confirmed by an inspection of l o c a l meteorological records. In the rare cases when a valuable piece of data i s missed i t may be possible to estimate the approximate number of recording intervals involved and t h e i r order within the sample. This has been suggested i n connection with similar applications by Borgman [ l 9 6 l ] . 20. CHAPTER 4 INITIAL SELECTION OF PARAMETRIC FAMILY To carry out a detailed analysis using each of the distributions i n turn would be tedious. To base the f i n a l selection of a distribution solely on the width of the confidence intervals could be d i f f i c u l t and misleading, often resulting i n solutions which appeared, to f i t well i n the extreme t a i l but poorly elsewhere. In order to eliminate these procedures i t i s convenient to make use of a simple property which i s shared to a d i f f e r i n g degree by a l l distributions. Such a property i s the curvature of a distribution when plotted on Type I paper. Literature often shows noticeable curvature of data as i t approaches the t a i l of the distribution. In many cases this curvature i s detectible to some degree, e.g.- FIGURE I . Typical examples • may be found i n papers by Khanna and Andru [1974], and Ovellet [1974]. 4.1 Curvature Properties when a Type I I distribution i s represented on any other pair - of axes than those used to form a Fretchet plot, the result w i l l be a curved l i n e . This principle applies to a l l the distributions considered, and forms the basis of the curvature test suggested-here. The comparison i s made by examining the curvature of each type of distribution when plotted on Gumbel Paper (Type I ) . A typical result i s shown i n FIGURE 7 and the actual degree of curvature f o r each distribution i s dependent upon the parameters used. The curvature i s defined as the second derivative 21. CURVATURE = . .d y/dx 2 where 2 (10). y = - £n { - in [ l - Q(h)] x = h. The resulting slope and curvature relationships are summarized i n TABLE I I and are b r i e f l y described below. TYPE I .remains a straight, l i n e TYPE I I - develops strong negative curvature, and the t a i l decays more slowly than any of the other distributions. TYPE I l l y - develops the strongest positive curvature . which enables i t to achieve a f i n i t e l i m i t i n g wave height, ( i . e . one which has an exceedance probability of zero). NOTE: As the. parameter - a approaches i n f i n i t y both the Types I I andTXLj become straight l i n e s , i . e . Type I . TYPE I H - niay develop both negative or positive L curvature depending on the size of a. (see TABLE I I ) . In the special case of a =1 the l i n e becomes straight. The relative f l e x i b i l i t y of the t a i l of t h i s distribution makes i t useful for spanning the gap between Types I I and I H ^ i n the ranges where t h e i r t a i l s become i n f l e x i b l e . 22. The Type IH-^ distribution has a very wide range of curvature, and might often be an acceptable choice even without the curvature test. . LOGNORMAL - behaves i n a similar fashion to Type I I , though developing r e l a t i v e l y mild curvatures. The curvature relationships for Types I , I I , I I I ^ and I I I ^ are r e l a t i v e l y simple to derive and are given i n Appendices A.5 -. A.7.; . However, the lognormal distribution's behaviour i s most easily demonstrated graphically, FIGURE 21. Although some overlap i n curvature Is expected, particularly between the lognormal and the Type I I I ^ , both are retained as p o s s i b i l i t i e s . In order to make an i n i t i a l choice of the distributions to be studied In d e t a i l , three groups may be used. (a > 1) POSITIVE GROUP - Types I l l y , III STRAIGHT GROUP - I I I ^ (a = 1 ) , LOGNORMAL, I . NEGATIVE GROUP - LOGNORMAL, I I I L L (a < 1 ) , I I . The curvature test does not provide any method of selecting a distribution from within one of the groups just mentioned. However, this may be achieved by using the method of confidence intervals which i s described i n CHAPTER 6. 23. 4.2 The Curvature Test . A simple procedure for selecting one of the three groups may now be used. i ) Plot the t a i l of the data onto a Type I Gumbel Paper with axes shown i n FIGURE 7- The plotting procedure i s that described i n SECTION 3.2. i i ) The presence, type and degree of curvature i s then assessed by eye and leads to a choice of one of the three groups. Since the data i s assembled for plotting by the method of SECTIONS 3.2 and 3-3, a l l points should occur within the parent distribution's t a i l , and hence one's decision would be based upon an 'overall' curvature for a l l of the plotted points. CHAPTER 5 ' METHODS OF PARAMETER ESTIMATION Each of the four distributions considered here i s actually a family of different distributions which have widely different properties depending upon the parameter values. Having chosen one of the four distributions as a l i k e l y model, i t remains to find the parameter values which f i t that distribution to the data the closest. In both the f i e l d s of wave and wind prediction, three methods of estimation have been adopted. i) ii) iii) Method of Moments Method of Least Squares Method of Maximum Likelihood Each of these three methods may give a different estimate of the parameters based on the same sample. I t should be noted that a l l ... three methods provide a 'point estimate' i . e . a simple parameter value for each sample. S t r i c t l y speaking then, the estimates are. themselves random variables, though they are treated as i f they are stationary. I t i s mentioned i n passing that the method of f i t t i n g a l i n e by eye can give comparable results to those obtained by the method of least squares. A brief description of each method and i t s application to the distributions i s now given. The estimated, values of parameters are indicated by a hat 'V, and parameter notation i s as used i n TABLE 1. 25. 5.1 Method of Moments The method of moments operates by successively approximating the shape of the model distribution to that of the sample This i s achieved by equating the f i r s t histogram.. k moments to give one equation for each of the k parameters required. The f i r s t three or four moments tend to exert the strongest influence on the shape of a distribution and so the procedure often leads to a reasonable model. One disadvantage of this method i s that i t uses a l l the collected data and does not emphasise the role played by the distribution t a i l . 5.1.1 Gumbel Distribution - Type I The Type I distribution has two parameters (see TABLE I ) . The derivation of the moments and their properties are given i n Appendix A . l . The two equations resulting from this procedure are H = t+ y 9 (11) and H 2 - (-H) 2 = r'e 2 • (12) where H and y ^ i s 1 = n ^ Z-* h ; H i m 2 5 1 = n n Z, h m (13) 2 1 Euler's Number (0.57722) denotes an estimated parameter. Hence estimated values of the two parameters are: e = H - 0.4501 /.H - (H) 2 2 .....(14) and 6 = 0.7797 / H 2 -(H) 2 (15) 26. Both e and § are random variables whose values depend upon the particular random sample used for their estimation. The method of moments estimator f o r e has a variance which i s only 5% larger than that obtainable by the more complicated method of maximum likelihood. However, this method gives an estimator f o r 6 with a variance which i s 80% greater than that obtainable by the method of maximum likelihood, and thus the resulting value of § tends to be unreliable. It should be noted that the moment estimators for Type I are r e l a t i v e l y straight forward to use since this distribution has the same shape for a l l parameter values. 5.1.2 Fretchet Distribution - Type I I The Type I I distribution again has two parameters. However, one of these, .a, controls the basic shape of the distribution and this parameter must be estimated f i r s t . A commonly used method f o r determining the shape i s to equate the skewness of the sample to that of the model. The skewness of a distribution i s defined by: S where u 2 and u = y /y 3 3 (16) 3/2 2 are the second and third central moments. The skewness of the Type I I distribution i s shown i n FIGURE 8 as a function of a. I t can be seen that i n the region increasingly insensitive to a a >5 the skewness becomes and i s l i a b l e to provide an i n e f f i c i e n t estimation. The estimation of the two parameters i s achieved by the following method: i) The sample skewness Is calculated as /B .= [H - 3H..H + 2(H) ] [H where and 3 H H, H 3 = ^ n 2 3 2 _ -h - (H) G ' 2 .....(18) h m 3 1 ,,...(17) are defined i n Eqn. 13- 2 ii) The estimated value of the shape parameter a i s obtained from FIGURE 8. iii) The scale parameter i s then calculated 9 where = H /. T{1 - l/S.} .. (19) r{} i s the gamma function and i s available i n tabulated form, e.g. Abramowitz and Stegun . [1970]. 5.1.3 Weibull Distribution - Type I I I The Weibull distribution requires the estimation of three parameters a, e and 0. Since a controls the basic shape of the density function this parameter takes precedence i n the estimation process. The skewness, which i s the r a t i o of the second and t h i r d central moments, Eqn. 16 i s found to provide satisfactory estimations, as for the Type I I distribution, for a < 20, but for higher values i t becomes asymptotic and loses, i t s s e n s i t i v i t y to a • This presents no real problem since the range of s e n s i t i v i t y has been found to be more than adequate for this study. against a ...in FIGURE 9- The skewness i s shown plotted Since there are three, parameters to estimate, the method of moments requires equations Involving the f i r s t three moments. These may be central moments or moments about the origin (or a combination of the two types). The derivation of the Weibull moments.is given i n Appendix A.3. The method then reduces to the following steps: i) Calculate the sample skewness from Eqns. 13, 17 and 18. ii) The estimated value of the shape parameter a i s obtained from FIGURE 9. It should be noted that for the Type I H y distribution the sign of the skewness should be changed before executing i i ) . This adjustment i s not required for the Type HI-^ distribution. iii) Solve for 9 from 0 = iv) t 5.1.4 {[H - ( H ) ] f ( l + 2/S) - r ( l + V - ) ] " } 2 2 2 1 (20). h Solve for e = H + 8 r(i + V-) (21) Lognormal Distribution The lognormal distribution has two parameters u and a. In the transformation from a normal to a lognormal distribution their properties: y changes .from a location parameter to a scale parameter 0, while 29. a changes from a scale parameter to the shape parameter a. The normal density has only one standardised shape, however the lognormal can assume a variety of shapes depending upon the value of the shape parameter a. The two moment estimators are given by - 2 £n {H}] ^ a = [to {H } 9 = 2 In {H} - h to {H } 2 .....(22) and 2 (23) The use of these i s straightforward, but they can be expected to.perform poorly i f the density i s skewed too highly, i . e . best results w i l l occur when the sample histogram of to(H) i s nearly symmetrical about the mean value, Bury [1975; p . 2 8 l ] . 5.2 Method of Least-Squares It was recommended i n SECTION 3-3 that p r i o r i t y be given to f i t t i n g a l i n e to the data occuring within the t a i l rather than using the entire sample. Of the procedures discussed i n this chapter, the method of least-squares i s the only one which may be used effectively with a portion of the sample and forms an important part of the approach recommended in.this thesis. Since a l l the types of probability paper described here give • a straight-line plot for their family of distributions, i t i s feasible to use the linear version.of the method of least-squares when f i t t i n g a .line to the. data: .. Trie method i n i t s basic form i s directly 30. applicable to the lognormal, Type I and Type I I distributions and i s described below. However, a modification i s required for the Type I I I distribution since i t s abscissa:. scale Is dependent upon the'parameter e, which i s to be estimated. Two Parameter Estimation The least-squares method for determining the l i n e of best f i t In FIGURE 10 a series of to a group of data points i s well known. data points are to be f i t t e d by a l i n e of slope a and intercept b. The v e r t i c a l distance from a data point to the l i n e i s given by r = |y - a x - b| ± (24) ± the sum of the squared distances i s N q = (y - a x - b ) ± 2 ± (25) 1 the method of least squares selects values of a and b which rninimize q. : .N H (Yi-ax. -b)x.= 0. = (26) 1 || N = - 2 £ ( y . - a x. - b) 1 where i i s the point index N i s the number of points. =0 (27) Whence a = r- N N II N N : • - E: D : N N r: N D N - ; -1 (28) ( E ) and N N _ N -1 (29) b = Thus the slope and the Intercept may be calculated directly from a table of wave heights and their frequencies of occurrence. 5.2.1 Gumbel Distribution - Type I The Type I distribution uses the general method described earlier. Type I paper uses an ordinate scale of y = - In {- In [ l - Q(h)]} (30). and an abscissa... scale of x = h The data plotting positions are calculated by the method in.SECTION 3.2 and the least-squares estimates are: = V .a e = where a and b -b are the slope and.intercept. (3D (32) 32. 5.2.2 Fretchet Distribution - Type I I The basic method i s applied to.the Type I I d i s t r i b u t i o n , which has the same ordinate scale as Type I (Eqn. 30), but with.an abscissa;., scale: x = In h The treatment i s the same as used for Type I , and yields the following estimations a = a (33) 6 = exp{-Va> (34) where a and b are the slope and intercept respectively. 5.2.3 Lognormal Distribution Lognormal paper has, as i t s ordinate scale, c r i t i c a l points, of probability from the standard Normal d i s t r i b u t i o n (SECTION 2.1). 1-2 Z of P(Z) = Jexp ~V 2 dt (35) —00 then the ordinate scale i s given by y = z The abscissa scale i s given by x = Hn H P(Z) i s tabulated as NORMAL PROBABILITIES and i s available In any s t a t i s t i c a l text. 33. Application of the basic method to data, plotted according to SECTION 3.2 gives the estimators: 5.2.4 a = 1/a (36) 6 = - b/a (37) Weibull Distribution - Type I I I The three parameter Weibull distribution has the same ordinate scale as the Types I and I I distributions, but has an abscissa scale which i s i t s e l f dependent upon one of the parameters See FIGURE 15 and SECTION 2.0. e to be estimated. Hence, without, prior knowledge of e the data cannot even be plotted. An i t e r a t i v e least squares procedure must be adopted to overcome this problem. Using the least squares c r i t e r i o n , three equations must be analyzed. This i s achieved by the basic method of rrdnimizing the sum of the squared.errors as described previously. For the Type I l l y distribution the three resulting equations now become a = _ N N N Z>i i - D i y N b = N X>iE i x E y i J L N - E x N i E x N N R -, y i N E (38) x I i x -(E i ) x 2 ..(39) N N a E ^ d - H i ) " ab £ _ -(H i) N r i N V ^ ) + .a* E ^ ^ C r f i ) ...(40) where r = , x ± = - m(t-E ), ± and y ± = -^{Hmp.-QO^)]} Similarly, three equations may be assembled for the Type I I I ^ distribution. i) The procedure for the Type I I I ^ now becomes: Select an i n i t i a l value e. Q This may be the largest measured wave height b H^_-^ using Eqns. 38 and 39. ii) Calculate a and iii) Calculate r from Eqn. 40 and check for solution when r = 0 iv) Increase e by an increment Ae, e-^ = E + Ae, q and repeat procedure u n t i l the value of gives r =0 e which i s found. The least-squares estimation of the Type I I I parameters i s achieved by computer program, and convergence of r shown for t y p i c a l wave data i n FIGURE 11. as e i s increased i s 35. 5-3 Method of Maximum Liklihood The method of maximum likelihood estimators (MLEs) attempts to provide estimated parameters which would give the data sample the highest probability of being observed i n i t s particular form. The relative sizes of each of the data values play a fundamental part . i n the method, though their order i s not important, and for. t h i s . . .. reason the method i s unsuited for f i t t i n g a distribution s p e c i f i c a l l y to the sample t a i l . The random sample i s considered to consist of a series of independent observations from the same distribution. The probability of the intersection of these events i s then the product of their individual probabilities. The Likelihood function i s defined n L = TT P 0 (V CD m=l where p^ i s the density of the parametric family, e.g. Type I , h m are the individual wave heights, and n i s the t o t a l number of wave heights. The method of maximum l i k e l i h o o d then selects values for each parameter • which maximise the likelihood function. Since most of the common density functions have an exponential form, this procedure Is simplified by minimising the logarithm of the likelihood function. MLE's have an important advantage over a l l other estimators i n that they can y i e l d unbiased estimators with rninimum variance. This results i n a comparatively e f f i c i e n t use of the data and estimates which are more l i k e l y to be close to their true values. Against this 36. quality lies.the consideration that for two or more parameters they are troublesome to solve, requiring lengthy i t e r a t i v e manipulation. Furthermore the method uses the entire sample of wave heights and thus i s unsuitable to studies directed s p e c i f i c a l l y at the distribution tail. As a result the MLE's.are rarely used i n offshore engineering since i t i s usually f e l t that their drawbacks outweigh the principal advantage. For completeness, a short, description of the Maximum.Likelihood procedure follows for each distribution. 5.3.1 Gumbel Distribution - Type I The likelihood function i s given by ....(42) and by setting 9_ Zn L 3e two equations i n e and 1 n and = 0 6 can be obtained n exp 1 1 (43) H (44) It can be seen that these Involve an i t e r a t i v e solution for £ and 5.3-2 6. Fretchet Distribution - Type I I The likelihood function i s given by -(a+l) L(h;a,6) = and the two ML a n (45) exp equations i n a and g. are n h -° = m (46) 0 and n r n In hm E L where n -1 - n 1 = 0 (47) i s the t o t a l number of wave heights, A more detailed account of the distribution i s given by Thorn [1954J. Equations 46 and 47 must be solved simultaneously by computer or a graphical method. A closed form for the maximum likelihood estimator does not exist. 5.3-3 Weibull Distribution .- Type I I I The likelihood function i s given' by n in L(h;e,a,9) = [ ^- ] exp <| - £ a where = m (48) e - h^ the three equations which result from inaximizing the likelihood function are *a n a-1 u E A- 1 m 1 (49) i -,1/a n n -i n (50) I—. • m 1 and n n„ d - n in 9 + ) . in A *-* • m 1 n - El (51) 1 Equations 49, 50 and 51 must be solved simultaneously for 9, 0. and e, and again a closed form maximum likelihood estimator does not exist. The treatment of the Type I I L distribution i s similar but not given here. 39. 5-3.4 LdgridrmaT Distribution As a result.of i t s . d i r e c t connection with the Normal distribution, the lognormal case i s comparatively straighforward. The estimators are: n (52) = - Y* In h n <— m v and n £2 i l E ( ^ - g ) 2 (53) 1 In view of the simplicity of the M.L. estimators for the lognormal distribution and because of their desirable properties of unbiasedness and minimum variance, these should be used i n place of the method of moments (which have a lower efficiency), and may be considered a good alternative to least-squares. 40. CHAPTER 6 .TESTS OF FIT BETWEEN THE DISTRIBUTION AND THE DATA Basically there are two methods of testing f i t between the distribution and the data: i ) by hypothesis testing i i ) by confidence intervals These two methods, 'operate i n much the same way since both use a pivotal quantity and have a level of significance or confidence. The confidence interval approach has been almost universally accepted for this type of work since i t can be plotted i n a form which may be readily appreciated by the engineer. U n t i l very recently, the coastal/ocean engineering literature indicated that confidence intervals could only be applied to the Type I distribution derived by Gumbel [1958] and i n the form summarized by St. Denis [1969]. The discussion here w i l l show that this i s certainly not the case, and that they may be generated for any of the four distributions, and that Gumbel's form was only an approximation to the exact derivation. The intervals described here are not presented i n the standard parametric intervals form most commonly used by s t a t i s t i c i a n s , and which sets probabilistic l i m i t s to the estimated values of the parameters. The confidence intervals used for this work set probabil- i s t i c l i m i t s on the range of each of the data values given the estimated distribution. This permits an engineer to review the results and form a conclusion on the closeness with which a model f i t s the data. 41. A summary of the derivation of confidence intervals i s presented i n the next.section. More detailed discussions may be found i n accounts by Kendall [1947] and Borgman [1959]. 6.1 Derivation of Confidence Intervals Starting with.a data sample H continuous parent distribution l5 H which has a 2 P(h), we arrange the data i n order of magnitude: H,;v, H,.v (1) () H, \ i s the smallest, (n) Each data value 2 H, > so that H/.s i s the largest and (n) (i) H, i s then assumed to behave as an Cm) independent random variable which has an identical parent distribution P(h). The general probability density of the of a sample containing n statistic values i s the probability that ,h - -~dh <^ H, „ ^,,dh 2 (m) <- h + ^r-2 N In order to achieve t h i s , h + ^jp, Hence where the :,m^ h f m-1 values must f a l l above f a l l within h ±-^r, and the rest below h - ^=p. v n-m (h) dh = p(h). p(h) dh. l-P(h) -. i s the parent density function i s the density function of the m th statistic Since there i s s t i l l some ambiguity over which values go above and below h, we choose (n-m) values below h, then one value at h, leaving the remaining (m-1) to f a l l above h. n \ n-m m-1 f (h) dh = m I P(h) [1 - P(h)l p(h) dh m \n-m/ (54) , \ = TnT( +i; \n-m) ' r(n-m+l)r(m+l) 11 n m 1/ B(n-m+l,m) / where B( ) i s the Beta function. The cumulative probability of the m from th largest variable n values i s P (h) = Prob [ H m and ^m^^ < h] ( m ) _ | f ( h ) dh m h^ Fm(h) let = f n- , —? /I P(h (n-m+l,m) / ^ 1 B ( r n m w = P(h) dw = p(h) dh r P •m [ 1 m-1 " P ( h ) ] P ( h ) d h "••^5 43. . P(h) F m (h) P X r ^-m+l,m) +l?^ = 1 Bp(n-^n+l,m) TT7 n r B(n-m+l,m) / [ 1 - W ] m _ 1 d W (56) 1^ (n-m+l,m) P = X v J where Ip(.') i s the incomplete beta function which can be expressed i n terms of the binomial expansion n I (m,n-m+l) = E ( j ) p ~ j=m n p I (a,b) and p V h ) = 1 j [1-PJ J = 1 - I _p(b,a) ~ E 1 n (i) R ( h ) (57) th Equation 5 7 , then, expresses the probability function of the m data point i n terms of the parent distribution P(h) which governs the occurrence of a l l wave heights. At this point I t i s clear that the form or type of the parent distribution has not been specified, and that the method i s equally applicable to data from any of the four distributions considered. 44. 6.2 Asymptotic Distribution of the m Statistic I f the number of wave.heights n while both m i s increased without l i m i t and P(h) are.held stationary, the function F (h) m tends to zero. This i s very inconvenient and i t would be desirable to have a stable non-zero l i m i t i n g function which could be used f o r large samples. This i s achieved by replacing P(h) by the parameter w (h) n and tabulating w^(h) (58) = n [l-P(h)] instead of P(h). In.this case the incomplete beta function i s replaced by the incomplete gamma function. of w n Tables were prepared by Borgman [1959] and are summarized i n FIGURES 24 to 28 which are plots of £n w = Jin {n[l-P(h)]} against the n sample size 6.3 n for several values of F ( h ) . m Approximate Distribution of the irfo S t a t i s t i c Gumbel [1958] described an approximate form for the standard error of the m th statistic. I t was assumed that m/n was approximately Jg, so that I t was s t r i c t l y v a l i d only for s t a t i s t i c s taken from the centre of the ordered sample. He showed that Eqn. 54 could be expressed as a power series of which a l l but the f i r s t few terms could be neglected. This simplied to the density of a normal distribution. 6.4 Limitations It should be noted' that the assumptions made In I t s derivation now render the Gumbel version quite inaccurate i n the v i c i n i t y (FIGURE 12). m-1, Since this region i s of prime interest here, especially when using the least-squares method, the more direct method of solving Eqn. 57 i s to be preferred. In addition to a variation between the true interval and the standard deviation indicated by the normal distribution approximation, there i s a very noticeable development of skewness (see th FIGURE 12) i n the distribution of the unity. m value as m approaches This i s quite important when using the confidence interval l i n e s , since they develop a strong bias towards the right of the f i t t e d l i n e , indicating that i f other sets of samples were used, the values would tend to f a l l more often to the right of the f i t t e d l i n e as m decreased and have greater heights than indicated by i t . A r e s t r i c t i o n on the use of the that i t i s not defined i n the region directly extended to predicted values. m th m < 1, value distribution i s and hence cannot be Gumbel effectively suggested that predicted maximum values would each retain the same interval size as the s t a t i s t i c at m = 1, but t h i s practice does not seem to be generally accepted. 46. 6.5 Method of Determining C r i t i c a l Points For a given sample of n ordered position m wave heights and a s t a t i s t i c the distribution of the m^ value Is described h i n terms of the o r i g i n a l parent distribution. Since our sample consists of extreme values, the parent distribution may be chosen from one of the four distributions used here, e.g. Type I I I . I f the confidence intervals are defined i n terms of a probability that a s t a t i s t i c then Eqn. 57 given height w i l l not be exceeded by the m t]ri can be solved for the c r i t i c a l value h. However a more useful form can be reached by solving for the value of the parent distribution P(h) which s a t i s f i e s value of P(h) E (h) = cj>, the confidence probability. m given by this may then be applied to any parent distribution to get a c r i t i c a l value of h. tabulated.values of tables for m = 1(1)5 6.6 The P(h) for given n, m This approach results i n and Y. Comprehensive have been compiled by Borgman [1959]- Procedure for Plotting Confidence Intervals A plot of the data i s prepared on the paper of the selected distribution by following the sample preparation and plotting procedure given i n CHAPTER 3. A b e s t - f i t l i n e i s then located by one of the techniques i n CHAPTER 5- The method for plotting the confidence intervals i s as follows: i ) A confidence probability level i s selected. Each value of m y e.g. 0.25. i s then treated separately and i s used to obtain a pair of values of P(h) from FIGURES 24 to 28. That i s j for a given.sample size: n, m and setting F (h) = <f> for both cf> = ( l - y ) / 2 and respectively, as shown i n FIGURE 17, (I+Y)/2 the appropriate FIGURE (24 to 28) is. used to determine two values of i i ) Each value of P(h) P(h). i s used as an ordinate position (see FIGURE 13) and, when projected onto the b e s t - f i t l i n e w i l l given two l i m i t i n g th values of height for the m statistic. i i i ) These may then be plotted on either side of the plotting position and a faired l i n e drawn through equivalent l i m i t s for the remaining points. 6.7 Examples of Confidence Intervals The sample described by St. Denis [1969] i s shown plotted i n FIGURES 13 to 16 with 25% confidence intervals. I t may be seen that this level of confidence i s sufficient to contain a l l points i n the Type I , Type I I and lognormal plots. However, the Type I l l y confidence intervals are too narrow to contain the second and t h i r d highest points. This indicates that the Type I l l y distribution i s the least suitable model for this data. CHAPTER 7 METHODS OF PREDICTION The'processes of selecting the most suitable distribution together with estimates of the parameters have been described i n detail. This chapter w i l l discuss methods of using the b e s t - f i t l i n e to predict the "extreme wave". In order to describe this wave one requires a representative height, and for this either the significant or the maximum wave height may be used. The calculation, of these two values i s outlined i n SECTIONS 7.1 and 7-3. The methods described In previous chapters have been directed entirely towards predicting wave heights. Although extreme value analysis have not been applied to wave periods i n this thesis, the period of the extreme wave may s t i l l be calculated from the l i m i t i n g steepness as described i n SECTION 7.2. The predicted.values of wave height and period apply over the same recording interval that was used for data collection. Once an extreme wave height has been obtained i t may be desirable to set a pair of probabilistic l i m i t s on i t s value to r e f l e c t the size of the sample and the quality of the estimating process. A discussion of such l i m i t s i s given i n SECTION 7.4. In SECTION 7-5, the encounter probability and return period are discussed In d e t a i l . The encounter probability quantifies the r i s k of a wave with a given return period occurring within the lifetime. Additionally, another value might be used to estimate" : the number of smaller waves, occuring within the same period, and which might hinder operation or promote fatigue. By this process the designer would be able to consider a "limit-state" and a "service condition". 7.1 Expected Significant Height Once the extreme value plot has been drawn the next stage Is usually to estimate the so-called 50 or 100 year design wave height. That i s the wave height, defined i n the same way as the recorded heights (e.g. the significant height over a recording i n t e r v a l ) , which would only be exceeded on average once during a period of 50 or 100 years respectively. This time interval i s called the RETURN PERIOD or RECURRENCE INTERVAL, and i s usually selected on the basis of a given structure lifetime. A non-dimensional EXPECTED WAITING TIME, R i s defined as the average number of t r i a l s between exceedances of a given height h. Each t r i a l amounts to one recording interval and hence the waiting time i s given by EXPECTED WAITING TIME = Let W R E C O S INTERVAL = R ••• (59) be the waiting time, i . e . the random number of observations preceding and including the f i r s t exceedance of a given height h. W then has a GEOMETRIC DISTRIBUTION and i f P = P r [H < h] 50. then the expected value of W i s ....(60) R = which yields EXPECTED WAITING TIME = Thus for a return period T V (1 - P) (61) of 50 years, and using data r which was recorded at 3 hourly intervals, the expected waiting time R would be 146,000. may thus calculate For a given data record and a return period one from Eqns. 59 and 6 l . R and hence Pr[H £ h] This w i l l correspond to a value of height h on the probability plot. Since the data consisted of a series of significant heights, this predicted height w i l l represent the significant height of a corresponding recording interval and with the required return period. 7.2 Extreme Wave Period There are three common approaches to estimating the extreme wave period. The f i r s t i s to repeat the entire.procedure using wave periods instead of wave heights. The marginal frequencies are obtained by summing the number of wave occurrences i n each period class. By i .. using the same return period as the heights, one may obtain a predicted value of the 50 year zero-crossing period with the same recording Interval. T for a future wave record Draper [19631 has suggested that this value of T_ may be used with the predicted height. This suggestion i s based on the fact that there i s a noticeable correlation between the two variables i n the scatter diagram. 51. The second method.of obtaining a representation wave period involves the use of a one-parameter wave spectrum such as the PiersonMoskowitz spectrum. The spectrum Is calculated for the predicted value of significant height and the value of frequency locating the spectrum peak i s used to obtain the T . Again, the value corresponds to the same recording interval as does the data. The t h i r d method, which i s the simplest to use, involves using the predicted wave height to set a lower l i m i t on the wave period. By assuming a Pierson-Moskowitz spectrum, Battjes [1970] has shown that, for deep, water and intermediate depths, the wave steepness defined as 2TTH s /gT i s limited by 2 ^ gT where g T^ < 1/16 ....(62) 2 L i s the gravitational constant. Thus a lower l i m i t of period for a given significant height may be set as: T L = (32TT -Hg/g}^ ....(63) The method then involves trying different combinations of the period with the predicted height to find the worst effect on the structure. 7-3 Maximum Wave Height Once the extreme values of significant wave height and mean zero-crossing period have been established, the maximum wave height may be calculated. Longuet-Higgins and Cartwright [1956] showed that for a wave spectrum of arbitrary shape the expected maximum individual wave height could be expressed as Expected Maximum Height Significant Height s p £ n ( t / L } i z h ( g 4 ) J where t i s the recording interval and T i s the mean zero crossing period. This relationship i s v a l i d provided t/T i s large, i . e . the recording interval contains a large number of waves. The sea-state i s assumed to be stationary throughout this period. In the preceding section 7 . 2 , i t was shown that a lo\^er l i m i t would be placed on the period attached to the predicted value of the significant height. The procedure for calculating the expected maximum wave height is: i ) Using the predicted value of the significant height H g and the l i m i t i n g steepness, calculate the minimum T^ from Eqn. 6 2 : period T L = (32w H /g} ....(63) h s where g i s the gravitational constant. i i ) Calculate the expected maximum wave height from Eqn. 64 using to T^ from Eqn. 63. T. z It i s quite sufficient to use the minimum period here since Eqn. 64: i s insensitive to variations i n t/T . For example, a 10% z error i n the central period of 7 seconds over a recording interval of 3 hours would result In a height error of less than 0.5%. 53. 7-4 Confidence Intervals for Prediction Gumbel [1958] suggested that the confidence intervals described i n the last chapter could be extended, beyond the region containing data, for prediction. The method he suggested was to draw a pair of lines p a r a l l e l to the f i t t e d l i n e and passing through the interval offset points of the highest data point. Although the concept of using intervals to indicate error i n prediction was very attractive, t h i s method has not generally been adopted according to Chow [ 1 9 6 4 ] . I t was, however, restated by St. Denis [1969] i n a paper devoted to wave prediction. As has already been.discussed, there i s always a.degree of v a r i a b i l i t y involved i n parameter estimation. An estimate i s a function of a random sample and hence Is I t s e l f a random variable. The estimator's v a r i a b i l i t y i s not lessened by the fact that the various methods of estimation often y i e l d s l i g h t l y different results. Hence, the Weibull distribution has three possible sources of error once i t has been estimated. In view of the d i f f i c u l t i e s described, i t i s not surprising that the problem was l e f t untouched u n t i l quite recently. Thoman, Bain and Antle [1969] prepared confidence interval tables for the parameters of the two-parameter weibull distribution. This special case occurs when a Type I I I ^ distribution i s used with epsilon set equal to zero. The tables were made by using Monte Carlo Simulation to generate a series of random samples from the Type IH-^ distribution, and thence deriving an empirical distribution for each parameter. Confidence intervals were then taken from these empirical distributions by a similar process to that used i n the last section. This approach was used by Petrauskas and Aagaard [1971] to produce "uncertainty intervals" f o r prediction. The two l i m i t s calculated for each of the parameters involved resulted i n a pair of straight l i n e s j each having a slope and intercept which were different from the l e a s t - f i t l i n e . shown i n FIGURE 2 3 . An example of'the uncertainty intervals i s The intervals generated by this method were found to diverge from the l e a s t - f i t l i n e as the variate increased instead of remaining p a r a l l e l to i t , as o r i g i n a l l y suggested by Gumbel... 7.5 Encounter Probability and Waiting Time I t i s accepted practice to refer to a design wave of given height by i t s RETURN PERIOD .at a specific location. Thus a 25-year wave means that waves as large as the design wave or larger, occur on average once i n each 25-year period. I t Is evident that i n fact several such waves could possibly occur within the same 25-year period. Borgman [1963] has given a description of the distribution of the waiting time between events and of the probability of encounter. The concept of return period enables one to represent the continuous time dimension as a series of discrete integer multiples of the o r i g i n a l recording Interval, or of a conveniently related quantity such as one year. Thus time can be used as a discrete . random variable which has two possible states, which reflect whether or not the design wave has been exceeded within the. associated time period. The probability of an exceedance i s given by p and t q = 1-p represents the probability that a recording interval w i l l contain a wave higher than h. If h = . 1 - P(h) T i s the waiting time u n t i l the f i r s t exceedance of occurs, then i t has a geometric distribution and p < r T l ' > T W t = " ^ 1 .-..(65) where T i s a dimensionless integer multiple from Eqns. 59 and 6 l , the expected waiting time i s T = r / t (66) Vfi-PCh)] thus from Eqns. 65 and 66 P r TT± { t } = " 1 " T ( 1 V ...-(67) ) T r This i s the probability distribution of waiting time, and i t i s independent. of the timing of the previous exceedance. I f (x.t) represents the lifetime of the structure then Eqn. 67 may be used to calculate the probability of i t experiencing a wave with a given return period. FIGURE 22 has been constructed from Eqn. 67 f o r the special case where t = 1 year. when T/t >:> 1) When T / t r » 1 2 2 ( i . e . generally a suitable approximation to Eqn. 67 i s given by T P r { T - - T } ^ 1 ~ e x p { _ } -...(67a) CHAPTER 8 • CONCLUSIONS - A RECOMMENDED PROCEDURE A recommended procedure for the prediction of extreme waves i s now given below: a) The data i s taken from an intermittent record over a period of at least one year. Usually this w i l l consist of a series of short continuous records (10 - 20 minute duration) which have been started at intervals of 3 - 10 hours. For reasons given i n SECTION 3-3 i t i s possible to use data which has some of the results from 'low a c t i v i t y ' months missing provided i t can be shown that their approximate ranking positions f a l l outside the t a i l as defined i n SECTION 3.3b) Each continuous record i s reduced by the method described i n SECTION 3.1 to a pair of single values, i . e . the significant height c) H and the zero crossing period s • A scatter diagram with both H and T number of equal-classes i s then prepared. T. z divided into a The number of records f a l l i n g Into each j o i n t interval should be marked as shown i n FIGURE 5. d) The marginal height frequencies.are fixed by summing over the period . T z for each class of H . The plotting positions ,of each,class lower l i m i t are calculated according to the method of SECTION'.3,1. Each class lower l i m i t i s plotted on Type I paper and the curvature test i s applied as i n SECTION 4.2.' This w i l l result In a choice of one of three groups of distribution which are described i n SECTION 4.1. The t a l l of the sample i s isolated for further use by the method given i n SECTION 3,3, I f this yields less than five points, one may return to step (c). and .further .subdivide the height classes In the scatter diagram. I f t h i s i s not feasible one may have to accept a more general f i t and.include some lower l i m i t classes. The class lower l i m i t s of the t a i l are then plotted onto each paper of the distribution group selected i n step ( f ) . A straight l i n e i s f i t t e d by one of the methods described i n CHAPTER 5For each distribution of the group, the confidence Intervals corresponding to the plotted points are drawn. I n i t i a l l y a confidence l e v e l of 60% may be used and narrower bands drawn u n t i l points start to f a l l outside the l i m i t s . On this basis one may select one distribution which gives the best f i t to the data. Predicted values of wave height and period may now be made by the methods of CHAPTER 1, and encounter probabilities assigned where appropriate. CHAPTER 9 A WORKED EXAMPLE The data used for this worked example was collected by the Department of Public Works of Canada.at TINIER POINT, NEW BRUNSWICK. I t covers a period of one year and o r i g i n a l l y appeared i n a paper by Khanna and Andru [1974]. The example i s analysed using the same steps as.summarized i n CHAPTER 8. ' Steps a) to c) were o r i g i n a l l y carried out by the collecting agency and the starting point was the scatter diagram shown i n FIGURE 5- The marginal frequencies of significant height zero crossing period T H and s are also shown [step d)]. Steps e) and f ) were carried out to give the results shown i n FIGURE 18. A curved l i n e , which was f i t t e d to the points by eye for convenience, has a strong positive curvature. On the basis of the method discussed i n SECTION 4.1 t h i s permitted the analysis to be confined to two possible models: - Type I l l y distribution - Type I I I ^ distribution Step g) was carried out according to SECTION 3-3 as follows: t o t a l number of wave heights on record N = 2245 longest rank within t a i l w = ^ y ^ = 224 By adding the marginal frequencies of each height class the position of the class containing a value with a rank of 224 was located. This 60. resulted In the nine points which were plotted i n FIGURE 18. The extent of the t a i l i s indicated and the remaining points have been included to show their behaviour. The result of step h), using the method of least squares, i s shown i n FIGURES 19 and 20. f i t t e d according to step i ) . i s given i n TABLE V. The confidence intervals have also been The calculation of confidence band positions Since FIGURES 24 to 28 only cover the f i r s t five ranked positions [m=l to 5] they cannot be directly applied to higher ranked values. This i s overcome by introducing an approximate method which may be j u s t i f i e d by the fact that the confidence intervals only serve as a test of comparison, and hence do not require the rigorous approach of parameter estimation. The method i s given as follows: i ) The rank m of the smallest value occurring i n each class i s used and shown i n column 2. i i ) The mean frequency i s calculated as n+1 Q i i i ) When the rank becomes greater than f i v e , m i s set equal to 5 and a new sample size n' i s chosen to give an approximate frequency (column 4) which i s close to the value i n column 3. n' = J- - 1 % The resulting values of n' column 5. are shown i n 61. iv) An i n i t i a l confidence.level of was chosen. V = 0.60 The procedure of SECTION 6.6 was then used to plot the confidence intervals with the following modifications - n 1 replaces the true sample.size - FIGURE 28 i s used for m _> 5 . . ..-The...final.plots together, with, confidences':••• .' . ii • Ii intervals are shown i n FIGURES 19 and 2 0 . . Since the 60% confidence bands did not enclose a l l the points i n FIGURE 1 9 , Intervals for 80% confidence were plotted. I t was found that 60% confidence was sufficient for FIGURE 2 0 , and that the highest pair of points could be enclosed by a narrower band of 40% confidence. Thus the Type I l l y distribution with e = 15 feet was the most suitable model for this data. The prediction of the 100 year design wave i s made according to the methods given i n CHAPTER 7 for a RECORDING INTERVAL of 3 hours, and a RETURN PERIOD of 100 years, which results i n an EXPECTED WAITING TIME of 2 9 2 , 0 0 0 . The probability of non-exceedance i s calculated from Eqn. 64 as P(h) = 0.99999657, which corresponds to y = 12.584 on the ordinate scale of FIGURE 2 0 . This yields a 100. year SIGNIFICANT HEIGHT OF 14.47 feet. The data used for this worked example was analysed by Khanna and Andru [ 1 9 7 4 ] . Their estimate for the 100 year significant height varied between 20 and 30 feet. The lowest value was taken from a 62. Type I H plot and the largest from a lognormal plot. L The value of significant height f o r the 100 year wave suggested by this worked example was less than 15 feet. The large difference i s attributable to the l i m i t i n g effect of the Type I l l y distribution and to the f i t t e d l i n e now lying to the l e f t of the m=l point. The minimum period i s determined from Eqn. 64 as T^ = 6.7 seconds, and hence the EXPECTED MAXIMUM HEIGHT i s given by Eqn. 63 as 27-9 feet. CHAPTER 10 FUTURE WORK The procedure which has been described i s primarily concerned with predicting a wave height of given return period. In cases where the dynamic response of a structure to waves i s of importance, one must consider the distribution of wave periods. In order to calculate the combined effect of height and period variation, i t becomes necessary to introduce a long-term bivariate distribution. This describes the j o i n t probability of a given height and period occurring i n combination. To a limited extent this problem has been examined,"" by Battjes [SECTION 1.0] who used a discrete approach. The bivariate distribution developed could be continuous and i t s marginal distributions of height and period may be quite different, e.g. a Type I for periods with a Type I l l y for significant height. Thus, instead of f i t t i n g a straight l i n e one would use a "surface of best fit". One advantage of such an approach would be that a designer could take the fundamental frequencies of the structure into account when predicting design values. An approach would be to predict a wave of given return period, e.g. f i f t y year wave, given that the period of concern lay between l i m i t s , e.g. 6 to 8 seconds. From SECTION 7-4 i t can be seen that there i s s t i l l no direct approach for obtaining confidence bands for predicted values. I t would be most useful to the engineer i f a method based on tables could be prepared for office use. 64. BIBLIOGRAPHY 65. Abramowitz, M. and Stegun, I . Handbook of Mathematical Functions, Dover, New York, 1970. Battjes, J.A. "Long-term wave height distribution at seven stations around the B r i t i s h Isles," National Institute of Oceanography, England, Internal Report No. A.44, July,'.1970. Borgman, L.E. "The frequency distribution for the mth largest of n values," M.S. Thesis, University of Houston, Houston, Texas 1959. Borgman, L.E. "The frequency distribution of near extremes," Journal of Geophysical Research, Vol. 6 6 , No. 10, pages 3295-3307, October, 1961. Borgman, L.E. "Risk C r i t e r i a , " Journal of the Waterways and Harbours Division, American Society of C i v i l Engineers, Vol. 89, No. WW3. August, 1963. Borgman, L.E. "Extremal S t a t i s t i c s i n Ocean Engineering," Proceedings of Civil Engineering in the Oceans, University of Delaware, 1975. BretSchneider, C.L. "Generation of wave by winds, state-of-the-art," National Engineering Science Company, Report SN-134-6, January, 1965. Bury, K.V. Statistical Models in Applied Science, John Wiley and Sons, 1975. Cartwright, D.E. and Longuet-Higgins, M.S. "The S t a t i s t i c a l distribution of the maxima, of a random function," Philosophical TransactionsRoyal'. Society of London, A247, pages 22-48, 1958. Chow, V.T. Handbook of Applied Hydrology: a compendium of water resources technology, McGraw-Hill, New York, 1964. Draper, L. "Derivation of a design wave from instrumental records of sea states," Proceedings, Institution of Civil Engineers, London, Vol. 26, pages 291-304. Fisher, R.A., and':Tippet, L.H.C. "Limiting forms of the frequency distrubution of the largest or smallest member of a sample," Proceedings, Cambridge Philosophical Society, Vol. 24, 1926. Gringorten, I . I . "Envelopes for ordered observations applied to meterological extremes," Journal of Geophysical Research, Vol. 6 3 , No. 3, pages 815-826, February 1963 Gumbel, E.J. " S t a t i s t i c a l theory of droughts," Journal of Hydraulics Division, American Society of C i v i l Engineers, Vol. 8 0 , May 1954. Gumbel, E.J. Statistics of Extremes, Columbia University Press, 1958. Hogben, N. "A companion of log-normal and Weibull functions for f i t t i n g long-term wave height distributions i n the North Atlantic," National Physics Laboratory, England. Ship Division, T.M.190, October 1967. Jasper, N.H. " S t a t i s t i c a l distribution patterns of ocean waves and of wave-induced ship stresses and motions with engineering applications," S.N.A.M.E., New York Meeting, Preprint No. 6 , 1956. Kendall, M.G. the Advanced Theory of Statistics, Charles G r i f f i n and Co., London, Vol. 1, 1947. Khanna, J . and Andru, P. "Lifetime wave height curve for Saint John Deep, Canada," Proceedings, International Symposium, Ocean Wave Measurement and Analysis, Vol. 1, pages 301-319, ASCE, New Orleans, pages 301-319, September 1974. Kimball, B.F. "On the choice of plotting positions on probability paper," - Journal of the American Statistical Association, V o l . pages 546-560 I960. 67. Ouellet, Y. "On the need of wave.data' for the design of rubble mound breakwaters," Proceedings, International Symposium on Ocean Wave Measurement and Analysis, Vol. 1, pages 500-522, ASCE-,---New Orleans, September 1974. Petrauskas, C. and Aagaard, P. "Extrapolation, of. h i s t o r i c a l storm data for estimating design-wave heights," Journal of the Society of Petroleum Engineers, Vol. 1 1 , pages 2 3 - 3 7 , March 1971. Simiu, E. and F i l l i b e n , J.J. "Probability distributions of extreme wind speeds," Journal of the Structural Division, American Society of C i v i l Engineers, Vol. 102, No. ST9, pages 1861-1877, September 1976. St. Denis, M. "Determination of Extreme Waves", 'Topics in Ocean Engineering, Vol. 1, C.L. Bretschneider (ed.) pages 37' Gulf Publishing Co., Texas, 1969. St. Denis, M. "Some cautions on the employment of the spectral technique to describe the waves of the sea and the response thereto of oceanic systems," Offshore Technology Conference, Houston, Paper OTC 1819, May 1973- Thorn, H.C.S. "Frequency of maximum wind-speeds" Journal of the Structural Division, American Society of C i v i l Engineers, Vol. 8 0 , November 1954. Thorn, H.C.S. "Asymptotic extreme-value distributions of wave heights i n the open ocean," Journal of Marine Research, Vol. 2 9 , pages 1 9 - 2 7 , 1971. Thoman,. D.R., Bain, L.J. and Antle, C.E. "Inference parameters of the Weibull distribution," Technometrics, Vol. 1 1 ( 3 ) , pages 445-460, 1969. Tucker, M.J. "Analysis of records of sea waves," Proceedings, Institution of Civil Engineers, London, Vol. 2 6 , pages 305-316, 1963. TABLES DISTRIBUTION RANGE 0<H<°° LOGNORMAL —0O<Q<0O 0<a<°° PROBABILITY FUNCTION P(h) VARIANCE EXPECTED VALUE h 1 /2TV J / 1 ah A exp { -hi \ a }dh exp { 6 + | } exp{20+ }[exp(a )-l] 2 2 a -oo<H<oo TYPE I exp {- exp [ -(^p)]} _00<£<00 1.64493 9 E + 0.57722 e : 0<9<°° 0<a<°° 0<6<°° 0<H<°° TYPE I I TYPE I l l y UPPERBOUND TYPE I I I -«xH<e 0<a<°° o<9<°° £<H<°° 0<a<°° 0<8<°° L LOWER BOUND TABLE 1 exp { - ( I ) e x p ^ ) } a r ( i - -) ) {r(i- I) - r ( i - h] £-e r(i+ -) ) {r(i+ -) -r (i+ -)} 2 a 2 a £+9 r(l+ -) a PROBABILITY DISTRIBUTIONS AND 'THEIR PROPERTIES a 2 a a /H-e \ i-exp {- y-Q-j } 2 a e {r(i+ -) -r (i+ -)} 2 2 a a DISTRIBUTION Lognormal Type I Type I I SLOPE- TAIL CURVATURE positive negative 1/6 straight l i n e + .d/H negative curve - a/H 2 Type I H a/H-eY*" L eve/ 1 < a(a-l) e ( H 0 _ a-l £ ) a a<l negative ot=l straight ot>l positive Type I l l y +a/(e-H) positive curve a/(e-x) TABLE I i : 2 >_ 0 CURVATURE PROPERTIES OF THE DISTRIBUTIONS 1/ALPHA 0.0125 0.0250 0.0375 0.0500 0.0625 0.075 0 0.0875 0 . 1000 0 . 1125 0.1250 0 . 137 5 0 . 150Q 0 . 1625 0 . 1750 0 . 1875 0.2000 0.2125 0.2250 0.2375 0.2500 0.2625 0.2750 0.2875 0.3000 0.3125 0.3250 0.3375 0.3500 0.3625 0.3750 0.3875 0.4000 0.4125 0.4250 0.4375 0.4500 0.4625 0.4750 0.4875 0.5000 TABLE I I I SHAPE FACTOR 1. 2 1 6 1 1.2970 1.3827 1.4739 1. 5 7 1 3 1.6757 1.7883 1.9103 2 . 0 433 2.1893 2.3505 2.5302 2.7324 2.9621 3. 2 2 6 5 3.5351 3.9015 4.3456 4. 8 9 7 4 5-6051 6. 5509 7.8872 9.932 7 13.4835 2 1 . 2472 5 2 . 1732 -102.1849 -24.9316 -13.8496 -9.3816 -6.9457 -5.3957 -4.3085 -3.4908 - 2 . 8404 -2.2965 -1.8180 -1.3691 -0.8994 -0.0018 ALPHA 80.00 40.00 26.67 20.00 16.00 13. 33 1 1.43 10.00 8.89 8.00 7 . 27 6.67 6 . 15 5.71 5.33 5.00 4.71 4.44 4. 21 4.00 3.81 3.64 3.48 3.33 3 . 20 3.08 2.96 2.86 2.76 2.67 2 . 58 2.50 2 . 42 2.35 2 . 29 2 . 22 2 . 16 2. 11 2.05 2.00 Shape Factors for the FRETCHET Distribution 1/ALPHA 0.05 0 . 10 0 . 15 0.20 0 . 25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1 . 10 1.15 1 . 20 1.25 1.30 1.35 1.40 1.45 1.50 1 . 55 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 TABLE BIZ SHAPE FACTOR -0.8680 -0.6376 -0.4357 -0.2541 -0.0872 0.0687 0.2167 0.3586 0.4963 0. 6311 0.7640 0.8960 1.0279 1. 1604 1.2941 1.4295 1.5674 1.7080 1.8521 2.0000 2.1523 2.3093 2.4718 2.6400 2.8146 2.9961 3. 1851 3.3820 3.5875 3.8023 4.0269 4.2621 4.5086 4. 767 1 5.0385 5.3235 5.623 0 5.9381 6.2697 6.6188 ALPHA 20.000 10.000 6.667 5.000 4.000 3 . 333 2.857 2.500 2. 2 2 2 2.000 1.818 1.667 1. 538 1 . 429 1. 333 1 . 250 1 . 176 1. 111 1.053 1.000 0.952 0.909 0. 870 0.833 0.800 0.769 0.741 0.714 0.690 0 . 667 0.645 0.625 0.606 0. 588 0 . 571 0.556 0.541 0.526 0.513 0.500 Shape Factors for the WEIBULL" .Distribution 73. n m 1 DATA POINT 2 3 4 5 6 APPROX TRUE MEAN FREQ. APPROX MEAN FREQ. EQUIV. . SAMPLE SIZE 60$ CONFIDENCE RANK HIGHEST TABLE V: 1 - y LOWER Y UPPER 2245 7.24 9.22 2245 6.62 7.91 1400 5.34 6.11 1 1 1/2245 2 2 2/2245 3 8 8/2245 4 15 15/2245 5/750 750 4.71 5.49 5 24 24/2245 5/450 450 4.20 4.98 6 38 38/2245 5/245 295 3.77 4.55 7 64 64/2245 5/175 175 3.25 4.02 8 93 93/2245 5/120 120 2.87 .3.64 9 133 133/2245 5/85 85 2.51 3.29 5/1,4000 ESTIMATION OF CONFIDENCE INTERVALS FOR TYPE III TT PLOT 74. FIGURES 75 NOTE: N~ {P(h)} i s the value of variate which corresponds to an area equal to P(h) under the Standard Normal Distribution density curve. FIGURE 1 Typical Examples of Data on Lognormal Paper FIGURE 2 Typical Density Curves for the Types I , I I and Lognormal Distributions 77. FIGURE 3 Typical Density Curves for the Type I I I Distribution 78. Recording .5 Penod St'41 water- level. Recording Interval NOTE: The wave height i s measured from trough to crest FIGURE 4 Typical Wave Elevation Recording f s i 2 2 1 1 3 2 1 1 th 1 U 1 2 % 1 3 10 7 1 1 I Ik i I A h- 1 1 6 li 11 1 / 2 1o fft 12 " J " IT <>° k 1 1 U % jp 1 Iv 1 1 11 m 12 1 11 0 ki u- 1U n i Li M 31 If 13 /o 6> % %ob 111 1*1 7A71 L* 11 13 U / •£ m 1 i+i 1U Vi Lb h 11 8<?7i, ** 11 6 2 3 b 1 6 7 8 * fo ff 13 /A 11 1* ff M Period. FIGURE 5 A Bivariate Scatter Diagram 80. FIGURE 6 The Definition of Sample T a l l FIGURE 7 Comparison of T a i l Curvatures on Gumbel paper FIGURE 8 Skewness of the Fretchet Distribution 83 FIGURE 9 Shape Factor of the Weibull Distribution FIGURE 10 Method of Least Squares a* slope b* Intercept 85. FIGURE 11 Convergence of Least Squares Procedure for Type I I I TT Distribution 86. 6.0 >o/o CL c 016 I U-o Least Line Square* / / 25% Confidence bands 020 based on distribution of ^^observation N ordered 3* 25/ a >o5o Confidence Valves. bands based on Normal approximation Convera'ma 2* ranh -too rrt' as increases 8 /2 ff Wave FIGURE 12 of Comparison Between the Approximate and Exact Confidence Intervals Heioht FIGURE 13 Confidence Intervals on a Type I Plot 6 3 * V a: 1 c / -2$Z 8 2.1 2.2 CoNFlDJLMCJE. /o 2.3 //AVfS WAVE. H&KJHJ II 12 2.L 2.5 Fcnj UfHj Note: FIGURE 14 data from St. Denis [1969] Confidence Intervals on a .Type I I plot 89. FIGURE 15 Confidence Intervals on a Type I I I Plot 90. ConF>Xf£.h4CJL 2.1 2.2 2.3 2.U 2/> NOTE: i ) N~.:{P(h)} i s the value of variate which corresponds to an area equal to P(h) under the Standard Normal Distribution density curve. i i ) Data from St. Denis [1969] FIGURE 16 Confidence intervals on Lognormal Plot 91. h u Height h J (£/ Confidence Height h Inter vat FIGURE 17 Determination of Confidence Interval from Typical Distribution Density of the mth Observation. FIGURE 18 The Curvature Test FIGURE 19 Type I I I Plot with Confidence Intervals 94. FIGURE 20 Type I I I TT Plot with Confidence Intervals 95. Note: N {P(h)} i s the value of variate which corresponds to an area equal to P(h) under the Standard Normal Distribution density curve -1 FIGURE 21 Curvature of the Lognormal Distribution on Type I Paper FIGURE 22 The Relationship Between Return Period and Encounter Probability 97. Wave FIGURE 23 Typical Prediction Intervals Heiaht 98. FIGURE 24 The Relationship Between P(h) and F (h) for m = 1. 99. 100. FIGURE 26 The Relationship Between P(h) and F_.(h) for m = 3. 101. FIGURE 27 The Relationship Between P(h) and F (h) for m = 4. 102. 5o foo SAMPLE. SIZL FIGURE 28 The Relationship Between P(h) and F (h) m 500 n for m = 5. 103. APPENDIX The cumulative probability function i s given by P(h) , , let = exp.'{- exp ]} 'h-e Z = -7T~ where e Is a location parameter and 9 i s a scale parameter P(z) = exp { - exp(-z)} the density function i s then given by p(z) = ^ P(zj = exp{ -z} exp {- exp(-z)} Type I density p(z) = exp {-z ;- exp[-z]} A moment generating function i s defined as 00 _ M{t} = / exp {tz} p(z) dz where t i s a dummy variable, set y = exp(-z) M{t} = J /y - " y ' exp {-y} dy o = r(i-t) ( 1 t ) 1 v which i s the Gamma function with argument (1-t), 105. Trie basic properties of the moment generating function [Bury 1975; page 44] give the kth moment about the origin m (z) = ^ dk k thus where mj(z) = ~ . M{t} I t=0 r(l-t) I = t=o Y ....(74) y i s Euler's Number (0.57722). Similarly higher moments are found to be m (z) = T 2 m (z) . = 3 2 X + = 1 - 9 7 8 n 5.44487 = 23.56147 mi(z) The f i r s t four central moments.are given as yi(z) Var(z) = Q = y (z) = ^ 2 y (z) 3 . Ul+ (z) = = 1.64493 2.40411 = 14.61136 The f i r s t shape factor i s constant and given by y — = un / 3 3 2 1.13955 2 Since the shape factor i s constant, the Type I distribution i s only capable of having one shape which i s shown i n FIGURE 2. 106. The estimation of parameters.used i n Eqn. 68 i s achieved by using the properties of the'parameterless .version, Eqn. 7 0 . . from Eqn. 71 E(z> -E mi(h) {Y}- ^ e + 0.57722 9 = - ' f (75) Similarly Var(z) y (h) 2 = m (z) - mi (z) 2 2 1 U2QO = J' 0 2 ....(76) Equations 75 and 76 form the basis of the method of moments described i n SECTION 5 . 1 . 1 . The cumulative probability function i s given by - of -(f) P(h) = exp h >_0 9 > 0. a >0 where a i s a shape parameter and 9 i s a scale parameter The density i s p ( h ) ..;|. . s P ( h ) (|) The kth moment i s defined as 00 rn^Ch) = J h p(h) dh k o setting t -ctr - t = ha / \-( h * - | (|) rn^t) = j a a + 0 e tk k / a e _ t O m^t) = e k r{i - (k/a)} dt -(cd-i) exp 108. The function r{l-k/a} Is discontinuous f o r a l l integer values of (k/a), and this distribution i s only v a l i d when (k/a) <1. [Gumbel 1958 pages 262-264]. This implies that when a i s integer, the expectation or moment of order a w i l l not exist. In addition, when a <_ 1 the Type I I distribution does not have a mean, and i t s variance cannot exist f o r a <_ 2. Thus, this distribution must be used with considerable care. 109. A3 Properties of the Type I H g .Distribution The cumulative probability function i s (fff} P(h) = exp {for - h >0 . E 0 >0 ; where a >0 e i s the highest wave ever possible i.e. and (80). P ( ) = 1.0 e 0 i s a scale parameter. The distribution may be simplified by setting A = . (e - h) the density then becomes a-l a .(81) again defining the kth moment by 00 \(X) = j X p U ) dA k —OO, oo:. / f® 8 setting «p<-(0 } 1 A ' dA K t = gives the gamma integral M (A) k = J 0 k t k / a e _ t dt o \{x) = e k r { i + (k/a)} ....(82) 110. The size .of the moment i s independent of e. Unlike the moments of the Type I I distribution, the moments of the Type I l l y distribution are continuous. Equation 82 forms the basis of the method of moments described i n SECTION 5.1-3- The treatment for the Type I I I L distribution i s basically the same and i s easily derived by the same method. 111. AK Properties of the LogriO:raiaI Distribution The lognormal model i s obtained by. using the logarithm of the height as a reduced variate, and applying the Normal model. The reduced variate y = £n(h) and the NORMAL probability function h y i e l d the Lognormal probability function h P( ) H 1 = | 1 . 1 (M-lCf { a/27T o where y i s the scale parameter and a i s a shape parameter The density i s given by for h >0 a >0 The ; -oo, kth moment i s given as 00 M (h) k = / / h p k £ (h) dh <y< •) to . . . . ( 8 3 ) setting y = &n(h) oo M (h) .... (85) = I exp' {ky} p(.y.) dy k n where P (y) i s the Normal density. n Equation 85 i s the standard form of the moment generating function with argument k, giving lyh) = exp { yk + • ^ -} k a The f i r s t moment i s then E{h} = niiCh) = exp{y + | } ....(86) 2 and the central moments are m(h) = 0 y (h) = [exp(a ) - 1] e x p ( 2 y + ) y (h) = m?( -V) 2 a 2 ....(87) ....(88) 6 3 where y 2 Y the coefficient of dispersion l s Y j, = u /mj 2 2 = [exp(a ) - 1] 2 1, 2 The f i r s t shape factor i s given by 4> = Y 3 + 3Y ....(89) > 0, indicating that a l l lognormal densities are skewed to the right. The simultaneous solution of Eqns.. 86 and 87 leads to the method of moment estimators given i n Eqns. 22 and 23 of SECTION 5 . 1 . 4 . 113. A5 Curvature of the Type I I Distribution The ordinate scale of Type I paper i s given by y = - Jin {- In P(h» the Type.TI probability function i s 'P(h) = exp {-(I) ^ Thus the equation of a Type I I l i n e on Type I paper i s y = a (Jin h - Jin G) (90) which has slope % dh = ^ and curvature •£y_ dh 2 = -<* h 2 < o ~ Hence the Type I I distribution w i l l have a negative curvature when plotted on Type I paper. A6 Curvature of the Type I l l y Distribution The Type I l l y probability function i s given by P(h) = exp {- e-h a } Again using Eqn. 1 as the ordinate of the Type I paper. The equation of a Type I l l y l i n e on Type I paper i s y = -a .Jin(e-h) + a £n (91) which has slope dy . dh +a (e-h) and curvature dx 2 a (e-h) 2 >— 0 Thus, a Type I l l y distribution w i l l have positive curvature when plotted on Type I paper. 115. A7 Curvature of the .Type .III Distribution The Type I I I probability function i s P(h) as = 1- exp{-(^) } a P(h) -* 1 £n P(h) = £n [ I - exp{ - ( ^ ~ ) > ] a = - exp, - (^)«> Using Eqn. 1 as the ordinate of the Type.I paper, the equation of the Type I I I ^ distribution becomes:. (92) which has slope dy _ a ( -e\ dh 8.\ e / n and curvature dh since e } z e V e A a and (h-e) are positive dv —^ dh 2 sign of 2 = sign of (a-1). Thus the curvature of the Type.Illy i ) positive when a > 1 i i ) zero when a =1 H i ) negative when a < l distribution becomes: A
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The statistical estimation of extreme waves
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
The statistical estimation of extreme waves MacKenzie, Neil Grant 1979
pdf
Page Metadata
Item Metadata
Title | The statistical estimation of extreme waves |
Creator |
MacKenzie, Neil Grant |
Date Issued | 1979 |
Description | This thesis contains a review of existing statistical techniques for the prediction of extreme waves for coastal and offshore installation design. A description of the four most widely used probability distributions is given, together with a detailed discussion of the methods commonly used for the estimation of their parameters. Although several of these techniques have been in use for several years, it has never been satisfactorily shown which are capable of yielding the most reliable predictions. The main purpose of this thesis is to suggest a practical method of solving this problem and achieving the best estimate. The basic theory for the prediction of extreme values was described in detail by Gumbel (1958) who concentrated largely on the double exponential distribution which is named after him. An order to evaluate the quality of fit between this law and the data, Gumbel derived expressions which enabled one to plot confidence intervals to enclose the data. The method described in this thesis in partly an extension of Gumbel's work, and similar confidence interval methods are given for the remaining distributions, thus permitting direct comparisons to be drawn between their performances. The outcome of this is that the most reliable model of the data may be chosen, and hence the best prediction made. The method also contains a curvature test which has been devised to facilitate computation and lead more directly to the end result. The particular form of the wave data, which is quite different from wind records, is also taken into consideration and a working definition of the sample tail is suggested. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-03-05 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0062860 |
URI | http://hdl.handle.net/2429/21534 |
Degree |
Master of Applied Science - MASc |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Unknown |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1979_A7 M33.pdf [ 4.09MB ]
- Metadata
- JSON: 831-1.0062860.json
- JSON-LD: 831-1.0062860-ld.json
- RDF/XML (Pretty): 831-1.0062860-rdf.xml
- RDF/JSON: 831-1.0062860-rdf.json
- Turtle: 831-1.0062860-turtle.txt
- N-Triples: 831-1.0062860-rdf-ntriples.txt
- Original Record: 831-1.0062860-source.json
- Full Text
- 831-1.0062860-fulltext.txt
- Citation
- 831-1.0062860.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0062860/manifest