THE STATISTICAL ESTIMATION OF EXTREME WAVES by NEIL GRANT MACKENZIE Sc(Hons)3 University of Newcastle-upon-Tyne, 1973-A THESIS'.: SUBffilTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE IN THE FACULTY OP GRADUATE STUDIES i n the Department of CIVTL ENGINEERING We accept this thesis as conforming to the required standard The University of Br i t i s h Columbia June, 1979 © Neil Grant MacKenzie, 1979 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h Columbia, I agree t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by the Head o f my Department o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g or p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department The U n i v e r s i t y o f B r i t i s h Columbia 2075 Wesbrook P l a c e Vancouver, Canada V6T 1W5 DE-6 BP 75-51 1 E ( i i ) ABSTRACT This thesis contains a review of existing s t a t i s t i c a l techniques for the prediction of extreme waves for coastal and offshore i n s t a l l a -tion design. A description of the four most widely used probability distributions i s given, together with a detailed discussion of the methods cormionly used for the estimation of their parameters. Although several of these techniques have been i n use for several years, i t has never been satisfactorily shown which are capable of yielding the most reliable predictions. The main purpose of this thesis i s to suggest a practical method of solving this problem and achieving the best estimate. The basic theory for the prediction of extreme values was described i n detail by Gumbel (1958).who concentrated largely on the double exponential distribution which i s named after him. An order to evaluate the quality of f i t between this law and the data, Gumbel derived expressions which enabled one to plot confidence intervals to enclose the data. The method described i n this thesis i n partly an extension of Gumbel's work, and similar confidence interval methods are given for the remaining distributions, thus permitting direct comparisons to be drawn between their performances. The outcome of this i s that the most reliable model of the data may be chosen, and hence the best prediction made. ( i i i ) The method also contains a curvature test which has been devised to f a c i l i t a t e computation and lead more directly to the end result. The particular form of the wave data, which i s quite different from wind records, i s also taken into consideration and a working definition of the sample t a i l i s suggested. (iv) TABLE OF CONTENTS Chapter :•• Page 1. INTRODUCTION - AND LITERATURE REVIEW 1 2. THE DISTRIBUTIONS AND. THEIR PROPERTIES 6 2.1 The Lognormal Distribution 6 2.2 Asymptotic Distribution of the Extreme Value .. 7 2.3 Gumbel Distribution - Type I 8 2.4 Fretchet Distribution - Type^II 9 2.5 Weibull Distribution - Type I I I 10 3- PROCEDURE FOR COLLECTING THE DATA 12 3.1 Forming the Sample 14 3.2 Determination of Plotting Positions 17 3-3 Definition of the Sample Tail 18 4. INITIAL SELECTION OF PARAMETRIC FAMILIES 20 4.1 Curvature Properties 20 4.2 The Curvature Test 23 5. METHODS OF PARAMETER ESTIMATION 24 5.1 Method of Moments 25 5.1.1 Gumbel Distribution - Type I 25 5.1.2 Fretchet Distribution - Type I I 26 5.1.3 Weibull Distribution - Type I I I 27 5.1.4 Lognormal Distribution 28 5.2 Method of Least Squares 29 5.2.1 Gumbel Distribution - Type I 31 5.2.2 Fretchet Distribution - Type I I 32 5.2.3 Lognormal Distribution • 32 5.2.4 Weibull Distribution - Type I I I 33 (v) TABLE OF CONTENTS . (continued) Page 5.3 Method of Maximum' Likelihood 35 5.3-1 Gumbel Distribution - Type I 36 5.3.2 . Fretchet. Distribution - Type I I 37 5.3.3 Weibull Distribution - Type I I I 38 5.3.4 Lognormal Distribution 39 6. TESTS OF FIT BETWEEN THE DISTRIBUTION AND THE DATA . 40 6.1 Derivation of Confidence Intervals 4 l 6.2 Asymptotic Distributions of the mth Statistic 44 6.3 Approximate Distribution of the mth Statistic 44 6.4 Limitations 45 6.5 Method of Determining C r i t i c a l Points 46 6.6 Procedure for Plotting Confidence Intervals ^6 6.7 Examples of Confidence Intervals 47 7. METHODS OF PREDICTION 1 + 8 7.1 Expected Significant Height ^9 7.2 Extreme Wave Period 50 7.3 Maximum Wave Height 51 7.4 Confidence Intervals for Prediction 53 7.5 Encounter Probability and Waiting Time 54 8. CONCLUSIONS - A RECOMMENDED PROCEDURE 56 9. WORKED EXAMPLE 59 10.. FUTURE WORK 63 BIBLIOGRAPHY 6 4 (vi) TABLE OP CONTENTS (continued) APPENDIX Al Properties of the Type I Distribution 104 A2 Properties of the Type I I Distribution 107 A3 Properties of the Type I l l y Distribution 109 A4 Properties of the Lognormal Distribution I l l A5 Curvature of the Type I I Distribution 113 A6 Curvature of the Type I l l y Distribution 114 A7 Curvature of the Type I I I T Distribution 115 (vii) LIST OF FIGURES Figure Page 1. Typical Examples of Data on Lognormal Paper 75 2. Typical Density Curves for the Types I, I I and Lognormal Distributions 76 3. Typical Density .Curves for the Type I I I 77 Distributions 77 4. Typical Wave Elevation Recording ...... 78 5. A Bivariate Scatter Diagram 79 6. The Definition of Sample Tail 8b 7. Comparison of T a i l Curvatures and Gumbel Paper 8 l 8. Skewness of the Fretchet Distribution 82 9- Shape Factor for the Weibull Distribution 83 10. Method of Least Squares 84 11. Convergence of Least Squares Procedure for !. ' Type I I I Distributions 85 12. Comparison Between the Approximate and Exact 86 Confidence Bands 13. Confidence Intervals on a Type I Plot 87 14. Confidence Intervals on a Type II Plot :88 15. Confidence Intervals on a Type I I I Plot 89. 16. Confidence Intervals on a Lognormal Plot . go 17. Determination of Confidence Interval from Typical Distribution Density of mth Observation ... .91 18. The Curvature Test 9,2 ( v i i i ) LIST OP FIGURES (continued)• Page 19. Type I I I L Plot with.Confidence Intervals .. 93 20. Type I l l y Plot with Confidence Intervals 94 21. Curvature of the Lognormal Distribution on Type I Paper 95 22. The Relationship Between Return Period and Encounter Probability 96 23. Typical Prediction Intervals 97 24. The Relationship Between P(h) and P m(h) for m = 1 98 25. The Relationship Between P(h) and P m(h) for m = 2 99 26. The Relationship Between P(h) and F m(h) f or m = 3 100 27. The Relationship Between P(h) and F m(h) f or m = 4 101 28. The Relationship Between P(h) and F m(h) f or m = 5 10.2 (ix) LIST OF TABLES TABLE Page I Probability Distribution and Their Properties 69 I I Curvature Properties of the Distribution 70 I I I Shape Factors for the Fretchet Distribution 71 IV Shape Factors for the Weibull Distribution 72 V Estimation of Confidence Intervals for Type i l l . Plot 73 (x) ACKNOWLEDGEMENT The author profits by this occasion to thank his supervisor, Dr. M. de St. Q. Isaacson, for the constant help and encouragement given. The typing and ingenious presentation of the text was the work of Ms. S. McLintock. 1. CHAPTER 1 INTRODUCTION AND LITERATURE REVIEW The technique of plotting collected wave data on probability paper i n order to predict the probable magnitude of extreme values has a firm place i n engineering design. Although i t s use i s wide-spread, a general method of selecting the most suitable probability model has not been previously suggested. Considerable attention has been given i n design to accurately predicting the effect of a selected design wave ...on.. a<-structure.-. .:iHowever,/the,process for-/selecting a design wave s t i l l remains comparatively unreliable, and represents a weak link within the design process. In engineering practice and' current literature only four distributions are commonly used for this model. These are the LOGNORMAL, GUMBEL (Type T), FRETCHET (Type I I ) , and WEIBULL (Type III) distributions. Each of these four distributions i s actually a family of probability functions whose properties vary subtly as their parameters change i n value. The f i r s t three are each defined by two parameters, conveniently called "shape" and "scale" parameters. The Type III distribution requires a third parameter for i t s definition. This i s referred to as a "location" parameter and enables the Type I I I distribution to be used i n two alternative forms. These are later designated the Type IIL^ and Type H l y distributions. The success of the method described here i s dependent upon a close .empirical f i t between the model and the data, being achieved. Depending on the data used, there i s often a tendency for one 2. distribution to be more suitable than the others, and hence able to give more reliable predictions. . The systematic search for this distribution has received very l i t t l e attention i n the literature, and i s the basic problem considered i n this thesis. Fisher and Tippet [1926] showed that there are only three asymptotic distributions, and that these describe the behaviour of the maximum value from any parent distribution as the sample size approaches i n f i n i t y . Gumbel [1958] developed the Type I. distribution to a considerable degree as a tool for flood prediction. As a result of his work, this distribution has, u n t i l recently, been the most widely used throughout the various applications of extremal s t a t i s t i c s . Gumbel also popularised the method of moments, and to some extent the method.of least-squaresfor estimating the parameters of the Type I distribution. The former method was generally adopted since i t could be carried out by hand, whereas the latter required a computer program. Thorn [1954] developed the Type II distribution for wind analysis and suggested using the method of maximum likelihood for estimating the parameters. Both the Type I and Type I I distributions have been adopted by meteorologists for the prediction of extreme wind speeds i n the United States and elsewhere. Jasper [1956] suggested the use of the lognormal distribution for describing the occurrence of significant wave heights. This has been commonly used by many authors including Draper [1963], though i n recent years i t s popularity may have diminished slightly. 3. Following the successes with, wind speed prediction, Thom [1971] went on to advocate the application of the Type I I distribution to wave, heights. He argued that this distribution was superior to the Type I since i t had a lower bound of zero height. He plotted data taken from several Ocean Station Vessels based i n the Pacific and Atlantic Oceans, achieving a good f i t i n some cases. The data from these vessels was mainly based on visual estimates, and the vessels themselves stationed i n the deep waters of.mid-ocean. However, i t i s not unreasonable to expect the distributions of waves i n shallower, more restricted sites to be rather different from those described. There are two alternative forms of the Type I I I distribution which are denoted by the Type I H L and Type I l l y distributions respectively. The Type I I I ^ (Weibull lower-bound) distribution of minima (see TABLE I) was used i n combination with the method of moments by Gumbel [1954] for estimating the worst drought occurring i n a river. This was a natural choice since i t enabled the engineer:..to place a lower limit e on the least flow ever possible i n the river. Bretschneider [1965] suggested the use of the Type I H L distribution for the short-term significant wave heights associated with a given storm. Hogben [1967] used data gathered i n the mid-Atlantic to make a comparison between the Type UL^ distribution and the lognormal distribution. He concluded that, although the latter gave a better f i t for lesser wave heights, the Type I I I ^ was superior for larger heights. Battjes [1970], using instrument-recorded data from the mid-Atlantic and the Celtic Sea, found a strong departure i n the data from the lognormal distribution for extreme wave heights. In this instance the lognormal distribution gave an over-estimation of wave height for a specified probability of exceedance. He found that the Type III-^ distribution gave a superior f i t when a small positive value was used for e.. Generally e was less than one metre and represented the extent of background noise which was always found to be present. St. Denis [1973] suggested that the Type I l l y (Weibull upper-bound - see TABLE I) - distribution should be used for the description of wave heights i n situations where a physical upper limit i n the height could be expected. A typical case might be that of shoaling, or of a distinctly limited fetch. No reports of i t s use have been found i n the published literature, although i t s use has also been advocated by Borgman [1975]. This may be the result of d i f f i c u l t i e s surrounding parameter estimation.. These are discussed i n detail i n CHAPTER 5 . on the estimation of parameters. Petrauskas and Aagaard [1971] described a computer method which enabled them to select the most suitable distribution for a data sample from eight chosen po s s i b i l i t i e s . These consisted of the Type I distribution together with, seven Type I H L distributions, each with a different prescribed shape parameter. The process of parameter estimation then simplified to one of determining two rather than three parameters for each of the eight cases. This was then achieved by a direct least-squares approach. The resulting distribution was then plotted with "uncertainty intervals" to indicate the degree of error i n prediction. 5. In a paper on rubble-mound breakwaters, Ouellet [1974], noted the wide variability, i n sample of wave data, and the need for a consistent approach to predict from them. FIGURE 1,. which i s taken from his paper, shows five sets of data from different sources. It can be seen that i n one case (Moffat Beach, Australia) the researcher did not f i t a single straight line but used three straight sections. This implies, that the sample was a mixture of data from three quite different lognormal populations. From the point of view of prediction this i s quite undesirable, since only one-third of the sample could be used for long-term forecasts. Two other sets of data (Benghazi Harbour and Mangalore Harbour) develop pronounced curvature as the exceedance probability decreases. In CHAPTER 4, the role of this property i s examined i n detail. In the next chapter the distributions are described in detail. Each subsequent chapter discusses a step i n .the derivation of a design wave, as indicated by their t i t l e s . The conclusion to this thesis describes a complete procedure and a worked example i s provided for demonstration. 6. CHAPTER 2 THE DISTRIBUTIONS AND THEIR PROPERTIES The distributions described here .are the most commonly used for extreme wave prediction, and an outline of their properties i s given. 2.1 ... .The.lognormal. Distribution The Lognormal Distribution i s derived by transforming a variable to i t s logarithm before applying the normal distribution. This results i n a density which only exists for a positive variable, as shown i n FIGURE 2. If. Y i s an N(y,a2) variable, that i s i t possesses a Normal distribution with mean y and variance a 2, " then X = exp(Y) i s a lognormal variable with parameters y and a 2. The density i s given i n TABLE I with a = a .and 6 = p. The popularity of this distribution amongst coastal engineers i s largely due to i t s connection with the normal distribution, and i t s considerable f l e x i b i l i t y rendered by the scale and shape parameters y and a 2 respectively. In other related f i e l d s , such as meteorology, the lognormal distribution i s much less popular. It i s possible that meteorologists feel j u s t i f i e d i n only using asymptotic distributions by the relative abundance of weather data. Of the various distributions considered here, the lognormal i s an exception i n that i t i s not an asymptotic form. In other words, i t does not limi t i t s description to the t a i l of a parent distribution from which the body of data might be collected. 7. Lognormal Paper Is constructed as.follows: a) The ordinate.scale carries the Standard Normal Distribution c r i t i c a l points corres-ponding to the exceedance probability Q(h). A c r i t i c a l point i s the value of variate which defines the lower limit of area representing the exceedence probability Q(h) under the density curve. The procedure has been described by Draper [1963]. b) The abscissa scale i s simply the natural logarithm of the wave height. True lognormal data w i l l l i e on a straight line whose slope and intercept w i l l be determined by the two parameters a and 02. 2.2 Asymptotic Distributions of the Extreme Value Generally the distribution of data occurring within two standard deviations of the mean value i s well described by the parent distribution. However, i n many cases (for example the NORMAL DISTRIBUTION), the areas within the t a i l , corresponding to comparatively rare events, are d i f f i c u l t to.calculate with precision. This problem does not arise i n practice since the distribution of the maximum value occurring within a sample from any parent family tends i n distribution to one of the three asymptotic types as the sample size approaches. i n f i n i t y . The three types, namely Gumbel, Fretchet, and Weibull distributions, a l l have cumulative distribution functions 8. (TABLE 1) which may be evaluated by pocket-calculator instead of tables or a computer program. In order to simplify the description of these distributions the following notation w i l l be adopted: a - shape parameter which determines basic shape of the density function. 0 - scale.parameter controlling the density scale or spread along the variate axis. e - location parameter locating the position of the density function on the variate axis. In the special case of the Types IHy and I H L distributions j e locates one end of the density function (see Figure 3)• 2.3 Gumbel Distribution - Type I The Type I distribution i s the limiting form for maximum values taken from the exponential class of parent distributions, which include the Normal, Exponential and Gamma distributions. The probability function i s defined i n TABLE I, and the density function i s sketched i n FIGURE 2. This asymptotic distribution has.become an accepted form for the prediction of extreme winds i n the U.S.A. according to. Simiu [1976]. It sets no upper lim i t on the intensity of the windspeed which may occur. In common with the other two types, the GUMBEL PAPER has a linear ordinate scale given by 9. y = - hi { -In [ l - Q(h)]} . . . . ( 1 ) where Q(h) i s the exceeelance probability of a given wave height. The abscissa scale i s simply the wave height, which need not be standardised before plotting. 2.4 Fretchet Distribution - Type I I The probability function of the Fretchet Distribution i s defined i n TABLE I and typical shapes of the density function are shown in FIGURE 2. The Type II asymptote i s the limiting distribution of maximum values taken from the Cauchy class of parent distributions which are not commonly used i n engineering because their means and variances do not always exist.. The Cauchy class generally have densities which are functions of the reciprocal of the intensity (wave height). A useful property of the Type II asymptote i s that i t s density decays more slowly than the other two asymptotes. This property has made the Type I I distribution invaluable for the prediction of hurrican intensities i n the U.S.A. (Simiu 1976). Thorn [1971] suggested that the Type II distribution i s particularly suited to the description of wave heights. He argued that wave heights are bounded quantities since they cannot have negative values, and thus, merit quite different treatment from temper-atures and pressures, which are unbounded. The transformation from an unbounded variate to i t s extreme value i s achieved by a translation, whereas a bounded variate i s transformed by a change of scale. The Type II distribution may be considered as a Type I distribution i n 10. which the variate has been transformed to.its logarithm, thus giving i t a change of scale. Fretchet Paper has a linear ordinate scale given by y = - in { - In [ l - Q(h)]} as before. The linear abscissa scale i s now given by x = £n{h} (2) where h i s the wave height. 2.5 Weibull Distribution - Type III The Weibull probability functions are given i n TABLE I, and typical density functions are shown i n FIGURE 3. As already mentioned, two forms of the Weibull distribution are available. The upper bound distribution, Type I l l y , i s the distribution of maximum values taken from parent distributions with a f i n i t e t a i l .length, such as the UNIFORM DISTRIBUTION. The lower bound distribution, Type I I I T , i s the distribution of minimum values from the same source. The Type H I ^ distribution has been used to a certain extent as an empirical tool for wave height prediction. The second form of Type III. does not appear to have been widely applied to the problem considered here. Type I l l y paper has the same ordinate scale as Types I and I I , which i s given by y = . - Jin { - Jin [ 1 - Q(h)]} Its. abscissa scale i s dependent upon one of the three parameters, and so varies from one set of data, to another. x = _ in{e - h} (3) where e i s the maximum wave height ever possible and .is f i n i t e . Type I I I L Paper has a different ordinate scale to that used for the other asymptotic distributions, and this i s generally y = + in{- in Q(h)} (4) and the abscissa scale i s again parameter dependent. x = + in {h - e} (5) where e i s the smallest wave height possible and e > 6.. 12. CHAPTER 3 ' PROCEDURE FOR'COLLECTING WAVE DATA An ideal data source would consist of a continuous wave height recording over a period of several years. In practice, the collection of a continuous sample i s not feasible. The accepted alternative i s systematic intermittent sampling which consists of chart-recording over a short period of several minutes i n each successive interval of a few hours. The recording period i s often set at twenty minutes, for a recording interval of three hours. Generally, the sea state changes slowly enough for this to be representative and a typical wave elevation recording i s sketched i n FIGURE 4. Engineers concerned with the prediction of extreme winds generally use data which has been collected over a decade or more. This i s usually available from nearby airports, which keep records of this length. It i s rare for a coastal engineer to have an equally long record for wave heights at some loca l i t y . Wave heights are sensitive to physical influences such as fetch length and water depth. The engineer i s often obliged to use recordings made at the location .. of interest i n order to account for local effects, such as diffraction by coastline projections or refraction. A further difficulty.arises because the engineer, rarely knows the location of the project some years before the design i s started, and hence i s often forced to work with a short record. A f i n a l problem arises because there are usually practical d i f f i c u l t i e s i n operating wave recorders accurately over long periods of time since they usually have to be rigged with a buoy and anchor. Experience shows that the chances of a malfunction or loss of an instrument deployed i n this fashion i s quite high. In cases where i t ...becomes necessary to i n i t i a t e a local wave-recording program, the record w i l l rarely cover more than a year or two, except perhaps i n long term wave research projects. More often the time period studied w i l l be one year. Although a shorter study period would be expected to introduce seasonal variations, there i s reason to suggest that provided the winter months are covered i n detail no serious information loss should occur (see SECTION 3-3). In this extreme case the parameter estimation should be carried out by a least-squares approach (SECTION 5-2 et seq.). If summer months are omitted, i t i s often relatively simple to confirm that no storms more severe than those measured during the winter occurred. The result of a wave recording program would be a series of representative wave heights, one for each recording interval, e.g. the significant heights, or the maximum height measured. The method of converting a continuous record into a series of stat i s t i c s i s not described here and has been well documented by Tucker [1963]. The basic form for the presentation of wave data i s the bivariate histogram or "scatter diagram". This consists of a table containing significant wave heights which are divided, by frequency of occurrence,.into intervals of wave period. The total number of occurrances i s equal to.the number of wave records (including calms) 14. gathered during the study period, [e.g. one year). FIGURE 5 provides an example of this , and. from such a diagram one may assemble a table of height classes and their frequencies by summing over the wave periods. The resulting data i s then used for plotting. Although the methods of prediction used for extreme wind speeds are very similar to those used for. wave heights, there i s one basic difference i n approach which results from the much shorter wave period. Engineers concerned with the extreme wind speed usually have a sample containing one maximum speed for each year of the record. For the reasons just discussed, the wave prediction has often to be based on values occurring within a single year's recording. As one might expect then, the wave prediction must lack the degree of precision of a wind prediction and this i s reflected i n the width of the confidence intervals (see SECTION 7-8). 3.1 Forming The Sample Sampling i s a c r i t i c a l stage i n any s t a t i s t i c a l analysis. Not only does i t enable the statistician to reduce the vast universe of data into something which i s both meaningful and manageable, but i t also largely determines the shape of the questions which may be answered. As far as a literature review could show, very few authors attempted any other method of forming the sample than the one used by Draper [1963] and summarised below: a) Each short length of wave.recording corresponding to a recording period i s reduced to a significant height h g and a zero crossing period T^ . The method proposed by. Tucker [l '96l] Is used. Each pair of stati s t i c s then applies to one recording interval ..of several hours. b) To fac i l i t a t e , handling, a scatter diagram i s prepared. This i s a table of h against Tz, both divided into classes, and each element Is appropriately marked with a number of recording intervals, (see FIGURE 5). c) Each wave height class (0 - 1.99 f t . , 2 - 3.99 f t . etc.) i s summed over a l l classes of T 7 to give marginal frequencies of height. d) The probability that the significant wave height may exceed the lower limit of any class i s then calculated as 0(-, s _ Number of height values 2 h 1 + total number of height values where h i s the lower limit of each height class (0 f t . 2 f t . etc) and Q(h) i s [Prob H>h] e) Paired values of h and Q(h) may then be used to plot the lower limit of each class onto a probability paper (e.g. Type I I I T paper). 16. This approach i s commonly used i n stat i s t i c s and was developed to f i t a distribution to. the body .of the sample.. It i s particularly useful when estimating by the method of moments. However, It i s rather non-specific i n the way i t achieves a f i t and does not always give the engineer the type of f i t he requires. Since the purpose of sampling wave heights i s to arrive at reliable estimates of the rare occurance wave, i t seems unreasonable to concentrate oh achieving a good f i t at the median of the sample. In fact the quality of f i t for wave heights which are exceeded almost daily i s quite Irrelevant to the problem considered here. Clearly, a high quality f i t i n the v i c i n i t y of the t a i l of the parent distribution i s required i n order to predict events whose exceedence probabilities occupy this region, It would be most helpful i f the " t a i l " , of the sample could be defined so that a distribution could be f i t t e d directly to this portion of the data. In practical applications of.statistics i t i s quite common to define the f i n i t e end of a t a i l as being a fixed . number of standard deviations from the mean. The result i s an empirical rule of the form: Lowest height within t a i l i h + a where h i s the sample mean height S^ i s the sample standard deviation a... i s a constant An alternative approach, which yields the constant 'a', i s based on the method of calculating plotting positions and w i l l be given i n SECTION 3-3-17. 3-2 . Determination of Plotting Positions In order to plot the data, on probability paper one must assign a fixed probability to each value i n the sample. To do this the data i s ordered according to height and the suffix m i s used : to denote i t s position or RANK. Thus m = 1 corresponds to the largest value and m = n to the smallest of a sample containing n wave heights. A formula which has gained wide acceptance for calculating the plotting position i s : Q^) •= 1 - P ^ ) = m/n+1 (7)" . . .... It has been shown by Gumbel [1958] that the expected probability for the m^ ordered observation i s given by m/n+1 and that this i s independent of the distribution. However, i t has been demonstrated by Kimball [i960] and Gringorten [1963] that this formula tends to introduce a slight bias towards the distribution being estimated. . Although i t i s possible to form unbiased forms for the distributions considered here, such expressions would vary according to the parameters. Since the parameters s t i l l have to be estimated, this would lead to either approximate forms or to an iterative procedure. The example given by Gringorten suggests that the bias introduced by the simple formula Eqn. 7 i s small enough to be considered a second-order effect i n comparison with those introduced by adopted different estimation . . methods or sampling procedures. The simple rule i s invariably used for plotting and, for..example, has been strongly recommended by Borgman [1975]. It w i l l be adopted-throughout the present study and the effects of alternative formulae are not examined. It should be noted that the rank value m i s assigned to each individual wave height recorded but not directly to the height class limits. This may be seen i n the worked example i n CHAPTER 9-As a consequence of this the class limits i n the example have the approximate ranks shown i n TABLE I I I . 3.3. Definition of the Sample Tail It has already been mentioned that for the prediction of extreme wave heights, a good f i t i n the t a i l of the data i s of considerable importance, i.e. the distribution should give a good f i t to the worst of the extreme measurements made. It i s therefore convenient to define the sample t a i l . In SECTION 3.1 one simple method of defining the sample t a i l was mentioned. An alternative approach may be based on the fact that for a l l . three asymptotic distributions a function of probability provides the ordinate scale th for plotting. The ordinate of the m wave height s t a t i s t i c i s given by y m = - £n {- In [ l - Qd^)]} where QCfcfo) i s calculated as in SECTION 3-2. A plot of this function against QCh^) describes the distortion applied to obtain the scale along the ordinate and i s shown i n FIGURE 6(a). The gradient of the curve i s given by 19. d y /dQ = {(1-Q) £n(l-Q)}_1 ...... (-8) and this i s plotted against Qfl^.) i n FIGURE 6(b). As QCbjjj) approaches the median value, of 0 . 5 , the gradient decreases and becomes almost constant for values less then, say, 0.1 . The vertical distance between plotted points i s controlled directly by this gradient and hence a lower li m i t for Q(h^) of 0.1 i s chosen to locate the start of the t a i l . The extent of the t a i l i s then determined by the position of the sample wave height having a rank w where w = (n+l)/10 . (9) and n i s the number of wave heights i n record. As a result of this procedure only 10% of the original sample i s used for estimation. A proportion of the bulk of data, which Is used i n the f i t t i n g procedure should contain a l l measurements made during the summer months of. lower storm activity. As a result, i t no longer becomes necessary to rely on precise measurements during these periods of low storm activity. Thus gaps i n the record for these months need not be serious, and often this may be confirmed by an inspection of local meteorological records. In the rare cases when a valuable piece of data i s missed i t may be possible to estimate the approximate number of recording intervals involved and their order within the sample. This has been suggested in connection with similar applications by Borgman [ l 9 6 l ] . 20. CHAPTER 4 INITIAL SELECTION OF PARAMETRIC FAMILY To carry out a detailed analysis using each of the distributions in turn would be tedious. To base the f i n a l selection of a distribution solely on the width of the confidence intervals could be d i f f i c u l t and misleading, often resulting i n solutions which appeared, to f i t well i n the extreme t a i l but poorly elsewhere. In order to eliminate these procedures i t i s convenient to make use of a simple property which i s shared to a differing degree by a l l distributions. Such a property i s the curvature of a distribution when plotted on Type I paper. Literature often shows noticeable curvature of data as i t approaches the t a i l of the distribution. In many cases this curvature i s detectible to some degree, e.g.- FIGURE I. Typical examples • may be found i n papers by Khanna and Andru [1974], and Ovellet [1974]. 4.1 Curvature Properties when a Type II distribution i s represented on any other pair -of axes than those used to form a Fretchet plot, the result w i l l be a curved line. This principle applies to a l l the distributions considered, and forms the basis of the curvature test suggested-here. The comparison i s made by examining the curvature of each type of distribution when plotted on Gumbel Paper (Type I ) . A typical result i s shown i n FIGURE 7 and the actual degree of curvature for each distribution i s dependent upon the parameters used. The curvature i s defined as the second derivative 21. CURVATURE = . .d2y/dx2 (10). where y = - £n { - in [ l - Q(h)] x = h. The resulting slope and curvature relationships are summarized i n TABLE I I and are briefly described below. TYPE I .remains a straight, line TYPE I I - develops strong negative curvature, and the t a i l decays more slowly than any of the other distributions. TYPE I l l y - develops the strongest positive curvature . which enables i t to achieve a f i n i t e limiting wave height, (i.e. one which has an exceedance probability of zero). NOTE: As the. parameter - a approaches i n f i n i t y both the Types II andTXLj become straight lines, i.e. Type I. TYPE I H L - niay develop both negative or positive curvature depending on the size of a. (see TABLE I I ) . In the special case of a = 1 the line becomes straight. The relative f l e x i b i l i t y of the t a i l of this distribution makes i t useful for spanning the gap between Types II and I H ^ i n the ranges where their t a i l s become inflexible. 22. The Type IH-^ distribution has a very wide range of curvature, and might often be an acceptable choice even without the curvature test. . LOGNORMAL - behaves i n a similar fashion to Type I I , though developing relatively mild curvatures. The curvature relationships for Types I, I I , I I I ^ and I I I ^ are relatively simple to derive and are given i n Appendices A.5 -. A.7.; . However, the lognormal distribution's behaviour i s most easily demon-strated graphically, FIGURE 21. Although some overlap i n curvature Is expected, particularly between the lognormal and the Type I I I ^ , both are retained as pos s i b i l i t i e s . In order to make an i n i t i a l choice of the distributions to be studied In detail, three groups may be used. POSITIVE GROUP - Types I l l y , I I I L (a > 1) STRAIGHT GROUP - I I I ^ (a = 1 ) , LOGNORMAL, I. NEGATIVE GROUP - LOGNORMAL, I I I L (a < 1 ) , I I . The curvature test does not provide any method of selecting a distribution from within one of the groups just mentioned. However, this may be achieved by using the method of confidence intervals which i s described i n CHAPTER 6. 23. 4.2 The Curvature Test . A simple procedure for selecting one of the three groups may now be used. i ) Plot the t a i l of the data onto a Type I Gumbel Paper with axes shown i n FIGURE 7- The plotting procedure i s that described i n SECTION 3.2. i i ) The presence, type and degree of curvature i s then assessed by eye and leads to a choice of one of the three groups. Since the data i s assembled for plotting by the method of SECTIONS 3.2 and 3-3, a l l points should occur within the parent distribution's t a i l , and hence one's decision would be based upon an 'overall' curvature for a l l of the plotted points. CHAPTER 5 ' METHODS OF PARAMETER ESTIMATION Each of the four distributions considered here i s actually a family of different distributions which have widely different properties depending upon the parameter values. Having chosen one of the four distributions as a l i k e l y model, i t remains to find the parameter values which f i t that distribution to the data the closest. In both the fields of wave and wind prediction, three methods of estimation have been adopted. i ) Method of Moments i i ) Method of Least Squares i i i ) Method of Maximum Likelihood Each of these three methods may give a different estimate of the parameters based on the same sample. It should be noted that a l l .. three methods provide a 'point estimate' i.e. a simple parameter value for each sample. S t r i c t l y speaking then, the estimates are. themselves random variables, though they are treated as i f they are stationary. It i s mentioned i n passing that the method of f i t t i n g a line by eye can give comparable results to those obtained by the method of least squares. A brief description of each method and i t s application to the distributions i s now given. The estimated, values of parameters are indicated by a hat 'V, and parameter notation i s as used i n TABLE 1. 25. 5.1 Method of Moments The method of moments operates by successively approximating the shape of the model distribution to that of the sample histogram.. This i s achieved by equating the f i r s t k moments to give one equation for each of the k parameters required. The f i r s t three or four moments tend to exert the strongest influence on the shape of a distribution and so the procedure often leads to a reasonable model. One disadvantage of this method i s that i t uses a l l the collected data and does not emphasise the role played by the distribution t a i l . 5.1.1 Gumbel Distribution - Type I The Type I distribution has two parameters (see TABLE I ) . The derivation of the moments and their properties are given i n Appendix A.l. The two equations resulting from this procedure are and where H = t + y 9 (11) H 2 - (-H)2 = r ' e 2 • (12) 1 * ^ 1 n H = - Z- h ; H 2 = - Z, h 2 (13) n i m 5 n 1 m and y i s Euler's Number (0.57722) ^ denotes an estimated parameter. Hence estimated values of the two parameters are: = H - 0.4501 /.H2 - (H) 2 .....(14) e and 6 = 0.7797 / H 2 - ( H ) 2 (15) 26. Both e and § are random variables whose values depend upon the particular random sample used for their estimation. The method of moments estimator for e has a variance which i s only 5% larger than that obtainable by the more complicated method of maximum likelihood. However, this method gives an estimator for 6 with a variance which i s 80% greater than that obtainable by the method of maximum likelihood, and thus the resulting value of § tends to be unreliable. It should be noted that the moment estimators for Type I are relatively straight forward to use since this distribution has the same shape for a l l parameter values. 5.1.2 Fretchet Distribution - Type II The Type II distribution again has two parameters. However, one of these, .a, controls the basic shape of the distribution and this parameter must be estimated f i r s t . A commonly used method for determining the shape i s to equate the skewness of the sample to that of the model. The skewness of a distribution i s defined by: S = y 3/y 2 3 / 2 (16) where u 2 and u 3 are the second and third central moments. The skewness of the Type II distribution i s shown in FIGURE 8 as a function of a. It can be seen that i n the region a >5 the skewness becomes increasingly insensitive to a and i s liable to provide an inefficient estimation. The estimation of the two parameters i s achieved by the following method: i ) The sample skewness Is calculated as _ -h /B .= [H 3 - 3H..H2 + 2(H)3] [H 2 - (H)2G ' , , . . . ( 1 7 ) where H3 = ^ h 3 . . . . . ( 1 8 ) n m 1 and H, H2 are defined i n Eqn. 13-i i ) The estimated value of the shape parameter a i s obtained from FIGURE 8. i i i ) The scale parameter i s then calculated 9 = H /. T{1 - l/S.} .. (19) where r{} i s the gamma function and i s available i n tabulated form, e.g. Abramowitz and Stegun . [1970]. 5.1.3 Weibull Distribution - Type I I I The Weibull distribution requires the estimation of three parameters a, e and 0. Since a controls the basic shape of the density function this parameter takes precedence i n the estimation process. The skewness, which i s the ratio of the second and third central moments, Eqn. 16 i s found to provide satisfactory estimations, as for the Type I I distribution, for a < 20, but for higher values i t becomes asymptotic and loses, i t s sensitivity to a • This presents no real problem since the range of sensitivity has been found to be more than adequate for this study. The skewness i s shown plotted against a ...in FIGURE 9-Since there are three, parameters to estimate, the method of moments requires equations Involving the f i r s t three moments. These may be central moments or moments about the origin (or a combination of the two types). The derivation of the Weibull moments.is given in Appendix A.3. The method then reduces to the following steps: i ) Calculate the sample skewness from Eqns. 13, 17 and 18. i i ) The estimated value of the shape parameter a i s obtained from FIGURE 9. It should be noted that for the Type IHy distribution the sign of the skewness should be changed before executing i i ) . This adjustment i s not required for the Type HI-^ distribution. i i i ) Solve for 9 from 0 = {[H2 - (H ) 2 ] f(l + 2/S) - r 2 ( l + V - ) ] " 1 } h (20). iv) Solve for e t = H + 8 r ( i + V - ) (21) 5.1.4 Lognormal Distribution The lognormal distribution has two parameters u and a. In the transformation from a normal to a lognormal distribution their properties: y changes .from a location parameter to a scale parameter 0, while 29. a changes from a scale parameter to the shape parameter a. The normal density has only one standardised shape, however the lognormal can assume a variety of shapes depending upon the value of the shape parameter a. The two moment estimators are given by a = [to {H2} - 2 £n {H}] ^ . . . . . ( 2 2 ) and 9 = 2 In {H} - h to {H2} (23) The use of these i s straightforward, but they can be expected to.perform poorly i f the density i s skewed too highly, i.e. best results w i l l occur when the sample histogram of to(H) i s nearly symmetrical about the mean value, Bury [1975; p . 2 8 l ] . 5.2 Method of Least-Squares It was recommended i n SECTION 3-3 that priority be given to f i t t i n g a line to the data occuring within the t a i l rather than using the entire sample. Of the procedures discussed i n this chapter, the method of least-squares i s the only one which may be used effectively with a portion of the sample and forms an important part of the approach recommended in.this thesis. Since a l l the types of probability paper described here give • a straight-line plot for their family of distributions, i t i s feasible to use the linear version.of the method of least-squares when f i t t i n g a .line to the. data: . Trie method i n i t s basic form i s directly 30. applicable to the lognormal, Type I and Type II distributions and i s described below. However, a modification i s required for the Type I I I distribution since i t s abscissa:. scale Is dependent upon the'parameter e, which i s to be estimated. Two Parameter Estimation The least-squares method for determining the line of best f i t to a group of data points i s well known. In FIGURE 10 a series of data points are to be f i t t e d by a line of slope a and intercept b. The vertical distance from a data point to the line i s given by r = |y± - a x ± - b| (24) the sum of the squared distances i s N q = (y ± - a x ± - b ) 2 (25) 1 the method of least squares selects values of a and b which rninimize q. : .N H = ( Y i - a x . - b ) x . = 0 . (26) 1 N | | = - 2 £ ( y . - a x. - b) = 0 (27) 1 where i i s the point index N i s the number of points. Whence a = r- N N N N I I : • - E: D : : D ; - ( E ) r- N N -1 (28) and b = N N N N _ N -1 (29) Thus the slope and the Intercept may be calculated directly from a table of wave heights and their frequencies of occurrence. 5.2.1 Gumbel Distribution - Type I The Type I distribution uses the general method described earlier. Type I paper uses an ordinate scale of y = - In {- In [ l - Q(h)]} (30). and an abscissa... scale of x = h The data plotting positions are calculated by the method in.SECTION 3.2 and the least-squares estimates are: = V . a e = -b (3D (32) where a and b are the slope and.intercept. 32. 5.2.2 Fretchet Distribution - Type I I The basic method i s applied to.the Type II distribution, which has the same ordinate scale as Type I (Eqn. 30), but with.an abscissa;., scale: x = In h The treatment i s the same as used for Type I, and yields the following estimations a = a (33) 6 = exp{-Va> (34) where a and b are the slope and intercept respectively. 5.2.3 Lognormal Distribution Lognormal paper has, as i t s ordinate scale, c r i t i c a l points, of probability from the standard Normal distribution (SECTION 2.1). Z 1-2 of P(Z) = Jexp ~ V 2 dt (35) —00 then the ordinate scale i s given by y = z The abscissa scale i s given by x = Hn H P(Z) i s tabulated as NORMAL PROBABILITIES and i s available In any s t a t i s t i c a l text. 33. Application of the basic method to data, plotted according to SECTION 3.2 gives the estimators: a = 1/a 6 = - b/a (36) (37) 5.2.4 Weibull Distribution - Type II I The three parameter Weibull distribution has the same ordinate scale as the Types I and I I distributions, but has an abscissa scale which i s i t s e l f dependent upon one of the parameters e to be estimated. See FIGURE 15 and SECTION 2.0. Hence, without, prior knowledge of e the data cannot even be plotted. An iterative least squares procedure must be adopted to overcome this problem. Using the least squares criterion, three equations must be analyzed. This i s achieved by the basic method of rrdnimizing the sum of the squared.errors as described previously. For the Type I l l y distribution the three resulting equations now become a = b = _ N N N N N _ Z > i y i - D i E y i J L N R - ( H x i ) N N N -, r N I X > i E x i - E x i E x i y i N E x i - ( E x i 2 ) (38) . . (39) N N N where r = a E ^ d - H i ) " ab £ V ^ ) + .a* E ^ ^ C r f i ) ...(40) , x± = - m(t-E±), and y ± = -^{Hmp.-QO^)]} Similarly, three equations may be assembled for the Type I I I ^ distribution. The procedure for the Type I I I ^ now becomes: i ) Select an i n i t i a l value e Q. This may be the largest measured wave height H^ _-^ i i ) Calculate a and b using Eqns. 38 and 39. i i i ) Calculate r from Eqn. 40 and check for solution when r = 0 iv) Increase e by an increment Ae, e-^ = E q+ Ae, and repeat procedure u n t i l the value of e which gives r =0 i s found. The least-squares estimation of the Type III parameters i s achieved by computer program, and convergence of r as e i s increased i s shown for typical wave data i n FIGURE 11. 35. 5-3 Method of Maximum Liklihood The method of maximum likelihood estimators (MLEs) attempts to provide estimated parameters which would give the data sample the highest probability of being observed i n i t s particular form. The relative sizes of each of the data values play a fundamental part . i n the method, though their order i s not important, and for. this . . . reason the method i s unsuited for f i t t i n g a distribution specifically to the sample t a i l . The random sample i s considered to consist of a series of independent observations from the same distribution. The probability of the intersection of these events i s then the product of their individual probabilities. The Likelihood function i s defined n L = TT P 0 ( V C D m=l where p^ i s the density of the parametric family, e.g. Type I, h m are the individual wave heights, and n i s the total number of wave heights. The method of maximum likelihood then selects values for each parameter -• which maximise the likelihood function. Since most of the common density functions have an exponential form, this procedure Is simplified by minimising the logarithm of the likelihood function. MLE's have an important advantage over a l l other estimators i n that they can yield unbiased estimators with rninimum variance. This results i n a comparatively efficient use of the data and estimates which are more l i k e l y to be close to their true values. Against this 36. quality lies.the consideration that for two or more parameters they are troublesome to solve, requiring lengthy iterative manipulation. Furthermore the method uses the entire sample of wave heights and thus i s unsuitable to studies directed specifically at the distribution t a i l . As a result the MLE's.are rarely used i n offshore engineering since i t i s usually f e l t that their drawbacks outweigh the principal advantage. For completeness, a short, description of the Maximum.Likelihood procedure follows for each distribution. 5.3.1 Gumbel Distribution - Type I The likelihood function i s given by ....(42) and by setting 9_ 3e Zn L = 0 two equations i n e and 6 can be obtained n n 1 exp 1 (43) 1 and H (44) It can be seen that these Involve an iterative solution for £ and 6. 5.3-2 Fretchet Distribution - Type I I The likelihood function i s given by L(h;a,6) = a n exp -(a+l) (45) and the two ML equations i n a and g. are n h - ° = 0 m (46) and n r n In - n E L 1 h m -1 = 0 (47) where n i s the total number of wave heights, A more detailed account of the distribution i s given by Thorn [1954J. Equations 46 and 47 must be solved simultaneously by computer or a graphical method. A closed form for the maximum likelihood estimator does not exist. 5.3-3 Weibull Distribution .- Type I I I The likelihood function i s given' by n L(h;e,a,9) = [ ^ -a] exp <| - £ i n m (48) where = e - h^ the three equations which result from inaximizing the likelihood function are *a a-1 n E A - 1 u 1 m n i - i (49) n n I—. • m 1 -,1/a (50) and n n n „ - n in 9 + ) . in A d *-* • m 1 1 - El (51) Equations 49, 50 and 51 must be solved simultaneously for 9, 0. and e, and again a closed form maximum likelihood estimator does not exist. The treatment of the Type I I L distribution i s similar but not given here. 39. 5-3.4 LdgridrmaT Distribution As a result.of its.direct connection with the Normal distribution, the lognormal case i s comparatively straighforward. The estimators are: n = - Y* In h (52) n <— m v and £2 n i l E ( ^ - g ) 2 (53) 1 In view of the simplicity of the M.L. estimators for the lognormal distribution and because of their desirable properties of unbiasedness and minimum variance, these should be used i n place of the method of moments (which have a lower efficiency), and may be considered a good alternative to least-squares. 40. CHAPTER 6 .TESTS OF FIT BETWEEN THE DISTRIBUTION AND THE DATA Basically there are two methods of testing f i t between the distribution and the data: i ) by hypothesis testing i i ) by confidence intervals These two methods, 'operate i n much the same way since both use a pivotal quantity and have a level of significance or confidence. The confidence interval approach has been almost universally accepted for this type of work since i t can be plotted i n a form which may be readily appreciated by the engineer. Until very recently, the coastal/ocean engineering literature indicated that confidence intervals could only be applied to the Type I distribution derived by Gumbel [1958] and i n the form summarized by St. Denis [1969]. The discussion here w i l l show that this i s certainly not the case, and that they may be generated for any of the four distributions, and that Gumbel's form was only an approximation to the exact derivation. The intervals described here are not presented i n the standard parametric intervals form most commonly used by statisticians, and which sets probabilistic limits to the estimated values of the parameters. The confidence intervals used for this work set probabil-i s t i c limits on the range of each of the data values given the estimated distribution. This permits an engineer to review the results and form a conclusion on the closeness with which a model f i t s the data. 41. A summary of the derivation of confidence intervals i s presented i n the next.section. More detailed discussions may be found i n accounts by Kendall [1947] and Borgman [1959]. 6.1 Derivation of Confidence Intervals Starting with.a data sample H l 5 H 2 which has a continuous parent distribution P(h), we arrange the data i n order of magnitude: H,;v, H,.v H, > so that H/.s i s the largest and (1) ( 2) (n) (i) H, \ i s the smallest, (n) Each data value H, i s then assumed to behave as an Cm) independent random variable which has an identical parent distribution P(h). The general probability density of the s t a t i s t i c of a sample containing n values i s the probability that , dh ^ „ ^ , , d h h - -~- < H, N < h + ^ r-2 (m) - 2 In order to achieve t h i s , m-1 values must f a l l above h + ^jp, the :,m^ h f a l l within h ±v-^ r, and the rest below h - =^p. n-m -. Hence f (h) dh = p(h). p(h) dh. l-P(h) where i s the parent density function i s the density function of the mth s t a t i s t i c Since there i s s t i l l some ambiguity over which values go above and below h, we choose (n-m) values below h, then one value at h, leaving the remaining (m-1) to f a l l above h. n \ n-m m-1 f (h) dh = m I P(h) [1 - P(h)l p(h) dh (54) m \n-m/ , 11 \ = TnT( n+i; m \n-m) ' r(n-m+l)r(m+l) 1/ /B(n-m+l,m) where B( ) i s the Beta function. The cumulative probability of the mth largest variable from n values i s P m(h) = Prob [ H ( m ) < h] and m^^ ^ _ | f m ( h ) dh f n -, 1 n — r / P(h (n-m+l,m) / h^ •m m-1 Fm(h) = B ( r m? I P ^ [ 1 " P ( h ) ] P ( h ) d h " • • ^ 5 let w = P(h) dw = p(h) dh 43. . P(h) -m+l,m) / Fm ( h ) = P X r ^ + l ? ^ 1 [ 1 - W ] m _ 1 d W Bp(n-^ n+l,m) TT7 n r = 1^ (n-m+l,m) (56) B(n-m+l,m) XP v J where Ip(.') i s the incomplete beta function which can be expressed i n terms of the binomial expansion n Ip(m,n-m+l) = E ( j ) p n ~ j [1-PJJ j=m and I p(a,b) = 1 - I1_p(b,a) n V h ) = 1 ~ E (i) R ( h ) (57) th Equation 57, then, expresses the probability function of the m data point i n terms of the parent distribution P(h) which governs the occurrence of a l l wave heights. At this point It i s clear that the form or type of the parent distribution has not been specified, and that the method i s equally applicable to data from any of the four distributions considered. 44. 6.2 Asymptotic Distribution of the m Statistic I f the number of wave.heights n i s increased without limit while both m and P(h) are.held stationary, the function F m(h) tends to zero. This i s very inconvenient and i t would be desirable to have a stable non-zero limiting function which could be used for large samples. This i s achieved by replacing P(h) by the parameter wn(h) = n [l-P(h)] (58) and tabulating w^(h) instead of P(h). In.this case the incomplete beta function i s replaced by the incomplete gamma function. Tables of wn were prepared by Borgman [1959] and are summarized i n FIGURES 24 to 28 which are plots of £n wn = Jin {n[l-P(h)]} against the sample size n for several values of F m(h). 6.3 Approximate Distribution of the irfo Statistic Gumbel [1958] described an approximate form for the standard error of the mth s t a t i s t i c . It was assumed that m/n was approximately Jg, so that I t was s t r i c t l y valid only for statistics taken from the centre of the ordered sample. He showed that Eqn. 54 could be expressed as a power series of which a l l but the f i r s t few terms could be neglected. This simplied to the density of a normal distribution. 6.4 Limitations It should be noted' that the assumptions made In Its derivation now render the Gumbel version quite inaccurate i n the v i c i n i t y m-1, (FIGURE 12). Since this region i s of prime interest here, especially when using the least-squares method, the more direct method of solving Eqn. 57 i s to be preferred. In addition to a variation between the true interval and the standard deviation indicated by the normal distribution approxi-mation, there i s a very noticeable development of skewness (see th FIGURE 12) i n the distribution of the m value as m approaches unity. This i s quite important when using the confidence interval lines, since they develop a strong bias towards the right of the f i t t e d l i n e , indicating that i f other sets of samples were used, the values would tend to f a l l more often to the right of the f i t t e d line as m decreased and have greater heights than indicated by i t . A restriction on the use of the mth value distribution i s that i t i s not defined i n the region m < 1, and hence cannot be directly extended to predicted values. Gumbel effectively suggested that predicted maximum values would each retain the same interval size as the s t a t i s t i c at m = 1, but this practice does not seem to be generally accepted. 46. 6.5 Method of Determining C r i t i c a l Points For a given sample of n wave heights and a s t a t i s t i c ordered position m the distribution of the m^h value Is described i n terms of the original parent distribution. Since our sample consists of extreme values, the parent distribution may be chosen from one of the four distributions used here, e.g. Type I I I . I f the confidence intervals are defined i n terms of a probability that a given height w i l l not be exceeded by the mt]ri s t a t i s t i c then Eqn. 57 can be solved for the c r i t i c a l value h. However a more useful form can be reached by solving for the value of the parent distribution P(h) which satisfies E m(h) = cj>, the confidence probability. The value of P(h) given by this may then be applied to any parent distribution to get a c r i t i c a l value of h. This approach results i n tabulated.values of P(h) for given n, m and Y. Comprehensive tables for m = 1(1)5 have been compiled by Borgman [1959]-6.6 Procedure for Plotting Confidence Intervals A plot of the data i s prepared on the paper of the selected distribution by following the sample preparation and plotting procedure given i n CHAPTER 3 . A best-fit line i s then located by one of the techniques i n CHAPTER 5- The method for plotting the confidence intervals i s as follows: i ) A confidence probability level y e.g. 0.25. i s selected. Each value of m i s then treated separately and i s used to obtain a pair of values of P(h) from FIGURES 24 to 28. That i s j for a given.sample size: n, m and setting F (h) = <f> for both cf> = (l-y ) / 2 and (I+Y)/2 respectively, as shown i n FIGURE 17, the appropriate FIGURE (24 to 28) is. used to determine two values of P(h). i i ) Each value of P(h) i s used as an ordinate position (see FIGURE 13) and, when projected onto the best-fit line w i l l given two limiting th values of height for the m s t a t i s t i c . i i i ) These may then be plotted on either side of the plotting position and a faired line drawn through equivalent limits for the remaining points. 6.7 Examples of Confidence Intervals The sample described by St. Denis [1969] i s shown plotted i n FIGURES 13 to 16 with 25% confidence intervals. It may be seen that this level of confidence i s sufficient to contain a l l points in the Type I, Type II and lognormal plots. However, the Type I l l y confidence intervals are too narrow to contain the second and third highest points. This indicates that the Type I l l y distribution i s the least suitable model for this data. CHAPTER 7 METHODS OF PREDICTION The'processes of selecting the most suitable distribution together with estimates of the parameters have been described i n detail. This chapter w i l l discuss methods of using the best-fit line to predict the "extreme wave". In order to describe this wave one requires a representative height, and for this either the significant or the maximum wave height may be used. The calculation, of these two values i s outlined i n SECTIONS 7.1 and 7-3. The methods described In previous chapters have been directed entirely towards predicting wave heights. Although extreme value analysis have not been applied to wave periods i n this thesis, the period of the extreme wave may s t i l l be calculated from the limiting steepness as described i n SECTION 7.2. The predicted.values of wave height and period apply over the same recording interval that was used for data collection. Once an extreme wave height has been obtained i t may be desirable to set a pair of probabilistic limits on i t s value to reflect the size of the sample and the quality of the estimating process. A discussion of such limits i s given i n SECTION 7.4. In SECTION 7-5, the encounter probability and return period are discussed In detail. The encounter probability quantifies the risk of a wave with a given return period occurring within the lifetime. Additionally, another value might: be used to estimate" the number of smaller waves, occuring within the same period, and which might hinder operation or promote fatigue. By this process the designer would be able to consider a "limit-state" and a "service condition". 7.1 Expected Significant Height Once the extreme value plot has been drawn the next stage Is usually to estimate the so-called 50 or 100 year design wave height. That i s the wave height, defined i n the same way as the recorded heights (e.g. the significant height over a recording interval), which would only be exceeded on average once during a period of 50 or 100 years respectively. This time interval i s called the RETURN PERIOD or RECURRENCE INTERVAL, and i s usually selected on the basis of a given structure lifetime. A non-dimensional EXPECTED WAITING TIME, R i s defined as the average number of t r i a l s between exceedances of a given height h. Each t r i a l amounts to one recording interval and hence the waiting time i s given by EXPECTED WAITING TIME = R E C O S INTERVAL = R ••• ( 5 9 ) Let W be the waiting time, i.e. the random number of observations preceding and including the f i r s t exceedance of a given height h. W then has a GEOMETRIC DISTRIBUTION and i f P = P r [H < h] 50. then the expected value of W i s R = . . . . ( 6 0 ) which yields EXPECTED WAITING TIME = V (1 - P) (61) Thus for a return period T r of 50 years, and using data which was recorded at 3 hourly intervals, the expected waiting time R would be 146,000. For a given data record and a return period one may thus calculate R and hence Pr[H £ h] from Eqns. 59 and 6 l . This w i l l correspond to a value of height h on the probability plot. Since the data consisted of a series of significant heights, this predicted height w i l l represent the significant height of a correspond-ing recording interval and with the required return period. 7.2 Extreme Wave Period There are three common approaches to estimating the extreme wave period. The f i r s t i s to repeat the entire.procedure using wave periods instead of wave heights. The marginal frequencies are obtained by summing the number of wave occurrences i n each period class. By i .. using the same return period as the heights, one may obtain a predicted value of the 50 year zero-crossing period T for a future wave record with the same recording Interval. Draper [19631 has suggested that this value of T_ may be used with the predicted height. This suggestion i s based on the fact that there i s a noticeable correlation between the two variables i n the scatter diagram. 51. The second method.of obtaining a representation wave period involves the use of a one-parameter wave spectrum such as the Pierson-Moskowitz spectrum. The spectrum Is calculated for the predicted value of significant height and the value of frequency locating the spectrum peak i s used to obtain the T . Again, the value corresponds to the same recording interval as does the data. The third method, which i s the simplest to use, involves using the predicted wave height to set a lower limit on the wave period. By assuming a Pierson-Moskowitz spectrum, Battjes [1970] has shown that, for deep, water and intermediate depths, the wave steepness defined as 2TTH /gT2 i s limited by s ^ < 1/16 . . . . ( 6 2 ) gT L 2 where g i s the gravitational constant. Thus a lower limit of period T^ for a given significant height may be set as: T L = (32TT -Hg/g}^ . . . . ( 6 3 ) The method then involves trying different combinations of the period with the predicted height to find the worst effect on the structure. 7-3 Maximum Wave Height Once the extreme values of significant wave height and mean zero-crossing period have been established, the maximum wave height may be calculated. Longuet-Higgins and Cartwright [1956] showed that for a wave spectrum of arbitrary shape the expected maximum individual wave height could be expressed as Expected Maximum Height s p £ n ( t / } h ( g 4 ) Significant Height L i z J where t i s the recording interval and T i s the mean zero crossing period. This relationship i s valid provided t/T i s large, i.e. the recording interval contains a large number of waves. The sea-state i s assumed to be stationary throughout this period. In the preceding section 7 . 2 , i t was shown that a lo\^er limit would be placed on the period attached to the predicted value of the significant height. The procedure for calculating the expected maximum wave height i s : i ) Using the predicted value of the significant height Hg and the limiting steepness, calculate the minimum period T^ from Eqn. 62: T L = (32w Hs/g} h . . . . ( 6 3 ) where g i s the gravitational constant. i i ) Calculate the expected maximum wave height from Eqn. 64 using T^ from Eqn. 63. to Tz. It i s quite sufficient to use the minimum period here since Eqn. 64: i s insensitive to variations in t/T z. For example, a 10% error in the central period of 7 seconds over a recording interval of 3 hours would result In a height error of less than 0.5%. 53. 7-4 Confidence Intervals for Prediction Gumbel [1958] suggested that the confidence intervals described i n the last chapter could be extended, beyond the region containing data, for prediction. The method he suggested was to draw a pair of lines parallel to the f i t t e d line and passing through the interval offset points of the highest data point. Although the concept of using intervals to indicate error i n prediction was very attractive, this method has not generally been adopted according to Chow [1964]. It was, however, restated by St. Denis [1969] i n a paper devoted to wave prediction. As has already been.discussed, there i s always a.degree of variab i l i t y involved i n parameter estimation. An estimate i s a function of a random sample and hence Is It s e l f a random variable. The estimator's variability i s not lessened by the fact that the various methods of estimation often yield slightly different results. Hence, the Weibull distribution has three possible sources of error once i t has been estimated. In view of the d i f f i c u l t i e s described, i t i s not surprising that the problem was l e f t untouched u n t i l quite recently. Thoman, Bain and Antle [1969] prepared confidence interval tables for the parameters of the two-parameter weibull distribution. This special case occurs when a Type I I I ^ distribution i s used with epsilon set equal to zero. The tables were made by using Monte Carlo Simulation to generate a series of random samples from the Type IH-^ distribution, and thence deriving an empirical distribution for each parameter. Confidence intervals were then taken from these empirical distributions by a similar process to that used i n the last section. This approach was used by Petrauskas and Aagaard [1971] to produce "uncertainty intervals" for prediction. The two limits calculated for each of the parameters involved resulted i n a pair of straight linesj each having a slope and intercept which were different from the l e a s t - f i t line. An example of'the uncertainty intervals i s shown i n FIGURE 23. The intervals generated by this method were found to diverge from the l e a s t - f i t line as the variate increased instead of remaining parallel to i t , as originally suggested by Gumbel... 7.5 Encounter Probability and Waiting Time It i s accepted practice to refer to a design wave of given height by i t s RETURN PERIOD .at a specific location. Thus a 25-year wave means that waves as large as the design wave or larger, occur on average once i n each 25-year period. It Is evident that i n fact several such waves could possibly occur within the same 25-year period. Borgman [1963] has given a description of the distribution of the waiting time between events and of the probability of encounter. The concept of return period enables one to represent the continuous time dimension as a series of discrete integer multiples of the original recording Interval, or of a conveniently related quantity such as one year. Thus time can be used as a discrete . random variable which has two possible states, which reflect whether or not the design wave has been exceeded within the. associated time period. The probability of an exceedance i s given by p = . 1 - P(h) and q = 1-p represents the probability that a recording interval t w i l l contain a wave higher than h. If T i s the waiting time u n t i l the f i r s t exceedance of h occurs, then i t has a geometric distribution and pr < T W l T ' t > = 1 " ^ . - . . ( 6 5 ) where T i s a dimensionless integer multiple from Eqns. 59 and 6 l , the expected waiting time i s T r / t = Vfi-PCh)] (66) thus from Eqns. 65 and 66 P r { T T ± t } = 1 " ( 1 " V T r ) T . . . - ( 6 7 ) This i s the probability distribution of waiting time, and i t i s independent. of the timing of the previous exceedance. I f (x.t) represents the lifetime of the structure then Eqn. 67 may be used to calculate the probability of i t experiencing a wave with a given return period. FIGURE 22 has been constructed from Eqn. 67 for the special case where t = 1 year. When T 2 / t 2 r » 1 (i.e. generally when T/t >:> 1) a suitable approximation to Eqn. 67 i s given by T P r { T - - T } ^ 1 ~ e x p { _ } - . . . ( 6 7 a ) CHAPTER 8 • CONCLUSIONS - A RECOMMENDED PROCEDURE A recommended procedure for the prediction of extreme waves i s now given below: a) The data i s taken from an intermittent record over a period of at least one year. Usually this w i l l consist of a series of short continuous records (10 - 20 minute duration) which have been started at intervals of 3 - 10 hours. For reasons given i n SECTION 3-3 i t i s possible to use data which has some of the results from 'low activity' months missing provided i t can be shown that their approximate ranking positions f a l l outside the t a i l as defined i n SECTION 3.3-b) Each continuous record i s reduced by the method described i n SECTION 3.1 to a pair of single values, i.e. the significant height H and the zero crossing period T . s • z c) A scatter diagram with both H and T divided into a number of equal-classes i s then prepared. The number of records f a l l i n g Into each joint interval should be marked as shown i n FIGURE 5. d) The marginal height frequencies.are fixed by summing over the period . T z for each class of H . The plotting positions ,of each,class lower limit are calculated according to the method of SECTION'.3,1. Each class lower limit i s plotted on Type I paper and the curvature test i s applied as i n SECTION 4.2.' This w i l l result In a choice of one of three groups of distribution which are described i n SECTION 4.1. The t a l l of the sample i s isolated for further use by the method given i n SECTION 3 ,3 , I f this yields less than five points, one may return to step (c). and .further .subdivide the height classes In the scatter diagram. I f this i s not feasible one may have to accept a more general f i t and.include some lower limit classes. The class lower limits of the t a i l are then plotted onto each paper of the distribution group selected i n step ( f ) . A straight line i s f i t t e d by one of the methods described i n CHAPTER 5-For each distribution of the group, the confidence Intervals corresponding to the plotted points are drawn. I n i t i a l l y a confidence level of 60% may be used and narrower bands drawn u n t i l points start to f a l l outside the limits. On this basis one may select one distribution which gives the best f i t to the data. Predicted values of wave height and period may now be made by the methods of CHAPTER 1, and encounter probabilities assigned where appropriate. CHAPTER 9 A WORKED EXAMPLE The data used for this worked example was collected by the Department of Public Works of Canada.at TINIER POINT, NEW BRUNSWICK. It covers a period of one year and originally appeared in a paper by Khanna and Andru [1974]. The example i s analysed using the same steps as.summarized i n CHAPTER 8. ' Steps a) to c) were originally carried out by the collecting agency and the starting point was the scatter diagram shown i n FIGURE 5- The marginal frequencies of significant height H s and zero crossing period T are also shown [step d)]. Steps e) and f) were carried out to give the results shown in FIGURE 18. A curved l i n e , which was f i t t e d to the points by eye for convenience, has a strong positive curvature. On the basis of the method discussed i n SECTION 4.1 this permitted the analysis to be confined to two possible models: - Type I l l y distribution - Type I I I ^ distribution Step g) was carried out according to SECTION 3-3 as follows: total number of wave heights on record N = 2245 longest rank within t a i l w =^y^ = 224 By adding the marginal frequencies of each height class the position of the class containing a value with a rank of 224 was located. This 60. resulted In the nine points which were plotted in FIGURE 18. The extent of the t a i l i s indicated and the remaining points have been included to show their behaviour. The result of step h), using the method of least squares, i s shown i n FIGURES 19 and 20. The confidence intervals have also been fi t t e d according to step i ) . The calculation of confidence band positions i s given i n TABLE V. Since FIGURES 24 to 28 only cover the f i r s t five ranked positions [m=l to 5] they cannot be directly applied to higher ranked values. This i s overcome by introducing an approximate method which may be j u s t i f i e d by the fact that the confidence intervals only serve as a test of comparison, and hence do not require the rigorous approach of parameter estimation. The method i s given as follows: i ) The rank m of the smallest value occurring in each class i s used and shown i n column 2. i i ) The mean frequency i s calculated as Q n+1 i i i ) When the rank becomes greater than fi v e , m i s set equal to 5 and a new sample size n' i s chosen to give an approximate frequency (column 4) which i s close to the value i n column 3. n' = J - - 1 % The resulting values of n' are shown in column 5. 61. iv) An i n i t i a l confidence.level of V = 0.60 was chosen. The procedure of SECTION 6.6 was then used to plot the confidence intervals with the following modifications - n 1 replaces the true sample.size - FIGURE 28 i s used for m _> 5 . . ..-The...final.plots together, with, confidences':••• .' I i . ii • intervals are shown i n FIGURES 19 and 2 0 . . Since the 60% confidence bands did not enclose a l l the points i n FIGURE 19, Intervals for 80% confidence were plotted. It was found that 60% confidence was sufficient for FIGURE 2 0 , and that the highest pair of points could be enclosed by a narrower band of 40% confidence. Thus the Type I l l y distribution with e = 15 feet was the most suitable model for this data. The prediction of the 100 year design wave i s made according to the methods given i n CHAPTER 7 for a RECORDING INTERVAL of 3 hours, and a RETURN PERIOD of 100 years, which results i n an EXPECTED WAITING TIME of 292,000. The probability of non-exceedance i s calculated from Eqn. 64 as P(h) = 0.99999657, which corresponds to y = 12.584 on the ordinate scale of FIGURE 20. This yields a 100. year SIGNIFICANT HEIGHT OF 14.47 feet. The data used for this worked example was analysed by Khanna and Andru [1974]. Their estimate for the 100 year significant height varied between 20 and 30 feet. The lowest value was taken from a 62. Type I H L plot and the largest from a lognormal plot. The value of significant height for the 100 year wave suggested by this worked example was less than 15 feet. The large difference i s attributable to the limiting effect of the Type I l l y distribution and to the f i t t e d line now lying to the le f t of the m=l point. The minimum period i s determined from Eqn. 64 as T^ = 6.7 seconds, and hence the EXPECTED MAXIMUM HEIGHT i s given by Eqn. 63 as 27-9 feet. CHAPTER 10 FUTURE WORK The procedure which has been described i s primarily concerned with predicting a wave height of given return period. In cases where the dynamic response of a structure to waves i s of importance, one must consider the distribution of wave periods. In order to calculate the combined effect of height and period variation, i t becomes necessary to introduce a long-term bivariate distribution. This describes the joint probability of a given height and period occurring i n combination. To a limited extent this problem has been examined,"" by Battjes [SECTION 1.0] who used a discrete approach. The bivariate distribution developed could be continuous and i t s marginal distributions of height and period may be quite different, e.g. a Type I for periods with a Type I l l y for significant height. Thus, instead of f i t t i n g a straight line one would use a "surface of best f i t " . One advantage of such an approach would be that a designer could take the fundamental frequencies of the structure into account when predicting design values. An approach would be to predict a wave of given return period, e.g. f i f t y year wave, given that the period of concern lay between li m i t s , e.g. 6 to 8 seconds. From SECTION 7-4 i t can be seen that there i s s t i l l no direct approach for obtaining confidence bands for predicted values. It would be most useful to the engineer i f a method based on tables could be prepared for office use. 64. BIBLIOGRAPHY 6 5 . Abramowitz, M. and Stegun, I. Handbook of Mathematical Functions, Dover, New York, 1970. Battjes, J.A. "Long-term wave height distribution at seven stations around the Br i t i s h Isles," National Institute of Oceanography, England, Internal Report No. A.44, July,'.1970. Borgman, L.E. "The frequency distribution for the mth largest of n values," M.S. Thesis, University of Houston, Houston, Texas 1959. Borgman, L.E. "The frequency distribution of near extremes," Journal of Geophysical Research, Vol. 66 , No. 10, pages 3295-3307, October, 1961. Borgman, L.E. "Risk Criteria," Journal of the Waterways and Harbours Division, American Society of C i v i l Engineers, Vol. 89, No. WW3. August, 1963. Borgman, L.E. "Extremal Statistics i n Ocean Engineering," Proceedings of Civil Engineering in the Oceans, University of Delaware, 1975. BretSchneider, C.L. "Generation of wave by winds, state-of-the-art," National Engineering Science Company, Report SN-134-6, January, 1965. Bury, K.V. Statistical Models in Applied Science, John Wiley and Sons, 1975. Cartwright, D.E. and Longuet-Higgins, M.S. "The S t a t i s t i c a l distribution of the maxima, of a random function," Philosophical TransactionsRoyal'. Society of London, A247, pages 22-48, 1958. Chow, V.T. Handbook of Applied Hydrology: a compendium of water resources technology, McGraw-Hill, New York, 1964. Draper, L. "Derivation of a design wave from instrumental records of sea states," Proceedings, Institution of Civil Engineers, London, Vol. 26, pages 291-304. Fisher, R.A., and':Tippet, L.H.C. "Limiting forms of the frequency distrubution of the largest or smallest member of a sample," Proceedings, Cambridge Philosophical Society, Vol. 24, 1926. Gringorten, I.I. "Envelopes for ordered observations applied to meterological extremes," Journal of Geophysical Research, Vol. 6 3 , No. 3, pages 815-826, February 1963 " S t a t i s t i c a l theory of droughts," Journal of Hydraulics Division, American Society of C i v i l Engineers, Vol. 8 0 , May 1954. Statistics of Extremes, Columbia University Press, 1958. "A companion of log-normal and Weibull functions for f i t t i n g long-term wave height distributions i n the North Atlantic," National Physics Laboratory, England. Ship Division, T.M.190, October 1967. " S t a t i s t i c a l distribution patterns of ocean waves and of wave-induced ship stresses and motions with engineering applications," S.N.A.M.E., New York Meeting, Preprint No. 6 , 1956. Kendall, M.G. the Advanced Theory of Statistics, Vol. 1, Charles G r i f f i n and Co., London, 1947. Khanna, J. and Andru, P. "Lifetime wave height curve for Saint John Deep, Canada," Proceedings, International Symposium, Ocean Wave Measurement and Analysis, Vol. 1, pages 301-319, ASCE, New Orleans, pages 301-319, September 1974. Kimball, B.F. "On the choice of plotting positions on probability paper," - Journal of the American Statistical Association, Vol. pages 546-560 I 9 6 0 . Gumbel, E.J. Gumbel, E.J. Hogben, N. Jasper, N.H. 67. Ouellet, Y. "On the need of wave.data' for the design of rubble mound breakwaters," Proceedings, International Symposium on Ocean Wave Measurement and Analysis, Vol. 1, pages 500-522, ASCE-,---New Orleans, September 1974. Petrauskas, C. and Aagaard, P. "Extrapolation, of. historical storm data for estimating design-wave heights," Journal of the Society of Petroleum Engineers, Vol. 11 , pages 23-37 , March 1971. Simiu, E. and F i l l i b e n , J.J. "Probability distributions of extreme wind speeds," Journal of the Structural Division, American Society of C i v i l Engineers, Vol. 102, No. ST9, pages 1861-1877, September 1976. St. Denis, M. "Determination of Extreme Waves", 'Topics in Ocean Engineering, Vol. 1, C.L. Bretschneider (ed.) pages 37' Gulf Publishing Co., Texas, 1969. St. Denis, M. "Some cautions on the employment of the spectral technique to describe the waves of the sea and the response thereto of oceanic systems," Offshore Technology Conference, Houston, Paper OTC 1819, May 1973-Thorn, H.C.S. "Frequency of maximum wind-speeds" Journal of the Structural Division, American Society of C i v i l Engineers, Vol. 8 0 , November 1954. Thorn, H.C.S. "Asymptotic extreme-value distributions of wave heights i n the open ocean," Journal of Marine Research, Vol. 29 , pages 19-27, 1971. Thoman,. D.R., Bain, L.J. and Antle, C.E. "Inference parameters of the Weibull distribution," Technometrics, Vol. 1 1 ( 3 ) , pages 445-460, 1969. Tucker, M.J. "Analysis of records of sea waves," Proceedings, Institution of Civil Engineers, London, Vol. 2 6 , pages 305-316, 1963. TABLES DISTRIBUTION RANGE PROBABILITY FUNCTION P(h) EXPECTED VALUE VARIANCE LOGNORMAL TYPE I TYPE II TYPE I l l y UPPERBOUND TYPE I I I L LOWER BOUND 0<H<°° —0O<Q<0O 0<a<°° - o o < H < o o _00<£<00 0<9<°° 0<a<°° 0<6<°° 0<H<°° -«xH<e 0<a<°° o<9<°° £<H<°° 0<a<°° 0<8<°° h 1 J 1 /2TV / A exp { -hi ah \ a }dh exp { 6 + | } exp {- exp [ -(^p)]} exp { - ( I ) } e x p ^ ) a /H-e \ a i-exp {- y-Q-j } E + 0.57722 e r ( i - -) £-e r(i+ -) a £+9 r(l+ -) a exp{20+ a 2}[exp(a 2)-l] 1.64493 9: ) 2{r(i- I) - r 2 ( i - h] a a )2{r(i+ -) -r 2(i+ -)} a a e 2{r(i+ -) -r 2(i+ -)} a a TABLE 1 PROBABILITY DISTRIBUTIONS AND 'THEIR PROPERTIES DISTRIBUTION SLOPE- TAIL CURVATURE Lognormal positive negative Type I 1/6 straight line Type I I + .d/H negative curve - a/H2 < 0 Type I H L a/H-eY*"1 e v e / a(a-l) ( H _ £ ) a - l e a a<l negative ot=l straight ot>l positive Type I l l y +a/(e-H) positive curve a/(e-x) 2 >_ 0 TABLE I i : CURVATURE PROPERTIES OF THE DISTRIBUTIONS 1/ALPHA SHAPE FACTOR ALPHA 0 . 0 1 2 5 1. 2 1 6 1 8 0 . 0 0 0 . 0 2 5 0 1 . 2 9 7 0 4 0 . 0 0 0 . 0 3 7 5 1 . 3 8 2 7 2 6 . 6 7 0 . 0 5 0 0 1 .4739 2 0 . 0 0 0 . 0 6 2 5 1. 5713 1 6 . 0 0 0 . 0 7 5 0 1 .6757 13. 33 0 . 0 8 7 5 1 . 7 8 8 3 1 1 .43 0 . 1000 1 .9103 1 0 . 0 0 0 . 1125 2 . 0 433 8 . 8 9 0 . 1 2 5 0 2 . 1 8 9 3 8 . 0 0 0 . 137 5 2 . 3 5 0 5 7 . 27 0 . 150Q 2 . 5 3 0 2 6 . 6 7 0 . 1625 2 . 7 3 2 4 6 . 15 0 . 1750 2 . 9 6 2 1 5 . 7 1 0 . 1875 3. 2265 5 . 3 3 0 . 2 0 0 0 3 . 5 3 5 1 5 . 0 0 0 . 2 1 2 5 3 . 9 0 1 5 4 . 7 1 0 . 2 2 5 0 4 . 3 4 5 6 4 . 4 4 0 . 2 3 7 5 4 . 8974 4 . 21 0 . 2 5 0 0 5 - 6 0 5 1 4 . 0 0 0 . 2 6 2 5 6. 5509 3 . 8 1 0 . 2 7 5 0 7 . 8 8 7 2 3 . 6 4 0 . 2 8 7 5 9 . 9 3 2 7 3 . 4 8 0 . 3 0 0 0 1 3 . 4 8 3 5 3 . 3 3 0 . 3 1 2 5 2 1 . 2472 3 . 20 0 . 3 2 5 0 5 2 . 1732 3 . 0 8 0 . 3 3 7 5 - 1 0 2 . 1 8 4 9 2 . 9 6 0 . 3 5 0 0 - 2 4 . 9 3 1 6 2 . 8 6 0 . 3 6 2 5 - 1 3 . 8 4 9 6 2 . 7 6 0 . 3 7 5 0 - 9 . 3 8 1 6 2 . 6 7 0 . 3 8 7 5 - 6 . 9 4 5 7 2 . 58 0 . 4 0 0 0 - 5 . 3 9 5 7 2 . 5 0 0 . 4 1 2 5 - 4 . 3 0 8 5 2 . 42 0 . 4 2 5 0 - 3 . 4 9 0 8 2 . 3 5 0 . 4 3 7 5 - 2 . 8404 2 . 29 0 . 4 5 0 0 - 2 . 2 9 6 5 2 . 22 0 . 4 6 2 5 - 1 . 8 1 8 0 2 . 16 0 . 4 7 5 0 - 1 . 3 6 9 1 2. 11 0 . 4 8 7 5 - 0 . 8 9 9 4 2 . 0 5 0 . 5 0 0 0 - 0 . 0 0 1 8 2 . 0 0 TABLE I I I Shape Factors for the FRETCHET Distribution 1/ALPHA SHAPE FACTOR ALPHA 0 . 0 5 - 0 . 8 6 8 0 2 0 . 0 0 0 0 . 10 - 0 . 6 3 7 6 1 0 . 0 0 0 0 . 15 - 0 . 4 3 5 7 6 . 6 6 7 0 . 2 0 - 0 . 2 5 4 1 5 . 0 0 0 0 . 25 - 0 . 0 8 7 2 4 . 0 0 0 0 . 3 0 0 . 0 6 8 7 3 . 333 0 . 3 5 0 . 2 1 6 7 2 . 8 5 7 0 . 4 0 0 . 3 5 8 6 2 . 5 0 0 0 . 4 5 0 . 4 9 6 3 2. 222 0 . 5 0 0 . 6 3 1 1 2 . 0 0 0 0 . 5 5 0 . 7 6 4 0 1 . 8 1 8 0 . 6 0 0 . 8 9 6 0 1 .667 0 . 6 5 1 .0279 1 . 538 0 . 7 0 1 . 1604 1 . 429 0 . 7 5 1 . 2 9 4 1 1 . 333 0 . 8 0 1 . 4 2 9 5 1 . 250 0 . 8 5 1 . 5 6 7 4 1 . 176 0 . 9 0 1 . 7 0 8 0 1 . 111 0 . 9 5 1 . 8 5 2 1 1 . 0 5 3 1.00 2 . 0 0 0 0 1 .000 1 .05 2 . 1 5 2 3 0 . 9 5 2 1 . 10 2 . 3 0 9 3 0 . 9 0 9 1 .15 2 . 4 7 1 8 0. 870 1 . 20 2 . 6 4 0 0 0 . 8 3 3 1 .25 2 . 8 1 4 6 0 . 8 0 0 1.30 2 . 9 9 6 1 0 . 7 6 9 1 .35 3 . 1851 0 . 7 4 1 1 .40 3 . 3 8 2 0 0 . 7 1 4 1 .45 3 . 5 8 7 5 0 . 6 9 0 1 .50 3 . 8 0 2 3 0 . 667 1 . 55 4 . 0 2 6 9 0 . 6 4 5 1.60 4 . 2 6 2 1 0 . 6 2 5 1 .65 4 . 5 0 8 6 0 . 6 0 6 1 .70 4 . 767 1 0 . 5 8 8 1 . 7 5 5 . 0 3 8 5 0 . 571 1.80 5 . 3 2 3 5 0 . 5 5 6 1 . 8 5 5 . 6 2 3 0 0 . 5 4 1 1 .90 5 . 9 3 8 1 0 . 5 2 6 1 .95 6 . 2 6 9 7 0 . 5 1 3 2 . 0 0 6 . 6 1 8 8 0 . 5 0 0 TABLE BIZ Shape Factors for the WEIBULL" .Distribution 73. m n 1 1 DATA POINT 2 APPROX RANK 3 TRUE MEAN FREQ. 4 APPROX MEAN FREQ. 5 EQUIV. . SAMPLE SIZE 6 60$ CONFIDENCE y LOWER Y UPPER HIGHEST 1 1 1/2245 - 2245 7.24 9.22 2 2 2/2245 - 2245 6.62 7.91 3 8 8/2245 5/1,4000 1400 5.34 6.11 4 15 15/2245 5/750 750 4.71 5 .49 5 24 24/2245 5/450 450 4.20 4.98 6 38 38/2245 5/245 295 3.77 4.55 7 64 64/2245 5/175 175 3.25 4 .02 8 93 93/2245 5/120 120 2.87 .3.64 9 133 133/2245 5/85 85 2.51 3.29 TABLE V: ESTIMATION OF CONFIDENCE INTERVALS FOR TYPE III T T PLOT 74. FIGURES 75 NOTE: N~ {P(h)} i s the value of variate which corresponds to an area equal to P(h) under the Standard Normal Distribution density curve. FIGURE 1 Typical Examples of Data on Lognormal Paper FIGURE 2 Typical Density Curves for the Types I, II and Lognormal Distributions 77. FIGURE 3 Typical Density Curves for the Type I I I Distribution 78. .5 Recording Penod St'41 water- level. Recording Interval NOTE: The wave height i s measured from trough to crest FIGURE 4 Typical Wave Elevation Recording f s i 2 2 1 1 3 1 2 th 1 1 1 U 1 2 % i " 3 10 7 1 1 I 6 Ik li i 11 I 1 A / h-k 1 1 1 J " IT <>° 11 m 12 1 11 0 1 U % ki u- 1U n jp 1 Iv 1 1 Li M 2 1o fft 12 31 If 13 /o 6> % %ob 111 1*1 7A 71 L* 11 13 U / •£ m 1U Vi Lb h 11 8<? 7i, ** 11 6 2 1 i+i 3 b 1 6 7 8 * fo ff 13 /A 11 1* ff M Period. FIGURE 5 A Bivariate Scatter Diagram 80. FIGURE 6 The Definition of Sample Ta l l FIGURE 7 Comparison of Tail Curvatures on Gumbel paper FIGURE 8 Skewness of the Fretchet Distribution 83 FIGURE 9 Shape Factor of the Weibull Distribution a * slope b* Intercept FIGURE 10 Method of Least Squares 85. FIGURE 11 Convergence of Least Squares Procedure for Type I I I T T Distribution 86. 6.0 >o/o CL c I U-o 016 Least Square* / / Line 020 3 * >o5o 2* -too 25% Confidence bands based on distribution of ^^observation of N ordered Valves. 25/a Confidence bands based on Normal approximation Convera'ma a s ranh rrt' increases 8 ff /2 Wave Heioht FIGURE 12 Comparison Between the Approximate and Exact Confidence Intervals FIGURE 13 Confidence Intervals on a Type I Plot 6 3 * a : c / V 1 -2$Z CoNFlDJLMCJE. //AVfS 8 /o WAVE. H&KJHJ II 12 Fcnj 2.1 2.2 2.3 2.L 2.5 U f H j Note: data from St. Denis [1969] FIGURE 14 Confidence Intervals on a .Type I I plot 89. FIGURE 15 Confidence Intervals on a Type I I I Plot 90. ConF>Xf£.h4CJL 2.1 2.2 2.3 2.U 2/> NOTE: i ) N~.:{P(h)} i s the value of variate which corresponds to an area equal to P(h) under the Standard Normal Distribution density curve. i i ) Data from St. Denis [1969] FIGURE 16 Confidence intervals on Lognormal Plot 91 . hu Height h J Height h (£/ Confidence Inter vat FIGURE 17 Determination of Confidence Interval from Typical Distribution Density of the mth Observation. FIGURE 18 The Curvature Test FIGURE 19 Type I I I Plot with Confidence Intervals 94. FIGURE 20 Type I I I T T Plot with Confidence Intervals 95. Note: N - 1{P(h)} i s the value of variate which corresponds to an area equal to P(h) under the Standard Normal Distribution density curve FIGURE 21 Curvature of the Lognormal Distribution on Type I Paper FIGURE 22 The Relationship Between Return Period and Encounter Probability 97. Wave Heiaht FIGURE 23 Typical Prediction Intervals 98. FIGURE 24 The Relationship Between P(h) and F (h) for m = 1. 99. 100. FIGURE 26 The Relationship Between P(h) and F_.(h) for m = 3. 101. FIGURE 27 The Relationship Between P(h) and F (h) for m = 4. 102. 5o foo SAMPLE. SIZL 500 n FIGURE 28 The Relationship Between P(h) and F m(h) for m = 5. 103. APPENDIX The cumulative probability function i s given by P(h) = exp.'{- exp ]} , , 'h-e l e t Z = -7T~ where e Is a location parameter and 9 i s a scale parameter P(z) = exp { - exp(-z)} the density function i s then given by p(z) = ^ P(zj = exp{ -z} exp {- exp(-z)} Type I density p(z) = exp {-z ;- exp[-z]} A moment generating function i s defined as 00 _ M{t} = / exp {tz} p(z) dz where t i s a dummy variable, set y = exp(-z) - / y ( 1 - t ) " 1 M{t} = J y v ' exp {-y} dy o = r(i-t) which i s the Gamma function with argument (1-t), 105. Trie basic properties of the moment generating function [Bury 1975; page 44] give the kth moment about the origin mk(z) = ^ M{t} I dk t=0 thus mj(z) = ~ . r(l-t) I = Y ....(74) t=o where y i s Euler's Number (0.57722). Similarly higher moments are found to be m2(z) = T 2 + X = 1 - 9 7 8 n m3(z) . = 5.44487 mi(z) = 23.56147 The f i r s t four central moments.are given as yi(z) = Q Var(z) = y 2(z) = ^ = 1.64493 y 3(z) = 2.40411 . U l +(z) = 14.61136 The f i r s t shape factor i s constant and given by y 3 — = 1.13955 n 3/ 2 u 2 Since the shape factor i s constant, the Type I distribution i s only capable of having one shape which i s shown i n FIGURE 2. 106. The estimation of parameters.used i n Eqn. 68 i s achieved by using the properties of the'parameterless .version, Eqn. 70. . from Eqn. 71 E(z> - E{Y}- ^ - ' f mi(h) = e + 0.57722 9 (75) Similarly Var(z) = m2(z) - mi 2(z) 1 U2QO y 2(h) = J ' 0 2 . . . . ( 7 6 ) Equations 75 and 76 form the basis of the method of moments described i n SECTION 5 . 1 . 1 . The cumulative probability function i s given by - of P(h) = exp -(f) h >_0 9 > 0 . a > 0 where a i s a shape parameter and 9 i s a scale parameter The density i s p ( h ) ..;|. P ( h ) . s (|) -(cd- i) exp The kth moment i s defined as 00 rn^Ch) = J h k p(h) dh o setting t - c t r -a t = h- a / h \ - ( a + 0 * - | (|) rn^t) = j e k t - k / a e _ t d t O m^t) = e k r{ i - (k/a)} 108. The function r{l-k/a} Is discontinuous for a l l integer values of (k/a), and this distribution i s only valid when (k/a) <1. [Gumbel 1958 pages 262-264]. This implies that when a i s integer, the expectation or moment of order a w i l l not exist. In addition, when a <_ 1 the Type I I distribution does not have a mean, and i t s variance cannot exist for a <_ 2. Thus, this distribution must be used with considerable care. 109. A3 Properties of the Type IHg .Distribution The cumulative probability function i s P(h) = exp {- (fff} for E - h > 0 . 0 > 0 ; a > 0 where e i s the highest wave ever possible i.e. P( e) = 1.0 and 0 i s a scale parameter. The distribution may be simplified by setting A = . (e - h) the density then becomes a-l (80). a again defining the kth moment by 00 \(X) = j Xk pU) dA —OO, oo:. / f® 8- 1 «p<-(0 } AK' dA setting t = gives the gamma integral Mk(A) = J 0 k t k / a e _ t dt o .(81) \{x) = e k r{ i + (k/a)} . . . . ( 8 2 ) 110. The size .of the moment i s independent of e. Unlike the moments of the Type I I distribution, the moments of the Type I l l y distribution are continuous. Equation 82 forms the basis of the method of moments described i n SECTION 5.1-3- The treatment for the Type I I I L distribution i s basically the same and i s easily derived by the same method. 111. AK Properties of the LogriO:raiaI Distribution The lognormal model i s obtained by. using the logarithm of the height as a reduced variate, and applying the Normal model. The reduced variate y = £n(h) and the NORMAL probability function h yield the Lognormal probability function h P( H) = 1 | 1 {. 1 (M-lCf •) to . . . . ( 8 3 ) a / 2 7 T o where y i s the scale parameter and a i s a shape parameter The density i s given by for h > 0 a > 0 ; - o o , < y < The kth moment i s given as 00 / Mk(h) = / h k p £ (h) dh setting y = &n(h) oo Mk(h) = I exp' {ky} pn(.y.) dy .... (85) where P n(y) i s the Normal density. Equation 85 i s the standard form of the moment generating function with argument k, giving l y h ) = exp { yk + •k^a-} The f i r s t moment i s then E{h} = niiCh) = exp{y + | 2} . . . . ( 8 6 ) and the central moments are m(h) = 0 y 2(h) = [exp(a2) - 1] exp(2y+ a 2 ) . . . . ( 8 7 ) y 3(h) = m?( Y 6-V) . . . . ( 8 8 ) where y l s the coefficient of dispersion j, 1, Y = u 2 2/mj = [exp(a2) - 1] 2 The f i r s t shape factor i s given by 4> = Y 3 + 3Y . . . . ( 8 9 ) > 0 , indicating that a l l lognormal densities are skewed to the right. The simultaneous solution of Eqns.. 86 and 87 leads to the method of moment estimators given i n Eqns. 22 and 23 of SECTION 5 . 1 . 4 . 113. A5 Curvature of the Type II Distribution The ordinate scale of Type I paper i s given by y = - Jin {- In P(h» the Type.TI probability function i s 'P(h) = exp {-(I) ^ Thus the equation of a Type I I line on Type I paper i s y = a (Jin h - Jin G) (90) which has slope % = ^ dh and curvature •£y_ = -<* < o dh 2 h 2 ~ Hence the Type I I distribution w i l l have a negative curvature when plotted on Type I paper. A6 Curvature of the Type I l l y Distribution The Type I l l y probability function i s given by a P(h) = exp {- e-h } Again using Eqn. 1 as the ordinate of the Type I paper. The equation of a Type I l l y line on Type I paper i s y = -a .Jin(e-h) + a £n (91) which has slope dy . dh and curvature + a (e-h) dx 2 a (e-h) 2 — > 0 Thus, a Type I l l y distribution w i l l have positive curvature when plotted on Type I paper. A7 Curvature of the .Type .III Distribution 115. (92) The Type III probability function i s P(h) = 1 - e x p { - ( ^ ) a } as P(h) -* 1 £n P(h) = £n [ I - exp{ - (^~) a>] = - exp, - (^)«> Using Eqn. 1 as the ordinate of the Type.I paper, the equation of the Type I I I ^ distribution becomes:. which has slope dy _ a (n-e\ dh 8 . \ e / and curvature dh z e V e A since e } a and (h-e) are positive d 2v sign of — ^ = sign of (a-1). dh 2 Thus the curvature of the Type.Illy distribution becomes: i ) positive when a > 1 A i i ) zero when a = 1 H i ) negative when a < l
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The statistical estimation of extreme waves
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
The statistical estimation of extreme waves MacKenzie, Neil Grant 1979
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | The statistical estimation of extreme waves |
Creator |
MacKenzie, Neil Grant |
Publisher | University of British Columbia |
Date Issued | 1979 |
Description | This thesis contains a review of existing statistical techniques for the prediction of extreme waves for coastal and offshore installation design. A description of the four most widely used probability distributions is given, together with a detailed discussion of the methods commonly used for the estimation of their parameters. Although several of these techniques have been in use for several years, it has never been satisfactorily shown which are capable of yielding the most reliable predictions. The main purpose of this thesis is to suggest a practical method of solving this problem and achieving the best estimate. The basic theory for the prediction of extreme values was described in detail by Gumbel (1958) who concentrated largely on the double exponential distribution which is named after him. An order to evaluate the quality of fit between this law and the data, Gumbel derived expressions which enabled one to plot confidence intervals to enclose the data. The method described in this thesis in partly an extension of Gumbel's work, and similar confidence interval methods are given for the remaining distributions, thus permitting direct comparisons to be drawn between their performances. The outcome of this is that the most reliable model of the data may be chosen, and hence the best prediction made. The method also contains a curvature test which has been devised to facilitate computation and lead more directly to the end result. The particular form of the wave data, which is quite different from wind records, is also taken into consideration and a working definition of the sample tail is suggested. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-03-05 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0062860 |
URI | http://hdl.handle.net/2429/21534 |
Degree |
Master of Applied Science - MASc |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1979_A7 M33.pdf [ 4.09MB ]
- Metadata
- JSON: 831-1.0062860.json
- JSON-LD: 831-1.0062860-ld.json
- RDF/XML (Pretty): 831-1.0062860-rdf.xml
- RDF/JSON: 831-1.0062860-rdf.json
- Turtle: 831-1.0062860-turtle.txt
- N-Triples: 831-1.0062860-rdf-ntriples.txt
- Original Record: 831-1.0062860-source.json
- Full Text
- 831-1.0062860-fulltext.txt
- Citation
- 831-1.0062860.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0062860/manifest