A THEORETICAL STUDY OF TRANSFORM PICTURE CODING SYSTEMS AND AN INVESTIGATION OF A NEW MEASURE OF DISTORTION IN PROCESSED IMAGES by Alexander John Main Baillie B.A.Sc, University of British Columbia, 1971 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in the Department of Electrical Engineering We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA December 1973 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the Head of my Department or by his representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department o The University of British Columbia Vancouver 8, Canada Date Abstract The performance of systems which employ a linear transformation and block quantization to encode visual information sources has been investigated. The effect of several relevant design parameters upon the behavior of these systems was examined theoretically. The performance criterion used was the rate measured in bits-per-picture sample required to transmit an image through a noiseless digital channel such that the average distortion in the received picture is less than a specified value. As a measure of image distortion, the mean-squared-error, well-known for its analytical tractability, was used. A second measure of image distortion which incorporates knowledge of the human visual system was examined. The new distortion index proposed by Stockham is expected to correlate well with subjective assessments of picture distortion although no experiments have been performed, as yet, to show this. In order to determine what, if anything, could be learned about this second distortion measure from theoretical considerations alone, a procedure was developed to estimate its value for given bit rates. These values were compared to the mean~squared--error distortion values obtained for the same bit rates in an attempt to gain some insight into the usefulness of the second distortion measure. TABLE OF CONTENTS Page I. INTRODUCTION 1.1 Motivation 1 1.2 Background ' . .- 1 1.3 Outline of Thesis 2 II. MONOCHROMATIC IMAGE CODING BY LINEAR TRANSFORMATION AND BLOCK QUANTIZATION 2.1 Introduction- Why Transform Coding? 4 2.2 A Statistical Model for Motion Pictures 6 2.3 The Generalized Transform Coding System 7 2.4 Linear Transformations 11 2.5 Block Quantization 4 III. A THEORETICAL EVALUATION OF TRANSFORM CODING SYSTEM PERFORMANCE "3.1 Introduction 15 3.2 The Work of Habibi and Wintz 17 3.3 Factors Affecting the Performance of Transform Coding Systems 20 . 3.4 Basis-restricted Transformations and a Performance Measure for Transform Picture Coding 22 3.5 Numerical Results 26 3.6 Conclusions 3IV. A THEORETICAL INVESTIGATION OF AN ALTERNATIVE TO THE MEAN-SQUARED-ERROR DISTORTION MEASURE 4.1 Introduction 38 4.2 A Theoretical Assessment of Transform Coding Using Stockham's Distortion Measure 40 4.2.1 Preliminary Remarksiii 4.2.2 The Procedure Used to Evaluate System Perform ance With Stockham's Measure 41 4.2.3 The Computation of Se(u,v) 6 4.2.4 Numerical Results 53 4.3 Conclusions 61 V. CONCLUSION 5.1 Summary of Thesis 62 5.2 Future Research . 3 APPENDIX A. DEFINITIONS AND SOFTWARE IMPLEMENTATION OF THE DISCRETE KARHUNEN-LOEVE, FOURIER, AND HADAMARD TRANSFORMATIONS A.l General Definition of a Linear Transform 64 A. 2 The Discrete Karhunen-Loeve Transform 65 A. 3 The Discrete Fourier Transformation 7 A. 4 The Discrete Walsh or Hadamard Transformation 67 --A-i'5 '-Variances 'of Transform Coefficients 70 A. 6 Software Implementations . 72 A.6.1 The Discrete KL TransformA.6.2 The Discrete Fourier Transform 73 A.6.3 The Hadamard Transform . 73 A.6.4 Listing of the Subprogram KLVAR1 . . 73 A.6.5 Listing of the Subprogram FVAR1 . 74 A.6.6 Listing of the Subprograms HVAR1 and HAD2 .... 75 APPENDIX B. DETAILS CONCERNING PEARL'S PERFORMANCE MEASURE FOR TRANSFORM CODING SYSTEMS B. l Derivation of Equations (3.11) and (3.12) . 85 B.2 An Objection to Pearl's Measure 86 APPENDIX C. LISTINGS OF.THE PROGRAMS USED TO CALCULATE e^L,,T iv Cl Explanatory Remarks 88 C.2 Program Listings ' 88 REFERENCES . 96 a v LIST OF ILLUSTRATIONS Figure Page 2.1 Generalized transform coding system ' 8 3.1 Basis-restricted transform structure 23 3.2 . Comparison of rate versus distortion curves for one-, two-, and three-dimensional blocks of data when N = 64 and adjacent sample correlation equals 0-.85. (a) K-L. (b) Fourier. (c) Hadamard 28 3.3 Comparison.of rate versus distortion curves for K-L, Fourier, and Hadamard transforms when N = 64 and adjacent sample corre lation equals 0.85. (a) One-dimensional blocks, (b). Two-dimen sional blocks, (c) Three-dimensional blocks 29 3.4 Comparison of rate versus correlation curves for one-, two-, and three-dimensional blocks of data when N = 64 and the sig nal-to-noise ratio is 30 db. (a) K-L. (b) Fourier, (c) Hada mard 31 3.5 Comparison of rate versus correlation curves for K-L, Fourier, and Hadamard transforms when N = 64 and the signal-to-noise ratio is 30 db. (a) One-dimensional blocks, (b) Two-dimen-•'S'i©nal-»b4ook-r5, •• (o) • •Tbi?ee-d4mens*onal'>bio'cks »• . 3-2 3.6 Comparison of rate versus block length for one-, two-, and three-dimensional blocks of data when the adjacent sample correlation is 0.85 and the signal-to-noise ratio is 30 db. (a) K-L. (b) Fourier, (c) Hadamard 34' 3.7 Comparison of rate versus block length curves for K-L, Fourier, and Hadamard transforms when the adjacent sample correlation is 0.85 and the signal-to-noise ratio is 30 db. (a) One-dimen sional blocks, (b) Two-dimensional blocks, (c) Three-dimen sional blocks 35 4.1 Stockham's visual model . 39 4.2 Plot of equation (4.8) 45 2 4.3 The partitioning of an image into M subsections 47 vi LIST OF TABLES Table Page I Performance estimates for the one-dimensional Fourier transform coder (Block length = 64) 54 II Performance estimates for the one-dimensional Hadamard transform coder (Block length = 64) 55 III Performance estimates for the two-dimensional Fourier transform coder (Block length = 64) 57 IV Performance estimates for the two-dimensional Hadamard transform coder (Block length = 64) 57 V Performance estimates for the one-dimensional Fourier transform coder (Block length = 256) 59 VI Performance estimates for the one-dimensional Hadamard transform coder (Block length = 256) 60 vii ACKNOWLEDGEMENT I would like to thank my supervisor, Dr. Grant Anderson, for his constant encouragement, and inexhaustible patience during the course of this thesis. Thanks are also due to Miss Betty Cockburn and Miss Linda Christie who patiently and quickly typed the manuscript. The technical assistance of Messrs. A. Mackenzie and H. H. Black is greatly appreciated, as well. Grateful acknowledgement is given to the National Research Council for financial assistance under grant NRC 67-7994. Finally, I would like to thank my wife, Joanne, for making it all worthwhile. viii 1 I. INTRODUCTION 1.1 Motivation Experimental studies of transform picture coding systems are generally performed using large, high-speed, general-purpose digital computers to simulate the operation of the transform coder [l]-[2]. However, the time required to digitize the pictures, process them with the computer, and display the results has prevented researchers from performing a thorough examination of the effect of relevant parameters upon the performance of transform picture coders. In this thesis, a scheme for estimating the per formance of transform coders [3] is employed to predict system performance for a wide range of values of certain pertinent design parameters. The analytical technique used yields useful results in a fraction of the time taken to find corresponding results experimentally. Assessing the quality of processed images by subjective testing is another time-consuming and laborious task. Not surprisingly, objective measures of picture quality which correlate well with subjective judgements are constantly being sought after for their time- and labour-saving proper ties. One such measure proposed recently by Stockham [4] is studied in this thesis. 1.2 Background The concept of linear transformation and block quantization was first reported by Huang and Schultheiss [5]. Several researchers have since applied the technique to the coding of picture data. Wintz [6] has surveyed all known applications of linear transformation and block quantization to monochromatic image coding. Wilkins and Wintz [7] published a comprehensive bibliography of picture coding techniques, ranging from pulse-code-modulation 2 (PCM) through a host of ad hoc intuitive schemes. Many published reports of transform coding schemes are included in their listings. Also included in [7] are lists of published papers dealing with picture properties and models, and with various aspects of human visual information processing. Knowledge of how humans process visual data has been utilized by researchers [4][8] to define objective measures of picture quality. Wilder [8] studied fifteen objective error measures, each of which involved some function of the point-by-point difference between the original scene and the processed version. He concluded that point-by-point error measures were not subjectively relevant from the viewpoint of overall quality. Further, he suggested that future investigators should choose objective error measures which take area properties of vision into account. In particular, he recommended considering pictures from a spatial frequency viewpoint as Stockham has clone [4]. 1.3 Outline of Thesis In Chapter 2, the reasons for studying transform coding systems are discussed, a theoretical model for picture data sources is outlined, and a generalized transform coder is described. Chapter 3 is devoted to a theoretical evaluation of transform pic ture coding system performance. The results are presented in the form of curves which show the effects on system performance of varying the values of several important design parameters. The performance evaluations used in this study are based upon a mean-squared-error distortion measure and certain applicable rate-versus-distortion expressions due to Pearl et. al. [3]. Also in Chapter 3 is a summary of the work of Habibi and Wintz [9], included for the following reasons: (1) Theirs is the only study which compares the performance of the three linear transformations most frequently used in transform coding research, and (2) An understanding of the theoretical technique which they used to estimate the performance'of transform coding systems helps one to appreci ate the simplicity and usefulness of the method which was used in this thesis. In Chapter 4, an objective measure of image distortion proposed by Stockham [4] is investigated by comparing it to the mean-squared-error distortion criterion. A procedure is developed for estimating the value of the new measure for images which have been transform coded. The results are presented in tabular form for both one- and too-dimensional Fourier and Had amard transforms. Finally, Chapter 5 contains a summary of the thesis and an outline of possible directions for future research. II. MONOCHROMATIC IMAGE CODING BY LINEAR TRANSFORMATION AND BLOCK QUANTIZATION 2.1 Introduction- Why Transform Coding? Since analog sources have infinite entropy and practical channels have finite capacity, analog source outputs cannot be transmitted with arb itrarily small error. This fact, embodied in the converse to the noisy channel coding theorem [10], motivated studies of how to efficiently approx imate analog sources with finite entropy discrete sources in such a way that tolerable distortion occurs relative to an appropriate fidelity criter ion. The simplest and most widely employed method to accomplish the above stated task is to sample the analog source output and then to quantize each sample using a uniform quantizer. The resultant signal is a pulse-code-modulated £P.CM) .version,.of ...the ..analog signal,. In the case of a monochromatic picture source, for which the brightness signal varies with too spatial variables and time, it has been found that sampling at 15 to 60 frames per second and 500 x 500 samples per frame using 16 to 256 uniformly spaced quantizing levels per sample point yields PCM pictures with quality comparable to that of commercial television pictures. The channel capacity required to transmit PCM pictures of this 7 8 quality is seen to be about 10 to 10 bits per second which is undesirably large [11]. Thus, although digitizing the analog picture source output in the above manner does yield a finite entropy approximation, the approximation is inefficient. Statistical relationships between neighbouring samples remain to be exploited. In addition, it is expected that psychovisual properties can be utilized to further reduce the amount of information re quired to adequately represent the source output. That this should be poss ible follows from the fact that the digitization scheme seeks to approximate the analog brightness values as closely as possible at each sample point. This implies a sample-by-sample error criterion wThich, as Wilder [8] pointed out, is not subjectively relevant since the eye does not operate on a point-by-point basis. Methods have been studied [12] to reduce the information rate of digitized analog sources below that required for PCM without employing the inter-sample correlations. These methods require either sophisticated coding schemes such as the Huffman procedure or non-uniform quantization in order to obtain relatively small rate decreases. As- alternatives to PCM, several techniques have been proposed [7]. -'Atnes.-g-'-t-hem, -one technique -that- has-'been -receiving-much- -attention recently [6] is coding by linear transformation and block quantization. As stated above, simply digitizing an analog source output yields an inefficient finite entropy approximation to the analog source. Linear transform coding seeks to correct this inefficiency by eliminating some of the redundancy contained in the discrete version of the original source. In addition, the block quantizer can be designed in accordance with known properties of human vision to fur ther enhance the performance of the transform coder. The remainder of this chapter is devoted to explaining the opera tion of a generalized transform coding system for source encoding picture data. In Chapter 3 the performance of this system will be studied. 2.2 A Statistical Model for Motion Pictures Before describing the generalized transform coding system, a few preliminary observations are in order concerning the statistics of the picture data upon which the system will operate. Franks [13] derived an expression for the power spectral density of the electrical process gener ated by linear sequential scanning of a rectangular portion of an infinite, two-dimensional random picture. He found that the power spectral density of a sampled picture source could be expressed as a product of three factors, characterizing separately the influence of point-to-point, line-to-line, and frame-to-frame correlation. In addition to being separable in space and time, the autocorrelation function for his random picture model had an exponential form in each dimension. To express these ideas in mathematical form, let f(x,y,t) represent the picture brightness at the point (x,y,t). Then, the autocorrelation function determined by Franks takes the follox^ing form: R(Ax, Ay,At) = E{f (x,y, t) f (x' ,y', t') } = a2p ~IAX'p "l^lp "lAtl (2.1) x y r 2 2 where Ax = x-x', Ay = y-y', At = t-t', a - E{f (x,y,t)}, E is the expect ation operator, and p^, p^, and p^_ are the correlation coefficients of brightness between adjacent picture elements in the x-, y-, and t-dimension, respectively. Experimental measurements by Kretzmer [14] placed pfc in the range 0.80 to 0.86 for typical television material. Typical p and p x y values, as measured by Kretzmer, were slightly higher. Of course, the corre lation coefficients depend upon the content of the picture material. A random field having an autocorrelation function of the form given in equation (2.1) is known as a wide-sense stationary Markov field, in addition to the 7 Markov assumption, we will also assume the picture data to be Gaussian and zero-mean. Actually, the assumption that f(x,y,t) is zero-mean is implicit in (2.1). Given that f(x,y,t) is not zero-mean, a simple change of scale can make it so. The primary motivation behind using this model, apart from its agreement with certain experimental measurements [14] on picture data, is its analytical tractability. In particular, the separability of the auto correlation function is a powerful feature that permits a multi-dimensional analysis of the generalized transform coding system. The assumption of Gaussianity will allow other analytic simplifications in certain theoretical evaluations to follow. 2.3 The Generalized Transform Coding System The system depicted in Figure 2.1 is a generalized transform coding system. The system is generalized in the sense that it can operate on the digitized source data in blocks of any desired dimension up to the dimension of the continuous function. In the case of monochromatic motion pictures, the maximum dimension is three since f(x,y,t) prescribes brightness as a function of two spatial variables and time. Past studies [6] have only considered transform coding systems which blocked the picture data in one-and two-dimensional blocks for encoding. This was a natural consequence of the fact that the digitization of a continuous picture is performed by a flying-spot scanner or similar sequential device that generates picture samples from one complete frame, scanning a line at a time, before proceeding to scan the next frame. To obtain three-dimensional blocks of digitized picture data would require storing previous frames. Although such a proced ure is prohibitively expensive today, future technological innovations in GITIZER \ BLOCK ORGANIZATION u LINEAR TRANSFORM v = Au ENCODER v=0(v ) Sawj DIGITAL CHANNEL -A CONVERTER u= A v INVERSE I v = Q/( v') LINEAR TRANSFORM Figure 2.1 Generalized transform coding system co 9 the area of random access, high density storage devices could make three-dimensional transform coding feasible. For this reason, and because the analytical tools exist to handle the three-dimensional case, transform coders of one-, two-, and three-dimensions will be considered here. In order to isolate the source encoder and measure its effect upon the system's performance, it is convenient to conceptually partition the transform coding system as shown in Figure 2.1. The digitizer uses sampling and quantization to convert the con tinuous source function f(x,y,t) into a three-dimensional array of discrete sample values, f'(x.,y.,t, ). 1 j k. The block organization unit partitions f' into sub-arrays of spec ified dimension and size, N, and then rearranges the elements of the sub-arrays to form one-dimensional vectors, u, of length N. o The,.linear -.transform and .r.the. .encoder ..following it-comprise the source encoding section. The transform multiplies the vector, u, by the matrix, A, to yield a transform vector, v. The results of Huang and Schul-theiss [5] suggest that an appropriate choice of the matrix, A, could result in the elements of v being uncorrelated. Because of the assumed Gaussianity of the original samples and the linearity of the matrix operation, the ele ments of v would then also be statistically independent. If the transformed samples contained in v are statistically independent, the task of the encoder is significantly simplified. To understand why this is so, consider the operation of the encoder. The encoder quantizes the elements of v and ass igns coding bits to each of them in preparation for transmission over the digital channel. This process can be viewed as a mapping, (">(•)> of the N-dimensional space in which v is defined into the N-dimensional space in which v1 is defined. Since the elements of v' are quantized versions of the •elements in v, the operator Q(0> is seen to map the v-space into itself. In general, when the elements of v are correlated, such a mapping, if opti mum, requires knowledge of all N components of v for determination of each component of v'. But, if the components of v are independent, then the i1**1 element of v' can be determined from knowledge of the i*^1 element of v alone, ignoring the other N-1 elements of v. Thus, a linear transformation which decorrelates the samples contained in v serves not only to eliminate the inherent redundancy in the digitized image samples but also to simplify the computational task of an optimum encoder unit significantly. For our purpose, the digital channel is restricted to performing an operation representable as an identity matrix. At the receiving end of the digital channel, the decoder functions to recover v' as accurately as possible under a specified error criterion. ••Prom--the 'restriction -of - ~lrtie--,above--'<p&rag-T?aphj --v is •'identical- -to v'. -The de coder output v is transformed using A \ the inverse of the matrix used earlier to transform the vector u. The effect of this matrix operation is to yield an approximation u to the data vector, u. Finally, the D-A con verter operates upon u to produce a continuous approximation, f(x,y,t), to f(x,y,t), the original image source function. The objective of the system depicted in Figure 2.1, as of any digi tal communication system, is to minimize the number of bits used to transmit v' over the digital channel while maintaining a tolerable fixed distortion between f(x,y,t) and its reproduction, f(x,y,t). Our particular interest in this study is to consider that portion of the system which operates upon the vector u to obtain v', that is, the source encoder. The remaining sec tions of the system are assumed fixed. In the next chapter, we consider 11 varying the dimensionality and total size of the sub-array from which the vector u is formed, and the type of matrix used to transform u in order to determine their effects upon system performance. As a means to isolate the performance of this portion of the system, we choose to measure the distor tion between u and u rather than between f and f. As well, the channel is assumed to have a transition matrix equal to the identity matrix and the decoder is assumed to be errorless as implied earlier. 2.4 Linear Transformations As explained previously, transform coding schemes are motivated by the notion that if one succeeds in transforming the vector u (See Figure 2.1) to another vector v whose coordinates are statistically independent, then the task of assigning bits to the elements of v is greatly simplified. . Such a transformation has not been -found. However, a linear transformation that produces uncorrelated components does exist [5][15]. The matrix A which accomplishes this transformation has as its columns the normalized eigenvectors of the N x N covariance matrix 4> given by: $u = E{uu } (2.2) where "T" denotes a transpose. That is, the columns of A are the vector solutions, x, of the matrix equation: <J> x = Xx (2.3) u Since A decorrelates u, the covariance matrix of v is given by: $v = E{w } = X± 0 -"0 0 X •••0 o o N (2.4) 12 where the X/s are the eigenvalues of the matrix $u in equation (2.3). This transformation is known as the eigenvector, Hotelling [15], or discrete Karhunen-Loeve transformation. Transformations other than the eigenvector transformation have been applied to image coding. Although these transformations do not complete ly decorrelate the image samples, they offer significant computational ad vantages over the eigenvector transformation. Among these transformations are the discrete Fourier transformation and the Hadamard or discrete Walsh transformation. Consider the discrete Fourier transform (DFT) of a one-dimensional 2 array x of length M which has the form: M2-l * • 2 y(£) = 2 x(k)a (£,k) , I = 0,1 M -1 (2.5) k=0 7— 2 where a'(£,k) = 1/M exp[-2nY-l(k£.)/M ] and y is the transform vector. Also consider the two-dimensional DFT of the array w of size M x M which has the form: M~l M-l z(m,n) = E E w(i,j)a*(m,n,i,j) (2.6) i=0 j=0 m,n = 0,1,...,M-1 where a*(m,n,i,j) = 1/M exp[-2Trv^I(im + jn)/M] and z is the resultant trans form matrix. Now, (2.6) can be written in a form similar to (2.5) if: I. = mM + n, and ^ 7) k = iM + j so that y(£) = z(m,n) , and (2.8) x(k) = w(i,j). 13 The result is: 2 M -1 . o y(£) = Z x(k)d (£,k) , I = 0,1,...,M -1 (2.9) k=0 where d*(£,k) = a*(m,n,i,j) in (2.6) since k = k(i,j) and £ = £(m,n). Clear ly, the vector y in (2.9) is not the one-dimensional DFT of the vector x, as is the case in (2.5). This is because d*(£,k) in (2.9) is not identical to a*(£,k) in (2.5). In fact, the elements of y in (2.9) are the elements of the two-dimensional transform matrix z rearranged as a vector according to the rules in (2.7) and (2.8). The point to be made here is that although the DFT of a multi-dimensional array can be represented and implemented as the transformation of a one-dimensional array, that transformation is not the one-dimensional DFT of the vector. As a consequence of this fact, even though the operation of the system depicted in Figure 2.1 is conceptually correct,*'sub'sequeivt'discus'sions'will not treat .the "linear trans formation of multi-dimensional data blocks as the transformation of one-dimensional data vectors > Both the Fourier and Hadamard transforms of multi-dimensional arrays are implementable as a sequence of one-dimensional transformations [16], Furthermore, both the Fourier and Hadamard one-dimensional transforms are members of a class of Kroneckered matrix transformations [17] that can be implemented with much less computational effort than the eigenvector trans formation. This fact is reflected in the fast transform algorithms which perform the Fourier and Hadamard transforms in Nlog^N basic operations as 2 compared to N basic operations for the eigenvector transform [16][18]. The relative advantages of the Fourier and Hadamard transforms over the Hotell-ing transform in terms of computational simplicity can be seen to increase 14 with both the size and dimension of the array being transformed. The moti vation for using the Fourier and Hadamard transforms in a transform coding system is the belief that the computational saving gained will outweigh any loss suffered due to the fact that these transforms only partially decorre late the data arrays. Definitions of the Hadamard, Fourier, and Karhunen-Loeve transforms along with their implementation in software are presented in Appendix A. 2.5 Block Quantization 2 Samples of the image source have variances equal to a in accord ance with equation (2.1). The effect of linearly transforming an array of these samples with any of the previously mentioned transforms is to yield an array of transform samples with unequal variances. Huang and Schultheiss [5] found an algorithm that minimized the total mean-squared quantizing error incurred in quantizing such an array of transform samples independently. They found that the number of bits used to quantize a transform coefficient, 2 2 y, should be proportional to log where a^ is the variance of the coeffi cient. Their procedure, known as block quantization, is significantly more efficient than simply assigning an equal number of bits to each sample and is the basis for using linear transformation and block quantization coding systems. 15 III. A THEORETICAL EVALUATION OF TRANSFORM CODING SYSTEM PERFORMANCE ,3.1 Introduction In Chapter 2, the discrete Karhunen-Loeve, Fourier, and Hadamard transformations were viewed as a means to'decorrelate arrays of picture data. Conceptually, these transformations attempt to achieve that objective by performing a generalized "rotation" of the data arrays in a multi-dimension al space. Alternatively, the previously considered transformations can be interpreted as the representation of data arrays by linear combinations of basis arrays. For the purpose of this discussion, consider the following DFT of a multi-dimensional array of data: M-l M-l M-l . F(£,m,n) = E E E f (i, j ,k) a" (£,m,n,i, j ,k) (3.1) i=o j=0 k=0 •.%«t?n .= ,,0.SL5 ... .^M-l where a*(£,m,n,i, j ,k) = 1/M exp[-2TT/-1 (i£ + jm + kn)/M]. The data samples, f(i,j,k), can be recovered from the Fourier transform coefficients, F(£,m,n), by performing the inverse transformation shown in (3.2). M-l M-l M-l f(i,j,k) = E E E F(£,m,n)a(i,j,k,£,m,n) (3.2) £=0 m=0 n=0 i, j,k = 0,1,...,M-1. In (3.2), the components of the data array are represented as a weighted sum 3 of the M three-dimensional arrays {a(i,j,k,£,m,n), £,m,n = 0,1,...,M-1}. These arrays, which are mutually orthonormal, form a set of basis arrays that span a three-dimensional space. The Fourier, Hadamard, and Karhunen-Loeve discrete transformations of any dimensionality are also defined in terms of sets of orthonormal basis arrays that have the same dimensionality as the 16 transform to which, they correspond. The factors which, weight the basis arrays in (3.2) are simply the Fourier transform coefficients defined in (3.1). Since the variances of -these coefficients are unequal, as pointed out in Chapter 2, it is possible to order the basis arrays according to the variances of the coefficients which weight them. The smaller the variance of the coefficient weighting a basis array, the less that basis array contributes, on the average, to the total sum in (3.2). In fact, the average contribution of some basis arrays may be insignificant, statistically speaking, if the coefficients which weight them have variances much smaller than the typical variance in the set. For example, suppose the variances of the set of coefficients de-3 fined in (3.1) are distributed in such a way that only the first L of the 3 M basis arrays contribute significantly to the sum in (3.2). Then the sum "in X3.2) can be approximated "by: L-l L-l L-l f(i,j,k) - f(i,j,k) = E E E [F(£,m,n)a(i,j,k,£,m,n)] (3.3) L=0 m=0 n=0-i,j,k = 0,1,...,M-1. 2 The mean-squared difference, e , between the set {f(i,j,k), i,j,k = 0,1,..,M-l} and the set {f(i,j,k), i,j,k = 0,1,...,M-l} is often used as a measure of the error involved in approximating (3.2) by (3.3). The approximation error, ~~2 . . v E , is given by: 3 ? 3 . M-l M-l M-l - 2 e * = 1/M"3 E{E E E [f(i,j,k) - f(i,j,k)]^. (3.4) a 1=0 j=0 k=0 Substituting (3.2) and (3.3) into (3.4) and employing the orthonormal proper ty of the basis arrays yields the following expressions: 17 , 3 M-l M-l M-l ? = 1/MJ E E E a (£,m,n) (3.5) = R(0,0,0) - 1/M3 I;-1 If1 E'"1 o2U,m,n) £=0 m=0 n=0 where a (£,m,n) = E{F(£,m,n)F*(:l,m,n) } and R(0,0,0) = E{f (i,j,k)} as defined 2 in (2.1). Through equation (3.5) we see that e depends upon R(Ax,Ay,At), the autocorrelation function for the data, since the variances of the trans-'form coefficients depend upon the autocorrelation function of the original data. To this point, the present discussion has used the three-dimension al discrete Fourier transformation as a vehicle for explanation. The concepts, however, apply generally to each of the other transformations considered. - ^ -*---'»*» The transformation which minimizes e is the Hotelling transform a <%ieoause •:i:t"'-psGics-i,th'e,"Eost -variance-into ''the -fewest •transform coefficients [6]. Thus, from both points of view - decorrelating data samples or minimizing mean-squared approximation error - the discrete Karhunen-Loeve transform is optimal. 3.2 The Work of Habibi and Wintz Habibi and Wintz [9] studied, theoretically and experimentally, the performance of transform coding systems which used the two-dimensional Hadamard, Fourier, and Hotelling transformations to process monochromatic still pictures. Their picture data consisted of three digitized pictures of small, moderate, and large amounts of detail corresponding respectively to strong, moderate, and weak correlation between picture samples. The original continuous pictures were sampled at a resolution of 256 x 256 picture elements (pels) per picture and quantized uniformly to eight bits per pel to obtain 18 the digitized "originals". Each picture was processed in blocks of 16 x 16 samples. Each of the 256-point blocks was assumed to be a sample function of a homogeneous, zero-mean, Gaussian-Markov random field having an autocorre lation function of the form: R(Ax, Ay) = E{u(x,y)u(x',y')} = oV" I Ax' ~6' Ay' (3.6) where u(x,y) is the value of the random field at point (x,y), e = p , e ^ = Py [See (2.1)], and Ax = x-x', and Ay = y-y'. Values of a and 8 were estimated for each picture and used in (3.6) when designing the experimental coding system. Habibi and Wintz made theoretical estimates of the transform coding system's performance by calculating the total mean-squared difference between the digitized original image and its reproduction at the output when a speci fied number of bits per pel were used to code the picture. The total mean-~~2 . 2 squared-error, £ , was calculated as the sum [19J of the two terms: e , ci 2 the previously mentioned approximation error, and e , the error incurred by 2 quantizing each block of 16 x 16 transform samples, e can be computed if SL the number of transform coefficients used in the approximation (3.3) and their 2 variances are known. depends upon the variances and probability density functions of the transform coefficients being quantized, upon the type of quantizer used, and upon the number of bits used to quantize and code each transform coefficient. Using the fact that the transform coefficients are normally distributed and assuming the quantizer for each sample to be an op timum uniform quantizer [20] (in the mean-squared-sense), Habibi and Wintz 2 also obtained an expression for . The sum of the two error expressions, 2 given in (3.7), can be minimized by appropriate choice of the number, L , of transform coefficients retained in the approximation, and the number, 19 b(l,m), of bits allocated to each of the retained coefficients for the pur poses of quantization and coding. ~2 2 , 2 e = e + E a 1 (3.7) , 2 L-l L-l 2 = R(0,0) - 1/M E E a (H,m){l - exp[-l/2 b(£,m)ln 10]} £=0 m=0 where M = 16 in the Habibi and Wintz study. Imposing a constraint on the total number of bits, B, used to code each block of transformed samples, Habibi and Wintz used a simple bit assign ment algorithm published by Kurtenbach and Wintz [21] that assigns B bits to 2 2 L samples having variances {a (£,m), l,m = 0,1,...,L-1}, in a nearly optimum manner. This rule was used for the Fourier (F), Hadamard (H), and Karhunen-Loeve (KL) transformations for various values of B. For each B, a value for ~~2 '1-was chosen, >the .rule.*applded.s «and >.the>'-'r-eS'U-itin.g e compu-t-ed. This was re peated for various L values to determine the optimum values for L and {b(£,m), £,m = 0,1,...,L-1}. Each B value defined a particular rate measured in bits per pel given by: R = B/M2 = B/256 (3.8) which Habibi and Wintz plotted versus the normalized mean-squared-error, ~~2 E /R(0,0). The performances of the various transformations were then compared on the basis of the bit rate required to achieve a specified mean-squared-error. In order to compare the theoretically estimated performances of the various transforms with the theoretical lower limit, Habibi and Wintz derived an expression which they erroneously believed was Shannon's rate-distortion function for the Gaussian-Markov field. The correct expression [22] yields 20 a lower bound than their expression. 3.3 Factors Affecting the Performance of Transform Coding Systems Habibi and Wintz [9] considered one particular transform coding system which was restricted to processing still pictures by successively coding two-dimensional blocks of fixed size using either the Hadamard, Fourier or Karhunen-Loeve and block quantization. However, intelligent design of a practical transform coding system requires an understanding of the influence upon system performance of many parameters which Habibi and Wintz held con stant. In addition, since the data set they used was of limited size, the dependence of system performance upon the statistics of the source being coded was not adequately considered. In general, the factors which affect the performance of a system will depend upon the way in which the performance is defined, that is, upon the performance criterion used. When transform coding systems that process digitized picture data are assessed in terms of rate versus mean-squared-error distortion, then the effect of the following parameters must be exam ined: (1) The type of transform used. Although the discrete KL trans form is theoretically optimum, the F and H transforms are. of interest to designers since they offer significant savings in their implementation that may offset their suboptimum performance. (2) The dimensionality of the data blocks upon which the system operates. Although Habibi and Wintz only considered two-dimensional blocks, it is of interest to designers to know the performance degradation that re sults when one-dimensional blocks are processed. Again the question arises as to the trade-off between cost of implementation and the performance im-21 provement obtained in going to a higher dimensionality system. In addition to one- and two-dimensional processing, it is interesting, in view of its possibility in the future, to examine three-dimensional processing. Three-dimensional transform coding is an inter-frame procedure whereas one- and two-dimensional coding are intra-frame techniques that process motion picture data frame-by-frame without utilizing the statistical redundancy contained in successive frames. (3) The size of the data blocks upon which the system operates. For reasons related to the ease of implementation and accuracy .obtainable in computing the eigenvectors used in the Karhunen-Loeve transform, Habibi and Wintz settled upon a block size of 256 (16 x 16). Others have used smaller [1] and larger [23] block sizes. The choice of block size, as with the dimen sionality of the block, involves a trade-off between the higher cost of pro cessing larger blocks and the performance gains obtained by exploiting inter-sample correlations to a greater extent. (4) The nature of the picture being processed. As already observed, Habibi and Wintz used a limited set of picture data to evaluate their trans form coding system. Assuming the autocorrelation function of the picture data (scaled to zero-mean) to have the form specified in (3.6), they estimated a and 3 for each picture in the set and used the estimates in both their theoretical and experimental studies. However, the range of a and 3 for these pictures was small and thus not likely representative of all possible picture material. Moreover, it does not seem that Habibi and Wintz were con cerned, in this particular research [9], with the effect of the source stat istics upon system performance. It is, however, of interest to designers to know the extent to which the system performance is sensitive to the source 22 statistics. In the next section, an analytical technique will be introduced that enables the effect of the above factors upon the transform coding syst em's performance to be evaluated. 3.4 Basis-restricted Transformations and a Performance Measure for Transform Picture Coding Consider once more the generalized transform coding system depicted in Figure 2.1. Pearl, Andrews, and Pratt [3] realized that for any practical implementation of this system, irrespective of the particular transform used, the individual transform coefficients which comprise the vector v would be encoded independently. It was noted earlier that since the eigenvector trans form generated statistically independent transform coefficients, the subse quent encoding process would be implementable with a computational complexity of degree N rather than N"" as required for the original correlated data. However, as Pearl et. al. point out, the design requirement of computational simplicity constrains the encoder to operate upon each transform coefficient independently even if the transform used has only partially decorrelated the coefficients. This constraint, which is also imposed upon the decoder, gives the encoder-decoder chain the structure depicted in Figure 3.1. Pearl et. al. called this structure basis-restricted transformation to indicate that once the transform coefficients are found they are to be treated independently in further source encoding operations such as quantization and binary coding, until transformed back to the original basis of u. As a performance criterion for transform coding schemes, Pearl et. al. proposed measuring the minimun information rate, R, that can be achieved while still maintaining a fixed mean-squared-error distortion, D, within the frame-U> 24 work of the basis-restricted.structure. Pearl.et. al. summarized the issues involved in designing a transform coding system'very clearly in the following way: "Referring to Figure 3.1, the designer is seeking to optimize the design of each P^, knowing the transformation, A, and the covariance matrix, <j> , of the input vector, u. The designer should decide on the number of quantizer levels, assigned to the i*"*1 transform coefficient and the exact partition of the complex plane v^ into disjoint regions. Equivalently, we can say that the designer attempts to distribute a certain distortion to each encoder in a way that would minimize the total rate per sample, R„(A,D) = 1/N E R,(D.) (3.9) i=l 11 subject to: N N ? D = 1/N E D. = E{l/N E Iv. - v.I } i=l i 1=1 1 i i1 (3.10) = E{l/N| |u - u\ |2} ."• Assuming, as did Habibi and Wintz, that the picture samples in f'(x.,y.,t ), and hence u as well, are quantized sufficiently finely to be I j k. approximated as a set of continuous Gaussian random variables, Pearl et. al. went on to minimize the rate expression in (3.9) subject to the constraint (3.10). Actually, the minimization problem had been solved previously, as pointed out in Appendix B where the details leading up to the result of Pearl and his colleagues are given. Their result takes the form of an N-block rate-versus-distortion func tion expressed via the parameter 0 as shown below. 25 P^(A,D) = 1/N I 1/2 log (a.2/min [cr 2,8]) (3.11) 1=1 N 2 D = 1/N Z min[a. ,8] (3.12) i=l 1 2 tTi where a. is the variance of the i transform coefficient, i The transform coding system considered here is constrained to oper ate upon each input vector u independently and upon each transform coefficient independently, as emphasized earlier. However, Shannon's rate-distortion function for a single Gaussian random variable, which is used in deriving the above result, is only attained in the limit of infinitely long blocks, which is contrary to these agreed upon encoder constraints. This does not destroy the utility of (3.11) and (3.12) as a figure of merit, however, since -"<tbe. -performance. >of'-'the -'-system-under "'-'upon-arrival" coding constraints • takes the same functional form as (3.11) and (3.12). [See Appendix B]. In fact, the degradation from the curves stipulated by (3.11) and (3.12) can be kept to within 1/4 bit per picture sample for upon-arrival coding. The performance measure defined by (3.11) and (3.12), although simple in form, will be seen, in the next section, to be exceedingly useful in assessing the effect of the selection of the parameters discussed in section 3.3 upon the performance of the transform coder. To evaluate the transform coder for a specified set of parameters, all that is required is to compute the variances of the transform coefficients that result for those parameters and substitute them into the pair of equations (3.11) and (3.12), allowing 8 to vary over an appropriate range in order to obtain a complete R versus D curve. 26 Although the explanations of basis-restricted transformations and the performance criterion of Pearl et. al. have been developed in a one-dimensional framework, corresponding to the block diagram of Figure 2.1, analogous statements apply to higher dimensional systems. In particular, (3.11) and (3.12) apply exactly using the set of N variances corresponding to the N transform coefficients in the multi-dimensional code blocks. As shown in Appendix A, the variances of the transform coefficients obtained by coding multi-dimensional blocks can be simply computed as the products of transform variances obtained for certain associated one-dimensional coders. This simplification is shown to be possible because of the separability of the autocorrelation function defined in (2.1) into a product of one-dimensional autocorrelation functions. The variances of transform coefficients associated with one-dimensional data blocks are given by the diagonal elements "of-*the ..transform repxd.-s'erita'fion -of , irameTy $ , given by: $ = A $ A"1 (3.13) v u x ' where $ is defined in (2.2) and $vin (2.4). [See Appendix A]. 3.5 Numerical Results The "Pearl measure" of performance, (3.11) and (3.12), was used to evaluate the effect of the choice of several parameters, listed in section 3.3, upon the performance of the transform coder. The results of that study are presented in this section. All computations were performed on the IBM 360/67 general-purpose digital computer using the Fortran IV programming language. A few preliminary statements concerning notation are in order. For the purposes of this section, (2.1) has been modified to the following form: 2 -a1|AK|-a |Ay|-a |At| R(Ax,Ay,At) = a e (3, where cx^, o^* and are called the correlation parameters for the two spatial and time dimensions, respectively. The sub-blocks of f'Cx.jy.jt.) upon which the system operates each have N components, l j k where N = L^L^L^ and L^, L^, and specify the array size in the two spatial co-ordinates and time, respectively. When dealing with two-dimensional arrays, = 1 and for onerdimensional arrays, ~ 1 also. To determine the effect of dimensionality on transform coding efficiency, a block length of N = 64 was chosen and a^, c^* and ag were all 'set equal to' '0.8'5 "in agreement with values reported for experimental television data [14]. Figures 3.2(a), (b) , and (c) are plots of rate versus distortion for the KL, F, and H transforms, respectively, For the three-dimensional cases, = = L^ = 4 while = = 8 for the two-dimensional cases. It can be seen from Figure 3.2 that a reduction in rate is obtained by applying transform coding in two dimensions instead of one dimension. A somewhat lesser additional reduction is obtained by applying three-dimensional coding. The curves of Figure 3.2 are reassembled into groups of curves of the same dimensionality to form Figure 3.3. This is done to compare the performance of the different transform types. Although there is no great difference between the performance of the KL, F, and H transforms regardless of dimensionality, as witnessed by Figure 3.3 RATE IN LCGSIIS Pf<TE ill L0GB1IS • (b) Figure 3.2 Comparison of rate versus distortion curves for one-, two-, and three-dimensional blocks of data when N = 64 and adjacent sample correlation equals 0.85. (a) K-L. (b) Fourier, (c) Hadamard. Rfilt IN LOMiTS 3.7 I (c) to CO Pr.IE IN LK,S17S .0037 (b) Figure 3.3 Comparison of rate versus distortion curves for K-L, Fourier, and Hadamard transforms when N - 64 and ad jacent sample correlation equals 0.85. (a) One-dimen sional blocks, (b) Two-dimensional blocks, (c) Three-dimensional blocks. for the parameter values chosen, an interesting result can be seen. Whereas the Fourier transform outperforms the Hadamard transform in the one-dimensional case, the reverse is true for too- and three-dimensional coding. This result is interesting because it runs counter to the views expressed by many researchers who have implicitly ranked the Fourier transform as superior to the Hadamard transform for all dimensionalities on the basis of their results [3][9] for a single dimensionality. Also considered in this study was the effect on R of varying the degree of correlation of the picture, that is the picture statistics while holding N, D, L^, L^, and constant. A reasonably high quality picture requires a signal-to-noise ratio on the order of 30 decibels, where the signal-to-noise ratio is S/N (db) = 10 log10 a2/D . (3 Figure 3.4 illustrates the relation between rate and correlation for the different transform types at N = 64 and for a distortion corres ponding to output picture samples with a signal-to-noise ratio of 30 decibels. The picture data was modelled as having ct^ = = ot^ and being uniformly sampled in all three dimensions when calculating these curves. The dimensionless variable ALPHA on the abscissas of the curves is defined to be the product of the sampling interval and either al' a2' °r a3' w^:'-c^ are identical. It can be seen from Figure 3.4 that a substantial reduction in data rate is obtained by coding in two rather than one dimension, particularly for strongly correlated data, while a lesser saving is obtained by extension from two to three dimensions. ALPHA ." ALPHA ALPHA (a) .(b)- (c) Figure 3.4 Comparison of rate versus correlation curves three-dimensional blocks of data when N = 64 noise ratio is 30 db. (a) K-L. (b) Fourier. for one-, two-, and and the signal-to-(c) Hadamard. (a) (b) (c) Figure 3.5 Comparison of rate versus correlation curves for K-L, Fourier, and Hadamard transforms when N = 64 and the signal-to-noise ratio is 30 db. (a) One-dimensional blocks, (b) Two-dimensional blocks, (c) Three-dimensional blocks. 33 The curves of Figure 3.4 are regrouped in Figure 3.5 so as to relate the performance of the different transforms to one another as functions of correlation. Although differences in performance are slight, it can be seen that the Hadamard transform is superior to the Fourier transform in the three-dimensional case, in the two-dimensional case except for very weakly correlated data, and for strongly correlated data in the one-dimensional case. The dependence of rate upon block length, N, was computed for digitized picture data with a correlation coefficient of 0.85 between adjacent samples, under the restriction that an output signal-to-noise ratio of 30 decibels be maintained. The results for different dimensionalities and the KL, F, and H transforms are presented in Figure 3.6. As can be expected, in all instances a block length can 'be''reached -where" further gains in coding efficiency with increasing block length become negligible. The greater the dimensionality of the coding, the longer the block length at which the return in increased coding efficiency fails to justify.increasing block length. Figure 3.7 allows a comparison between the performances of the KL, F, and H transforms as a function of block length. Here the curves of Figure 3.6 have been rearranged in groups of like dimensionality. The important thing to note from Figure 3.7 is that, for the stated parameter values, the Hadamard transform outperforms the Fourier transform at short block lengths regardless of the dimensionality in coding. At longer block lengths, the reverse is true. 5.0 A Figure 3.6 Comparison of rate versus block length curves for one-, two-, and three-dimensional blocks of data when the adjacent sample correl ation is 0.85 and the signal-to-noise ratio is 30 db. (a) K-L. (b) Fourier, (c) Hadamard. Figure 3.7 Comparison of rate versus block length curves for K-1, Fourier and Hadamard transforms when the adjacent sample correlation is 0.85 and the signal-to-noise ratio is 30 db. (a) One-dimensional blocks, (b) Two-dimensional blocks, (c) Three-dimensional blocks. 3.6 Conclusions In the foregoing, we have established meaningful theoretical performance curves for picture data of varying correlation operated upon by transform coding systems of different coding dimensionality, transform type, and block length. It has been shown that reductions in required rate can be achieved by increasing coding dimensionality, particularly in the case of the extension from one to two dimensions. It has also been shown that for relatively short block lengths and strongly correlated data, the Hadamard transform outperforms the Fourier transform. Or, rather, at least this is true for the case of the mean-squared-error distortion criterion and a homogeneous, Gaussian-Markov random field. This is true irrespective of coding dimensionality and runs counter to results at long block lengths when the Fourier transform approximates the optimal KL transform in performance [22][24]. The gains in coding efficiency to be obtained by increasing block length were determined for picture data modelled as a homogeneous Gaussian-Markov random field. In particular, it was shown that the greater the coding dimensionality, the longer the block length at which appreciable reductions in rate can still be achieved by increasing block length. To obtain the above results, a performance measure due to Pearl et. al. [3] was used. Similar results could have been obtained by applying the formulas derived by Habibi and Wintz [9] which include the effects of an actual block quantization strategy. However, compare the computational effort involved in finding the optimal combination of L and {b(l.m), 1, m = 0, 1, L-l} for each value of M (3.7) in Habibi and Wintz' approach to that required in evaluating equations (3.11) and (3.12) for a particular value of 6. Pearl's measure can be shown to yield results which are within 1/4 bit per sample point of rates theoretically obtainable at the present time. The Pearl measure, it can be concluded, is an extremely useful tool which offers simplicity in computation and flexibility in the choice of which parameters to study in terms of their influence upon system performance. Finally, a statement concerning a major issue in the design of engineering systems, the matter of cost, should be made in connection with transform coding systems. It is clear that the implementation cost grows linearly with the block size, N, since the expense is mainly for memory. Just how cost grows with dimensionality,-however, is difficult, if not impossible to determine with any precision. Currently, it is reasonable to argue that three-dimensional systems would be significantly more expensive than too-dimensional systems which in turn are somewhat more expensive than one-dimensional systems. Future hardware advances could lessen or eliminate problems with hardware costs for three-dimensional systems. The linear transform is more cheaply implemented as a Hadamard transform than a Fourier transform the reason being that the Hadamard transformation really involves only addition and not multiplication due to the transform elements being either +1 or -1. The fast Fourier transform algorithm and the similarly structured fast Hadamard transform algorithm make both sub-optimal transforms interesting in practical situations [25] whereas the KL transform, due to a lack of similar implementation efficiencies, is of interest only from a theoretical standpoint. 38 IV A THEORETICAL INVESTIGATION OF AN.ALTERNATIVE' TO THE MEAN-SQUARED-ERROR DISTORTION MEASURE 4.1 Introduction Successful source coding depends upon an adequate understanding of both the source and the user. If the characteristics of the user are understood sufficiently well, then it may be possible to define analytical distortion measures which accurately reflect the relative importance to the user of different components of the source output. For the human user and a pictorial information source, the mean-squared-error distortion measure has seen almost universal service in source coding research. Despite an ever increasing understanding of the human visual system, it is extremely difficult to develop objective expressions- corresponding 'to -subjective -distortion which ar-e analytically tractable as is the mean-squared-error (MSE) distortion measure. Recently, Stockham [4][26] suggested a nonanatomic model for the processing characteristics of the human visual system. The first order psychovisual model for human vision shown in Figure 4.1 is based on his studies in non-linear filtering [27]. The model considers the human eye as a perfect camera followed by a logarithmically sensitive surface and a linear filter. The presence of a logarithmic response in vision has been accepted for some time [28]. As well, the linear filtering characteristics of the eye have been thoroughly documented in the physiological literature [28]. Stockham's visual model can account, theoretically for the existence of several optical illusions [4][26]. Examples are the Mach Figure 4.1 Stockham's visual model phenomenon and the illusion of simultaneous contrast. The model has been verified empirically to some extent by experiments discussed in [4] and [26]. Stockham defined an objective measure of image quality by measuring the difference between a distorted image and its reference original, only after each had been transformed by the model. A mean-squared-error measure based on this definition is given by: 2 2 e = //[v(x,y)*(log I(x,y) - log R(x,y))] dxdy (4 where v(x,y) is the two-dimensional point spread function of the linear filter in the visual model, I(x,y) is the reference original, R(x,y) is the distorted reproduction being evaluated, and "*" denotes convolution. 'The'-utility '*of 'this-'"measure -depends "-upon"the degree to which the visual model approximates the early portions of the human visual system. If the approximation is sufficiently good, meaning that the model correctly emphasizes certain aspects of an image and deemphasizes certain others in a manner similar to the early portions of the human visual system, then distortions which are significant to the user will be weighted heavily while less important distortions will be treated with less weight. 4.2 A Theoretical Assessment of Transform Coding Using Stockham's Distortion Measure 4.2.1 Preli.minary Remarks Treating the logarithm of the brightness, called the density, 41 as a wide-sense stationary Gaussian-Markov (SGM) random field may have more validity than the earlier assumption that the brightness is an SGM random field. Many images have intensity histograms skewed toward the dark range of values whereas their density histograms are more symmetric [4] and nearer to being Gaussian in shape. Regarding the image density as the input to the transform coding system and the processed density as the output, a theoretical comparison was made of system performance using MSE and Stockham's modified MSE distortion measures. For MSE, performance of the system was determined using the Pearl measure described in Chapter 3. For modified MSE, denoted subjective mean-squared-error (SMSE) to indicate that it is believed to be highly correlated with subjective assessments of distortion, a more involved procedure was required, which is discussed below. 4.2.2 The Procedure Used to- Evaluate System Performance With Stockham's Measure Consider rewriting (4.1) for an ensemble of images in the following way: ESUBJ =/o/o[v(x'y)*e(x'y^2dxdy <4.2) where the overbar indicates ensemble average, e(x,y) = d(x,y) - d(x,y), and d(x,y) is a degraded version of d(x,y), the image density. That is, d(x,y) and d(x,y) are identical to log I(x,y) and log R(x,y), — respectively, in equation (4.1). Simplifying the equation for e still further yields: 42 ESUBJ = /OJD [e'(x>y)]2dxdy (4.3) where e'(x,y) is the resultant signal when the image error, e(x,y), is processed by the spatial filter v(x,y) of the visual model, By virtue 2 of Parseval's theorem, £gygj can also be expressed in the following way: 2 2 OO CO 2 eCTi,T = A I E E'(m,n) Z (4.4) SUBJ m=-°° n=-co til where E'(m,n) is the (m,n) Fourier coefficient in a Fourier series expansion of e'(x,y). Further, considering e'(x,y) to have period A in each dimension, Se,(u,v), the power spectral density of e'(x,y), is given by: S£(u,v) = |v(u,v)|2Se(u,v) = Z E E'(m,n) 2S(u- m/A)6(v- n/A) (4.5) m=_oo m=-oo where S (u,v) = E E |E(m,n)|26(u- m/A)6(v- n/A) (4.6) e m=-oo n=-«= ' 1 is the power spectral density of e(x,y). It follows that: _oo-L Se,(u,v)dudv = /_„/_„ |V(u,v)| Se(u,v)dudv 1=_M lV(m/A. n/A)|2|E(m,n)|2 = A2^2^ (4.7) 43 Equation (4,7) was used to compute the SMSE resulting from linear transformation and block quantization encoding for a Gauss-Markov field image model. The steps involved are: (1) Determine the power spectral density, Se(u,v), of the difference error, e(x,y). In the case of transform coding systems, theoretical calculation of Se(u,v) involves finding the power spectrum of each basis function used in the transformation and weighting the power spectrum by the mean-squared-error of the transform coefficient which weights the corresponding basis function in the expansion of d(x,y) and e(x,y). For the purpose of computing its power spectrum, each basis function was defined to be identical to zero everywhere in an image except in the subsection which it was used to represent. That is, each basis function was defined over the complete picture rather than only over a single subsection. The reason for doing this is that when someone views an image he does not view the subsections individually but collectively as an entire picture. Further details concerning the procedure used to compute Se(u,v) are given in the next. section. (2) Weight Se(u,v) with |v(u,v)| according to (4.5) to obtain the modified error power spectral density, S ,(u,v). Since V(u,v) is the transfer characteristic of the spatial filter in 2 Stockham's visual model, JV(U,V)| serves to modify the error spectral density, Se(u,v), in accordance with subjective considerations of the importance of distortion in different frequency ranges. (3) Integrate Se,(u,v) to obtain a value proportional to 2 *^CTTT> T (4-7). Notice, as a point of interest, that a similar integration 44 of Se(u,v) yields a value proportional to e"", the simple MSE employed in Chapter 3. This fact permits a check to be made on any software procedure used to compute S^(usv)• The numerical values of z obtained using the techniques of Chapter 3 and obtained by integrating Se(u,v) and scaling the result must agree. The above procedure for calculating estimates of the SMSE of transform coded images was implemented on the IBM 360/67 general-purpose digital computer. Consequently, the continuous operations used above were replaced with their discrete equivalents. In addition, the source images were assumed to be digitized pictures having 256 x 256 picture elements. Only one- and two-dimensional Fourier and Hadamard transforms were considered due to computational constraints. The Karhunen-Loeve transform was ignored because it is of little practical interest. Two-dimensional blocks, upon which the two-dimensional transform coders operate were chosen to be 8 x 8 arrays of adjacent picture elements. One-dimensional data blocks of 64 adjacent picture elements were considered. The horizontal and vertical correlation coefficients, p and p , were set equal to 0.85 as in Chapter 3. x y The most difficult step in implementing the procedure for 2 finding e mT is the calculation of S (u,v). The subsequent weighting and integration operations are straight forward. The transfer function used for the spatial filter, V(u,v), was determined experimentally by Stockham [4] to be: V(R) = 742 / (661 + R2) - 2.463 / (2.459 + R2) (4.8) 45 Figure 4.2 Plot of equation (4.8) 46 where R is the spatial frequency in cycles per degree. We assumed that the eye has a circularly symmetric frequency response so that Stockham's one-dimensional frequency response function (4.8) can represent a radial cross-section of the two-dimensional frequency 2 2 2 response function. That is, we assumed that R = u + v . A plot of equation (4.8) is shown in Figure 4.2. 4.2.3 The Computation of Se(u,v) Consider a two-dimensional transform coding system which codes continuous images of size Ax A by operating upon subsections of size B x B units of length. Assume that A/B = M so that the difference error, e(x,y), is given by: e (x>y) r=o s=o rs (4.9) where e(x,y) ; (x,y) £ r.s ers^X'^ 1o ; elsewhere (4.10) and T = {(x,y) : the point (x,y) is in the (r,s) subsection) rs As indicated in Figure 4.3, the subsection index r varies from 0 in the leftmost subsections to M-l in the rightmost while s varies from 0 in the uppermost subsections to M-l in the lowest. We can express e(x,y) as: e(x,y) = S=_oo E__ot E(m,n) exp[2ir/=T(mx - ny)/ A] (4.11) Figure 4.3 The partitioning of an image into M subsections 48 where E(m,n) =^2 e(x,y) exp [-2u/T(mx + ny)/A]dxdy (4.12) Equation (4.11) implies that e(x,y) is periodic with period A in each direction. Hence Se(u,v) is given by (4.6). Each of the functions e (x.y) can also be considered rs J periodic with period A in each direction so that: CO CO , e (x,y) = E- E n (m,n) exp[2ir/-T(mx + ny)/A] (4.13) rs m=-°° n=-°° rs r,s = 0, 1, . .. , M-l where: 1 r A r A / £2 (m>n) = T2 Jn/n er-Ax>y) exp[-2iT/-l(mx + ny)/A]dxdy (4.14) rsv A J0JQ rs' r,s= 0, 1, ...» M-l Alternatively, since the transform coder operates upon each function e^s(x,y) in turn, expanding it in a suitable series, we could express e (x,y) as: rs CO CO e rs (x,y) = |=_„ Ars(k,£)<f>rs(k,£,x,y) (4.15) where r,s = 0, 1, ..., M-l Ars(k,£) = f2 ./ / ers(x,y)cf)*s(k,£,x,y)dxdy (4.16) r rs k,£ = .. ., 0, .. ., o= 49 The functions {<}> (k,£ ,x,y) ,k,£ =-eo, ..., 0, ...,»} are the basis rs functions of the transform used to process e (x,y). Each function . rs (J>rg(k,£ ,x,y) is defined to be identical to zero outside the subsection T . For example, the Fourier basis functions for the first subsection rs r > in the upper left-hand corner of the image are given by: ) - exp[2^,/^T(kx + ly)/B] ;(x,y)£ Tnn <j> (k,£,x,y) = < B 00 uu / 0 ; elsewhere (4.17) k, = -oo, ..., o, .. ., 00 Recalling that the transform coders considered in this thesis use the same basis functions to represent each subsection, it can be seen that: • 4rs(k,£,xyy) -= 4>0g(k,£,x-rE,~y-sB) (4.18) r,s =0, 1, ...» M-l That is, the basis functions for any subsection are obtained by appropriate translations of the basis functions in the first subsection. Having introduced necessary notation, we now proceed to 2 find an expression for |E(m,n)| which can be substituted into (4.6). The expression we seek must be computable in terms of quantities to which we have ready access. Now, from equation (4.12) it can be shown that: (4.19) E(m,n)|2= |^2 e(x,y) exp [-2Tr/=T(mx + ny)/A]dxdy |' 50 Substitution of (4.9) into (4.19) yields: • .(4.20) |E(m,n)[2= iflofloi2 IQ!Q ers^x,y^ ' exPi-ZvS-liux + ny)/A]dxdy| 2 Substitution of (4.14) into (4.20) results in: (4.21) Next, substitution of equation (4.15) into equation (4.14) defining ^rg(m,n) yields: CO oo-SI (m,n) =2 E A (k,£)$ (k,£,m,n) „ ... rs R=-» £=-co rs rs (4.22) m,n = 0, oo; r, s=0, 1, M-l, where -*rs(k,£,m,n) = ^2 /Q/Q ^(k, £,x,y)exp[-27r/-T • (mx + ny)/A]dxdy - (4.23) k,£,m,n = 0, oo; r>s = 0, 1, M-l But substitution of (4.18) into (4.23) shows that: <3> (k,£,m,n) = $ (k, £,m,n)exp[-2n/T(mr + ns)/M] (4.24) rs 00 th where the indices k and £ denote the (k,£) basis function. Substitution of (4.24) into (4.23) and of the result into (4.21) yields: 51 |E(m,n)|2 = $f 1 $f 1 f . ? {A (i,j)A* (k,£) p,q=0 r)s=Oi)]=-«s k,H=-«> vq 'J/ rsN * (i,j,m,n)$* (k-,&,m,n) • (4.25) exp[-27r/^T(pm + qn) /M] exp [ 2n/^T(rm + sn)/M] } We assume that the cross-terms in (4.25) are negligible since the transforms approximately decorrelate the data samples. That is, A (ijj)A* (k,£) is expected to be small for p^r and/or qrs. If pq rs the cross-terms are considered zero, (4.25) simplifies to: lE(m»n)|2'd;UoX*=-» JAr8<k,J»|2|*fl0<k,^m,n>|^ (4.26) m,n = -oo, .. ., 0, . .., 2 th Now, |A (k,Jl)| is simply the mean-squared-error of the (k,£) transform coefficient in the (r^)1"*1 subsection. However, it has been assumed throughout that each subsection is a sample function 2 from the same wide-sense stationary process. Hence, |A (k,£)| TL S is independent of r and s so that (4.26) can be rewritten as: |E(m,n)|2 = M2 |>£=_ro Dfc0Q(k,£,m,n)|2 (4.27) th where D^£ is the mean-squared-error of the (k,£) transform coefficient in each subsection. can be obtained using equations (3.11) and (3.12) if a one-to-one correspondence is made between f 52 D^ and D^ in equation (3.10) [See equation (B.3) also]. 2* Inspection of (4.27) reveals that |E(m,n)| and hence Se(u,v) can be found by computing the power density spectrum of each basis function and then weighting the power spectrum by the mean-squared value of the error in the corresponding transform coefficient. As stated in section 4.2.2, these operations were performed on the IBM 360/67 digital computer. This required replacing the continuous functions and infinite series used in the discussion above by sampled functions and finite series, respectively. The FORTRAN IV programs 2 written to compute S (u,v) and enrro, are listed in Appendix C. e burs j Determining Se(u,v) for transform coded images processed in one-dimensional subsections is slightly more involved than finding S£(u,v) for two-dimensionally coded images. This is because the basis functions are one-dimensional and hence their power density spectra are also one-dimensional. Se(u,v) was obtained in this case by observing that the one-dimensional transform coding of an image can be viewed as a one-dimensional filtering operation. Denote the . transfer function of the equivalent filter by G('u) . Since the auto correlation function for the image is assumed to be separable, the power spectral density, denoted S^(u,v), must also be separable. That is: S (u,v) = S„(u)S (v) (4.28) u tl V where H and V denote horizontal and vertical, respectively. Since G(u) is a function only of u, S (v) will be unaffected by the filtering operation. Only SH(u) will be affected. Thus S(=i(u,v) will have the form: 53 S (u,v) = Pe(u)Sv(v) (4.29) where P (u) is the one-dimensional error spectrum. S (v) was computed e v by Fourier transforming: R^Ay) = e~BlAyl (4.30) where g corresponds to p =0.85. P (u) was calculated using the y e same procedure as that used to determine S (u,v) for two-dimensionally encoded images. A sample of the programs written to perform these tasks is given in Appendix C. 4.2.4 Numerical Results 2 Here, results obtained employing enTTT) as a measure of trans-&UX5 J "•'form coding system*performance "are "given--and compared 'to results given in Chapter 3. Tables I and II list the MSE and SMSE performance estimates as a function of the bit rate computed for the one-dimensional Fourier and Hadamard transform coding systems, respectively. Inspection of these tables reveals the following points: 2 2 (1) For non-zero bit rates, e„T_ _ > e regardless of the 2 transform used. In fact, gSUBJ is> on t'ae average> H Per cent larger ~~2 ~2 than e in the Fourier case and 9 per cent larger than £ in the Hadamard case fcr the set of non-zero rates considered. Assuming that Stockham's measure is indicative of subjective judgements of image 2 quality and that the technique used to evaluate Eg^gj is valid implies that the MSE distortion measure is a slightly optimistic estimate of actual picture quality. Rate in bits 2 e 2 eSUBJ 0.0 1.00 1.00 0.020 0.7985 0.8195 0.0584 0.5995 0.6290 . 0.1394 0.4024 0.4330 0.3811 0.2013 0.2259 0.7945 0.1005 0.1130 0.9548 0.0803 0.0901 1.162 0.0602 0.0675 1.455 0.0400 0.0450 1.955 0.0201 0.0226 2.455 0.0100 0.0113 2.616 0.0080 0.00903 2.823 0.00603 0.00676 3.116 0.00401 0.00450 3.616 0.0020 0.00225 4.116 0.00101 0.00113 Table I Performance estimates for the one-dimensional Fourier transform coder (Block length = 64) 55 Rate in bits 2 e 2 ESUBJ 0.0 1.00 1.00 0.0213 0.799 0.8194 0.0658 0.5998 0.6303 0.1673 0.4018 0.4328 0.4772 -. 0.2012 0.2218 0.9511 0.1004 0.1104 1.112 0.0802 0.08824 1.320 0.0601 0.06612 1.612 0.0400 0,04403 2.112 0.0201 - 0.02215 2.612 0.0101 0.01106 2.773 0.00804 0.00884 2.980 0.00602 0.00662 3.273 0.00401 0.00441 3.773 0.0020 0.00220 4.273 0.00101 0.00111 Table II Performance estimates for the one-dimensional Hadamard transform coder (Block length = 64) (2) For the assumed parameter values - that is, a block length of 64 and correlation coefficients equal to 0.85 - the relative performance of the one-dimensional F and H transform coders 2 as predicted by the rate versus e„TTT> T criterion is roughly the same as the relative performance estimate obtained using rate versus MSE. Prior to obtaining numerical results it had been thought Stockham's distortion measure might stipulate greater differences between the quality of the Fourier and Hadamard processed images for a given bit rate or even invert the ordering in quality as compared to MSE performance estimates. Tables III and IV list performance estimates for the two-dimensional F and H transform coding systems, respectively. Observe the following points with respect to Tables III and IV: 2 2 2 (1) As in the one-dimensional case, e_TT„ T > e • enTTT> T oUJiJ DII.DJ 2 is, on the average, 9 per cent and 10 per cent larger than e for the Fourier and Hadamard systems, respectively, for non-zero rates. (2) The two-dimensional Hadamard transform coder is 2 estimated to require fewer bits to attain the same e„Trn T value as 2 the Fourier transform coder, for any value of e . This result bUrSJ is the same as that predicted by the mean-squared-error distortion measure. Finally, Tables V and VI list performance estimates for one-dimensional Fourier and Hadamard systems which operate upon image data blocks of length 256. Since the images were assumed to consist of 256 x 256 arrays of picture elements, a block of length 256 corresponds to a complete scan line of picture samples. 57 Rate in 2 2 bits e eSUBJ 0.0 1.00 1.00 0.0416 0.4986 0.5290 0.5062 0.0991 0.1092 0.8796 0.0497 0.0551 1.990 0.0098 0.0109 2.490 0.0049 0.00548 Table III Performance estimates for the two-dimensional Fourier transform coder (Block length = 64) — - 11 1 1 1 . — Rate in 2 2 bits e ESUBJ 0.0 1.00 1.00 0.0379 0.4991 0.5295 0.4840 0.0995 0.1098 0.8402 0.050 '0.0556 1.946 0.0099 0.0111 2.446 0.0050 0.00555 Table IV Performance estimates for the two-dimensional Hadamard transform coder (Block length = 64) 58 Complete scan lines were considered in order to determine 2 if e„TTT> T is dominated by "edge effects" at shorter block lengths of SUBJ 64 elements which might obscure differences in subjective performance for the F and H transforms. Large differences in picture elements at opposite edges of a subsection can cause problems for block lengths less than the image width. Considering both the Fourier and Hadamard discrete transforms as series expansions of the picture samples in a subsection it can be seen that edge-to-edge differences, if large, will greatly influence the spectrum of the subsection. This is because the subsection is treated as a period of a periodic function when expanded in a Fourier or Walsh series. Thus large differences along opposite edges of the subsection appear as sharp discontinuities in the hypothetical periodic function. The transform coefficients attempt to represent the discontinuities at the expense of features within the subsection. Moreover, errors in coefficients due to quantization appear as contours along subsection boundaries in the processed picture, which, if pronounced, are very degrading to picture quality [2]. 2 Examination of Tables V and VI discloses that E is greater SUBJ ~2 than e by approximately 11 per cent and 9 per cent, respectively, for the Fourier and Hadamard systems. The fact that these percentage differences are identical to those observed in Tables I and II suggests 2 that eCTTT!T is not dominated by edge effects at a block length of 64. The approximation made in obtaining (4.26) is validated by comparing the MSE values in Tables I to VI with corresponding MSE values for the same bit rates computed by the techniques of Chapter 3. 59 Rate in 2 2 bits e ^UBJ 0.0 1.00 1.00 0.0187 0.7996 0.8145 0.0544 0.5991 0.6240 0.1302 0.3987 0.4291 0.3577 0.1987 0.2243 0.7546 0.09946 0.1134 0.9134 0.07986 0.0906 1.121 0.0600 0.0679 1.413 0.039.96 0.0452 1.913 0.01993 0.0226 2.413 0.009941 0.0113 - 2.574 0.00795 0.00902 2.782 0.00597 0.00678 3.074 - 0.00400 0.00453 3.574 0.001997 '0.00226 4.074 0.000995 0.00113 Table V Performance Estimates for the one-dimensional Fourier transform coder (Block length = 256) 60 Rate in 2 2 bits G ESUBJ 0.0 1.00 - 1.00 0.0212 0.7996 0.8177 0.0656 0.5992 0.6270 0.1670 0.3988 0.4293 0.4768 0.1989 0.2195 0.9506 0.09945 0.1099 1.112 ' 0.07986 0.0882 1.319 0.0600 0.0661 1.611 0.03996 0.04402 2.111 0.01993 0.0220 2.611. 0.009941 • 0.011 . 2.772 0.00795 0.0088 2.980 0.00597 0.0066 3.272 0.0040 0.0044 3.772 0.001997 0.0022 4.272 0.000995 .0.0011 Table VI Performance estimates for the one-dimensional Hadamard transform coder (Block length = 256) 61 4.3 Conclusions In this chapter, we have theoretically assessed the performance of transform coding systems based on a distortion measure proposed by Stockham [4][26]. In particular, the utility of Stockham's measure, 2 e„TT_T, as an index of the quality of transform coded images has been bUD J 2 examined. Theoretically estimated values of £,.„_,,.. called the subjective J SUBJ5 J mean-squared-error, were compared to values of the mean-squared-error for the same bit rate. 2 Although T was approximately 10 per cent larger than the oUB J MSE in each of the cases considered, the performance rank ordering of 2 the different systems using egyBJ was identical to that obtained using MSE distortion criterion. '"Numerical results 'for a'"long block length suggested that the SMSE was not dominated at shorter block lengths by edge effects which obscure differences between F and H transform coders. The study reported in this chapter was conducted to ascertain differences in theoretical estimates of transform coding system performance 2 2 using eQrRT as opposed to e . Of course, the ultimate test of the utility of the SMSE must involve empirical investigations which include a program of subjective testing of processed images. 62 V. CONCLUSION 5.1 Summary of Thesis Two topics of interest to the designers of picture coding systems have been investigated. The first subject of inquiry was the performance of a particular class of picture coding systems known as transform coders. The effect of certain relevant design parameters upon the behavior of these systems, which employ linear transformations and block quantization, was determined. The performance of the transform coders was estimated using a mean-squared-error measure of image distortion and certain applicable rate-versus-distortion expressions due to Pearl et. al. [3]. The results of the study are reported in section 3.5 in the form of curves showing the effects upon system performance of varying linear transform type, coding dimension ality, picture data correlation, and code block length. The conclusions reached are reported in section 3.6. The second topic considered centred on a new measure of image distortion proposed by Stockham [4] as an alternative to the mean-squared-error distortion index. The new measure is of interest because it incorpor ates knowledge of the human visual system and is expected to agree more closely with subjective judgements of picture quality than the mean-squared-error does. In order to determine what, if anything, could be learned about this second distortion measure from theoretical considerations alone, a procedure was developed to estimate its value for transform coded images. The values obtained were compared to the mean-squared-error distortion values obtained for the same bit rates. The results are tabulated in section 4.2.4 and the conclusions drawn are given in section 4.3. 63 5.2 Future Research On the basis of the work reported here, the following additional studi'es seem necessary: (1) An experimental investigation to test the validity of the results given in Chapter 3. The experiments would be similar to those con ducted by Habibi and Wintz [9] but far more extensive. If the subjective mean-squared-error, as well as the mean-squared-error, were evaluated for each transform coded image, then a comparison could be made of the two dis tortion measures. (2) A program to subjectively assess the quality of the processed images obtained in the experiments suggested above. The results of this project should indicate the relationship between Stockham's measure and subjective judgements of distortion. 64 APPENDIX A Definitions and Software Implementation of'the Discrete-Karhunen-Loeve, Fourier, and Hadamard Transformations A.1 General Definition of a Linear Transform The general forms of the one-, two-, and three-dimensional linear transformations used in picture coding are given below in equa tions (A.l), (A.2), and (A.3), respectively. Equations (A.4) to (A.6) give the corresponding inverse transformations for each dimensionality. Throughout this appendix, an asterisk denotes complex conjugation. L-l * yU) = I x(i)a U,i), £=0,1,... ,L-1 (A.l) 1=0 L-l M-l a yU.m) = I I x(i,j)a (£,m,i,j) (A.2) £=0 m=0 £=0,1,...,L-l;m=0,l,...,M-1 L-l M-l N-1 ^ y(£,m,n) =11 I x(i,j,k)a (£,m,n,i,j,k) (A.3) i=0 j=0 k=0 £=0,1,...,L-l;m=0,l,...,M-l;n=0,l,...,N-1 L-l x(i) = I y(£)a(i,£), i=0,l,...,L-1 (A.4) £=0 L-l M-l x(i,j) » • I I y(£,m)a(i,j,£,m) (A.5) £=0 m=0 i=0,l,...,L-l;j=0,l,...,M-l 65 L-i M-l N-1 x(i,j,k) = -l I I y(.^,m5n)a(i,.j,k,£',m,n) 1=0 m=0 n=0 (A.6) 1=0,1, .. .,L-l;j~=0,l, .. .,M-l;k=0,l, .. . ,N-1 A.2 The -Discrete Karhunen-Loeve Transform As observed in Chapter 2, the transform matrix, A, which effects the one-dimensional KL transformation has as its columns the vector solu tions, co > of the matrix equation: (A. 7) where § to - = Ato x $ = E{xx } x (A.8) '(A'r7) can-also-"be "written -in -series ^form-as fol'lows : L-l A,to, (i) = I *(i,i')w, d') k i'=0 (A.9) where c))(i,i') is the (i,i')th- element of $ . The convariance matrix of the transform vector, y, is given by * . = 10 A„ ..01 (A.10) y where A1, A0, ...,A are the eigenvalues of (A.16). - 1 L L To obtain the operator arrays, and hence the basis arrays, which accomplish the too- and three-dimensonal KL transformations, equations similar to (A.7), but of higher dimensionality:must be solved. However, o . '. 0 l 0 A_ . • z . 0 t • 6 0 66 since the auto.covariance function for image data is assumed to be separable (2.1), the solutions of higher order equations.are separable into products of one-dimensional solutions [9]. For instance, the matrices i^>^m' £=0>l> . ..,L-l;m=0,l,...M-l}, which comprise the two-dimensional KL transform operator, are the solutions of the tensor equation given by: L-l M-l where R(|i-i'|,|j-j'|) = R(i,i',j,j') = E{x(i,j)x(i'>j1)} is a fourth-order point tensor. Since R(|i-i'|,|j-j'|) = RH(|i-i,|)Rv(|j-j'|) (A.12) in accordance with (2.1), therefore %n(i>^ " V1)zk<J> ' (A-13) " where the vectors v and z are solutions of the following two matrix equations, L-l £hvh(i) =. I RH(|i-i'|)vh(i') (A.14) i' =0 M-l = .J0 V|j-j,|)zk(j,) (A,15) In addition, A„ = U, h=0,l,...,L-l; k=0,l,...,M-1 (A.16) x.m h k where the values of h and k are chosen to ensure that 67 All *• A12 38 5 A1L 5 X21 * ••• 5 *2L 5 * • ALL (A.17) This choice of h and k permits the basis matrices to be ordered accord 'im ing to the variances X„_. of the transform coefficients y(£,m) which weight them. A.3 The Discrete Fourier Transformation The one-, two-, and three-dimensional discrete Fourier transform ations are defined by equations (A.l) to (A,3) when the following iden tifications are made in (A.l), (A.2), and (A.3), respectively. a*(£,i) = — exp[-2TT>£T (£i)/L] (A.18) JL a*(£,m,i,j) = — exp[-2Tr/^T {it + mj)/LM] (A.19) a*(£,m,n,i,j,k) = —— exp [-2TT/T(U + mj + nk)/LMN] (A.20) VLMN The discrete Fourier transformation has many properties which are analogous to those of the continuous Fourier transform [29]. The most important of these is that the discrete Fourier transform coefficients relate to the frequency content of the data being transformed, thus making the spectral analysis of digitized signals possible. A.4 The Discrete Walsh or Hadamard Transformation Much work has been directed recently toward the applications of a set of orthogonal functions called the Walsh functions in the areas of signal processing and information transmission [30] [31] . Discrete Walsh functions and the discrete Walsh transform, in particular, have received much attention because of the continually expanding use of digital devices 68 to process information. The discrete Walsh transform or Hadamard transform, as it is commonly known, has many similarities to the discrete Fourier transform [32]. However, there are differences as well, which make Hadamard pro- • cessing an appealing alternative to Fourier processing [1][16].. The one-, two-, and three-dimensional Hadamard transforms are defined by substituting the expressions given below into equations (A.l), (A.2), and (A.3). * i q-iU.i) a (£,i) = a(£,i) = — (-1) X (A.21) JL where L-l q^.i) «• I Sh(*Hh (A-22) h=0 and L = log2L and i^ is the binary representation of i. That is: (i) decimal = (i£-l \~2 ' '' ' *1 Vbinary (A'23) Furthermore, gl(£) = + ££_2 g2 (£) = ££_2 + ££_3 S£-1W " h + V where each addition is modulo 2. (A.24) 69 a a,m,i,j) - — (-1) (A.25) LM where L-l M-l . q(l,m,±,3) = • I ShWK+ I gu.(m)jh, • (A.26) 1 h=o h h h'=o n n = q1(£,i) + q1(m,j) and M = lc^M. where * q (Jl,m,n,i,j,k) a (£,m,n,i,j,k) = —— (-1) (A.27) VLMN L-l M-l q (£,m,n,i,j,k) = - \ g,(£)i, + \ g.,(m)j,, (A.28) 6 h=o k h h'=o h n N-1 h"=o n n q1(£,i) + q1(m,j) + q^n.k) and N = log^N. As the above expressions (A.21), (A.25), and (A.27) indicate, the discrete Walsh functions only assume the values +1 and -1. That is, the Hadamard basis arrays are a set of sampled rectangular waves as com pared to the sampled sine-cosine waves associated with the discrete Fourier transform. A generalized frequency interpretation is given to the rec tangular Walsh functions by analogy with the sinusoidal Fourier functions. The term "sequency" has been chosen to denote the average number of zero-crossings per second by a Walsh function. 70 A.5 Variances of Transform Coefficients In order to evaluate some of the expressions presented in the text of the thesis, it is required that values be available for the var iances of the transform coefficients defined in (A.l) to (A.3). Because all of the transforms used in this thesis,- regardless of dimensionality, are implementable as a sequence of one-dimensional transforms, it can be shown that the variances of the transform coefficients which result from a multi-dimensional transformation are equal to the product of variances obtained for appropriately chosen one-dimensional transform coefficients. This fact will now be proven for the three-dimensional case. Consider, first, the variances of the one-dimensional tranform coefficients defined in (A.l). o2 = E{yU)y*U)} (A .29) Substituting (A.l) into (A.29) yields o] = I I E{x(i)x(i')} a*(£,i)a(£,i') (A.30) * i=0 i'=0 L-l L-l = I i, Vli-i' I) a a.Daa.i') i=0 i'=0 where is the horizontal spatial component of the image autocorrelation ' function. Next, consider the variances of the three-dimensional transform coefficients defined in (A.3). a2 = E{yU,m,n)y*U,m,n)} (A.31) Substitution of (A.3) into (A.31) yields L-l M-l N-1 a 2 I I I [E{x(i,j,k)x(i,,j,,k,:)} • lmn i,i'=0 j,j'=0 k,k*=0 (A.32) a"(«,m,n,i,j,k)a(£,m,n,i',j',k')] 71 But inspection of the transform definitions given in the previous sections reveals that .* A A -k a (£,m,n,i,j,k) = c (£,i)d (m,j)e (n,k) (A.33) for each of the transforms considered. Each of the factors on the right-hand side of (A.33) is a one-dimensional transform operator [See (A.l)]. Recall that: E{x(i,j,k)x(i',j,,k')} = R(|i-i'|,|j-j'|,|k-k*|) = Rj^di-i'lO^dj-j'^.P^dk-k'l) (A.34) where R^, R^, and R^, are the horizontal and vertical spatial components and the time component, respectively, of the image autocorrelation function. Equations (A.33) and (A.34), when substituted into equation (A.32), yield: 9 L_1 * %n = [. ,Jn yM'l)<- a.Dca.i')] • 1, x —0 M-l A [ I R__ (|j-j'|)d (m,j)d(m,j')] • j,j'=0 N-1 [ -l IL,(|k-k« |)e"(n,k)e(n,k')] (A.35) k,k'=0 Each term in square brackets above.has the same form as equation (A.30) proving that the three-dimensional transform variances are equal to the product of three one-dimensional transform variances. Finally, it should be pointed out (A.30) can be written in matrix form as follows. o1 = $ •£ y = A$ A 1 ll x , £=0,1,...,L-1 (A.36) ll 72 Although we know that the diagonal elements, <S> | of $^ are, by definition, equal to the variances of the transform coefficients, (A.36) states the additional fact that the matrix, § , can be obtained by appropriate trans form operations upon the rows and columns of the data covariance matrix, $ . x Equation (A.36) was used to compute the Fourier and Hadamard transform coefficient variances. The Karhunen-Loeve transform variances, however, were not computed in this manner. The reason for this is that before the KL transform can be performed, (A.7) must be solved to obtain the eigenvectors. Siiyce solution of (A.7) yields the eigenvalues as well as the eigenvectors and since the eigenvalues are the KL transform coefficient variances, there was no need to use (A.36). -1 * Since, by definition, A = A for the Fourier transform while .A ^ = A for .the Hadamard transform, (A.36) becomes * | = A$ A*| , £=0,1,...,L-1 (A.37) y. ££ x ££ ' for the Fourier case, and *J = A* A| , £=0,1, . . .,L-1 (A.38)-for the Hadamard case. (A.38) is simply the two-dimensional Hadamard transform of the covariance matrix $ . (A.37) is implemented by Fourier transforming each column of §x and then Fourier transforming each row of the resultant matrix to obtain $ . y A.6 Software Implementations A.6.1 The Discrete. KL Transform The linear transformations discussed above were used in the 73 calculation of (A.36). However, the Karhunen-Loeve transform coefficient variances were found by determining the eigenvalues of $ , the covariance matrix of the data vector, x. In order to solve (A.7), advantage was taken of the fact that $ is, by definition, a symmetric matrix. A FORTRAN-callable subroutine, SYMAL[33], which calculates the eigenvalues of a symmetric matrix was used to find the variances of one-dimensional KL transform coefficients. The subprogram, KLVAR1, written to control the process of obtaining the KL variances, is listed in section A.6.4. A.6.2 The Discrete Fourier Transform The subprogram, FVAEl, was written to calculate the variances of one-dimensional Fourier transform coefficients according to equation (A.37). This subprogram, listed in section A.6.5, calls the library subroutine F0UR2 [34], which is a fast Fourier transform routine [18] [35] [36]. ~A/6 .3 "The H a d am a r'd" "T r an s"f o rm The subroutine, HVAR1, was written to compute the variances of one-dimensional Hadamard transform coefficients according to equation (A.38). This subprogram, listed in section A.6.6, calls the subroutine HAD2, which is a too-dimensional Hadamard transform routine. HAD2, which is also listed in section A.6.6, is a FORTRAN IV implementation of the fast Hadamard transform algorithm described in [16]. A.6.4 Listing of the Subprogram KLVAR1 SUBROUTINE KLVARWS ,EX t N, M, A ) C THIS S/R COMPUTES THF VARIANCES OF THE L-DIM. KL TRANSFORM C COEFFICIENTS US TNG THE LIBRARY S/R ' S YM AL 1 . THE ARGUMENTS C OF THE S/R KLVAR1 ARE: S,THE VECTOR IN WHICH THE VARIANCES C ARE RETURNED; EX, THF CORRELATION PARAMETER USED TO GFNER-C ATE THE ELEMENTS OF THE SYMMETRIC COVARIANCE MATRIX OF THE C DATA VECTOR; N,THE LENGTH OF THE DATA VECTOR AND HENCE TEE C ORDER OF THE COVARIANCE MATRIX; M=N* ( N+l) fl IS THE LENGTH C OF THE ARRAY A; A,A L-DIM. ARRAY CONTAINING THE ELEMENTS C OF THE SYMMETRIC COVARIANCE MATRIX ARRANGED AS REQUIRED PY 74 C THE S/R SYMAL. REAL S(N), A(M) C FORM THF ARRAY A USING THE EXPONENTIAL AUTOCORRELATION FUNCTION. DO 1 1 = 1, N 0 0 1 J = 1. 11 K={ I-L) *I/2 +J A(K) = EXP(-EX* IABS ( I-J) ) 1 CONTINUE C NOW FIND THE EIGENVALUES OF A. CALL SYMAL* A ,N, S , IE RROR»0 ) 1F{ TERROR .EO.O ) GO TO 3 WRITE(6,2)I ERROR 2 F 0 R M A T ( • «,' SOMETHING WRONG . I ERRO R •=• , I 2 ) STOP 3 RETURN END A.6.5 Listing of the Subprogram FVARl SUBROUTINE FV AR I ( S, R f A t GAMMA , I .) C THIS IS A S/R TO COMPUTE THE VARIANCES OF THE 1-DIM. FOURIER C TRANSFORM COEFFICIENTS OBTAINED FROM THE XSFORMAT TON OF A DATA C VECTOR OF LENGTH I. THE DATA IS ASSUMED TO HAVE AN EXPONENTIAL .C .AUTOCORRELATION EUN.CT ION WITH CORRELATION PARMET.ER .., GAMMA . R C IS THE COVARIANCE MATRIX OF THE DATA. A IS A WORKING ARRAY ANO C S IS THE ARRAY IN WHICH THE VARIANCES ARE RETURNED. THIS S/R C CALLS THE LIBRARY S/R 'FOUR2« , A FAST FOURIER TRANSFORM PACK-C AGE. • REAL Si T ) COMPLEX R(I,I),A{I) INTEGER DIM(1) - D IM ( L ) = I C GENERATE THE COVARIANCE MATRIX ,R. DO 1 J=l,I DO 1 K= 1, I 1 R{J,K)=EXP(-GAMMA*IABS{J-K)) • C TRANSFORM EACH COLUMN OF R. DG 4 K=l,I DO 2 J=l,l 2 A(J) = R(J,K) CALL FOUR2(A,DIM,1,-1,1 ) DO 3 J= 1 , I 3 R(J,K)=A{J) 4 CONTINUE C NOW INVERSE XSFORM EACH ROW OF THF MTX. OF XSFORMEC COLUMNS. IP= 1/2 + 1 t 2=IP—1 DO 7 J = 1, I P DO 'S K-1, I 5 A(K)=R{J,K) CALL FOUR 21 A,DIM,1, 1,1) 75 00 6 K=l, F 6 R ( J , K) = A { K ) 7 CONTINUE C THE VARIANCES DESIRED ARE THE DIAGONAL ELEMENTS OF THE C TRANSFORMED ARRAY ,R. DO 8 J=l, IP 8 S(J)=REAL(P(JtJ)) DO 9 J=2,I2 K=I-J+2 9 S(K>=S(J) RETURN END A. 6.6 Listing of the Subprograms HVARl and HAD2 SUBROUTINE HVAR1{S,R,GAMMA*H1,H2,K) C THIS IS A S/R TO CALCULATE THE VARIANCES OF THE 1-DIM. C HADAMARD TRANSFORM COEFFICIENTS OBTAINED BY TRANSFORMING C A DATA VECTOR. THE DATA IS ASSUMED TO HAVE AN EXPONENTIAL C AUTOCORRELATION FUNCTION WITH CORRELATION PARAMETER , GAMMA. C THE DATA VECTOR HAS LENGTH K. AS HAVE THE ARRAY ,S, IN WHICH C THE VARIANCES ARE RETURNED AND HI AND H2 ,2 WORKING ARRAYS. C R IS THE COVARIANCE MATRIX FOR THE DATA VECTOR. THIS S/P C CALLS THE S/R 'HA02' WHICH CONTAINS A FAST HADAMARD TRANSFORM -C PACKAGE, REAL SIK) i R ( K , K) » HI (K.) , H2 I K } C GENERATE THE COVARIANCE ARRAY ,R. DG 1 1 = 1» K DO 1 J=1,K 1 R ( 11 J )= EXP (-GAMM A*( TABS ( I-J ) ) C PERFORM A 2-DIM. HADAMARD TRANSFORM OF R. CALL HAD 2 { R , K ,H1 »H2 ) C THE VARIANCES APE THE DIAGONAL ELEMENT'S OF THE TRANSFORMED R. DO 2 1=1 , K 2 S(I)=RUtI) RETURN E ND SUBROUTINE HA02(D,N,H1,H2) C THIS IS A S/R TO COMPUTE THE 2-DIM. DISCRETE WALSH C TRANSFORM OF DATA STORED IN THE ARRAY 'D*. THE C TRANSFORMED DATA IS RETURNED IN THE ARRAY D. 0 MUST ' C BE SQUARE WITH DIMENSION ANY POWER OF 2 FROM 2 UP TO. C 256. REAL D(N,N),H1(N) ,H2(N) COMMON/ HA C.T/T7 ( L 28 ) , T 6 ( 64 ) , T 5 I 3 2 ) , T4( I. 6 ) , T3( 8 ) , T 2 ( 4 ) , T 1 ( 2 ) C PERFORM A 1-DIM. HAD» TRANSFORM CN EACH ROW OF D. DO 20 1=1,N 76 DC 5 J=l,N 5 H1(J)=D(I,JI CALL HADK H1,N,H2» £125) C THE TRANSFORM IS RETURNED IN THE H2 VECTOR. DO 10 J = 1 , N 10 D{ I, J )=H2{ J ) 2 0 . CCNTTNUF C NEXT, TRANSFORM EACH COLUMN OF THE RESULTANT MTX. DO 30 J=1,N CALL HA 01 (DU , J) , N,H2f tl.2 5) DO 25 1 = 1, N 25 0(1 , J) = H2( I )/N 30 CCNTINUE R ETURM 12 5 WRTTE(6,127) 127 F 0 R M A T( 5 'ERROR IN HADl'J STOP END SL6R0UTINE HAD1 (D,N,H,*) C THIS S/R COMPUTES THE 1-DIM. TRANSFORM OF THE DATA IN C THE ARRAY D OF LENGTH N. THE TRANSFORM IS RETURNED C IN THE ARRAY H,ALSO OF LENGTH N. THIS S/R IS AN C I M PL E ivE NT AT I ON OF THE FAST HADAMARD TRANSFORM AS C PUBLISHED BY PRATTTANDREWS AND KANE. REAL D ( M ) , B (N ) COMMON/HA DT/T7 (128 ) , T6 ( 64 ) , T 5 ( 32 ) f T4( 16 ) , T 3 (-8 ) » T2 ( 4) , T H 2 ) •IF <-t-M«.t?Q, •?:•) »G-P •'TO-'-^O Hll )= . SEQS(D,N ) H(2 > = SE01(Tl ) •IF (N.EQ.4)G0 TO 95 . H (3 )= SEQ2{ T2 ) H(4)= SEQKTl) IF (N .EC 8) GO TO 100 H(5) = SEQ3(T3) H(6) = SEQKTl) H ( 7 ) = S EQ 2 { T 2 ) H (3) = SEC11T1) IF (N .EQ. 161 GO TO 50 • H(9)= SEQ4CT4) 1 . H (10)= SEQl (Tl ) H ( 11)= SE021T2) H (12 )= SECT(T 1 ) H(13)= SEQ3(T3) H( 14)= SEQKTl) HU5) = SFQ2(T2) H(16)= SEG1(Tl) IFCN.EQ. 16.) RETURN IF (N.EQ.3 2) GO TO 60 H< 17)=SEG5( T5) 2 H(13) = SE01( Tl ) H(19)= SEC2(T2) H( 20)= SEQKTl) H(2l)= SEQ3(T3) H(22)= SEQ1(Tl) H(23) = SE02(12) H (2 4 ) = S E Q1 ( T 1 ) H(25)= SEC4(T4) H( 26 5= SEQl.(Tl) H (2 7 ) = S E C2 IT 2 ) H{28)= SE01(Tl) H(29)= SEQ3(T3) H (3 0) = SEG1 (Tl. ) H { 31 ) --- SE 0 2 ( T 2) H(32) = SEQKTl) IF (N.£0.32.) RETURN IF(N.EQ.64)G0 TO 70 H(33)=S E06 <T6 ) H ( 34) = SEQ1 ( Tl ) H( 35) = SEQ2(T2) H(36)=SEQ1 (T). ) H ( 37) = SEQ3 ( T3 ) H(38)=SEQ1(T1 ) H(39)=SE02 (T2 ) H( 40) =SE01( Tl ) H(4 1 )=SEQ4(T4) H(42)=SEQ1(Tl) H(43) = SEQ2( T2) H (44 )=SEQ1 <T 1 ) H145)=SEQ3(T3) Hi 46J-SEQ 1( Tl) H(47)=SEQ2 (T2") •H'(43-S='SE'8-I-{-Tl^ H (49 ) = SEQ5(T5) H(50)=SEQ1(T1) H( 5l)=SEQ2( T2) H(52)=SE01(T1) H ( 53)= SE 03(T3 ) H(54)=SE01(T1) H(55)=SEQ2(T2 ! H(56)=SEQ1(T1) H(57 )=SEQM T4 ) H(58) =SE01(Tl ) H(59)=SE02(T2) H (60 )=SEQ1(T1) H(61) =SEQ3(T3) H(62) = SE01(T1) H(63)= SE02(12> H(64)=SEQ1(Tl) IF(N .EQ .64)RETURN IF(N.E0.128)G0 TO 8 0 H(65)=SE07(T7) H(66 )=SEQ1( Tl ) H (67)=SE02{T2) H(68)=SE01( Tl) H(69 ) = SE03(T3) H(70)=SE01(Tl) H(71)=SEQ2(T2) H(7Z >=SEQ1(T1 ) H(7 3)~ S E Q4{T4) H(74) = SEQ1(T1) H(75)=SE02 (T2 ) H(76)=SEQ1(Tl) H (77 )=SEQ3(T3 ) H(78J=SEQ1(Tl) H(79) = SEQ2(T2) H (80 ) = SEQ1 (T 1 ) H (81 )=SE05(T5) H(82 ) = SEQ 11T 1) Hi 03)=SEQ2(T2 ) H{84)=SE01 (Tl ) H(85)=SEQ3(T3 5 H(86)= SEQ1(Tl ) H( 87) = SEQ2( T2) H (RO ) = S EQ 1 (Tl ) H{89)=5EQ4(T4 ) H( 90) = SEQ1(Tl) H(9i.) = SEQ2(T2) H(92)=SEQ1(Tl) H( 93)=SE03( T3) H(94)=SE01(Tl) H ( 95) = SEQ2 ( T2 ) H { 96 >= S EQ 1 ( T 1 ) H(97)= S EQ6 (T6 ) H{ 98/ = SEQ 1(Tl 5 H(99 )=SEQ2(T2 ) H (100 )'=SEC1 (Tl ) H( 101) = SEQ3( T3) H (102 ) = SEC1 (T 1 ) •H(103)=SEC2(T2) H{ 104) = SEQI (Tl) H ( 105)=SEC4(T4) H( 106) = SEQIITl) H(107! = SEQ2IT 2) H (108)--SEG1 (Tl) HI 109) = SEQ3(T3) H(110)=SECUT1) H{111) = SE 02(T2) H ( 112) = SE01(T1) H (.1.13 ) = S EC5 (T5 ! H ( 11 4 ) = SE 01. ( T1 ) H 1115 )=SEQ2(T2) H(116)=SEC1(T1) HI 117) = SEQ3( T3) H(118) = SEG1(T1) H(119)=SEG2(T2) H( 120) = SEQ1(T 1) H(121)=SEG4!T4) H(122)-SE01(T1) H(123)=SEQ2(T2) H(124)=SEGl(Tl) H ( 1 2 5 ) = SE 0 3 ( T 3 ) H ( 12 6 ) = S E 0 I { T I ) H <i27)=SE02 (T2 ) H( 128) = SFQ1( Tl) IFIN.E0.12S )R ETiJRM 1F(N.NE.256)RETURN1 H{129)=SEQ8(D) H<130)=SEQ1(T 1) H{131) = SEG2IT?) H (132)= SEQKTl! H < 133 ) =SEC3 <T3 ) H ( 134) = SEQKTl ) H(135)=SEG2(T2) H(136)=SFGI(T1) H( 137)=SEQ4( T4) H (138 )=SEQl (T 1 ) H< 1391 = SEG2 (T2) H( 140)-SEQKTl) H (141 ) = SEQ3 (T 3 ) H ( 142)=SEG1 (T 1 ) H( 143)=SEQ2(T2) H (144)=SEC1 (T 1 ) H{ 145) = SEQ5(T5) H( 146)= SEQKTl) H(14 7) = S E C2 (T2 ) H { 14 8) = S E 0 K TI ) H(149)=SEG3(T3) n"{ 1 50 > •= St 01. ( T I ) HI I51) = SEQ2( T2) H (152 ) = SEG1 (T 1 ) H(153)=SEG4(T4) H( 154) = SE0'K Tl ) H(155 ) = SFQ2 (T2 > H(156! = SF Q1 (T1) H(157)=SEG3(T3I H<158)=SFG1(T1) H( 159) = SE02(T2-) H( 160)= SEQKTl ) H { 161)= SEG6 (T6 ) H( 162) = SEQ] (71) H (163 ) = SEC2 (T2 ) H(164)=SE0l(Tl) H(L65)=SEQ3(T3) H(166)=SEG1(Tl ) H(167)=SEQ2(T2) H (168 )=SE0 KT 1 ) H(169)=SEC4(T4) H( 170)= SEQKTl) H(171)=SEG2(T2) H(172)=SEC1(Tl) H ( 173)= SEQ3 ( T3) H (1 74 ) = SEC1. (T 1 ) H( 175) = SE02tT2) H(176 H (177: H( 178 H (179 H(iso: H( 181 H (18 2 H(183: H ( 184 HU85. H ( 1.8 6: HU87 HUBS H( 189 H (190 H {1 9 1 H( L92 H (193 H ( 194 H (.19 5 H (196 H( 197 H( 198 H (199 H ( 200 H(201 . HJ2.02 H { 203 H (204 HJ205 H( 206 H (207 H (208 H ( 209 H ( 21 0 H { 211 H (212 H ( 213 H ( 214 H(215 H (216 H( 217 H I 2 18 H ( 2 19 H (220 H (221 H ( 222 H ( 223 H (224 H{ 22 5 H (226 H ( 2 2 7 H(228 H (2.2 9 SEQKTl) SEC5 ( T5 ) SEQKTl) SEC2 (T 2 ) S E G1 (T 1 ) SFQ3(T3) Sf-Gl (T 1 ) SEQ2 (T2) SEQKTl) SFC4 (T4 ) SEQK Tl ) SEQ2(T2) SEC1 (Tl ) SEQ3{13) SEQKTl) SEC2(T2) SEQKTl) SEC7 (T7 ) SEQK Tl ) SEQ2(T2 ) SEQKTl) SEC3(T3) SEQKTl) SEG2 (T2 ) SEQ1 (Tl ) SEQM TM .SECl.ITi ) SEQ21T2) SEQKTl) SEG3(T3) SEQK Tl) SEC2(T2) SEG1 (Tl) SEQ5(T5) SEC1(Tl) SF Q2(T2) SEQ1(T 1 ) SEC3 (T3 ) SEQKTl) S EQ2(T 2 ) SEQ1 (Tl) SEQM T4) S E C K T 1 ) SE C2(T2) SEQKTl) SEC.3 (T3 ) SEQK Tl) S EQ 2(T 2 ) SEQ1 (Tl ) SEQ6( T6) SEQKTl) SEQ2 { T2) SFQK Tl) :SFQ3 1T3 ) 81 H ( 23 0) '= SF Q 1 ( T1) H(231 )=SEQ2(T2) H ( 2 3 2 ) := S F G1 ( T I ) H ( 23 3 ) = SE Q4 ( T 4 ) H(234 ) = SFC1 (T 1) M(23f5)=SEC2(T2) H( 236) = SEQ1(T1) H(237)=SEG3IT3) H( 238) = SEGi (Tl ) H{239) = SEQ 2(T 2 ) H(24 0)=SEC1(T1) H{ 241) = SEQ5( T5) H(242)=SEQ1(T1) H(243)=SEQ2(T2) H( 244) = SEQ1(T1) H(24.5) = SEQ3(T3) H (246) = SEC1(Tl) H(24 7) = SFQ2(T2) H{?48 ) = SEC1 (Tl) H(249)=SEQ4(T4) H(250)=SEQ1(T1) H(251)=S E G2(T 2 ) H ( 2 5 2 ) -= SE Q1 ( T1) H(253)=SEQ3(T3) H (254)=SEQ1 (Tl) H(255) = SFG?(T2) H{256 ) = SE01(T1) .R-F.T4;|.-RN 50 H(9)=SE04(D) GC TO 1 60 H(17) = SE05(D) GO TO 2 70 H (33 1-SEG6(D) GC TO 3 80 H (65) = SEQ7(D, 128) GC TO 4 100 H(5)=SE03(D) H(6)= SF0UT1) H (7)= SEQ2(T2) H( 8)= SEOK Tl ) RETURN 90 • H(1)=D(1)+D(2 ) H(2)=D(l)-0(2) R ET URN 95 H(3)=SEQ2(D,4) H<4) = SEQ1(T1) RETURN END FUNCTION SFQS ( D r N) REAL D(N) CCHMQN/HADT/T7(123) »T 6 { 6 4 ) , T 5 (32) »T 4 (.16 )»T3(8)»T2 (4), T 1( 2) IF{N„NE. 2 565 GO TO 102 DC 101 1=1,12 8 101 T7(I)=D(2*I)+D(2*I-1) GO TO 201 82 102 IF(N.NE.128)GG TO 1 04 DO L03 1=1,64 103 T6( I) =0(2* I )+D{2* I-l) GC TO 301 104 IF(N.NE .64 ) GO TO 2 DC I 1=1,32 1 T5(I)=D(2*I-1)+D(2*l) GO TO 8 2 I FlN.NE.3 2) GO TO 4 DO 3 1=1,16 3 T4CI )=D(2*T-1 )+D( 2*1) GC TO 10 4 IF {N.NE.16) GO TO 6 DO 5 1= 1, 8 5 T3U) = D(2*I-1 )+D(2*I) GO TO 12 6 IF {N.NE.8) GO TO 15 DC 7 1=1,4 7 T2(I ) = D( 2*I-1) + D( 2*1) GC TO 14 201 DO 202 1=1,64 202 T6{ I )=T7< 2*1 »+T7( 2*1- 1) 301 DO 302 1 = 1 ,32 302 T5{I)=T6(2*1)+T6(2*1-1) 8 ' DO 9 1=1, 16 9 T4U)=T5(2*I-1)+T5(2*I) •10 -nn 11 -4 = 1-; 8 11 T 3 ( I ) =74 ( 2* I-1) +T 4 { 2* I ) 12 00131=1,4 13 T2(IJ = T3( 2*T-1)+T3( 2*11 14 Tl(1)=T2i1)+T2(2) TK 2) = T2( 3)H2(4) SE0S = T1(1 )+Tl(2) RETURN 15 IF (N.NE.4) GO TO 100 Till ) = D(1 )+D(2) T112)-D(3)+D(4) SEQS=T1( 1I + TK2) RETURN 100 WRITE (6,105) 105 FORMAT (» «, 'INVALID N IN CALL TO HAD8) STOP END FLNCTION SEGKD) REAL 0(2) SEQ1= 0(1) - D(2) RETURN END FLNCTION SEG2{D) R E AL 0(4) CCMM0N/HADT/T7(123),T6164),T5(32),T4{16),T3(8),T2 14),T1<2) 11(11-0(11-0(2) .T 1 (2 )= D( 4 ) - D( 3 ) SE02= Tl(l) + Tl(2) 83 RETURN END FUNCTION SEC3(C) REAL 0(8), CGMMON/HADT/T7(128) ,T6(64),T5(32) ,T4( 16) ,T3(8) ,72(4) ,11(2) T2(l) = D(l) -012! T2(2) = D( 4) - D( 3) T 2 ( 3 ) = C ! 5 ) - D ( 6 > T2<4)=0(8>-D(7) Till )=. T211) .+ T2(2) TK2)=T2(3)+T2(4) SEQ3 = TK 11 + TK 2) RETURN END FUNCTION SEQ4(D) REAL D( 16 ) CGMMQN/HADT/T 7(128) ,T6 (64 ) , T5 ( 32 ).,T4 < 16 ) ,T3( 8 ) ,T2 (4 ), T 1( 2 ) 0 0 11=1,4 1 T3(2*I )=0(4*I)-D(4* I-l ) DO 2 1=1,4 2 T 3 (2*1- 1 ) = D( 4*T-3 )-D( 4*1-2) DC 3 1=1,4 3 T2(I)=T3(2*I)+T3(2*1-1) DO 4 1=1,2 4 TK I ) =72 (2*1 )+T2 (2* I-l ) SEQ4= T K 1) + TK 25 RFT.IJ.RN END . . FUNCTION SEQ5(0) REAL 0(32 ) C0MM0N/HA0T/T7U23) f T6 (64 ) ,T5 ! 32 ) , T4 ( 3.6 ) ,T3 ( 8 ) , T2 (4 ) , T K 2 ) D 0 1 I = 1, 8 1 T4(2*I)=D(4* I)-D(4* 1-1) DO 2 1=1,8 2 T4(2*I- 1) = D( 4*I-3)-D( 4*1-2)' DO 3 1=1,8 3 T3( I )=T4( 2*1. )+T4(2*I-l) DO 4 T=l,4 4 T2(I)=T3(2*T)+T3(2*I-1) DO 5 1=1,2 5 TK I ) =T2 ( 2*1 )+T2 (2* I-l ) SEQ5= TK1) + Tl ( 2 ) RETURN END FLNCTION SEG6(D) REAL D(64) CCMM0N/HACT/T7(120),T6(64),T5(32),T4( 16)tT3(8),T2(4),Tl(2) 0 0 11=1,16 1 T5(2*I) = D(4*I )-D(4*I-l) DC 2 1=1,16 2 T5(2*I-l)=D(4*I-3)-Dl4*I-2) DO 3 1=1,16 3 T4( I )-=T5(2*I ) + T5 (2*1-1) DO 4 1= 1, 8 84 4 T3( I ) = T4( 2*I)+T4(2*I-1 ) DC 5 1=1,4 5 T 2 ( I ) = T3( 2*1 )+T3( 2*1- 1) DO 6 1=1,2 6 Tl(I) = T2(2*15 + T2!2*1-13 SEQ6=T1( 1)+Tl(2) RETURN ENn FUNCTION SEQ7{C) REAL HI 12 8) C DM M ON /HA 0 T / T 7 { 128) ,T6 { 64) , T5 (32 ) , T4 (16 ) ,T3 ( 8 ) , T2 (4 ) ,T1 (2 ) DC 1 1=1,32 1 T6(2*1.)=D(4*1)-D(4*1-1)' DO 2 1= 1, 32 2 T6(2*I-l)=D(4*I-3)-D(4*I-2) 00 3 1=1,32 3 T5( I )=T6(2*I)+T6(2*I-1) DO 4 1=1,16 4 T 41 I ) = T 5 ( 2 * I ) •+ T 5 ( 2*1-1) DC 5 1=1,8 5 T3(I)=T4(2*1)+T4(2*1-1) DO 6 1=1,4 6 T2(I)=T3(2*J)+T3(2*T-l) DO 7 1=1,2 7 T 1 ( I ) = T2(2* I )+T2(2*I-1) SEQ7 = T1 ( D + Tl (2) RE TURN-END FUNCTION SEQ8(D) REAL 0(256) C CMMGN/HACT/T7(128)fT6(64),T5(32),T4(16),T3(8),T2(4),TI(2) DO 1 1 = 1,64 1 T7(2*I ) = D(4*I )-D( 4*1- 1) DC 2 1 = 1 ,64 2 T7(2*I-l)=DI4*I-3)-0(4*1-2) DO 3 1=1,64 3 T6(I)=T7(2*T.)+T7(2*I-l) DO 4 1=1,32 4 T5(I )=T6<2* I )+T6(2*I-l) 0 0 5 1 = 1, 1 6 5- T4( I )=T5( 2*1 )+T5( 2*1- 1) DC 6 1=1,8 6 T3( 1 )=T4( 2*I) + T4( 2*1-1) DO 7 1=1,4 7 T 2 ( I ) = T 3 ( 2 * I ) + T 3 ( 2* I -1 ) DO 8 1=1,2 8 Ti( I )=T2( 2*1 )+T2(2* I-l ) SE08 = T1( D + TK2) R FT URN END 85 APPENDIX B. DETAILS CONCERNING PEARL'S PERFORMANCE MEASURE FOR TRANSFORM CODING SYSTEMS B.l Derivation of Equations (3.11) and (3.12) Since image transform coefficients are processed independently in accordance with the basis-restricted structure of Figure 3.1, the Shannon rate-distortion function for a single Gaussian source [37] can be used to bound the minimum rate, min R_^, required to obtain a maximum dis tortion of D_^ in the i*"*1 transform coefficient. That is, assuming no re strictions exist on the scheme used to quantize v^, the i*"^ transform. coefficient, then: 1/2 log o.2/D. , o.2> D. 1 XX X min Ri(Di) 0 , o±2<: D (B.l) = max{0,l/2 log o^/J) } i = 1,2,...,N 2 where o. is the variance of v.. Substitution of (B.l) into (3.9) yields x x the following minimization problem: N 2 R.7(A,D) = .. min [1/N -Z max{0,l/2 log a. /D.}] (B.2) ^ '{D1,D2,...,DN> € D* i=l 1 1 where each set {D^,D^,...,D^} in D satisfies equation (3.10). Solution of equation (B.2) requires allocating part of the total distortion D to each coefficient in such a way that the total rate RN(A,D) required to transmit the N coefficients is a minimum. This problem, as Pearl et. al. [3] point out, was solved by Kolmogorov [38] and results in equations (3.11) and (3.12). Examination of (3.11) and (3.12) reveals that the expression in (B.2) is minimized, subject to the constraint in (3.10), by setting 2 D. = min{6,a.} (B.3) xi Equations (B.3), (3.11), and (3.12) say that one should encode each transform coefficient with the same distortion, 0, as long as the distortion does not exceed the variance of the coefficient. If 9 is greater than the variance of some of the coefficients, then those coeffi cients should not be transmitted. As a result, the mean-squared-error distortion associated with those coefficients will be identical to their variances as specified by (B.3). B.2 An Objection to Pearl's Measure The derivation of (3.11) and (3.12) as a performance criterion • for .transform •ceding ••s-ys-tems "i-s-->subject' to"~some-*is>b jection . •-Stranrvor. '-s rate-distortion function, used for min IL(Ih) in (B.l), is only attained in the.limit of infinitely long blocks which is contrary to the basis-restricted requirement that each transform vector, v, be coded upon ar rival. This objection is not as serious as it may appear since the theore tical performance of practical coding schemes can be shown to obey equa tions having the same functional form as equation (B.l). For example, consider the use of an optimum Max quantizer [20] to code the transform coefficients. Max found that over a wide range of parameters the distortion for Gaussian coefficients quantized optimally in the MSE sense has the following form: D. .= go2 MT01 (B.4) XXX v ' where 1.74 £ a <: 2, 1.32 £ 3 $2.72, M. is the number of quantizer levels 87 9 used to quantize v., and a. } D.. Hence min R.(D.) is given approximately by: o 2 1 1 min R.(D.) = log M.. = - log(—) , (B.5) -2 a. D. , x 1 which has the same form as min R^D^ in(B.l). Substituting (B.5) into (3.9) yields: ,1 V 1 , . _ / IN R.CA.D) = min I -log(—)] (B.6) * {D1>D2,...,DN)6D i=l i which is the same minimization problem as specified by (B.2) and hence has a solution of the same form. /.Thus, ..aM.hou.gh. JL1») ~.an-d*(-3...12). .repr.eseivt^a fic.ti.tio.us rate that cannot be achieved using single block coding, they are nevertheless a useful figure of merit considering the degree of approximation attainable by practical schemes. APPENDIX C. LISTINGS OF THE PROGRAMS USED TO CALCULATE e Cl Explanatory Remarks The programs listed below pertain to the Hadamard transform coding systems studied in Chapter 4. Analogous programs to compute eOTrD SUB, for the Fourier systems are so similar that it was deemed unnecessary to list them. C.2 Program Listings SUBROUTINE SORTCSiN). C THIS S/R SORT S THE 1-DIM. ARRAY OF VARIANCES,'S8 INTO C ASCENDING ORDER. REAL S(M) NN=N/2 N1 = N-1 DO 2 1=1,NI 11=1+1 DO 1 J = 11 , N -.1 F {,$-{ .<M •••GE-. S>H<M G0 TO 4 T FMP = Si I ) S(I ) = S( J) S(J)=TE HP .1 .CONTINUE 2 CONTINUE RETURN END SUBROUTINE G THE TA[D,R,A,THE TA) C THIS IS A S/R WHICH, GIVEN A DISTORTION VALUE, 0, AND A C SET OF TRANSFORM COEFFICIENT VARIANCES, FINDS THE CORRE C SPONDING VALUE .OF THE. PARAMETER, THETA, IN THE RAT E—V ERS US C 'DISTORTION EQUATIONS (3.11) AND (3.12). WHEN THETA IS FOUND C IT IS USED TO COMPUTE THE RATE R ACCORDING TO (3.11). THETA C IS RETURNED TO THE CALLING PROGRAM SO THAT IT CAN BE USED TO C COMPUTE THE MEAN-SQUARED-ERROR IN EACH OF THE TRANSFORM C COEFFICIENTS USING EQUATION (B.3). REAL A (64) C A CONTAINS THE VARIANCES SORTED INTO ASCENDING ORDER. CHECK=64.0*D INOEX=T SUM I=0.0 DEN0M=ALOG 1 C( 2.0 5 C FIND THETA. 89 C FIRST CHECK THE SHALL DISTORTION POSSIBILITY. I.E. THETA=D. IF(D.LT.A( I) )GO TO 10 C ALSO CHECK FOR THE 0=1.0 CASE. IFtD.EQ.I.OiGO TO 11 DO 2 1=1, 63 S U M I = S U M I + A ( I ) THETA=(CHECK-SUMI)/(64.0-1) IP=I + 1 IF( THET A. GE . A ( I) . AND. THETA.LT .A { I P ) )G0 TO 1 GO TO 2. 1 I.NQEX= 1+1 GO TO 3 2 CONTINUE C NOW FIND R. 3 R = 0.0 DO 4 .1 = INDEX, 64 4 R=R + AL0GL0 ( A( I )/THETA ) /OENOM R=R/128.0 RETURN 10 THETA=D' GC TO 3 11 THETA=A ( 64) + 1 0. R=0 .0 RETURN END REAL S{ 8, 8) ,A164),TFMP( 8) ,0 I{ 8,8) ,P(8 ,129) REAL POWER(256,256),MSE C C THIS PROGRAM COMPUTES THE SUBJECTIVELY WEIGHTED ERROR POWER C SPECTRUM AND THE SUBJECTIVE MEAN-SQUARED-ERROR FOP IMAGES C WHICH HAVE BEEN PROCESSED BY A 2-DIM. HADAMARD TRANSFORM C CODER USING 8 X 8 BLOCKS. THE COMPUTATIONS TARE ADVANTAGE C OF: (1) THE SEPARABILITY OF THE AUTOCORRELATION FUNCTION C FOR THE IMAGE PROCESS C (2) THE SEPARABILITY OF THE 2-DIM. HADAMARD BASIS C FUNCTIONS INTO THE PRODUCT OF 2 1-DIM. BASIS FUNCTIONS C AND (3) THE SEPARABILITY OF THE 2-DIM. DFT INTO 2 l.-DI M. C DFT'S . C C INPUTS TO THE PROGRAM ARE: (1) THE 8 1-DIM. VARIANCE C VALUES USED TO GENERATE 64 2-DIM. VARIANCE VALUES;' C (25 THE 3 2 56-PT. POWER SPECTRA OF THE 3 1-DIM. HACAM ARC C BASIS VECTORS; (3) A VALUE OF THE.TOTAL DISTORTION, D. C C THE OUTPUTS ARE: (1) THE RATE, R, CORRESPONDING TO D; C 12) THE SUBJECTIVE MEAN-SQUARED-ERROR; (3) THE UNMODIFIED C M E A N-S CU ARED-E R R0 R. C C FIRST, OBTAIN THE 1-DIM. VARIANCES. D 0 2 1=1,8 1 FORMAT(F16.8) 2 RE ADO, 1 5TEMP( I ) 90 C GENERATE' THE 2-DIR. VARIANCES. 0 0 3 1=1,8 DC 3 J = l , 3 S{I ,J)-TEMPI 1 )*TEMP{J) C STORE THE VARIANCES IN A FOR LATER SORTING. K=U-1)*8+J 3 A ( K ) = S ( I , J ) C SORT THE VARIANCES FOP LATER CALCULATIONS OF RATE AND C INDIVIDUAL TRANSFORM SAMPLE DISTORTION VALUES. C ALL SORT (A,64) C NEXT, READ IN THE 256.-PT. POVi ER- SPECTRUM OF THE 8 HADAMARD C BASIS VECTORS STORED IN A SEQUENTIAL FILE ON UNIT 4. DO 5 1=1,8 4 FORMAT! ' * , 129F12 .6 ) 5 R E A D(4,4) (Pit ,J) » J = 1» 1 2 9) C ACTUALLY, ONLY 129 PTS. ARE REQUIRED BECAUSE OF THE SYMMETRY C. PROPERTIES OF THE POWER SPECTRUM OF A REAL SIGNAL. C C NEXT, READ IN A VALUE OF D.» THE TOTAL DISTORTION, FOR WHICH C THE RATE ANO MEAN—SQUARED-ERRORS ARE TO RE CALCULATED. 6 R EAD( 5,7,END=1OO)D 7 FCRM-AT ( F6 .4 ) C. CALL S/R GTHETA TO CALCULATE P(D) AND THETA, THE PARAMETER IN C THE RATE VERSUS DISTORTION EQUATIONS. CALL GTHETA( D, R, A,THETA ) C NOW USE THETA TO DETERMINE THE INDIVIDUAL TRANSFORM SAMPLE ...C DJSTO.RTJ..ON. VAl U.FS . DO 9 1=1 ,8 DO 9 J = 1, 8 IF(S( I, J) .-1. E . T HET A ) GO TO 8 • D I ( I , J ) = T H E T A GO TO 9 8 D I ( I , J ) = S ( I , J ) 9 CONTINUE C NOW CALCULATE THE ERROR SPECTRUM. C FIRST, INITIALIZE THE POWER ARRAY. DO 10 M=L,2 56 DC 10 N=1,2 56 10 PCWER(M,N)=0.0 DO 11 1=1,8 DO 11 J=l,8 DC 11 M=l , 129 DO 11 N= 1,129 11 POWER!M,N) = POWER(K,N)+P{ I,M ) *P(J,N ) *DI( I,J)/4096.G DO 12 M=l,129 DO 12 N=130,256 NN=258-N 12 PQWER(M,N)=POKER.(M,NN) DO 14 M=130,256 MiY=258-M DO 13 N = 1 , 129 13 POWER(M,N)= POWFR(MM,N) DC 14 N = 130 ,2 56 N M= 2 5 8- N 14 PCWER(M,N)=POWER(MM,NN) C NOW COMPUTE THE MSE AND THE SUBJECTIVE MSE. MSE=0.0 DC 50 M=l,256 DO 50 N=l ,2 56 50 MSE= MSE +P0WER(M,N ) -SBJMSE=0.0 DO 17 M=l , 1.29 DO 15 N=l, 129 R S Q = ( K * - 0 7 > * * 2 + < N * . 0 7 > * * 2 WE LGHT=742. /( 661.+R SO)—2. 4 63/(2.4 5 9+R SO) 15 POWER (M,N)=PO W E R (M, N ) * W E 1 G H T * * 2 DO 16 N=130,256 NN=129-N R S Q = ( M* . 0 7 ) ** 2 + ( N N* . 0 7 ) ** 2. WEIGH 1 = 742. / ( 661. +RSQ )-?. .463/ (2.459+RSQ) 16 POWER (MtN ) = POWER ( M, N ) *WE IGHT**2 17 CCNTINUE DO 20 M=13G,256 M M=129—M DO 18 N = i ,129 RSQ=(MM*.07)**2+<N*.07)**2 WEIGHT = 742./(661 .+RSO)-2.463/( 2.459+RSQ ) 18 P0WER(M,N)=POWER(M,NJ*WEIGHT**2 DO 19 N=130,256 NK=129-N RSQ=(f*M*.07)**2*<NN*.07)**2 W FIGH T= 742. / ( 661 . + R SQ )-2. 4 63/(2.4 59+R SO) 19 P0WER{MiN) = PCWER(M,N)*WEIGHT**2 20 CONTINUE DO 60 M=1,25 6 DC 60 N = l ,256 60 S BJM SE = SB J MSE + POWER(M,N) C SCALE MSE AND SUBJECTIVE MSE SO THAT THEY ARE < OR =• • 1. 0 MSE=MSE/654C8.68 S BJMS E= SBJM SE/53791 .04 W PIT E(7,21)D,R,MS E,S BJ MS E 21 FORM A T( • • ,4(F16.8,3X) ) GO TO 6 100 STOP END The following program computes the power spectra of the eight one-dimensional sampled Walsh functions which serve as basis functions for the Hadamard transform when the data blocks being transformed are of length eight. Because two-dimensional Walsh functions are simply the 92 products of two one-dimensional Walsh functions, the power spectra of the two-dimensional sampled Walsh functions are computable as products of two one-dimensional power spectra. The program below produces the one-dimensional power spectra which are combined in the previous pro gram to produce two-dimensional spectra. REAL K(256)/256*0.0/,P(8,129) C THIS PROGRAM GENERATES 8 256-PT. POWER SPECTRA OF THE 8 C 1-DIM. HADAMARD BASIS VECTORS OF LENGTH OF 8. THESE CAN C RE USED LATER TO FINC THE 256 X 256 PT. POWER SPECTRA OF C THE 64 8X8 PT. HADAMARD BASIS MATRICES. I NTEGER N(1 J/256/ COMPLEX TRANI129) EQUIVALENCE (H,TRAN) DO 4 NN =1,8 C OBTAIN THE. NEXT 8 PT. BASIS VECTOR. READ! 5, 1) (HI I ) ,1 = 1, 8) 1 FORMAT ( 8F6.3 ) C DFT H . CALL FOUR 2(H,N, 1,-1,0) C COMPUTE THE POWER SPECTRUM OF H. n 2 r=l-f-12-9 2 P ( NN, I}= CABS(TR AN ( I ) )** 2 -W RIT EI 6,10 ) (P(N N,I) ,1 = 1,129) 10 FORMAT!• ',129F12.6) C REINITIALIZE H. DO 3-1 = 1,256 3 H ( I ) = 0. 0 4 C CNT I NUE STOP END REAL S( 64) , A( 64) ,DI ( 64) , PU{ 64 ,12 9) , POWER! 256 , 256) REAL PV!12"5,MSE,POWERU!129 ) C'THIS PROGRAM COMPUTES THE SUBJECTIVELY WEIGHTED ERRCR POWER C SPECTRUM AND SUBJECTIVELY WEIGHTED MSE FOR IMAGES WHICH HAVE C BEEN PROCESSED BY TRANSFORM CODING 1-DIM. BLOCKS OF LENGTH 64 C USING THE HADAMARD TRANSFORM. USE IS MADE OF THE FACT THAT C .1. -DIM. PROCESSING IS EQUIVALENT IN SOME SENSE TO A 1-DIM. C. FILTERING OPERATION. HENCE THE ERROR POWER SPECTRUM HAS THE C SAME VARIATION WITH V, THE 'VERTICAL FREQUENCY' AS THE C ORIGINAL IMAGE POWER SPECTRUM HAD DUE TO THE SEPARABILITY C CF THE ORIGINAL'S AUTOCORRELATION FUNCTION. THE U OR C 'HORIZONTAL FREQUENCY' COMPONENT OF THE ERROR SPECTRUM IS C FOUND BY DETERMINING THE. POWER SPECTRA OF THE BASIS FN'S., C WEIGHTING THESE BY THE APPROPRIATE DISTORTION VALUES C BT AI NED C F POM RIO) CALCULATIONS, AND THEN SUMMING THEM TO FIND THE C U COMPONENT. C 93 C FIRST, OBTAIN THE VARIANCES USED TO DETERMINE INDIVIDUAL C TRANSFORM SAMPLE DISTORTION VALUES. ALSO, STORE THEM IN C ARRAY A FOR LATER SORTING. DO 2 1=1,64 1 FORMAT!F16.8) READ(3, 1 )S( I ) 2 A { I ) =S( I ) C SORT THE VAR IANCES . CALL SORT(A,64) C READ IN THE V COMPONENT CF THE ERROR POWER SPECTRUM. READI4, 33 )(PV(N ),N= 1, 129) 33 FORMAT( « ' ,129F12 .6 ) C READ IN THF 256-PT. POWER SPECTRA OF THE 64 1-DIM. HACAMARC C BASIS VECTORS .( ACTUALLY, ONLY 129 PTS. ARE REQUIRED BECAUSE OF C THE SYMMETRY OF THE DFT OF A REAL FUNCTION). DO 4 1= 1, 64 4 R E A D ( 5 » 3 } (PUT I,M ) ,M = 1,12 9) 3 FORMATf • « , 129F10.6) C C NEXT, READ IN A VALUE OF D, THE TOTAL DISTORTION. 5 READ?6,6 ,END=100)D 6 FORMAT(F6.4) C CALL S/R GTHETA TO CALCULATE R(D) AND THETA, THE PARAMETER IN C THE RATE VERSUS DISTORTION EQUATIONS. CALL GTHETA(D,R, A,THETA) C NOW USE THETA TG DETERMINE THE INDIVIDUAL TRANSFORM SAMPLE C .-M E. A N —-S-QU AR E.D—E R Ri' R. D.I.STORT,I.ON .VALUES. DO 0 1=1,64 ' . . • IF( S( I ) .LE. THETA.) GO TO 7 PI(I)=THETA GO TO 8 7 DI ( I ) = S ( I ) 8 CONTINUE C NOW CALCULATE THE *U• COMPONENT OF THE ERROR POWER SPEC C TRUM.FIRST, INITIALIZE THE POWERU ARRAY. DO 9 1=1,129 9 POWERUII)=0.0 DO 10 1=1,64 DC 10 M=1,129 10 POWFRLK M)=POWERUt M)+PU(I,M)*DI( I) C NEXT, COMPUTE THE 2-DIM. ERROR POWER SPECTRUM. DO 12 M = l ,1 29 DO 11 N=l,129 11 POWER(M,N)=POWERU(M)*PV(N)/4096. DO 12 N = 1.30 ,256 NN=2 5 8-N 12 POWER(M,N)=POWER(M,NN ) DC 14 M=130,256 MM=2 58-M DC 13 N=l,129 13 POWER(M,N)=POWFR(MM,N) DO 14 N=130,256 .NN=2 5 8-N . 14 ' P 0 WE R ( M , N ) = P 0 WE R ( MM ,NN) .94 C NOW DETAIN THE MEAN SCUARRE ERROR BY SUMMING THE POWER C SPECTRUM, THEN WEIGHT THE SPECTRUM WITH THE SUBJECTIVE C FILTER AND THEN SUM AGAIN TO OBTAIN THE SUBJECTIVE MSE. MSE -=0.0 DO 50 M=l,256 DO 50 N=l,256 50 MSE = MSE + PCWE R ( M, N ) SBJMSE=0.0 DC 17 M=LT129 DC 15 N = l,129 RSO=(M*.07) **2+{N *.07)**2 WEIGHT=742./(661,+RSO)-2.463/(2.459+RSQ) 1 5 PUWERf M,N)=POWER(M,N)-WEIGHT**2 DO 16 N= 130, 2 56 NN=129-N R S0= ( M*. 07 ) **2+ ( NN*. 07) **2 WE I GHT = 742-/1661.+ RSQ)- 2.463/( 2.459 + R SO) 16 PGWER{M,N)=P0WER(M,N)*WEIGHT**2 17 CONTINUE DC 20 M=130,2 56 MM=129-M DO 18 N =1,129 RSQ=(MM*.0 7)**2 + (N*.0 7)**2 WEIGHT=742./(661.+ RSQ)-2.463/(2.459+RS0) 18 POWER IM-,N )=POWER ( M, N )*WE IGHT**2 DC 19 N=130,256 N.N=,1,29-N RSQ={MM*.07)**? + (NN*.075**2 WEIGHT=742./(661 .+ R S Q)-2.4 6 3/(2.4 59+RSO) 19 POWER(M,N)=P0KE R(M,N) *WEI GHT**2 20 CONTINUE DC 60 'M=l ,2 56 00 60 N= 1, 2.56 60 S 0 J MS E= S B J MS E + POW ER ( M , N ) C SCALE MSE AND SUBJECTIVE MSE SO THAT THEY ARE < OR = 1.0 MSE=MSE/862.4539 S e J M S E= S R J M S F / 6 7 0 . 9 5 8 WRITE(7,21)D,R,MSE,SBJMSE 21 FORMAT (* • , 4( E16.8, 3X ) ) GC TO 5 100 STOP END REAL H(256)/256*0.0/,P{256) C THIS PROGRAM COMPUTES THE FIRST 129 POINTS OF EACH OF THE C 64 1-OIMENSIONAL 256-PQINT POWER SPECTRA FOR EACH 64rP0INT C HADAMARD BASIS VECTOR. C ONLY 129 POINTS APE REQUIRED DUE TO THE CONJUGATE SYMMETRY OF C THE DISCRETE FOURIER TRANSFORM OF REAL DATA. INTEGER N(l)/256/ CCMPLEX T R A N(12 9) EQUIVALENCE {H,T R A N) ' DO 6 NN =1,64 C OBTAIN THE NEXT 64 PT. HAD. BASIS VECTOR. REALM 5, 1 ) (H{ I. ) , T = 1 , 64) 1 FORMAT( ' 1 ,64F6.3) C PERFORM A 2 56 PT. DFT . CM IT. CALL FOUR 2 (HUN, 1,-1,0) C COMPUTE THE 256 PT. POWER SPECTRUM. DO 2 1 = 1 ,1.29 2 P( I ) = CABS(IRAN(I ) )**2 C STORE THE POWER SPECTRAL VALUES IN A SEQUENTIAL FILE. C ONLY NEED TO STORE THE FIRST 129;OTHERS MATCH. WRITE(7,4 ) (P(K),K=1,129) 4 FORMAT( 5 » , 129F10.6 ) C REINITIALIZE THE H ARRAY. DC 5 1=1,256 5 H(I)=0.0 6 CONTINUE STOP END. REAL -PV( 256) ,OUT( 12 9) C THIS IS A PROGRAM TO GENERATE THE 256-PT. DFT OF THE C Y-COM PON ENT OF THE AUTOCORRELATION FN. THIS IS USED AS C THE 'VERTICAL FREQUENCY' COMPONENT" IN ANOTHER PROGRAM C WHICH COMPUTES AN ERROR POWER SPECTRUM. I N'TEGER N (1 ) /2 56/ X OMR LEX . TRAN.( .1.2-9.) EQUIVALENCE ( PV , T R AN ) C GENERATE THE ELEMENTS OF RV. DO 1 1=1, 129 1 PV([)=EXP(-0.163*1) DO 2 1=130,256 ' K=258-I 2 P V ( I ) = P V ( K ) C NOW DFT RV. CALL F0UR2(PV,N, 1,-1,0) C STORE THE RESULT IN A SEQUENTIAL FILE. DO 22 1=1,129 22 OUT(I)= RE/H.(T RAN(15) W PI TE ( 7 , 4 ) { C U T ( 1 ) ,1=1 ,129) 4 FORMAT ( ' 12 9F1.2.6) STOP E NO 96 REFERENCES H.J. Landau and D. Slepian, "Some computer experiments in picture processing for bandwidth reduction", Bell Syst. Tech. J., vol. 50, pp. 1525-1540, May-June 1971. G.B. Anderson and T.S. Huang, "Piecewise Fourier transformation for picture bandwidth compression", IEEE Trans. Commun. Tech., vol. COM-19, pp. 133-140, April 1971. J. Pearl, H.C. Andrews, and W.K. Pratt, "Performance measures for transform data coding", IEEE Trans. Commun., vol. COM—20, pp. 411-414, June 1972. T.G. Stockham, Jr., "Image processing in the context of a visual model", Proc. IEEE, vol. 60, pp. .828-842, July 1972. J.J.Y. Huang and P.M. Schultheiss, "Block quantization of correlated Gaussian random variables", IEEE Trans. Commun. Syst., vol. CS-11, pp. 289-296, Sept. 1963. P.A. Wintz, "Transform picture coding", Proc. IEEE, vol. 60, pp. 809-820, July 1972. L.C. Wilkins and P.A. Wintz, "Bibliography on data compression, picture properties, and picture coding", IEEE Trans. Inform. Theory, vol. IT-17, pp. 180-197, March 1971. W.C. Wilder, "Subjectively relevant error criteria for pictorial data processing", TR-EE 72-34, Purdue University, Dec. 1972. A. Habibi and'P.A. Wintz, "Image coding by linear transformations and block quantization", IEEE Trans. Commun. Tech., vol. COM-19, pp. 50-60, Feb. 1971. R.G. Gallager, Information Theory and Reliable Communications, New York: Wiley, 1968. 97 11. W.F. Schreiber, "Picture coding", Proc. IEEE, vol. 55, pp 320-330, March 1967. 12. T.J. Goblick and J.L. Hoslinger, "Analog source digitization: A com parison of theory and practice", IEEE Trans. Inform. Theory (Corresp.), vol. IT-13, pp. 323-326, April 1967. 13. L.E. Franks, "A model for the random video process", Bell Syst. Tech. J. , pp. 609-630, April 1966. 14. E.R. Kretzmer, "Statistics of TV signals", Bell Syst. Tech. J., vol. 31, p. 763, July 1952. 15. H. Hotelling, "Analysis of a complex of statistical variables into principal components", J. Educ. Psychology, vol. 24, pp. 417-441, 498-520, 1933. 16. W.K. Pratt, J. Kane, and H.C. Andrews, "Hadamard transform image •coding", '"Proc.-''SEE'S ^-vol. 57, -pp. --5 8-68, 'Jan. -1969. 17. - H.C. Andrews and J. Kane, "Kronecker matrices, computer inplementation, and generalized spectra", J. Assoc. Comput. Mach., vol. 17, pp. 260-268, April 1970. 18. J.W. Cooley and J.W. Tookey, "An algorithm for the machine calculation of complex Fourier series", Math. Comput., vol. 19, pp. 297-301, April 1965. 19. R.E. Totty and G.C. Clark, "Reconstruction error in waveform trans mission", IEEE Trans. Inform. Theory, vol. IT-13, pp. 336-338, April 1967. 20. J. Max, "Quantizing for minimum distortion", IRE Trans. Inform. Theory, vol. IT-6, pp. 7-12, March 1960. 21. P.A. Wintz and A.J. Kurtenbach, "Waveform error control in PCM telemetry", IEEE Trans. Inform. Theory, vol. IT-14, pp 650-661, Sept. 1968. 98 22. L.D. Davisson, "Rate-distortion theory and application", Proc. IEEE, vol. 60, pp 800-808, July 1972. 23. W.K. Pratt and H.C. Andrews, "Fourier transform coding of images", Proc. Hawaii Int. Conference bri System Sciences, Jan. 1968. 24. J. Pearl, "On coding and filtering stationary signals by discrete Fourier transforms", IEEE Trans. Inform. Theory (Corresp.), vol. IT-19, pp. 229-232, March 1973. 25. T. Fukinuki and M. Miyata, "Intraframe image coding by cascaded Hada mard transforms", IEEE Trans Commun., vol. COM-21, pp. 175-180, March 1973. 26. T.G. Stockham, Jr., "Intra-frame encoding for monochrome images by means of a psychophysical model based on nonlinear filtering of multiplied signals", Proc. 1969 Symp. Picture Bandwidth Compression, -T s-S-.~*Hu«irg- «and.- "-0 .--J.' -Tr-et-kaki "Ed's. * -"-New '-York: -Gordon -and Breach, 1972. 27. A.V. Oppenheim, R.W. Schafer, and T.G. Stockham, Jr., "Nonlinear filtering of multiplied and convolved signals", Proc. IEEE, vol. 56, pp. 1264-1291, Aug. 1968. 28. T.N. Cornsweet, Visual Perception, New York: Academic Press, 1970. 29. J.W. Cooley, P.A. Lewis, and P.D. Welch, "Application of the fast Fourier transform to computation of Fourier integrals, Fourier series, and convolution integrals", IEEE Trans. Audio and Electroacoustics, vol. AU-15, pp. 79-84, June 1967. 30. H.F. Harmuth, Transmission of Information by Orthogonal Functions (Second Edition), New York: Springer-Verlag, 1972. 31. W.K. Pratt, "Linear and nonlinear filtering in the Walsh domain", Proc. 1971 Symp. on Applications of Walsh Functions published in IEEE Trans. Electromag. Compatibility, vol. EMC-13, pp. 38-42, Aug. 1971. 99 32. B.L.N. Kennett, "A note on the finite Walsh transform", IEEE Trans. Inform. Theory (Corresp.), vol. IT-16, pp. 489-491, July 1970. 33. C. Bird, "Calculating all the eigenvalues and eigenvectors of a symmetric matrix", UB'C SYMAL, Computing Centre, University of British Columbia, Vancouver 8, B.C., Canada, March 1972. 34. E. Froese, "Discrete Fourier transformations", UBC FOURT, Computing Centre, University of British Columbia, Vancouver 8, B.C., Canada, Jan. 1970. 35. G.D. Bergland, "A guided tour of the fast Fourier transform", IEEE Spectrum, vol. 6, pp. 41- 52, July 1969. 36. E.O. Brigham and R.E. Morrow, "The fast Fourier transform", IEEE Spectrum, vol. 4, pp. 63-70, Dec. 1967. 37. CE. Shannon, "Coding theorems for a discrete source with a fidelity criterion", IRE Nat. Conv. Rec, Pt. 4, pp. 142-163, March 1959. 38. A.N. Kolmogorov, "On the Shannon theorj7 of information transmission in the case of continuous signals", IRE Trans. Inform. Theory, vol. IT-2, pp. 102-108, Dec. 1956.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Theoretical study of transform picture coding systems...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Theoretical study of transform picture coding systems and an investigation of a new measure of distortion.. Baillie, Alexander John Main 1973-12-31
pdf
Page Metadata
Item Metadata
Title | Theoretical study of transform picture coding systems and an investigation of a new measure of distortion in processed images |
Creator |
Baillie, Alexander John Main |
Date | 1973 |
Date Issued | 2010-01-23T00:00:13Z |
Description | The performance of systems which employ a linear transformation and block quantization to encode visual information sources has been investigated. The effect of several relevant design parameters upon the behavior of these systems was examined theoretically. The performance criterion used was the rate measured in bits-per-picture sample required to transmit an image through a noiseless digital channel such that the average distortion in the received picture is less than a specified value. As a measure of image distortion, the mean-squared-error, well-known for its analytical tractability, was used. A second measure of image distortion which incorporates knowledge of the human visual system was examined. The new distortion index proposed by Stockham is expected to correlate well with subjective assessments of picture distortion although no experiments have been performed, as yet, to show this. In order to determine what, if anything, could be learned about this second distortion measure from theoretical considerations alone, a procedure was developed to estimate its value for given bit rates. These values were compared to the mean-squared-error distortion values obtained for the same bit rates in an attempt to gain some insight into the usefulness of the second distortion measure. |
Subject |
Computer graphics |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Collection |
Retrospective Theses and Dissertations, 1919-2007 |
Series | UBC Retrospective Theses Digitization Project |
Date Available | 2010-01-22 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0065542 |
URI | http://hdl.handle.net/2429/19078 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- [if-you-see-this-DO-NOT-CLICK]
- UBC_1974_A7 B33_9.pdf [ 4.86MB ]
- Metadata
- JSON: 1.0065542.json
- JSON-LD: 1.0065542+ld.json
- RDF/XML (Pretty): 1.0065542.xml
- RDF/JSON: 1.0065542+rdf.json
- Turtle: 1.0065542+rdf-turtle.txt
- N-Triples: 1.0065542+rdf-ntriples.txt
- Original Record: 1.0065542 +original-record.json
- Full Text
- 1.0065542.txt
- Citation
- 1.0065542.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
United States | 12 | 6 |
China | 8 | 16 |
United Kingdom | 5 | 0 |
France | 1 | 0 |
Russia | 1 | 0 |
India | 1 | 0 |
Czech Republic | 1 | 0 |
Japan | 1 | 0 |
City | Views | Downloads |
---|---|---|
Ashburn | 7 | 0 |
Unknown | 7 | 8 |
Shenzhen | 4 | 16 |
Beijing | 4 | 0 |
Lewes | 3 | 0 |
San Francisco | 2 | 0 |
Kharagpur | 1 | 0 |
Saint Petersburg | 1 | 0 |
Tokyo | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065542/manifest