RD OPTIMIZED P R O G R E S S I V E I M A G E CODING USING J P E G By Jaehan In B. Sc. (Electrical Engineering) The Johns Hopkins University, U.S.A. M . Sc. (Electrical Engineering) The Johns Hopkins University, U.S.A. A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF T H E REQUIREMENTS FOR T H E D E G R E E OF M A S T E R OF APPLIED SCIENCE in T H E FACULTY OF GRADUATE STUDIES DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING We accept this thesis as conforming to the required standard T H E UNIVERSITY OF BRITISH COLUMBIA March 1998 .© Jaehan In, 1998 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Electrical and Computer Engineering The University of British Columbia 2356 Main Mall Vancouver, Canada V6T 1Z4 Date: Abstract The J P E G standard allows four modes of operation. They are the hierarchical (HJPEG), progressive (PJPEG) , sequential (SJPEG), and lossless modes1: The H J P E G and P J P E G modes inherently support progressive image coding. In H J P E G , an image is decomposed into subimages of different resolution, each of which is then coded using one of the other three modes of J P E G . Progressiveness within a resolution in H J P E G can be achieved when each subimage is coded using P J P E G . An image coded using P J P E G consists of scans, each of which contributes to a portion of the reconstructed image quality. While SJPEG yields essentially the same level of compression performance for most encoder implementations, the performance of P J P E G depends highly upon the designed encoder structure. This is due to the flexibility the standard leaves open in designing P J P E G encoders. In this thesis, an efficient progressive image, coding algorithm is developed that is compliant with the J P E G still image compression standard. The JPEG-compliant pro-gressive image encoder is a H J P E G encoder that employs a rate-distortion optimized P J P E G encoding algorithm for each image resolution. Our encoder outperforms an op-' timized SJPEG encoder in terms of compression efficiency, substantially at low and high bit rates. Moreover, unlike existing J P E G compliant encoders, our encoder can achieve precise rate control for each fixed resolution. Such good compression performance at low bit rates and precise rate control are two highly desired features currently sought for the emerging JPEG-2000 standard. 1 Lossless JPEG algorithms are rarely used since their performance levels are significantly lower than those of other lossless image compression algorithms, and are therefore not widely used. 11 Table of Contents Abstract ii List of Tables v List of Figures vi Acknowledgment x Glossary xi 1 Introduction 1 2 Background 4 2.1 Basics of DCT-Based J P E G , 4 2.2 Sequential DCT-Based J P E G 6 2.3 Progressive DCT-Based J P E G 10 2.3.1 Spectral Selection and Successive Approximation First Scan . . . 11 2.3.2 Successive Approximation Subsequent Scans 12 2.4 Hierarchical J P E G 14 3 Rate Distortion Optimized P J P E G 16 3.1 Prioritization of Bits . . 17 3.2 Grouping of Bits . . 22 3.3 Complexity, Memory and Computation Requirements Vs. Robustness . . 31 3.4 Experimental Results 32 i n 4 Constrained Rate Control in P J P E G 46 4.1 First DC Scan 48 4.2 Subsequent DC Scan 50 4.3 First A C Scan . . . 50 4.4 Subsequent A C Scan 53 5 Hierarchically Embedded P J P E G 57 5.1 Selection of Filters " 57 5.2 Selection of Transition Points 61 5.3 Experimental Results 63 6 Conclusions 67 Bibliography 68 IV List of Tables 2.1 Encoding of DC coefficients using D P C M model 7 2.2 Encoding of A C coefficients 8 2.3 First DC scan (first 9 bits) using successive approximation 11 2.4 First A C scan (first 9 bits) using successive approximation 12 2.5 Subsequent DC scan (last bit) using successive approximation 12 . 2.6 Subsequent A C scan (last bit) using successive approximation 13 3.7 Priority values of the bits of the coefficients of the 512 x 512 image LENA. 20 3.8 The first step of the grouping procedure for the image LENA. The most significant bits of DC and A C coefficients are grouped separately 28 3.9 The second step of the grouping procedure for the image LENA. Grouped bits of neighboring A C coefficients with similar grouping structure are combined to form spectral bands 29 3.10 The final step of the grouping procedure for the image LENA. Bits of the A C coefficients, located at the same bit position level and with similar priority values are combined to yield large group of bits 30 4.11 Rate Control Results for the image LENA 56 " v List of Figures 2.1 Basic coding procedure for DCT-based J P E G 5 2.2 Encoding of A C coefficients in zigzag order in an 8 x 8 block 8 2.3 H J P E G encoding/decoding procedure. One of the progressive, sequen-tial, or lossless encoding modes can be used in the [Encode] steps, and corresponding decoding modes can be used in the [Decode] steps 15 3.4 A n 8 x 8.x 10 parallelepiped. 17 3.5 Original images: (a) 512 x 512 image LENA and (b) 512 x 512 image MANDRIL. 19 3.6 First 20 bits in descending order of priority level for the image LENA. . . 21 3.7 First 20 bits in descending order of priority levels for the image LENA where the last P J P E G constraint listed in Section 2)3.2 is satisfied. . . . 23 3.8 Reduction in distortion vs. Rate for the first 4 groupings of DC coefficient using successive approximation for the image LENA 25 3.9 Reduction in distortion vs. Rate for groupings of the first A C coefficient for the image LENA 26 3.10 Comparison of the proposed algorithm with the sequential J P E G and with E Z W at high bit rates for the image LENA. 34 3.11 Comparison of the proposed algorithm with the sequential J P E G and with E Z W at low bit rates for the image LENA 35 3.12 Comparison of the proposed algorithm with the sequential J P E G and with E Z W at high bit rates for the image MANDRIL 36 VI 3.13 Comparison of the proposed algorithm with the sequential J P E G and with E Z W at low bit rates for the image MANDRIL 37 3.14 Image LENA coded using (a) optimized sequential J P E G and (b) proposed P J P E G at 0.075 bpp 38 3.15 Image LENA coded using (a) optimized sequential J P E G and (b) proposed P J P E G at 0.1 bpp. 38 3.16 Image MANDRIL coded using (a) optimized sequential J P E G and (b) pro-posed P J P E G at 0.055 bpp.- 39 3.17 Image MANDRIL coded using (a) optimized sequential J P E G and (b) pro-posed P J P E G at 0.1 bpp 39 3.18 Four training images: BIRD, FIGHTER, PEPPERS, and TRUCK 41 3.19 Comparison between the performance of image specific and non-specific grouping algorithms at medium and high bit rates for the image LENA. . 42 3.20 Comparison between the performance of image specific and non-specific grouping algorithms at medium and high bit rates for the image MANDRIL. 43 3.21 Comparison between the performance of image specific and non-specific grouping algorithms at low bit rates for the image LENA. 44 3.22 Comparison between the performance of image specific and non-specific grouping algorithms at low bit rates for the image MANDRIL. 45 4.23 High level block diagram of the rate control algorithm. L is the maximum number of bits available anytime, initially set to the maximum allowed number of bits, H is the size of PJPEG's frame header in bits, Hs is the size of the current scan header in bits. . 47 vii 4.24 Block diagram of the rate control algorithm used for the first DC scan, n ' is the number of non-coded blocks, ./V is the number of 8 x 8 blocks and B is the number of code bits generated by encoding the current 8 x 8 block. 49 4.25 Block diagram of the rate control algorithm used for the subsequent DC scans 51 4.26 Block diagram of the rate control algorithm used for the first A C scan. B is the number of code bits generated by encoding the current 8 x 8 block or blocks, m the number of blocks processed, h(EOBn) the length in bits of the variable length code representing E O B n and s(n) the number of bits required to represent n 52 4.27 Block diagram of the rate control algorithm used for the subsequent A C scans, nz is the number of D C T coefficients with nonzero history and mz the number of D C T coefficients with nonzero history which were processed. 54 5.28 Encoding and decoding procedures for hierarchically embedded P J P E G with three layers 58 5.29 Downsampling and upsampling filters: (a) 1-D averaging downsampling filter and (b) 1-D bi-linear interpolation upsampling filter 59 5.30 (a) The 512 x 512 image LENA sampled and reconstructed using the filters shown in Figure 5.29, and (b) the corresponding difference image 60 5.31 (a) The 512 x 512 image LENA sampled and reconstructed using B-spline filters, and (b) the corresponding difference image 60 5.32 PSNR vs. Rate for 128 x 128, 256 x 256, and 512 x 512 LENA images coded using P J P E G at low and medium bit rates 62 vm 5.33 PSNRvs . Rate for 128x128, 256x256, and 512x512 LENA images encoded using P J P E G , and the 512x512 LENA image coded using HP J P E G , SPIHT and optimized J P E G at low and medium bit rates. 64 5.34 PSNRvs . Rate for 128x128, 256x256, and 512x512 LENA images encoded using P J P E G , and the 512x512 LENA image coded using HP J P E G , SPIHT and optimized J P E G at all bit rates 65 5.35 The 512 x 512 image LENA reconstructed (a) at 0.3 bpp using conventional H J P E G , and (b) at 0.25 bpp using. HP J P E G . . . . : . 66 IX Acknowledgment I would like to thank my supervisor, Dr. Faouzi Kossentini, for his supervision and advice throughout this research. His generous encouragement and support made this work possible. I would also like to thank my colleague, Shahram Shirani, for his contribution to a part of this research. I also thank my colleagues in the Signal Processing and Multimedia Laboratory for the kind friendship they provided me. Working with them was a pleasure. Personally, I would like to thank my parents for their encouragement and especially my wife, Gyunghwan, for her patience and support. x Glossary J P E G : Joint Photographic Experts Group. H J P E G : Hierarchical mode of J P E G . P J P E G : Progressive mode of J P E G . SJPEG: Sequential mode of J P E G . PIT: Progressive image transmission. RD: Rate-distortion. H P J P E G : Hierarchically embedded progressive J P E G . DCT: Discrete cosine transform. F D C T : Forward discrete cosine transform. IDCT: Inverse discrete cosine transform. D P C M : Differential pulse code modulation. V L : Variable length. E O B : End of block. Scan: A single pass through the data for one or more image components in an image. XI Chapter 1 Introduction Images usually contain so much data that many bits are still needed to represent them even after compression. Therefore transmitting whole compressed image data at once over a low bandwidth channel is often inefficient, and is not adequate for applications such as fast image browsing and video-teleconferencing [1, 2]. Under such circumstances, immediate, if not full, reconstruction of the compressed images is essential. Progressive image coding allows images to be progressively coded and transmitted, i.e., only a portion of the information about the original image is first coded and transmitted, and, upon the receiver's request, more bits representing the rest bf the information are transmitted. Progressive image coding can be achieved in two ways: 1) quality progressive image coding, and 2) hierarchical image coding. In quality progressive image coding, a rough approximation about the full image is obtained after decoding a few bits, and, as more bits are decoded, the image reproduction quality is gradually improved. Hierarchical image coding, also referred to as layered image coding, allows the transmitter and the receiver to control the resolution of the encoded/decoded images. In this coding algo-rithm, original images are often reduced from their original sizes by downsampling or averaging, coded, and transmitted. Later, higher and original resolutions of the input image can be reconstructed by decoding a relatively small number of bits [3, 4]. Both of the above coding algorithms can be combined to provide the receiver with maximum progressiveness control. J P E G (Joint Photographic Experts Group) is the current image compression standard 1 Chapter 1. Introduction 2 for still, continuous-tone, monochrome and color images based on discrete cosine trans-form (DCT). Among the four principal modes provided by J P E G , H J P E G and P J P E G support progressive image transmission (PIT). Various algorithms using D C T have been developed for PIT. [5, 6, 7, 8, 9]. In PIT, prioritization of the bits to be transmitted takes an important role and is done in several different ways. In [5], the D C T coefficients are sorted in a decreasing order of their ensemble variance. This order is then modified according to a human visual system model. Another method suggested in [8] employs the magnitude of the D C T coefficients as a parameter in prioritization. The D C T coefficients are rounded to the same number of bits and are then sorted so that the larger coefficients with their positions are encoded and transmitted first, followed by the smaller ones. The above described methods achieve PIT by exploiting various properties of the D C T coefficients. However, it is generally difficult to directly apply them to a J P E G framework, where many constraints must be satisfied during the encoding/decoding pro-cess. Although P IT using J P E G was discussed in some papers [6, 10], the performance levels of the corresponding implementations were quite limited. In [6] and [10], only the progressive mode of J P E G was discussed allowing only quality control. In [3, 4], hierar-chical mode of J P E G using SJPEG as a submode was implemented allowing only spatial progressiveness. In this thesis, an efficient RD optimized J P E G encoder is presented which supports spatial and quality progressive image encoding, simultaneously. _ The new encoder em-ploys the progressive mode of J P E G as a submode of its hierarchical mode. Our encoding algorithm produces downsampled versions of the original image, which are then encoded, producing an ordered sequence of scans that can be used to progressively encode/decode an image, achieving precise RD control and yielding relatively good compression perfor-mance at all bit rates, including the very low bit rates. Chapter 1. Introduction 3 Our encoding algorithm is compliant with the J P E G standard, in the sense that existing decoders can be used without modification although their reproduction quality levels are significantly higher using our algorithm. In Chapter 2, we provide a description of J P E G , in particular its three popular modes of operation. In Chapter 3, a rate-distortion (RD) optimized P J P E G compliant coding algorithm is devised. In Chapter 4, rate-control within P J P E G is enabled taking into account all the syntax constraints of a P J P E G encoder. Finally, in Chapter 5, the RD optimized P J P E G algorithm discussed in Chapter 3 is used as a part of a hierarchically embedded progressive J P E G (HPJPEG) encoder. The resulting encoder is compliant with the J P E G standard. Finally, conclusions are presented in Chapter 6. Chapter 2 Background In the following sections, sequential DCT-based mode of J P E G , progressive DCT-mode of J P E G and hierarchical mode of J P E G are discussed in detail, following the common basic operations between the two modes. 2.1 Basics of DCT-Based J P E G A l l DCT-based modes of operation in J P E G follow the same basic coding procedure shown in Figure 2.1. First, an input image is partitioned into 8 x 8 blocks. The first element of each 8 x 8 D C T block is called the DC coefficient and the rest are called the A C coefficients. Each 8 x 8 block is then transformed using the forward discrete cosine transform (FDCT) and optionally quantized. The D C T coefficients are then prepared for entropy encoding. The J P E G standard allows two algorithms for entropy coding, one Huffman-based and the other one based on arithmetic coding. Huffman-based coding is widely used because it is simple and publically available, whereas arithmetic coding (although more compression efficient) is more complex and is mostly patented. Therefore, we only consider Huffman-based J P E G modes in the rest of this thesis. Finally, necessary information for decoding such as headers (image dimension and sample precision) and tables (quantization and Huffman tables) as well as the encoded data are stored in the compressed bit stream. Likewise, at the decoder, the received bit streams are entropy decoded to provided x 8 blocks of D C T coefficients, which are then dequantized and transformed using inverse 4 Chapter 2. Background r -\ ) / 8x8 Block FDCT J \ Quantization Tables Entropy Coding Tables (a) Block diagram of basic encoding procedure in J P E G 1 1 • i , i • Compressed Image (Headers/ Tables/ Quantizer Mode-specific (Optional) Entropy Coder Image Data) Compressed Image (Headers/ Tables/ Image Data) t. Entropy Coding Tables Mode-specific Entropy Decoder Quantization Tables Dequantizer (Optional) 8x8 Block IDCT (b) Block diagram of basic decoding procedure in J P E G Figure 2.1: Basic coding procedure for DCT-based J P E G . Chapter 2. Background 6 D C T (IDCT) to yield the reconstructed image. The F D C T and IDCT equations are given by S(v,u) = ^1^MJ2J2 s(y,x)coS[(2x + l>7r/16]«w[(2y + 1)^/16] (2.1) ^ y=Ox=0 and *(V, *) = E ^ E ^r-S(v, u)cos[{2x + l)u7r/16]aw[(2y + 1)^/16] , (2.2) v=0 1 u=0 Z where C(u) = l/y/2 for u = 0, C{u) = 1 for u > 0, C(y) = l'/y/2 for v = 0, C(uj) = 1 for u > 0, s(y, a;) is the pixel value of an input 8 x 8 block and S(v, u) is the D C T coefficient of an 8 x 8 D C T block. The 2-D D C T can be computed by 1-D D C T of the rows followed by 1-D D C T of the columns. Although 2-D algorithms are available, they are much more complex. In this thesis, the 1-D D C T algorithm described in [11] is implemented in integer arithmetic. 2.2 Sequential DCT-Based J P E G In the sequential DCT-based mode the image is compressed in a single scan. A l l the 8 x 8 D C T blocks in an image are entropy encoded in a raster scan order. Two statistical models are used in the DCT-based sequential mode. One applies to the coding of DC differences generated by the D P C M model and the other applies to the coding of A C coefficients. The DC coefficient of the 8 x 8 block is always coded first. The DC is coded using a D P C M model in which the prediction is the DC coefficient of the most recently coded block. Table 2.1 shows the encoding process of the DC coefficients. The difference between the DC value and the predicted value is fed to the entropy coder. The Huffman Chapter 2. Background 7 DC values 8 9 8 -6 -8 -3 3 3 .. D P C M 0 1 -1 -14 -2 5 6 0 .. Size 0 1 1 4 2 3 3 0 .. Appended bits - 1 0 0001 01 101 110 -Table 2.1: Encoding of DC coefficients using D P C M model statistical model for coding of D P C M differences segments the difference values into a set of (approximately) logarithmically increasing magnitudes (shown as Size in Table 2.1) categories. Each of the difference categories is assigned a variable length (VL) code. For example, the DC values 9 and 8, and -3 and 3 in Table 2.1 have the same V L code since they are in the same magnitude categories. Except for zero differences, however, the difference category V L codes do not fully describe the difference. Therefore, immediately following the V L codes for non-zero difference categories, additional bits are appended to the code stream to identify the sign and fully specify the magnitude of the difference. Positive differences are represented in binary representation using the minimum number of bits, and negative differences are represented by the complement of the positive numbers of the same magnitude. Therefore, the most significant bits of the positive and the negative differences will be always 0 and 1, respectively. The A C coefficients are scanned in zigzag sequence order as shown in Figure 2.2. A l l of the D C T coefficients are then entropy encoded and output as part of the compressed image data. The encoding process for the A C coefficients is illustrated in Table 2.2. In order to obtain coding efficiencies for the A C coefficients approaching the entropy, the V L coder aggregates zero coefficients into runs of zeros. However, even this strategy is not quite good enough, so the V L coder statistical model uses symbols that combine the run of zeros with magnitude categories for nonzero coefficients that terminate the Chapter 2. Background AC1 DC i Frequency Frequency Figure 2.2: Encoding of A C coefficients in zigzag order in an 8 x 8 block . A C index 1 2 3 4 5 6 7 8 9 10 . . 62 63 A C values 0 0 0 0 -14 0 0 1 0 0 . . 0 0 Zero run-length counter 1 2 3 4 1 2 1 2 . . 54 EOB Size 4 1 Appended bits 0001 1 Table 2.2: Encoding of A C coefficients Chapter 2. Background 9 runs [2]. These magnitude categories increase logarithmically in exactly the same way as the D P C M difference categories, and the coding strategy is therefore very similar. For example, the first five A C coefficients in Table 2.2 are encoded by a V L code representing four runs of zeros and the size four. These codes are followed by appended bits that fully specify the sign and magnitude of the nonzero coefficient in exactly the same way as in the DC encoding process. Consequently, except for very long runs of zeros, V L codes are assigned to all possible combinations of runs of zeros and magnitude categories. Because high frequency A C coefficients are mostly small in magnitude, they are mapped to zero after quantization. This is directly incorporated into the A C statis-tical model by means of a special symbol called the end-of-block (EOB). E O B means that the rest of the coefficients in the block are zero. The last 54 A C coefficients in Table 2.2 are encoded using E O B codes. There are two types of sequential DCT-based J P E G . A simpler form of the sequential DCT-based mode of operation is called the "baseline J P E G . " It represents a minimum capability that must be present in all DCT-based J P E G coders [12]. Only 8-bit input precision and two sets of Huffman tables (each set includes one DC table and one A C table) for entropy coding are allowed in baseline J P E G . A sequential DCT-based J P E G that has capabilities beyond the baseline J P E G is called "extended sequential J P E G , " which allows 12-bit input precision and 4 sets of Huffman tables. The J P E G standard includes a set of "typical" Huffman and quantization tables based on the results of several experiments using many images. While these tables are used in many implementations and yield reasonable performance, an image can be scanned prior to quantization and entropy encoding to generate image-specific quantization and optimized V L coding tables, respectively. The implementations using these optimal tables (although more complex) result in further compression performance gains, and are called "optimized J P E G " implementations [2, 12]. Chapter 2. Background 10 2.3 Progressive DCT-Based J P E G In the progressive J P E G (PJPEG) mode, 8 x 8 blocks are formed in the same order, D C T transformed and (optionally) quantized to a specific number of bits. The quantized D C T coefficients are then partially encoded in multiple scans, each of which corresponds to a few bits of one or more D C T coefficients, and therefore represents a portion of the image being encoded/decoded. The DC coefficients are the average values of the 8 x 8 blocks, and an image containing uniform gray-level blocks is obtained when just the DC coefficients are decoded. The A C coefficients in the first few rows and columns of the 8x8 coefficients represent well the vertical and horizontal edges in the image, respectively. The high frequency A C coefficients represent fine, often non-existent, details of an image. The D C T coefficients, each of which contributing differently to improving the image quality, can be divided into many groups or decomposed into bits as specified in the P J P E G mode, such that the reconstructed image quality increases as more groups or bits of the D C T coefficients are received and decoded. Two different procedures are defined in P J P E G by which the quantized coefficients may be partially encoded within a scan. In one procedure, called spectral selection, the D C T coefficients from the zigzag sequence in each 8 x 8 block are segmented into frequency bands of variable lengths. Each frequency band is then encoded in one scan. In the other procedure, called successive approximation, the D C T coefficients need not be encoded to their full accuracy in each scan. The precision of the D C T coefficients of the 8 x 8 blocks is reduced by dividing them by a power of two (or shifting right their binary representations) before encoding such coefficients. The received D C T coeffi-cients are decoded and multiplied by the same power of two (or shifting left their binary representations) before the IDCT operation. The precision of the D C T coefficients is increased by one bit in each of the subsequent scans. These two procedures can be used Chapter 2. Background 11 DC values • 7.' 8 10 -5 7 -9 5 0 .. After point transform (>>1) 3 4 5 -3 3 -5 2 0 .. D P C M 3 1 1 -8. 6. -8 7 -2 .. Size 2 1 1 4 3 4 3 2 .. Appended bits 11 1 1 0111 110 0111 111 01 .. Table 2.3: First DC scan (first 9 bits) using successive approximation independently or combined in a flexible manner to yield many combinations [1, 2, 4]. Since they involve essentially the same operation, spectral selection and the first scan in successive approximation are discussed together in the next subsection. This is followed by a detailed description of the subsequent successive approximation scans. 2.3.1 Spectral Selection and Successive Approximation First Scan The encoding methods for spectral selection and for the first scan of successive approx-imation are the same as that of the sequential J P E G mode, except that in P J P E G , the DC and A C coefficients are always encoded separately in different scans. Moreover, the end-of-block marker used in the sequential mode is extended to become a series of end-of-band markers in the P J P E G mode. When encoding the first DC scan, the precision of all the DC coefficients is reduced by a point-transform (a right-shift). Table 2.3 illustrates the encoding process of the first DC scan, where the precision of the DC coefficients is reduced by one bit. D P C M differences are then obtained from the DC values of reduced magnitude, and the minimum number of bits required to represent the difference is found (Size in Table 2.3) and variable length encoded, followed by the actual difference in binary representation. Negative differences are represented in the same way as in the sequential J P E G mode. The first A C scan is also encoded in a similar manner as in the sequential mode. Chapter 2. Background 12 A C index 1 2 3 4 5 6 7 8 9 10 . . 62 63 A C values 0 1 7 -5. 2 0 -4 0 1 0 . . 0 0 After point transform (>>1) 0 0 3 -2 1 0 -2 0 0 0 . . 0 0 Zero run-length counter 1 2 0 0 0 1 0 EOB0 Size - • - 2 2 1 - 2 Appended bits - - 11 01 1 - 01 Table 2.4: First A C scan (first 9 bits) using successive approximation DC values 7 8 10 -5 7 -9 5 0 .. Previously coded x2 6 8 10 -6 6 -10 4 0 .. Appended bit (correction bit) ' 1 0 0 1 1 1 1 0 .. Table 2.5: Subsequent DC scan (last bit) using successive approximation The precision is reduced by a point-transform (division by a power of two). Table 2.4 illustrates the procedure in the case where the precision of the A C coefficients is reduced by 1 bit. Consecutive coefficients of zero magnitude are counted until a nonzero coeffi-cient is encountered, and the minimum number of bits required to represent the nonzero coefficient is obtained. The combination of the run of zeros and the magnitude of the nonzero coefficient is then variable length encoded, followed by the actual coefficient in binary form. Negative numbers are handled in the same way as when encoding the DC scan. 2.3.2 Successive Approximation Subsequent Scans Both of the subsequent DC and A C scans in successive approximation increase the pre-cision of the D C T coefficients by one bit at a time. In the subsequent DC scans, the encoding method is rather simple: the codewords are simply the bit values themselves as shown in Table 2.5. In the subsequent A C scans, the bits are coded in three different Chapter 2. Background 13 A C index 1 2 3 4 5 6 7 8 9 10 . 62 63 A C values 0 1 7 -5 2 0 -4 0 1 0 . 0 0 Previously coded x2 0 0 6 -4 2 0 -4 0 0 0 . 0 0 Zero run-length counter 1 0 0 0 1 0 1 - EOB0 Size - 1 - - - - - - 1 Appended bits (correction bits) - - 1 1 0 - 0 - -Table 2.6: Subsequent A C scan (last bit) using successive approximation ways: partial run-length coding, nonzero coding, and appending of correction bits. Table 2.6 illustrates the encoding procedure of the subsequent A C scans. The Os are counted until a new 1 is encountered. A bit can have a new value of 1 when it converts the decoded value in the previous scans from zero to a nonzero value. The run-length and the size (equal to 1) of the new 1 bit is variable length encoded, followed by a sign bit (0 for negative and 1 for positive). The Os to be added to the nonzero decoded values from the previous scans are skipped when counting the Os, i.e., the run-length coding of zeros is partial. Such Os, as well as the nonzero bits to be added to the nonzero decoded value, are appended to the variable length codeword and sign bit. Based on the encoding process described above, it is clear that P J P E G imposes a few constraints on the progressive encoder's structure. Some are generic and are identical to those imposed on the sequential J P E G encoder's structure, and others are imposed mainly to reduce the complexity of P J P E G compliant decoders. Such constraints, which must be satisfied by our proposed P J P E G design algorithm, are stated below: • The first scan to be encoded/decoded must contain only DC information. Such a scan represents all DC coefficients in spectral selection, and it represents a group of the most significant bits of the DC coefficients in successive approximation. • The DC and A C coefficients cannot be encoded in the same scan. Chapter 2. Background 14 • The coefficients in the same spectral band must be consecutive, i.e., their locations can be described by just two points: 1) start of band and 2) end of band. • Only the first scan of any D C T coefficient can represent more than one precision bit, and the subsequent scans must represent only one precision bit at a time. • Encoding/decoding of a bit of a coefficient requires that the preceding bits of that coefficient be encoded/decoded in advance. 2.4 Hierarchical J P E G H J P E G provides the other J P E G modes of operations with a high level structure for a progressive encoding of an image at multiple spatial resolutions. Between progressive stages, spatial resolution of the reconstructed image increases. Figure 2.3 shows a block diagram of H J P E G . Before any encoding is done, the original image is downsampled by a desired number of multiples of 2 in each dimension. The first stage (lowest resolution image) can then be encoded using one of the progressive, sequential, or lossless J P E G modes of operations. This encoded image is transmitted to the receiver, but, at the same time, is decoded, upsampled and lowpass filtered to become the prediction for the second stage. In the second stage, the corresponding difference image is encoded and transmitted. Again, at the encoder, this difference image is decoded, added to the predicted image from the first stage, upsampled and lowpass filtered to become the prediction for the third stage. Likewise, the difference image obtained in the third stage is encoded and transmitted. After each stage, the receiver is able to progressively reconstruct the image with increasing spatial resolution following the same upsampling and filtering procedure, used at the encoder. Chapter 2. Background 15 Original Image Downsample - (^) j^ Encode^ j • o Upsample • Downsample FDCT/Q J^Encode^ J • DQ/IDCT Upsample Encode Q: Quantization DQ: Dequantization • HJPEG Encoder • j^ DecodeJ — ( ^ ) Reconstructed Image + Upsample ^DecodsTj — ( 3 + Upsample Decode • HJPEG Decoder Figure 2.3: H J P E G encoding/decoding procedure. One of the progressive, sequential, or lossless encoding modes can be used in the [Encode] steps, and corresponding decoding modes can be used in the [Decode] steps. Chapter 3 Rate Distortion Optimized P J P E G Understanding the relationship between the D C T coefficients and the reconstruction image quality can help us greatly design an efficient and effective P J P E G compliant encoder. For example, as discussed in Section 2.3, only the first few rows and columns of the 8 x 8 blocks can be transmitted for the images whose content involve mostly vertical and horizontal edges, respectively. However, ordering of 64 D C T coefficients in an.8 x 8 block according to subjective importance is a very difficult task, and such algorithm requires information about the image content to be available prior to the compression process. Moreover, the bit rate usually supported by typical low-bandwidth channels is generally so low that only the first few most significant bits of a single D C T coefficient in 8x8 blocks can be transmitted. To address the above problems, all the bits of the 64 D C T coefficients in an 8 x 8 block are here assigned importance levels that involve the size (i.e., bit rate) of the resulting scan and the scan's contribution to reducing the reconstruction distortion, expressed in terms of the sum of squared errors. More specifically, each bit in a D C T coefficient is assigned a distortion-rate ratio, whose numerator is the reduction in reconstruction distortion, and denominator is the scan's size in bytes. The resulting value is then used to associate with each bit a priority level. Taking into account PJPEG's general encoder structure and specific constraints, the prioritized bits are then grouped, achieving higher compression efficiency and lower computational requirements. Details about each of the above steps are presented next. 16 Chapter 3. Rate Distortion Optimized PJPEG 17 D C A C 1 A C 5 A C 6 A C 1 4 A C 1 S A C 2 7 A C 2 8 A C 2 A C 4 A C 7 A C 1 3 A C 1 6 A C 2 6 A C 2 9 A C 4 2 A C 3 A C 8 A C 1 2 A C 1 7 A C 2 S A C 3 0 A C 4 1 A C 4 3 A C 9 A C 1 1 A C 1 8 A C 2 4 A C 3 1 A C 4 0 A C 4 4 A C 5 3 A C 1 0 A C 1 9 A C 2 3 A C 3 2 A C 3 9 A C 4 5 A C 5 2 A C 5 4 A C 2 0 A C 2 2 A C 3 3 A C 3 8 A C 4 6 A C 5 1 A C S S A C S O A C 2 1 A C 3 4 A C 3 7 A C 4 7 A C 5 0 A C 5 6 A C 5 9 A C 6 1 A C 3 5 A C 3 6 A C 4 8 A C 4 9 A C 5 7 A C 5 8 A C 6 2 A C 6 3 Figure 3.4: An 8 x 8 x 10 parallelepiped. 3.1 Prioritization of Bits , Assuming 8-bit input precision, each D C T coefficient will have 11-bit precision, 10 bits for magnitude and one bit for sign 1. The sign bit is not considered separately when prioritizing bits since it is always encoded with the first non-zero most significant bit as explained in Section 2.3. Excluding the sign bit for the moment, an 8 x 8 D C T coefficient block can be considered as a parallelepiped which contains 8 x 8 x 10 bits. Figure 3.4 shows an 8 x 8 x 10 parallelepiped. The decoder assumes parallelepipeds of all Os corresponding to 8 x 8 blocks of zero-valued coefficients. The Os are replaced by the actual values of the bits as they are decoded. Therefore, to quantify the effect of each bit on the quality of the decoded image, the distortion associated with each bit being considered 0 is computed based on the fact that more significant bits are more important 1 J P E G also allows 12-bit input precision as described in Chapter 2. Our algorithm can be easily extended to support 12-bit input precision since the input precision is simply a parameter that can take on many values. Chapter 3. Rate Distortion Optimized PJPEG 18 than less significant bits. For example, if the tenth bit (left-most or most significant bit) of the DC coefficient is equal to 1 in value, and the decoder assumes it to be 0, the resulting distortion2 (or squared error) will be equal to (2 9) 2 . When the value of a bit is 0, its associated distortion will be equal to zero regardless of its position, as the decoder has already assumed all the bits to have a 0 value, and decoding a 0 bit will not reduce the reconstruction distortion. In the P J P E G mode of operation, each bit of a parallelepiped should be transmitted along with the bits at the same position of all the parallelepipeds in an image. Thus, the overall distortion reduction value should be equal to the sum of the distortion reduction values associated with the bits at the same position in all the parallelepipeds in the image, that is, the overall distortion reduction AD{ associated with the ith bit should be equal to A- = E ^ - - 1 ) 2 , (3.3) 3 = 1 where i = 1,2, ...,640, iVis the number of 8 x 8 blocks in the input image, is the value of the ith bit of the jth parallelepiped and p; is the position of the ith bit. Encoding of bits at different locations in the parallelepipeds also results in different scan sizes or bit rates. Thus, a rate-distortion optimized importance measure should involve the total distortion and bit rate values, and an appropriate importance measure /; is then given where A D ; and AR{ are the priority value, distortion reduction and scan size (or bit rate increase) associated with the ith bit, respectively. In this manner, the bit that con-tributes the most distortion reduction while requiring the least number of bits (smallest scan size) is assigned the greatest priority value. Table 3.7 shows the priority values of 2Since the DCT is a unitary transform, the distortion in the DCT domain is theoretically equal to that in the spatial domain. Chapter 3. Rate Distortion Optimized PJPEG 19 (a) (b) Figure 3.5: Original images: (a) 512 x 512 image LENA and (b) 512 x 512 image MANDRIL. all the bits of the first 30 D C T coefficients (obtained by tracing the 64 D C T coefficients in the normal J P E G zig-zag way) of the 512 x 512 image LENA shown in Figure 3.5 (a). Throughout the chapter, the popular standard image LENA will be used whenever subjective quality illustrations are provided. However, in the last section, our algorithm will also be applied to the image MANDRIL 3 shown in Figure 3.5 (b). As expected, the less significant bits, especially those of coefficients occuring at high frequencies, are assigned small priority values. Such bits contribute only high frequency or noise-like information, which generally reduce only slightly the objective/subjective distortion. Based on the priority value defined in equation (3.4), all the 640 bits in an 8 x 8 block can be sorted in terms of decreasing order of priority values. Figure 3.6 shows the priority values of the 20 most important bits, for the 512 x 512 image LENA. Notice that the priority values associated with the first few bits are very large, as compared to 3Such an image is known to contain much more statistically random information than the other standard images. Chapter 3. Rate Distortion Optimized PJPEG D C T index M S B LSB 0 270719 136142 55399 15099 3874 966 246 62 16 4 1 6899 44227 16384 5163 1626 463 127 36 10 3 2 0 21845 13749 4233 1392. 417 117 32 9 3 3 0 0 4029. 2538 1039 318 92 27- 8 2 4 0 4795 9732 3857 1241 384 106 29 8 2 5 0 5958 10377 3941 1272 373 111 31 8 2 6 0 1725 3218 2658 1020 325 92 28 8 2 7 0 0 5532 2983 1093 322 96 27 8 2 8 0 0 2731 2933 1009 309 95 26 8 2 9 0 0 0 1154 724 245 79 24 7 2 10 0 0 0 286' 337 186 65 22 7 2 11 0 0 0 1069 706 267 81 24 7 2 12 0 0 2249 2464 1024 300 91 26 7 2 13 0 0 1707 2268 969 297 90 25 7 2 14 0 0 431 1682 848 273 81 24 7 2 15 0 0 0 972 595 222 72 23 7 2 16 0 0 0 793 691 256 78 23 7 2 17 0 0 0 1601 820 268 78 25 7 2 18 0 0 0 991 711 269 79 24 7 2 19 0 0 0 . 0 256 223 65 22 7 2 20 0 0 0 0 89 118 56 20 6 2 21 0 0 0 0 0 63 44 18 6 2 22 0 0 0 0 91 121 57 20 6 2 23 0 0 0 0 293 209 74 21 7 ' 2 24 0 0 0 512 613 249 76 23 7 2 25 0 0 0 562 603 226 74 22 7 2 26 0 0 0 .286 451 217 73 21 7 2 27 0 0 0 108 394 179 64 22 6 2 28 0 0 0 0 120 137 59 21 • 7 2 29 0 0 0 205 218 164 64 21 7 2 Table 3.7: Priority values of the bits of the coefficients of the 512 x 512 image LENA. Chapter 3. Rate Distortion Optimized PJPEG 21 Figure 3.6: First 20 bits in descending order of priority level for the image LENA. Chapter 3. Rate Distortion Optimized PJPEG 22 those of the other bits. Such a feature is well-suited for low-bandwidth channels, such as those used in wireless applications. By transmitting the bits according to the proposed prioritization method, acceptable reproduction quality is quickly obtained after receiving the first 1—4 bits. Unfortunately, however, the order obtained above cannot be used as-is if P J P E G compliance is to be maintained. As stated in Section 2.3.2, for a specific bit of a D C T coefficient to be encoded/decoded, P J P E G requires that all the more significant (in terms of bit position in the binary representation) bits of the subject coefficient be encoded/decoded first. Figure 3.7 shows the first 20 bits to be transmitted satisfying the constraint. As expected, the graph in the figure is not as smooth as the one in Figure 3.6. 3.2 Grouping of Bits Encoding bits according to their priority values and the above P J P E G constraint is still far from "optimal", as such a method does not take into account the P J P E G header and encoding structures. In P J P E G , a header is assigned to each scan, carrying information such as the number of image components. If each scan contained a single bit plane, the additional header information would be excessive, offsetting most of the compression gain that would have been obtained using such a method. Moreover, encoding/decoding 640 bit planes is not necessarily the most compression efficient method in light of the specific coding methods used in P J P E G . For example, encoding the first DC bit following the P J P E G method requires at least one bit per pixel, leading to expansion instead of compression, i.e., subsequent DC bits are coded as-is, and extra header information is added to each scan. Another example is PJPEG's across-block run-length method, which becomes very inefficient when the high frequency A C bits are encoded. Fortunately, P J P E G allows a group of consecutive most significant bits of a coefficient and/or bits at Chapter 3. Rate Distortion Optimized PJPEG 23 Figure 3.7: First 20 bits in descending order of priority levels for the image LENA where the last P J P E G constraint listed in Section 2.3.2 is satisfied. Chapter 3. Rate Distortion Optimized PJPEG 24 the same position of consecutive coefficients to be encoded in a single scan. Considering the fact that the header size is essentially the same regardless of how many bits are represented in a single scan, grouping bits will reduce substantially the number of header bits. As will be shown later, grouping of bits may also result in improved compression efficiency of PJPEG's across-block run-length coder. To determine which bits should be grouped together, we begin by grouping only con-secutive bits in the same DC or A C coefficient. First, it is better (in terms of compression efficiency) to combine all the bits of the DC coefficient into one scan. This is partly be-cause 1) the first group of bits is encoded as in sequential J P E G , and 2) all subsequent scans must represent only one bit, which is sent as-is for each 8 x 8 block. Another reason is that only a single header is needed when one scan represents all the bits of the DC co-efficients. The optimal number of bits to be grouped can be determined experimentally. Figure 3.8 shows the distortion reduction vs. rate for 4 cases: no grouping and groupings of 2, 3, and 4 most significant bits of the DC coefficient for the image LENA. When no bits are grouped, there are 10 scans, each scan representing-a bit. When the first 2, 3, and 4 first bits are grouped into one scan, we obtain 9, 8, and 7 scans, respectively. Clearly, the amount of reduction in distortion increases (for the same bit rate) as more and more bits are grouped, but the rate of increase levels off after grouping 2 — 3 bits. Figure 3.9 shows the results when the first 2, 3, and 4 first bits of the first A C coefficient are grouped into one scan. In this case also, the rate of increase of distortion reduction ievels off after grouping 2 — 3 bits. The number of bits that should be grouped in a scan appears to be increasing as a function of the A C position along the zig-zag direction. Determining the "best" grouping method for the D C T bits, along the direction from the most to the least significant bit, is not difficult since there are only 10 possible different cases for each coefficient. However, by combining successive approximation with spectral selection, grouping A C bits can then be performed in two directions simultaneously, going Chapter 3. Rate Distortion Optimized PJPEG 25 Figure 3.8: Reduction in distortion vs. Rate for the first 4 groupings of DC coefficient using successive approximation for the image LENA. Chapter 3. Rate Distortion Optimized PJPEG 26 3000 Rate (bytes) 6000 Figure 3.9: Reduction in distortion vs. Rate for groupings of the first A C coefficient for the image LENA. Chapter 3. Rate Distortion Optimized PJPEG 27 from the most to the least significant bit and in the zig-zag direction from the first A C coefficient to the highest frequency A C coefficient. Obviously, the number of possible groupings is very large. On the one hand, one can construct two groups, yielding a DC scan arid an A C scan. Such a procedure is almost equivalent to that of the sequential J P E G mode. On the other hand, one can construct 640 scans representing 10 DC bits and 630 A C bits. Clearly, a design method exists, and is here developed by imposing some additional constraints, that yields good tradeoffs between compression efficiency and progressiveness resolution. First, the analysis described above, using the priority values and taking into account the last P J P E G syntax constraint listed in Section 2.3.2, is applied to group the most significant bits of the DC and A C coefficients separately. The results of the first step are shown in Table 3.8 for the 512 x 512 image LENA. Second, grouped bits of neighboring (in the zig-zag direction) A C coefficients having similar grouping structures and priority values can be further combined to form spectral bands, as shown in Table 3.9 for the same image. Finally, subsequent bits of the A C coefficients, located at the same bit position level and having similar priority values, can also be combined to yield larger groups of bits, as shown in Table 3.10. Note the last two steps are expected to improve compression efficiency, since 1) combining groups with similar priority values would not theoretically decrease rate-distortion performance significantly, 2) many of the significant bits of the A C coefficients typically have zero values, improving across-block run-length coding efficiency, and 3) grouping many bits into a single scan would reduce the number of required header bits. Looking at Tables 3.8 and 3.10, it is clear that the priority values increase monotonically as more bits are grouped, indicating improved compression efficiency. In summary, the overall P J P E G grouping algorithm is given below: 1.. Compute the priority values of all 640 bits in the parallelepiped using equation 3.4. Chapter 3. Rate Distortion Optimized PJPEG 28 DCT index MSB LSB 0 (DC) 15099 3874 966 246 62 16 4 1 5163 1626 463 127 36 10 3 2 • 4233 1392 417 117 32 9 3 3 2538 1039 318 92 27 8 2 4 3857 1241 384 106 29 8 2 5 3941 1272 373 111 31 8 2 6 2658 1020 325 92 28 8 2 7 2983 1093 322 96 27 8 2 8 1009 309 95 26 8 2 9 724 245 79 24 7 2 10 337 186 65 22 7 2 11 706 267 81 24 7 2 12 1024 300 91 26 7 2 13 969 297 90 25 7 . 2 14 848 273 81 24 7 2 15 595 222 72 23 7 2 16 691 256 78 23 7 2 17 820 268 78 25 7 2 18 711 269 79 24 7 2 19 256 .223 65 22 7 2 20 89 118 56 20 6 2 21 0 63 44 18 6 2 22 91 121 57 20 6 2 23 209 74 21 •7 2 24 249 76 23 7 2 25 226 74 22 7 2 26 217 73 21 7 2 27 179 64 22 6 2 28 137 59 21 7 2 29 164 64 21 7 2 30 200 63 21 7 2 31 212 69 21 7 2 32 215 70 21 7 2 33 62 20 6 2 34 45 19 6 2 35 35 18 6 2 36 34 17 6 2 37 43 17 6 2 38 59 20 6 2 39 59 19 6 2 40 60 20 7 2 41 58 21 6 2 42 55 19 6 2 43 51 19 6 2 44 51 19 6 2 45 54 19 6 2 46 57 19 6 2 47 44 18 6 2 48 17 6 2 49 16 6 2 62 15 6 2 63 15 2 2 Table 3.8: The first step significant bits of DC and of the grouping procedure for the image LENA. The most A C coefficients are grouped separately. Chapter 3. Rat e Distortion Optimized PJPEG 29 DCT index MSB LSB 0 (DC) 223695 15099 3874 966 246 62 16 4 1 5163 1626 463 127 36 10 3 2 4233 1392 417 117 32 9 3 3 2538 1039 318 92 27 8 2 4 24016 3857 1241 384 106 29 8 2 5 3941 1272 373 111 31 8 2 6 2658 1020 325 92 28 8 2 7 2983 1093 322 96 27 8 .2 8 1009 309 95 26 8 2 9 724 245 79 24 7 2 10 337 186 65 22 7 2 11 706 267 81 24 7 2 12 1024 300 91 26 7 2 13 969 297 90 25 7 2 14 848 273 81 24 7 2 15 3867 595 222 72 23 7 2 16 691 256 78 23 7 2 17 820 268 78 25 7 2 18 711 269 79 24 7 2 19 256 223 65 22 7 2 20 89 118 56 20 6 2 21 0 63 44 18 6 2 22 91 121 57 20 6 2 23 209 74 21 7 2 24 249 76 23 7 2 25 226 74 22 7 2 26 217 73 21 7 2 27 940 179 64 22 6 2 28 137 59 21 7 2 29 164 64 21 7 2 30 200 63 21 7 2 31 212 69 21 7 2 32 215 70 21 7 . 2 33 62 20 6 2 34 45 19 6 2 35 35 18 6 2 36 34 17 6 2 37 43 17 6 2 38. 59 20 6 2 39 59 19 6 2 40 251 60 20 7 2 41 _ 58 21 6 2 42 55 19 6 2 43 51 19 6 2 44 51 19 6 2 45 54 19 6 2 46 57 19 6 2 47 44 18 6 2 48 17 6 2 49 16 6 2 70 62 15 6 2 63 15 2 2 Table 3.9: The second step of the grouping procedure for the image LENA. Grouped bits of neighboring A C coefficients with similar grouping structure are combined to form spectral bands. Chapter 3. Rate Distortion Optimized PJPEG 30 DCT index MSB LSB 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 62 63 223695 I 15099 3874 966 246 62 16 24016 5236 3867 1566 940 426 251 110 70 28 Table 3.10: The final step of the grouping procedure for the image LENA. Bits of the A C coefficients, located at the same bit position level and with similar priority values are combined to yield large group of bits. Chapter 3. Rate Distortion Optimized PJPEG 31 High priority values are assigned to the bits reducing substantially the reconstruc-tion distortion and costing only a few bits. 2. Obtain the best individual coefficient grouping structure, i.e. the structure yielding the best tradeoff between compression efficiency and progressiveness resolution. 3. Combine groups of bits of consecutive A C coefficients in the zig-zag direction. 4. Compute the new priority values using as distortion reduction the value obtained by computing the sum of the individual distortion reduction values of the constituent bit groups, and as cost the number of code bits required by P J P E G to represent the resulting scan. 5. Generate scans in the order of decreasing priority values. The performance of our RD-optimized P J P E G encoder is presented in Section 3.4. 3.3 Complexity, Memory and Computation Requirements Vs. Robustness In this section, we discuss the complexity, memory and computation requirements at both the encoder and decoder. As the proposed encoding algorithm is P J P E G compliant, de-coding can be performed using existing hardware/software decoders, whose complexity, memory and computation demands are clearly larger than those of sequential J P E G de-coders, but are arguably smaller than those of today's high performance subband/wavelet decoders [13]. Despite the fact that quantization can be avoided in P J P E G , sequential J P E G en-coders are still simpler and less memory intensive. However, the computational demands of conventional P J P E G encoders are comparable to those of sequential J P E G encoders, and are substantially smaller than those of state-of-the-art subband/wavelet encoders. Chapter 3. Rate Distortion Optimized PJPEG 32 For the benefit of better rate-distortion performance and precise rate/distortion control, the proposed encoder appears to be more complex and much more demanding in terms of computations. Making P J P E G useful in constant bit rate applications is worth the additional complexity, due mainly to the above rate control algorithm. However, the gain in compression performance may not be worth the seemingly two orders of magnitude increase in encoding computations. The surge in computational demands by the encoder is due mainly to the computa-tions required to estimate the priority values. Such computations are required to encode 640 scans (for 8-bit precision) and produce corresponding distortion and bit rate (scan size) values. Another source is the re-computation of priority values of newly grouped bits. While all the computations can sometimes be performed off-line, a fast algorithm is still desired that can be implemented on-line. By applying the algorithm obtained in the previous section to a set of training images4, a general grouping structure can be obtained. When such a general grouping structure is used, complexity, computational demands, and memory requirements of our algorithm reduce to levels comparable to those of conventional P J P E G encoders. As will be discussed in Section 3.4, our proposed algorithm is very robust with respect to changes in input image content and resolution. Thus, the above grouping structure yields only a slight loss in compression performance. 3.4 Experimental Results In this section, we present experimental results that illustrate the performance of our RD optimized P J P E G compliant encoding algorithm. Figure 3.10 shows the PSNR for the P J P E G encoders based on our algorithm, a baseline J P E G coder, a well optimized 4Averaging of the grouping structures obtained separately is very difficult and often not systematic. In our approach, the algorithm is applied only once to a big combined image of a set of training images. Chapter 3. Rate Distortion Optimized PJPEG 33 sequential J P E G coder, and the SPHIT coder [13], at medium and.high bit rates for the 512 x 512 image LENA. As it is evident, our algorithm outperforms both the baseline and optimized sequential J P E G coders at high bit rates. At medium bit rates, however, the performance of the algorithms are comparable. Note that the sequential J P E G coders were designed for such medium bit rates. Moreover, notice that the wavelet SPHIT coder outperforms significantly (by 1-3 dB) the J P E G / P J P E G ones at most bit rates. Figure 3.11 shows the PSNR for the same coders at low bit rates. Below 0.2 bpp, the difference in performance between sequential J P E G encoders and our encoder is considerable. Some of the bit rates cannot even be achieved by sequential J P E G . Also illustrated in the figure is the fact that SPHIT is much more compression efficient than the other algorithms. Figure 3.12 shows the PSNR for our P J P E G coder, sequential and optimized sequen-tial J P E G , and SPHIT at medium and high bit rates for the 512 x 512 image MANDRIL. Again, the performance of our algorithm is better than that of the sequential J P E G . Figure 3.13 shows the same information at low bit rates. As in the previous case, the difference between sequential J P E G and our algorithm is considerable below 0.2 bpp. As expected, SPHIT-EZW algorithm, again, outperforms the J P E G / P J P E G coders, al-though by less than 1 dB at most bit rates. Figures 3.14 and 3.16 show the 512 x 512 images LENA and MANDRIL coded at 0.075 bpp and 0.055 bpp, 5 respectively using an optimized sequential J P E G algorithm and our P J P E G algorithm. Clearly, the visual quality of the image coded using our algorithm is much better than the other one. Figures 3.15 and 3.17 show the same images coded at 0.1 bpp using the same optimized sequential J P E G and our P J P E G algorithm. The superior performance of our algorithm is clearly illustrated. To demonstrate the robustness of the our grouping algorithm with respect to varying 5The two images were encoded at different bit rates since the sequential JPEG does not provide rate control. However, the images encoded using RD optimized PJPEG were obtained with precise rate control algorithm discussed in Chapter 4. Chapter 3. Rate Distortion Optimized PJPEG 34 Figure 3.10: Comparison of the proposed algorithm with the sequential J P E G and with E Z W at high bit rates for the image LENA. Chapter 3. Rate Distortion Optimized PJPEG 35 0.15 0.2 Rate (bits/pixel) 0.35 Figure 3.11: Comparison of the proposed algorithm with the sequential J P E G and with E Z W at low bit rates for the image LENA. Figure 3.12: Comparison of the proposed algorithm with the sequential J P E G and with E Z W at high bit rates for the image MANDRIL. Figure 3.13: Comparison of the proposed algorithm with the sequential J P E G and with E Z W at low bit rates for the image MANDRIL. Chapter 3. Rate Distortion Optimized PJPEG 38 Figure 3.15: Image LENA coded using (a) optimized sequential J P E G and (b) proposed P J P E G at 0.1 bpp. Chapter 3. Rate Distortion Optimized PJPEG 39 Figure 3.17: Image MANDRIL coded using (a) optimized sequential J P E G and (b) pro-posed P J P E G at 0.1 bpp. Chapter 3. Rate Distortion Optimized PJPEG 40 image content and resolution, two grouping patterns, one obtained using a set of four 512 X 512 training images (BIRD, FIGHTER, PEPPERS, TRUCK) and another using the same images at 256 x 256 resolution, are applied to the 512 x 512 images LENA and MANDRIL. The selected training images differ substantially in content, as shown in Figure 3.18. Figures 3.19 and 3.20 show the different PSNR values at medium and high bit rates for the images LENA and MANDRIL, respectively. Figures 3.21 and 3.22 show the same information at low bit rates. Clearly, the performance of the grouping pattern obtained using the training images is very close to that of the grouping pattern designed using both of the test images LENA and MANDRIL. Figures 3.19 through 3.22 illustrate that our grouping algorithm performs quite well for different images and resolutions. However, note that substantial changes in content (e.g., natural vs medical) or resolution (e.g., 64 x 64 vs 2048 x 2048) do impact the compression performance more significantly. Nevertheless, different patterns can still be designed off-line that achieve consistently good compression performance for a very wide range of image types and resolutions. Figure 3.18: Four training images: BIRD, FIGHTER, PEPPERS, and TRUCK. Chapter 3. Rate Distortion Optimized PJPEG 42 Figure 3.19: Comparison between the performance of image specific and non-specific grouping algorithms at medium and high bit rates for the image LENA. ' Figure 3.20: Comparison between the performance of image specific and non-specific grouping algorithms at medium and high bit rates for the image MANDRIL. Figure 3.21: Comparison between the performance of image specific arid non-specific grouping algorithms at low bit rates for the image LENA. Figure 3.22: Comparison between the performance of image specific and non-specific grouping algorithms at low bit rates for. the image MANDRIL. Chapter 4 Constrained Rate Control in P J P E G Rate control is the ability of coding an image up to a specified rate [14]. Our goal here is to seek the best possible image reproduction quality given the user-specified bit rate. As is well known, encoders that are sequential J P E G compliant cannot achieve automatic rate control in the sense that a desired bit rate can only be achieved approximately by using a trial-and-error approach, i.e., each time quantization table elements are scaled up/down, D C T coefficients are scaled down/up, resulting in decreased/increased bit rate. The lack of a rate control functionality in sequential J P E G encoders limits their usefulness in many important applications. Rate control has thus become a key desired feature of the emerging JPEG-2000 standard [15]. Despite the various syntax constraints of P J P E G , precise rate control can still be achieved. In fact, shown in the following sections is a method that allows the above proposed P J P E G compliant encoder to maintain high reproduction quality subject to the constraint of a fixed bit rate. Figure 4.23 shows the high level block diagram of the rate control algorithm, which is applied to scans ordered according to the grouping procedure described above. Let L be the maximum number of bits available anytime during the encoding process, H be the size in bits of the PJPEG ' s frame header, Hs be the size in bits of the current scan's header, and N is the number of 8 x 8 blocks. First, the value of L , which is initially set to the maximum allowed number of bits (i.e. desired bit rate), is compared to the size H of PJPEG's frame header, and if L < H, the encoding procedure is terminated. This case represents the very unlikely event that the 46 Chapter 4. Constrained Rate Control in PJPEG 47 STOP Yes Proceed Yes. STOP Next Scan STOP Yes L = L-Hs Subsequent AC Scans Subroutine First AC Scan Subroutine First DC Scan Subroutine Subsequent DC Scans Subroutine STOP Figure 4.23: High level block diagram of the rate control algorithm. L is the maximum number of bits available anytime, initially set to the maximum allowed number of bits, H is the size of PJPEG' s frame header in bits, Hs is the size of the current scan header in bits. Chapter 4. Constrained Rate Control in PJPEG 48 maximum allowed number of bits is too small to accommodate even PJPEG' s header. Otherwise, L is decremented by H, and we then proceed by 1) making sure that at least one scan is available and 2) comparing the new value of L to Hs, the size of the current scan header. In the very probable event that L > Hs, L is further decremented by Hs, and we then proceed encoding the D C T bit values in the subject scan in a way that depends on its type (DC or A C , first or subsequent). Figures 4.24, 4.25, 4.26 and 4.27 illustrate the algorithms corresponding to the first DC scan, subsequent DC scans, first A C scan, and subsequent A C scans, respectively. Each of the algorithms is described in detail in the following sections. 4.1 First D C Scan J P E G does not allow encoding of only a few DC coefficients of an image. Therefore, the size of the bit stream after encoding all the DC coefficients is the minimum rate achievable by conventional J P E G encoders. However, we can "fool" the decoder by replacing some of the actual DC bits with zeros which require fewer code bits to achieve lower rate. As. illustrated in Figure 4.24, we first verify that we have at least 2A^ code bits available, where N is the number of 8 x 8 blocks in the image being coded. The number 2N is the minimum number of bits required to encode the fist DC scan since the variable length codeword for the DC difference of zero is "00". Satisfying this condition, we encode the group of bits of the first non-coded 8 x 8 block, decrement L by the corresponding B code bits used to encode the current block, and compare L to 2ra, where 2n is the number of code bits needed to represent n "0"s, and n is the number of non-coded blocks. If L < 2n, the code bits representing the current block are removed, 2n code bits (which happen to be "0"'s) are added to the bit stream, and the encoding procedure is terminated. Otherwise, we encode the following block and so on until we exhaust all Chapter 4. Constrained Rate Control in PJPEG 49 n = N Encode the group of bits of next 8x8 block L = L - B n = n - 1 I Remove the last bits added to bit stream Add Code bits for 2n zeros to the bit stream Go to Proceed Go to Stop Figure 4.24: Block diagram of the rate control algorithm used for the first DC scan, n is the number of non-coded blocks, Af is the number of 8 x 8 blocks and B is the number of code bits generated by encoding the current 8 x 8 block. Chapter 4. Constrained Rate Control in PJPEG 50 available bits or the last block in the image has been encoded. In the latter case, we proceed as shown in Figure 4.23 to encode the next available scan. When the image with some of the DC coefficients replaced with zeros is decoded, its reconstructed image will contain blocks whose gray levels are vividly different from their neighboring blocks. Although such degradation in image quality is not desired, the above algorithm can be useful when only extremely low bandwidths are available. 4.2 Subsequent D C Scan For all subsequent DC scans, we simply verify if we have N available code bits to represent as-is [4] the DC bits. If there are enough code bits, the DC scan is copied to the bit stream in a sequential order (left to right, top to bottom), and we then proceed to encode the following scan. Otherwise, L is incremented by Ha, the current subsequent DC scan is discarded, and the following scan (which would hopefully be an A C scan) is considered. Figure 4.25 illustrate the rate control algorithm when a subsequent DC scan is being encoded. 4.3 First A C Scan As shown in Figure 4.26, we here do not encode 8 x 8 blocks individually, rather, we start and continue processing A C sub-coefficients or groups of bits until the first nonzero sub-coefficient is encountered. At such a point, we encode all zero sub-coefficients and the nonzero sub-coefficient as described in Chapter 2, which produces B bits and processes m blocks. Then,' L is decremented by B bits. To determine the number of non-coded blocks, we verify if the nonzero sub-coefficient is the last element of the spectral band of the subject block. If it is not the last element of the band, the number of non-coded blocks should include the current block (there still exist some sub-coefficients to be coded in the Chapter 4. Constrained Rate Control in PJPEG 51 Yes No L = L + Hs Encode the whole scan 1 L = L - N Go to Proceed Figure 4.25: Block diagram of the rate control algorithm used for the subsequent DC scans. Chapter 4. Constrained Rate Control in PJPEG 52 First AC scan subroutine No Yes Go to Proceed n = N L = L - B n = n - m .Yes * Process next group of bits until the first non-zero bit is encountered -> B bits, m blocks jYes n = n - m n = n - m + 1 -Yes Remove the coded bits of current block(s) n = n + m Add h(EOBn)+s(n) to the bit stream Go to Stop Figure 4.26: Block diagram of the rate control algorithm used for the first A C scan. B is the number of code bits generated by encoding the current 8 x 8 block or blocks, m the number of blocks processed, h(EOBn) the length in bits of the variable length code representing E O B n and s(n) the number of bits required to represent n. Chapter 4. Constrained Rate Control in PJPEG 53 current block). Therefore, the current number n of non-coded blocks is decremented by m — 1 if the nonzero sub-coefficient is not the last element of the corresponding band, and by m otherwise. The parameter L is subsequently compared to h(EOBn) + s(n). h(EOBn) is the length in bits of the variable length code representing EOBn, the end-of-band when n blocks of all zeros are encountered, and s(n) is the number of bits required to represent n. h(EOBn) + s(n) is the minimum number of code bits required to encode the sub-coefficients in non-coded blocks that were ignored by simply assuming them to be zero. If L < h(EOBn) + s(n), the code bits representing the current block(s) are removed, n is restored to its previous value and code bits for EOBn and n in binary are added to the bit stream, and the encoding procedure is terminated. Otherwise, we encode the following block and so on until we exhaust all available bits or the last block in the image has been encoded. , 4.4 Subsequent A C Scan As shown in Figure 4.27, for all subsequent A C scans, we first verify if we have h(EOB(N))+ s(N) + nz available code bits, where nz is the number of the D C T coefficients that have been partially encoded with a nonzero history.' [4]. This is the minimum number of bits required to represent the coefficients with nonzero history and force the other coeffi-cients to zeros. h(EOB(N)) is the length in bits of the variable length code representing EOB(N), the end-of-band when N (initial number of blocks in an image) blocks of zeros are encountered, and s(N) is the number of bits required to represent N. Additional nz bits are required since the D C T coefficients with nonzero history are not variable length coded, but appended as-is. If the total number of available bits is too small, we continue processing A C sub-coefficients until the first nonzero sub-coefficient with zero history is encountered. At such a point, we encode all zero sub-coefficients and the nonzero Chapter 4. Constrained Rate Control in PJPEG 54 Subsequnet AC scan subroutine n = n -m nz = nz - mz Yes * Process next group of bits until the first non-zero bit with zero history is encountered -> B bits, mz coefficients with non-zero history ** Add h(EOBn), n in binary, and correction bits of coefficients with non-zero history to the bit-stream Yes Remove the coded bits of current block n = n + m nz = nz + mz ** Go to Proceed Go to Stop Figure 4.27: Block diagram of the rate control algorithm used for the subsequent A C scans, nz is the number of D C T coefficients with nonzero history and mz the number of D C T coefficients with nonzero history which were processed. Chapter 4. Constrained Rate Control in PJPEG 55 sub-coefficient as described in Section 2.3, producing B bits. Then, L is decremented by B bits. It is assumed that processing of the coefficients up to this point caused mz coefficients with nonzero history and m blocks to be coded. Therefore, the values of n and nz are decremented by m and mz respectively. L is subsequently compared to h(EOB{n)) + s(n) + nz. If L < h,(EOB(n)) + s(n) + nz, the code bits representing the current block(s) are removed, n and nz are restored to their previous values, code bits for EOBn and n in binary representation and correction bits of the coefficients with nonzero history are added to the bit stream, and the encoding procedure is terminated. Otherwise, we encode the following blocks and so on until we exhaust all available bits or the last block in the image has been encoded. To summarize, the above algorithm can achieve relatively precise rate control through sending both complete and partial scans. The partial scans are obtained by appropriately forcing the encoding of "0" 's once the size of the encoded image is approaching the desired maximum number of bits. The same algorithm can also be used, in a straightforward way, to achieve distortion or quality control. Although P J P E G imposes many constraints on the encoder, more precise rate/distortion control can still be achieved by RD optimizing the control algorithm explicitly. That is, we should not wait until all available bits would be exhausted before forcing the rest of the coefficient bits to be "0"'s. Instead, we can employ a Lagrangian multiplier, whose value is adaptively changed during the encoding process, to decide optimally in the rate-distortion sense whether a "1" should be converted to a "0". However, the gain in rate/distortion control precision does not, so far, appear to justify the additional complexity. The performance of the proposed constrained rate control algorithm is shown in Table 4.11. The table shows the target bit rates of 0.0625, 0.125, 0.25, 0.5 bpp, as well as achievable bit rates and the corresponding PSNR values for the image LENA. Using our rate control algorithm, the actual bit rates are very close to the desired bit rates. Chapter 4. Constrained Rate Control in PJPEG 56 Desired bit rate (bpp) Obtained bit rate (bpp) Obtained PSNR 0.0625 0.0618 ' 24.74 0.125 0.125 26.43 0.25 0.249 29.14 0.5 0.499 32.42 Table 4.11: Rate Control Results for the image LENA. Chapter 5 Hierarchically Embedded P J P E G In Chapter 3, we developed a stand-alone RD optimized P J P E G compliant encoding algorithm. In this chapter, the algorithm is employed as a submode of H J P E G to allow quality progressive control for each resolution layer. Figure 5.28 shows the block diagram of the hierarchically embedded P J P E G (HPJPEG) coding procedure with three resolu-tion layers. In H J P E G , an image is encoded as a sequence of frames, while each frame is an independently encoded J P E G image. These frames provide reference reconstructed images which are needed for prediction of subsequent frames. Except for the first frame, difference frames, obtained by computing the difference between the original image and the reference reconstructed image from a frame at a lower resolution, are encoded. Down-sampling and upsampling filters are used to provide the reference reconstructed images. The design of an efficient H P J P E G coding algorithm involves two steps: 1) selection of appropriate filters and 2) selection of proper transition points from lower to higher resolutions. These steps are discussed in detail in following sections. 5.1 Selection of Filters The selection of downsampling and upsampling filters is not specified in the J P E G stan-dard and is left to specific encoder implementations. In order to evaluate its influence on the performance of our H P J P E G encoder, an experiment is performed where different set of simple filters are compared. The first set of filters consists of simple averaging (for downsampling) and bi-linear 57 Chapter 5. Hierarchically Embedded PJPEG 58 AA / V +i A I - • O w Reconstructed Image Spatial Domain PJPEG Encoding PJPEG Decoding Progressive Scans Upsampling/ Downsampling Figure 5.28: Encoding and decoding procedures for hierarchically embedded P J P E G with three layers. interpolation (for upsampling) as shown in Figure 5.29. The center sample in Figure 5.29 (a) is aligned with the left-most column and top row of the original image when computing the left-most column and top row of the lower resolution image, respectively. Filter output values are normalized by the sum of the filter weights. Sample values at the boundaries are replicated when filter weights are outside of the image boundaries. Each 1-D filter can be applied sequentially to achieve 2-D downsampling. A n interpolated value using Figure 5.29 (b) can be calculated by applying the simple equation to the adjacent sample values, RA and RB] RA + RB Px = (5.5) where Px is the newly interpolated value. When the filter value is outside of the image, opposite boundary conditions are applied, as in the downsampling case. Chapter 5. Hierarchically Embedded PJPEG 59 2_ 1 A X B A_ B (a) (b) Figure 5.29: Downsampling and upsampling niters: (a) 1-D averaging downsampling filter and (b) 1-D bi-linear interpolation upsampling filter. A good alternative is a set of filters that are obtained using spline interpolation and approximation techniques [16]. Splines are polynomial functions that can be used to interpolate discrete sample values. In [16], an upsampling filter using a specific type of spline, called a B-spline, is designed to minimize the least mean squared error for a specific lowpass filter. Figures 5.30 and 5.31 show the downsampled/upsampled versions of the 512 x 512 image L E N A and the difference images using the two sets of filters, respectively. As can be seen from the difference images, the B-spline filters clearly outperform the averaging and bi-linear interpolation filters. It can be easily predicted that coding of the difference image in Figure (b) requires a fewer number of bits than that in Figure (a), as the difference image in Figure (b) contains more zeros and less high frequency information (e.g., edges) than Figure (a). As a result, we employ the B-spline filters as part of out H P J P E G encoder. Note that better filters can always replace the existing ones to improve our encoder's complexity-performance tradeoffs. Chapter 5. Hierarchically Embedded PJPEG 60 Figure 5.31: (a) The 512 x 512 image LENA sampled and reconstructed using B-spline filters, and (b) the corresponding difference image. Chapter 5. Hierarchically Embedded PJPEG 61 5.2 Selection of Transition Points In conventional H J P E G , transition from the current layer to the higher resolution layer occurs only after the frame at the current layer has been fully decoded and upsampled, since the differential frames at the higher layers have been already constructed based on the frame at the current layer. In such a case, even after the desired image quality has been achieved at the current resolution, the decoder has to wait until it receives the rest of the information before proceeding to the higher resolution layer. The H P J P E G algorithm, however, allows transitions from one layer to the higher one by allowing the decoder to decide on the reference image quality, i.e., the encoder transmits the encoded differential frames based on the reconstructed image at the decoder at any time. Although our algorithm can be used as-is to provide the receiver with full progressive-ness control, it would then require many computations because prioritized scans would have to be generated at each layer as discussed in Section 3.3. Also transition from one layer to the higher one, if not done properly, can cause a decrease in overall coding effi-ciency. Therefore a default coding procedure is desired that maintains high compression performance at all bit rates and resolutions. Figures 5.32 shows the PSNR vs. bit rate for the image L E N A at three resolutions, each of which is encoded using the RD optimized P J P E G algorithm obtained in Chapter 3. Evidently, the performance of the P J P E G encoder encoding the 128 x 128 resolution image is better than the other two resolutions at most bit rates up to the potential transition point marked as T l . Beyond the point T l , it is outperformed by the P J P E G encoder encoding the next higher resolution image. Naturally, T l would be the first desired transition point for the H P J P E G encoder. Likewise, the second desired transition point would be T2 as shown in the Figure 5.32. Although the first transition can occur exactly at T l , the second transition would not necessarily have to be T2, since T2 is the Chapter 5. Hierarchically Embedded PJPEG 62 Figure 5.32: PSNR vs. Rate for 128 x 128, 256 x 256, and 512 x 512 LENA images coded using P J P E G at low and medium bit rates. Chapter 5. Hierarchically Embedded PJPEG 63 point generated originally by the P J P E G encoder encoding the 256 x 256 image and is not generally reachable by encoding the differential frame during the hierarchical coding procedure. Nevertheless, the best second transition point obtained by differential frame encoding using the same P J P E G algorithm occurs approximately at the same position as T2. The performance of the H P J P E G encoder is presented in the next section. 5.3 Experimental Results Figure 5.33 shows the PSNR vs. rate of the H P J P E G encoder with three layers and the RD optimized P J P E G encoder for the 512 x 512 image L E N A at low and medium bit rates. While the H P J P E G coder can generate compressed images at all the bit rates supported by each of the RD optimized P J P E G encoders, its performance is nearly the same as or better than the "best" of the three stand-alone RD optimized P J P E G encoders with more progressiveness stages as shown in the figure. Also shown in the figure are the performance levels of the SPIHT encoder and the optimized J P E G encoder. Likewise in Section 3.4, the H P J P E G encoder outperforms the optimized J P E G encoder at all bit rates, while achieving some of the very low bit rates not supported by optimized J P E G . However, SPIHT algorithm is still more compression efficient than H P J P E G , as shown in the figure. Figure 5.34 also shows the performance levels of the same encoders at high bit rates. In Figure 5.35, the image LENA was reconstructed using conventional H J P E G and H P J P E G , respectively. Figure 5.35 (a) shows the reconstructed image corresponding to point A l (0.3 bpp) in Figure 5.32. In conventional H J P E G , the receiver has to wait until all the scans beyond the point A l are received before it can make a transition to a higher resolution layer. However, receiving further scans does not improve the reconstructed image quality, as shown in Figure 5.32. Such inefficiency is avoided in our H P J P E G Chapter 5. Hierarchically Embedded PJPEG 64 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Rate (bits/pixel) Figure 5.33: PSNR vs. Rate for 128 x 128, 256 x 256, and 512 x 512 LENA images encoded using P J P E G , and the 512 x 512 L E N A image coded using H P J P E G , SPIHT and optimized J P E G at low and medium bit rates. Chapter 5. Hierarchically Embedded PJPEG 65 651 1 1 1 1 r— 1 1 1 —r j i i i i i _ J i L " 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Rate (bits/pixel) Figure 5.34: PSNR vs. Rate for 128 x 128, 256 x 256, and 512 x 512 L E N A images encoded using P J P E G , and the 512 x 512 LENA image coded using H P J P E G , SPIHT and optimized J P E G at all bit rates. Chapter 5. Hierarchically Embedded PJPEG 66 (a) (b) Figure 5.35: The 512 x 512 image LENA reconstructed (a) at 0.3 bpp using conventional H J P E G , and (b) at 0.25 bpp using H P J P E G . algorithm. Using the default transition points obtained in the previous section in our algorithm, a higher PSNR value is achieved at even a lower bit rate for point A2 (0.25 bpp) in Figure 5.33. The corresponding reconstructed image is shown in Figure 5.35 (b). The difference in image quality is vivid in areas of high frequency information such as hair and edges. Our H P J P E G encoder outperforms conventional H J P E G both objectively and sub-jectively. The main advantage of our H P J P E G encoder, however, is that reconstructed images at very low bit rates can be achieved using resolution control. Chapter 6 Conclusions Progressive J P E G encoders are well suited in many image coding applications. We developed an RD optimized hierarchically embedded progressive J P E G encoder that yields progression in both reproduction image quality and resolution. First, a stand-alone RD optimized P J P E G encoder was designed that generates a sequence of single-bit scans, ordered based on their priority values. The D C T bits are then grouped using an efficient design algorithm which provides a good balance between compression efficiency and progressiveness resolution. In addition, we have also developed a rate control algorithm by incorporating some simple control routine into the P J P E G encoding process. The P J P E G compliant encoders outperform sequential J P E G encoders substantially at low and high bit rates, and they achieve precise rate/distortion control. Second, the encoding algorithm obtained in the first step was applied as a submode of hierarchical J P E G to allow spatial resolution control. The resulting hierarchical P J P E G (HPJPEG) encoder allows automatic transition from one layer to the next layer at the receiver's request without fully decoding the lower resolution image. The H P J P E G en-coder using the default transition points achieves high rate-distortion performance at all progressive stages, while allowing the decoder to reconstruct images at a wide range of resolutions and bit rates. Substantially better compression performance and precise rate control, achieved by our J P E G compliant hierarchically embedded progressive encoding algorithms, are two highly desired features currently sought for the emerging JPEG-2000 standard. 67 Bibliography K . R. Rao and J . J . Hwang, Techniques and Standards for Image, Video and Audio Coding. Englewood Cliffs, NJ : Prentice-Hall, 1996. W. Pennebaker and J . Mitchell, JPEG Still Image Compression Standard. New York: Van Nostrand Reinhold, 1993. J . K . Han and G. C. Polyzos, "Networking applications of the hierarchical mode of the J P E G standard," in IPCCC'96, (Pheonix, A Z , USA) , Mar. 1996, ISO/IEC, Digital compression and coding of continuous tone still images: require-ments and guidelines. International Standard Organization, 1994. B. Chitprasert and K . R. Rao, "Human visual weighted progressive image transmis-sion," IEEE Transactions on Communications, vol. 38, pp. 1040-1044, A 1990. C. Pires, J . Afonso, and F. Pereira, "Progressive still images coding; transform analysis," in 6th Mediterranean Electrotechnical Conference, pp. 22-24, May 1991. S. Sridharan, A . Ginige, and D. Lowe, "Progressive image transmission," in Pro- . ceedings of the International Conference on Image Processing and its Applications, pp. 115-118, Mar. 1992. Y . Huang, H . M . Dreizen, and N . P. Galatsanos, "Prioritized D C T for compression and progressive transmission of images," IEEE Transactions on Image Processing, vol. 1, pp. 477-487, Oct. 1992. J . L i , J . L i , and C. J . Kuo, "Layerd D C T still image compression," IEEE Trans-actions on Circuits and Systems for Video Technology, vol. 7, pp. 440-443, Apri l 1997. A . Jain and S. Panchanathan, "Scalable compression for image browsing," in Proc. of International Conference on Consumer Electronics, pp. 394-400, August 1994. C. Loefner, A . Ligtenberg, and G. Moschytz, "Practical fast 1-D D C T algorithms with 11 multiplications," in ICASSP'89, 1989. G. K . Wallace, "The jpeg still picture compression standard," Communications of the ACM, vol. 34, pp. 30-44, Apr. 1991. 68 Bibliography 69 [13] A . Said and W. A . Pearlman, "A new fast and efficient image codec based on set partitioning in hierarchical trees," IEEE Transactions on Circuits and Systems for Video Technology, vol. 6, pp. 243-250, June 1996. [14] T. Chiang and Y . Q. Zhang, "A new rate control scheme using quadratic rate dis-tortion model," IEEE Transactions on Circuit and Systems for Video Technology, vol. 7, pp. 246-250, Feb. 1997. [15] ISO/IEC JTC1/SC29/WG1, Call for contributions for JPEG 2000 (JTC 1.29.14, 15444): Image Coding System. ISO/IEC JTC1/SC29/WG1, 1997. [16] M . Unser, A . Aldroubi, and M . Eden, "B-spline signal processing: Part II-efficient design and applications," IEEE Transactions on Signal Processing, vol. 41, pp. 834-848, Feb 1993.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Rd optimized progressive image coding using JPEG
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Rd optimized progressive image coding using JPEG In, Jaehan 1998
pdf
Page Metadata
Item Metadata
Title | Rd optimized progressive image coding using JPEG |
Creator |
In, Jaehan |
Date Issued | 1998 |
Description | The JPEG standard allows four modes of operation. They are the hierarchical (HJPEG), progressive (PJPEG), sequential (SJPEG), and lossless modes1: The HJPEG and PJPEG modes inherently support progressive image coding. In HJPEG, an image is decomposed into subimages of different resolution, each of which is then coded using one of the other three modes of JPEG. Progressiveness within a resolution in HJPEG can be achieved when each subimage is coded using PJPEG. An image coded using PJPEG consists of scans, each of which contributes to a portion of the reconstructed image quality. While SJPEG yields essentially the same level of compression performance for most encoder implementations, the performance of PJPEG depends highly upon the designed encoder structure. This is due to the flexibility the standard leaves open in designing PJPEG encoders. In this thesis, an efficient progressive image, coding algorithm is developed that is compliant with the JPEG still image compression standard. The JPEG-compliant progressive image encoder is a HJPEG encoder that employs a rate-distortion optimized PJPEG encoding algorithm for each image resolution. Our encoder outperforms an op- timized SJPEG encoder in terms of compression efficiency, substantially at low and high bit rates. Moreover, unlike existing J P EG compliant encoders, our encoder can achieve precise rate control for each fixed resolution. Such good compression performance at low bit rates and precise rate control are two highly desired features currently sought for the emerging JPEG-2000 standard. 1 Lossless JPEG algorithms are rarely used since their performance levels are significantly lower than those of other lossless image compression algorithms, and are therefore not widely used. |
Extent | 6640636 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-05-19 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0064806 |
URI | http://hdl.handle.net/2429/7921 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1998-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1998-0250.pdf [ 6.33MB ]
- Metadata
- JSON: 831-1.0064806.json
- JSON-LD: 831-1.0064806-ld.json
- RDF/XML (Pretty): 831-1.0064806-rdf.xml
- RDF/JSON: 831-1.0064806-rdf.json
- Turtle: 831-1.0064806-turtle.txt
- N-Triples: 831-1.0064806-rdf-ntriples.txt
- Original Record: 831-1.0064806-source.json
- Full Text
- 831-1.0064806-fulltext.txt
- Citation
- 831-1.0064806.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0064806/manifest