I M A G E R E D U C T I O N A N D E D G E - B A S E D E X P A N S I O N by Hongjian Shi B. Sc. Henan Normal University, China, 1984 M. Sc. Peking University, China, 1987 Ph. D. Simon Fraser University, Canada, 1997 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA September 2001 © H o n g j i a n Sh i , 2001 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. The University of British Columbia Vancouver, Canada D E - 6 (2/88) Abstract Image resizing plays an important role in image processing and video stream trans-mission, specially in the commercial video industries. The most imporstant point of image resizing is that the resized images or videos should keep similar key features such as geometric patterns and the closeness of edges of the original images so that they fit the perception of human vision. Image resizing consists of two parts: image reduction and image expansion. For image reduction purposes, some pixels should be removed or a unified shrinking transforms should be taken. For image expansion, some new pixel values should be added. Image resizing has been an exciting topic in digital image processing due to its extensive uses in industries. Several popular methods for image reduction and expansion methods have been proposed. For image reduction, popular methods are alternative downsampling, average filtering, median filtering and wavelet transform. For image expansion, pixel replication, linear inerterpolation, and cubic interpolation are frequently used. In this thesis, for image reduction, an observation on the image reduction method using Daubechies wavelet transform is made. This observation keeps high frequncy components in the reduced picture and so the reduced picture is obviously sharper than that of the top left corner picture of a wavelet transform. For image expansion, a computationally efficient edge dection method is developed. This new edge detection n method generates edge pictures that are very similar to the edge pictures generated by the very known Canny edge detector in closeness, one pixel width and non-zigzagging. Furthermore, the computation of this new edge detection method is much less but more robust than the Canny edge detector. Based on this new edge detection method, a new image expansion method is proposed. This expansion method preserves the edge information very well. The generated picture using our proposed expansion method appears less zigzagged on the edge regions of the picture. The idea is to change the neighborhood pixel values of the edges in a picture expanded by a simple expansion method. The changed pixel values are fitter to human vision than the values generated by the simple expansion method. An important application of our reduction and expansion methods is in video compression. After image reduction, only 25% of the stream source is left for encoding. The encoding process can save 75% of the standard encoding time. Implementation and testing also show that overall 20% to 30% improvements over standard well-known MPEG2 and H.263 methods are achived in acceptable low bits media stream transmmission using this new method. Above all, our proposed image reduction method gives better quality pictures than those generated by Dauchies wavelet transform, alternative downsampling, average filtering and median filtering methods. Our proposed expansion method outperforms the pixel replication, the linear interpolation and the cubic interpolation methods. It gives crisp and less zigzag pictures. Also these methods used for compression give better compression quality than the standard MPEG2 and H.263 methods. in Contents Abstract ii Table of Contents • iv List of Figures v " Acknowledgements xiv 1 Introduction 1 1.1 Background 1 1.2 Thesis Objective and Accomplishments 3 1.3 Thesis Organization 4 2 Image Reduction .• • • 5 2.1 Introduction 5 2.2 Alternative downsampling 6 2.3 Average filtering 7 2.4 Median filtering 7 2.5 Daubechies wavelet transform method 7 2.6 An observation on the reduction method using Daubechies wavelet transform 8 2.7 Comparisons of the reduced images 10 3 Classical image expansion methods revisited 20 iv 3.1 Introduction 20 3.2 Pixel replication 21 3.3 Linear interpolation 22 3.3.1 Mathematical representation 24 3.4 Cubic interpolation 26 3.4.1 Mathematical representation 28 3.5 Expansion using the inverse Daubechies wavelet transform . . 30 4 Image expansion based on edge detection 32 4.1 Introduction 32 4.2 Classical edge detection 33 4.2.1 Gradient-based edge detectors 35 4.2.2 The Sobel edge detector 37 4.2.3 The Canny edge detector 38 4.3 Proposed edge detector 39 4.3.1 Proposed edge detection method 40 4.3.2 Analysis of the proposed edge detection 45 4.4 Proposed edge-based image expansion 66 4.4.1 Expansion algorithms 66 4.4.2 Comparison of expansion results 71 5 Conclusions and Future Work 98 5.1 Image reduction methods 98 5.2 Comparisons of classical image expansion methods 99 5.3 Edge detection 99 5.4 Proposed image expansion method 100 v 5.5 Future work Bibliography List of Figures 2.1 The original Lena image with resolution 512 x 512 11 2.2 The original Baboon image with resolution 512 x 512 12 2.3 Reduced Lena using alternative sampling method 13 2.4 Reduced Lena using 4 point averageing filter 13 2.5 Reduced Lena using Daubechies wavelet transform 13 2.6 Reduced Lena using modified Daubechies wavelet transform 13 2.7 Reduced Baboon using alternative sampling method 14 2.8 Reduced Baboon using 4 point averaging filter 14 2.9 Reduced Baboon using Daubechies wavelet transform 14 2.10 Reduced Baboon using modified Daubechies wavelet transform . . . . 14 2.11 4 times pixel replication of a portion from the reduced Lena by the alternative downsampling method 15 2.12 4 times pixel replication of a portion from the reduced Lena by the averaging filter 15 2.13 4 times pixel replication of a portion from the reduced Lena by the Daubechies wavelet transform 15 2.14 4 times pixel replication of a portion from the reduced Lena by the modified Daubechies wavelet transform 15 vii 2.15 Daubechies wavelet transform of the Lena image 16 2.16 Pixel values greater than zero in LH, HL and HH parts of Daubechies wavelet transform of Lena image are changed to 255 17 2.17 Pixel values greater than 5 in LH, HL and HH parts of Daubechies wavelet transform of Lena image are changed to 255 and others to zero 18 2.18 Pixel values greater than 10 in LH, HL and HH parts of Daubechies wavelet transform of Lena image are changed to 255 and others to zero 19 3.1 Illustration of indices and value relations of the original pixels and the expanded pixels for the expansion factor 1.2 in one dimensional case . 27 4.1 Original Lena image with resolution 256 x 256 58 4.2 Part of Baboon image with resolution 256 x 256 58 4.3 Cameran man image with resolution 256 x 256 58 4.4 Part of a Letters image with resolution 256 x 256 58 4.5 Edges of Lena image using difference edge detector with threshold 40 59 4.6 Edges of Lena image using Sobel edge detector with threshold 60 . . 59 4.7 Edges of Lena image using Canny edge detector with low threshold 0.01 and high threshold 0.03 59 4.8 Edges of Lena image using the proposed edge detector with threshold value 5 59 4.9 Edges of the centeral part of Baboon image using difference edge detec-tor with threshold 40 60 4.10 Edges of the central part of Baboon image using Sobel edge detector with threshold 60 60 viii 4.11 Edges of the central part Baboon image using Canny edge detector with low threshold 0.01 and high threshold 0.03 60 4.12 Edges of the central part Baboon image using the proposed edge detec-tor with threshold 5 60 4.13 Edges of Cameraman image using difference edge detector with thresh-old 40 61 4.14 Edges of Cameraman image using Sobel edge detector with threshold 60 61 4.15 Edges of Cameraman image using Canny edge detector with low thresh-old 0.01 and high threshold 0.03 61 4.16 Edges of Cameraman image using the proposed edge detector with threshold 5 61 4.17 Edges of the enteral part of Letters image using difference edge detector with threshold 40 62 4.18 Edges of the central part of Letters image using Sobel edge detector with threshold 60 62 4.19 Edges of the central part Letters image using Canny edge detector with low threshold 0.01 and high threshold 0.03 62 4.20 Edges of the central part Letters image using the proposed edge detector with threshold 5 62 4.21 Edges of Lena image using difference edge detector with threshold 60 63 4.22 Edges of Lena image using Sobel edge detector with threshold 80 . . 63 4.23 Edges of Lena image using Canny edge detector with low threshold 0.02 and high threshold 0.06 63 ix 4.24 Edges of Lena image using the proposed edge detector with threshold value 10 63 4.25 Edges of the centeral part of Baboon image using difference edge detec-tor with threshold 60 64 4.26 Edges of the central part of Baboon image using Sobel edge detector with threshold 80 64 4.27 Edges of the central part Baboon image using Canny edge detector with low threshold 0.02 and high threshold 0.06 64 4.28 Edges of the central part Baboon image using the proposed edge detec-tor with threshold 10 64 4.29 Edges of Cameraman image using difference edge detector with thresh-old 60 65 4.30 Edges of Cameraman image using Sobel edge detector with threshold 80 65 4.31 Edges of Cameraman image using Canny edge detector with low thresh-old 0.02 and high threshold 0.06 65 4.32 Edges of Cameraman image using the proposed edge detector with threshold 10 65 4.33 Lena (256x256) 73 4.34 Cameraman (256x256) 73 4.35 Expansion of Lena image using pixel replication with expansion factor 2 74 4.36 Expansion of Lena image using linear interpolation with expansion fac-tor 2 75 4.37 Expansion of Lena image using cubic interpolation with expansion fac-tor 2 76 x 4.38 Expansion of Lena image using inverse Daubechies wavelet transform with expansion factor 2 77 4.39 Expansion of Lena image using Canny edge detector and proposed ex-pansion method with expansion factor 2 78 4.40 Expansion of Lena image using proposed edge detector and proposed expansion method with expansion factor 2 79 4.41 Expansion of Cameraman image using pixel replication with expansion factor 2 80 4.42 Expansion of Cameraman image using linear interpolation with expan-sion factor 2 81 4.43 Expansion of Cameraman image using cubic interpolation with expan-sion factor 2 82 4.44 Expansion of Cameraman image using inverse Daubechies wavelet trans-form with expansion factor 2 83 4.45 Expansion of Cameraman image using Canny edge detector and pro-posed expansion method with expansion factor 2 84 4.46 Expansion of Cameraman image using proposed edge detector and pro-posed expansion method with expansion factor 2 85 4.47 The original Lena image with resolution 256x256 86 4.48 Pixel replication after alternative downsampling 86 4.49 4 times reduction-expansion using the linear interpolation 87 4.50 4 times reduction-expansion using the cubic interpolation 87 4.51 4 times reduction-expansion using the Daubechies wavelet transform and inverse Daubechies wavelet transform 87 xi 4.52 4 times reduction-expansion using the modified Daubechies wavelet transform and proposed expansion method 87 4.53 The original Cameraman image with resolution 256x256 88 4.54 Pixel replication after alternative downsampling 88 4.55 4 times reduction-expansion using the linear interpolation 89 4.56 4 times reduction-expansion using the cubic interpolation 89 4.57 4 times reduction-expansion using the Daubechies wavelet transform and inverse Daubechies wavelet transform 89 4.58 4 times reduction-expansion using the modified Daubechies wavelet transform and proposed expansion method 89 4.59 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 90 4.60 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 91 4.61 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 92 4.62 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 93 4.63 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 94 4.64 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 95 4.65 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 96 xii 4.66 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps X l l l Acknowledgments I wish to express my gratitude and appreciation to my supervisor, Dr. Rabab Ward who introduced me to the digital image processing world. Her valuable guidance, encouragement and patience are greatly appreciated. My thanks also go to my graduate student fellows who gave me help and shared some of their expertise. xiv Chapter 1 Introduct ion 1.1 B ackgr ound Image resizing plays an important role in image and video processing. It has found many important applications in video industries such as video conferencing, HDTV and DVD. Many researchers aim to find proper image resizing methods so that the resized images have close resemblance with the original ones. Image resizing includes image reduction and image expansion. An important application of image resizing to image or video compression is to encode size reduced video frames. The encoded pic-tures can be stored or transmitted using less space or less bandwidth. The consumers decode the encoded video and expand the low resolution pictures to the normal format. The expansion method showed yields an acceptable quality picture level. Also most graph editor tools use image resizing (zooming) for image analysis and comparison such as Adobe Photoshop, Paint and Photo Editor. The most popular image reduction methods are alternative downsampling (dis-carding some rows and columns), average filtering over 2x2 neighborhoods, median filtering over 2x2 neighborhoods [10] and Daubechies wavelet transform method 1 [14]. These methods are simple to implement. However, they tend to generate pic-tures with some blocky or blurring appearance when later expanded. The reduction method of alternative downsampling needs the least computations but its expanded pictures using pixel replication are more blocky than those generated by other meth-ods. The average filtering and median filtering reduction method generally produce more blurred pictures. The reduction method of Daubechies wavelet transform is better than the above simple methods but its reduced pictures have low contrast appearance. Its generated pictures contain less high frequency components. Some adaptive image reduction methods improve image blockiness and blurring (see [9], [12], [17], [16] and [5]). Image expansion has been a popular topic for researchers. Frequently used meth-ods are pixel replication, linear interpolation, cubic interpolation and their variants. The pixel replication method generates pictures that are more blocky than those ex-panded by the linear and the cubic interpolation methods. The cubic interpolation method generates pictures that are less blurred than those generated bt the linear interpolation method and less blocky than those produced by the pixel replication. However, the cubic interpolation method uses the most computations among these three methods. The linear interpolation and the cubic interpolation methods usually yield better acceptable pictures than the pixel replication method. Most common graph editor tools such as Adobe Photoshop, Paint and Photo Editor use linear inter-polation, cubic interpolation and their variants. Many adaptive expansion methods ([9], [12] and [17] etc.) have been developed for various purposes. Most of them excell in some aspects but are not good in other aspects. In many applications, image reduction and expansion methods are often used in 2 combination. For example, image reduction and expansion methods can be used for video stream transmission and broadcast. By transmitting size reduced pictures and then expanding these pictures at the receiver, reduced frames will require less number of bits when encoded. Transmitting media streams at lower bit rates but achieving good picture quality has been a target in video processing and commercial video industries. 1.2 Thesis Objective and Accomplishments The objective of this thesis is to investigate popular image reduction and expansion methods and to propose a computationally efficient image expansion method that yields good expanded pictures. To get sharp or unblurred expanded images, our aim is to obtain the picture edges and enhance the appearance of the picture at these regions. Thus our image expansion method is based on our proposed edge detector. The edge detector should be efficient, i.e., computationally inexpensive. Our edge detection method has similar properties to the well known Canny edge detector in detecting edges that are close to the real ones, are of one pixel width. Our edge detector is not very sensitive to the threshold value chosen. Based on this proposed edge detection method a new expansion method is developed. The latter performs well specially in the edge neighborhoods of the expanded pictures. An observation on the Daubechies wavelet transform allows us to propose a reduction method. A video compression scheme composed of reducing the video frames by this method before MPEG2 encoding and then using our proposed expansion method on the decoded frames is shown to improve the compression rate by 20% and cut down the computational time up to 75% of the original. 3 1.3 Thesis Organization In Chapter 2 , several popular image reduction methods are reviewed and investi-gated. An observation on the image reduction method using Daubechies wavelet transform is made. Based on this obseration an improved image reduction method using Daubechies wavelet transform is proposed. The reduced images on several test images are provided using this observation, alternative downsampling, averaging filter and Daubechies wavelet transform. In Chapter 3, popular image expansion methods and their properties are investi-gated and compared. In Chaper 4, the well known Difference edge detector, Sobel edge detector and Canny edge detector are reviewed. A new edge detection method is proposed for image expansion purposes. Based on our proposed edge detector and some classical image expansion method, an image expansion method is proposed. Our proposed expansion method performs better than the pixel replication, the linear interpolation and the cubic interpolation methods. An application of the proposed image reduction and expansion method to video compression is proposed. In Chapter 5, conclusions on the image reduction, image expansion algorithms and their application to video compression are made, and possible applications and suggested future work are mentioned. 4 Chapter 2 Image R e d u c t i o n 2.1 Introduction In many applications it is useful to obtain images with reduced resolutions from those of the original ones. Processing the reduced i.e. smaller sized images involves much less computations than processing the original larger sized images. When images are size reduced, there is loss of information in the picture details. Image reduction fol-lowed by expansion that is matched to recover the information loss introduced by reduction can perceptually recover some of the original image information, so that human vision could perceive little change. This is very useful for video transmission if the methods used for reduction followed by expansion match each other well. The most important factor in successfully reducing an image is the preservatioon of edges in the generated lower resolution image. It is expected that continuous patterns or features in the reduced image keep their similar shapes as in the original image. In the following sections, we will review several popular image reduction techniques. These methods are average filtering, median filtering based on 2 x 2 arrays, alterna-tive downsampling and Daubechies wavelet transformation method. Also a modified 5 method using Daubechies wavelet wi l l be proposed. In the last section of this chap-ter, some comparasion results of these image reduction methods wi l l be given through observations of the implementation results on pictures. Usually an image could be sampled at any low resolution. For most applications, the reduction factor 2 is usually adopted. Our discussions on image reduction methods wi l l focus on this image reduction factor. A color picture is generated from three components, the luminance Y and the chrominances U and V. These represent the blue B, green G and red R components. In this thesis, the same procedures that can be applied to the three components Y, U and V can also be applied to the blue _£?, G and R components. So al l our discussions wi l l be only on the intensity component Y. 2.2 A l t e r n a t i v e d o w n s a m p l i n g Alternative downsampling method is the simplest reduction method. The procedure deletes every other row and every other column of an original image. The reduced image is this: / r ( t , j ) = / 0 (2 i ,2 j ) . Where fr(i,j) represents the pixel value of the reduced image at location and /0(2i, 2j) represents the pixel value of the original image at location (2i, 2j). Pictures in Figures 2.3 and 2.7 are the reduced images in Figures 2.1 and 2.2 using this method. This method needs the least computational effort among all image reduction methods. However, this method generates the most blockiness i n the reduced images. This can be seen when one uses graph editor tools to enlarge the reduced image. 6 2.3 Average filtering Average filtering here involves taking the average of the four neighboring pixel values as the pixel value in the reduced image at corresponding location: fr(i,J) = \[fo(2i,2j) + f0(2i,2j + 1) + fQ(2i + l,2j) + f0(2i + l,2j + 1)]. In the one dimensional case, the average filtering takes the average of every other two pixel values. For two dimensional images, the average filtering can be realized by one dimensional row average filtering followed by one dimensional column average filtering. The reduced images using this method are blurred but not blocky. Pictures in Figures 2.4 and 2.8 are the reduced pictures of pictures in Figure 2.1 and 2.2 respectively using this four point average filter. 2.4 Median filtering Instead of taking the average of the four neighboring pixel values, the median of the four neighboring pixel values is taken. The median of 4 points is the average of the two middle pixel values in the four pixel values queued from small to large. The reduced images using this method are less blurred than those using the average filter but more blurred than those obtained by the alternative downsampling method. 2.5 Daubechies wavelet transform method Wavelet transforms have been extensively used in image processing and image com-pression. A discrete wavelet transform decomposes an image into a set of successively smaller orthonormal images. The reduction factor could be 2~N(N > 1). For N = 1, 7 the left top corner image in a wavelet tranform is usually considered the reduced im-age wi th reduction factor 0.5. For N > 1, the top left top corner image in the wavelet transform is the reduced image with reduction factor 2~N. Wavelet transforms offer the promise of compact representation and efficient detection of image components that match the wave shape of the chosen wavelet. For lossy compression purposes, the detection of specific components such as edges in an image is an important aspect. It is important to select a wavelet that is similar to the components of interests. The most popular and frequently used wavelet transform is the Daubechies wavelet transform. The impulse response of Daubechies lowpass filter ho(k) (tap 2) is as follows. f (1 + 7(3)) k = 0 Ayf(2)h0(k) = (3 + /(3)) k = l (3-/(3)) k = 2 (1-/(3)) k = 3 0 otherwise. This filter is widely used for finger print. Figures 2.5 and 2.9 show the reduced pictures wi th resolution 256 x 256 through taking the top left corner component of the Daubechies wavelet transform with step N — 1. 2.6 An observation on the reduction method using Daubechies wavelet transform From the discussions of the above image reduction techniques, the method using Daubechies wavelet transform is very competitive. However, there st i l l exist some points that need to be improved. The key point is that the reduced picture using the Daubechies wavelet transform has lost the low high L H , high low H L and high high H H frequency components of the original image. These components are perceived by 8 human vision. The Daubechies wavelet transform decomposes an image into low low LL, low high LH, high low HL and high high HH four parts. The reduced image is taken as the LL frequncey components. As the LH, HL and HH frequency components are discarded, the reduced picture appears plain. Our point here for a modified reduction method using the Daubechies wavelet transform is to consider the other three frequency components LH, HL and HH. The Daubechies wavelet transform has a very good energy compaction property. The picture in Figure 2.15 is the Daubechies wavelet transform of Lena image. We see that in the LH, HL and HH parts the pixel values are small. Shaddow of the LL part, specially the edges of LL part, can be seen in the LH, HL and HH parts. In fact, if the pixel values greater than zero in the LH, HL and HH parts are changed to 255, we can see that these pixel values almost fill the LH, HL and HH components (see Figure 2.16). However, if the pixel values greater than 5 in the LH, HL and HH parts are changed to 255 and others to zero, then simple edge pictures of the LL part appear in the LH, HL and HH parts (see 2.17). Figure 2.18 shows the case where the pixel values greater than 10 in the LH, HL and HH are changed to 255 and others in these parts are changed to 0. In each case, The pixel values in the LH, HL and HH parts are the edges of the reduced Lena image in the top left corner. So it can be expected that if all these edge pixel values in the LH, HL and HH parts are added to the LL part, a sharper reduced picture would appear since this addition would strength the edges of the picture in the reduced image. Our procedure is to add the frenquency components greater than a given threshold in parts LH, HL and HH to the LL parts. The whole energy of thus reduced picture is of course increased and specially the edges of the reduced image in the top left corner is strengthed. The reduced pictures using this modification of the Daubechies 9 wavelet transformed images are crisper than those of the top left component only. Our simulation results confirm this observation and that this nonlinear process does not add zigzag artifacts to the edges of the image in the top left corner. We here call this process the modified Daubechies wavelet transform. Pictures in Figures 2.6 and 2.10 are the reduced pictures in Figures 2.1 and 2.2 using this modified Daubechies wavelet transform. 2.7 Comparisons of the reduced images In the previous sections several image reduction methods were discussed. An obser-vation on the Daubechies wavelet transform was discussed. Although we can see that the images reduced by using different methods have subtle differences, these changes are hard to perceive by human vision and it is also hard to see clearly what the ge-ometrical structures of the edges look like. For this purpose, we replicate each pixel on a portion of the reduced images. That is, using replication method to enlarge the reduced images. This will make the edges of the enlarged images look zig-zag due to the increased space for each individual pixel. However, it is a natural way to show the performance of each image reduction techniques. Pictures in Figures 2.11, 2.12, 2.13 and 2.14 are the expanded images by replicating each pixel in a same portion (64 x 64) of the reduced Lena images to 16 pixels. The reduced Lena images are the images using alternative downsampling, average filtering, Daubechies wavelet trans-form and our modified wavelet transform respectively. From these pictures we can see that our observation on Daubechies wavelet transform recovers or increases high frequency components. This can be seen specially compared to the 4 times expanded part of the portion reduced from the alternative downsampling method. 10 It is also helpful for us to choose a reduction method which matches well for video compression purpose as outlined at the beginning of this thesis. ure 2.2: The original Baboon image with resolution 512 x 512 12 Figure 2.3: Reduced Lena using alternative sampling method Figure 2.5: Reduced Lena using Daubechies wavelet transform Figure 2.4: Reduced Lena using 4 point averageing filter Figure 2.6: Reduced Lena using modified Daubechies wavelet transform 13 Figure 2.7: Reduced Baboon using alterna- Figure 2.8: Reduced Baboon using 4 point tive sampling method averaging filter Figure 2.11: 4 times pixel replication of a portion from the reduced Lena by the al-ternative downsampling method Figure 2.13: 4 times pixel replication of a portion from the reduced Lena by the Daubechies wavelet transform Figure 2.12: 4 times pixel replication of a portion from the reduced Lena by the av-eraging filter Figure 2.14: 4 times pixel replication of a portion from the reduced Lena by the mod-ified Daubechies wavelet transform 15 Figure 2.15: Daubechies wavelet transform of the Lena image 16 Figure 2.16: Pixel values greater than zero in LH, HL and HH parts of Daubechies wavelet transform of Lena image are changed to 255 17 Figure 2.17: P i x e l values greater than 5 in L H , H L and H H parts of Daubechies wavelet transform of Lena image are changed to 255 and others to zero 18 Figure 2.18: Pixel values greater than 10 in LH, HL and HH parts of Daubechies wavelet transform of Lena image are changed to 255 and others to zero 19 Chapter 3 Classical image expansion methods revisited 3.1 Introduction Image expansion has been a popular topic in digital pictures and video processing. Many methods have been developed (e.g. [2], [3], [5], [6], [9], [12], [13], [15], [16] and [17]). The pixel replication method and the linear interpolation method have been very popular and frequently used due to their simple implementation. Generally, there are two types of expansion methods. The first is of the non-edge based image expansion ([6], [9], [13], [16] and [17]). This kind of image expansion does not use any imformation related to image edges and estimates a pixel value according to its pixel location and its global invariant relation with its neighborhood pixel values. The pixel replication method, the linear interpolation and the cubic interpolation are all non-edge based expansion methods. The other type of image expansion methods is edge based ([2], [3], [5] [12] and [15]). These methods expand or fill the pixel values according to the edges of the original images or some other images i.e. these methods 20 depend largely on the edge information. In this chapter, several known and frequently used expansion methods will be revisited, discussed and implemented. 3.2 Pixel replication Pixel replication is the simplest image expansion method. The method simply maps each pixel in the original image onto a n f x F block of pixels in the expanded image, where F is the expansion factor in the horizontal direction and vertical direction and each pixel in the expanded block has the value of the original pixel. For example, if the expansion factor is 3 then each pixel in the original image is mapped onto a 3 x 3 block of pixels in the expanded image. So a pixel which has a value b will be mapped as follows. b b b b=> b b b b b b It is easily seen that the pixel replication algorithm is separable. It can be im-plemnted as one-dimentional row pixel replication followed by one-dimensional col-umn pixel replication, or as one-dimensional column replication followed by one-dimensional row pixel replication. In the literature, this replication method is also called zero-order interpolation. This pixel replication expansion method plays a very important role in analysizing and viewing image processing results. When an image is enlarged or zoomed to certain levels, the structures or patterns of the original im-age remain the same. From this, it can be clearly seen what the pattern of some important local region looks like so that proper processing would be taken to achieve better quality (see [3]). This pixel replication method is the simplest and cheapest one to implement. So, if the computational requirement is restricted, this replication 21 method for expansion is a good choice. However, for image expansion purposes only, this pixel replication method introduces jaggy or blocking artifacts specially at large expansion factors and could not be used directly for fractional image expansion. Pictures in Figures 4.1 and 4.3 are the original Lena and Cameraman images with resolution 256 x 256 respectively. Pictures in Figures 4.35 and 4.41 show their expanded images with the expansion factor 2 using the pixel replication method. 3.3 Linear interpolation Linear interpolation for image expansion estimates the unknown pixels by linear in-terpolation using the known pixel values. The linear interpolation can be considered for integer expansion factors such as 2,3,4 and for fractional factors such 1.1,1.3,2.6 etc. For integer expansion factors such as 3, in the one-dimensional case, the linear interpolation estimates the values of two other unknown pixels lying between any two neighboring original pixels by fitting a straight line between the two original pixels. For example, the two unknown pixel values between two original neighboring pixel values a and b are shown as follows. 2 1 , 1 2. , a, o a , 3"' 3 ° + jj6' b-In the two dimensional case, the linear interpolation is also called bilinear interpola-tion. The process interpolates pixel values on the one dimensional columns followed by interpolating the one dimensional rows (or vice versa). For the fractional expansion, the idea is similar to the linear expansion with in-teger expansion factor. In the one dimensional case, for example, assume the linear expansion factor is 1.2. That means that we need to expand 5 pixel values to 6 pixel 22 values. The idea is to estimate 6 even spaced pixel values through the piecewise linear connection of the five known pixel values. For example, the five original pixel values are I0,Ii,I2,l3 a n d h- The immediate following two pixel values are 75 and 75. The expanded 6 pixel values will be x0 = /o, Xi " 12 / o + 1271' X2 - 5 * + 12 /2' X3 + 12 /3' X4 = h>* + 12 /5' x5 + 12/6-We see that, from the same pattern recursively appears. Figure 3.1 shows the relation of the original pixel values and the filled pixel values if we use small intervals to express pixel values in the expanded case. The formula to be derived or results are also correct for the image reduction method of the linear interpolation. The idea of linear interpolation is well known and widely used. Here, we will derive the general formula for any expansion factor and implement it for our expansion purpose. The process of linear interpolation is separable. It is also a relatively simple, fast and computationally inexpensive method. However, from the human visual aspect, it introduces blurring and contouring effects around edges in the expanded image and also does not recover small details of the original image, thus the expanded image looks soft and plain. 23 3.3.1 Mathematical representation Since linear interpolation is a separable process, we only need to discuss the one dimensional linear interpolation case. First, we just ignore the indices of the pixel values to see how to evaluate these values and then find their corresponding indices for implementation. We use (x,f) to express one pixel with its location x and pixel value / . Given two neighboring pixels (x0, fo) and the linear interpolation is to find a straight line p\(x) such that p\(x0) = fQ and pi(xi) = f\. Then p\(x — XQ) can be written as pi(x - x0) = f0 + (x - x0)— —, xi - x0 where x is between XQ and x\. Note that x\ — XQ = 1. Thus the above equation can be written as pi(x - x0) = fo + {x - x0)(/i - fo)-Let r = x — xo, then Pi(r) = fo + r(h - fo) = (1 - r ) / 0 + rfr. (3.1) The pixels (xo,/o) a n ( l (xiifi) contribute (1 — r)f0 and rf to the interpolated pixel value pi(r) respectively. For the two dimensional case, assume the neighboring pixels are (xno, /oo), (xni, /oi)> (xio,/io) and (xn,/n). Let s and r denote the horizontal and vertical distances re-spectively from the interpolated pixel (x,pi(x)) to the point (£nn,/oo)- Then Pi(r) = (1 - r)foo + r/oi and P2{r) = (1 - r)fw + rfn. 24 Then, the one dimensional interpolation in the vertical direction between pi(r) and P2{f) yields the desired interpolated value p(r, s) as follows. p{r,s) = (1 - s)pi(r) + sp2(r) = (l-s)((l-r)f00 + rf01) + s{{l-r)fw + rfu) = (l-s)(l-r)/oo+ ( l -5 ) r /o i+a ( l - r ) / 1 0 + 3 r / i i . (3.2) From this it can be seen clearly that the original pixels (x00, /no)) (^01, /oi), (xio, fw) and (xn,fu) contribute to the interpolated value p(r, s) (1 — s)(l — r)/oo, (1 — s)rfoi, s(l — r)/io and srfu respectively. We now find the relation of an interpolated value and the original pixel values so as to efficiently implement the linear interpolation. For an integer expansion factor F, linear interpolation interpolates F — 1 pixel values between two neighboring original pixel values. To find the estimate of a pixel value X{ with index i, we need first to find the two neighboring original pixels Iq and Iq+\ for which X{ belongs to the interval [Iq,Iq+i\. Assume that the distance between the two original pixels is 1 and the distance between two new neighboring pixels in the expanded image is 1/F, then we can show that the pixel value X{ should lie between the two original pixels whose indices are q and 17 + 1 where q is the integer part of i (the index of X{) divided by F. Furthermore, if we use r to denote the decimal fraction of i divided by F, then r is the distance of X{ from the original pixel whose index is q. Then using (3.1), the estimate of re, in terms of Iq and Iq+i is Xi = (1 - r)Iq + rlq+1. (3.3) From the above formula we can evaluate all the pixel values in a whole row in terms of the original pixel values. In the case that the expansion factor is an integer, q and r are the quotient and the decimal fraction of i divided by F respectively. To 25 implement linear interpolation expansion on an image, we can use this formula to implement the linear interpolation on all rows first and then on all columns. In the following we use examples to verify the above formula. For the one dimensional linear interpolation with the expansion factor F = 4, the interpolated pixel values obtained geometrically are x0 Xi x2 X4 = / o , 3 r 1 r 4 7 o + 4 7 1 ' 2 2 1 , 3 , = h, It can be easily verified that these pixel values are the same as the ones derived from formula (3.3). For fractional expansion, we use the one dimensional linear inter-polation with the expansion factor F = 1.2 as illustrated in Figure 3.1. The gefnetrical figure 3.1 shows the index and value relation of each original pixel and expanded one. In the figure, the Fs are the original pixels and the x's represent the expanded ones. Pictures in Figure 4.36 and 4.42 are the expanded images of Lena with resolution 256 x 256 using the linear interpolation with expansion factor 2. 3.4 Cubic interpolation Cubic interpolation uses a third order degree polynomial to interpolate the unknown pixels in the expanded image (in the one dimensional case). There are many variants 26 I 0 1_1 I_2 I 4 I 5 I 6 I 7 I 9 x 11 x 12 113 x 4 1 x 15 1 x 16 1 x 17 x_18 1 i_ 19 1 xlO ' x_21 I 9 I 10 I 11 1 12 I 13 I_14 I_15 I_16 I 17 I 18 Figure 3.1: Illustration of indices and value relations of the original pixels and the expanded pixels for the expansion factor 1.2 in one dimensional case of the polynomial used and their relations to the original pixel values differ from one method to the other ([1], [8]). However, here the definition of the cubic interpolation is the one naturally thought of and frequently used. For a two dimensional image, similar to the linear interpolation, cubic interpolation can be first applied to all rows and then to all columns. Thus it is also called the bicubic interpolation. Bicubic interpolation should be better than the linear interpolation since it uses more information, four pixel values to estimate unknown pixel values rather than the two pixel values used in linear interpolation. Implementation also shows this is correct. Bicubic interpolation generates sharper and less blocky pictures specially at edge neighborhoods. The overall quality of the generated pictures is better than those generated by linear interpolation. Although it yields less blurred and less blocky pictures, it still removes 27 some small details. Also blocky artifacts can be seen when the generated pictures are enlarged using Adobe Photoshop. On the other hand, bicubic interpolation uses much more computation than linear interpolation. 3.4.1 Mathematical representation Since bicubic interpolation can be implemented by applying cubic interpolation on the one dimensional rows and then on the one dimensional columns, our following discussion will focus on the one dimensional cubic interpolation. Let (x-i, / _ i ) , (xo, /o), (xi, fi) and (x2, $2) be the locations and pixel values of 4 equally spaced neighboring pixels. The distance from one pixel to the other three pix-els is 1, or 2 or 3. Cubic interpolation finds a polynomial p${x) such that pz{x-\) = /_!, pz{xQ) = /o, Pz{xf) = fi and pz(x2) = f2. The purpose is to find an interpo-lated value for a pixel x between Xo and Xi. The third order polynomial for cubic interpolation, [8] and [1], is Pa(x) fo + (x- x0)f(x0, xi) + (x- x0)(x - x i ) f ( x 0 , Xi, x2) +(x - x0)(x - xi)(x - X2)f(X-i,X0,Xi,X2). Here f { x 0 , x i ) / 1 - / 0 — fi — fo, Xl - x0 f ( x 0 , x i , x 2 ) f(xi,x2) - f(x0,xi) = 0.5(/2 - 2fi + f0) X2 - XQ and f ( x - 1 , x 0 , x i , x 2 ) f(x0, Xi, x2) - f ( x - i , x 0 , Xi) x2 — x _ i 0.5(/2 - 2fi + f0) - 0.5(/! - 2/o + 3 28 Let xQ = 0 and r = x — XQ. Simplifying the expression of pz(x) yields the following formula. Pa(r) = -^(r-2)(r- l )r /_ 1 + l(r-2)(r-l)(r + l)/ 0 _I(r _ 2)r(r + 1)/! + i(r - l)r(r + l)/ 2. (3.4) From this formula, for the interpolated pixel x between XQ and xi, /_ i contributes — |(r — 2)(r — l)r/_i to its interpolated pixel value, fo contirbutes |(r — 2)(r — l)(r + l)/o, / i contributes — |(r — 2)r(r + l)/i and f2 contirbutes |(r — l)r(r + l)/2 geometrically. For the cubic interpolation with a fractional expansion factor, the above formula can still be used. The indices relation of the original pixels and the pixels in the expanded image is the same as in the case of the linear interpolation which was discussed in the last section. So an explicit formula of the pixel values in the expanded image and the original pixel values with indices denoted can be written as follows. X i = _I(r-2)(r-l)r/,_1 + ^(r-2)(r-l)(r + l)/, b I - i ( r - 2)r(r + + l-(r - l)r(r + l) / , + 2 . (3.5) Where q and r are the integral part and the decimal fraction part of i divided by the expansion factor F respectively and where Fs and x's are the original pixel values and the pixel values in the expanded image respectively. For example, assume the original four neighboring pixel values are [4 1 3 2]. If the expansion factor is 4, the interpolated values between 1 and 3 will use all the four pixel values 4,1,3,2. For a two dimensional image, as noted earlier, one dimensional row cubic interpolation is implemented followed by one dimensional column cubic interpolation. For the inter-polated pixel values between the boundary and the right original pixels, we use the 29 cubic interpolation polynomial for the first four original pixels in the proper direction to calculate their values. Similar process works for the other boundary pixels. Pictures in Figure 4.37 and 4.43 are the expanded images of Lena and Cameraman with resolution 256 x 256 images using cubic interpolation with the expansion factor 2. 3.5 Expansion using the inverse Daubechies wavelet transform It is known that the Daubechies wavelet transform with step 2 decomposes an image into four parts, low low frequency components LL, low high frequency components LH, high low frequency components HL and high high frequency components HH. The low low frequency part LL is the reduced picture of the original picture with reduction factor 0.5. With step N the Daubechies wavelet transform yields a reduced picture with the reduction factor 2~N. For expansion purposes, using the inverse Daubechies wavelet transform to expand an image with the expansion factor 2, we first consider the low low part as the original picture and all pixel values in other parts as zero and then use the inverse Daubechies wavelet transform. The generated pictures contains no high frequency components and so the picture is somewhat blurred and some details are lost. This method is comparable to linear interpolation in the quality of the generated pictures. However, it is computationally expensive and applies only when the factor is a power of 2. Of course, the best result for expansion using the inverse Daubechies wavelet transform is the case for the expansion factor 2 since more information is utilized. For step N(N > 1), the inverse Daubechies wavelet transform will be used N times. 30 The expansion result is certainly worse. Pictures in Figures 4.38 and 4.44 are the expanded images of Lena and Cameraman with resolution 256 x 256 using the inverse Daubechies wavelet transform with expan-sion factor 2. These expanded pictures appear less zigzag than those expanded images by linear interpolation and cubic interpolation. The shoulder part of the expanded Lena image of Figure 4.38 obviously has less zigzag edges than those of Figures 4.36 and 4.37. Similar results can be observed from the expanded Cameraman images in Figures 4.44, 4.42 and 4.43. These images also show that the images expanded by the inverse Daubechies wavelet transform are crisper than those expanded by linear interpolation and cubic interpolation. 31 Chapter 4 Image expansion based on edge detection 4.1 I n t r o d u c t i o n In Chapter 3, we visited several image expansion methods: pixel replication, linear interpolation, cubic interpolation and the inverse Daubechies wavelet transform. All those methods have disadvantages such as blockiness, blurring or zigzag edges in the expanded image. Our purpose in this chapter is to propose an edge-based expansion method to be applied on images expanded by one of these methods and it reduces these artifacts i.e. blockiness, blurring and zigzag edge noises of the expanded image. An image is composed of nonstationary and homogenous regions. Homogenous regions are those where the pixel values are close, and so linear interpolation can be used for expansion. Nonstationary regions are those where the pixel values vary largely, and so these regions are difficult to expand. If linear interpolation is used to expand these regions the expanded image would be blurred at the edge neighborhoods. For different edge detectors, the nonstationary regions will be expanded different due 32 to the differences in the detected edges. So our expansion method will generate pictures with different qualities. If the detected edges are of one pixel width and their closeness to the true edges is good, then the proposed method will generate crisp pictures specially at the edge neighborhoods. Many popular edge detection methods use thresholding measures to determine if a point is an edge point. For our proposed edge detector, if the threshold is properly small, the boundaries of small features or details can be detected as edges. So the procedures of our proposed method are used extensively on images expanded using some simple expansion methods such as linear interpolation or cubic interpolation. Also, this proposed edge detector, detailed in Section 4.3, is simple as a computationally efficient edge detector is used. This edge detector has similar features to the Canny edge detector in the properties of yielding edges that are close to the true ones and are of one pixel width. Based on this edge detection method, the proposed image expansion method produces comparably sharp and less zigzag edges. Since our proposed image expansion method depends on the edge detector used, we first discuss some classical edge detection methods and then propose our computationally efficient edge detection method in Section 4.3. In Section 4.4, details of our proposed expansion method that is based on edge detection are described. The edge detectors to be discussed and used are the gradient-based difference edge detector, the Sobel edge detector, the Canny edge detector [14, 7] and our proposed edge detector. 4.2 Classical edge detection This section introduces some classical edge detectors such as the gradient-based dif-ference edge detectors, the Sobel edge detector and the Canny edge detector. Then we 33 propose a new and computationally effect edge detection method in Section 4 .3. These methods will be used for our new segmentation based image expansion in Section 4.4. Variations or discontinuities in the luminance or chrominance components in an image are fundamentally important characteristics of an image. The human vision system is very sensitive to these discontinuities since they provide an indication of the ranges of objects within the image and make humans perceive the structures of the image. So edge detection is very important in image or video processing. In many applications, edge detection forms the first step in image or video processing and has been the subject of much research. New edge detection algorithms continue to be developed. Display of good contrast edge neighborhoods or enhancing the edges of an image can make the image look sharper as the contrast of differnt objects within the image is well perceived by the human vision. Classical expansion methods usually make the enlarged image look blurred, blocky or zigzag to some extent. So image processing during the expansion process is a good adaptive measure to avoid blurring, blockiness or zigzagging. For these kinds of processing, edge detection is important to obtain high quality pictures. Generally, a pixel in an image is called an edge point if it is on the boundaries of objects within the image. An edge pixel is the point on which the gradient in one of eight directions has relatively larger changes than the other two neighboring pixels in the same direction. An edge point exists if the gradient in one of eight directions achieves maximum or minimum. Usually a point is also called an edge if the gradient is greater than a specified value. In this section we will review some classical edge detectors. These are the differ-ence edge detector (also called Robert edge detector), the Sobel edge detector and the 34 Canny edge detector. These edge detection methods will be used for our proposed image expansion method in Section 4 . 4 . A new edge detector will be proposed in Section 4 . 3 . The performance of this new edge detector is similar or close to that of the well known Canny edge detector. Our new edge detector is more computa-tionally efficient and robust and the detected edges are mostly one pixel width as that of Canny's. Since it is computationally efficient and can detect weak edges, it is potentially useful for various real-time image or video processing applications. In Section 4 . 4 , we will propose a new image expansion method based on edge detection. This expansion method based on our proposed edge detection method outperforms frequently used classical image expansion methods. Our proposed expansion methods based on the Canny edge detector and the proposed edge detector are implemented. Comparison results are also provided. 4.2.1 Gradient-based edge detectors The simplest edge detector is the gradient-based difference edge detector. It is also called Robert edge detector. An edge segment F(i,j) in an image can be detected by the gradient G(i,j) along a line orthogonal to the edge, which is at an angle 9 with respect to the horizontal axis. The gradient along the line orthogonal to the edge slope can be computed in terms of row edge gradient and column edge gradient as follows. G M ) = ^ m t + z * j i l > l « n e . (4.1) Ol OJ The spatial gradient amplitude is given by \G(j,k)\ = { [ G r ( j , * ) ] a + [ G c ( j , * ) ] a } 1 / a - (4 -2) Gr(j,k) = F(j,k)-F(j,k-l) (4 .3 ) 3 5 and Gc(j,k) = F(j,k)-F(j-l,k). (4.4) Equation (4.2) can be approximated by \GA(j,k)\ = \Gr(j,k)\ + \Gc(j,k)\ (4.5) for computation efficiency. After the edge gradient is computed, the approximate gradient is compared to a specified threshold value to determine if a pixel is an edge pixel. The threshold value determines the sensitivity of the edge detector. For image processing purposes the threshold value might be chosen so that some small variations of the amplitude can be detected. However, such selection usually depends on the specific application e.g. to reduce some noise such as for compression purposes. For noisy images, the selection of the threshold value forms a tradeoff between missing real edges and noise-induced false edges. Pictures in Figures 4.1, 4.2, 4.3 and 4.4 are our test pictures. Here the resolution of Lena is 256 x 256. The resolution of the previous Lena image used in Chapter 2 is 512 x 512. Pictures in Figures 4.5, 4.9, 4.13 and 4.17 show the detected edges using the difference edge detector with the threshold value 40. Pictures in Figures 4.21, 4.25 and 4.29 also show the edge pictures obtained using the threshold value 60. If the threshold value is changed to a lower value than 40, the edges detected are wider. If the threshold value is changed to a larger value then weaker edges are lost. This is also true for the Sobel edge detector discussed in the next subsection. The difference edge detector is not very sensitive to small fluctuations. This can be predicted from the formulation of this edge detector. We see from the edge pictures 36 that the edges detected may be three pixels width, or more pixels width. The detected edges are rarely less than three pixels width. Furthermore, when the threshold value is large, many small details are completely lost. The other edge detectors to be discussed are better than the difference edge detector in this respect. Also, the edge picture noise is not well suppressed even with a relative large threshold value. For most processing we require less noisy edges so as to obtain crisp pictures. 4.2.2 The Sobel edge detector Based on the idea of using gradient amplitudes, many similar edge detectors can be generated, for example, diagonal edge detector, or the one using the differnce of two alternative neighboring pixel values instead of the gradient defined above. Generally the Sobel edge detector is also a gradient-based edge detector which is generated by convolving the image with the following two matrice: 1 2 l " 0 0 0 . -1 -2 -1 _ Sh and Sv are the Sobel vertical edge detector and the Sobel horizontal edge detector respectively. The sum of the absolute values of the Sobel horizontal gradient and the Sobel vertical gradient forms the so called Sobel edge detector. Since the Sobel edge detector uses three rows and three columns centered at the current pixel to calculate the mixed gradient, the detected edges should more accurately reflect the edge orientation. Also it is better in detecting the weak edges than the difference edge detector. However, the Sobel edge detector might not detect weak edges and the detected edges are usually three to six pixels width. Pictures in Figures 4.6, 4.10, 4.14 and 4.18 show the edge pictures of our test images obtained using the Sobel edge 37 sh = -1 0 -1 2 0 - 2 1 0 -1 and Sv = -8 detector with the threshold value 60. Pictures in Figures 4.22, 4.26 and 4.30 show the edge pictures of our test images detected using the Sobel edge detector with the threshold value 80. 4.2.3 The Canny edge detector The Canny edge detector [4] is well known and has been considered as a good edge detector if not the best. We include the Canny edge detector for comparison with the proposed edge detector. Unlike the gradient-based edge detectors, the Canny edge detector is not a linear one. The Canny edge detection linearly filters an image with a Gaussian kernel to mask or smooth the noise and then computes the edge amplitude and direction for each pixel in the smoothed image. This is done by differentiating the image in the horizontal and vertical directions and computing the gradient amplitude as the root of the square sum of the horizontal gradient and the vertical gradient. The gradient direction can be computed by the ratio of the vertical gradient to the horizontal gradient. Candidate edge pixels are identified as the edge pixels that survive after a thinning process called non-maximal suppression. In this thinning process, the edge amplitude of each candidate edge pixel is set to zero if its amplitude is not larger than the amplitudes of the two adjacent pixels in the gradient direction. Thresholding is then done on the thinned edge magnitude using hysteresis. In hysteresis, two edge amplitude thresholds are used. All canadidate edge pixels below the lower threshold are disqualified as a true edge point and all pixels above the low threshold that can be connected to any pixel above the high threshold through a chain of edge pixels are qualified as true edge pixels. 38 The Canny edge detection is a good edge detection in three aspects. The amplitude signal-to-noise ratio (SNR) of the gradient is maximized to obtain a low probablity of failure to mark real edge points and a low probability of falsely marking non-edge points. The edge points are identified as close as possible to the center of the edge. The detected edges are of one pixel width and the closeness of the edge pixel values is surprisingly better than those of the difference edge detector and the Sobel edge detector. Furthermore, the Canny edge detector can also detect small details of an object with much less noise if the threshold value is small. It is much better than the difference edge detector and the Sobel edge detector in keeping the rich textures of the image. Also, the Canny edge detector has been considered the best in detecting the closeness of edges and in yielding one pixel width edges. It has been extensively researched and used in image processing. Pictures in Figures 4.7, 4.11, 4.15 and 4.19 show the edge pictures of the test images obtained using the Canny edge detector with the lower threshold value 0.01 and higher threshold value 0.03 . Pictures in Figures 4.23, 4.27 and 4.31 show the edge pictures of the test images Lena, part of Baboon and Cameraman detected using the Canny edge detector with the lower threshold value 0.02 and higher threshold value 0.06. 4.3 P r o p o s e d edge detector In this section, a new edge detector will be proposed. This edge detector has some similar properties to the Canny edge detector such as edge closeness and one pixel width edges. Our proposed edge detector uses the average filter to mask noise instead of using the more computationally demanding Gaussian filter used by the Canny 39 edge detector. Also we use the geometrical properties of local maximal points to significantly reduce the number of computations to detect the edges instead of using an additional thining process as in the Canny edge detection. The threshold value used is only one fixed value instead of two, and can be set at the beginning. Experiments show that the good values for the threshold are 4 to 10, for expansion purposes. 4.3.1 Proposed edge detection method Now we begin to introduce this new edge detector step by step as follows. Let the following matrix be a part of pixel values from an image 116 108 95 105 111 118 135 102 102 101 110 112 116 118 119 115 130 [l63~ 132 190 117 125 143 |178| 204 204 178 176 166 181 164 186 190 179 166 174 191 171 180 187| 184 180 168 171 182 189 179 182 186 191 192 187 192 188 187 190 192 192 188 186 177 184 188 191 Obviously, the boxed pixel values or their immediate neighboring pixel values should be the edge pixel values if the edge detection method is good. To detect the true edges and their real tendancy at one edge pixel, we usually need to mask or remove some possible noise that may affect the judgement of the edge orientation. The Canny edge detector uses several times the Gaussian mask which involves complicated computations. Instead, we use the 3x3 average filter or a quasi-average filter to mask or remove some noise which may affect our judgement of the edge orientation. This process will make the picture more blurred but the detected edges will be less jaggy 40 and smoother. After the average filtering, the data matrix becomes 108 102 95 105 105 118 117 102 110 129 112 117 118 119 128 116 132 142 159 190 140 151 143 154 164 190 166 177 166 177 174 180 162 178 181 168 179 181 164 178 177 185 181 183 183 186 178 181 184 188 184 187 191 192 188 188 192 188 204 176 189 192 190 186 Now we find the gradient sum for each point. For example, we want to find the gradient sum at pixel value O. The four adjacent pixel values are A,B,C and D as follows. A B O C D Then we define the gradient sum at 0 as GSO = \0 - A\ + \0 - D\ + \0 - B\ + \0 - C\. It is easy to see that when the variation of pixel values is large, the corresponding GSO is also large. In this way, we do not need to determine the orientations of the edges since the gradient sum GSO itself contains enough information about each pixel value. The edge direction can be determined by the GSO. Now let us see the matrix of the gradient sums. We only calculate the gradient sums at the non-boundary pixels 41 of the image. So we have the following 6x6 matrix of the gradient sums 30 39 46 58 86 110 62 67 67 64 62 66 85 64 52 43 34 26 44 38 24 16 10 13 28 11 7 10 12 14 22 14 16 15 10 4 In the illustration we exclude the pixels on the four boundaries due to no enough information to determine an edge point. Of course, for a whole image the only affected part is just the four boundaries of the frame. From this data array we define the edge points in the 6x6 original part from an image. First, a threshold value, usually a number between 4 to 10, will be used. We now judge if the pixel O is an edge point. If the GSO value at a point is less than this fixed threshold value, this pixel is excluded from being an edge pixel. Otherwise, we consider some conditions necessary for a pixel to qualify as an edge point. Let us first assume 8 adjacent pixel values A, B, C, D, E, F, G, H with 0 as the center as follows: The GSH,GSG,GSF,GSA,GSO,GSE,GSB,GSC and GSD are the gradient sums of H, G, F, A, 0, E, B, C and D respectively. We record four 'if conditions as follows: H G F A 0 E BCD GSO = irmx(GSA,GSO,GSE) (4.6) GSO = m<ix{GSG,GSO,GSC) (4.7) 42 GSO = max(GSH,GSO,GSD) (4.8) GSO = m&x(GSF,GSO,GSB) (4.9) If either (4.6 and 4 .8) , or (4.6 and 4 .9) , or (4.7 and 4 .8) , or (4.7 and 4 .9) , is satisfied, the pixel value O is considered as an edge pixel value. Otherwise, O is not qualified for an edge pixel value. This qualification condition is configured according to our desire that the edges detected would be of one pixel width. To illustrate what we mean, consider the eight pixel values H, G, F, A, (9, E, B, C and D as denoted above. Assume H, A, B and F, E, D belong to two different homogeneous regions connected by the ramp GOC. From the definition of the gradient sum, GSO > GSA,GSO > GSE, GSO > GSH, GSO > SGD, GSO > GSB and GSO > GSF. GSO is the maximum gradient in three directions AOE, BOF and HOD and so 0 is qualified as an edge pixel value as desired. Similarly, G and C are also qualified as edge pixel values. From the definition of the gradient sum and GSO > GSA, GSA can only be the maximum in two diagonal directions. A is not desired, and also not qualified as an edge pixel value from the requirement. Similarly, H, B, F, E and D are not desired, and not qualified as edge pixel values. In the qualification requirement, we do not consider those pixel values as edge pixel values whose gradient sums are maximum in only two diagonal directions. Later, we will test various patterns to justify the requirement of the qualification as an edge pixel value. From this requirement of the qualification we see that the edge points in our illustration data are the pixels with the same value 1 as follows. 43 o o o o \T\ [T] o miiiii o o \T\ o o o o o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 The corresponding detected edge points in the original data are the boxed pixels as follows. 108 95 105 118 102 105 117 102 110 129 112 117 118 128 119 116 132 142 159 190 140 151 143 154 164 190 166 177 166 177 178 184 174 180 181 188 162 178 181 184 191 168 179 181 187 192 164 177 181 183 188 192 178 185 183 186 188 188 204 176 189 192 190 186 Comparing these detected edge points with the original data we can see these detected edge points form a smoothly connected edge. This edge is basically our expected edge. As will be seen in the following section, this proposed edge detection method detects one pixel width edges for real ramp edges but at most two pixel width edges for steep edges. However, since the textures of an image is complicated and the pixel values vary from time to time, so at almost all cases the generated edge pictures are one or two pixels width. We will illustrate our statements and analyze this proposed edge detection method in the next subsection. 44 4.3.2 Analysis of the proposed edge detection In this section we will use some possible edge patterns to analyze the performance of our proposed edge detection method. Usually, image edges fall into two categories, ramp edges and steep edges. Here we analyze typical edge patterns using our proposed edge detection method. Ramp edges. A ramp edge is an edge in a data pattern which has the transition pixels between two homogeneous regions. These edges can be slant lines, horizontal lines or vertical lines. For convenience, we use 0, 2 for the pixel values in the two homogeneous regions and 1 for the pixel values in the transition line. Also, we will not consider thresholding which does not affect our discussions here as the variation of the gradient sum is continuous along edges. Notice that our proposed edge detection method is symmetric about the edge no matter where the higher and lower pixel values lie. So, we just list one case for the two cases with the same edge pattern but the higher pixel values lie in different parts of the edge. Also, on the first row, last row, leftmost column and rightmost column, we will assume their neighboring rows and columns have the tendancy to form the whole pattern. For example, consider the data pattern in the following Pattern 1 and Pattern 3. In Pattern 1, we consider the row above the first row has data like 0,0,0,1,2,2,2,2 and the column left to the leftmost column has data like 0,0,0,0,0,0,0,0. In Pattern 3, we consider the row above the first row has data like 0,0,0,0,0,0,0,0,1, • • •. We first consider the four most important edges Pattern 1, Pattern 2, Pattern 3 and Pattern 4. 45 Pattern 1 is a vertical ramp edge with the following data pattern: 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 Pattern 1 From this data pattern we desire the proposed method to detect the vertical tran-sition column with value 1 (the fourth column) as an edge line. Using the proposed edge detection method, after average filtering the corresonding gradient sum matrix is: 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 0 1/3 1 4/3 1 1/3 0 qualification conditions it is easy column could be the edge line. 46 Patten 2 is a horizontal ramp edge with the following data pattern: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Pattern 2 From the data pattern we wish the proposed method to detect the horizontal transition row with value 1 (the fourth row) as an edge line. Using the proposed edge detection method, after average filtering the corresonding gradient sum matrix is as follows. 0 0 0 0 0 0 0 0 1/3 1/3 1/3 1/3 1/3 1/3 1/3 1/3 1 1 1 1 1 1 1 1 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 1/3 1/3 1/3 1/3 1/3 1/3 1/3 1/3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 So, according to the qualification conditions it is easy to see that only the fourth row could be the edge line. 47 Patten 3 is a diagonal ramp edge with the following pattern: 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 2 0 0 0 0 0 1 2 2 0 0 0 0 1 2 2 2 0 0 0 1 2 2 2 2 0 0 1 2 2 2 2 2 0 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 Pattern 3 From the data pattern, we expect the diagonal line of pixels with value 1 to be detected as the edge. After average filtering, the gradient sum matrix is as follows. 0 0 0 0 2/9 8/9 16/9 20/9 0 0 0 2/9 8/9 16/9 20/9 16/9 0 0 2/9 8/9 16/9 20/9 16/9 8/9 0 2/9 8/9 16/9 20/9 16/9 8/9 2/9 2/9 8/9 16/9 20/9 16/9 8/9 2/9 0 8/9 16/9 20/9 16/9 8/9 2/9 0 0 16/9 20/9 16/9 8/9 2/9 0 0 0 20/9 16/9 8/9 2/9 0 0 0 0 It is easy to see that our proposed edge detection method detects the diagonal line with pixel values 1 as an edge line. 48 Pattern 4 is a diagonal ramp edge with the following pattern: 1 2 2 2 2 2 2 2 0 1 2 2 2 2 2 2 0 0 1 2 2 2 2 2 0 0 0 1 1 2 2 2 0 0 0 0 1 2 2 2 0 0 0 0 0 1 2 2 0 0 0 0 0 0 1 2 0 0 0 0 0 0 0 1 Pattern 4 After average filtering, the gradient sum matr ix is as follows. 20/9 16/9 8/9 2/9 0 0 0 0 16/9 20/9 16/9 8/9 2/9 0 0 0 8/9 16/9 20/9 16/9 8/9 2/9 0 0 2/9 8/9 16/9 20/9 16/9 8/9 2/9 0 0 2/9 8/9 16/9 20/9 16/9 8/9 2/9 0 0 2/9 8/9 16/9 20/9 16/9 8/9 0 0 0 2/9 8/9 16/9 20/9 16/9 0 0 0 0 2/9 8/9 16/9 20/9 Our proposed edge detection method detects the diagonal line wi th the pixel value 1 as an edge line, which was our expected edge line. The above four patterns are the most important ones to human vision. Besides the four types of edges, there are other types of edges with different slopes other than 0, 45, 90 and 135 degrees. Since al l discussions of them are similar, we just discuss four edge patterns of these. 49 Pattern 5 is a slant ramp edge with the following pattern. 0 0 0 0 0 1 2 2 0 0 0 0 0 1 2 2 0 0 0 0 1 2 2 2 0 0 0 0 1 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 1 2 2 2 2 2 0 0 1 2 2 2 2 2 Pattern 5 After average filtering, the gradient sum matrix is as follows. 0 0 1/9 7/9 15/9 17/9 11/9 3/9 0 0 3/9 11/9 17/9 15/9 7/9 1/9 0 1/9 7/9 15/9 17/9 11/9 3/9 0 0 3/9 11/9 17/9 15/9 7/9 1/9 0 1/9 7/9 15/9 17/9 11/9 3/9 0 0 3/9 11/9 17/9 15/9 7/9 1/9 0 0 7/9 15/9 17/9 11/9 3/9 0 0 0 11/9 17/9 15/9 7/9 1/9 0 0 0 Using our proposed edge detection method, the detected edge data array is as follows. 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 50 Where those pixels with value 1 are the detected edge pixels. The detected edge is basically our expected edge except the displacement by one pixel in some locations. Pattern 6 is a slant ramp edge with the following pattern: 0 0 0 0 0 1 2 2 0 0 0 0 1 2 2 2 0 0 0 0 1 2 2 2 0 0 0 0 1 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 0 1 2 2 2 2 0 0 1 2 2 2 2 2 Pat te rn 6 After average filtering, the gradient sum matrix is as follows. 0 0 1/9 7/9 14/9 15/9 1 2/9 0 0 2/9 1 15/9 14/9 7/9 1/9 0 0 4/9 12/9 14/9 12/9 3/9 0 0 1/9 7/9 14/9 15/9 1 2/9 0 0 2/9 1 15/9 14/9 7/9 1/9 0 0 4/9 12/9 14/9 12/9 3/9 0 0 1/9 7/9 14/9 15/9 1 2/9 0 0 2/9 1 15/9 14/9 7/9 1/9 0 0 Using our proposed edge detection method, the detected edge data array is as 51 follows. 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 Where those pixels with value 1 are the detected edge pixels. Once again the detected edge is the same as expected. Pattern 7 is a slant ramp edge with the following pattern: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 2 2 1 1 0 0 0 0 2 2 2 2 1 1 0 0 2 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Pattern 7 After average filtering, the gradient sum matrix is as follows. 11/9 7/9 3/9 1/9 0 0 0 0 17/9 15/9 11/9 7/9 3/9 1/9 0 0 15/9 17/9 17/9 15/9 11/9 7/9 3/9 1/9 7/9 11/9 15/9 17/9 17/9 15/9 11/9 7/9 1/9 3/9 7/9 11/9 15/9 17/9 17/9 15/9 0 0 1/9 3/9 7/9 11/9 15/9 17/9 0 0 0 0 1/9 3/9 7/9 11/9 0 0 0 0 0 0 1/9 3/9 52 Using our proposed edge detection method, the detected edge data array is as follows. 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Where those pixels with value 1 are the detected edge pixels. So the detected edge is basically as expected except the displacement by one pixel for some locations. Pattern 8 is a slant ramp edge wi th the following pattern: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 1 1 1 0 0 0 0 2 2 2 2 1 1 1 0 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Pattern 8 53 After average filtering, the gradient sum matrix is as follows. 2/9 1/9 0 0 0 0 0 0 1 7/9 4/9 2/9 1/9 0 0 0 15/9 14/9 12/9 1 7/9 4/9 2/9 1/9 14/9 15/9 14/9 15/9 14/9 12/9 1 2/9 7/9 1 12/9 14/9 15/9 14/9 15/9 14/9 1/9 2/9 4/9 7/9 1 12/9 14/9 15/9 0 0 0 1/9 2/9 4/9 7/9 1 0 0 0 0 0 0 1/9 2/9 Using our proposed edge detection method, the detected edge data array is as follows. 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Where those pixels with value 1 are the detected edge pixels. So the detected edge is the same as expected. The edge detected using our proposed method is one pixel width and is what we expected compared to the original data. So, for ramp edges in an image our proposed edge detection method gives one pixel width edges with good closeness and smooth transition properties. Steep edges Step edges mean the edges formed by two neighboring homogeneous parts con-nected together without transition area or pixels. For our symbols, a steep edge is 54 formed by two neighboring parts, one part with lower pixel value 0 and one with higher pixel values 2. For this case, our proposed edge detection method wi l l detect edges wi th two pixel width. Here, for illustration purpose, we take two patterns as examples to discuss. Other cases are similar to the cases for ramp edges except the two pixels width of the detected edges. Patten 9 is a horizontal steep edge with the following data pattern: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Pattern 9 Using the proposed edge detection method, after average filtering the corresonding gradient sum matr ix is as follows. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2/3 2/3 2/3 2/3 2/3 2/3 2/3 2/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 4/3 2/3 2/3 2/3 2/3 2/3 2/3 2/3 2/3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 So, according to the qualification condition of the proposed edge detector it is easy to see that the fourth and the fifth rows both are the detected edges. The detected edges are of two pixels width. 55 Patten 10 is a slant steep edge with the following pattern: 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 2 0 0 0 0 0 2 2 2 0 0 0 0 2 2 2 2 0 0 0 2 2 2 2 2 0 0 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Pattern 10 After average filtering, the gradient sum matrix is as follows. 0 0 0 0 4/9 12/9 20/9 20/9 0 0 0 4/9 12/9 20/9 20/9 12/9 0 0 4/9 12/9 20/9 20/9 12/9 4/9 0 4/9 12/9 20/9 20/9 12/9 4/9 0 4/9 12/9 20/9 20/9 12/9 4/9 0 0 12/9 20/9 20/9 12/9 4/9 0 0 0 20/9 20/9 12/9 4/9 0 0 0 0 20/9 12/9 4/9 0 0 0 0 0 Applying the proposed edge detection method, the detected edges are the pixels with the value 1 and of two pixels width in the slant direction as follows. 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 56 The detected edge is what we expected. Pictures in Figures 4.8, 4.12, 4.16 and 4.20 show the edge pictures of test images detected using the proposed edge detector with the threshold value 5. Pictures in Fig-ures 4.24, 4.28 and 4.32 show the edge pictures of the first three test images detected using the proposed edge detector with the threshold value 10. Since the gradient sums of the Letters images are zero or 255, so the edge picture of the Letters image remains the same no matter what threshold values are used. When the threshold for our proposed edge detector is lowered, more weak edges will be detected and the edge closeness will be strengthed but edge noise will be increased. When the threshold for our proposed edge detector is increased, some weak edges will be lost and the edge closeness will be reduced but the edge noise will be reduced. The same is true for the Canny edge detector. The edges detected by the Canny edge detector keep strictly one pixel width and its closeness is a little better than our proposed edge detector. Above all, our edge detector is computationally efficient and robust against threshold settings. For the difference detector and the Sobel edge detector, if the threshold is lowered, the edges become much wider and so the edges will lose the meaning of edges. Also, weak edges are not detected as in the case of the Canny's and our proposed edge detectors. When the threshold is raised for the difference edge detector and the Sobel edge detector, almost all weak edges are lost and the edge closeness becomes worse. Also, the edge width is reduced but is still bigger than two or three pixels width. 57 Figure 4.1: Original Lena image with res-olution 256 x 256 Figure 4.3: Cameran man image with res-olution 256 x 256 Figure 4.2: Part of Baboon image with res-olution 256 x 256 Q R ! \t 1AI Figure 4.4: Part of a Letters image with resolution 256 x 256 58 Figure 4.5: Edges of Lena image using dif-ference edge detector with threshold 40 til!-'- .-MJ W 1'-. -- / Figure 4.7: Edges of Lena image using Canny edge detector with low threshold 0.01 and high threshold 0.03 l1 JP > Figure 4.6: Edges of Lena image using So-bel edge detector with threshold 60 Figure 4.8: Edges of Lena image using the proposed edge detector with threshold value 5 59 Figure 4.9: Edges of the centeral part of Baboon image using difference edge detec-tor with threshold 40 Figure 4.10: Edges of the central part of Baboon image using Sobel edge detector with threshold 60 Figure 4.11: Edges of the central part Ba-boon image using Canny edge detector with low threshold 0.01 and high threshold 0.03 Figure 4.12: Edges of the central part Ba-boon image using the proposed edge detec-tor with threshold 5 60 Figure 4.13: Edges of Cameraman image using difference edge detector with thresh-old 40 Figure 4.15: Edges of Cameraman image using Canny edge detector with low thresh-old 0.01 and high threshold 0.03 Figure 4.14: Edges of Cameraman image using Sobel edge detector with threshold 60 Figure 4.16: Edges of Cameraman im-age using the proposed edge detector with threshold 5 61 D ) ( r \ \ <; D ) ( r\\ < W/7 W/7 Figure 4.17: Edges of the enteral part of Figure 4.18: Edges of the central part of Letters image using difference edge detec- Letters image using Sobel edge detector tor with threshold 40 with threshold 60 Figure 4.19: Edges of the central part Let- Figure 4.20: Edges of the central part Let-ters image using Canny edge detector with ters image using the proposed edge detec-low threshold 0.01 and high threshold 0.03 tor with threshold 5 62 Figure 4.21: Edges of Lena image using dif- Figure 4.22: Edges of Lena image using So-ference edge detector with threshold 60 bel edge detector with threshold 80 Figure 4.23: Edges of Lena image using Canny edge detector with low threshold 0.02 and high threshold 0.06 Figure 4.24: Edges of Lena image using the proposed edge detector with threshold value 10 63 Figure 4.25: Edges of the centeral part of Baboon image using difference edge detec-tor with threshold 60 Figure 4.26: Edges of the central part of Baboon image using Sobel edge detector with threshold 80 Figure 4.27: Edges of the central part Ba-boon image using Canny edge detector with low threshold 0.02 and high threshold 0.06 Figure 4.28: Edges of the central part Ba-boon image using the proposed edge detec-tor with threshold 10 64 Figure 4.29: Edges of Cameraman image Figure 4.30: Edges of Cameraman image using difference edge detector with thresh- using Sobel edge detector with threshold old 60 80 Figure 4.31: Edges of Cameraman image using Canny edge detector with low thresh-old 0.02 and high threshold 0.06 Figure 4.32: Edges of Cameraman im-age using the proposed edge detector with threshold 10 65 4.4 P r o p o s e d edge-based image e x p a n s i o n 4.4.1 Expansion algorithms Our image expansion algorithm is based on linear interpolation and our proposed edge detection method. It is as follows. First, using some simple expansion method we expand an original image into an image with the desired large format. Second, we use our proposed edge detector to detect the edges of this large format image. Third, based on the detected edges and the orientations of the edges we change the pixel values in the neighborhoods of the detected edges. After these three steps are performed the resultant image is our expanded image. In the first step of our expansion procedure, we use the linear interpolation method to expand the original image. This is because the edges in the expanded image by linear interpolation are ramp edges. Applying our proposed edge detector to the expanded image by linear interpolation yields edges of one pixel width. Cubic inter-polation can also be chosen instead of linear interpolation. Since the expanded image by cubic interpolation appears sharper than that by linear interpolation, sometimes the edges in the expanded image by cubic interpolation are not ramp edges. The edges detected by our proposed edge detector might be of two pixel width. This would reduce the picture quality in the third step. Experiments confirm that linear interpolation is a better choice. In the second step of our expansion procedure, our proposed edge detection method could detect very small details if the threshold value is small enough. So most pixel values perceived by humans in the edge neighborhoods will be changed according to our third step for expansion. This will improve the picture quality in the detected 66 edge neighborhoods. If the threshold value is too small there will be some edge noises which affect our expansion result. However, experiments show that if the threshold value for our edge detection method is between 4 to 10, the expansion result is pretty good and as desired. It can be seen that our edge-based expansion method performs well on the edge neighborhoods. Since the Canny edge detector has similar features of closeness and one pixel width detected edges, it can also be used in this procedure instead of our proposed edge detector if the computational efficiency is not considered. The difference edge detector and the Sobel edge detector can not detect weak edges and the detected edges are three or more pixels width. They are of no use for us since we change the pixel values in the edge neighborhoods. In the third step of our expansion procedure, we change the pixel values in the neighborhood of the detected edges. This change involves the immediate neighboring pixel values. The Sobel horizontal and vertical derivates are used to determine the edge orientation and which neighboring pixel values are to be changed. This change will make the image look crisper than the expanded image by linear interpolation and cubic interpolation. In the following, details of the three steps of our expansion procedure are described. Let f(i,j) be a detected edge pixel in the following 3x3 array. /(* - 1,j - 1), f(i-hj), f ( i - l j + l) f{i,j), + /(* + i , i — l), /(* + i , i ) , f(i+ + The Sobel horizontal derivate and vertical derivate at the location are SDh(iJ) = \[f(i-l,j-l)-f(i-l,j + l)} + \[f(i,j-l)-f(i,j + l)} +![ / ( •+ u - i ) - / ( i + u + i)] 67 and SDv(i,j) = + + + respectively. If \SDh(i,j)\ >= \SDv(i,j)\, we change the pixel values of f(i,j — 1) and f(i,j + 1) to /(», j - 1) = |(/(f, J - 1) + /(», J - 2)) and / ( * , i + i ) = ^ ( / ( * ' , i + i ) + / ( * , i + 2)) respectively, where /(z, j — 2) is the value of the pixel to the left of f(i,j — 1) and f(i,j + 2) is the value of the pixel to the right of f(i,j + l). Ii\SDh,(i,j)\ < \SDv(i,j)\, we assign the pixel values f(i — l,j) and f(i + l,j) to f(i-hj) = \(f(i-hj) + f(i-2,j)) and f(* + hj) = \(f(i + l,j) + f(i + 2,j)) respectively. In the following we will use the 5x5 array taken from the data of a picture to illustrate our expansion method. 12 11 76 176 165 12 12 151 170 167 11 30 174 170 169 11 100 174 169 169 17 152 170 166 167 68 Here we want to expand the 5x5 array to 10 x 10. The linearly interpolated data of the above array is 12 11 11 43 76 126 176 170 165 165 12 11 11 62 113 143 173 169 166 166 12 12 12 81 151 160 170 168 167 167 11 16 21 91 162 166 170 168 168 168 11 20 30 102 174 172 170 169 169 169 11 37 65 119 174 171 169 168 168 168 11 55 100 137 174 171 169 168 168 168 14 69 126 149 172 169 167 167 167 167 17 84 152 161 170 168 166 166 167 167 17 84 152 161 170 168 166 166 167 167 Here the last row and the last column are replicated of the previous row and column respectively. Now we apply our proposed edge detection method to find the edges in this interpolated data array. After average filtering and the estimation of the gradient sum matrix and deleting the four boundaries we get 20 75 113 141 85 34 13 5 28 79 119 111 52 19 4 3 42 95 124 98 62 5 0 2 65 112 121 101 103 22 1 2 93 122 112 63 37 5 1 1 116 122 90 43 14 2 1 1 126 111 70 26 9 3 3 1 124 115 51 18 5 4 3 1 The detected edge pixel locations are the same as those boxed elements in the gradient sum matrix. The linear interpolated matrix with the boxed pixel values of detected 69 edge is as follows. 12 12 12 11 11 11 11 14 17 17 11 11 12 16 20 37 55 11 11 12 21 30 43 62 76 126 113 81 91 102 65 100 69 84 84 126 152 152 119 137 149 161 161 151 162 174 174 174 172 170 170 143 160 166 172 171 171 169 168 168 176 170 165 165 173 169 166 166 170 168 167 167 170 168 168 168 170 169 169 169 169 168 168 168 169 168 168 168 167 167 167 167 166 166 167 167 166 166 167 167 The two boxed pixel values on the boundaries are determined as edge points by calculating the rows not listed here. According to this detected edge pattern, and using the Sobel derivates to modify the pixel values around the edge, the expanded data are as follows. 12 12 12 11 11 11 11 11 11 12 16 20 11 11 43 59 126 173 36 113 158 12 18 25 81 91 102 155 164 173 24 33 65 100 146 155 14 _17 T r i 69 84 84 137 156 156 149 161 161 174 174 172 170 170 160 166 172 171 171 169 168 168 173 170 170 170 169 169 167 166 166 170 165 165 169 166 166 168 167 167 168 168 168 169 169 169 168 168 168 168 168 168 167 167 167 166 167 167 166 167 167 In the above array, the boxed pixel values are the new pixel values assigned. We can see that the contrast on the two sides of the edge is higher than the original. So the picture expanded in this way looks sharper than the originally linear interpolated one. 70 We include here some pictures to show our proposed edge-based expansion. Pic-tures in Figures 4.39, 4.40 are the expanded images of Lena (resolution 256 x 256) with factor 2 using proposed expansion method based on the Canny edge detector and our proposed edge detector respectively. Pictures in Figures 4.45, 4.46 are the expanded images of Cameraman (resolution 256 x 256) with factor 2 using proposed expansion method based on the Canny edge detector and our proposed edge detector respectively. 4.4.2 Comparison of expansion results In this part, we propose two methods to show the performance of the proposed edge-based expansion method. The first method is to use matched image reduction and expansion methods to reduce an image to a quater size image and then blow up the reduced image to the original size. The matched reduction and expansion pairs used here are alternative downsampling with pixel replication (0 order reduction and expansion), reduction and expansion by linear interpolation, reduction and expan-sion by cubic interpolation, reduction using the Daubechies wavelet transform and expansion using the inverse Daubechies wavelet transform, and reduction using the modified Daubechies wavelet transform and expansion using our proposed edge-based expansion method and edge detector. Generally, good image reduction and expan-sion methods generate good reduced and then good expanded images. Our modified Daubechies wavelet transform reduction method increases high frequency components in the reduced image. Our proposed expansion method first uses linear interpolation as the basic expansion. This blurs the picture to some extent. Thus we propose 71 reducing the image using the modified wavelet transform and then expanding it us-ing our proposed expansion. Experiments show that this is a good combination for reducing-blowing pictures and for video compression. Simulation results show that the downsampling and replication pair yields the most blocky and zigzag pictures. The reduced and then expanded images, by linear interpolation and by cubic inter-polation both yield blurred and zigzag pictures that are better in that regard than the downsampling-replication method. The reduced and then expanded images using the modified wavelet transform and our proposed expansion method are less blurred and less blocky than the other four combination pairs. But, our method also gen-erates some small artifacts on edges. However, fewer such artifacts appear if the threshold for the expansion process is large. Pictures in Figures 4.48, 4.54, 4.49, 4.55, 4.50, 4.56, 4.51, 4.57, 4.52 and 4.58 are the images using these combination pairs of reduction-expansion methods. The second method which shows the performance of our method is to compare two video compression results. We used the standard MPEG2 scheme for encoding and decoding an original video sequence at a certain bit rate. Ligos MPEG2 encoder is used. To further compare the compression performance, we reduce the same video se-quence using the modified Daubechies wavelet transform before the MPEG2 encoding. Then after decoding we expand the decoded sequence using our proposed expansion based on our proposed edge detector. Pictures in Figures 4.59, 4.60, 4.61, 4.62, 4.63, 4.64, 4.65 and 4.66 are a few side to side comparisons of MPEG2 and our proposed compression scheme at the bit rate of 2 Mbits per second. We get improvement in bits rates of 20% for bit rates less than 3 Mbits per second and also allow saving up to 75% encoding time. 72 Figure 4.33: Lena (256x256) Figure 4.34: Cameraman (256x256) 73 Figure 4.35: Expansion of Lena image using pixel replication with expansion factor 2 74 4.36: Expansion of Lena image using linear interpolation wi th expansion factor 75 76 Figure 4.38: Expansion of Lena image using inverse Daubechies wavelet transform with expansion factor 2 77 Figure 4.39: Expansion of Lena image using Canny edge detector and proposed pansion method wi th expansion factor 2 78 Figure 4.40: Expansion of Lena image using proposed edge detector and proposed expansion method with expansion factor 2 79 Figure 4.41: Expansion of Cameraman image using pixel replication with expansion factor 2 80 Figure 4.42: Expansion of Cameraman image using linear interpolation wi th expan-sion factor 2 81 Figure 4.43: Expansion of Cameraman image using cubic interpolation wi th expansion factor 2 82 Figure 4.44: Expansion of Cameraman image using inverse Daubechies wavelet trans-form with expansion factor 2 83 84 Figure 4.46: Expansion of Cameraman image using proposed edge detector and pro-posed expansion method with expansion factor 2 85 86 Figure 4.49: 4 times reduction-expansion Figure 4.50: 4 times reduction-expansion using the linear interpolation using the cubic interpolation Figure 4.51: 4 times reduction-expansion using the Daubechies wavelet transform and inverse Daubechies wavelet transform Figure 4.52: 4 times reduction-expansion using the modified Daubechies wavelet transform and proposed expansion method 87 Figure 4.53: The original Cameraman im- Figure 4.54: Pixel replication after alt age with resolution 256x256 native downsampling 88 Figure 4.57: 4 times reduction-expansion using the Daubechies wavelet transform and inverse Daubechies wavelet transform Figure 4.58: 4 times reduction-expansion using the modified Daubechies wavelet transform and proposed expansion method 89 90 Figure 4.60: 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 91 92 Figure 4.62: 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 93 Figure 4.63: 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 94 Figure 4.64: 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 95 Figure 4.65: 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 96 Figure 4.66: 90 degree rotated side to side MPEG2 and proposed method at the bit rate 2.0Mbps 97 Chapter 5 Conclusions and Future Work This thesis addresses image reduction and edge-based expansion. Usually image re-sizing methods result in some blockiness, blurring and zigzag noises in the resized images or video. The objective of this thesis is to analyze classical image resizing methods and find better image resizing methods to reduce blockiness, blurring, and zigzag noises. Also one of the objective is to find matched reduction and expansion methods for better video compression results. We show that image enhancement based on edge detection is very helpful in reducing these noises and processing the images. This thesis covers the following topics. 5.1 Image reduction methods Classical image reduction methods such as alternative downsampling, average fil-tering, median filtering, and the Daubechies wavelet transform were analyzed. A modified Daubechies wavelet transform method is proposed. The reduced picture us-ing the modified Daubechies wavelet transform looks sharper, specially at the edge neighborhoods due to the addition of the high frequence components. 98 5.2 Comparisons of classical image expansion meth-ods In this part, we simulated and analyzed some popular expansion methods, pixel repli-cation, linear interpolation, cubic interpolation and the inverse Daubechies wavelet transform. The advantages and disadvantages of these methods provided us with usefull insights in developing our new edge detection and expansion methods. The pixel replication method generates blocky and zigzag edges pictures. The linear inter-polation method and the cubic interpolation method generate less blocky and zigzag edges but blurred images. The cubic interpolation method is the best among these three methods. The inverse Daubechies wavelet transform generates crisper pictures than those pictures expanded by the previous three methods. 5.3 Edge detection To expand an image, the most important part is how to expand the nonstationary region, specially the neighborhoods of edges. So edge detection is important and helpful for image expansion purposes. In this part, we reviewed several frequently used edge detection methods which were used for our proposed expansion method. A new computationally efficient edge detection method was proposed. This proposed edge detector has similar properties of closeness and one pixel width edges to the well-known Canny edge detector. This proposed edge detector is computationally efficient compared to the Canny edge detector. Also only one threshold value is needed for our edge detector against the choice of two threshold values of the Canny edge detection. Experiments showed a threshold value of 4 to 10 is good. The detected edge is one 99 pixel width for a ramp edge and two pixel width for a steep abrupt edge. If the threshold value is small enough, the edges of small details can be detected. In this respect, it is also similar to the Canny edge detector. In determining if a pixel is an edge pixel, geometical properties of pixel values are used to get one or two pixel width edges instead of the additional thinning process used in the Canny edge detection. 5.4 Proposed image expansion method In this part, we proposed an image expansion method based on edge detection. The Canny edge detector and our proposed edge detector can detect small weak edges when the threshold values are small enough and thus they are efficient for use by our edge-based image expanion method. Pictures using our proposed expansion method and our proposed edge detector are generated. We tested our results first on reduced then expanded images, and then on MPEG2 encoded decoded pictures. For the later, reduction was carried before encoding the video and then expansion after decoding it. The results showed that such a compression scheme using our new reduction before encoding, and our proposed expansion after decoding is a good application to video compression. This compression scheme can get 20% improvement of the bit rates compared to the standard MPEG2 compression results and also save 75% of the encoding time. The numbers provided here has been tested off-line using the Ligos MPEG2 encoder and the UBC H.263 encoder. The time saving can potentially allow MPEG2 software encoding to be processed in real time. 100 5.5 Future work • Testing the performance of our proposed image reduction (or proposed image expansion) method when combined with other expansion (or reduction) methods for video compression. • In our proposed expansion method we used linear interpolation as our basic expansion to first enlarge the image. It is good to see if there is other basic expansion method that can be used instead of linear interpolation so that the final expansion results outperforms the current one. • Examining how our edge detector can be used for post-processing of decoded images from MPEG2 compression or H.263 compression • Since our preprocessing saves 75% encoding time, applying the proposed image reduction and expansion to write a software encoder that runs in real time. • Our proposed image reduction, edge detection and image expansion has been al-ready tested in software. It would be a worthwhile project to use hardware tools to simulate our proposed reduction, edge detection and expansion algorithm to realize the algorithms in real time. 101 Bibliography [1] M. Abramowitz and I. A. Stegun, "Handbook of Mathematical Functions", Dover Edition, General Publishing Company, Toronto, Canada, 1965. [2] A. K. Murad Agha, R. K. Ward, and S. Zahir, "Image Expansion Using Segmentation-based method", Proc. IEEE Communications, Computers and Sig-nal Processing Conference, 1999. [3] G. H. Atwood, W. A. Davis, "Image Expansion Using Interpolation and Heuristic Edge Following", Proc. IEE Image Processing and its Applications Conference, Warwick, UK, 1989. [4] J. Canny, "A computational approach to edge detection", IEEE Trans, on Pattern Analysis and Machine Intelligence, PAMI-8, no.6, pp.679-698, 1986. [5] Jong-Ki Han and Seung-Ung Baek, "Parametric cubic convolution scaler for en-largement and reduction of image", IEEE Transactions on Consumer Electron-ics Vol. 46, Issue 2, pp. 247-256, May 2000. [6] Nicos Herodotou, A. N. Venetsanopoulos, "Image interpolation using a simple Gibbs random field model", Proc. International Conference on image processing, pp. 494-497, 1995. 102 [7] Bernd Jahne, Digital Image Processing, 4th edition, Springer-Verlag Berlin Hei-delberg, Germany, 1997. [8] Kreyszic, "Advanced Engineering Mathematics", sixth edition, John Wiley & Sons, 1988. [9] Chulhee Lee, M. Eden and M. Unser, "High-quality image resizing using oblique projection operators", IEEE Transactions on Image Processing, Vol.7, Issue 5, pp. 679-692, May 1998 [10] Ze-Ning Li and Gongzhu Hu, "Quantitative evalution on edge preservation in multiresolution image", Third International Conference on Image Processing and its Application, pp. 623-627, 1989. [11] D. Marr and E. C. Hildreth, "Theory of edge detection", Proc. of Royal Society London B., Vol. 207, pp. 187-217, 1980. [12] A. Munoz, T. Blu and M. Unser, "Efficient image resizing using finite differences", ICIP 1999, Vol.3, pp. 662-666, 1999. [13] F. G. B. De Natale, G. S. Desoli, D. D. Giusto and G. Vernazza, "A Spline-Like Scheme for Least-Square Bilinear Interpolation of Images", Proc. IEEE Acoustic, Speech, and Signal Processing Conference, New York, NY, USA, April 27-30, 1993. [14] W. K. Pratt, Digital Image Processing, New York, 1978. [15] M. Sakalli, Hong Yan, and A. M. N. Fu, "A fuzzy-Bayesian approach to image expansion", Proc. International Joint Conference on Neural Networks, 1999. 103 [16] K. Sugahara and R. Konishi, "Complex-log mapping approach to rotation and enlargement or reduction of digital inages and its performance evaluation", Pro-ceeding of the IEEE IECON 22nd International Conference on Industrial Elec-tronics, Control and Instrumentation", Vol. 3, pp. 1655-1660, 1996. [17] M. Unser, A. Aldroubi and M. Eden, "Enlargement or reduction of digital images with minimum loss of information", IEEE Transactions on Image Processing, Vol. 4, Issue 3, pp. 247-258, May 1995. [18] Saif Zahir, Shih-Wen Lee, and Morton Kanefsky, "A Segmentation Based Pre-diction Algorithm", Proc. IEEE Data Compression Conference, Snowbird, UT, USA, March 30-April 1, 1993. 104
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Image reduction and edge-based expansion
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Image reduction and edge-based expansion Shi, Hongjian 2002
pdf
Page Metadata
Item Metadata
Title | Image reduction and edge-based expansion |
Creator |
Shi, Hongjian |
Date Issued | 2002 |
Description | Image resizing plays an important role in image processing and video stream transmission, specially in the commercial video industries. The most imporstant point of image resizing is that the resized images or videos should keep similar key features such as geometric patterns and the closeness of edges of the original images so that they fit the perception of human vision. Image resizing consists of two parts: image reduction and image expansion. For image reduction purposes, some pixels should be removed or a unified shrinking transforms should be taken. For image expansion, some new pixel values should be added. Image resizing has been an exciting topic in digital image processing due to its extensive uses in industries. Several popular methods for image reduction and expansion methods have been proposed. For image reduction, popular methods are alternative downsampling, average filtering, median filtering and wavelet transform. For image expansion, pixel replication, linear inerterpolation, and cubic interpolation are frequently used. In this thesis, for image reduction, an observation on the image reduction method using Daubechies wavelet transform is made. This observation keeps high frequncy components in the reduced picture and so the reduced picture is obviously sharper than that of the top left corner picture of a wavelet transform. For image expansion, a computationally efficient edge dection method is developed. This new edge detection method generates edge pictures that are very similar to the edge pictures generated by the very known Canny edge detector in closeness, one pixel width and non-zigzagging. Furthermore, the computation of this new edge detection method is much less but more robust than the Canny edge detector. Based on this new edge detection method, a new image expansion method is proposed. This expansion method preserves the edge information very well. The generated picture using our proposed expansion method appears less zigzagged on the edge regions of the picture. The idea is to change the neighborhood pixel values of the edges in a picture expanded by a simple expansion method. The changed pixel values are fitter to human vision than the values generated by the simple expansion method. An important application of our reduction and expansion methods is in video compression. After image reduction, only 25% of the stream source is left for encoding. The encoding process can save 75% of the standard encoding time. Implementation and testing also show that overall 20% to 30% improvements over standard wellknown MPEG2 and H.263 methods are achived in acceptable low bits media stream transmmission using this new method. Above all, our proposed image reduction method gives better quality pictures than those generated by Dauchies wavelet transform, alternative downsampling, average filtering and median filtering methods. Our proposed expansion method outperforms the pixel replication, the linear interpolation and the cubic interpolation methods. It gives crisp and less zigzag pictures. Also these methods used for compression give better compression quality than the standard MPEG2 and H.263 methods. |
Extent | 6131229 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-08-14 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0065300 |
URI | http://hdl.handle.net/2429/12235 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2002-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2002-0238.pdf [ 5.85MB ]
- Metadata
- JSON: 831-1.0065300.json
- JSON-LD: 831-1.0065300-ld.json
- RDF/XML (Pretty): 831-1.0065300-rdf.xml
- RDF/JSON: 831-1.0065300-rdf.json
- Turtle: 831-1.0065300-turtle.txt
- N-Triples: 831-1.0065300-rdf-ntriples.txt
- Original Record: 831-1.0065300-source.json
- Full Text
- 831-1.0065300-fulltext.txt
- Citation
- 831-1.0065300.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065300/manifest