UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A sonar imaging system Smayra, Michael K. A. 1992

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1992_spring_smayra_michael.pdf [ 3.81MB ]
JSON: 831-1.0064985.json
JSON-LD: 831-1.0064985-ld.json
RDF/XML (Pretty): 831-1.0064985-rdf.xml
RDF/JSON: 831-1.0064985-rdf.json
Turtle: 831-1.0064985-turtle.txt
N-Triples: 831-1.0064985-rdf-ntriples.txt
Original Record: 831-1.0064985-source.json
Full Text

Full Text

A SONAR IMAGING SYSTEMbyMICHAEL K. A. SMAYRAM.Sc., The University of British Columbia, 1992A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinTHE FACULTY OF GRADUATE STUDIESDepartment of Electrical EngineeringVancouver, British ColumbiaWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAApril 1992Copyright © Michael Smayra, 1992In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of  at cr-alc,4^4)6 1■1/41^e IA) 6The University of British ColumbiaVancouver, CanadaDate^30 A Ae (-- 1 11 I-DE-6 (2/88)AbstractIn this thesis, a sonar mine detection system is presented. The three stagesof the system, image enhancement, edge detection and data compression areanalyzed. For each stage, different approaches are studied and the appropriatealgorithm is chosen.For image enhancement, a cascaded morphological enhancement algorithmis developed. Our algorithm is shown to preserve high-detail regions, includingedges surrounding features of concern (suspect mines) and their correspondingshadows. This is accomplished without introducing any blurring, which is aninherent side-effect of conventional enhancement algorithms.For edge detection, amongst various methods, we have found the two di-mensional morphological edge detector, the Alpha-Trimmed Morphological (ATM)filter, to be immune to noise, a major problem encountered in sonar imagery, andconsequently this filter has outperformed the other conventional edge detectors.At the compression stage, three algorithms are examined. The Vector Quan-tization (VQ) method, using a new codebook generation algorithm, which we havedeveloped, is found to yield very high quality reconstructed images. The decom-pressed images have their high-detail regions well preserved, and furthermorethe low-detail regions are minimally distorted, to the extent that the reconstructedimages are virtually indistinguishable from the original images. With the new com-pression algorithm, a compression ratio of 1:10, or equivalently 0.8 bits per pixelii(bpp) is achieved. Furthermore, the processing time for codebook generation isgreatly shortened.iiiContents Abstract ^List of Figures viiList of Tables ^  xiAsk have$ iteM era-, . ..^. . . .^, ,^.   xi;1^Introduction  12 Side-Scan Sonar System ^  53 Image Enhancement 113.1^Conventional Smoothing Algorithms ^ 15^3.1.1^Spatial Averaging ^ 153.1.2^Order Statistics Filtering  173.1.3^Out-Range Noise Cleaning ^ 203.2^Cascaded Morphological Smoothing Filters ^ 213.2.1^Principles of Mathematical Morphology 213.2.2^The Enhancement Filter ^ 273.3^Sonar Image Enhancement Results ^ 304 Edge Detection^ 334.1^Introduction 334.1.1^Zero-Crossing of the Laplacian of Gaussian^ 344.1.2^Step Function Edge Detector ^ 364.1.3^Histogram Thresholding 37iv^4.1.4^Multidimensional Morphological Edge Detection ^ 384.1.5^Other Edge Detectors^ 394.2^Multidimensional Morphological Edge Detector^ 394.3^Results of Edge Detection^ 454.3.1^Edge Detection Success Measurement^ 454.3.2^Results Using Different Pre-Processing Enhancement filters ^ 454.3.3^Results ^ 475 Image Compression 515.1^History and Definitions ^ 515.1.1^Predictive Coding 525.1.2^Transform Coding ^ 525.1.3^Block-Based Coding 535.2^Block Truncation Coding^ 545.3^JPEG Algorithm ^ 555.3.1^JPEG Encoding 565.3.2^JPEG Decoding ^ 585.3.3^Experimental Results 595.4^Vector Quantization ^ 635.4.1^Introduction to Vector Quantization ^ 635.5^Summary of Compression Results 706 Conclusions^ 81^6.1^Original Contribution ^ 816.2^Suggestions for Future Work^ 82A REFERENCES^ 83ViList of FiguresFigure 1^Sidescan sonar apparatus. ^  6Figure 2^Emitted sonar beam geometry  7Figure 3^Original sonar image ^ 11Figure 4^A three-dimensional view of fig. 3^  13Figure 5^Fourier transform of fig. 3.  14Figure 6^Fourier transform of a noise-free image^ 14Figure 7^Effect of averaging on an edge. 16Figure 8^Output image of a mean filter on the sonar image of Fig.3. ^  17Figure 9^Output of an order statistics filter using a 3x3 window,and the 10th percentile. ^  18Figure 10^Output of an order statistics filter using a 3x3 window,and the 50th percentile (Median filter). ^ 19Figure 11^Output of an order statistics filter using a 5x5 window,and the 50th percentile (Median filter). ^ 19Figure 12^Out —range noise cleaning, using a high thresholdvalue^ 20Figure 13^Dilation of object A by structuring element B (from [6]). ^ 23Figure 14^Erosion of object A by structuring element B (from [6]). ^ 24Figure 15^Morphological opening (from [6]). ^ 26viiFigure 16Figure 17Figure 18Figure 19Figure 20Figure 21Figure 22Figure 23Figure 24Figure 25Figure 26Figure 27Morphological closing (from [6])^ 27Block diagram of the cascaded morphologicalenhancement filter. ^ 28Output of cascaded filter with a horizontal bar asstructuring element^ 29Output of cascaded filter with a vertical bar asstructuring element ^ 29Output of cascaded filter using a square window as astructuring element ^ 30Output of cascaded filter using a circular window as astructuring element^ 30Output of a zero-crossing edge detector using a narrowstandard deviation range. ^ 35Output of a zero-crossing edge detector using a widestandard deviation range. ^ 35Output of a step function edge detector searching forintensity steps larger than 20.^ 36Histogram thresholding 38Sensitivity of histogram thresholding: threshold changedby 0.1% of that of FIG. 25. ^ 38Block diagram of the "primitive" morphological edgedetector^ 40viiiFigure 28^Block diagram of the ATM edge detector: ^ 42Figure 29^ATM Edge Detection, with a = 1 and preceded by amorphological enhancement step^ 43Figure 30^ATM Edge Detection without the enhancement step. ^ 44Figure 31^ATM Edge Detection after spatial averaging. ^ 46Figure 32^ATM Edge Detection after out-range noise removal. ^ 46Figure 33^ATM Edge Detection after median filtering. ^ 47Figure 34^Parallel array shift architecture (from [9]). ^ 48Figure 35^Reconstructed sonar image after BTC encoding anddecoding ^ 55Figure 36^Block diagram of the JPEG encoder ^ 56Figure 37^Zig-Zag scanning used to get long sequences of zeros ^ 57Figure 38^Block diagram of the JPEG decoder ^ 58Figure 39^The zoomed original image (Zooming factor = 2) ^ 73Figure 40^The zoomed reconstructed image after JPEGcompression, with 5:1 compression ratio. ^ 74Figure 41^The zoomed reconstructed image after VQcompression, with 10:1 compression ratio.^ 75Figure 42^The zoomed reconstructed image after JPEGcompression with quality factor of 30 (10:1compression) ^ 76ixFigure 43^The zoomed reconstructed image after VQ compression(10:1 compression) ^ 77Figure 44^Difference image after JPEG compression^ 78Figure 45^Difference image after VQ compression ^ 79List of Tables Table 1^Tabulated results of edge detection^ 49Table 2^Visual "measurement" using a quality factor of 75, forimages 1 and 2. ^ 60Table 3^Visual "measurement" using a quality factor of 75, forimages 3 and 4. ^ 61Table 4^Quantitative comparison of JPEG compressionalgorithm. ^ 62Table 5^Results of VQ method for images 1 and 2. ^ 67Table 6^Results of VQ method for images 3 and 4. ^ 68Table 7^Quantitative comparison of modified VQ compressionalgorithm. ^ 69AcknowledgmentI wish to express my sincere gratitude to Professor R. Ward for her advice,guidance, and for many useful comments and suggestions throughout the durationof this thesis.I wish to dedicate this work to my parents Albert and Libuse and to my sisterMona who have been providing me with so much love and inspiration.Xii1 Introduction In recent years, many countries have expressed interest in developing sonarsystems that would make their harbors and coastal waters safer, in case of emer-gencies such as wars. Sharing this concern, Canada established a multi-billiondollar Mine Sweeping Counter Measure project (MSCM) 1 for the development ofsuch a system. This system will require the development of many complicatedsub-systems, such as sonar sensors and remotely controlled tow-ships. This the-sis is concerned with a small but vital part of that wider project: image analysisand storage. Typically, an operator will be monitoring the bottom of the ocean byanalyzing sonar images that will be displayed in front of him or her on a screen.Since this will be done on a real-time basis, the operator will see the sea bedscrolling on the screen, while the operating ship will be scanning the area. Oncethese images are displayed, it is required that they are then compressed andstored on tapes for future use.Specifically, this thesis examines the three major components of such asystem, namely:1. Image Enhancement.2. Edge Detection.3. Image compression and decompression.The Canadian government refers to this program as the Maritime Coastal Defense Vessel program.1The organization of this thesis is as follows. Chapter 2 provides an overviewof the MSCM system with reference to its components and the way in whichthey interact. This chapter does not provide any new results or analysis, butintroduces readers to the subject and establishes a logical framework within whichsubsequent chapters are placed.One unfortunate characteristic of sonar images is their noisiness, and one ofthe challenges to overcome in the MSCM system is to identify and locate features(suspect mines) in the presence of a very noisy background. For this reason,before applying intelligent analysis tools such as edge detectors, a preliminarystep is required. The purpose of this step is to enhance the image by suppressingnoise and sharpening the edges that delimit features paving the way for easyedge detection. Chapter 3 is devoted to this pre-processing step, to examiningtraditional enhancement algorithms. Many of these traditional algorithms usesome sort of averaging in order to suppress noise. Unfortunately, this has the side-effect of blurring since high spatial frequencies are lost or weakened. Althoughthe averaging operation is used in a wide spectrum of applications, it is found tobe inappropriate for sonar images since it causes a blurring that removes smallfeatures, an operation that is unacceptable for military standards (no suspectmine can be overlooked).Morphological enhancement filters have been used in "very noisy images" ap-plications, and they lead to promising results [15, 41, 43, 59]. We develop theCascaded Morphological filter which is shown to preserve the required features2and at the same time to suppress the noise. Since the purpose of such enhance-ment filters is to aid in the edge detection process, the "best" enhancement filteris chosen after applying edge detectors to the enhanced images.The next step in the display process is to detect edges by overlaying theedges on top of the original image. This edge detection operation would makethe detection task easier for the operator, since the features will be highlighted.In Chapter 4, we apply different edge detectors to the enhanced images and thenwe compare the results. It is shown that the ATM edge detector outperforms allthe other industry-used detectors. This filter is not recent, and in fact its rootsgo back to 1960 when Mitsubishi used it to locate defects in integrated circuitsboards. Since then, a few research papers which investigate the usefulness ofthis detector on a wide range of imagery have appeared in the literature. Forsonar imagery, the good performance of the ATM detector is mainly attributedto its immunity to noise. Because most edge detectors are based on findingintensity steps or gradients, it is evident that these algorithms will fail in the sonarimages applications. This chapter ends with a tabulated results of the differentinvestigated algorithms.Another important step in this system is to compress the sonar images andarchive them on tape. A high compression ratio is desirable since this wouldcut down the cost of tape required to store data that represent sea beds nearharbors and coastal waters, which amounts to very vast areas. In Chapter 5we compare three compression methods, Block Truncation Coding (BTC), Joint3Photographic Experts Group (JPEG) algorithm, and Vector Quantization (VQ).Excellent feature preservation and very low bit rates are achieved by usingan improved VQ algorithm, with a new codebook generation method which wedeveloped. In addition to the high image quality obtained after reconstruction,the processing time of the codebook generation method is significantly shortened,making the VQ our recommended compression/decompression algorithm.Finally, we reserve Chapter 6 for concluding remarks, the original contributionsof this thesis, and some ideas for future research.42 Side-Scan Sonar SystemA brief overview of the MSCM is provided here to introduce the system as awhole, and to frame the different chapters of this thesis.The Canadian marine forces consider certain areas prone to mine attacks.Harbors must be secure at all times, especially in an emergency situation like awar, since they provide a vital link to the movement of warships and submarines,as well as to commercial ships providing food supplies and other essential com-modities. Other areas considered strategically important are the shallow coastalwaters where mines could potentially sabotage the safe passage of ships.The mine tracking and counter measure operation would function as follows:a ship would tow a towfish, which would have two transducers, one on the portside and one on the starboard (left and right sides of a ship). Each transducerwould emit pulses of sonar energy which would travel through the water until theyare reflected back to the towfish by objects in the water and by the sea bed.5Figure 1 Sidescan sonar apparatus.6ShadowThe sonar pulses cover a long and narrow strip of the ocean floor as illustratedin Figure 2:Figure 2 Emitted sonar beam geometry.7The wavefront, travelling in circular waves, reaches the sea bed directlybeneath the towfish first, and then hits it further and further away from the towfish.Due to propagation loss in water, the sound energy is rapidly attenuated as ittravels greater distances. The echoes measured by the transducers describe theshape and composition of the sea bed. The reflected received pulses are thentransformed into images. In these images, high intensities represent objects thatare at higher altitudes than their surroundings, and low intensities represent areasthat are obscured by rocks, mines, and other objects that may be lying on thesea bed.As the towfish is pulled by the ship, or is remotely controlled by it, an operatorpositioned in the ship would be monitoring the sea floor, looking for features orsuspect mines. In this thesis, we will use the terms feature and suspect mineinterchangeably, since they both refer to the same object. Monitoring theseimages could potentially be a slow and time-consuming process. In order toaid the operator in this task, an analysis tool is provided. This tool detects edgesand highlights them to the operator. If the operator suspects the presence of amine he or she would be able to view, on a different screen, images of the seabed taken on an earlier mission of the same region, and would be able to make avisual comparison to determine whether or not that object was previously present,or if it might be a newly deposited mine. All images scanned on a particularday are then compressed and saved on tapes. These tapes are re-displayed onthe next mission of the same region, as explained earlier. For this purpose, the8decompression speed must work on a real-time basis.The images used in this thesis are 256 by 256 pixels in size, and each pixelrepresents an area of approximately 100 centimeters square. The intensitiesof the pixels contained in these images range in values from 0 (black) to 255(white), although these extreme values are seldom found, i.e. pixel values arenot normalized.9103 Image EnhancementThe main objective of image smoothing is to accentuate and sharpen featurescontained in an image. In the case of sonar imagery, any suspect mine isconsidered a feature. Figure 3 depicts a sonar image that will be used throughoutthis thesis, since it contains three features, two of which are obvious, and the thirdmay be noticed by its shadow only (upper left quadrant of image). Each feature isrecognized by its locally higher intensity, and by its shadow, recognized by locallylower intensities and adjacency to the feature. Depending on the position of thesuspect mine, the locally higher intensities may not be obvious, making the taskof detection even harder.Figure 3 Original sonar image11As mentioned earlier, sonar images are inherently very noisy. The range of thenoise intensity varies from high values comparable to those of the feature pixelsto low values comparable to those of the shadow pixels. By trying to attenuatethe noise, there will be some feature and shadow information loss. To illustratethe high levels of noise in sonar images, Figure 4 shows a three-dimensionalmeshed view of the original image, and Figure 5 depicts its Fourier transform.From the meshed picture, it is obvious that the greatest difficulty in the smoothingprocess will be to attenuate the noise without attenuating the already degradedsuspect mines. It is sometimes beneficial to study an image in the frequencydomain, but from Figure 5 it is evident that the outcome of such a study would belimited. The central spike in the middle of the image reflects the DC component,the neighboring region represents low spatial frequencies present in the image,and all the other impulses represent spatial frequencies of edges mainly causedby noise. A noise-free image would contain the DC component, and very limitednumber of high spatial frequencies (impulses). Figure 6 illustrates the Fouriertransform of a smooth image (image of Lenna from [55]). It can be noticed that itcontains a high DC component and some low spatial frequencies, but very limitednumber of high frequencies.We first applied conventional smoothing methods to our sonar images andthen applied edge detectors to them. Since the results were not satisfactory, wethen developed our own smoothing filter built on the image morphology approach.At this point, we would like to clarify the requirements imposed on such a detection12system:1. No suspect mines can be overlooked.2. The number of false suspect mines (badkground pixels of high intensities)should be minimized.Figure 4 A three-dimensional view of fig. 3.13Figure 5 Fourier transform of fig. 3.Figure 6 Fourier transform of a noise-free image.141 Conventional Smoothing AlgorithmsThere are several studies and algorithms that aim at the attenuation of noise.For our study, we first considered the following three conventional techniques:,1. Spatial Averaging [2][3][4].2. Order Statistics Filtering [1][3].3. Out Range Noise Removal [2].Because of the high intensity levels of noise in sonar images, it is almostimpossible to discriminate noise pixels (background pixels of high intensities) fromfeature pixels, or even background (low intensity) pixels. What identifies a featureis an aggregation of high intensity pixels, whereas noise pixels tend to be moreisolated. Therefore, the techniques that we considered do not discriminate noisepixels; rather, they process all pixels in a unified fashion. It should be evident thatthe smoothing process must be moderate due to the stringent condition imposedon the detection scheme: not to miss any suspect mine.Spatial AveragingOne of the simplest ways to smooth images is to use a spatial averaging filter.This image enhancement method replaces each pixel in the image by a weightedaverage of its neighborhood pixels:y(m,n)^11W E E a(k,l)s(m— k,n —1)kEW /EWwhere x and y are the original and processed images, respectively, and W isa scanning window that traverses the original image. Although this smoothing15filter tends to attenuate the noise, it also smooths the edges surrounding featuresand their respective shadows, an operation that is highly undesirable. Figure7 illustrates the effect of averaging on an edge, which translates into blurring,noticeable by the human eye.Averaging attenuates highintensity jumps (spatialfrequencies), which causesblurringSuspect mine intensity level Background levelEdge before averaging^Edge after averagingFigure 7 Effect of averaging on an edge.Figure 8 shows the output image of a mean filter, using a 3x3 scanningwindow.16Figure 8 Output image of a mean filter on the sonar image of Fig. 3.The main advantages of this linear filter are its simplicity and its effectivenessin noise suppression, but the fact that feature pixels are attenuated shows thatspatial averaging is unsuitable for sonar images.Order Statistics FilteringOrder statistics filters are a class of non-linear translation-invariant filters thatare very popular in digital image processing, because they can be applied to differ-ent types of imagery, ranging from medical images to meteorological images. Oneother attractive feature of this class of filters is that they are easily implemented,and they suppress impulsive noise, while preserving the edges of the signal [1].As in the previous averaging filter, a scanning window is positioned at each pixelin the image, and the central pixel is replaced by a percentile (in a 3x3 scanningwindow, the 5th ordered pixel is the fiftieth percentile), after having ordered the17pixels in the window in increasing intensities. A special case of an order statisticsfilter is the median filter, where the central pixel in a window is replaced by the50th percentile of the ordered pixels. Figures 9, 10 and 11 represent samples ofthe filtered images. Using a low percentile caused high intensities (suspect mineshave high intensities) to be filtered out, and consequently loss of information. Onthe other extreme, using high percentiles caused low-intensity pixels (shadowsbelong to this set) to be removed. Compromising between these two extremes, amedian filter (50th percentile) produced blurred suspect mines. As shown earlier,blurring is caused by the disappearance of high spatial frequencies, which alsotranslates into the attenuation of suspect mines pixels.Figure 9 Output of an order statistics filter using a 3x3 window, and the 10th percentile.18Figure 10 Output of an order statistics filter using a 3x3 window, and the 50th percentile (Median filter).Figure 11 Output of an order statistics filter using a 5x5 window, and the 50th percentile (Median filter).19Out-Range Noise CleaningUnlike the conventional methods described above, this technique attempts toidentify noise pixels [2], and is therefore more selective. Each pixel is sequentiallyexamined, and if its intensity is greater than the average of its neighboring pixelsby some threshold, the pixel is defined as a noise pixel, and consequently itis replaced by the neighboring average. At first glance this technique seemedpromising, although experimental results showed otherwise. Figure 12 depicts aresulting image of this noise removal method.Figure 12 Out —range noise cleaning, using a high threshold value.Furthermore, choosing a constant threshold value is an undesirable operation,since the intensities contained in a sonar image may change from region toregion, depending on the composition of the sea bed. Sand seabeds have higherreflectiveness ratios than mud seabeds.202 Cascaded Morphological Smoothing FiltersThe morphological filtering of images was initiated by Matheron and Serra inthe late 1960s, at the School of Mines in Paris, at Fontainebleu. Their early workculminated in the book Image Analysis and Mathematical Morphology [5], in whichthe set-theoretical methodology for image analysis was first introduced. The ob-jective of mathematical morphology is the qualitative description of geometricalstructures. Its foundations lie in the disciplines of integral geometry and geomet-rical probability, also drawing upon harmonic analysis and algebraic topology.Mathematical morphology has been applied to a wide range of image analy-sis problems including smoothing and edge detection. Some of these applicationsinclude range imagery feature extraction [59], geological sample analysis [5], andcircuit board inspection [58]. Morphological operators are suited for very local,pixel-oriented operation such as noise removal. Several papers have been ded-icated to noise removal using morphological operations [6-8]. The encouragingresults described in those papers prompted us to pursue this smoothing method.A short overview of mathematical morphology is first provided below for readerswho have not previously encountered this subject, and the ideas presented areessential for the comprehension of this section and the ATM edge detector section.Easy to read and morphologically related introductions can be found in [6, 9-15].Principles of Mathematical MorphologyThere are two fundamental morphological operations upon which the entirefiltering process is founded: dilation and erosion. For ease of understanding, we21will consider binary images (features have a value of 1, and background is 0). LetA be a binary object and B be a mask, which mathematically is referred to as astructuring element. Let us define Minkowski addition and subtraction:Let A and B be two sets. The Minkowski addition is the following set-theoretical operation:AeB, UA-FBbEBAe B is constructed by translating A by each element of B and then taking theunion of all the resulting translates.Similarly, Minkowski subtraction is defined as the following operation:Ae B nA+ BbEBIn this operation, A is translated by every element of B and then the intersection istaken. Based on the Minkowski addition and subtraction, we can define the mostbasic morphological operations, namely dilation and erosion.D : Z 2 — Z 2 where Z is the integers' set.D(A,B)= AeBandE : Z 2 — Z2E(A,B) , Ae B22Geometrically speaking, Dilation and Erosion can be interpreted as follows:1. Dilation: Take the original object A, translate A by each element of B andthen take the union of all the resulting translates.2. Erosion: Take the original object A, translate A by each element of B andthen take the intersection of all the resulting translates.Perhaps, the best way to illustrate these mathematical operations is with thefollowing figures:Figure 13 Dilation of object A by structuring element B (from [6]).23Figure 14 Erosion of object A by structuring element B (from [6]).For completeness, a summary of Minkowski's properties are listed [6], whichresemble the usual Euclidean properties:1. Commutativity: A ff.) B = B W A2. Associativity: A e (B (DC) = (A fl-) B)(1,) C3. Translation invariance: A G (B x) = (A ED B)-F x,x E R24. Duality: A e B^[A , e5. If A l C A2, then D(A i , B) c D(A2,B)24For grey scale operations, the "Union" and "Intersection" operations are re-placed by "maximum" and "minimum" operations [15]. Also, in the grey scale case,the object A represents the whole image. Based on the two primitive morpholog-ical operations, there are two additional operations, namely closing and opening,which are defined as follows:0(A,B) = [A ei –B)] Bwhere —B is simply B rotated 180° around the origin, andC(A,B) = [A 6) (-B)] e BEmploying erosion and dilation terminology, we haveO(A,B) = D[E(A,B),B]andC(A, B) = E[D(A,–B),–B]The Opening and Closing are illustrated in Figures 15 and 16 respectively.25Figure 15 Morphological opening (from [6]).26Figure 16 Morphological closing (from [6]).The Enhancement FilterWe applied morphological opening and closing to the original image, usingthree structuring elements: a 3x3 window, 3x1 window, and a 1x3 window. Wenoticed that applying these filters gradually removed the noise, without greatlyaffecting the features that are contained in these images. Encouraged by theseresults, we used a cascaded morphological filter, with a block diagram as shownin Figure 17. The structuring element in each cascaded filter has the same27shape but different dimensions. The larger the structuring element, the moreOriginal^ EnhancedClosing by Closing by Closing bya 2x1 struct. a 3x1 struc. a (N+1)x1 --II.element element. struc. elem.Image^ ImageN cascaded Morphological FiltersFigure 17 Block diagram of the cascaded morphological enhancement filter.information loss occurs (Minkowski properties). But, also the larger the structuringelement, the more noise that is removed, of course at the expense of removingfeatures as well. Therefore there is a trade-off between noise removal and featurepreservation. We noticed that using no more than 3 cascaded filters lead tosatisfactory result. Figures 18, 19, 20, and 21 show the enhanced images,using the cascaded filters of order 3, with a horizontal bar (2x1 followed by 3x1,followed by 4x1 structuring element), a vertical bar (1x2, 1x3, then 1x4), a squarewindow (2x2, 3x3, then 4x4), and a circular window (radius=2, radius=3, thenradius=4) structuring elements. Two-dimensional and Large structuring elementscause higher attenuation of noise and also higher attenuation of features thanone-dimensional and small structuring elements. Since we want to remove noisein a moderate fashion, a one-dimensional window (bar) would achieve our goal.Since features and shadows are usually longer than wider (particularly true for28shadows due to the geometry), information loss caused by horizontal structuringelements would have a lesser effect than by vertical bars.Figure 18 Output of cascaded filter with a horizontal bar as structuring element.Figure 19 Output of cascaded filter with a vertical bar as structuring element29Figure 20 Output of cascaded filter using a square window as a structuring elementFigure 21 Output of cascaded filter using a circular window as a structuring element.3 Sonar Image Enhancement ResultsFrom an enhancement point of view, all filters that we considered attenuatednoise. In addition, they all attenuated feature and shadow pixels. However,30some attenuated these informative pixels more than others. The goal of theenhancement step is to remove noise while preserving features as much aspossible. This preservation of features is needed by edge detectors since theybase their search on edges that surround features.31324 Edge Detection1 IntroductionAfter having preprocessed the sonar images with a smoothing filter, the taskof locating features, or equivalently suspect mines, is tackled. Much researchhas been done on this subject [18-40]. Edge Detection can be defined as theprocess that locates the boundary between two adjacent regions of an image,which differ in some characteristics. In this thesis, this translates into edgesthat surround high-intensity objects (suspect mines) and adjacent low-intensityregions (shadows). No single Edge Detector (ED) algorithm will successfullywork on all types of images. Each class of imagery including optical, biomedical,meteorological, and sonar, has different characteristics that require specializedalgorithms for locating edges contained in those images. This task of findingedges is more complex, and sometimes impossible, if noise is present. Sincesonar images are noisy images, we investigated edge detection research papersconcerned with overcoming the noise problem. In addition, there are differentkinds of noise, such as white Gaussian, impulsive, additive, and multiplicative.Therefore, we were very selective in finding an ED that may suit our set of sonarimagery.Suk and Hong [17] described a new edge extraction technique specificallydeveloped for noisy images. A couple of years later, another paper was presentedby Feehs and Arce[41], which also dealt with immunity to noise.33At the beginning of our research, we felt that only few edge detectors may besuitable to the sonar set, and out of 4 promising methods briefly described below,only one was identified successful for the sonar images application.Zero-Crossing of the Laplacian , of GaussianThis technique is introduced by Marr and Hildreth [30]. It is based oncombining zero-crossings from the output of a Laplacian of Gaussian filters forseveral spatial frequency channels. The Gaussian filter is first used for smoothingthe image, and is followed by the Laplacian which is a nondirectional secondderivative 2D operator. Since the Gaussian is a low-pass filter, the smoothing stepof this method should suppress noise and edges indiscriminately. The outcomeis a blurred image, which is what we tried to avoid in the previous chapter, sincethis information loss is non-recoverable.Mathematically, the Laplacian of Gaussian operator is defined as:2^=^+ y2 )^X -1-Y 2V U(S1 Y) 2 ^ e. 2where u is the standard deviation of the Gaussian.Although this approach proved to be successful in many applications, andmany improvements have been appended to this ED [24, 31], in the sonar imagescase it failed. The next figures show the output of this edge detector using anarrow and wide standard deviation ranges for the Gaussian processing.34Figure 22 Output of a zero-crossing edge detector using a narrow standard deviation range.Figure 23 Output of a zero-crossing edge detector using a wide standard deviation range.The failure is attributed to the high levels of noise, i.e. the comparableintensities of features and noise pixels on the one hand, and shadows and noise35pixels on the other. The space constant of this filter a was made very large inorder to ignore small edges, caused by noise pixels. But even with large 0- , theresulting image did not meet our preset specifications (see FIG. 23).Step Function Edge DetectorAs its name suggests, the step function edge detector search is orientedtowards finding an intensity step. After implementation, we realized that thistechnique is highly inappropriate for sonar images. The three dimensional viewof the image ( fig. 4) shows how many step functions it contains. The task oflabeling the step functions that correspond to the edges that surround featuresand shadows is impossible. Figure 24 illustrates how inappropriate is this method.Figure 24 Output of a step function edge detector searching for intensity steps larger than 20.36Histogram ThresholdingEasy in concept and easy to implement, this method bases its search foredges on thresholding. It does not specifically look for edges, rather it looks forfeatures and shadows. Therefore, the Histogram Thresholding is not a true edgedetector, rather it is a feature locator. Nevertheless, in this thesis it was classifiedas an edge detector since it leads to the same outcome: locating suspect mines.The histogram of the original image is built, and two values are extracted fromit: a lower threshold, chosen from the low-intensity end of the histogram, andan upper intensity threshold is chosen from the high-intensity end. The filteringproceeds as follows: if the intensity of any given pixel is either lower than thelower threshold, or higher than the upper threshold, then the pixel is unchanged,otherwise a value of 128 (grey value) is assigned for background distinction.After thresholding the image, many noise pixels were filtered through, butsome isolated ones remained. In order to remove these isolated noise pixels, amorphological opening was performed. Figure 25 illustrates the output of this filter.It appears that the drawback of this technique is its use of thresholding values,an operation that is not recommended, since the choice of threshold values ishighly dependent on the image under study. In this case however, both upperand lower thresholds were extracted from the histogram of the image, which isoperator-independent. Just to illustrate the sensitivity of thresholding, Figure 26shows the output of a histogram thresholder, with threshold values that differ by60 (0.1 % deviation) pixels from the ones used in Figure 25.37Figure 25 Histogram thresholdingFigure 26 Sensitivity of histogram thresholding: threshold changed by 0.1% of that of FIG. 25.Multidimensional Morphological Edge DetectionFor noisy images, this class of edge detectors outperformed most maskand differentiation based edge detectors [7]. The reason for this technique's38success stems from the fact that it is based on analyzing the geometric structurescontained in images. Lee, Haralick, and Shapiro [8] introduced a more primitivemorphological edge detector, which showed considerably better results at lowsignal to noise ratios than any of the previously developed edge detectors. Wefound that this approach gave us encouraging results, better that any other edgedetector that we have considered. Section 4.2 is dedicated to the analysis ofthis ED.Other Edge DetectorsOther edge detection work has concentrated on finding different smoothingfilters to deal with noise [17,18], or on optimizing positional accuracy by usingdirectional operators [26, 27].Chen, Huertos and Medioni [26, 37] have developed an iterative approach forthe removal of juxtaposed edges. A fairly recent and sophisticated scheme waspublished by Canning, Kim, and Rosenfeld [40], which consists of two passes,one for the search of i-pixel ridges, and the other for "double checking" purposes.Other sophisticated edge detectors were suggested but, as we mentioned earlier,our judgement was that they were inappropriate for sonar imagery.2 Multidimensional Morphological Edge DetectorAs mentioned in the previous section, we found the multidimensional morpho-logical edge detector to be suitable for sonar imagery. A more primitive type ofedge detector of the same category was designed and tested by Lee, Haralick39DilationErosion...Blurring step Min—OB.and Shapiro [8] in 1986. In their paper, they applied this ED to noisy images, butnot comparable to the amount of noise present in sonar images. This ED was effi-cient and robust, and more importantly immune to noise (small amounts of noise).It consisted of two major processing stages. The first step was to blur the image,using a blurring filter, which had a side effect of attenuating the noise present inthese images. It also caused feature loss, which was tolerated in [8], since thenoise level was small. The second step consisted of morphological operations.Figure 27 illustrates these steps:Figure 27 Block diagram of the "primitive" morphological edge detector.A similar edge detector was later introduced by Feehs and Arce [7], the Alpha-Trimmed Morphological Edge Detector. In this filter, the image is first prepro-cessed through an Alpha Trimmed operator [42], and the detection is done with40more sophisticated cascaded morphological filters. Perhaps the best way to illus-trate the idea behind this edge detector is by an example: (for simplicity, considerone line of the image with the following intensities)noiseEdge withnoise pixels:noiseOriginal: 3 3 9 3 3 3 3 3 9 9 9 9 9 3 9 9Blurred: 3 5 5 5 3 3 3 5 7 9 9 9 7 7 7 9Dilation: 5 5 5 5 5 3 5 7 9 9 9 9 9 7 9 9Erosion: 3 3 5 3 3 3 3 3 5 7 9 7 7 7 7 7Closing: 3 5 5 5 3 3 3 5 7 9 9 9 7 7 7 9opening: 3 5 5 5 3 3 3 5 7 9 9 9 7 7 7 9lower branch: 2 0 0 0 2 0 2 2 2 0 0 0 2 0 2 0upper branch: 0 2 0 2 0 0 0 2 2 2 0 0 0 0 0 2Output of ED: 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0Edge location:41DilateErodeUpper BranchMina-trimDilate^ ErodeLower BranchFigure 28 Block diagram of the ATM edge detector:The results of this ED are shown in Figure 29. The appropriate valueof a was easily determined. Relatively large values of a would cause thea—trimmed filter to behave like a spatial averaging filter. And as it is shownin the conclusive remarks in this chapter, spatial averaging followed by ATM edgedetection performed poorly. Choosing small values of a gave much better results(FIG. 29). Specifically, a value of a = 1 gave best results.42Figure 29 ATM Edge Detection, with a = 1 and preceded by a morphological enhancement step.Mathematically, the ATM edge detector can be defined as follows: let theinput and output images to this ED be denoted as fi (z) c Z2 , and L(z) C Z2respectively. Moreover, let f(z) C Z2 , denote the alpha-trimmed mean blur of theinput image and be defined ask-a fu) (z )f(z) = E k — 2ai=i+awhere k = (2n + 1) 2 is the number of samples spanned by the basis function B,n the order of the filter, and f( l)(z) is the ith smallest sample of f2 (z) spannedby B. In simpler terminology, f(l) (z) is the Ith ordered pixel in a scanning windowof dimension k (for example, if the scanning window is 3x3, then k=3x3=9). TheATM edge detector output is defined asfo(z) = nun {[fB(:)— (f A B)(--;)1, [(f^fB(z)] }43where (f e B)(z) and (f e B)(z ) are respectively, the dilation and erosion ofimage f(z) with the structuring element B. fB is erosion followed by a dilationoperation, and fB is dilation followed by an erosion operation.A more elaborate analysis of morphological edge operators can be found in[43]. Just to illustrate the need for a smoothing step before applying an ED, wehave included the output of the ATM ED, applied to the original image in Figure3. Figure 30 shows the result of that operation, where it can be seen that manyfalse edges (edges caused by noise pixels) have been detected. These resultsviolate the specifications of this system.Figure 30 ATM Edge Detection without the enhancement step.443 Results of Edge DetectionEdge Detection Success MeasurementOur measurement of success for edge detection filters was based on thedetection requirements introduced earlier. These are:1. No suspect mines can be overlooked.2. The number of false suspect mines (background pixels of high intensities)should be minimized.A feature was considered successfully extracted if at least 2/3 of the perimeterof a feature was detected. We also considered a continuity judgement. Thecontinuity constraint meant that a feature must not have large holes (relative tothe extracted perimeter segments).Results Using Different Pre-Processing Enhancement filtersSince all edge detectors, with the exception of the ATM edge detector, failedto meet the requirements of this system with and without the enhancement step,we chose to compare the different smoothing filters using the ATM edge detectoronly. Figures 31, 32 and 33 show the results of the ATM edge detector afterspatial averaging, out-range noise removal, and median filtering, respectively. Itcan be easily concluded that these pre-processing filters attenuated feature and45shadow edges more than the cascaded morphological filter. This edge attenuationcaused the ATM edge detector to fail to meet the requirements.Figure 31 ATM Edge Detection after spatial averaging.Figure 32 ATM Edge Detection after out-range noise removal.46Figure 33 ATM Edge Detection after median filtering.ResultsIn preparing for this stage of our research, we studied many technical papersdescribing "new" edge detectors that work well in the presence of noise, howeverwe found no papers that dealt with images as noisy as those that we wereconsidering.After extensive testing, it was found that the ATM edge detector extracted allthe edges that surround features and their corresponding shadows. All detectionoperations were performed on the enhanced images that were discussed inChapter 3. Without a smoothing step, all edge detectors including the ATM edgedetector performed poorly since they introduced false suspect mines (by falsesuspect mines, we mean background pixels that have high intensities). Therefore,the preprocessing enhancement step is mandatory. Obviously, the success of this47ACTIVE IMAGE INSHIFTING ARRAYSHIFTINSTRUCTIONSMASTERCONTROLBLACK PIXELCOORDINATES46171111MNIIII AVArillrierIIIIIM1111141/Afil11111i111111111211/ AdVAIVAIIIVAISTATIONARY ARRAYCONTAINING RESULT STRUCTURING ELEMENTIN REFERENCE ARRAYedge detector is contributed to its immunity to high levels of noise. Another factorthat is crucial in this filter's success is the choice of the structuring element. In [9,10, 16], it is shown that the structuring element must belong to a subset of erodedfeatures, a product of skeletization, or in simple terms, must be "similar in shape"to the feature. This filter has another important advantage over conventionalfilters: it is efficient. It could be implemented in a real-time image processingoperation, which is required by the Mine Counter Measure System (cf. Chapter2). Two efficient implementations of morphological filters are described in [9] and[11]. The practical method for performing morphological transformations takesadvantage of the parallel shifting array architecture as depicted in Figure 34.Figure 34 Parallel array shift architecture (from [9]).48The following table summarizes the results of four edge detectors, two ofwhich showed good results, while the other two were incompatible with imagesas noisy as sonar images.Edge Detectors Advantages DisadvantagesLaplacian of Gaussian 1. Too many false edges2. Slow edge detection (20seconds)Step Function 1. Too many false edgesHistogram Thresholding 1. Fast algorithm (2seconds)1. The value of the thresholdis picture-dependent2. Extremely sensitive todeviations from the "right"thresholdMultidimentional 1. Detected all featureMorphological Edge Detector edges (with ourpre-processingenhancement step)2. Fast algorithm (3seconds)Table 1 Tabulated results of edge detection.49505 Image Compression1 History and DefinitionsMotivation for the development of data compression algorithms stems fromcontinuing demand for cost-effective and higher capacity data storage. Since oneof the goals of the MSCM is to store a library of images that would cover themajor Canadian ports and coastal waters, it is mandatory to compress the imagesin order to render that goal feasible. In addition, this goal would prove to be verycostly if some sort of data compression is not employed.Although the principle of data compression is not new, there has been world-wide renewed interest, triggered by the recent deployment of Integrated ServicesDigital Network (ISDN), and the need to have a global image compression stan-dard. The field of data compression can be partitioned into two fields: losslessand lossy coding. Even though lossless compression is used in applications thatcannot tolerate information loss, the achieved compression ratio is small, in theorder of 2 to 1. Such a compression ratio is too small for this application, and in-formation loss can be tolerated as long as no features are lost after reconstruction.Lossy compression algorithms achieve much higher compression ratios, ob-viously at the expense of image degradation after reconstruction. Lossy compres-sion techniques fall into three categories:511. Predictive Coding.2. Transform Coding.3. Block-based Coding.Predictive CodingIn a predictive image coding system, the value of each pixel is predicted basedon the history of previously processed pixels. The estimate is then subtractedfrom the actual pixel value, and the difference is quantized. In the reconstructionprocess, a similar prediction is done, which should produce the same errorobtained at the encoder, and the quantized difference is added to the prediction,leading to an estimated value very close to the original pixel value.Transform CodingThis method is simple in concept and yields good compression ratios. It takesadvantage of the high energy concentration in the lower coefficients of the imagerepresentation in the transformed domain. This coding technique has become oneof the most widely-used technique in the industry. The International ConsultativeCommittee for Telephone and Telegraph (CCITT), the international body in chargeof the standardization of image compression and transmission (ISDN application)has adopted a compression technique that is based on transform coding [44, 45].Recently, the Joint Photographic Experts Group (JPEG), has submitted a proposaltowards establishing the first international digital image compression standard for52still greyscale and colored images [46]. The proposed algorithm suggests usingthe discrete cosine transform as one of its major processing steps.Block-Based CodingBlock Truncation Coding (BTC) and Vector Quantization (VQ) are the maintwo coding techniques that fall under this category. In these methods, the imageis subdivided into non-overlapping squares, which are sometimes referred to asblocks or vectors. Instead of processing each pixel individually, each block isconsidered as one entity. BTC is a relatively old technique which seems to havereached its maturity, in terms of research [47-50]. Nonetheless, its simplicity andease of implementation make it a very attractive coding/decoding scheme.More recently, a more interesting and still a highly researched topic, is theblock-based scheme: Vector Quantization. This scheme maps a sequence ofblocks, mathematically referred to as vectors, into a digital sequence.In the above paragraphs, we have very briefly described the different com-pression schemes that have been in use or are still being studied by the researchcommunity. Since our goal is to compress sonar images, we opted for studyingthe following compression methods:1. BTC532. JPEG3. VQ2 Block Truncation CodingThe BTC is an image coding method, introduced in 1979 by Delp and Mitchell[47]. As mentioned earlier, the idea behind this coding method is simple: eachimage is divided into 4x4 non-overlapping blocks of pixels, and each block is codedindividually. The mean and the variance of each block are stored along with 16bits, each bit taking the value of 0 or 1, depending on whether each pixel is lessor greater than the mean in that block. Only 4 bytes are needed to represent theblock (mean of block, variance of block, and 16 bits as explained above), insteadof 16 bytes, assuming the values of the pixels range from 0 to 255. Therefore,each smooth block leads to a compression ratio of (2+2)/16, or 4 to 1.At the reconstruction stage, if the block was smooth, then each pixel that hada value higher than the mean in that block is assigned the value:B = +a016 — (1)1qand each pixel that had a value below the mean of that block is assigned thevalue:A = — aVq1(16—q)Nasiopoulos, Ward and Morse [50] devised an adaptive BTC algorithm whichpreserves edges: The algorithm calculates a smoothness measure (the varianceof the block), and if the block is not smooth enough (a comparison of the variance54and a user-selected threshold is performed), then all 16 pixels are stored withoutmodification. If the block is smooth, then BTC is used.Applying this scheme to sonar images, we realized that it is highly inadequateto our application. The achieved compression ratio was of the order of 2 to 1,because only few blocks were smooth, due to the presence of noise. In additionto the small compression ratio, the quality of the reconstructed image was verypoor, as illustrated in Figure 35.Figure 35 Reconstructed sonar image after BTC encoding and decoding3 JPEG AlgorithmAs mentioned in an earlier section, JPEG is a transform coding scheme, andfurthermore a proposed still image compression standard that has been submittedto CCITT. Even though this proposed standard has not yet been approved, theindustry has already implemented chips that would execute the compression anddecompression JPEG algorithms [51]. In this thesis, we used the most recent55EntropyEncoder8x8 blocksDCT Quantizer •CompressedImage DataSource Image Data^ Quality Factorsoftware release [version 2.3] of this algorithm, which was generously madeavailable to the public by JPEG. We did not modify any parts of the algorithm,instead, we used it to compare results using the VQ method as modified by usto suit sonar imagery. For completeness, we will present a brief overview of thedifferent processing steps of the JPEG coding/decoding algorithm. Figures 36and 38 depict the block diagrams of the encoder and the decoder, respectively.JPEG EncodingFigure 36 Block diagram of the JPEG encoderIn the encoding step, the image is first divided into 8x8 non-overlapping blocks,and the Discrete Fourier Transform is performed on each block, using the followingequation:7 71^ 2x + 1)z or^(2y + 1)vz-F ti,v = —4C(u)C(v) E^f(x,y) cos^16 16cos^56{i f°""v =,where C(u), C(v) = 7 J^0 and f (x, y) is the pixel value.1^otherwiseThe output of the DCT is a matrix of coefficients that represent the energyin 2D frequency domain. Generally, most of the energy is concentrated at theupper left corner of that matrix (in other words, the weights assigned to the lowharmonics are significantly higher than those of the high harmonics). If this matrixis scanned in a zig-zag manner, by starting at the upper left corner and ending atthe bottom right corner, as shown in Figure 37, then the resultant sequence willcontain long strips of zeros, especially towards the end of the sequence. At thisFigure 37 Zig-Zag scanning used to get long sequences of zerospoint, a quantization step is performed in order to achieve further compression57EntropyDecoder Dequantizer IDCTCompressedImage Databy representing the DCT coefficients with less precision, which should cause aslight degradation in the image.The quantized resultant sequence is then represented by the non-zero termsand the number of consecutive zero terms. To increase further the compressionratio, lossless entropy coding (Huffman for example) is then used to encodethe number of zero terms in every sequence of zeros. The released encodingalgorithm uses two user-specified parameters. The first parameter specifies thequantization step, which translates into a quality factor ranging in values from0 to 100, where a value of 100 represents minimal compression. The otherparameter specifies which of the two available entropy coding methods —Huffmanor arithmetic entropy coding— is to be used.JPEG DecodingReconstructedImage DataFigure 38 Block diagram of the JPEG decoder58Similarly to the encoding step, except in reverse order, entropy decoding isfirst performed, followed by a dequantization step, followed by the Inverse DCT:[ 7 71f(x, y) =^EE C(u)C(v)F(11, v) cos ^16(2x + nur cos (2y + 1)v7r]16u=o v=owhere C(u) and C(v) are as before.Experimental ResultsAs discussed before, there are two parameters that control the encoder: thequality factor parameter related to the compression ratio required, and the entropycoding algorithm parameter. We noticed that using arithmetic coding leads tolower compression ratios than Huffman coding, without introducing any visualdegradation to the reconstructed images. For the compression ratio parameter,we decided upon a value that caused the reconstructed image to be minimallydistorted, and therefore used a quality factor of 75. As subjective as it mayseem, our main concern was to preserve all features. Apart from this visual"measurement", we used the Root Mean Squared (RMS) error as a measurementof success. Also, we took into consideration the performance of the encoding anddecoding stages. Tables 2 and 3 illustrate the visual measurement: the left-handside column contains the original images, whereas the right-hand side columncontains the reconstructed images, using a quality factor of 75. For a highercompression ratio, the degradation of the reconstructed image was evident, andconsequently we decided not to include those results in this thesis, except forcomparison purposes. Table 4 summarizes the measure of success of each image( 4 images in total), along with some performance measurements.59Original Sonar Images^ Reconstructed Images, using JPEGEncoding/Decoding AlgorithmTable 2 Visual "measurement" using a quality factor of 75, for images 1 and 2.60Reconstructed Images, using JPEGOriginal Sonar ImagesEncoding/Decoding AlgorithmTable 3 Visual "measurement" using a quality factor of 75, for images 3 and 4.61OriginalSonarImagesRMS •errorbetweenoriginalimage, andrecon-structedimageCompressionratioBits per PixelrepresentationEncodingtimeDecodingtimeImage 1 5.66 5.0:1 1.59 bpp 5 seconds 11 sec.Image 2 9.11 3.3:1 2.37 bpp 5 seconds 12 sec.Image 3 7.13 5.2:1 1.54 bpp 5 seconds 10 sec.Image 4 7.23 5.0:1 1.59 bpp 5 seconds 12 sec.Table 4 Quantitative comparison of JPEG compression algorithm.These tests were carried out on a Sun 4/60 computer, with the followingspecifications:1. 1.8 Specification Marks (SM)2. 12 Million Instructions per Second (MIPS)3. 1.9 Mega FLoat OPerations (MFLOPS)• Maximum RMS Error is 256.624 Vector QuantizationIntroduction to Vector QuantizationOriginally used for speech coding, the vector quantization method now domi-nates research in image compression. Many papers have appeared on this sub-ject, some of them aiming at improving the quality of the reconstructed images,and others aiming at making the encoding step faster and more efficient. Wefound the vector quantization attractive for two reasons: the image reconstructionis faster than any other reconstruction algorithm, and the compression ratio thatcan be achieved by this scheme is larger. The drawback is the slow speed inencoding.In this thesis, we are introducing a new codebook generation algorithm. Thisalgorithm, when used by VQ, results in large compression ratios, larger thanthose obtained by JPEG method. Furthermore, features and their correspondingshadows are better preserved, and the blurring effect produced after JPEG re-construction is absent in this method. In addition, information theory implies onecan always obtain better performance by coding vectors, as is the case in VQ,instead of scalars [52], as is the case in JPEG.Mathematically, the image is subdivided into 4x4 blocks, or vectors. Leteach vector be represented by = (x 0 , x i ,..., x15 ) . In the case of the sonarimage that we have been analyzing, the image is 256x256 pixels, and thereforeit consists of 64x64 vectors. An alphabet, or codebook, is a set of vectors thatrepresent the whole image. By choosing a codebook that has less vectors than63the image itself, data compression is achieved. Consequently, encoding is simplya mapping between each image vector x and one vector g of the codebook. TheHamming2 distances between each image vector and all codebook vectors arecalculated, and the codebook vector which is closest to the image vector is chosento represent the image vector. In other words, many image vectors will map toone codebook vector.There are many methods for building codebooks [52], the best known beingthe one suggested by Linde, Buzo, and Gray (LBG algorithm) in 1980 [53]. Thisalgorithm is set as follows:1. Choose a set of codebook vectors from the image vectors set. In this initialset, choose those vectors so that they are evenly distributed, and no twovectors have adjacent rows nor columns.2. Assign the image vectors to cells, where the number of cells equals thenumber of codebook vectors required. Use Hamming distance as the criterionfor finding which image vector belongs to which cell. It is eventually desiredto have each vector in the codebook as the centroid of each cell.3. A measure of error on this choice of codebook vectors is performed. If theerror is too large, then the centroid of each cell is calculated, and the newcentroids are used as a new set of codebook vectors.4. This operation of calculating the centroid of each cell is performed until themeasure of error is small (a user-specified threshold is used for comparison).2^Other algebraic distances may be used as well.64It can be seen from this layout that the set of codebook vectors will bean optimal set to represent the whole image. This optimality is based on theHamming distance. This optimality has a drawback, especially in the case ofsonar images, since it treats all vectors alike, i.e. without processing featureand shadow vectors differently from the other (background) vectors. For sonarimages, it is preferable to compress the feature and shadow areas minimally andthe background maximally. The resulting compression ratio of this method wasvery encouraging, in the order of 8 to 1. However, the reconstructed image hadall of its features distorted.This prompted us to try a new way of generating codebooks, which wouldprocess feature vectors in a different way than a background vector. This newalgorithm, which we called the MAS algorithm, is simple in concept, easy toimplement, leads to better compression ratios than those achieved by the LBGalgorithm, preserves features and their shadows, and is faster than the LBGalgorithm by orders of magnitude (it takes 3 seconds to create a codebook usingthe MAS algorithm, whereas it takes about 3 minutes by using the LBG algorithm).The outline of the MAS algorithm is as follows:1. Calculate the mean intensities in all image vectors.2. Order all vectors according to their increasing mean values.3. Choose the codebook vectors from this order list as follows:65a. Choose the first 303 vectors, the vectors with the lowest mean valuesin the ordered list. These vectors should represent all the shadowscontained in the image.b. Choose the last 30 vectors, the vectors with the highest mean values inthe ordered fist. These vectors should represent all features containedin the image.c. Choose the rest of the codebook vectors from the ordered list, so thatthese vectors • are evenly distributed in that list. These vectors shouldrepresent the background surrounding features and their correspondingshadows.Assuming that the original image is 320x320 pixels, and therefore 80x80vectors, and if the codebook chosen consists of 256 vectors (30 vectors withlowest mean values, 30 vectors with highest mean values, and 196 vectors chosenevenly from the ordered vectors set), then the achieved compression ratio wouldbe 10:1.80 * 80(vectors) * 16(bytes/vector)C R = 256(codebook vectors) * 16(bytes/vectors) [80 * 80(lookup addresses)] = 9.76We used the same measure of success criteria to judge the reconstructedimages using VQ, as we did for the JPEG method. The images are presented intwo columns. The left column contains the original sonar images, and the rightcolumn contains the reconstructed images using the vector quantization method.Choosing any value that is greater than 20 would lead to same results. A value of 30 was chosen toincrease its sensitivity on a particular threshold.66Original Images^ Reconstructed Images using VQTable 5 Results of VQ method for images 1 and 2.67s or vg method for images 3 andOriginalSonarImagesMKS errorbetweenoriginal andreconstructedCompressionRatioBits perpixelsCodingtime insecondsDecodingtime insecondsImage 1 8.96 10:1 0.8 54 1Image 2 21.72 10:1 0.8 48 1Image 3 13.26 10:1 0.8 50 1Image 4 12.89 10:1 0.8 51 1Table 7 Quantitative comparison of modified VQ compression algorithm.695 Summary of Compression ResultsThe first method we tried, the adaptive BTC, gave very poor results. Muchmore encouraging results were achieved by using the JPEG encoding/decodingalgorithm. Depending on a user-specified quality factor, the compression ratioranged from 5:1 to about 3:1. The variation of compression ratios, with the qualityfactor set at a constant value of 75, was caused by the texture difference ofthe images considered. The last compression algorithm that we considered wasbased on Vector Quantization, with a new codebook generation introduced. Thiscodebook generation gave good quality reconstructed images.It is difficult to be objective when comparing the JPEG and the VQ methods.Both methods have their advantages and disadvantages. We do not believe thatone method outperforms the other in the absolute sense. The Mean Squared Erroris not a very good measure for comparison, since the VQ method tries to compressthe sonar images, without compressing the features contained in that image.Therefore, the features are much better preserved in the reconstructed imageafter a VQ compression, whereas the background is not so well preserved. TheMSE of the reconstructed features is almost zero in the case of VQ and is muchlarger in the JPEG case. It would be safe to say that the VQ is tailored to sonarimages, whereas the JPEG is in fact a generalized compression/decompressionalgorithm that fits its international standardization status very well.70Figures 39, 40, and 41 illustrate our claim. At this point, it is noteworthy topoint out that Figure 40 was reconstructed from a 5:1 compressed image, whereasFigure 41 was reconstructed from a 10:1 compressed image.In order to compare JPEG and VQ more objectively such that both imageshave the same compression ratios, the JPEG method was used with a qualityfactor of 30, resulting in a 10:1 JPEG compression ratio. Figure 41 illustrates thereconstructed image after JPEG compression with a quality factor of 30. Now,Comparing Figures 42 and 43 shows the obvious advantage of the VQ compres-sion method. In the latter case, both the compression ratio and the reconstructedimage quality make it a more favorable compression method. Reconstruction afterJPEG compression cuts off high spatial frequencies, causing the resulting imagesto be blurred. This blurring effect is accompanied with a loss of edges that sur-round suspect mines and could potentially remove small features altogether, anevent that is unacceptable for such a sophisticated system.Another useful measure of reconstructed image quality is the difference image.In a difference image, the reconstructed image is subtracted from the originalimage, and a value of 128 is added to the difference in order to highlight thediscrepancies. Figures 44 and 45 illustrate the difference images of a JPEG (FIG.40) and VQ (FIG. 41). In Figure 44, it is seen that the JPEG algorithm treats allregions equally. The degradation of feature and shadow regions is equivalent tothe degradation of the background regions.For the VQ, it can be seen that feature and shadow areas are much better71preserved than those in the JPEG reconstructed images. In those regions, thevalue of the difference is zero, as desired. This proves that the VC) compressionmethod meets the requirements in that feature and shadow regions are 100%preserved.72Figure 39 The zoomed original image (Zooming factor = 2)73Figure 40 The zoomed reconstructed image after JPEG compression, with 5:1 compression ratio.74Figure 41 The zoomed reconstructed image after VQ compression, with 10:1 compression ratio.75Figure 42 The zoomed reconstructed image after JPEGcompression with quality factor of 30 (10:1 compression)76Figure 43 The zoomed reconstructed image after VQ compression (10:1 compression)77Figure 44 Difference image after JPEG compression78Figure 45 Difference image after VQ compression796 Conclusions 1 Original ContributionFor the pre-processing step of image enhancement, we developed the Mor-phological Cascaded Smoothing Filter which was found to enhance the sonarimagery by suppressing noise and sharpening edges that surround features andshadows. This enhancement was done gradually, in order to preserve featurepixels.Even though no original work was presented in the edge detection stage,an ATM edge detector was found to suit sonar imagery. The other three edgedetectors that were considered had major drawbacks, making them unsuitable forsonar images. The success of this edge detector was contributed to its immunityto high levels of noise and to the smoothing effect caused by the MorphologicalCascaded filter.For the final stage, image compression and storage, a new codebook gener-ation algorithm was introduced in association with the VQ coding. The VC) withour codebook led to a very low bit rate of 0.8 bpp, where features and shadowswere very well preserved after reconstruction. Results of this compression methodwere compared to an internationally proposed compression algorithm, the JPEGalgorithm, and it was shown that in the case of sonar imagery the VQ codingalgorithm gives better reconstructed image feature quality and lower bit rates.812 Suggestions for Future WorkThe next logical step would be to investigate the performance of the proposedcompression algorithm by using an efficient methodology, such as quad-trees [60].Recent research work in Vector Quantization has addressed different ways ofmaking this compression/decompression methodology more efficient [54, 55, 57].Having investigated the performance of this algorithm, a hardware implemen-tation would constitute a very interesting and challenging research topic.Another suggested topic would be to investigate the reusability of codebooksin order to achieve higher compression ratios. The calculation of the compressionratio took into consideration the codebook size. It may not be necessary to store acodebook for each image, a simplification that would lead to a 16:1 compressionratio. Adjacent images may be represented by one codebook, preventing theunnecessary storage of a separate codebook for each image.82A REFERENCES1. P.A Maragos, A Unified Theory of Translation Invariant Systems with Applica-tions to Morphological Analysis and Coding of Images, Ph.D. Thesis, GeorgiaInstitute of Technology, July 1985.2. W.K. Pratt, Digital Image Processing, J. Wiley and Sons, New York, 1978.3. A.K. Jain, Fundamentals of Digital Image Processing, Prentice Hall, 1989.4. A Rosenfeld and A.C. Kak, Digital Picture Processing, Academic Press, NewYork, 1976.5. J. Serra, Image Analysis and Mathematical Morphology, Academic Press,New York, 1982.6. C.R Giardina and E.R Dougherty, Morphological Methods in Image and SignalProcessing, Prentice Hall, New Jersey, 1988.7. R.J. Feehs and G.R Arce, Multidimensional Edge Detection, SPIE Vol. 845,1987, pp. 285-292.8. S.J. Lee, R.M. Haralick, and G. Shapiro, Morphologic Edge Detection, Ma-chine Vision International, Ann Arbor, Michigan 84104, Aug. 1986.9. J.F. Bronskill, Multidimensional Shape Description and Recognition UsingMathematical Morphology, Master Thesis, University of Toronto, July 1986.10. Z. Zhou, Morphological Skeleton Representation and Shape Recognition,Mater Thesis, University of Toronto, July 1988.8311. M. Jelavic and T. Whalen, Morphology Changes the Shape of Machine Vision,The Electronic System Design Magazine, November 1987, pp 67-75.12. P. Maragos and R. Schafer, Morphological Systems for MultidimensionalSignal Processing, Proceedings of the IEEE, Vol. 78, No. 4, April 1990.13. T.R. Esselman and J.G. Verly, Feature Extraction from Range Imagery UsingMathematical Morphology, SPIE Vol. 845, Visual Communications and ImageProcessing II, 1987, pp 233-240.14. S.R Sternberg, Biomedical Image Processing, IEEE Computer, Jan. 1983,pp 22-34.15. S.R. Sternberg, Grayscale Morphology, Computer Vision, Graphics ImageProcessing, Vol. 35,1986, pp 333-355.16. M.K. Smayra, The Effect of Structuring Elements on Shape Description andRecognition, Bachelor Thesis, University of Toronto, April 1989.17. M. Suk and S. Hong, An Edge Extraction Technique for Noisy Images, Com-puter Vision, Graphics and Image Processing, 25,1984, pp 24-45.18. Mashua and Gilbert, Finding Edges in Noisy Scenes,19. V. Nalwa and T. Binford, On Detecting Edges, IEEE Transactions of PattermAnalysis and Machine Intelligence, Vol. PAMI-8, No. 6, Nov. 1986, pp699-714.20. W.H. Lunscher and M.P. Beddoes, Optimal Edge Detector Evaluation, IEEETransactions on Systems, Man, and Cybematics, Vol. SMC-16, No 2, March1986, pp 304-312.8421. G.F. McLean and M.E. Jernigan, Hierarchial Edge Detection, Computer Vi-sion, Graphics, and Image Processing 44,1988, pp 350-366.22. A. Rosenfeld, The Max Roberts Operator is a Hueckel-Type Edge Detector,IEEE PAMI, Vol. 3, No 1, Jan. 1981, pp 101-111.23. Suk and T. Cho, Object Detection Algorithm Based on Region-AdjacencyGraph, Proceedings of the IEEE, Vol. 72, No 7, July 1984, pp 985-986.24. W. Grimson and E.C. Hildreth, Comments on "Digital Step Edges from ZeroCrossings of the Second Directional Derivaties”, IEEE PAM I, Vol. 7, No 1,Jan. 1985, pp 121-199.25. A.N. Venetsanopoulos, Edge Detectors based on Nonlinear Filters, IEEEPAMI-8, No 4, July 1986.26. J.S. Chen and G. Medioni, Detection, Localization, and Estimation of Edges,IEEE PAMI, Vol. 11, No. 2, Feb. 1989, pp 191-198.27. J. Canny, A computational Approach to Edge Detection, IEEE PAMI, Vol. 8,No 6, Nov 1986, pp 679-699.28. V. Torre and T. Poggio, On Edge Detection, IEEE PAM I, Vol. 8, No 2, March1986, pp 147-163.29. E. Hildreth, Edge Detection, The Encyclopedia of Artificial Intelligence, J.Wiley and Sons, New York, 1987.30. D. Marr and E. Hildreth, Therory of Edge Detection, Technical Report Al Memo518, MIT Al Lab, April 1979, pp 187-217.8531. R.M. Haralick, Digital Step Edges from Zero Crossing of Second DirectionalDerivatives, IEEE PAM I, Vol. 6, No 1, Jan. 1984, pp 58-68.32. R. Nevatia, Linear Feature Extraction, Technical Report of Image ProcessingInstitute, University of Sounthern California, pp 73-76.33. K. Sijmons, Computer-assisted Detection of Linear Features from DigitalRemote Sensing Data, ITC Journal, Vol. 1, 1987.34. A. Bovik, On Detecting Edges in Speckle Imagery, IEEE Transactions onASSP, Vol. 36, No 10, Oct. 1988, pp 1618-1627.35. Y. Leclerc and P. Fua, Finding Object Boundaries Using Guided GradientAscent, Technical Report Al centre, SRI International, 1984.36. V. Nalwa, On Detecting Edges, Technical Report Al Lab, Standford University,pp 157-164.37. A. Huertas and G. Medioni, Detection of Intensity Changes with SubpixelAccuracy Using Laplacian-Gaussian Masks, IEEE PAMI, Vol. 8, No 5, Sep.1986, pp 651-663.38. W.H.H.J Lunscher, The Asymptotic Optimal Frequency Domain Filter for EdgeDetection, IEEE PAMI, Vol. 5, No 6, Nov. 1983, 678-680.39. R. Nevatia and K. Babu, Linear Feature Extraction and Description, ComputerGraphics and Image Processing 13, 1980, pp. 257-269.40. J. Canning, J. Kim, and A. Rosenfeld, Symbolic Pixel Labeling for CurvilinearFeature Detection, Image Understanding Workshop, Feb. 1987, pp 242-256.8641. S. Zucker and R. Hummel, A Three-Dimentional Edge Operator, IEEE PAMI,Vol. 3, No 3, May 1981, pp 324-331.42. Bednar and Watt, Alpha-Trimmed Means and Their Relationship to MedianFilters, IEEE Transactions on ASSP, Vol 32, No 1, Feb. 1984, pp 145-153.43. R. Stevenson and G. Arce, Theoretical Analysis of Morphological Filters, 24thAnnual Allerton Conference on Communications, Control, and Computing,Oct. 1986.44. M.L. Liou, Visual Telephony as an ISDN Application, IEEE CommunicationsMagazine, Feb. 1990, pp 30-38.45. A. Wong, C. Chen, D.L. Le Gall, F. Jeng, and K. Uz, A video Coding Algorithmfor Transmission and Storage Applications, IEEE Communications Magazine,Nov. 1990, pp 24-31.46. G.K. Wallace, The JPEG Still Picture Compression Standard, Communica-tions of the ACM, Vol. 34, No 4, April 1991, pp 30-45.47. E.J. Delp and O.R. Mitchell, Image Compression Using Block TruncationCoding, IEEE Transactions on Communications, Vol. 27, No 9, Sep. 1979,pp 1335-1342.48. D.R. Halverson and N.C. Griswold, A Generalized Block Truncation CodingAlgorithm for Image Compression, IEEE Transactions on ASSP, Vol. 32, No3, June 1984, pp 664-668.49. D.R. Halverson, On The Implementation of Block Truncation Coding Algorithm,IEEE Transactions on Communications, Vol. 30, No 11, Nov. 1982.8750. P. Nasiopoulos, R. Ward and D. Morse, Adaptive Compression Coding, IEEETransactions on Communications, Vol. 39, No 8, Aug. 1991, pp 1245-1254.51. M. Leonard, IC Executes Still-Picture Compression Algorithms, ElectronicDesign, May 1991, pp 49-53.52. R.M. Gray, Vector Quantization, IEEE ASSP Magazine, Vol. 1, No 2, April1984, pp 4-29.53. Y. Linde, A. Buzo, and R.M. Gray, An Algorithm for Vector Quantizer Design,IEEE Transactions on Communications, Vol. 28. No 1, Jan. 1980, pp 84-95.54. D.J. Vaisey and A. Gersho, Variable Block-Size Image Coding, Proc. ICASSP,1987.55. Y. Ho and A. Gersho, Variable-Rate Multi-Stage Vector Quantization ForImage Coding, Proc. ICASSP, 1988, pp 1156-1159.56. Z. Zhou and A.N. Venetsanopoulos, Directional Decomposition of Morpholog-ical Pyramid Representations for Low Bit-Rate Image Coding, Conference onDigital Signal Processing, 1991, pp 404-409.57. P. Yu and A.N. Venetsanopoulos, Hierarchical Multirate Vector Quantizationfor Image Coding, Canadian Conference on Electrical and Computer Engi-neering, Sep. 1991.58. L. Sutro and J. Lerman, Robot Vision, internal report, CSD Laboratory, Cam-bridge, Massachusetts, April 1973.59. J. Verly and T Esselman. Some Applications of Mathematical Morphology toRange Imagery, Electronic Imaging'88, Boston, Oct.1988.8860. H. Samet, The Quad Tree and Related Hierarchical Data Structures, ACMComputing. Surveys, Vol.16, No 2, June 1984, pp 187-260.89


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items