Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

An automated vision system for detection and counting of uneaten food pellets in a fish sea cage Foster, Michael D. 1993

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1993_fall_foster_michael.pdf [ 3.98MB ]
Metadata
JSON: 831-1.0064999.json
JSON-LD: 831-1.0064999-ld.json
RDF/XML (Pretty): 831-1.0064999-rdf.xml
RDF/JSON: 831-1.0064999-rdf.json
Turtle: 831-1.0064999-turtle.txt
N-Triples: 831-1.0064999-rdf-ntriples.txt
Original Record: 831-1.0064999-source.json
Full Text
831-1.0064999-fulltext.txt
Citation
831-1.0064999.ris

Full Text

AN AUTOMATED VISION SYSTEM FOR DETECTION AND COUNTING OF UNEATEN FOOD PELLETS IN A FISH SEA CAGE by Michael David Foster B.A.Sc., The University of Waterloo, 1986 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES Department of Electrical Engineering We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA June 1993 © Michael David Foster, 19 3  In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.  (Signature  Department of^ The University of British Columbia Vancouver, Canada  Date  DE-6 (2/88)  tiv  cue  Abstract  A system which quantifies the number of food pellets eaten by salmon in a sea cage would be beneficial to both fish farmers, and researchers. Such a system could be used for reducing food wastage, determining time-related feeding patterns, ensuring fish receive the correct dosage of drugs, etc. We developed algorithms for detection and counting of food pellets from recorded video image sequences, which could be used to determine actual feeding rates in a sea cage. The number of food pellets eaten over time is determined by counting the number of pellets not eaten. The method involves counting the number of pellets of a known size, falling through the view area of an underwater video camera. The size of the view area of the camera varies with the size of the food pellets used. The pellets, which appear white underwater, are counted as they enter the view area of the underwater camera, and are tracked in this area to avoid recounting. For each video frame in the sequence, image preprocessing is done, followed by object detection, object classification, and object tracking and counting. Original algorithms were developed for this project to automatically threshold images, track objects in consecutive frames, and count the objects entering the view area of the camera. The algorithms were implemented on a personal computer based image processing system. Experiments were carried out to test the algorithms with pellet densities used in actual feeding situations. The utility of the algorithms was confirmed by the experimental results. The average count error for the tests performed was approximately +/-10%. The recommended improvements to the counting algorithms should significantly reduce this error.  Table of Contents Abstract ^ Acknowledgement ^  vii  1 Introduction^  1  1.1 Motivation ^  1  1.2 Problem Statement ^  3  1.3 Objective and Scope ^  3  2 Goals and Constraints ^  7  3 Equipment for Acquisition and Analysis ^  9  3.1 Introduction ^  9  3.2 Image Acquisition and Storage ^  9  3.3 Image Analysis ^  12  4 Detecting Objects in Images ^  15  4.1 Introduction ^  15  4.2 Imaging of Food Pellets ^  15  4.3 Correlation Matching ^  17  4.4 Thresholding and Region Growing ^  20  4.5 Dilation and Erosion ^  28  4.6 Separating Overlapping Objects ^  31  4.6.1 Detecting the Boundary Pixels of an Object ^ 31 4.6.2 Estimating the Tangent at each Boundary Pixel ^ 35 4.6.3 Determining the Curvature at each Boundary Pixel ^ 37 4.6.4 Using Zero Crossings in Curvature to Divide the Object if Necessary ^  39  4.6.5 Case of more than two overlapping objects ^ 41 5 Measuring Features / Classifying Objects ^ 5.1 Introduction ^  42 42  5.2 Features used to Distinguish Food Pellets from Other Objects ^ 42 5.2.1 Circularity ^  43  5.2.2 Bounding Area Ratio ^  44  5.2.3 Minor to Major Axis Ratio ^  47  5.2.4 Minimum to Maximum Radius Ratio ^ 5.3 Classification Using Feature Values ^ 6 Counting Objects using Object Tracking ^  47 48 56  6.1 Introduction ^  56  6.2 Overview ^  56  6.3 Object Counting Algorithm ^  57  6.4 Object Matching Algorithms ^  60  6.4.2 Small Storage, Maximum Number of Distance Calculations Algorithm ^  64  6.4.3 Small Storage, Intermediate Number of Distance Calculations Algorithm ^  68  6.5 Distance Measure Between Objects ^ 6.5.1 Combining Distance Measures ^ 7 Experimental Results and Discussion ^  70 73 75  7.1 Introduction ^  75  7.2 Image Acquisition ^  75  7.3 Preprocessing ^  79  7.4 Object Detection ^  80  7.5 Feature Extraction and Classification ^  84  7.6 Object Tracking ^  90  7.7 Counting Experiments Performed and Results ^  97  8 Conclusion ^  102  8.1 Future Considerations ^  104  Nomenclature ^  106  Bibliography ^  109  iv  List of Tables 1  Theoretical area of pellet objects in pixel units vs distance from object to camera ^  17  2  Number of Food Pellets Entering Camera View per Minute ^ 98  3  Results of Counting Trials ^  99  4  Frequency of Occurrence of Different Counting Errors ^  100  V  List of Figures 1 View Area of Video Camera ^  4  2 Pellet Counting Process ^  5  3 Image Sequence Acquisition Equipment ^  11  4 Image Analysis Equipment ^  12  5 Digital Image Coordinate System ^  13  6 Correlation Matching ^  18  7 Histogram of Image without Objects ^  21  8 Histogram of Image with Objects ^  21  9 Results from Object Detection ^  27  10 4 and 8-neighbours of a pixel ^  28  11 Example of Dilation Operation ^  29  12 Example of Erosion Operation ^  30  13 Movement Codes ^  32  14 Object Boundary Detelinination ^  34  15 Example of Boundary Detection ^  35  16 Example of Tangent Estimate ^  37  17 Curvature for Concave and Convex Objects ^  38  18 Curvature Plots for Two Different Objects ^  39  19 Results of Object Division for Two Objects ^  40  20 Moments of Inertia used to Determine Rotation Angle ^  45  21 Calculation of Bounding Box Area ^  46  22 Feature Space Transformation ^  49  23 Euclidean Distances used for Classification ^  50  24 Feature Space Transformations ^  52  25 Equidistant MICD Contours on Original Feature Space ^  54  26 Food Pellet Motion ^  57  27 Two Consecutive Image Frames ^  58  28 Object Matching Example ^  61  29 Distance Array for Large Storage Algorithm ^  63  30 Motion Vector between Objects in Consecutive Frames ^  72  vi  31 Measure of Deviation of Motion Vector from the Center Point ^ 73 32 Transparent Shield on Camera ^  78  33 Object Separation with 3 Overlapping Objects ^  83  34 Error in Bounding Box Determination ^  86  35 Histogram of MICD Distances for Valid Pellet Objects ^  88  36 Two Object Matching Approaches ^  91  37 Example of Counting Error ^  92  38 Example of Counting Error ^  93  39 Object Tracking Example ^  94  40 Example of Error in Object Matching ^  95  vii  Acknowledgement I would like to thank my supervisors, Dr. Mabo R. Ito, Dr. Royann J. Petrell, and Dr. Rabab K. Ward for their guidance and support on this project. I want to thank the people at B.C. Packers, and Pacific Biological Station for their assistance with the project, and for allowing me to use their sites for data collection. I would also like to thank Colin Savage for his assistance with the computer and data acquisition equipment and Trey Neufeld for his help collecting video sequences for analysis.  viii  Chapter 1 Introduction 1.1 Motivation Feed wastage affects fish farm operating costs, profits, appropriate additive delivery, and the magnitude of the environmental impact. Salmon feed averages $1.25/kg and represents 45-60% of the operating costs on a commercial sized sea cage farm. The feed conversion ratio FCR is used as a measure of feed usage and is defined as the ratio of the mass of the food ration placed in the net to the mass gain of the salmon in the net, over a given time period. The daily ration is judged by observing fish feeding activity at the surface. Feeding is usually discontinued when the fish feeding activity greatly diminishes. The FCR in the industry averages 1.5 for Atlantic salmon and 2.0 for Chinook salmon (British Columbia Salmon Farmers Association). Under more controlled farming conditions, a FCR of 1.0 for Atlantic salmon was achieved without negatively affecting growth (Austreng et al., 1987; Storebaldcen and Austreng, 1987). The optimal feed conversion would be the one which produces the biggest fish using the least amount of feed in the shortest time period.  In order to determine the optimal FCR, the true ration must be known. This is the ration actually consumed by the fish. Salmon farmers in British Columbia currently do not measure or control food pellet wastage; although wastage have been reported to range between 15-40% on commercial sea cage farms (Seymour and Bergheim, 1991; Thorpe et al., 1990).  A reduction in the quantity of uneaten pellets would benefit the industry in a number of ways. Decreased operating costs would be one benefit of reducing wastage. Feed pellet loss adversely affects the environment, fish health, and fish quality because uneaten pellets accumulate underneath farms. Knowledge of pellet loss is critical for assuring  proper dose delivery when the pellets contain colouring agents, vaccines, vitamins, and antibiotics. Another benefit of reducing the quantity of uneaten food pellets could be an increased use of automatic feeders. The manual feeding of salmon on a sea cage farm is very time consuming. Currently, salmon are fed by hand for 4 hours a day in the winter months, and 8 hours a day in the summer months. Automatic feeders are not commonly used because feed wastage has been reported to be as high as 40% (Thorpe et al., 1990). To accurately measure the ration and feed wastage, fish farmers need a measurement tool to give them feedback on pellet loss during a feeding period. Aquaculture researchers would also benefit if a system were available to provide data on how much feed is not being eaten when they run feeding experiments. Hydroacoustic detection of food pellets (Juell, 1991) is one method used to detect food waste. A hydroacoustic sensor is used to detect food waste (a group of uneaten food pellets) and signal an automatic feeder. The automatic feeder is turned off when any amount of food waste is detected. The system does not count the uneaten food pellets, and therefore cannot be used for more complex automatic feeder control where some waste is acceptable. Another method of estimating food waste is to suspend a sheet below the sea cage during the feeding period (Shepherd and Bromage, 1988), retrieve it after feeding, and count the food pellets that were wasted. This method does not provide immediate feedback during feeding. Only after feeding can the wastage be assessed. Food pellets are available in various sizes and are dark brown to black in colour. However, they appear white when viewed underwater using a low light camera pointing straight down. This effect makes it possible to detect the pellets using underwater video cameras and analyze the images using digital image processing algorithms  2  1.2 Problem Statement In this study, we addressed the problem of developing an economical video camera based pellet detection and counting system, for use on salmon farms. The system could be used to study feed wastage and its effects on operating costs, additive delivery, and the magnitude of environmental impact. Count data from a single camera, or a number of cameras throughout the sea cage could be provided to an automatic feeding system. The feeder system would be automatically controlled using count data, and many man-hours of labour would be eliminated. This application is not the primary goal of this project, but can incorporate the results presented in this thesis.  1.3 Objective and Scope The objectives of this study were to determine the applicability of a manual video camera based pellet detection and counting system, and to develop an automatic version of this manual system using image analysis algorithms A farmer would be able to lower a video camera (pointing straight down) to a desired depth, and using video replay, count the number of pellets that are falling through the view area of the camera (Figure 1) at that depth. With this information, the farmer would have some numerical data on how much feed is being eaten by the fish above camera level. By placing the camera near the bottom of the sea cage, an estimate of how much feed is being wasted would be obtained. This system could be used to record the change in pellet loss during a feed event, as well as calculate the total amount lost.  3  Video Camera  Figure 1 - View Area of Video Camera  The manual approach to counting food pellets from video replay could be laborious depending on the length of the feeding period. An operator would have to watch a videotape of the entire feeding period and keep an accurate count of the food pellets falling past the camera. An automatic pellet counting system was therefore developed and tested.  There were two major steps involved in the development of the automatic counting system. These were, detecting the food pellets in the images, and correctly tracking food pellets from one image to the next. A typical image from the underwater camera would contain food pellets and salmon. The salmon are dark in colour when viewed from above, but parts of the fish's body appear white if a salmon turns on its side. Occasionally, other objects which appear light in colour are present in the water. It was important to distinguish salmon and other objects from food pellets so they were not included in the food pellet count. Tracking objects was necessary to ensure each pellet was only counted once.  4  The automatic algorithms were used to 1) isolate the pellet and non-pellet objects in the image, 2) measure features of the objects and classify them as food pellets or other objects, 3) track pellets from one frame to another, and 4) maintain a count of the number of pellets that have passed through the view area of the underwater camera. The system was tested in order to estimate and identify counting errors. The automatic food pellet counting process is shown in Figure 2.  Camera Positioning Acquisition of Images on Videotape Extraction of Frame Sequences for Analysis Preprocessing of Frame Sequence viv  Object Detection on Frame Sequence  [Object Feature Extraction V  Object Classification Object Matching Pellet Counting  Figure 2 - Pellet Counting Process  5  The goals of the project, and the constraints on development of the system are discussed in Chapter 2. The equipment used in both the manual and automatic food pellet counting systems are described in Chapter 3. Details on how the food pellets are imaged by the digital camera, and different methods of automatically segmenting and detecting the food pellets in an image are discussed in Chapter 4. The algorithms used to classify objects in the image as either food pellets, or other objects using measurable features of the objects, are presented in Chapter 5. Food pellet tracking between images and problems associated with tracking and maintaining an accurate count of the number of food pellets passing through the view area of the camera are described in Chapter 6. In Chapter 7, the experimental results obtained using the algorithms developed are discussed.  6  Chapter 2 Goals and Constraints  The goal of this project was to discover methods that could be used to manually and automatically detect and count food pellets falling through the view area of an underwater camera. Some initial investigations were done at the beginning of the project to determine the development possibilities given the time constraints. The following constraints were placed on the development of the counting system.  A real-time implementation of the automatic pellet counting system was not attempted because the development hardware was not powerful enough to digitally process the acquired images in real time. At this point in the project it was not practical to purchase equipment capable of real time processing, until the capabilities of the algorithms were established.  The pellet counting systems were designed to accurately count food pellets passing through the view area of the camera in densities which will occur in typical feeding situations. These densities can be determined using standard feeding tables which list the amount of food that should be feed to fish of a certain mass.  The food pellet counting system was designed to work under a variety of weather and natural illumination conditions. Only natural light is used for image acquisition in order to reduce the complexity of the image acquisition hardware. The light level at a given depth is dependent on many factors including weather (clear, cloudy, raining), the time of year, water visibility, and density of fish above camera level (fish block light coming down through the water column). Problems associated with insufficient illumination will not be addressed in this project. It is assumed that a more expensive, higher sensitivity 7  camera could be purchased or an underwater light source could be used to eliminate any insufficient illumination problems.  The camera system used in the development of this project was designed for experimental use only. For a commercial pellet counting system, the camera and rig should be easy to move in and out of the water, and therefore must be lightweight and transportable. The camera system and rig are described in section 3.2.  Determining the optimum placement of cameras in the sea cage in order to get an accurate sample of the total number of pellets falling is beyond the scope of this project. The issue of changing feeding methods in order to get uniform feeding at a known rate will not be addressed.  8  Chapter 3 Equipment for Acquisition and Analysis 3.1 Introduction  In this section, the equipment used for image sequence acquisition and analysis is described. The image acquisition equipment is required by both the manual and automatic pellet counting systems. The image analysis equipment is required for the automatic pellet counting system. The image acquisition equipment is used to acquire and storing the image sequences at the site. The image analysis equipment is used to transfer the image sequences to the computer system and perform analysis. The equipment required to carry out these two functions is described in this section. 3.2 Image Acquisition and Storage  The image acquisition and storage equipment is used to capture and store image sequences on videotape when the salmon are being fed in the sea cage. The equipment consists of three main components, a externally controlled digital camera, a super VHS videotape recorder, and the underwater camera housing and support rig. The camera used was the Panasonic WV-B400 CCD (Charge-coupled Device) camera. The camera outputs a video signal with 525 lines of horizontal resolution at a rate of 30 frames per second. It is equipped with a wide angle lens with a focal length of 4.8 mm The view area of the camera covers 76.5 degrees. It has low light capabilities and can operate at a minimum illumination of 0.5 lux (20 lux recommended). The camera aperture is automatically controlled, and compensates for the illumination level. The shutter speed of the camera is controlled manually from an external controller on the surface (Figure 3).  The super VHS videotape recorder used for the project was the JVC AG-1960. In record mode, the video recorder takes the signal from the camera as input, and stores in on super VHS videotape. The super VHS capability of the video recorder enables it to store better quality (higher resolution) images than a standard VHS recorder. A standard VHS recorder stores approximately 240 lines while a super VHS recorder records approximately 400 lines. When a video frame is transferred to the computer for analysis, it is digitized. Digitization involves determining the intensity value of each pixel in the image by interpolating the data in the video frame. A video frame is digitized into an image with 512 pixels horizontally by 480 pixels vertically. In order to ensure that the digitized image is an accurate representation of the original image, the high resolution of super VHS videotape is required. Resolution is very important for this image processing application since small objects (food pellets) are being imaged. When high resolution video frames are used, detection and analysis are much easier to perform, and the results are more accurate than if the same scene were digitized from a low resolution video image.  In order to use the camera in the sea cages, an underwater camera housing encases the camera to protect it from moisture (Figure 3). The dome port of the underwater camera housing was designed by International Hardsuits to produce optically correct images (i.e. the dome port negates the effect of the refraction of light due to the air, glass, water interface). The rig is used to mount the camera system so that its position and orientation can be controlled. In the sea cage, the rig system is lowered to the desired depth and positioned using ropes.  10  Ropes Used for Positioning and Support Image Monitor  v Vp  S-VHS Videotape  Sea Cage  Figure 3 - Image Sequence Acquisition Equipment  11  3.3 Image Analysis  The image analysis equipment is used to convert the video images stored on videotape to digitized images on the computer system, and analyze them to determine pellet counts. The equipment consists of three main components, an IBM compatible personal computer, an Imaging Technologies image processing board and software, and a super VHS time code generator/reader video recorder (Figure 4). The IBM compatible personal computer has a 80486 processor, a 130Mb hard disk and dual displays (computer monitor and image monitor). An Imaging Technologies hardware accessory board plugs into one of the expansion slots in the computer. The board has a video signal input. Video frames are extracted and digitized from the video input. The board contains extra memory for image frame storage, specialized hardware for image processing operations, and a video output port so images can be displayed on the image monitor.  Computer S-VHS Videotape  Monitor  Image Monitor  Video Signal  Computer Control  Video Signal  Figure 4 - Image Analysis Equipment  12  (Inside PC)  An Imaging Technologies software library was used for program development. This library contains hundreds of software functions that can be used in development programs. The library contains functions that interface with the Imaging Technologies hardware accessory board to load and save images, perform convolutions on images, and many other operations. The combination of the computer, the hardware accessory board, and the software library allows a programmer to write 'C' programs to extract data from images, and perform other numerical operations on images. Figure 5 shows the frame coordinate system used by the Imaging Technologies image processing system. This coordinate system will be used in this document. The size of the images used are 512 pixels horizontal by 480 pixels vertical.  • (x,y) Image Frame  Figure 5 - Digital Image Coordinate System A super VHS time code generator/reader video recorder is used to extract sequences of images from super VHS videotape. The machine that was used was the JVC BR-S822U with the time code generator/reader expansion card, the time base correction card, and the RS-232 interface port. This machine is capable of laying time codes on one of the audio tracks of the videotape. This time code is used to identify each individual frame on the videotape. The computer is used to control the machine via the RS-232 port on the video recorder. A computer program can be used to search for a specific frame on the 13  videotape by reading the time codes on the tape. When the desired frame has been reached, it can be acquired by the Imaging Technologies image processing card, and stored on the hard disk for later analysis. Using the time codes on the tape, a sequences of frames can be captured and stored at a desired sampling rate automatically.  14  Chapter 4 Detecting Objects in Images 4.1 Introduction Object detection involves separating the objects in the image from the background and determining the position of each object in the image. Correctly identifying objects which represent food pellets in an image is one of the most important operations in the counting system. If objects other than food pellets are mistakenly counted as food pellets, or objects that are food pellets are not counted, the pellet count will be inaccurate. The first stage of object identification is object detection. In this stage, all objects, food pellets and other particles are separated from the background and their positions are recorded. This information is then used by the object classification algorithms. In this section, an explanation of how food pellets are imaged by the digital camera is given, and different methods of object detection are discussed and evaluated. 4.2 Imaging of Food Pellets Food pellets are manufactured using an extrusion process. The pellets are cylindrical in shape. The sides are very smooth, and the ends where the pellets were broken off after extrusion are rough. Although the food pellets appear dark brown to black on the surface, they appear white in images taken in the water with the underwater camera. This effect is observed because the camera is highly sensitive (0.5 lux), and automatically compensates for the amount of light available. The camera opens the aperture in order to make the overall light level about 18% gray. When the camera is looking straight down into the water column, the background is very dark, so the aperture of the camera is opened wide. When the food pellets are in view, they reflect the sunlight (food pellets are approximately 5-6% reflective), and are therefore much lighter than the background, and appear white relative to the background. This is very important for detection, since if the pellets appeared dark gray, they would be virtually undetectable in the image.  15  As the camera is lowered into the water column, the sunlight reaching the pellets is attenuated by the water above the camera and blocked by fish above the camera. Under certain conditions, there will not be enough light for the system to work properly. The camera will be operating at its limits, and food pellets will not appear white and will be undetectable. In these cases, a more sensitive camera or an underwater light source is required.  In order for the food pellets to appear white in the captured images there must be approximately 3.5 meters (dependent on water visibility and light conditions) between the camera and the bottom of the net. This is required because the bottom of the net, which is white, must be very dark in colour relative to the food pellets. Since the camera must be at least 4 meters below the surface of the water, the pellet counting systems can only be used in sea cages which are at least 7.5 meters deep.  The maximum coverage of the camera depends on the pellet size used and the resolution of the video camera. As a result of using a wide angle lens on the underwater camera (to maximize the coverage area of the camera), food pellets in the captured images decrease in size rapidly as the pellets move away from the camera. Pellets smaller than 7mm in diameter become virtually undetectable at very small distances. This means that the coverage area of the camera is very small when using small pellets. Table 1 shows the theoretical area of objects imaged by the camera in pixel units corresponding to different object-camera distances (512 x 480 pixel image, view angle of camera - 76.5 degrees). In practice, these areas are approximately 1 to 2 pixels larger due to CCD bloom (cells in the CCD which detect high intensity light levels, increase the intensity values detected by neighbouring cells). In order to accurately detect a food pellet object in the image, the object should be represented by at least 4 pixels. When using 7mm pellets (area 38.575.7 mm2, depending on orientation), pellets 1.00 to 1.25 meters away from the camera 16  and closer can be accurately detected. This corresponds to camera coverage of approximately 2.5-3.9 square meters. Table 1 - Theoretical area of pellet objects in pixel units vs. distance from object to camera Distance from Object to Camera (meters) (Camera Coverage Area (meters2)) Object Area (mm 2) 0.50 0.75 1.00 1.25 1.50 1.75 2.00 (Approximate Pellet Size) (0.62) (1.4) (2.5) (3.9) (5.6) (7.6) (10.0) 1.00 (1mm) 4.00 (2mm) 9.00 (3mm) 16.00 (4mm) 25.00 (5mm) 36.00 (6mm) 49.00 (7mm) 64.00 (8mm) 81.00 (9mm) 100.00 (10mm)  0.42 1.69 3.79 6.74 10.5 15.2 20.7 27.0 34.1 42.2  0.19 0.75 1.69 3.00 4.68 6.74 9.18 12.0 15.2 18.7  0.11 0.42 0.95 1.69 2.63 3.79 5.16 6.74 8.54 10.5  0.07 0.27 0.61 1.08 1.69 2.43 3.30 4.32 5.46 6.74  0.05 0.19 0.42 0.75 1.17 1.69 2.29 3.00 3.79 4.68  0.03 0.14 0.31 0.55 0.86 1.24 1.69 2.20 2.79 3.44  0.03 0.11 0.24 0.42 0.66 0.95 1.29 1.69 2.13 2.63  4.3 Correlation Matching One traditional method of object detection is correlation matching. This method involves matching a prototype image of the object to be detected with sub-sections of the input image. If a positive match is made with a sub-section of the input image, then there is a high probability that the object is present in that sub-section of the image. This prototype can be any size, but is usually small to minimize computation. The intensity values of the prototype image w(x,y) are correlated with the intensity values of the sub-section of the input image as follows.  17  Correlation R(m,n) = E Ef(x,y)*w(x —m, y —n)  ^  (4.1)  x y  where: f(x,y) is an image of size MxN^m=0,1,...M-1^n=0,1,...N-1 w(x,y) is a prototype image of size JxK J<M^K<N the value of the prototype image is equal to zero outside the defined region  The correlation is repeated for every sub-section of the input image (Figure 6). When the correlation value is high, it is very likely that the object is present in that sub-section. One, or any number of prototype images representing objects being searched for can be used. For example the same object in a number of different orientations, or a number of different sizes can be searched for using a number of prototypes, each representing the object in a different orientation, or a different scale.  w(x,y) Prototype  f(x,y)  Input Image  Prototype Occured in Original Image at this Point  Correlation Result  R(m,n)  Figure 6 - Correlation Matching  18  Initially, correlation matching seemed to be a good method to use to detect food pellets in an image. However, there are number of reasons why correlation matching was not used. The first reason why correlation matching was not used is that there is a large variation in the length of the food pellets. Since the pellets are made using an extrusion process, the width of the pellets is consistent, but the length is not. Since there is a large variation in length is it difficult to determine a prototype image that will yield high correlation values for pellets of different lengths. The second reason for not using correlation matching is the fact that the pellets are being imaged as they fall through the water column. The image of a pellet decreases in size as the pellet falls away from the camera. Also pellets fall in different orientations. Since the pellets are cylindrical, the shape appears very different as the orientation changes. All these factors make it very difficult to use correlation matching. A very large number of prototypes would be necessary. These would include prototypes representing a number of different orientations, and a number of different sizes for each orientation. The number of prototypes necessary would make it very computationally expensive to detect the food pellets in an image. The final reason correlation matching is not suitable is the fact that food pellets are not always imaged as distinct objects. In some cases, the images of two or more food pellets overlap. There are methods to separate the pellets (described in section 4.6), but after separation, the shape of the pellet is severely altered, and would not match any of the shape prototypes used for correlation matching.  19  Correlation matching is not a good method to use to detect food pellets in an image because the length of the pellets is highly variable, it is too computationally expensive to take into account the number of shape prototypes necessary, and when the images of two or more food pellets overlap, correlation matching is impossible even after separation.  4.4 Thresholding and Region Growing Since it is not feasible to use correlation matching to detect objects in the digital image, another method of detection was developed. This method consists of two processes, thresholding and region growing. Thresholding is done to separate the objects from the background, and transform the 8 bit image with 256 levels of gray to a two bit (binary) image with two levels of gray - black and white. Region growing is used in the object detection algorithm to determine the locations and sizes of all objects in the binary image.  The threshokling operation involves choosing a gray level (threshold value) and then remapping each pixel in the image based on the threshold value. All pixels with gray levels greater than the threshold value are set to 255 (white) and pixels with gray levels less than or equal to the threshold are set to 0 (black). The difficulty arises in choosing the threshold value that will remap the image such that the background becomes black, and the food pellets and all other objects become white, with as little noise as possible (i.e. white pixels not corresponding to food pellets or other objects). A method was developed for this project to determine an acceptable threshold value automatically.  A histogram of an image is a graph of the frequency of occurrence of each gray level in the image. In an image that contains no food pellets or other objects, all the pixels have values near the black (0) end of the histogram. In this case the histogram values are typically normally distributed with a mean near 0 and a small variance (Figure 7). 20  18000 16000 _ 14000 _ 12000 _ Frequency f(x) 10000 _ 8000 _ 6000 4000 20000  2000  o  111111111f1111111111f 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 •—■ e.4 en .:1- kr) v:, I-- 00 Ch 0 ,—. N efl vr .r1 VD r- 00 Cn 0 ■-■ N en •ch kr) cv cv c,i cv cv rv ^  Grey Level  Figure 7 - Histogram of Image without Objects  In an image that does contain food pellets and other objects, since the pellets and objects are lighter gray than the background, the histogram contains values above the background levels (Figure 8).  Figure 8 - Histogram of Image with Objects 21  In order to properly threshold the image, a threshold value must be chosen to transform all background pixels to black, and all pixels representing food pellets or other objects to white. If an estimate of the normal distribution of the background gray levels is obtained, it can be used to find an estimate of the maximum gray level in the background. If the image is thresholded at this value, all pixels representing background will be transformed to black while all pixels representing objects will be turned to white.  The equation of a normal distribution with mean u and variance cr2 is:  f (x) _ k exp( 1(x u) \Go-^a )  2  —  (4.2)  Determining the threshold to use to separate the food pellets from the background can be done by fitting a normal distribution to the gray levels representing the background pixels in the histogram. The mean of the normal distribution, u, is taken as the gray level where f(x) is maximum. The threshold is chosen as (u+3a).  Determining the Optimum Threshold for Object Separation 1. low pass filter the histogram 3 times to smooth any noisy data 2.  find the gray level that corresponds to the maximum value of the histogram in the black end - denote this gray level by index u  3. find f(u), it can be expressed as follows:  f (u) —  22  -1-27ra  (4.3)  4.  choose an index near u - index i. Using 4.3, solve the normal equation (4.2) for a  i —u  = ,^ ^ 2(1n( f (i)) 1n( f (u))) —  (4.4)  —  5. repeat Step 4 for 5 other points near u and average a's obtained - aavg 6. Since 99.7% of the normal distribution lies in range (u-3aav-,u g +-1aavg), we should threshold the image at u+3aavg. This sets the majority of the background pixels black, and most objects in the image white.  In a sequence of images, the threshold value used to separate objects from the background is found to be fairly consistent throughout the sequence. In an image that contains many food pellets, the histogram may become skewed towards the white end. A skew can be determined by comparing the a values obtained using points (step 5) on opposite sides of the mean (u). In this case, the algorithm could be improved by using an average of the threshold values used for previous frames in the sequence, instead of the threshold value determined using the method described above.  Now that the image is reduced to a binary one, the object detection algorithm is used to determine from a binary image, a list of the objects present in the image. The centroid and the area of each object is also stored. In the algorithm, the image is scanned row by row. If a white pixel (pixel with a value of 255) is found (signifying the location of an object), the region growing procedure is initiated with this pixel's (seed pixel) coordinates as input. The region growing procedure is used to determine the size of the object, as well as the objects centroid. The object detection algorithm is presented below.  23  Object Detection Algorithm 1. Read pixel value at location (0,0) 2. If pixel has value of 255, start region growing procedure using current pixel location as seed location - Increment object count by one 3. Read the pixel value at the next location (x-coordinate increased by 1, if at end of row set x-coordinate to 0, increment y-coordinate by 1 - row by row scan). If at end of image STOP Else - GOTO step 2  The region growing procedure consists of repetitive searches for pixels with a value of 255 which are connected. Initially the 4-neighbours (Figure 10) of the pixel sent to the procedure (seed pixel) are checked to see if any of them have a value of 255. For each neighbour that does have a value of 255, its 4-neighbours are checked to see if any of them have a value of 255. After the 4-neighbours of a pixel have been checked, its location is stored in a data structure, a count of the number of pixels in the object is incremented, and a value of 128 is written to the pixel so it is not counted again. This repetitive search procedure continues until all the pixels in the object have been accounted for (none of the pixels in the object have 4-neighbours which have a value of 255). The centroid is then calculated using the coordinate values stored in the data structure. The coordinates of the centroids, and the number of pixels in the object are then stored. The region growing procedure then ends. The region growing procedure is presented below. The region growing function takes the coordinates of the seed pixel as input.  24  Region Growing Procedure 1. Set the seed pixel to the value 128. Put its coordinates on the queue. Set the number in the queue to 1. 2. If the queue is not empty - get the pixel coordinates off the queue - decrement the number in queue by 1 - increase the pixel count by one. Else GOTO step 5. 3. Read the gray levels of the 4-neighbours of the pixel. For each neighbour that has a value of 255, store its coordinates in a data structure, put its coordinates on the queue, and set its value to 128. Increase the number in the queue by 1 for each of the 4-neighobours that had a value of 255. 4. GOTO step 2. 5. Use the pixel coordinates in the data structure to determine centroid of object. Store centroid and number of pixels in object. 6. Return to the object detection algorithm.  When the region growing procedure ends, control returns to the object detection algorithm. The scan for a pixel with a value of 255 is resumed where it left off before the region growing procedure was initiated. The pixels that form the object detected by the region growing procedure were all set to a value of 128. Therefore none of these pixels will be used to initiate another region growing procedure. The object previously detected will be ignored for the rest of the object detection algorithm. The object detection algorithm ends when the scan reaches the end of the image. When the algorithm ends, a 25  list of the objects present in the image, and the centroid and size of each object have been stored. Figure 9 shows a binary image with the results from the object detection algorithm.  This algorithm requires a significant amount of storage space when operating on large objects. The algorithm could be implemented more efficiently by eliminating the storage of the pixel coordinates in the data structure. A total of the x coordinates, a total of the y coordinates, and the total number of pixels in the object could be maintained. After all the pixels in the object were determined, the centroid could be determined from these totals.  26  Object^Position^Number^of Pixels Object 1^(301,155)^8 Object 2^(250,176)^132 Object 3^(433,218)^127 Object 4^(339,222)^86 Object 5^(309,268)^17 Object 6^(288,275)^16 Object 7^(315,288)^44 Object 8^(360,295)^211 Object 9^(388,321)^65 Object 10 (212,398)^13 10 objects detected Figure 9 - Results from Object Detection Algorithm  27  ^  4.5^Dilation and Erosion In order to preprocess objects for separation of overlapping objects, and feature extraction and classification, dilation and erosion are performed to smooth the edges of the objects and remove noise in the image. These operations use the values of the neighbours of each pixel in the image to determine its new value. The following figures show two different types of neighbours used in calculations.  2^ 4^3^2 (x,y-1)^ (x-1,y-1)^(x,y-1) (x+1,y-1) 3^A^1^5^A^1 (x-1,y)^(x,y)^(x+1,y)^(x-1,y)^(x,y)^(x+1,y) 4^ 6^7^8 (x,y+1)^ (x-1,y+1) (x,y+1) (x+1,y+1) 4-neighbours of pixel A^8-neighbours of pixel A Figure 10 - 4 and 8-neighbours of a pixel  Dilation is used to fill in small gaps in the objects. If four or more of the 8-neighbours of a pixel have a value of 255, and the pixel itself has a value of 0 or 255, then the pixel is set to a value of 255. Otherwise its value is unchanged. This is achieved by convolving the image with a mask with the following form. 111 141 111 followed by a division by 4. If the result is greater than or equal to 255, the pixel corresponding to the center weight in the mask (4) is set to 255, otherwise the value of the pixel is unchanged. An example of the dilation process is shown in Figure 11. 28  Original Image  Image After Dilation  Figure 11 - Example of Dilation Operation  Erosion is used to remove extraneous pixels from the frame. If five or more of the 8neighbours of a pixel have a value of 0, then the pixel's value is set to 0. If less than five of the 8-neighbours have a value of 0, then the value of the pixel is unchanged. This operation is achieved by convolving the image with a mask with the following form. 111 181 111  29  followed by a division by 16. If the result is less than 176, then the pixel corresponding to the center weight in the mask (8) is set to 0, otherwise the pixel is set to 255. An example of the erosion process is shown in Figure 12.  Figure 12 - Example of Erosion Operation  To preprocess the image for separation of overlapping objects, dilation is performed, followed by erosion. These operations smooth the edge of the object so the object separation process can be performed if necessary.  30  4.6^Separating Overlapping Objects Since the underwater camera is viewing a three dimensional medium, it is possible that two or more food pellets will appear to overlap in the camera image. In order to maintain an accurate count of food pellets falling past the underwater camera, overlapping objects must be detected and separated before counting.  Given a list of pixels that make up an object, an algorithm has been developed to determine if the object represents two or more overlapping pellets, and if so determines how to separate them. The algorithm was developed by improving an algorithm described by Poon (Poon, 1989; Poon et al., 1992). After the dilation and erosion operations, the separation of overlapping objects can be done. The algorithm consists of four main steps:  Separating Overlapping Objects 1. Detect the boundary pixels of the object 2. Calculate an estimate of the tangent at each boundary point in the object 3. Determine the curvature at each boundary point in the object 4. Use the zero crossings in curvature to divide the object if necessary  Each of these steps are explained in the sections following.  4.6.1 Detecting the Boundary Pixels of an Object After the object detection algorithm has been executed, a list of objects, their sizes, and their centroids have been stored. The boundary pixel detection routine uses the centroid of the object as a starting point. The routine contains the following steps: 31  ^  Boundary Pixel Detection 1. start at the centroid of object (xc,yc) 2. increase the x-coordinate until the pixel at that coordinate (xn,yc) has a value of 0. Pixel at coordinates (xn-1,yc) is a boundary pixel. 3. follow the boundary of the object moving counter clockwise, storing coordinates until coordinate (xn-1,yc) is reached again.  After the initial boundary pixel is determined, the boundary is followed using codes to determine the direction of movement from the current boundary pixel to the adjacent valid boundary pixel. In order to be a valid boundary pixel, the pixel must have a value of 255 (must be part of the object), and it must be an edge pixel. To be an edge pixel, at least one of the 4-neighbours of the pixel must have a value of 0 (not part of the object). Codes are assigned based on the position of the next valid boundary pixel relative to the current boundary pixel as follows: Code 3^Code 2^Code 1 (x-1,y-1)^(x,y-1)^(x+1,y-1) Current Boundary ^Code 4^Pixel^Code 0 (x-1,y)^(x,y)^(x+1,y) ^Code  5^Code 6^Code 7 (x-1,y+1)^(x,y+1)^(x+1,y+1) Figure 13 - Movement Codes  In order to maintain the same direction of movement (counterclockwise) around the object, the next boundary pixel is determined using the previous movement code, and the movement codes of all valid adjacent boundary pixels. At the beginning of the algorithm, 32  the previous movement code is set to 4. This starts the movement in a counterclockwise direction. The next boundary pixel is determined as follows: The search for a valid boundary pixel begins at the movement code one counter clockwise position from the movement code directly opposite the previous movement code (i.e. if the previous movement code was 5, the movement code directly opposite code 5 is code 1, so the search begins one position counter clockwise from that position - movement code 2). If the pixel corresponding to that movement is a valid boundary pixel, then the current position is updated to this new boundary pixel. If the pixel in that position is not a valid boundary pixel, then the pixel corresponding to the movement code one counter clockwise position from that is checked. This search is continued until the next valid boundary point is found.  Figure 14 shows an example of the operation of the algorithm. At boundary pixel 4, there are three valid boundary points corresponding to movement codes 0,3, and 4. Since the previous movement code was 4, the search for a valid boundary pixel starts at movement code 1. There is not a valid boundary point corresponding to movement code 1, so the movement code one counter clockwise position away (movement code 2) is checked. There is not a valid boundary pixel corresponding to movement code 2, so movement code 3 is checked. There is a valid boundary pixel corresponding to movement code 3, so the pixel corresponding to movement code 3 becomes the next boundary pixel.  33  First Boundary Pixel  Starting Point  Movement Codes: 2,3,4,3,6,5,6,6,6,7,0,0,1,2,2  Figure 14 - Object Boundary Determination  At the end of the procedure, the coordinates of all the boundary pixels have been stored in order as they occur around the boundary of the object. The number of boundary pixels in the object has also be stored. These coordinates are used in the next section. Figure 15 shows an example of the results of the boundary detection routine.  34  Original Object  Boundary of Object  Figure 15 - Example of Boundary Detection  4.6.2 Estimating the Tangent at each Boundary Pixel Since the object consists of discrete pixels, an exact calculation of the tangent at each point in the boundary cannot be computed. However, an estimate of the tangent can be calculated. An accurate estimate of the tangent cannot be obtained by using only two adjacent boundary pixels. Since a boundary pixel can only occur in one of eight locations relative to a neighbouring boundary pixel, the tangent angle computed would be one of only eight angles. Clearly, more than two boundary pixels must be used to determine a 35  tangent estimate at each pixel in the boundary. The tangent estimate at a boundary point i is determined using the neighbouring four boundary points on either side of the boundary point i as follows: Let the x[1],x[2],...x[i],...x[n-1],x[n] represent the x-coordinates of the n boundary pixels Let the y[1],y[2],...y[i],...y[n-1],y[n] represent the y-coordinates of the n boundary pixels 4  dx[i] = Ex[i +j] —x[i —j] J=1 4  dY[i] = Ey[1.+A —y[i —j] J=1  (4.5)  (4.6)  Since the boundary points form a closed curve, wraparound is used when an index is greater than n or less than 1. For example x[0]=x[n] and x[n+2]=x[1].  The tangent angle, theta[i] at boundary point i can then be calculated as follows: if dx[i]=0^theta[i]=90.0*dy[i]/abs(dy[i])^(4.7) otherwise^theta[i]=atan(dy[i]/dx[i])^(4.8) Figure 16 shows results of determining the tangent at each boundary point in the given image. The tangent points theta[1...n] are used in the next section.  36  Boundary of Object Pixel  Tangent Estimate  (434,336) (434,335) (433,334) (433,333) (432,332) (431,331) (431,330) (430,329)  -87.1 -76.0 -68.2 -61.2 -57.0 -55.0 -51.7 -45.0  (431,343) (432,342) (433,341) (433,340) (433,339) (433,338) (434,337)  63.4 65.8 68.2 70.7 73.3 78.7 84.3  Figure 16 - Example of Tangent Estimates  4.6.3 Determining the Curvature at each Boundary Pixel The change in the tangent is an estimate of the second derivative of positions of the boundary points, or the curvature of the boundary points. An estimate of the curvature at boundary pixel i is obtained as follows: curvaturel[i]=--theta[i+1]-theta[i]^(4.9) curvature2N—theta[i]-theta[i-1] ^(4.10)  37  The curvatures are then adjusted to be in the range -90 degrees to 90 degrees as follows: if curvaturel[i]<-90^curvaturel[i]=curvaturel[i]+180^(4.11) if curvaturel[i]>90^curvaturel[i]=curvaturel[i]-180^(4.12) The same adjustment is done for curvature2[i].  The curvature at boundary pixel i is then calculated as follows:  curvature[i]=0.5*(curvature1[i]+curvature2N) ^(4.13)  Figure 17 shows how curvature values change as the object boundary changes from concave to convex. The curvature values are used to divide the object if necessary.  Curvature  + W ±  + Figure 17 - Curvature for Concave and Convex Objects  38  4.6.4 Using Zero Crossings in Curvature to Divide the Object if Necessary Figure 18 shows plots of the curvature values for two objects. Object 1 is a single object and division is not necessary. Object 2 consists of two overlapping objects, and division is desired. In the plot for Object 1, there are no zero crossings. In the plot for Object 2, there are two sets of zero crossings. The zero crossings correspond to a change in curvature, and identify which boundary pixels should be used to divide the object. To divide the object, a boundary pixel with the minimum curvature from within each of the two sets of zero crossings are connected with a black line. Figure 19 shows the results of executing overlapping object separation on the two objects. Object 1 has not been divided, and Object 2 has been divided. These are the correct results for these cases.  Curvature  Object I  Boundary Pixel Zero Crossings  Curvature  Boundary Pixel  Object 2  Figure 18 - Curvature Plots for Two Different Objects  39  Object I  Object 2  Figure 19 - Results of Object Division for Two Objects  In some cases, small negative curvature values are calculated for boundary points that do not correspond to points where the object should be divided. When there are more than two areas of negative curvature, the two areas containing the largest negative curvatures are used for object separation. When curvature is calculated for large objects that should not be divided, often there are two or more areas of negative curvature. This does not occur often for small objects. The reason that large objects are more likely to have areas of negative curvature is that as object size increases, the average curvature of the object decreases, and therefore small variations in curvature can result in areas of negative curvature. This does not occur for small objects. Since the average curvature is large for small objects, small variations from the average curvature will still be positive. In order to prevent large objects that should not be divided from being divided, the curvature values are adjusted before zero crossing detection is done. The adjustment depends of the size of the object. If the object is small, the curvature values should not be changed a great deal. It is acceptable to subtract a small constant curvature from all the pixels in a small objects. This constant amount should not be large enough to cause objects to be divided that should not be divided. If the object is large, a constant curvature should be added to prevent objects that should not be divided from being divided. The constant amount should not be so large that large objects that should be divided are not divided.  40  Equation 4.14 shows the curvature adjustment which was used. For large objects, the average curvature will be small, so the second term should be large (larger than K2), and a positive constant will be added. For small objects, the average curvature will be large, so the second term will be small (slightly smaller than K2) and a small negative constant will be added. The determination of the constants is discussed in the results section. curvature[i] =curvature[i] + K 1 —K 2  (4.14)  C AVG  where: CAvG=average curvature of all boundary points with positive curvature Ki and K2 are constants 4.6.5 Case of more than two overlapping objects In the case of more than two overlapping objects. The separation procedure can be repeated a number of times. The division procedure using zero crossings will be more complex in order to properly separate one object at a time from the group. This procedure has not been developed for this thesis.  41  Chapter 5 Measuring Features / Classifying Objects 5.1 Introduction After objects have been detected in the image, it must be determined which objects represent food pellets, and which objects represent fish and other particles or objects in the water. Properties of the objects are measured, and these values are used to classify each object as either a food pellet or something else. In this section, the features chosen to classify the objects are presented, the methods of measuring these features are discussed, and classification using these features is described. 5.2 Features used to Distinguish Food Pellets from Other Objects In the classification process, an object is identified as being a member of one of a number of groups (or classes) based on its characteristics. Features of the object are measured in order to quantify the characteristics of the object. The measured characteristics can then be numerically compared to the classes and a decision regarding which class the object is a member of can be made. In this application, objects are classified as being a member of one of two groups: food pellet objects, or other objects.  Ideal features for classifying objects into one of two groups have values that are very different for the two groups. The difficulty in classification lies in determining features that have significant differences in value for the two classes. In this application, the objects of interest are food pellets. These pellets have a cylindrical shape. Since the size of a food pellet in the image depends on the distance between the camera and the pellet, and food pellets can appear in different orientations in the image, the feature measurements taken from objects in the image must be invariant to both rotation and scaling. The features must measure properties that food pellet objects exhibit, but other objects do not, or vice versa, in order to achieve an accurate classification.  42  In this section, the four features used for classification are examined. These features are 1) circularity, 2) bounding area ratio, 3) minor to major axis ratio, and 4) minimum to maximum radius ratio.  5.2.1 Circularity Circularity is a measure of the roundness of the object, and is defined as:  Circularity —  Perimeter2 4 r • Area  (5.1)  Circular objects will have circularity values close to 1.0. As object shape varies from circular, the circularity value will vary from 1.0. Food pellet objects are close to circular in most orientations, so this feature is useful for discriminating food pellet objects from other objects that are not circular.  To determine the perimeter of an object, the following steps are performed:  Calculating the Perimeter of an Object I. determine the boundary pixels of the object (described previously in section 4.6.1) 2. start at one of the boundary pixels, and move around the object incrementing the perimeter by a value depending on the direction of movement from pixel to pixel as shown below (movements are from center pixel to one of its eight neighbours) 1.414  1.0  1.414  1.0  X  1.0  1.414  1.0  1.414  43  5.2.2 Bounding Area Ratio The Bounding Area Ratio is defined as: Bounding Area Ratio —  Object Area Bounding Box Area  (5.2)  where the bounding box is a rectangle surrounding the object (Figure 21)  After the objects in an image have been detected, the area and centroid of each object is known. To calculate the bounding area ratio for each object, the bounding box area for each object must be determined. The steps required to calculate the bounding box area [19] are presented below:  Calculation of the Bounding Box of an Object 1. determine the rotation angle 0 of the object using moments of inertia The Central Moment U(p,q) is defined as:  U(p,q) =  -_  E Dx -3-0P (y --yrf(x,y)^(5.3) x y  where: (x, y) are the coordinates of the centoid of the object f(x,y) is the intensity of the object at (x,y) The row moment of inertia is U(2,0) The column moment of inertia is U(0,2) The row-column cross moment of inertia is U(1,1) The moment of inertia covariance matrix is defined as:  u .....[U(2,0) U(1,1)1  L uom U(0,2)  By performing singular value decomposition on U, the diagonal matrix A is obtained.  44  (5.4)  ETUE =A^ E .[e1, e12 e 21 e 22  (5.5) (5.6)  (5.7)  A =V' L^A 2 The columns of E contain the eigenvectors of U and the diagonal of A contains the eigenvalues of U. If the object is an ellipse, the eigenvectors and eigenvalues have the physical meaning implied in Figure 20.  V C‘,IAJI.  (e ,e21) Vector (e,2,e22) ( X2) °5  Figure 20 - Moments of Inertia used to Determine Rotation Angle  The rotation angle 0 is determined as follows:  if A, >X2, 6) =tart-1 ( e '  (5.8)  ift,2 >Ai, 0 =tan 1( e 2  (5.9)  e ll  -  45  e 12  2. determine the extremities (,amin,amax4min,Omax) of the bounding box ((a,i3) Figure 21) of the object from the rotation angle and the boundary pixels of the object.  a =cos0(x —x) sin t9(y —y) +x —  = —sin 0(x —x) +cos 0(y y) +y  (5.10)  —  where: (x,y) are the coordinates of one of the boundary pixels  3.  calculate the area of the bounding box from the extremities  bounding box area =( ocmax  —arnin )0max —Anin^  (5.11)  The bounding area ratio will always be less than or equal to 1.0. Food pellet objects typically have bounding area ratios around 0.74. This feature can be used to help discriminate food pellet objects from objects which have a smaller bounding area ratio.  Figure 21 - Calculation of Bounding Box Area  46  5.2.3 Minor to Major Axis Ratio The minor to major axis ratio is defined as:  Minor to Major Axis Ratio —  Length of Minor Axis Length of Major Axis  (5.12)  The minor to major axis ratio will always be less than or equal to 1.0. Food pellet objects typically have minor to major axis ratios between 0.65 and 1.0. This feature can be used to help discriminate food pellet objects from objects which have a small minor to major axis ratio.  5.2.4 Minimum to Maximum Radius Ratio The minimum to maximum radius ratio is defined as: Minimum Radius of Object Minimum to Maximum Radius Ratio — ^ Maximum Radius of Object  (5.13)  The minimum and maximum ratios are determined by calculating the distance from the centroid of the object to each of the boundary pixels and storing the minimum and maximum distances calculated. The minimum to maximum radius ratio will always be less than or equal to 1.0. Food pellet objects typically have minimum to maximum radius ratios between 0.6 and 0.8. This feature can be used to help discriminate food pellet objects from objects which have a larger or smaller minimum to maximum radius ratio.  47  5.3^Classification Using Feature Values Using the four feature values measured from an object, the object can be classified as belonging to either the food pellet object group or the other object group. Measurements are taken in feature space to determine which class the unknown object is a member of. Feature space is the n-dimensional space defined by the n features. In this case the feature space is a four dimensional space. Classification is done by calculating the distance in feature space from the object to the class mean for the food pellet object group. The smaller this distance is, the more likely that the object belongs to the food pellet object group. The first step in the classification process is to determine the class mean for valid food pellet objects. This is done by collecting the feature values for a large number of objects that are known to be food pellets and calculating the means for each feature as follows.  For N valid objects that are known to be food pellets, the class mean is defined as:  m=—1 N^ Ex.i =  M f2  N i=1 — m f 3  (5.14)  where:  f 1,  f 2, x, =^ f3,  (5.15)  is the vector which contains the feature values fl, f2i,f3i, and f4i for the ith food pellet object and where mn, mc,me, and mf4 are the means for the four features.  48  The Euclidean distance from the origin of the feature space to a point x in the feature space is defined as: dE (x,0)  =[(x)T(.1)]°.5^ (5.16)  In order to use this Euclidean distance measure, the following transformation is required. x' =(x —m)^  (5.17)  This transforms the feature space so the class mean is at the origin. The distance measure then becomes the distance from the class mean m to a point x in the feature space and can be determined as follows: d E (2c_,Ln_) =[(^=RI —111)T (-1.rn)] 05  (5.18)  Figure 22 shows the effect of this transformation on the feature space (2-D feature space shown).  Feature Space Feature Space^ Before Transformation^ After Translation Figure 22 - Feature Space Transformation  49  Using the Euclidean distance from a class mean to an object (represented by a point in the feature space) as a measure of the probability that the object is a member of that class, is adequate in the case where the features are uncorrelated and have equal variance. Figure 23 shows why the Euclidean distance is not adequate when the features are correlated and do not have equal variance.  Figure 23 - Euclidean Distances used for Classification  In Figure 23 a class of objects is represented by two features. The features are correlated and their variances are not equal. The Euclidean distance from the class mean to P1 is the same as the Euclidean distance from the class mean to P2. However, P1 is not a member of the class but P2 is. Therefore, a classifier based on Euclidean distance alone cannot accurately classify P2 as a member of the class, and exclude P1 from the class.  A transformation of the feature space is necessary to uncorrelate the features and scale the variances of the features so both features carry equal weight in the distance measure. The covariance matrix of the class can be determined as:  -N^— r(x. —m)(x, — N  i=1  50  (5.19)  The feature space can be transformed by a weighting matrix of the following form (2-D case)  w .[ W11 W12 ]  (5.20)  W21 W22  The transformations applied to the feature space are defined as: x' =(x —ni) x" =Wx' =W(x —m)  (5.21) (5.22)  Figure 24 shows the desired results of these transformations on the feature space. After the second transformation, the features become uncorrelated, and the variances become equal.  51  Feature Space Feature Space^ Before Transformation^ After Translation x2"  Feature Space After Rotation and Scaling Figure 24 - Feature Space Transformations Incorporating these transformations into the distance measure, it becomes: dw(x,m) =[(W(x —m))T W(x —1)]"^(5.23) dw(x,m) =Rx _my wz w (x _n)r^(5.24)  To uncorrelate the features and equate the variances of the features, a weighting matrix that transforms the covariance matrix to the identity matrix is required. WSW T^  52  (5.25)  From this, it can be found that (5.26) Therefore the distance measure becomes:  cl,(x,m) =[(x m)T (x m)r^(5.27) —  —  This distance measure is called the Minimum Intra-Class Distance (MICD) measure. It is used to get an unbiased measure of distance from the class mean when the features are correlated and the variances of the features are not equal. Using the MICD distance measure, objects can be compared to the class mean without effects from correlated features or unequal variances affecting the measure. Figure 25 shows a plot of some equal MICD distance contours on the original feature space. The equation of an equidistant contour is:  [(x. m.)T (x m)]°5 =k^(5.28) —  —  where k is a constant.  53  Di stance=a Distance=b Distance=c Distance=d  ^• Xi MICD Equidistant Contours a>b>c>d Figure 25 - Equidistant MICD Contours on Original Feature Space In order to classify unknown objects as either food pellet objects, or other objects, the following steps are performed. Object Classification I. Obtain a large set of known pellets objects and determine the feature values for each object. 2. Use the feature values from the known pellet objects to calculate the class mean in and the covariance matrix S. 3. For each of the known pellet objects, calculate the MICD distance from the class mean. 4. Choose a MICD distance that is larger than most if not all of the distances obtained in step 3. This is the classification distance. 5. For an unknown object, measure the feature values, and calculate the MICD distance from the pellet object class mean. If the MICD distance obtained is smaller  54  than the classification distance obtained in step 4 (in feature space, the object is inside the equidistant contour defined by the classification distance), then classify the object as a pellet object, otherwise, classify it as a non-pellet object.  Steps 1 to 4 are only performed once. The class mean m, the covariance matrix S, and the classification distance are stored, and used to calculate the MICD distance for each unknown object.  55  Chapter 6 Counting Objects using Object Tracking 6.1 Introduction The goal of this project is to count the number of food pellets that fall through the view area of the underwater camera. After objects representing food pellets have been detected in a sequence of frames, the objects must be counted in such a way as to avoid counting a single food pellet more than once. Some method of object tracking must be implemented in order to track a single pellet throughout the sequence of frames so it is only counted once. In this chapter, the algorithm that was developed to accomplish this goal is presented. 6.2 Overview There are many different methods of tracking objects through a video sequence. Many methods, [2],[13],[14],[16] assume the size and shape of the object, or targets on the object, remain constant throughout the sequence. They then use predictor methods to estimate the position of the object from its past motion history. A search for the object is carried out in an area around this position. Searching is done using template matching, histogram matching, or other gray level comparison methods.  The problem with using a tracking method of this type for this project is that the food pellet objects change in size and shape between consecutive image frames. Since the food pellets are falling past the camera, the food pellet objects generally get smaller in consecutive frames. A method of tracking is required which can accommodate changes in the size and shape of the object between consecutive frames. Horton [7] describes a method of matching objects in consecutive frames using a cost function which is based on measurements of the objects shape and size. The weightings of the different measurements that make up the cost function can be varied. This is the type of matching  56  method that was implemented for this project. A distance function (similar to Hortons cost function but including other measurements in addition to shape and size type measurements) is used to match objects in consecutive frames. The tracking and counting algorithm takes into account factors specific to this application. 6.3 Object Counting Algorithm The geometry of pellet motion is shown in Figure 26. It can be seen that when food pellets fall through the view area of the underwater camera, they will always enter the image frame at one of the edges. If a food pellet falls straight down the water column, the pellet motion in the image frame will be from the edge of the frame inward. These two properties were used to develop the object counting algorithm used in this project. To examine the operation of the algorithm an example will be explained.  Falling Food Pellet  Video Camera  Frame at T1  • •  •  Frame at T2  •  Frame at T3  Frame at T4  • Figure 26 - Food Pellet Motion  In this example, it will be assumed that two frames, Frame i and Frame i+1 have undergone object detection and classification, and a list of valid objects representing food pellets is available for each frame. Figure 27 shows Frame i and Frame i+1 with food pellets, the New Object Area and the Center Area labeled.  57  Figure 27 - Two Consecutive Image Frames  The object counting algorithm involves tracking objects from Frame i to Frame i+1 and counting new objects in Frame i+1 that have just entered the view area of the underwater camera. The new objects that are counted are objects that were not tracked from Frame i, and are located in a thin band around the edge of the frame (New Object Area). The algorithm is presented below:  Object Counting Algorithm for Frame i and Frame 1+1 1. Execute the object matching algorithm between objects in Frame i and objects in Frame i+1 2. All objects in the New Object Area of Frame i+1 that were not matched with objects in Frame i are counted, and the resulting number is added to the total pellet count. 3.  i=i+ 1  4. if frame i+1 exists GOTO STEP 1, else STOP  58  Object tracking is done to ensure that if a pellet does not move out of the New Object Area between frames, or a pellet moves back into the New Object Area from the Center Area, it will not be counted again. The width of the New Object Area depends on the sampling frequency of the frames. The longer the sampling frequency, the wider the New Object Area must be. This is true because if the New Object Area is too narrow, a food pellet could enter the frame, and move through the New Object Area and into the Center Area between frames, and the algorithm would not count it. The wider the New Object Area, the greater the possibility of an error in the object count due to loss of track of a pellet. The width of the New Object Area is proportional to the maximum distance a food pellet can move between two frames, which varies with the frame sampling rate.  The choice of frame sampling rate affects the complexity of the object tracking algorithm. The higher the sampling rate, the easier tracking becomes because objects move very small distances between frames. Therefore, the object matching algorithm would not need to be as complex. However, as the sampling frequency is increased, the number of frames to be processed increases, and the computational power required increases. There is a trade-off between the sophistication of the object matching algorithm, and the processing power required.  One cause of error in pellet count occurs when the computer loses track of food pellets as they move through the view area of the camera. Loss of track of a food pellet could occur in a number of situations. If a fish or other object moved between the camera and a food pellet between Frame i and Frame i+1, the pellet would not be visible in Frame i+1, and a loss of track would occur. If the fish or other object moved out of the way between Frame i+1 and Frame i+2, the pellet would reappear in Frame i+2. If the pellet reappeared in central area of the frame, there would not be a problem, tracking would 59  resume on the pellet through subsequent frames. If however, the pellet reappeared in the New Object Area, the algorithm would not be able to differentiate the pellet from a new pellet entering the view, and would recount it. Therefore the New Object Area should be as narrow as possible in order to minimize the possibility of recounting error due to loss of track of objects. The use of a narrow New Object Area requires a high sampling frequency between frames, and consequently more processing power. If a pellet fell out of range of the camera between Frame i and Frame 1+1, the pellet would not be visible in Frame i+1, and a loss of track would occur. In this case, tracking would simply cease for this object, as there is no possibility of the object reappearing in future frames, and no counting error would occur.  The object matching algorithm is the most important step of the object counting algorithm, and will be described in the next section.  6.4^Object Matching Algorithms The object matching algorithm is used to determine the new positions of objects detected in Frame i, in Frame i+1. This information can be used to determine if new objects have entered the view area of the camera in Frame i+1. In order to match objects in Frame i with objects in Frame i+1, a distance measure is used to determine the probability that an object in in Frame i should be matched with an object (i+l)m in Frame i+1. In Figure 28, two frames, Frame i and Frame i+1 are shown with their respective objects. An overlay of Frame i and Frame i+1 is also shown, along with the desired matching result.  60  Figure 28 - Object Matching Example  The probability that two objects should be matched is measured by a distance function. The distance function can incorporate many measures such as Euclidean distance, relative sizes of the objects, and other factors. The smaller the value of the distance function, the 'closer' the two objects are, and the higher the probability that they should be matched.  61  An object in in Frame i is matched with an object (i+l)m in Frame i+1 if: 1) object i+lm is the closest object in Frame i+1 to object in 2) object in is the closest object in Frame i to object i+lm 3) the distance between objects in and i+lm is smaller than a maximum distance.  This can be stated as: ^ (6.1) distance(in,(i+l)m) <= distance(in,(i+l)k) V (i+l)k E Frame i+1 ^ (6.2) distance(in,(i+l)m) <= distance(ii,(i+l)m) V i1 E Frame i ^ (6.3) distance(in,(i+l)m) <a certain maximum distance where the maximum distance dependent on the maximum Euclidean distance a pellet can move between frame times (dependent on sampling frequency)  distance(in,i+1m) is dependent on a number of factors which are explained in section 6.5.  Three algorithms were developed for this project that can be used to match objects using the above distance measures. Each algorithm has its advantages and disadvantages. These three algorithms are presented in the sections following.  62  6.4.1 Large Storage, Minimum Number of Distance Calculations Algorithm In this algorithm, all distances from each object in Frame i to all objects in Frame i+1 are calculated. The distances are stored in an array (Figure 29). Objects in Frame i 1 Objects in Frame i+1  2  •••  n  1  distance(1,1) distance(2,1)  .^.^.  distance(n,l)  2  distance(1,2) distance(2,2)  .^.^.  distance(n,2)  • .  . .  .^.^.  distance(n,m)  . . m  . .  . .  distance(1,m) distance(2,m)  Figure 29 - Distance Array for Large Storage Algorithm  The array is then used to match objects in Frame i with objects in Frame i+1 as follows. Object s in Frame i+1 should be matched with object t in Frame i if the following conditions are met: 1. distance(t,^) is the minimum distance in row s 2. distance(t,^) is the minimum distance in column t 3. distance(t,^) is less than a certain maximum distance  The algorithm starts on the first row and finds the column with the minimum entry. If this entry is also the minimum in that column, and the distance is less than the maximum distance, then the objects are matched. If two objects are matched, then the maximum distance is written into all the entries in row s and column t to prevent these objects from being used for subsequent matches. If the objects are not matched, the algorithm moves  63  on to the next row. The algorithm cycles through the rows repeatedly until no matches are made in a cycle. The algorithm is presented below. Large Storage Object Matching Algorithm 1. match=0, s=1 2. if object s is unmatched, fmd the minimum distance in row s - found in column t 3. if entry (t,^) is the minimum in column t, object t in Frame i is unmatched, and distance(t,^) is less than the maximum distance then object t in Frame i is matched with object s in Frame i+1, match=match+1 all entries in row s and column t are changed to the maximum distance 4. Increment s and repeat STEPS 2 and 3 until the last row is reached. If the last row is reached and match=0 (no matches in last cycle) - END, else GOTO STEP 1  This algorithm has the advantage that it requires the minimum number of calculations of distances between objects in the two frames. For any matching algorithm with n objects in Frame i and m objects in Frame i+1, the minimum number of distance calculations is n*m. If (n-m) is close to zero, then the number of distance calculations required is in the order of n2 (0(n2)). The disadvantage of this algorithm is the large amount of storage it requires. For n objects in Frame i and m objects in Frame i+1, the algorithm requires m*n units of storage. If large amounts of storage are available, this algorithm is the best of the object matching algorithms presented. 6.4.2 Small Storage, Maximum Number of Distance Calculations Algorithm In this algorithm, a match for each of the m objects in Frame i+1 is determined one object at a time as follows: The distances from an object s in Frame i+1 to all objects in Frame i are calculated, and the minimum distance is determined. Let the object in Frame 64  i corresponding to the minimum distance be object t. The distances from object t in Frame i to all objects in Frame i+1 are calculated. If the minimum distance corresponds to object s in Frame i+1, and the distance is less than the maximum distance, then object s in Frame i+1 is matched with object t in Frame i, and both objects are marked as being matched so they are not used in subsequent matches. The algorithm then moves to the next unmatched object in Frame i+1. The algorithm cycles through the objects in Frame i+1 repeatedly until no matches are made in a cycle. The algorithm is presented below. Small Storage Object Matching Algorithm I. match=0; s=1 2. if object s in Frame i+ I is unmatched, calculate the distance to all unmatched objects in Frame i and store the results 3.  Sort the distances in ascending order.  4.  for object in Frame i corresponding to minimum distance (object t), calculate distances to all unmatched objects in Frame i+1 and store the results.  5.  Sort the distances in ascending order.  6. If the minimum distance corresponds to object s in Frame i+1, and the distance is less than the maximum distance, match object t in Frame i with object s in Frame i+1. Mark the objects as being matched so they are not used in subsequent matches. match=match+ 1 7. Increment s and repeat STEPS 2 to 6 until the last object in Frame i+1 is reached. If the last object is reached and match=0 (no matches in last cycle) - END, else GOTO STEP 1  65  This algorithm has the advantage that it requires a very small amount of storage. For n objects in Frame i and m objects in Frame i+1, the algorithm requires only 2m+2n units of storage. 2n units to store the n distances from objects in Frame i+1 to all objects in Frame i, and their corresponding object numbers. 2m units to store the m distances from objects in Frame i to all objects in Frame i+ 1, and their corresponding object numbers.  The disadvantage of the algorithm is the large number of distance calculations that must be performed. For n objects in Frame i and m objects in Frame i+1, the minimum number of distance calculations can be determined as follows:  Step 1  ^  Calculate n distances to objects in Frame i. Calculate m distances to objects in Frame i+1 two objects are matched  Step 2  ^  Calculate n-1 distances to objects in Frame i. Calculate m-1 distances to  objects in Frame i+1 two objects are matched  Step m-1 Calculate 2 distances to objects in Frame i. Calculate 2 distances to objects in Frame i+1 - two objects are matched  The total number of distance calculations (if n<m) is:  n(n +1) m(m +1) (m —n)(m —n +1) —mn +n Ek^ + Ek k=1^k=m-n+1^2^2^2  (6.4)  The total number of distance calculations is approximately equal to n*m, the same number of calculations required by the large storage algorithm. If (n-m) is close to zero, the number of distance calculations is 0(n2). However, this is the best case. In the best case, for each object s in Frame i+1, the closest object in Frame i to that object (object t)  66  is always its match (object s in Frame i+1 is the closest object to object t in Frame i, and the distance is less than the maximum distance). This is very unlikely and therefore the algorithm will most likely require many more calculations than required in the best case.  In the worst case, only one match is made for each cycle through the m objects in Frame i+1. The total number of distance calculations is:  E(n —k)(m —k) +(n —k) = Enm —nk —mk +k2 +n —k k=0^  k=0  =n2m +n2 —(m +n +1) Ek + Ek2 k=0^k=0  n(2n +1)(n +1)  =n 2^2 m +n —(m +n +1) n(n +1) 2^6  (6.5)  If (n-m) is close to zero, the number of distance calculations will be 0(n3). This is much larger than the number of calculations required in the best case. In the average case, the number of distance calculations will be between 0(n2) and 0(n3). If large amounts of storage are not available, this algorithm is an alternative to the large storage algorithm, but requires much more computation. However, this algorithm can be modified to maintain the low storage requirement, and reduce the amount of computation required. This modification is presented in the next section.  67  6.4.3 Small Storage, Intermediate Number of Distance Calculations Algorithm The small storage algorithm can be modified to reduce the number of distance calculations required, and maintain the low storage requirements. The algorithm is described in this section.  In this algorithm, matches are made between objects using the same method as was used in the previous algorithm, but if the closest object in Frame i (object t in STEP 4 of previous algorithm) is not matched with object s in Frame i+1, the second, and third closest objects are tried before moving on to the next object in Frame i+1. Since it is unlikely that for each object in Frame i+1, the closest object in Frame i to that object is its match, the closest three objects are tested. It is likely that one of the closest three objects will be its match. The algorithm is presented below.  Second Small Storage Object Matching Algorithm 1. match=0; s= 1 2. if object s in Frame i+1 is unmatched, calculate the distance to all unmatched objects in Frame i and store the results 3. Sort the distances in ascending order. count=1 4. OBJECT = object in Frame i corresponding to minimum distance (object t) 5. calculate distances from OBJECT to all unmatched objects in Frame i+ 1 and store the results. 6. Sort the distances in ascending order.  68  7. If the minimum distance corresponds to object s in Frame i+1, and the distance is less than the maximum distance, match object t in Frame i with object s in Frame i+1. Mark the objects as being matched so they are not used in subsequent matches. match=match+ 1 8. if a match was not made, and count=1 - repeat STEPS 4-7 for OBJECT = second closest object in Frame i from STEP 3. count=2 9. if a match was not made, and count=2 - repeat STEPS 4-7 for OBJECT = third closest object in Frame i from STEP 3. count=3 10. Increment s and repeat STEPS 2 to 6 until the last object in Frame i+1 is reached. If the last object is reached and match=0 (no matches in last cycle) - END, else GOTO STEP I  This algorithm has the same storage requirements as the previous algorithm (2m+2n units of storage). In the worst case, the algorithm will perform exactly the same number of distance calculations as the previous algorithm, and in the average case, it will perform fewer calculation than the previous algorithm. The exact number of distance calculations required will depend on how often an objects match is one of the three closest unmatched objects.  69  6.5 Distance Measure Between Objects The distance measure between two objects (one in Frame i at (x ,yi), the other in Frame i+1 at (x2,y2)) is inversely proportional to the probability that the two objects should be matched (i.e. the probability that the objects represent the same particle moving through the view area of the camera). The distance measure is composed of a number of individual measurements. The measurements are:  1. Euclidean distance ratio measure - The Euclidean distance between the centroids of the two objects (centroids at (xi,yi) and (x2,y2)). It is likely that the position of an object moving through the view area of the camera will not change by a large amount between the two images. Therefore the Euclidean distance between two objects in consecutive frames will be small if the objects should be matched. The Euclidean distance is defined as:  Euclidean Distance =V((x2 —x1 )2  ^ +(y2 —Y1 )2)  (6.6)  As a result of the camera geometry, objects near the edge of the frame tend to move larger distances between frames than objects closer to the center of the frame. The maximum distance an object can move between frames is dependent on its position within the frame. This distance is used to prevent matching objects that are too far apart. The maximum distance is defined as follows: The center of the frame has coordinates (xc,ye) The top right corner of the frame has coordinates (xmax,0) The maximum distance from center is defined as: Maximum Distance from Center = Euclidean Distance ( (xc,ye), (xmax,0) ) The maximum movement at the edge of the frame is Me The maximum movement at the center of the frame is Mc  70  The maximum distance an object can move depending on position (Distance from Center is the Euclidean distance from the center of frame to the centroid of the object in Frame i) within the frame is defined as:  (M —M ) xDistance from Center Maximum Distance — " Maximum Distance from Center  (6.7)  The Euclidean distance ratio Er is then defined as: Euclidean Distance(( x ), (x2 , y2 Euclidean Distance Ratio^=^ Maximum Distance ,  ))  (6.8)  This ratio is a measure of the closeness of the objects. A low value indicates a high probability that the objects should be matched, while a value near 1.0 indicates a lower probability of a match.  2. Area ratio measure - The ratio of the areas of the two objects. It is likely that the size of an object moving through the view area of the camera will not change by a large amount between the two images. Therefore the ratio of the areas of two objects in consecutive frames (areai and areali+i) should be close to 1.0 if the objects should be matched. The size ratio is defined as:  if area, >area,+, Area Ratio A,. — if area  areai+, area;  area; ^Area Ratio A, = area,+,  71  (6.9)  3. Motion vector between two objects - The motion vector is the vector formed by joining the centroids of two objects in consecutive frames. As a result of the camera geometry, objects generally tend to move from the edge of the frame inward, towards the center of the frame. Measurements can be made to determine if the motion of the object is towards or away from the center of the frame, and how close to the center of the frame the motion vector lies.  In most situations, if the distance from Object 2 in Frame i+1 to the center is further than the distance from Object 1 in Frame i to the center, then the motion vector does not point inward, and the two objects should not be matched (Figure 30).  Center (x0Y.)  Distance(Center, Object 2)  Object 2 (x2,Y2) Distance(Center,Objectl)  Motion Vector  Object 1 (x1,311)  Figure 30 - Motion Vector between Objects in Consecutive Frames A measure of the deviation of the motion vector from the center point can be obtained by calculating the distance from the center to the closest point on the vector path (Figure 31).  72  Figure 31 - Measure of Deviation of Motion Vector from the Center Point  The minimum distance to the center Dc is calculated as follows: equation of vector line: y = m xx + b  (6.10)  —Y 1 slope of vector =m = Y2^ x2 -.X 1  (6.11)  y —intercept =b =y2 —mx2  (6.12)  D, =  abs(mx, —y, +b)  (6.13)  1/M2 +1  The larger the minimum distance to the center, the less likely that Object 1 and Object 2 should be matched.  6.5.1 Combining Distance Measures The total distance used for object matching can be calculated as a weighted combination of the individual distance measures as follows: Distance =1/(weE,. )2 +(wa (1.0 —A„))2 +(w,D, )2^(6.14) 73  where: Er,Ar,Dc are Euclidean distance ratio, area ratio, minimum distance from motion vector to center we,wa,wc are weightings for each component distance measurement The weightings can be adjusted for different conditions. In some situations, use of one or more of the individual distance measures may adversely affect matching, so they should be eliminated from the total distance measure by weighting them by 0.  74  Chapter 7 Experimental Results and Discussion 7.1 Introduction  In this chapter, the performance of each of the various components of the food pellet counting system are discussed. Problems encountered in each part are described and possible solutions are suggested. Overall system tests are described, and counting accuracy results are presented.  7.2 Image Acquisition  The quality of the images analyzed by the computer is an important factor in the food pellet counting process. If the quality of the images is low, the information extracted from the images is noisy, and analysis becomes more difficult. In digital images, quality can be described by two factors. The first factor is accuracy in measuring the intensity level of each pixel in the image. As the accuracy decreases, the image will become more noisy, and the quality will decrease. The second factor in image quality is the number of pixels used to represent the image (resolution). As the number of pixels used to represent the image decreases, the value of the image from an image analysis standpoint decreases. The image sequences were acquired by a standard RS-170 CCD camera, and stored on SVHS videotape. A frame grabber card was then used to extract images from the videotape, digitize them, and store them on the computer. The images available for computer analysis were 512 by 480 pixels. Noise is added to the images in the process of storing images on videotape, and in the process of transferring them from videotape to the computer. In addition, the quality of the images output from the camera is not high. The camera is made for television quality output, not for applications which require highly accurate gray level information. Therefore, the quality of the images stored on the  75  computer is significantly lower than desirable. Higher quality images can be acquired using better equipment.  If a digital camera were used to capture and store images at a given sampling rate, the images available for computer analysis would be of much higher quality than those currently available. A digital camera can produce high resolution images with highly accurate pixel intensity values. Digital cameras can capture images 1000 by 1000 pixels, 2000 by 2000 pixels and larger. Using a digital camera bypasses the process of storing images on videotape, and transferring the images from videotape to the computer for analysis. Therefore the noise introduced by these processes is eliminated.  Another problem in image acquisition occurred in extracting images from videotape. The videotape was time coded so individual frames could be acquired by the frame grabber and stored. Unfortunately when a specific frame was cued up on the VCR, and held in STILL mode, the VCR interpolated one of the fields of the image (i.e. one of the fields was from data off the videotape, and the other field was interpolated from this data). This meant that all the data stored on the tape was not being used. To resolve this problem, frames were acquired from the video signal while the VCR was in PLAY mode. The computer received the time codes from the VCR as the tape played, and when the desired time code was returned, the frame grabber was activated to capture the desired frame. The problem with using this method was that the frame grabber did not consistently grab a complete frame. Often it grabbed one field from one frame, and the other field from the following frame. This caused a problem, because if an object moved between frames, it appeared in slightly different locations in the two fields.  76  It was decided that the inaccuracy introduced by interpolating fields was the lesser of the two problems and therefore images were acquired while the VCR was in STILL mode. For better results, the problem with acquiring image frames must be corrected.  Image sequences of falling food pellets were acquired at a number of different sites, in different weather conditions, and with and without fish in the view area of the camera. Some initial image sequences were acquired in grow-out tanks at the Department of Fisheries labs in West Vancouver. These tanks are cylindrical, 3 meters in diameter and 1 meter deep. The bottom of the tank was painted black so the food pellets would appear white in the images. The tanks were not deep enough to do tracking and counting experiments because food pellets never fell out of the view area of the camera. This footage was used to determine if measurements taken from pellet objects in the image could be used to identify the orientation of the food pellet, and the distance from the camera to the food pellet. This data was obtained to be used for a counting method which involved counting food pellets as they passed through a thin layer at a known distance from the camera. It was determined that since a pellet object is quite small in the image (represented by a small number of pixels), measurements taken from the object would not be accurate enough to reliably determine the orientation of the pellet and the distance from the camera to the pellet. Therefore, this method of counting food pellets was abandoned.  Another problem that occurred during image acquisition was that when food pellets passed very close to the front of the camera, they were shadowed from the overhead illumination by the camera rig, and therefore did not appear white. Under these circumstances, these food pellets would not be detected by the object detection routines, and would therefore not be counted. One possible solution to this problem would be to  77  attach a transparent shield (Figure 32) around the camera housing to prevent food pellets from coming too close to the front of the camera.  Food pellets 7mm in diameter and larger were used in the development of the automatic counting system. The system can be used with any size of pellet, but the pellet size will determine the effective coverage area of the camera. In farming practice, the size of the food pellet used is increased as the fish increase in size. In order to use the pellet counting systems in a sea cage, the sea cage must be of sufficient depth. The camera must be far enough away from the bottom of the net so that in the images, the net appears dark relative to the food pellets. Prior experiments indicated that the camera should be at least 4 meters below the surface of the water in order to avoid the back scatter of light off suspended particles in the first 4 meters of water. These constraints define a minimum sea cage depth for which the systems will work properly.  78  Some sample data was acquired in a commercial sea cage owned by BC Packers on Broughton Island. This cage measured 15 meters by 15 meters by 21 meters deep. Sequences of food pellets with and without fish were acquired in this cage. This footage was used to develop the current tracking and counting algorithm.  Image footage was also acquired at the Pacific Biological Station in Nanaimo, B.C. The sea cages at this site measured 7.62 meters by 7.62 meters by 7.62 meters deep. This footage was used to test the tracking and counting algorithm. 9.5mm food pellets were distributed in the view area of the camera in amounts consistent with actual feeding rates.  7.3 Preprocessing After images sequences were transferred from the videotape to the computer, each frame was preprocessed before performing object detection. Preprocessing involved automatic thresholding to separate the objects in the image from the background, and dilation and erosion to clean up the thresholded image. After each frame was preprocessed, it was stored for future use.  The automatic threshold determination algorithm performed well in most cases. In some cases where there were large fish in the frame, or many food pellets, the optimal threshold was not automatically determined. The algorithm requires a good estimate of the background gray level distribution to function properly. Since there were many objects in the frame, the gray level distribution of the image was skewed, and an accurate estimation of the background gray level distribution could not be determined. Therefore the correct threshold value was not chosen.  In order to get a more accurate estimation of the background gray level distribution, the image could be divided in to a number of sections, and for each section an intensity 79  histogram could be calculated. The automatic threshold deteithination algorithm could be executed on each section, and a threshold estimate would be calculated for each section. These estimates could then be used to determine the best threshold to use for the entire image. The benefits of this approach can be illustrated by considering an image frame in which a small number of the sections contained many food pellets or lightly coloured fish, but the rest of the sections contain few food pellets. By taking an average of the thresholds returned for each section by the automatic threshold determination algorithm, a threshold would chosen which was very close to the threshold determined for the sections with few food pellets. This threshold would be close to the optimal threshold for the image. Instead of simply using the average of the thresholds determined for each section, more sophisticated algorithms for determining the optimal threshold to use could be developed.  The dilation and erosion operators are useful for cleaning up the image after thresholding. The dilation operator fills in holes in objects and the erosion operator removes extra pixels which are not part of objects. The problem with these operators is that the execution time is relatively long. However, with the current image quality, they are necessary.  7.4 Object Detection In the object detection procedures, the objects in the image are first located, then any overlapping objects are separated. After overlapping object separation, all the objects in the image are located and their centroid positions and sizes are stored. It is necessary to repeat object detection after overlapping object separation because if a single object has been divided into two objects, the positions of the centroids of the two objects must be stored instead of the position of the original object.  80  As mentioned previously, the object detection algorithm could be implemented more efficiently. However, in some uses of the object detection algorithm, a list of the coordinates of all the pixels in the object is desirable. For example, when calculating moments of inertia for object classification (5.3), it is necessary to know the coordinates of each pixel in the object as well as the centroid of the object. Therefore the choice of which type of implementation of the algorithm to use is dependent on the necessity of storing the coordinates of all the pixels in the object.  Separating overlapping objects involves determining the boundary pixels of the object, and calculating the curvature at each boundary pixel. The curvature values are then used to determine if the object should be separated, and if so, how it should be separated. Since the object is a discrete representation of the actual object, and the object is small (not represented by many pixels), the calculated curvature values are often inaccurate. This inaccuracy can result in negative curvature values being calculated for boundary pixels which should have positive curvature values. In the separation procedure, areas of negative curvature are joined with a black line to separate objects. If the curvature values for some boundary pixels are determined to be negative when they should be positive, the object may be divided incorrectly, or divided when it should not be divided. One solution to this problem would be to use higher resolution images. When a food pellet is represented by more pixels, the curvature estimate would tend to be more accurate, and separation errors would decrease. Otherwise, perhaps a more accurate estimate of curvature could be obtained using a different method.  As described in Chapter 4, the curvature values are adjusted before determining if an object should be divided. The constants used for curvature adjustment were determined by gathering a small set of objects, some for which division was desired, and others for which division was not desired. The areas of minimum curvature and the average 81  positive curvature for each object were calculated. The curvature adjustment constants were then adjusted so the division algorithm would correctly divide the objects for which division was desired, while not dividing the objects for which division was not desired (i.e. the non-overlapping objects). The curvature adjustment was performed as follows:  .^140.0 curvaturek I =curvature[i] + ^ 15.0 C AVG  (7.1)  A detailed study into setting the curvature adjustment constants was not performed. Tests with a large set of objects could be performed to optimize the values of the constants for better overlapping object separation.  The method developed for separating overlapping objects can correctly divide only two objects which are overlapping. It three or more object are overlapping, the object separation algorithm will fail. The algorithm will fail because it will not be able to identify which sections of negative curvature correspond to which objects. Figure 33 shows three overlapping objects with a representation of the curvature along the boundary.  82  The first figure shows a set of three overlapping objects. The graph shows the curvature along the boundary of the set of objects. It can be seen that the curvature along the boundary of the set of objects becomes negative in four places. The second figure shows the desired object separation result; negative curvature point 1 has been joined to negative curvature point 4 with a white line, and negative curvature point 2 has been joined to negative curvature point 3. This division results in the correct separation of the three objects. However, applying the overlapping object separation algorithm in its current form would not necessarily result in this correct separation. The third figure shows another possible separation. Point 1 has been joined to point 2, and point 3 has been joined to point 4. This division results in an incorrect separation. Other separation results are also possible. The problem is that the algorithm does not know which negative curvature sections to join together to achieve correct separation. The problem becomes  83  greater when more than three objects are overlapping. When overlapping objects are not correctly separated, the pellet counting algorithm cannot operate correctly, and an error in the pellet count is introduced. The frequency of food pellets overlapping in an image is dependent on the density of pellets falling through the water column. In the experiments that were carried out for this thesis, two food pellets overlapping was fairly common, but three or more pellets overlapping was not common. If higher densities of food pellets than were used for these experiments are used, a method of separating three or more overlapping food pellets would likely be required.  7.5 Feature Extraction and Classification Four features were used for object classification. The features used were circularity, bounding box ratio, minor to major axis ratio, and minimum to maximum radius ratio. All the features are rotation and scaling invariant. Invariance to rotation and scaling is required for this application as food pellets appear in different rotations and are scaled depending on their distance from the camera.  Circularity measures the 'roundness' of an object. On most pellet objects, the circularity value was an accurate measure of the 'roundness' of the object. However, when objects were small (less than 10 pixels in area), the circularity measure was inaccurate. This inaccuracy was a result of the discrete nature of the digital images. When objects are small, their shapes are not well represented by a small number of pixels. If a small circular object is represented by a small number of relatively large pixels, the perimeter is measured to be proportionally smaller than it should be for the measured area. The circularity value will then be smaller than 1.0, and this may cause the object to be classified incorrectly. If circularity is to be used as a feature for classification, higher resolution images should be used, so feature measurements are more accurate.  84  The bounding box ratio measures the ratio of the area of the object to the area of the minimum size rectangle that completely encompasses the object. This feature is useful for discriminating between pellet objects, and objects which may not be convex (a line between any two points on the object lies entirely inside the object). The majority of pellet objects are convex. For most pellet objects, the bounding box ratio should be near 1.0. If an object had a hole in it (non-convex object), the bounding box ratio would be less than 1.0, and should therefore be classified as a non-pellet object. The key in calculating the bounding box ratio is determining the rotation of the minimum area bounding rectangle that completely encompasses the object. In the current implementation, the orientation of the object is estimated using moments of inertia, and the dimensions of the bounding rectangle are determined using this orientation. However, this method does not always yield the minimum area bounding rectangle. Figure 34 shows an object for which the currently used algorithm calculates the bounding box ratio incorrectly. The figure shows an object, the best fit bounding box, and the bounding box calculated by the currently used algorithm. It can be seen that the calculated bounding box has a much larger area than the best fit bounding box. In this case, the bounding box ratio should be close to 1.0, but because the calculated bounding box is so large, the ratio will be much less than 1.0.  85  Calculated Bounding Box  Object Best Fit Bounding Box Figure 34 - Error in Bounding Box Determination  The reason why the bounding box is calculated incorrectly is that the object is symmetrical in both the X and Y dimensions. The row and column moments of inertia are therefore both 0 or very close to 0, and the orientation is then calculated be close to 0 degrees. Since the orientation is calculated incorrectly, the bounding box is calculated as shown. One solution to this problem would be to abandon the algorithm using moments of inertia to calculate orientation. The bounding box areas for orientations from -89 degrees to 90 degrees could be calculated, and the orientation which yielded the minimum bounding box area could be used. This approach would be more computationally expensive, but would yield more accurate results.  The minor to major axis ratio is a measure of shape used to discriminate pellet objects from non-pellet objects. The major axis is the axis defined by the orientation of the object and the minor axis is the axis 90 degrees from the major axis. The orientation of the object used for this measure is the orientation calculated for the bounding box ratio. As described previously, the method used for calculating orientation has some problems.  86  It is therefore necessary to get a better estimate of orientation to obtain a more useful feature value for the minor to major axis ratio.  The minimum to maximum radius ratio is another measure of shape used to discriminate pellet objects from non-pellet objects. The minimum radius is the minimum distance from the centroid to the boundary of the object, and the maximum radius is the maximum distance. There were no problems in measuring these distances, but for very small objects the distances are small, and a single pixel change in the distance can drastically affect the radius ratio value. It would be more accurate to use higher resolution images so the radius distances would be larger, and the radius ratio would be more accurate.  In order to accurately classify objects using the four features, some estimate of the values of these features for valid pellet objects is required. A sample set of 1686 valid pellet objects was collected from image sequences on videotape, and the four features values were measured for each pellet object. The means of each feature measurement and the 4x4 covariance matrix were then calculated.  The mean vector (5.14) was calculated to be:  M  0.851 0.739^  0.806 0.656_  87  (7.2)  •  The covariance matrix (5.19) was calculated to be:  0.0418 —0.0046 —0.0073 —0.0110—0.0046^0.0054^0.0014^0.0031 s= —0.0073^0.0014^0.0135^0.0143 —0.0110^0.0031^0.0143^0.0213  (7.3)  The minimum intra-class distance (MICD) from the class mean was then calculated for each of the pellet objects. Figure 35 shows a histogram of the MICD distances obtained.  Frequency and Cumulative Frequency of M1CD Distances 500 400 300 200 100 0  1111111111111111111 1 111111111 101111i111111111111111111 11111111111111111111111111 11111111111111111111111 I  ^en CD  VD  ^ CN N Le') 00^ 'I" ts- C)^■0 Cl■ es) ,,f) 00 •••■1^•••••1 N en en en en^'t  100.00% 90.00% 1)*, 80.00% zr) 70.00% cr 60.00% 50.00% cu 40.00% 30.00% 20.00% 10.00% 0.00%  M1CD Distance  Figure 35 - Histogram of MICD Distances for Valid Pellet Objects  It can be seen that very few of the valid pellet objects had MICD distances greater than approximately 16 (97.27% of the objects have MICD distances of less than 16.0) from the class mean. In classifying objects as pellet object or other objects, two type of errors can be made. A false positive error would occur when a non-pellet object is classified as a pellet object. A false negative error would occur when a pellet object is classified as a non-pellet object. Both types of errors are detrimental, but for different reasons. If a nonpellet object is classified as a pellet object, the calculated pellet count will be larger than  88  the actual pellet count, and feeding may be stopped prematurely. Therefore, the fish may still be hungry, and the fish farmers will not get as high a growth rate as possible. If a pellet object is classified as a non-pellet object, the calculated pellet count will be smaller than the actual pellet count, and wasted feed will be unaccounted for. Feeding may not be stopped soon enough, and extra food will be wasted. The size of the equidistant contour used for object classification will depend on which type of error it is more important to reduce. If it is more important to reduce false positive errors, the size of the equidistant contour should be reduced to encompass the majority of the valid pellet objects, but eliminate a few valid pellet objects. If it is more important to reduce the false negative errors, the size of the equidistant contour should be expanded to encompass all of the valid pellet objects, and as a result also encompass some non-pellet objects. There are tradeoffs involved in both approaches, and the MICD classification distance (size of equidistant contour) should be modified according to the requirements of the individual.  The object classification algorithm was tested on both valid food pellet objects and on non-pellet objects. The algorithm successfully classified most valid pellet objects, but did not perform well on non-pellet objects. Most of the non-pellet objects were fish. When fish are separated from the background using the preprocessing algorithms, often the entire fish is not separated from the background. Small pieces of the fish have gray levels above the threshold value used for preprocessing. Many of these small pieces look very similar to valid pellet objects. Therefore, it is not surprising that these small pieces are often classified as valid pellet objects. There are a number of methods which may be used to eliminate this problem.  A more sophisticated object detection algorithm could be used to completely separate fish from the background. The fish object would not be similar to a valid pellet object, and would therefore be classified as a non-pellet object. The object classification 89  algorithm could be improved. By using more information, such as measurements which use the original gray levels of the object and the area around the object, classification may be improved so pieces of fish would not be classified as valid pellet objects. Another method to prevent pieces of fish from being classified as valid pellet objects would be to prevent fish from entering the view area of the camera. This might be accomplished by placing the camera in a netted structure below or inside the sea cage. This would make the image acquisition equipment more cumbersome, and may eliminate the need for object classification altogether.  7.6 Object Tracking The pellet counting system was implemented on a personal computer, the small storage object matching algorithm described in section 6.4.3 was used. This algorithm requires a small amount of memory with some sacrifice in speed. Since the personal computer does not have a lot of memory, this algorithm was the best choice. The matching algorithm performed well in most circumstances.  One of the problems with the current object matching algorithm is that it optimizes object matching locally instead of globally. In the local optimization object matching algorithm, an object in in Frame i is matched with an object i+lm in Frame i+1 if: distance(in,(i+l)m) <= distance(in,(i+l)k) V (i+l)k E Frame i+1 distance(in,(i+l)m) <= distance(ii,(i+l)m) V i1 E Frame  (7.4)  distance(in,(i+l)m) <a certain maximum distance  (7.6)  (7.5)  Using this algorithm, matches are made on the basis of distance between objects only, there is no overall goal of optimizing all the object matches between the frames. Figure 36 shows the results of executing two object matching algorithms on a set of objects (the  90  circles represent the positions of objects in frame i, the squares represent the positions of objects in frame i+1)  111\* 2 3 w 4 Local Optimization^Global Optimization Figure 36 - Two Object Matching Approaches  In the local optimization case, object 2 is matched with object 3 because the distance between them satisfies the above criteria. Object 1 is then matched with object 4 (assuming the distance between them is less than the maximum distance) because object 1 is the only unmatched object available. By matching objects 2 and 3, the algorithm is forced into the long distance (and therefore unlikely) match of objects 1 and 4. Because object matching is locally optimized, the algorithm does not take into account the effect that making a match will have on other objects in the frames. This demonstrates the need for a global optimization algorithm.  In the global optimization example in Figure 36, object 1 would be matched with object 2, and object 3 would be matched with object 4. This matching solution globally minimizes the total distance between matched objects in the frames. To eliminate matching errors when using a local optimization object matching method, a global optimization object matching method could be implemented. A global optimization object matching algorithm would likely be more computationally expensive than the local optimization algorithm, but could reduce errors in object matching. 91  There are several situations which will cause the object tracking and counting algorithm to return incorrect results. These errors are explained, and possible solutions to eliminate these errors are described.  One possible error in object counting occurs if a food pellet object is occluded in the frame where it is in the New Object Area. Figure 37 shows a frame sequence which illustrates this problem. In the first frame, the food pellet object is not visible. In frame i+1, the food pellet has entered the New Object Area, but is occluded by a fish. In frame i+2, the fish has moved away, and the food pellet is in the Center Area. Since the food pellet was never detected in the New Object Area, it will not be counted, and therefore the overall pellet count will be incorrect. Frame Border  New Object Area  Frame i  ^  Fish Occluding Food Pellet  Frame i+1  ^  Frame i+2  Figure 37 - Example of Counting Error  One possible solution to this type of problem is to increase the width of the New Object Area. With a wider New Object Area, a food pellet would be in this area for a longer period of time, and there would be less of a chance of another object occluding it for the entire time. Another solution might be to shield the view area of the camera, so food pellets could get in, but other large objects such as fish could not.  92  Another type of error occurs when a food pellet in the New Object Area in one frame is tracked to a different food pellet in the New Object Area in the next frame. Figure 38 illustrates this problem. New Object Area  •2 Figure 38 - Example of Counting Error  Object 1 represents a food pellet object in the first frame. Object 2 is the same food pellet in the second frame, and object 3 is a new food pellet in the second frame. For a correct object count, object 1 would be matched with object 2, and object 3 should be counted as a new object since it is in the New Object Area. In this case, object 1 is matched with object 3. Since object 1 has been matched with object 3, object 3 will not be counted as a new object, and the object count will not be incremented.  One possible solution to this type of problem would be to increase the sampling rate so objects do not move large distances between frames. However, with large pellet densities, this type of error will occur occasionally using the current method of food pellet counting.  Figures 39 and 40 show two examples of actual object tracking results (note the direction of motion is towards the center of the frame). Figure 40 shows an example of a food pellet in the New Object Area in one frame being tracked to a different food pellet in the New Object Area in the next frame - the second type of error described above.  93  Figure 39 - Object Tracking Example  94  Figure 40 - Example of Error in Object Matching  The sampling rate is chosen based on the time available for processing. A higher sampling rate makes object tracking more accurate because objects move smaller distances between frames, but processing takes more time.  The width of New Object Area is chosen to match the sampling rate used. The width has to be large enough such that a pellet entering the view area of the camera will always be in the New Object Area in at least one frame. Therefore, the width should be a little larger than the maximum distance a food pellet can move from the edge of the image inward in one frame period. Using the sample image sequences, this distance was estimated by measuring the movement of food pellets in a number of cases. The width of the New Object was set to 100 pixels, with a sampling rate of approximately 4 frames per second.  95  The distance measure used for object matching is composed of three measurements (section 6.5); the Euclidean distance ratio, the area ratio, and the distance of the motion vector from the center. Each of these measurements are weighted and combined to form the total distance. The individual weightings can be adjusted to favour different situations. For example, if there is an upward current or turbulence, the motion vectors of falling pellets may not be center pointing, so the distance of the motion vector from center measurement should carry very little weight in the total distance measure.  The weightings used for the total distance measure are very important for accurate object matching. If a relatively insignificant measurement is weighted highly, and a significant measurement is weighted low, there will be many object matching errors, and the object count will likely be inaccurate. Experimentation with the three weights in typical situations is required to get a useful weighting scheme. Other individual measurements with corresponding weights can be added to the total distance measurement if the three measurements are not sufficient. For the image sequences used for this project, it was found that the best weighting scheme was to weight the Euclidean distance ratio 1.0, and the area ratio 0.75. The distance of the motion vector from the center was weighted differently for different image sequences. In some sequences it was weighted by 0.1, and in others it was weighted by 0. In the sequences for which is was weighted by 0, the camera was moving from side to side with the motion of the water, so the direction of motion of the food pellets was not towards the center.  96  7.7 Counting Experiments Performed and Results  To test the food pellet tracking and counting system, a number of image sequences acquired under different conditions were used. These sequences were sampled and the images were stored on the computer system for analysis. An operator went through each sequence frame by frame counting the number of food pellets entering the view area of the camera in every frame. The computer algorithm was then executed using the image sequence and the pellet count after analysis of each frame was recorded. The computer algorithm output could then be compared to the count recorded by the operator after each frame to determine when the algorithm yielded the correct result, and when it made errors. The causes of individual errors could then be examined. The image sequences used as input were designed to approximate the pellet densities occurring in actual feeding situations. These densities were calculated from standard feeding tables which list the mass of food a fish requires per day as a percentage of the fish's body mass. Table 2 shows the approximate number of food pellets entering the view area of the camera per minute for different cage densities (number of kilograms of fish per cubic meter of water). According to standard feeding tables, in 16 degree water, 800g fish should be fed 1.55% of their body mass per day using 9.5mm food pellets (Source: White Crest Mills Feed Chart). These calculations assume a 9 5mm food pellet can be detected up to 1.5m from the camera (corresponding to a camera coverage of 5.6 square meters) and a sea cage size of 15 meters by 15 meters by 21 meters deep.  97  Table 2 - Number of Food Pellets Entering Camera View per Minute  Mass of  Number of  Pellets per Pellets Passing Passing Camera  Number of  Pellets/min  Cage  Number  9.5 mm  Density  800g Fish  Feed per  Day for  Camera per  (4 hours feeding)  kg/m3  in Cage  Day  Entire  Day  (None Eaten)  (kg)  Cage  (None Eaten)  5  30972  384  345646  12914  54  6  37166  460  414775  15496  65  7  43360  537  483904  18079  75  8  49554  614  553034  20662  86  9  55749  691  622163  23244  97  10  61943  768  691292  25827  108  11  68137  844  760421  28410  118  Table 3 shows the results of using the computer algorithm to count food pellets falling through the view area of the camera. A set of 25 frames sampled at approximately 4 frames per second were used for each test. In tests 1 to 10, food pellets were falling past the camera at a rate of approximately 120 pellets per minute. In test 11 to 18, food pellets were falling past the camera at a rate of approximately 200 pellets per minute. There were no fish in the water during these tests.  98  Table 3 - Results of Counting Trials Actual Food  Computer Food  Number of  Overall Error in  Pellet Count  Pellet Count  Counting Errors  Computer Count  1  8  8  2  0%  2  14  16  2  +14.29%  3  15  15  4  0%  4  11  11  2  0%  5  14  13  1  -7.14%  6  12  10  2  -16.67%  7  10  13  4  +30.00%  8  18  17  3  -5.56%  9  9  12  3  +33.33%  10  12  14  2  +16.67%  11  23  24  2  +4.35%  12  26  24  2  -7.69%  13  22  20  4  -9.01%  14  17  19  2  +11.76%  15  20  21  3  +5.00%  16  20  20  2  0%  17  23  21  2  -8.70%  18  23  21  2  -8.70%  Test  99  The average count error was approximately +/-10%. It should be noted that the overall error information is not very significant. Certain errors can cause the computer count to be too low, and other types of errors can cause the computer count to be too high. These two types of error can cancel each other out, and produce a misleading overall error measurement. This is the case in test 1. Although two errors were made, they cancelled each other out, so the computer count was correct. The number of errors the computer algorithm made for each test is a more significant indication of the accuracy of the algorithm. It is important to determine what kinds of errors occurred. Table 4 shows the frequencies of different types of errors that occurred in the 19 tests.  Table 4 - Frequency of Occurrence of Different Counting Errors  Cause of Counting Error  Number of  % of Total  Occurrences  Errors  Incorrectly tracking an object  12  27.27%  Classifying a valid pellet object as a non-pellet object  11  25.00%  Incorrectly failing to divide/dividing an object  9  20.45%  Error in object detection  7  15.91%  Valid pellet object moved through New Object Area  3  6.82%  2  4.55%  between frames and was therefore undetected Classifying a non-pellet object as a valid pellet object  Incorrectly tracking of an object was the most frequently occurring error. This is followed by classifying a valid pellet object as a non-pellet object, incorrectly dividing/or failing to divide an object, and errors in object detection. These four types of errors make  up 88.63% of the errors that occurred in the tests.  100  From these results is can be seen that object tracking should be improved. This may be accomplished by increasing the frame sampling rate. If objects move smaller distances between frames, object tracking will become less error prone. This adjustment will also reduce errors which occurred when an object moved through the New Object Area between frames, and was therefore undetected. The distance measure used for object matching can also be adjusted to improve tracking. Object classification is another area for which improvements are required. If the view area of the camera were enclosed in a net structure as described previously, object classification may be unnecessary. Otherwise, a more accurate object classification method is required.  Object division was another major cause of counting errors. A more detailed study into determining the values of the constants used for curvature adjustment may be necessary.  Errors in object detection were the final major cause of error in the computer count. Many of these errors were caused when food pellets were too close to the front of the camera. By adding a transparent shield around the camera (described in 7.2), the frequency of this type of error should be reduced.  101  Chapter 8 Conclusion  In this study, the problem of counting the number of food pellets falling through the view area of an underwater camera has been investigated. A solution to this problem is required in order to give fish farmers a measurement tool for determining the number of food pellets that are not eaten during a feeding period. This tool could be used to reduce food pellet wastage, and at the same time ensure the fish are getting enough food. A manual pellet counting system was developed, and image analysis algorithms to be used to automate the manual counting system were designed and tested.  The uneaten food pellets falling past an underwater camera can be counted manually. The manual counting of food pellets from video replay could be laborious depending on the length of the feeding period, however the manual pellet counting system will provide valuable data until the automatic counting system is fully developed.  Given a sequence of images of food pellets falling through the view area of an underwater camera, an automatic counting system was designed to 1) isolate the objects in the image, 2) measure features of the objects and use these features to classify the objects as food pellets or other objects, 3) track pellets from one frame to another, and 4) maintain a count of the number of pellets that have passed through the view area of the underwater camera. The effective camera coverage is dependent on the size of the food pellets used.  A new automatic threshold determination algorithm was developed for this project. This algorithm accurately separated objects in the image from the background. The dilation and erosion operators improved the quality of the images by filling in holes and 102  eliminating noise. The overlapping object detection and separation algorithm was developed for this project by improving an algorithm developed by Poon (Poon, 1989, Poon et al., 1992). The algorithm performed well on the majority of objects. The object classification algorithms performed well in classifying pellet objects, but did not accurately classify non-pellet objects. Recommendations were made that may improve classification, or completely eliminate the need for classification. An algorithm used for matching objects in consecutive frames was designed for this project. The algorithm succeeded in correctly matching objects in most images. Some improvements to the object matching algorithm were suggested.  The algorithms developed for determining if a new food pellet has entered the view area of the camera, and tracking previously counted pellets to avoid recounting, were original methods developed for this project.  A number of tests were performed using image sequences containing food pellets falling through the view area of the camera at rates likely to be encountered in practice. The utility of the algorithms was confirmed by these experimental results. The average count error was approximately +1-10%. A set of 450 frames were used in the pellet counting tests. A total of 44 errors were made by the counting algorithms during these tests. These results as well as observations from development work were used to identify areas of the food pellet counting process which could be improved in the future.  103  8.1 Future Considerations The algorithms we developed for this project represent the first stage in the development of a commercial automatic pellet counting system. Future work on the automatic system will focus on reducing the count error. Areas of the automatic pellet counting system requiring improvement are identified, and some suggested improvements are described.  Image acquisition is the first step in the pellet counting process. It was observed that as the camera is lowed down in the water column, the light reaching the camera may be insufficient to illuminate the food pellets enough so they can be detected by the camera. A large number of fish above camera level greatly reduce the available illumination. There are two possible solutions to this problem. An underwater light source could be attached to the camera, or a more light sensitive camera could be used.  The use of a high resolution digital camera would improve the accuracy of many algorithms used in the pellet counting process. By switching to a digital camera, the quality of the image sequences used for analysis would be higher, and the need for the video equipment would be eliminated.  If food pellets come too close to the front of the camera, the camera blocks the light, the pellets do not appear white, and are not detected. A transparent shield should be placed around the camera during image acquisition to prevent food pellets from coming too close to the front of the camera.  The overlapping object detection and separation algorithm requires improvements to eliminate problems of separating objects that should not be separated, or not separating objects that should be separated.  104  The object classification algorithm developed does not very accurately classify non-pellet objects. Non-pellet objects such as fish will be present in an actual feeding situation. One possible solution would be to improve the classifier to use additional information such as the original gray levels of the object. Another solution would be to enclose the view area of the camera in a netted structure to prevent fish from entering the view area. This would likely eliminate the need for object classification.  The object matching algorithm is used to match objects in consecutive frames. A local optimization object matching algorithm is used, however, it may be more accurate to implement a global optimization object matching algorithm. Additional components can be added to the distance measure used for object matching, and investigations should be carried out to determine the best weighting scheme to use for the distance measure. Increasing the sampling rate would require more processing power, but would make the object matching algorithm more accurate.  105  Nomenclature 4-Neighbours^The 4-neighbours of a pixel are the 4 pixels to the North, South, East and West of the pixel. 8-Neighbours^The 8-neighbours of a pixel includes the 4-neighbours of the pixel, and the 4 pixels to the NE, NW, SE, and SW of the pixel. Binary Image  An image containing only two shades of gray, black and white.  Bounding Box  The bounding box of an object is defined as the minimum area rectangle which completely encloses the object.  CCD Bloom  The effect that occurs when a CCD element is saturated, and the intensities of neighbouring CCD elements are increased above their correct values.  CCD Camera  A Charge Coupled Device camera. The imaging element is composed of an array of CCD elements. Each element measures the intensity of a small area of the image.  Camera Coverage  The area which is imaged by the camera at the maximum distance from the camera for which objects can be adequately resolved.  Digital Image  An image represented by a number of discrete points (pixels), each with an associated intensity value (0-black to 255-white).  Digitization  The process of converting an analogue image to a digital image of a certain resolution. The value of each pixel in the digital image is determined from the analogue image.  Features  Attributes of an object that can be quantified.  Feed Conversion Ratio  Ratio of mass of food fed to an animal to the mass gain of the animal.  Frame Grabber^A device which is capable of real time digitization and capture of images from a video source. Gray Level^The intensity value of a pixel which can be in the range of 0 to 255 (0-black, 255-white).  106  Grow-Out Tanks  Small cylindrical tanks used to raise small salmon to a certain size before transferring them to larger nets.  Image Fields  A video image is composed of two interlaced fields, the odd field and the even field. The fields are displayed one after the other and each field is displayed at a rate of 60Hz.  Image Histogram  An image histogram is a display of the number of occurrences of each gray level from 0 to 255 in the image.  Loss of Track  Correctly tracking an object involves determining the motion path of the object through a sequence of frames. Loss of track occurs when the position of the object is not known for one or more frames in the sequence.  Motion Vector  A vector describing the motion path an object takes between two image frames  Neighbouring Pixels  Pixels adjacent to a pixel. Either the 4-neighbours, or the 8neighoburs. If unspecified, the 8-neighbours are assumed.  Non-Pellet Object  An object in an image which represents anything other than a food pellet  Object Tracking  Tracking a single object throughout a sequence of image frames.  Pellet Object  An object in an image which represents a food pellet  Pixel  The smallest unit of a digital image. Each pixel has an associated intensity value.  Resolution  The number of pixels used to represent a digital image. The greater the number of pixels used the represent the image, the higher the resolution of the image.  Sampling Rate  The rate at which digital frames are extracted from the videotape. The time difference between consecutive digital image frames.  Sea Cage  An netted enclosure used to raise salmon in the open ocean.  107  Thresholding Setting all the pixels with intensity values higher than the threshold value to 255 (white), and all pixels with intensity values less than or equal to the threshold value to 0 (black). Creates a binary (2-intensity) digital image from a gray level digital image. Visibility  A measure of the clarity of the water. Often measured with a Secchi disc. which is a white circular object. The Secchi disc is lower down in the water column until it is no longer visible, and the distance is was lowered to when it was last visible is called the visibility.  108  Bibliography [1]  Austreng, E., Storebakken, T., Asgard, T. (1987) Growth Rate Estimates for Cultured Atlantic Salmon and Rainbow Trout. Aquaculture, 60, 157-160.  [2]  Burdick, D.C., Marcuse, M.L., Mislan, J.D. (1990). Computer Aided Target Tracking in Motion Analysis Studies. Proceedings of the SPIE - International Society for Optical Engineering, 1356, 89-100.  [3]  Donohoe, G.W., Hush, D.R., Ahmed, N. (1988). Change Detection for Target Detection and Classification in Video Sequences. Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing, 10841087.  [4]  Freeman, M.H. (1990). Optics - Tenth Edition. Butterworths, Toronto.  [5]  Gonzalez, R.C., Wintz, P. (1987). Digital Image Processing, Addison-Wesley, Reading, Massachusetts.  [6]  Gray, S., B. (1971). Local Properties of Binary Images in Two Dimensions. IEEE Transactions on Computers, C-20(5), 551-561.  [7]  Horton, R.D. (1991). A Target Cueing and Tracking System (TCATS) for Smart Video Processing. IEEE AES Systems Magazine, 3, 8-13.  [8]  Jain, A., K. (1989). Fundamentals of Digital Image Processing, Prentice Hall, Englewood Cliffs, NJ.  [9]  Jernigan, M.E. (1990). Pattern Recognition - Course Notes - SD 372, University of Waterloo, Waterloo, Ontario.  [10]  Juell, J. (1991). Hydroacoustic Detection of Food Waste - A Method to Estimate Maximum Food Intake of Fish Populations in Sea Cages. Aquaculture Engineering, (10), 207-217.  [11]  Kalbfleisch, J.G. (1985). Probability and Statistical Inference - Volume 1:Probability, Springer-Verlag, New York.  [12]  Kulpa, Z. (1971). Area and Perimeter Measurement of Blobs in Discrete Binary Pictures. Computer Graphics and Image Processing. 6(5), 434-451.  [13] Lee, H.J., Huang, L.F., Chen, Z. (1990). Multi-Frame Ship Detection and Tracking in an Infrared Image Sequence. Pattern Recognition, 23(7), 785-798.  109  [14]  Lu, S.W. (1990). A Multiple Target Tracking System. Proceedings of the SPIE International Society for Optical Engineering, 1388, 299-305.  [15]  Mertens, L.E. (1970). In-water Photography; Theory and Practice, John Wiley & Sons, New York. pp. 49-52.  [16]  Mostafavi, H. (1990). An Image Sequence Analysis Workstation for Multi-Point Motion Analysis. Proceedings of the SPIE - International Society for Optical Engineering, 1356, 19-24.  [17]  Poon, S. (1989). Algorithms for Detecting and Segmenting Nucleated Blood Cells, Masters Thesis - Department of Electrical Engineering - University of British Columbia.  [18]  Poon, S., Ward, R.K., Palcic, B. (1992). Automated Image Detection and Segmentation in Blood Smears. Cytometry, 13(7), 766-774  [19]  Pratt, W., K. (1991). Digital Image Processing, John Wiley & Sons, New York. pp. 449-484, 597-605  [20]  Press, W.H., Flannery, B.P., Teukolsky, S.A. (1988). Vetterling, W.T., Numerical Recipes in C, Cambridge University Press, Cambridge.  [21]  Sedgewick, R. (1990). Algorithms in C, Addison-Wesley, Reading, Massachusetts.  [22]  Seymour E.A., Bergheim, A. (1991). Towards a Reduction of Pollution from Intensive Aquaculture with Reference to the Farming of Salmonids in Norway. Aquaculture Engineering, 10, 73-88.  [23]  Shepherd, J.S., Bromage, N. (1988). BSP Professional Books, Oxford, pp. 78-81.  [24]  Storebakken, T., Austreng, A. (1987). Ration Level for Salmonids 1. Growth, Survival, Body Composition, and Feed Conversion in Atlantic Salmon Fry and Fingerlings. Aquaculture, 60, 189-205.  [25]  Thorpe, J.E., Talbot, C., Miles, M.S., Rawlings, C., Keay, D.S. (1990). Food consumption in 24 hours by Atlantic Salmon (Salmo salar L.) in sea cage. Aquaculture, 90, 41-47.  110  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0064999/manifest

Comment

Related Items