A MODEL-BASED APPROACH TO COMPLEX CONTOUR GENERATION FOR PROCESS AUTOMATION USING COMPUTER VISION By Lalith D. K. B. Gamage BSc. (Hons), Moratuwa, Sri Lanka; MSc., Leicester, U.K. A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES MECHANICAL ENGINEERING We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August 1993 © Lalith D. K. B. Gamage, 1993 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. (Signature) Department of mECHAN/c4L En/6/NecR/A16 The University of British Columbia Vancouver, Canada Date^1g 41)6 /993 DE-6 (2/88) Abstract The research described in this thesis addresses the development of methodology for fast extraction of complex processing information of an object using relatively simple on-line measurements and prior knowledge and information that can be generated off-line on the object. Even though the approaches investigated here are quite general, the techniques are developed with respect to the specific application of flexible automation of a fish processing cell. Robotics and computer vision have been combined to produce useful results in a variety of applications. Vision guidance may be applied at least in two ways here. Cutting contours of fish could be generated off-line and these could be used to generate the reference signals for the cutting controller. Alternatively, cameras mounted on the cutter could be used to guide the cutter in real time. The present research concentrates on the first approach since the objective is to generate the cutting contour sufficiently fast to meet the process speed requirements. The complexity of the vision problem is eased to some extent by the fact that the object (fish) is already recognized, at least in a generic sense. Low-to-medium-level computer vision techniques involving image transformation, enhancement and analysis of basic features such as edges, regions, shape, colour and texture, are reasonably well established and widely applied. The application of these techniques to directly generate the cutting contour of an arbitrary fish would be computationally slow and unacceptable in terms of process speed requirements. The present research effort is directed at expediting vision-based generation of cutting contours through the use of model-based vision techniques. One of the goals of this research is to develop a knowledge-base and a model-base to support the complex feature extraction procedure. Use of low-level image analysis techniques to obtain non-complex features at high speeds from the fish images is a further goal, as this is related to the first goal. In the model-based approach, the fish models are generated using a representative database of measurements on fish. The data includes a cutting contour and some dimensional measurements which are referred to as attributes or features. The cutting contours are non-dimensionalized, transformed, and grouped into an appropriate number of models using a systematic grouping procedure. Each model contains a nondimensional cutting contour for the corresponding group of fish, and a set of attributes. In real-time operation, the measured attributes have to be matched to the model. The main contribution of this research is the methodology for the generation of rules for model matching or classification. The a priori probability distribution of attributes in each group is used to generate the rules for the model that corresponds to the group. Rules generated for all models are collected in a rule base and used for classification. A systematic method for integrating expert and heuristic knowledge is developed in order to improve the efficiency of the classification process. An extensive process of error analysis of the present methodology is also presented. The techniques developed in the present research were implemented in a prototype fish processing cell that is being developed in the Industrial Automation Laboratory. This cell consists of a vision station, a knowledge-based system, a robotic cutter, and a motorized conveyor unit. In a heading (head removal) operation, each fish on the conveyer is imaged at the vision station and the corresponding cutting contour is determined with the help of the knowledge-based system. This cutting contour is transformed into a drive signal for the robotic cutter. At the cutting station, a fish will be gripped and cut according to the trajectory generated for that particular fish. In the present prototype system, however, the robot draws the corresponding cutting contour on a board placed on the conveyor bed. The vision system, the cutter/gripper controls, and the conveyer controls communicate and synchronize the activities of various components in the fish processing system. Specific ways of transferring the new technology to the Canadian fish processing industry are currently being explored. This research also compares the performance of the developed rule-based methodology with other classification and estimation methodologies. These include neural networks, Bayesian inferences, k-nearest neighbour method, and multiple regression. The results of these comparisons are presented. Also, as a preliminary requirement for implementation, algorithms have been developed for on-line measurement of the attributes using low-level image analysis techniques. iv Table of Contents Abstract^ ii List of Tables^ x List of Figures^ xii Acknowledgement^ 1 2 3 xvi Introduction 1 1.1 Objectives of the Research ^ 2 1.2 The Specific Application ^ 3 1.3 Approach and Justification ^ 4 1.4 Contributions of the Research ^ 6 1.5 Overview of the Thesis ^ 7 Literature Review 9 2.1 Statistical Decision Theory and Pattern Classification ^ 10 2.2 Knowledge-Based Systems, and Rule-Based Systems ^ 11 2.3 Computer Vision ^ 14 2.4 Knowledge-Based Computer Vision ^ 18 Data Acquisition and Analysis 20 3.1 The Approach ^ 21 3.2 Data Acquisition ^ 22 v 3.3 Data Analysis ^ 3.4 Model Grouping Through Mismatch Analysis 3.5 4 ^ 32 3.4.1^Model Comparison ^ 35 Feature Selection ^ 38 3.5.1^Development of Rules 40 ^ Rule Generation 42 4.1 Formal Analysis of Rule Generation ^ 42 4.1.1^On-line Inference ^ 48 4.1.2^Error Analysis ^ 52 4.1.3^Comparison of Bayesian Error to Rule Base Error ^ 52 4.1.4^Rule Base Error for Multi-Class, Multi-Modal Case ^ 53 4.1.5^Density Estimation Kernel and Size of Subrange (h) ^ 58 Rule Generation Through Connectivity Analysis ^ 58 4.2.1^Threshold Level Selection 59 4.2 4.3 4.4 5 26 Illustrative Example ^ ^ 62 4.3.1^Rule Prioritization and Threshold Level Selection ^ 65 4.3.2^Results ^ 69 Integration of Heuristics and Expert Knowledge ^ 71 4.4.1^Integration of Knowledge into Fish Rule Base: An Illustration 72 76 Attribute Measurement 5.1 Attribute Measurement Using Projection Profile Method ^ 77 5.2 Attribute Measurement Using Region Boxing ^ 83 5.3 Comparative Evaluation of the Techniques ^ 83 5.4 Measurement of Orientation ^ 84 vi ^ ^5.4.1^Least Rectangle Method ^ 6 7 9 5.4.2^2nd Moment of Area Method ^ 90 5.4.3^Fundamental Ellipse Method ^ 92 5.4.4^Results and Comparison ^ 94 Pattern Recognition Techniques 98 6.1 Pattern Recognition in Automated Fish Processing ^ 101 6.2 Neural Networks ^ 101 6.3 A Neural Network for Fish Classification ^ 108 6.4 Multiple Regression ^ 111 6.5 Bayesian Approach ^ 124 6.6 k-Nearest Neighbor Method ^ 128 Comparison of Recognition Techniques 131 Rule-Based Method ^ 131 7.1.1^Real-Time Complexity Comparison to Bayesian Approach 133 7.2 Neural Network ^ 134 7.3 Multiple Regression ^ 136 7.4 k-Nearest Neighbour Method ^ 138 7.5 Summary of Classification Results ^ 139 7.1 8 85 Pattern Recognition for Estimation of Cutter Position 140 8.1 Off-line Data Acquisition and Analysis ^ 141 8.2 Development of an Estimation Model ^ 142 Laboratory Implementation 149 9.1^"V-cut" Butchering Machine ^ 149 vii 9.2 Contour Cutting System ^ 150 9.3 Automatic Rule Generation ^ 155 9.4 Cutter Technology ^ 159 9.4.1 Water-jet Cutter ^ 159 9.4.2 Laser Cutter ^ 160 9.4.3 Wire-Cutter ^ 160 9.4.4 Rotary Blade Cutter ^ 163 9.4.5 Reciprocating Blade Cutter ^ 164 9.4.6 Comparison and Recommendations ^ 165 9.5 Gripper Considerations ^ 9.5.1 Comparison and Recommendations ^ 10 Conclusions and Future Work^ 165 167 170 10.1 Conclusions ^ 170 10.2 Main Contributions of the Thesis ^ 172 10.3 Future Work ^ 172 Nomenclature^ 174 Bibliography^ 178 A Overview of OPS/83^ 188 A.1 Rule-Based Programming ^ 188 A.2 High Level Programming ^ 189 A.3 Working Memory ^ 189 A.3.1 Element Declarations ^ 189 A.3.2 The make, modify, and remove Statements ^ 190 viii A.4 Rules ^ 190 B ESHED SCORBOT-ER VII Robotic Manipulator ^ 192 B.1 Robot Specifications ^ 192 B.2 Robot Controller Specifications ^ 192 B.3 Programming Considerations ^ 198 C SD512 Vision System^ 202 C.1 System Configuration and Specifications ^ 202 C.2 Available Functions ^ 203 C.2.1 Feature Enhancement ^ 203 C.2.2 Image Segmentation ^ 205 C.2.3 Feature Extraction ^ 206 C.2.4 Template Matching and Correlation ^ 207 C.2.5 Pyramid Processing ^ 207 C.2.6 Graphics ^ 208 C.2.7 Video Control and Image Acquisition ^ 208 C.2.8 IKS/VR Variables ^ 209 C.3 Application Software Development ^ D Interface Circuitry and Communication Software ^ 210 213 D.1 The Interface Circuit ^ 213 D.2 Synchronization Circuit ^ 213 D.3 Communication and Synchronization Software ^ 215 ix List of Tables 3.1 Typical Data of Fish Dimensions ^ 25 3.2 Typical Values of Index of Mismatch Compared with Fish No. 2. ^. 30 3.3 Indices of Mismatch between the Models ^ 37 3.4 Percentages of Wastage and/or Unwanted Material Left in the Body to ^ 37 4.1 A Sample of Data on Fish Dimensions for Group 5. ^ 63 4.2 The Medians y of the Subranges of the Relationships for Group 5 ^ 63 4.3 Probabilities of the Multinomial Distribution of Density Values of Rela- the Total Weight of fish tionships for Group 5. 4.4 66 ^ Probabilities of the Multinomial Distribution of Density Values of Relationships for All Groups ^ 4.5 67 Probabilities of the Multinomial Distribution of Density Values of Relationships for All Groups ^ 4.6 69 Rule-Based Method Classification Results for Resubstitution and Leave70 One-Out Methods. ^ 5.1 Typical Times for Various Image Processing Operations. 5.2 Typical Optimal Angles of Cut. ^ 5.3 Typical Results from the Three Methods of Computing the Angle of Ori- ^ 84 85 97 entation ^ x 6.1 Neural Network Classification Results for Resubstitution and Leave-OneOut Methods ^ 6.2 112 Multiple Regression Estimation Results for Resubstitution and Leave-OneOut Methods ^ 6.3 123 k-Nearest Neighbour Classification Results for Resubstitution and LeaveOne-Out Methods. 130 ^ 7.1 Summary of Results. ^ 8.1 Sample Test Results of Estimate Using Multiple Regression. 8.2 Sample Test Results of Estimate Using a Neural Network B.1 Robot Arm Specifications. ^ 139 ^ 144 ^ 145 193 B.2 Robot Controller Specifications ^ 195 B.3 ACL Command Summary^ 199 xi List of Figures 3.1 The System for the Off-line Development of a Model-Base for Salmon. 23 3.2 A Distributed Control Structure for On-line Fish Processing ^ 24 3.3 Nomenclature for Geometric Data of a Fish 25 3.4 Nomenclature for Geometric Data on Fish Head ^ 3.5 Reference Frames for Coordinate Transformation in the Cutting Contour ^ Analysis ^ 3.6 26 27 Representation of the Index of Mismatch as the Minimum Absolute Area Between Two Contours. ^ 31 3.7 First Stage of the Mismatch Analysis Procedure. ^ 34 3.8 Second Stage of the Mismatch Analysis Procedure. 36 4.1 Probability Density Function of a Single-mode, Two-Class Case ^ 4.2 Probability Density Function of a Single-mode, Two-class Case and the ^ Selection of Windows for Rule Generation 4.3 47 47 Graphical Illustration of the Error Difference Between the Rule Base and Bayesian Classification Methods. ^ 54 4.4 The Generalized Connectivity Diagram of a Model. ^ 60 4.5 The Connectivity Diagram for Group 5. 64 4.6 Probability Distribution of Density Values of Relationships for Group 5. . 66 4.7 Probability Distribution of Density Values for All Groups. ^ 68 4.8 Illustration of a Decision Tree for the Fish Classification System ^ 73 xii ^ 5.1 Captured Image Under Structured Lighting. ^ 77 5.2 The Isolated Largest Continuous Image Segment (Fish in the Image). . ^78 5.3 A region of Interest of the Fish Image. ^79 5.4 Gaussian-Smoothed Fish Head Image. ^ 80 5.5 Edge-Enhanced Head Image. ^ 81 5.6 Projection Profile of the Head Image ^81 5.7 Two Angles of Importance for Recovery Improvement in Head Removal in Fish Processing ^ 86 5.8 The Least Rectangle Enclosing a Fish. ^ 87 5.9 The Case of Mouth Closed. ^ 89 5.10 The Case of Mouth Open ^ 90 5.11 Illustration of the Parameters Used in Calculation of Second Moment of Area ^ 91 5.12 Distances to a Reference Point and the Corresponding Angle ^ 95 5.13 The Graph of Distance to a Reference Point Vs. the Angle. ^ 95 5.14 An Illustration of Three Methods to Calculate the Orientation ^96 6.1 Schematic Representation of Pattern Classification ^98 6.2 (a) Learning the Mapping Function Using a Training set. (b) Estimation Using the Learned Mapping Function. ^ 6.3 Schematic Representation of a Linear Discriminant ^ 100 102 6.4 Schematic Representation of a Multiclass Linear Discriminant ^ 103 6.5 Schematic Representation of a Semilinear Feedforward Net ^ 106 6.6 Normal Probability Plot of Residuals for Coefficient al ^ 114 6.7 Scatter Plot of Residuals for Coefficient al. ^ 115 6.8 Normal Probability Plot of Residuals for Coefficient a2 ^ 116 6.9 Scatter Plot of Residuals for Coefficient a2. ^ 117 6.10 Normal Probability Plot of Residuals for Coefficient a3 ^ 118 6.11 Scatter Plot of Residuals for Coefficient a3. ^ 119 6.12 Normal Probability Plot of Residuals for Coefficient a4 ^ 120 6.13 Scatter Plot of Residuals for Coefficient a4. ^ 121 7.1 Schematic Representation of a Node in the Neural Network ^ 134 7.2 Schematic Representation of Patterns in 2-D Feature Space. ^ 135 7.3 Density of Patterns for Two-Class, Single-Feature Classification ^ 139 8.1 Illustration of the Features of a Fish Head ^ 141 8.2 Normal Probability Plot of Residuals of Point Estimate Using a Neural Network ^ 146 8.3 Scatter Plot of Residuals of Point Estimate. ^ 147 9.1 Schematic Representation of the Model-Based Contour Cutting Workcell. 153 9.2 An Image Captured by the Vision System ^ 154 9.3 A Typical Output of the Model-Based Contour Generation ^ 154 9.4 An Image of a Salmon with the Cutting Contour Superimposed ^ 155 9.5 P-Rules with Density 9 (above), and Density 6 (below) ^ 157 9.6 P-Rules with Density 4 (above), and Density 2 (below) ^ 158 9.7 Cutting Geometry of Fish in Contour Cutting for Recovery Improvement. 161 9.8 Illustration of Two Stage Cutting in Head Removal ^ 162 9.9 Proposed Wire-Cutting Assembly for Robotic Contour Cutting. ^ 162 9.10 Proposed Rotary-Blade Cutter Assembly for Robotic Contour Cutting. ^ 163 xiv 9.11 Proposed Reciprocating-Blade Cutter Assembly for Robotic Contour Cutting ^ 164 9.12 Schematic Diagram of the Synchronous Gripper ^ 166 9.13 Schematic Diagram of the Intermittent Gripper ^ 168 B.1 Robot Arm Axes ^ 194 B.2 Operating Range of the Robot. ^ 194 C.1 SD512 System Block Diagram. ^ 204 D.1 Interface Circuit between the Robot Controller and SD512 ^ 214 D.2 Synchronization Circuit of the Conveyor and the Vision Software. . . . . 216 D.3 Complete Wiring Diagram of the Interface and the Synchronization Circuit.218 XV Acknowledgement I would like to thank Dr. C. W. de Silva for his constant supervision and guidance throughout this research. Special thanks to my supervisory committee, Dr. A. S. Atadan, Dr. R. G. Cosine, Dr. P. D. Lawrence, and Dr. V. J. Modi, for many valuable suggestions and comments. I would also like to thank the Advanced Systems Institute of British Columbia for providing a research grant (Grant No. 90-2151C) to carry out the research described in this thesis. Additional funding has been provided by Natural Sciences and Enginering Research Council's Industrial Automation Chair held by Dr. C. W. de Silva. Last but not least I would like to thank my parents and my wife for their constant support and encouragement. xvi Chapter 1 Introduction Generation of processing information is important in many industrial automation tasks. Processing in this context essentially refers to mechanical processing. Applications may include arc welding, inspection, sealant or glue applications, sewing, spray painting, and mechanical processing of fish. In many of these applications, the processing information has to be determined on-line through direct gauging of the workpiece or object. In such situations, the speed and the reliability of the information generated depends on the gauging technique or the sensing mechanism. Fast and accurate methods of generating processing information are important in this regard. In many of these application areas, there is well established knowledge in the form of experience or expertise. For example, an expert in welding may relate that if part A is to be welded to part B, then the rod type must be X and the gap between the two must be Y. An experienced shoe quality control inspector may express that if the shoe type is P, then the inspection seam trajectory must be Q. Likewise, knowledge can be acquired from experienced personnel or from experts, generally known as domain experts, by interviewing or questioning them. In some instances, the knowledge acquired from the domain experts may be complete, but in some other instances the domain expert may be unable to describe his/her knowledge completely. The definition of complete in this context is that everything relevant to the problem can be obtained from the experts or can be derived from the acquired knowledge. Also, 1 Chapter 1. Introduction^ 2 there may be cases where knowledge cannot be acquired from the domain experts nor is available from any other form. In cases where complete knowledge of the problem can be captured from the domain experts, the most common approach is to build an expert system. In instances where no knowledge can be captured and where there is no algorithmic solution to the problem, a statistical approach is the usual practice. Also, there are methods such as neural networks which utilize "learn from examples" approach. In cases where the knowledge captured from the domain experts is not complete, statistical or neural computing methods can be applied. However, partial but useful knowledge about the problem can be incorporated into the statistical approach such that it enhances the reliability and the performance of the system. 1.1 Objectives of the Research The main objective of the research here is to develop a methodology for fast extraction of complex processing information of an object. The term complex is used because the information pertaining to the processing is complex to extract using conventional sensors. Use of an approach such as computer vision alone to accomplish this may either limit the processing speeds or be impossible due to the nature of the processing information. The objective, therefore, is to use knowledge acquired from the domain experts and/or from the examples to extract complex processing information. Knowledge gained from the domain experts is usually in the form of if-then rules. Statistical knowledge captured from sample data is usually in the form of analytical rules. Development of a rulebased system, which utilizes knowledge acquired from both of these methods, is another objective of this research. Also, how expert knowledge and statistical knowledge together Chapter 1. Introduction^ 3 can give rise to better performance and higher reliability is explored. Alternatively, the problem can be addressed as a pattern recognition problem. Comparison of several pattern recognition techniques for extraction of complex processing information of an object with the developed rule-based methodology is also an objective of this research. Development of a practical workcell to demonstrate the feasibility of the developed methodology is the final objective of the research. This may involve transformation of processing information into necessary signals to drive the various components of the system. Also, control and synchronization signals to establish proper communication and synchronization among various components of the system must be generated. 1.2 The Specific Application Even though the approaches investigated here are general, the techniques are developed with respect to the particular application of flexible automation of a fish processing cell. In this system, functional flexibility is achieved through knowledge-based computer vision and a robotic cutter [de Silva, 1991]. In a heading operation, each fish on a conveyer will be imaged at the vision station and the corresponding cutting contour must be determined. This will be transformed into a drive signal for the robotic cutter. The cutting contour of a fish is essentially a complex feature since it cannot be determined by a computer vision system alone without any a priori knowledge of the object. Ideally, the cutting contour should follow the collarbone, and in fact part of the collarbone can be estimated from the gill contour obtained from the vision data. The other part, however, is not visible and must be derived. In particular, the latter part of the contour should remove the pectoral fin. The cutting contour should also be determined by the system Chapter 1. Introduction^ 4 such that it maximizes the yield for a given fish. With a standard throughput of 120 fish per minute, a butchering operation has to be carried out in 500ms. A cutting contour for a fish, therefore, has to be generated also in less than 500ms. Due to the complexity of the contour, however, this is not feasible using current technology. Prior knowledge of fish along with few on-line measurements will therefore be used to extract this feature. This leads us to the secondary objectives of this research, which are the generation of the knowledge-base and the establishment of the on-line measurements. 1.3 Approach and Justification As described in the above section, the main objective of the proposed research is to develop a methodology to extract complex processing information of an object using simple on-line measurements and prior knowledge. The approach used in the present research is to first address a specific problem, and then attempt to generalize the procedures that are developed to solve the specific problem. Two major components are emphasized, namely on-line measurements and prior knowledge. The knowledge-base contains two components, the knowledge acquired from the domain experts and the knowledge gained from examples. The knowledge from the domain experts is captured by interviewing or questioning them, while the knowledge from examples is captured using statistical data analysis. It is possible that additional knowledge can be gained by studying and working with the problem. The knowledge captured from the experts and the knowledge gained by working with the problem are most commonly expressed in the form of if-then rules. As mentioned earlier, this knowledge may not be sufficient to solve the problem. This knowledge, however, can be used to improve the Chapter 1. Introduction^ 5 performance and the reliability of the system. It is, therefore, important to integrate this knowledge with the knowledge captured from statistical data analysis. In the present approach, statistical knowledge is also represented by if-then rules so that they can be conveniently integrated into the rule base. The development of the statistical component of the knowledge-base is divided into three subtasks. Firstly, a database of fish data is developed from still pictures and video tapes of fish, collected at a local fish processing plant. This database contains the details of the cutting contours of fish and some important dimensional measurements. These measurements are referred to as attributes or features. Secondly, the cutting contours in this database are non-dimensionalized and transformed to a reference frame that, allows a comparison of contours and grouping of similar contours. Finally, a knowledge-base is developed to identify the group of a particular fish based on simple on-line measurements. Statistical analysis of the database of fish measurements is carried out to develop the rules for the knowledge-base. In particular, relationships between the fish measurements are investigated during the development of the rules. A suitable set of fish measurements, which can also be quickly determined by conventional computer vision algorithms, must be identified. These measurements are chosen such that they best represent a particular group. Low-to-medium level image analysis techniques are used to establish these measurements. This involves image transformation, enhancement and analysis of basic features such as edges, regions, and shapes. Once the cutting contour is established as the inference of the knowledge-based system, it is dimensionalized and transformed to a drive signal for the robotic cutter. At the cutting station, the fish is gripped and cut according to the trajectory generated for that particular fish. Two commonly used approaches can be adopted. The first approach requires the position, velocity, and acceleration of the manipulator's joints at selected Chapter 1. Introduction^ 6 locations along the trajectory. The second approach requires the path that the cutter must traverse as an analytical function. Formal procedures for developing rules for the knowledge-based system from data are formulated. This is essentially the generalization of the procedures that are used to solve the specific problem. A systematic scheme is developed for the system to incorporate knowledge in the field to improve efficiency and reliability. Furthermore, conflicts and redundancies may appear in the rule-base. The probabilities of the relationships between the attributes can be used to approach this problem. Finally, all components of the system are integrated to form a single processing cell. The vision system, the cutter/gripper controls, and the conveyer controls communicate and synchronize the activities of various components through data, control, and synchronization signals. 1.4 Contributions of the Research Generation of rules for a knowledge-based system using probability theory is the main contribution of this research. Multivariate probability density distributions of attributes are used for rule generation as well as for rule prioritization. Combining the knowledge already available from the expertise and experience with the knowledge gained from statistical analysis to enhance the reliability and the performance of the system is another contribution of this research. This research also provides a set of algorithms to compute attributes for the rulebased system using image data. Comparison of knowledge representation techniques for this type of industrial application is also a contribution of the research. The main contributions to the industry are the new technology and a practical workcell for fish Chapter 1. Introduction ^ 7 processing. 1.5 Overview of the Thesis Chapter 2 describes the literature related to this thesis. In particular three major areas, namely computer vision, statistics, and expert systems, are reviewed. In Chapter 3, procedures developed for data acquisition and analysis are presented. In the same chapter, algorithms developed to group objects are also described. Chapter 4 describes the rule generation methodology developed in the present research. It gives a formal analysis of the methodology first, and then describes the rule generation through connectivity analysis. An extensive error analysis of the method is also presented. An illustrative example given in this chapter describes rule generation through connectivity analysis. Finally in Chapter 4, a method to integrate expert and heuristic knowledge with statistical knowledge is presented. Chapter 5 gives the vision algorithms developed for feature extraction. Here, analytical background as well as examples are presented. Commonly used pattern recognition techniques such as neural networks, Bayesian approach, k-nearest neighbour method, and multiple regression are described in Chapter 6. These methods were applied to the fish processing problem and the results are shown. Chapter 7 compares the methods developed in Chapter 6 with the developed rule-based methodology. In Chapter 8, statistical estimation system is developed to estimate a point in the head of a fish. This point will be used in head removal of fish processing. A neural network estimator is also developed and the results of the two methods are compared. The laboratory implementation of the systems developed in the present research are presented in Chapter 9. The feasibility of a prototype workcell with an integrated rule Chapter 1. Introduction^ 8 base, a model base, a vision system, and robot is demonstrated. An automatic method to generate the rules is also developed in this chapter. Finally, Chapter 10 concludes the thesis and highlights some of the continuing work from this thesis. Chapter 2 Literature Review From an initial feasibility analysis of an existing indexing mechanism which uses tactile sensing of the fish anatomy, it was concluded that a tactile sensing mechanism is not suited for this application. Since the indexing mechanism did not work efficiently and resulted in considerable wastage, it was decided to investigate vision based techniques as a sensor in the removal of fish heads prior to canning. High speed computer vision is also attractive since the processing time for this operation is critical (—, 300ms). Further study of the problem revealed that the knowledge about fish cutting can be obtained from experienced butchers. The area of knowledge-based systems was studied in order to use this knowledge to aid the computer vision procedures. Although this knowledge is very useful in developing an automated butchering system, it was incomplete and inadequate. Therefore, statistical methods were investigated to extract new knowledge that can be used with the existing knowledge to develop a system for automated fish processing. In view of the relevant topics for this work, the literature review includes a discussion of selected papers from the areas of statistics, knowledge-based systems, general computer vision, and knowledge-based computer vision. Knowledge-based vision area was also researched to study the methodologies that are available to incorporate knowledge into a computer vision system. 9 Chapter 2. Literature Review^ 10 2.1 Statistical Decision Theory and Pattern Classification The field of statistical decision theory is very well researched and major contributions to the field are documented in [Neyman and Pearson, 1928], [Wald, 1950], [Ferguson, 1967], and [Blackwell and Cirshick, 1954]. In addition, statistics and decision theory is closely related to game theory which was developed greatly due to [Von Neumann and Morgenstern, 1944]. The pioneering decision theory work by [Neyman and Pearson, 1928] is based on hypothesis testing and the probability of error, and this was generalized by Wald [Wald, 1950] to include loss and risk. The fundamental Bayesian approach to pattern recognition has been avoided by many statisticians mainly because there is no reasonable way to determine the a priori probabilities for many problems. The approach assumes that the decision problem is posed in probabilistic terms and that all the relevant probability values are known [Duda and Hart, 1973]. Anderson [Anderson, 1958,1984] develops the Bayesian theory of classification in general terms and applies it to multivariate normal distributions. He proves that the Bayesian procedure is optimal and derives a quadratic discriminant function. The use of discriminant functions for classification was introduced by [Fisher, 1936] One of the first researchers to apply the Bayesian approach for pattern recognition is Chow [Chow, 1957], who introduced a provision for rejection and developed a fundamental relation between error and rejection rates. More recent work on Bayesian classification involves error estimation and effects of sample size on classification performance [Raudys and Pikelis, 1980]. Jain and Hoffman [Jain and Hoffman, 1988] describe an evidence-based recognition system to recognize 3-D objects from range images. In this system, they employ a few Chapter 2. Literature Review^ 11 crucial (notable) features of the object, and each object is represented as one or more rules in a rule base developed using these features. An object in the range image is identified by computing a similarity measure between the observed features and those represented in rule form. It recognizes only those objects represented in the rule base and any new object is refuted. Although some of the concepts of this paper are relevant to the present research, the present problem is much more complicated in the sense that the objects in this case are naturally varying. Also, it is impossible to assign a rule for each object let alone many rules for an object. As previously mentioned, a statistical approach is the most common for problems with either incomplete or no heuristic knowledge. Neural networks are commonly investigated for these types of applications. In the present research, the generation of complex process information is treated as a pattern classification problem in which processing information for a given object is selected from one of the several categories on the basis of some key measurements on the object. 2.2 Knowledge-Based Systems, and Rule-Based Systems Knowledge-based systems are programs which use collection of facts, rules of thumb, and other knowledge about a given field coupled with methods of applying those rules to make inferences. Expert systems have been built in a wide variety of domains [Hoffman et. al., 1986] [Hays-Roth et. al., 1983] and some of the applications include medical [Buchanan and Shortliffe, 19841 [Buchanan and Shortliffe, 1975], process planning [Schaefer et. al., 1988], process diagnosis [Ishida, 1988], education [Tompsett, 1988], Chapter 2. Literature Review ^ 12 and mineral exploration [Duda et. el., 19791. They differ substantially from conventional computer programs because their tasks are ill-defined, knowledge intensive, and have no algorithmic solutions. Production rules (also called IF THEN rules) are the most popular representation of domain knowledge needed for an expert system. Hence, an expert system built using production rules is also known as a rule-based system. Production rules were brought into Artificial Intelligence (Al) by Newell, who had seen their power and simplicity demonstrated in Floyd's [Floyd, 1961] work on formal languages and compilers. Feigenbaum began advocating the use of production rules to encode domain specific knowledge in DENDRAL [Feigenbaum et. el., 1971]. Waterman picked up on the suggestion but decided to work with rules and heuristics of the game of poker [Waterman, 1970]. As described in chapter 1, the objectives of the present research is to use simple online measurements and prior knowledge on the object to extract complex features. It is of paramount importance, therefore, to represent prior knowledge in a knowledge-base. The approach in the current research is to represent knowledge in the form of rules. Literature in the area of rule-based expert systems was primarily reviewed to study how knowledge can be represented as rules. There are numerous rule-based systems available for knowledge representation, however, rule-based systems that incorporate both statistical and heuristic or expert knowledge have not been reported. Buchanan and Shortliffe [Buchanan and Shortliffe, 1984] describe, in detail, a consultant expert system, MYCIN, which provides medical diagnoses and prescribes drugs for bacterial infections. Gayle and Dankel [Gayle and Dankel, 1986] present a knowledgebased system for drug interactions which helps medical personnel in identifying the potential of harmful drug induced effects in patient care. Hoffman and Carviedes Chapter 2. Literature Review^ 13 [Hoffman et. al., 1986] present a rule-based expert system for repair domain which diagnoses defects in electronics systems, guides a technician through appropriate repair procedures, verifies diagnoses, monitors repair effectiveness, and maintains records for subsequent evaluation and quality control. A rule-based system for geometrical reasoning for the interpretation of line drawings is presented by Whitaker [Whitaker, 1986]. The system derives the object on the basis of three orthogonal two dimensional views and attempts to represent a draftsman's knowledge in deducing the structure of the object. A study of the structure of these rule-based systems, and in particular the arrangement of rules to improve efficiency, provides the necessary background in rule-based systems. Gottinger [Gottinger, 1988] introduces two types of expert systems which involve statistical expertise: statistical consulting programs and pattern finding programs in databases. Pattern finding systems use statistical techniques such as correlations and discriminants as well as knowledge about their domain to search a database and discover relationships. For example, the RX program [Blum, 1982] searches through a medical database for possible causal relationships and uses medical knowledge to rule out common causes and spurious correlations. The work in this paper led to the idea of extracting relationships from a database. Birman [Birman, 1982] presents a rule-based method for ECG analysis that identifies cardiac abnormalities. In this system, each waveform is represented as a tree of rule activations. Howard and Rehak [Howard and Rehak, 1989] discuss the importance of interfacing expert systems with databases and they present a knowledge-aided database management system (KADBASE). This is a flexible interface in which multiple databases and knowledge-based systems can communicate as independent, self descriptive components. An expert system for labeling segments in infrared imaging is presented by Roberts Chapter 2. Literature Review^ 14 [Roberts, 1986]. This system labels high priority targets, low priority targets, roads, trees, forests, and potential clearings. Watanabe et. al. [Watanabe et. al., 1989] compare ART [ART, 1984], STROBE [Lafue and Smith, 1985], LOOPS [Stefik et. al., 1983], and OPS83 [Forgy, 1985] with their CL - Conceptual network-based Language - a LISPbased tool kit to assist knowledge engineers. This paper is of interest to this research because it compares 0PS83, which is the language used in the present implementation, with other languages. Carbonell, Cullingford, and Gershman [Carbonell et. al., 1981] discuss a knowledge-based system to translate languages. 2.3 Computer Vision Although human vision has been an academic subject of interest since seventeenth century, computer vision and digital image processing have largely evolved only during the last 30 years [Levine, 1985]. In particular, during the past 20 years there has been a considerable growth of interest in problems of computer vision [Fu, 1982]. The need has created an increasing activity in the development of theoretical methods and experimental software and hardware for use in vision systems and the design of such systems [Fu and Young, 1986]. Fu [Fu, 1982] defines computer vision as the process of extracting, characterizing, and interpreting information from images of a three dimensional world. This process may be subdivided into six principal areas (1) Sensing, (2) Preprocessing, (3) Segmentation, (4) Description, (5) Recognition, and (6) Interpretation. Sensing is the process that yields a visual image. Preprocessing deals with techniques such as noise reduction and enhancement of details. Segmentation is the process that partitions the image into objects of interest. Description deals with the computation of features (eg. size, shape) suitable for differentiating one type of object from another. Recognition Chapter 2. Literature Review ^ 15 is the process that identifies these objects (eg. wrench, bolt, engine block). Finally, interpretation assigns meaning to an ensemble of recognized objects. In more recent publications [Marr, 1982], Marr and Hilderith describe computer vision in a different way. Marr's view of vision is from raw primal sketch --> full primal sketch 21D --> 3-D models. A contribution of the raw primal sketch are edges, bars, blobs, and terminators, and these are characterized by orientation, contrast, length, width, and position. The full primal sketch deals with issues such as virtual contours or subjective contours in the image. The 21D sketch describes the properties of a surface including spatial properties, reflectance, and illumination. Deriving the 3-D models from 2-21D sketch is task and situation dependent and it requires knowledge about the scene. Marr's theory of object representation is an important part of much work on computer vision and is, therefore, relevant to the research described in this thesis. Horn [Horn, 1986] describes the computer vision problem as a two stage process. In the first stage (generally called the image analysis stage), a symbolic description of the image is produced. In the second stage, the scene analysis stage, a symbolic description of the scene is produced from the symbolic description of the image. This description is closely related to the present research where an image analysis stage extracts the attributes of an object from the image, and a scene analysis stage classifies an object described by the attributes to a given class. Although pattern classification and scene analysis are generally considered as two different issues, it is also possible to consider classification as a special case of scene analysis. In classification, however, the results of image analysis are used to make a class or group assignment rather than to define a complete description in scene analysis. In the specific application that is considered in the present work, the generation of a cutting contour for a fish by recognizing the pattern of the gill plate in a conventional Chapter 2. Literature Review^ 16 manner was initially investigated. Three methods have been discussed to recognize a pattern by [Fu(3), 1982]: (1) template matching, (2) decision theoretic approach, and (3) structural and syntactic approach. In template matching approach, a set of templates or prototypes, one for each pattern class, is stored in the machine. The input pattern with known classification is matched or compared with the template of each class, and the classification is based on a preselected matching criterion or similarity measure. In decision theoretic approach, a pattern is represented by a set of N features and the decision-making process is based on a similarity measure (between features) which, in turn, is expressed in terms of a distance measure. The classification of patterns is based on a statistical decision rule or membership function. In the structural and syntactic approach, a pattern is often represented as a string, a tree, or a graph of pattern primitives and their relations. The decision-making process, in general, is a syntax analysis. Much work has been done to describe boundaries or shapes of the objects in an image [Rosenfeld, 1975]. Shapes or boundary of an object can be applied in two ways for the problem of cutting contour generation in fish processing. On-line measurements can be transformed into parameters describing the shape of the object, and analysis of boundary contour of an object can be used to determine its orientation. [Brady, 1982] describes a vision system for parts description and acquisition. The paper reports that many objects in stable configuration can be recognized from their bounding contour and a procedure is developed for the extraction of two dimensional shapes and contours. [Persoon and Fu, 1977] use Fourier descriptors to describe boundary curves or shapes. In this paper they deal only with nonoverlapping boundaries that are completely known. The classes are described by a few fixed prototypes that may be rotated, translated, or scaled. [Chellappa and Bagdazian, 1982] describe a transform coding technique for closed 2-D image boundaries. The boundary is represented by either (x,y) co-ordinates Chapter 2. Literature Review^ . 17 of the end points of the segments or by magnitude of successive radii vectors that are equi-spaced in angle around the given boundary. Moment invariant functions have the desired property that they are invariant under image translation and rotation. Sensed objects can be matched with the predefined models through this process. [Dudani et. al., 1977] discuss a method to identify aircrafts by moment invariants extracted from binary images. This paper particularly concerned with the recognition of complex man-made objects rather than natural objects or simple solids. [Dirilten and Newman, 1977] address the problem of matching patterns modified by intensification, translation, magnification, and rotation-reflection. [Reddi, 1981] derives a set of radial and angular moment invariant functions which are capable of identifying rotated, translated, reflected, and scaled images. This technique is of interest to the fish processing application since fish have a significant size variation despite a similarity in cutting contour shape. Vision guided inspection is also widely used in industry today. Reinhold illustrates an installation of an automatic vision inspection system to inspect sheet metal parts [Reinhold, 1982]. This paper addresses the crucial issues of transfer of technology to the industry such as packaging processing electronics for the factory environment, reliably illuminating and imaging the part under test, providing a satisfactory user interface, and designing equipment to be maintained by factory personnel. Although the primary objective of this research is to develop a knowledge-based methodology for fast extraction of complex processing information of an object, the issues described in this paper are also addressed because they are some of the goals of the Industrial Automation Laboratory. Chapter 2. Literature Review^ 18 2.4 Knowledge-Based Computer Vision Incorporating knowledge into a vision system can help in many ways. Specifically, in the present research it is incorporated to speed up the process. Bolles and Cain [Bolles and Cain, 19821 highlight the major problem associated with currently available machine vision systems: they only recognize and locate isolated parts against a contrasting background. These systems recognize binary patterns by measuring global features of a region such as area, elongation and perimeter length, and compare these values with stored models. Darwish and Jain [Darwish and Jain, 1988] compare two methods of visual pattern inspection: image comparison and rule based n.• Lhods. The first method involves the well known template matching technique in which a pattern under test is compared with a reference pattern. The disadvantages of this method include the requirement for precise alignment prior to comparison and the difficulty in matching templates in the presence of tolerances in specifications. The rule-based method identifies defective patterns if they do not conform with the specified design rules. In the popular model-based object recognition approach, the primary objective is an efficient match between features that have been extracted from sensory data and corresponding features in object models. Perkins [Perkins, 19781 presents a model-based vision system for industrial parts that can determine the position and orientation of complex curved objects. The paper, however, does not consider the speed of the processes. Kjeldsen, Bolle, and Sabbah [Kjeldsen et. al., 1988] discuss a model based approach to recognize complex shapes using primitive shapes such as planes, cylinders, cones, and spheres. First, partial geometric descriptions of the primitives are extracted, Chapter 2. Literature Review ^ 19 and based upon these partial descriptions a more complete geometric description is obtained. This information is then used to index into an object database. Hence, complex recognition is achieved in stepwise fashion. A model-based vision system is presented by [Suzuki and Arimoto, 19881 to recognize continuous handwritten letters. The letter recognizer uses a parallelogram that covers the maximal letter and processes a sequence of letters while shifting the window. Recognition is achieved by calculating a similarity measure. Chapter 3 Data Acquisition and Analysis The main objective of the present research is to develop vision-based techniques for generating complex processing information of an object using measured features and a priori knowledge of the object. This will lead to a knowledge-based computer vision system for process trajectory generation in industrial automation. Although the system will be developed with emphasis on automated fish processing, the theory can as well be adopted to other automated processes. The formal analysis of the theory is described in Chapter 4. A majority of commonly available computer vision systems is incapable of real-time processing, particularly, when the position, orientation, and other features of a moving object have to be computed for this purpose at speeds of greater than 2Hz. Hence, a knowledge-based approach is developed to solve this problem. The approach used in the present research is to first address a specific problem, and then attempt to generalize the procedures that are developed to solve the specific problem. The specific approach employed has the following components: 1. Off-line analysis of fish data to develop a database of fish models. This database is developed using static pictures and video tapes of fish, collected at a local fish processing plant. The database consists of a set of geometric dimensional measurements and a cutting contour for each fish. 20 Chapter 3. Data Acquisition and Analysis^ 21 2. Mathematical analysis of the cutting contours to compare and group similar contours. The cutting contours represented in the database are dimensional, and therefore, they should be non-dimensionalized and transformed to a reference frame in order to compare and group similar contours. 3. Identify a suitable set of attributes which can be quickly determined by conventional image analysis algorithms. These attributes have to be carefully chosen such that they best represent the corresponding model. Also, they should be simple enough in order to determine them sufficiently fast by the vision system. Low to medium level image analysis techniques involving image transformation, enhancement, and analysis of basic features such as edges, regions, and shapes are adopted. 4. Develop an attribute driven rule-based system to identify a model of a particular fish. Statistical analysis of the attribute data, which are in the database, is carried out to develop these rules. Specifically, relationships between attributes are examined. Also, heuristics and experience are integrated into the rule base to improve the performance of inference. 5. Investigate the possibilities of developing faster algorithms to determine the above attributes. 6. Develop a procedure to generate cutting trajectories. 3.1 The Approach The method described below for the generation of cutting contours utilizes an integrated model-based and rule-based (knowledge-based) approach [Gamage and de Silva, 1991a]. Consider the system shown in Figure 3.1. Suppose that a series of models representing all Chapter 3. Data Acquisition and Analysis ^ 22 possible variations of fish that are normally expected at a processing machine is generated. Each entry in the database will consist of a cutting contour and a set of attribute values which identifies the particular model. A set of attribute values is processed at the vision station during real time operation of the process (Figure 3.2). These attribute values, when transmitted into the working memory (input) of a rule-based system, would fire a rule which in turn would identify a model of fish. The model provides the necessary cutting contour to the robotic cutter. The time taken to compute the characteristic attribute values is very much shorter than the time needed to compute a cutting contour from an image of a fish. The generation of the model base and the rule base are necessarily carried out off-line. Hence, computer overhead does not present a serious obstacle here. 3.2 Data Acquisition The development of fish models was carried out by first recording data on representative samples of fish and then analyzing the data to compare and group them. A large number of fish images, in the form of still pictures and video, was recorded at a local fish processing plant. These images were, subsequently, read into a vision workstation in order to identify a suitable cutting contour and a set of geometric features (attributes) for each image. This process is schematically illustrated in Figure 3.1. The best cutting contour (the cutting contour that recovers most meat from the head while removing the pectoral fin) was identified with the help of fish cutting experts. The cutting contour, at this stage, is in terms of pixel co-ordinates as shown in Figure 3.3. All possible geometric features of the fish that can be identified through a vision system were also recorded. These features are highlighted in Figure 3.3 and Figure 3.4. The dimensional measurements of samples of fish together with the cutting contour were Oti I-% CD 5.4 i--■ .. H CD C/D ‘< el ,rCD 5 ..., o'-1 c-tCD 0 .--. ..... • cl CI CD < ro o 5co <-4- 0 ..., P, 4 o CD , to P, CD I-1-, 0 I-t CID P) 5 0 Chapter 3. Data Acquisition and Analysis^ Figure 3.2: A Distributed Control Structure for On-line Fish Processing. 24 Chapter 3. Data Acquisition and Analysis^ stored in a database. A typical set of data of fish dimensions are shown in Table 3.1. 1^Nose 2^Eye 3 - 8^Cutting Contour 9 - 10 Width 11^Tail Figure 3.3: Nomenclature for Geometric Data of a Fish. Table 3.1: Typical Data of Fish Dimensions. Fish No. Nose-Eye Nose-Gill Nose- • - • Length Width (cm) (cm) (cm) (cm) (cm) 12.6 Si 3.4 10.2 59.0 ••^• 13.4 3.3 59.5 S2 10.7 ••^• 14.3 3.9 11.7 -^•^63.7 S3 12.6 S4 3.6 10.8 -^-^• 56.4 11.6 -^•^• 60.5 13.8 S5 4.5 P4 -^-^• 48.6 12.3 3.2 9.7 14.5 ••^• 77.8 18.4 4.8 CH5 25 Chapter 3. Data Acquisition and Analysis^ 26 1 Nose 2. Eye 3 Flap 4,5 Flap Width 7,8 Gill Width 6 Gill 9 Fin Top 10 Fin Center Figure 3.4: Nomenclature for Geometric Data on Fish Head. 3.3 Data Analysis Prior to the analysis, the data must be processed such that all measurements are nondimensionalized with respect to a common reference frame. Data processing, therefore, includes co-ordinate transformation, curve-fitting, and scaling. A cutting contour, as mentioned previously, is a set of pixel co-ordinates obtained from the fish image. In order to match and group similar contours, it is necessary to transform all contours to a common reference frame. This transformation was accomplished by fixing a local co-ordinate frame at one end of the contour, and moving this frame to a known reference frame through a translation and a rotation as illustrated in Figure 3.5. The cutting contours were then nondimensionalized so that the chord length of the contour is unity. This process can be mathematically Chapter 3. Data Acquisition and Analysis^ 27 expressed as follows: A Figure 3.5: Reference Frames for Coordinate Transformation in the Cutting Contour Analysis. Suppose that the pixel co-ordinates obtained from the image are (4,^, (x:, y:), . ,^), then translation is simply Xt = X - X The chord length of the contour is ^ 1* (3.1) Chapter 3. Data Acquisition and Analysis ^ 28 L =^— yD2 + (xin — )2 = v y;r + 42.^ (3.2) Now, the contour can be non-dimensionalized as follows: xi xi (3.3) The Sine and the Cosine values of the angle of rotation (0) can be expressed as Sin0 = y" and Cos0 = x"'^ (3.4) because the chord length of the contour is unity. The process of rotation can be expressed in matrix form as follows: yi^C os0^Sin01[ yr ^ (3.5) x-^—Sin() Cos0 Since the angle of rotation has to be 9 in the opposite direction, the transformation matrix becomes ^ Chapter 3. Data Acquisition and Analysis^ Y,^Cos° —Sine l[yr 29 (3.6) xi^Sine^Cost9^xr and can also be expressed as Yi ^1 i^Y: 1 —^(Yn (x72 xi ] (4 — Y1)2 +^— 4)2 (Yni — Y1)^(xn' —^x: — .^(3.7) The comparison procedure is facilitated by first curve fitting each set of pixel coordinates in to a polynomial expression to obtain an analytical cutting contour. Several methods were tried for this purpose including spline and least squares approaches. The most satisfactory cutting contour was obtained by polynomial curve fitting as shown below: Yi n-1^n-2 x1 1 Y2 n-2 x2 n-1^ x2 1 yn ,n-1 xn-2 n ^ al (3.8) an Chapter 3. Data Acquisition and Analysis^ 30 which is of the form y = ,V.a^ where y e Rn, a 6 ^X 6 (3.9) WnXn, and (xi, yz) are the coordinates of the cutting contour. The analytic form of the cutting contour is then given by the polynomial y(x)^a1xn -1^a2xn- 2^. .. 4_ an^ (3.10) The comparison of cutting contours is accomplished by first picking an arbitrary contour and matching it with all remaining contours. A parameter called index of mismatch (IM) is computed to determine how well two curves correlate with each other. This index of mismatch is the minimum absolute area between the two curves. In other words, this is the area under the two curves that is not common to both curves as shown in Figure 3.6, and it is obtained by minimizing the absolute area between the contours through a process of frame rotation (SO). A sample of typical values of index of mismatch is shown in Table 3.2. Table 3.2: Typical Values of Index of Mismatch Compared with Fish No. 2. Fish No. Si S2 S3 S4 S5 Index of Mismatch 0.1177 0.0000 0.0295 0.0207 0.0527 This minimum value of the area is selected as the index of mismatch and it is calculated according to the following formula. Chapter 3. Data Acquisition and Analysis ^ 31 Figure 3.6: Representation of the Index of Mismatch as the Minimum Absolute Area Between Two Contours. Chapter 3. Data Acquisition and Analysis ^ 32 IM = Min^Abs(yi(x) — y (x))dx^(3.11) 40) In which yz denotes the figuration of the jth ith contour which is transformed into the closest-match con- contour (y3( x)) through a sequence of incremental rotations (SO), as illustrated in Figure 3.6. A tolerance value is selected for the index of mismatch and all data below this value are grouped into one model. The tolerance value selected is directly related to the wastage, therefore, this value is chosen so that the wastage incurred by selecting a particular contour (model) is acceptable (less than 1% of the total weight of the fish). The grouping process is repeated with another contour and so on until all the data have been grouped into models. Note that, some samples of fish may belong to more than one group due to tolerance value selected. In such a case, the model which gives the lowest index of mismatch is selected. 3.4 Model Grouping Through Mismatch Analysis Grouping data into a reasonably compact set of models is done by a procedure of mismatch analysis for the complex pattern of interest[Gamage and de Silva, 199114 In the present implementation, the complex pattern is a cutting contour, and each data item in the database contains this contour with some dimensional measurements related to the contour. The grouping of contours into models is done in two stages. In the first stage, each contour is compared with all the remaining contours and a group of contours similar to that contour is determined. In the second stage, the groups of similar contours which may have common elements are compared to determine the most appropriate group to which a common contour should be assigned. In this manner, a smaller number of distinct Chapter 3. Data Acquisition and Analysis ^ 33 groups with no common elements is developed. The main steps of the first stage of the procedure are 1. Pick a contour and compare it with all the remaining contours contained in the data. 2. Compute the index of mismatch (IM) which represents the level of deviation of one contour with the other. 3. Select a suitable threshold level for the index of mismatch and group all the data items (contours) which have their indices of mismatch below the threshold into one model. 4. Repeat steps 1, 2, and 3 for all the contours. This process will generate a large number of groups, equal in number to the number of contours. The flow chart given in Figure 3.7 illustrates this procedure. The second stage involves the following steps 1. Pick a group which was generated in the first stage and examine each of its elements to check whether it is repeated in any other group. 2. If the particular contour is common to two or more groups, compare the indices of mismatch of this contour within these groups, assign the contour to the group which has the lowest index of mismatch for this contour, and delete it from the remaining groups. 3. If not repeated, there is no conflict and retain the element in the same group. Chapter 3. Data Acquisition and Analysis ^ 34 Pick a Contour Compare with other contours and compute the index of mismatch (IM) Ignore this contour Include the contour in the group 1Yes iYes (-Stop D Figure 3.7: First Stage of the Mismatch Analysis Procedure. Chapter 3. Data Acquisition and Analysis ^ 35 This procedure will, in general, reduce the number of groups considerably. The flow chart given in Figure 3.8 illustrates the second stage of the model grouping process. This grouping method was compared with a clustering method known as k-means clustering [Wilkinson, 1990] and it produced similar results. The k-means algorithm, however, does not guarantee that all the data in a particular group satisfies the specified tolerance level. Furthermore, the grouping method described here allows us to use any feature as the matching parameter. In the present case, it was the absolute area between two contours. 3.4.1 Model Comparison As a result of the grouping procedure described above, the present fish data set was clustered into five groups with each having a representational contour for the group. This representational contour of each group is called the model. In order to estimate the costs associated by selecting an incorrect model, it is possible to use the index of mismatch between two contours. As defined previously, the index of mismatch is the minimum absolute area between two contours. This value, therefore, relates to the wastage of useful meat and/or unwanted material left in the body of the processed fish as a result of selecting an incorrect model. Both wastage of useful meat and leaving unwanted material in the body have their associated costs. In particular, the latter case needs further processing. Table 3.3 shows the indices of mismatch between models for all five groups. Note that the last column of this table shows the area difference between the corresponding model and a straight "V-cut" (please refer to Figure 5.7). After manually butchering several medium size (1.5kg — 2.0kg) fish, it was possible to relate the indices of mismatch given in Table 3.3 to a percentage of ratio of wastage and/or unwanted material in the body to total weight of the fish. These percentage 36 Chapter 3. Data Acquisition and Analysis Pick a Group Read an element from the group Compare the IM in a common roup Higher Delete it from the group Lower or Same Are the common groups exhausted? Retain in the same group Keep the elements in the group which has the lowest IM Are all the elements rocessed? Yes Figure 3.8: Second Stage of the Mismatch Analysis Procedure. Chapter 3. Data Acquisition and Analysis ^ 37 Table 3.3: Indices of Mismatch between the Models Model 1 2 3 4 5 1 X 2 0.0352 X 3 0.0496 0.0180 X 4 0.0277 0.0294 0.0226 X 5 0.0239 0.0388 0.0529 0.0314 X Straight Cut 0.0959 0.1236 0.1400 0.1174 0.0986 values are shown in Table 3.4. As in Table 3.3, the last column of this table indicates the approximate wastage of useful meat if a straight "V-cut" is performed rather than the respective model. The "V-cut" butchering machine is also a prototype machine which is being developed in the Industrial Automation Laboratory. In existing butchering machines, head removal is performed by a rotary chopping knife which causes about 10% ,--, 15% wastage of useful meat on average. Table 3.4: Percentages of Wastage and/or Unwanted Material Left in the Body to the Total Weight of fish Model 1 2 3 4 5 1 X 2 2.0% X 3 2.2% 1.0% X 4 1.6% 1.7% 1.3% X 5 1.4% 2.0% 2.2% 1.8% X Straight Cut 3.4% 4.2% 4.8% 4.0% 3.2% As an example, a 1.6kg fish which belongs to model 1 was manually butchered according to all five models. The following figures show the actual wastage in grams, while in the cases of model 2 and model 3 operations, in addition to these figures, the contours cut into the head and left some gill and pieces of collarbone in the body resulting further Chapter 3. Data Acquisition and Analysis ^ 38 processing. Model 2 - 26g + gill and collarbone pieces Model 3 - 30g + gill and collarbone pieces Model 4 - 25g Model 5 - 22g 3.5 Feature Selection One of the basic problems in pattern recognition is to identify a set of features (attributes) that gives the best classification results. The purpose of feature selection is to recognize these features for the problem concerned. One way to approach this problem is to collect all possible features that can be measured using the existing sensing device. When the number of features increases, however, the procedures that are analytically and computationally manageable in lowdimensional spaces can become completely impractical in large (50 ,,, 100) dimensional spaces [Duda and Hart, 1973]. This problem is known as curse of dimensionality. A variety of methods for dimensionality reduction have been proposed in the literature [Meisel, 1972] [Kendall and Stuart, 1966] [Harman, 1967], because the curse of dimensionality poses a major problem for many pattern recognition procedures. The most common methods include principal component analysis and factor analysis. The Principal Component Analysis (PCA) finds a lower-dimensional representation that accounts for the variance of the features. The Factor Analysis (FA) finds a lower-dimensional representation that accounts for the correlation among the features. In the present research, the latter approach was employed to reduce the number of geometric features. Chapter 3. Data Acquisition and Analysis ^ 39 The process of dimensionality reduction through factor analysis can be thought of as the process of removing or combining highly correlated features. Also, this can be treated as a clustering procedure. For example, consider a data matrix of n rows with ddimensional samples. On the one hand, ordinary clustering can be considered as grouping rows to form a smaller number of cluster centers which can be used to represent data. On the other hand, dimensionality reduction can be thought of as combining columns (features) to form a reduced number of columns to represent the data. Duda and Hart [Duda and Hart, 19731 describe a hierarchical clustering procedure to reduce the dimensionality and it uses correlation between two features. The correlation between two features can be defined as: 23 where a ^the covariance of the features i and j, and u. and (3.12) CI 3 are the variances of features i and j respectively. Also, —1 < pi, < 1 and, therefore, 0 < 4 < 1 with pi22 = 0 for completely uncorrelated features and 4 = 1 for completely correlated features. Hence, it is clear that correlation is a measure of similarity between two features. For example, if the correlation between two features is large then they are good candidates to be merged into one, or one can be removed leaving the other to represent data. This procedure leads to the following hierarchical dimensionality reduction procedure. 1. Initialize dimensionality (d) with the number of features in the original data set. 2. Compute the correlation matrix of features (acz). 3. Find the most correlated pair of features (aci and ac, Chapter 3. Data Acquisition and Analysis ^ 40 4. Merge act and ac, or delete one of them (say ac„). 5. Decrement d by one. 6. Go to step 2. Duda and Hart [Duda and Hart, 1973] describes this procedure to continue until the desired number of features are established. In our case however, this procedure was continued until the correlation coefficients between features fell below a selected threshold. Geometric features such as nose-to-eye, nose-to-flap, nose-to-gill, nose-to-the center of the pectoral fin, nose-to-the edge of the pectoral fin, width at flap position, width at gill position, width at the center of the fish (width), and length of the fish were initially chosen for the problem of contour selection for fish butchering. With the hierarchical dimensionality reduction procedure described it was possible to reduce the number of features to four. These features are nose-to-eye length, nose-to-gill length, width, and length of the fish. The next phase is to develop a set of rules to identify a particular model using these features (characteristic attributes). 3.5.1 Development of Rules The production of rules is considered to be one of the most difficult and time consuming stage in the development of rule-based systems [Hand, 1985] [Rowley, 1986]. Even where there are experts willing to assist the knowledge engineer in the development of the expert system, it may be that the expert has great difficulty in making the rules explicit [Hand, 1985]. Data analysis for database design has been advocated for all system design and has been accepted in conventional database applications [Held and Carlis, 1989]. It is still Chapter 3. Data Acquisition and Analysis ^ 41 foreign to the design of rule-bases, however. In the present research, the rules are developed by analyzing data to establish the relationships between attributes. As mentioned in the previous section, the attributes selected are nose-to-eye length, nose-to-gill length, width, and length of the fish which are respectively represented as aci, ac2, ac3, and ac4. In general, the attributes aci, ac2, ac3, and ac4 may have the most general many <-4 many many 4-) many relationships among each other. This means that for a particular group, all the possible values of attribute aci may be associated with all the possible values of attribute ac2, all the possible values of attribute ac2 may be associated with all the possible values of attribute ac3, and so on. Further examination of data, however, reveals that the many to many relationships are not most general as described. The entire set of fish in a group is divided into subsets on the basis of the multivariate probability density of the candidate attributes as described in Chapter 4. The same results (subsets) can be obtained by setting up a connectivity diagram which describes the relationships between the attributes. Although it is possible for all attribute values to participate in the relationships, there can be hidden dependencies which restrict this participation. In other words, any arbitrary combination of attribute values may not correspond to a set of valid fish data. Once the proper relationships and the corresponding subsets are identified, they can be grouped such that a logical combination of these determine a unique model. The formal analysis of this process and the generation of rules through connectivity analysis are described in Chapter 4. Chapter 4 Rule Generation In the off-line analysis of rule generation, the data were grouped (clustered) through mismatch analysis as described in Chapter 3. In this chapter, the formal analysis of the rule generation methodology is first described. Secondly, a practical rule generation system using connectivity analysis is presented. The connectivity analysis, which is a direct application of the theory developed in the formal analysis section, is illustrated by a simple example. Also in the formal analysis section, an error analysis for the developed methodology and a comparison of error between the present method and the Bayesian approach are developed. Finally, a scheme for integrating expert knowledge and heuristics into the statistical rule base is developed. 4.1 Formal Analysis of Rule Generation After grouping data with similar patterns (feature of interest) into the respective groups, it is necessary to generate rules for these groups. Possible relationships among the features within each group are identified to generate the rule base for classification. Consider a set of m features ack, k = 1,2, , m. For each feature in a group or class, the measured (non-dimensionalized) feature values are arranged in a column in the ascending order of the numerical value. A relationship is defined as a set of feature values for a data item (a tuple of features). For example, a relationship comprises one value from each of the m columns and it forms a vector in m-dimensional space (arm). The subranges 42 Chapter 4. Rule Generation ^ 43 for each feature column are determined by the number of relationships falling into them. Specifically, if a particular set of subranges (subranges for aci, ac2, ,acn.,) has a larger number of relationships than a set of subranges slightly perturbed in their respective columns then the set of subranges which gives the maximum number of relationships in the neighborhood is considered for rule generation. In multi-dimensional space (in this case m-dimensional space), these subranges form hyper-cubes. The rules for a particular group are determined by the density of the relationships. The objective is to determine if a given feature vector of an object belongs to a high density region of arn feature space. In order to do this, it is necessary to determine the density functions of each group. The procedure for estimating the density functions is described below. The probability P that a feature vector ac belongs to the region 7Z, of a particular group C given by P = p(a_cri)dac (4.13) where p(ac)C3) is the multivariate conditional probability density function, the probability density function of ac given that the group or class is C3. The probability P can also be described as the smoothed or averaged version of p(aciC3). Now, suppose that n samples of ac (acl, ac , ,) are drawn according to the probability law p(acr 3). Then the probability that k samples belong to region 7Z is given by binomial law Pk = Pk(1 — P)n_k (4.14) with E(k) = nP as the expected value of k. Again, P is the probability that a sample belongs to the region R given that the group is Cr With the binomial law, the maximum Chapter 4. Rule Generation ^ 44 likelihood estimate for P is (4.15) If the region R is assumed to be small, then p(acr3)dac^p(acri).V^ (4.16) where ac is the point within R and V is the volume of the hyper-cube enclosed by 1Z. Combining equations 4.13, 4.15, and 4.16 p(acIC3) can be estimated as follows: k In 75(aciC j)^ V (4.17) If the sample size is large, then k converge in probability But for fixed volume V, f)(ac C3 is an average of p(acr3). P fRp(ac1C3)dac fl(acIC^= ^ (4.18) V^fR dac For finite sample case, V can be specified as a function of n. A general result for density estimates is that as n increases V should decrease. Rudemo[Rudemo, 1982] stated that for univariate case the length h should decrease as n-1. For m-dimensional case with n samples the volume can be expressed as follows: (4.19) Chapter 4. Rule Generation^ 45 Here, h is the size of each subrange. Without loss of generality, h, which is the size of each subrange, can be taken to be the same for each class provided that the attribute values are normalized by their respective ranges. The parameter h can also be defined as the length of each side of the hyper-cube. While the size of h can be found by cross validation, automatic methods are also available (although this is still an active research area in statistics) for choosing h. This issue will be discussed in Section 4.1.5. Now define the function 6b(u) as follows: 1 Ink! <^k = 1,2, ^ ,m OCtS)^ 0 otherwise (4.20) The function 0(u) defines a unit hyper-cube in m-dimensional space with its center at the origin. Therefore, the function (ac — ac h ) where ac, ac E C3^ (4.21) will be unity if ac belongs to the hyper-cube of volume 14, centered at ac and is zero otherwise. Hence, the number of samples in this hyper-cube is given by kr, E n (ac —hac 1=.1 where ac, ac E C3^(4.22) Now, the estimate 75(acIC3), 1 n 1 (ac — ac 15(acjC3) —E -0 n i=1^h ^(4.23) In order to find the high density regions, it is necessary to find the local maxima of fin(acrj). These maxima are found by taking the partial derivatives of 13,2(aciC 3) with Chapter 4. Rule Generation^ 46 respect to each feature (ack) and equating the derivatives to zero. Furthermore, the second derivatives (partial) with respect to all features must be negative for a maximum. The criteria for local maxima are aiin ( g_c i c; ) Oack =0^for k = 1,2,....,m a2Mgcic3) < 0 OacZ (4.24) for V k = 1,2, ...., m^(4.25) Next, the concept of rule generation is addressed. The basic idea here is to center each rule at the local maximum of the probability density functions isr,(acjC3)P(C.,). Note according to Bayes relation that, this amounts to maximizing P(C3Iac), which is an intuitively appealing criterion for generating rules for a particular class C3. In addition, each rule is assigned an activation region and a priority. The assignment of the activation region and the priority is best understood by considering a simple 1-D distribution with single mode for a two class case as shown in Figure 4.1. In establishing the rule-base, certain regions of the density function are deemed significant, as illustrated in Figure 4.2. Specifically, lobes of the density function which are above a predefined threshold are considered for rule generation. In view of this, the rule boundaries are set at the points of intersection of this threshold with the density functions, and the height of the window (rule priority) w, is selected so as to preserve the probability within the window. The required window height is calculated as = 1^a.2 ^p(acri)P(Ci)dac^(4.26) (ac2 — ac1) Chapter 4. Rule Generation 47 Figure 4.1: Probability Density Function of a Single-mode, Two-Class Case. I p (ac CdP(C1) p (acj C2)P(C2) 2 Threshold ' • • •^• • • :^ • •• •^ • • • ••• • • • •^" • •^ • •^• • •• • • : • • • ^ • •^" • •^• CI C I^OC2 OC3 •^•^. :• Figure 4.2: Probability Density Function of a Single-mode, Two-class Case and the Selection of Windows for Rule Generation. Chapter 4. Rule Generation ^ 48 where aci and ac2 are selected at the points of intersection of the density function and the threshold. Similarly, 1^fac4 W2^ (ac4 — ac3)J ac3 p(ac1C2)P(C2)dac^(4.27) In practice, the density functions p(ac)C3)P(C3) will each have more than one modal point and this will lead to a number of rules centered at the local maxima of the density functions. The priority of each rule will be determined by the height of the window associated with each peak, which is calculated as in Equations 4.26 and 4.27. For many problems, there may be several classes and several attributes associated with them. Also, there may be several peaks within the same class. For the general case (ith case of the jth class), therefore, the Equations 4.26 and 4.27 can be expressed as Wii 1 1[1][ni. — p(aAC;)P(Ci)dac (4.28) where [I] is the m x m identity matrix, the subscripts u and I correspond to the upper and lower bounds of the region 1? for each attribute ack in the attribute vector ac. 4.1.1 On-line Inference In general, the equations 4.24 and 4.25 have many solutions corresponding to the modal points of the density function p(acIC3). Each modal point i will have an associated hyperrectangle which corresponds to a rule in the rule base. Define a function 12 such that ^ Chapter 4. Rule Generation ^ 49 „^1 1Z1 uk and hk^k = 1,2, ^ ,m 21 V hk 0 otherwise Then, fl ( ac — 71, .i) ^hkij defines a hyper-rectangle at ^and i corresponds to the z' solution of the equations 4.24 --43 and 4.25 for the jth class. Also, ik can be thought as the cluster centers corresponding to each solution in the ith class. The length (hk„) of each side of the hyper-rectangle can be expressed in terms of the corresponding attribute values as follows: hki; —ackiju — acko k 1, 2, ^ , m and {Vi > 0}^(4.29) Also, the cluster centers Okij op _,3 ) can be expressed as ackii. ackii. k = 1, 2, ^ , m and {Vi > 0} 2^ (4.30) where k is the attribute number, and u and I correspond to the upper and lower bounds of the attribute of the selected cluster (cluster i). The attribute values ack„u and ackifi are selected such that the least hyper-rectangle formed by these ranges enclose the projection of the intersected density function at the threshold level on to the m-dimensional coordinate system. This process is better understood for the 1-D case shown in Figure 4.2. A graphical representation of this procedure is the connectivity diagram as shown in Figure 4.4. Examination of this diagram reveals that high density areas are represented as densely concentrated relationships. It is possible, therefore, to use this diagram to form ^ Chapter 4. Rule Generation ^ 50 rules for the group or class. Rule generation through connectivity analysis is described, in detail, in Section 4.2. Now, for a given feature vector ac the values of 12,3 can be computed for i = 1, 2, ^ PJ where p, is the number of peaks (modal points) or solutions in the jth class, j^1, 2, ^ q) and q is the number of classes. Three possible cases exist. 1. f2,3 = 1 for unique i and j. 2. I-2,3 = 1 for more than one j. 3. fi,; = 0 V i and j. Case 1 is straightforward since a given feature vector belongs to the group or class C3. In case 2, the given feature vector belongs to more than one class, and the class C, that gives maximum MacIC3) can be selected. That is, select class C3 if 23„(acr ) fin(acICT)^Vr j^ (4.31) This criterion is appropriate if all classes or groups have the same a priori probability. If the a priori probabilities are different the following rule can be used. j(acIC3)P(C3) 73„(aciC,.)P(C,.)^Vr j^(4.32) where P(C) and P(C3) are the a priori probabilities of groups C,. and C3 respectively. Note that, this rule is of the form of Bayesian decision rule. In the present research, however, the probability values at each point of the probability distribution are not stored because this is not practical, in particular, with large Chapter 4. Rule Generation ^ 51 dimensional spaces. Hence, in the current approach, only the significant regions (i.e., the regions with high probability density) are represented and used for on-line inferencing. In order to determine the class when there is ambiguity, the following rule is adopted. Select class C, if w., > w .,^Vr ^ (4.33) Here, • stands for any i value of the corresponding group. Again, if the class a priori probabilities are different, w..;_13(C.;)> w,P(Cr)^ (4.34) can be used. In case 3, the feature vector ac does not belong to any of the high density regions. In this case, the present methodology does not classify the feature vector into any of the groups, instead a default class is assumed. The regions of the density functions that correspond to this case are the regions that are eliminated by the threshold level. The present approach prefers a dominant default class to a less certain class. In other words, this methodology assigns a new object, which has attribute values corresponding to a low probability density area, to a class that has the highest a priori probability (P(C3)). The reason for this is that, when there is no other evidence, assigning a feature vector to the class with highest a priori probability reduces the probability of error, as shown in Section 4.1.4. The height of the hyper-rectangles (w,J) which corresponds to the relationship density in the connectivity diagram can be used to prioritize the rules of the rule base. The rule generation procedure using connectivity diagram is described in Section 4.2. ^ Chapter 4. Rule Generation^ 52 4.1.2 Error Analysis Again for simplicity, a 1-D distribution with single mode for a two class case as shown in Figure 4.1 is considered. As illustrated in this figure, the Bayesian error for these distributions is given by P(errorjac) = { P(C2jac) if ac E C1 (4.35) P(Cilac) if ac C C2 and a total classification error of P(error) =^p(acIC2)P(C2)dac + j'cc p(acri)P(Ci)dac.^(4.36) acB The total error for the rule-based representation of the distribution functions shown in Figure 4.2 is calculated as oci P(error)^p(acICOP(Ci)dac +^p(acIC2)P(C2)dac J ac, P(aciC2)P(C2)dac + ^p(acri)P(Ci)dac (2ci^ acs^ ^J ■lc2 ac4 p(acIC2)P(C2)dae +^p(acICOP(Ci)dac + a p(acri)P(Ci)dac + ^p(acr2)P(C2)dac c4^ ac4 lac'.^ ac2 =^p(ac)dac + I p(ac1C2)P(C2)dac , acs^ac4^ oo + j p(ac)dac + I p(aciCi)P(Ci)dae + I p(ac)dac. (4.37) ..c2^,2C3^ -co aci I1C4 4.1.3 Comparison of Bayesian Error to Rule Base Error Bayesian classification error for the two class, 1-D case as shown in Figure 4.1 can be expressed as given in Equation 4.36. This can also be expanded as given below. 53 Chapter 4. Rule Generation^ aci P(Bayes Error) , I .p(acr2)P(C2)dac + I p(acr2)P(C2)dac aci ac3 acB^ +^p(acr2)P(C2)dac + j p(acri)P(Ci)dac Jac2^ ' i.c.B . oo c4^ I P(acri)P(Ci)dac (4.38) + P(acri.)P(Ci)dac a + f ac3^ ac4 where aci, ac2, ac3, ac4, and acB are as defined in Figures 4.1 and 4.2. With Figure 4.2, the probability of rule base classification error can be expressed as follows: P(Rule Base Error) = j aci aci^ ao p(ac1C2)P(C2)dac+ 1 p(acri)P(Ci)dac -.0 ac2^ ac B . P(aCiC2)P(C2)daC + I p(acIC2)P(C2)dac + j^ aci^ (2c2 ac B^ ac3 p(acri)P(Ci)dac + .1. p(acjC2)P(C2)dac + I^ .2clit 42C2^ ac4 ac3^ + j.^ p(acri)P(Ci)dac + j p(acICOP(Ci)dac (.e3 ^ cL.13 .^ 0. + I p(aciC2)P(C2)dac + f p(acri)P(Ci)dac (4.39) ac4^ ac4 The difference between Bayesian and rule base classification error is, therefore, the difference between Equations 4.38 and 4.39 and is given below. P(Error Di f f.) = faci^ acB L. p(acri)P(Ci)dac + f p(acri)P(Ci)dac ac3^ aC2 oo + f p(acIC2)P(C2)dac + I p(acr2)P(C2)dac (4.40) acB^ ac4 These errors are highlighted in Figure 4.3. 4.1.4 Rule Base Error for Multi-Class, Multi-Modal Case In this section, a general result for classification error for multi-class, multi-model, and multi-dimensional case is derived. ^ ^ Chapter 4. Rule Generation^ 1 11 54 p (a c l Cd P(Cd pocic2)p(c2) 2 C4 O CI^PC2 0C3 Threshold OC Figure 4.3: Graphical Illustration of the Error Difference Between the Rule Base and Bayesian Classification Methods. Theorem : For the multi-class, multi-model case, the probability of rule base classification error is 1-EE q P.i^ q Pk [f p(acri)P(Cj)clac^EE^p(acrk)P(COdad^(4.41) ;=1 i=1^Rij^ k=1 1=1^,0,1,1, where i - Cluster number of class j. j - Dominant class of cluster i. p; - Number of clusters of class j. q - Number classes. - ith region or hyper-rectangle of the jth class. Chapter 4. Rule Generation^ 55 1 - Cluster number of class k. k - Class of an intersecting cluster with class j. A - Intersection indicator 1 if cluster i of class j intersects with cluster 1 of class k, A =^and j is the dominant class of all intersecting classes 0 otherwise - Intersection of Ri j with RIA Proof : Consider a region i, R2,3, for the jth class. This region is a hyper-rectangle in which the density function corresponding to the jth class (p(acIC3) has higher values than all the other classes. Therefore, the summation term q Pi >J p(acICJ)P(Cj)dac (4.42) is the summation of the probabilities of the dominant class of each hyper-rectangle. If two clusters intersect, the probability of the non-dominant class within the intersected region should also be counted for the probability of error. Therefore the last term on the right hand side, E E^p(acrk)P(Ck)dac, q Pk gives the total probability of all intersecting regions. This quantity will be added only if cluster i of class j is dominant (has the largest window height of all intersecting clusters). Chapter 4. Rule Generation ^ 56 Also, since the total probability ^fc° p(acICJ)P(C;)dac = 1,^ j=1 (4.43) -0° the probability of classification error for the general case is P3^ q Pk P(Error) = 1 — EE^p(cicIC3)P(C3)dac — AEE^p(q_c_ICk)P(Ck)daci . j=1 i=1 R'a^ k=1 1=1 (4.44) As described in Section 4.1.1, when a new object does not belong to any class it assumes a default class (the class that has the highest a priori probability, say P(Ch )). This further reduces the probability of rule base classification error. Theorem : For a multi-class, multi-model case with a default class, the probability of rule base classification error is 1 — P(Ch)^Ir^f p(acri)P(C;)dac — p(aciCh)P(Ch)dac j=i i=1 [^i,.i 9 Pk — AEE I^p(acrk)P(Ck)dac — p(asiCh)P(Ch)dad} k=1 1=1 R,..1,1,14 where all the parameters are as previously defined. Proof Since all the non-peak regions (all regions below the threshold level) are assigned class Ch, the probability incurred by this class must be subtracted. Now consider the following term. q P(QC_ICIL)P(Ch )d_QC — E^f^p(aCICh )P(Ch )Ckt_C] J ,, k=1 1=1 1?i,jlk 1=1 s=1 Pt.i Chapter 4. Rule Generation^ 57 This term accounts for the probabilities of class Ch within the high density regions. As in the previous case, the term preceded by A subtracts intersecting regions. The above term has to be deducted from the total probability of class Ch in order to account for the probability of this class within the non-peak regions. Hence, the term -EE[f q^P3 P(Ch) j=1 i=i q^Pk p(aclCh)P(Ch)dac — EE p(aclCh)P(Ch)dad k.1 1=1 where P(Ch ) is the total probability of class C3. Now, by subtracting the above term from the probability of error given in Expression 4.44 and by rearranging the terms, the probability of error, q Pj P(Error) = 1 — P(Ch) — IEELID P(f_z_gri)P(Ci)dac — p(ac1COP(Ch)dac 3=1..1 —A Ic=1 1=1^ Ri .j.l.k p(acICOP(Ck)dac — p(acICOP(Ch)dad} can be obtained. For a multi-dimensional distribution with no parametric form, the probability values of the complete multi-dimensional space have to be estimated and stored in order to use the Bayesian classification rule. When inferencing on-line, these probability values should be retrieved and compared for each class in order to classify a new object. In the present approach, however, only the significant regions are stored. This improves the efficiency of the on-line inferencing process by a considerable factor. For example, consider a 3-D feature space (as with the fish classification case) with values are sampled at each 0.1 interval for density representation. Also, assume the range of each value is from 0 - 10. This gives 1,000,000 discretized probability values to be stored in a database for on-line classification using Bayesian rule. However, if only the significant values were considered for on-line inferencing, only a fraction of this number has to be stored. For example, for Chapter 4. Rule Generation^ 58 the fish contour classification case, only 22 significant regions were stored resulting in 22 entries. 4.1.5 Density Estimation Kernel and Size of Subrange (h) In the analysis of rule generation, the kernel used (0(u)) is a uniform function in the range (-1, 1). Research in the area of density estimation presents other types of kernel which may give improved performance. An alternative to the uniform function is the Gaussian which takes all available data samples into account, giving appropriate weighting to each, to estimate the value of the distribution at a particular point. In contrast, the uniform kernel only considers the samples falling in a particular window, and all samples in this window are given the same weighting. The Gaussian kernel has the following form. W h(x) = 1^.2 ^ e- 270- h\/271- (4.45) where the parameter h is the bandwidth of the estimating kernel and it controls the smoothness of the distribution estimate. Also, it is related to the size of the subrange and therefore to the shape of the distribution. If h is too small the estimated relationship distribution will be ragged; if it is too large the distribution will be oversmoothed. An empirical choice for h is discussed in [Rudemo, 1982]. 4.2 Rule Generation Through Connectivity Analysis Consider a set of m features ack,k = 1,2, ,m For each feature (attribute) in a group or class, the measured (non-dimensionalized) feature values are arranged in a column in the ascending order of the numerical value. The feature values of each data item in the group are plotted in their respective columns. The plotted attribute values in adjacent Chapter 4. Rule Generation ^ 59 columns are joined by line segments. There are m-1 such line segments for one data item (a tuple of attributes) in a group and the completely connected line segments for all attributes form a relationship. If there are n data items in a group there will be n such relationships for the model that is formed by the particular group. In this manner, a connectivity diagram is obtained for the particular group, as shown in Figure 4.4. The subranges for each attribute column are determined by a window defined by 0(x) (the function 0(x) was introduced in Section 4.1) and maximizing the number of relationships that fall within this window. For example, if a particular set of local regions of attribute columns (local regions of columns aci, ac2, ,acrn) give a peak of occurrence of the relationships, this set of local regions is taken as subranges. The subranges are denoted by ACk23. The subscripts k, i, and j represent the attribute number, the subrange number, and the group number respectively. Initially, the size of the subrange is chosen to be small and it is iteratively increased to obtain the optimum value. The optimum size is usually obtained by cross validation. Automatic methods are also available as discussed in [Rudemo, 1982]. As previously mentioned, if the size of the subrange is too small the relationship distribution of the connectivity diagram will be too ragged (scattered); if it is too large it will be over-smoothed and important properties that are useful for rule generation will be obscured. Once, the subranges are determined in this manner, they can be logically combined with the aid of the connectivity diagram to formulate the rules for the group. 4.2.1 Threshold Level Selection The statistical procedure used for selecting the relationships having adequate density levels is described now. First, the number of repetitions of distinct relationships (connectivities) in a connectivity diagram is determined, and the relationships are arranged in Chapter 4. Rule Generation ^ Attribute 2 Attribute 1 (aci)^A Relationship line SegTent (ac2) 1 L.y...i t A C14 A CI3i Attribute m (ac3) ACm2J AC 11J ^x= Number of occurrence of the relationship 60 I A relationship AC 24 ACm3i 'I Overlapped relationships (densely present) • ACip1 1 AC2P II Figure 4.4: The Generalized Connectivity Diagram of a Model. 61 Chapter 4. Rule Generation^ the ascending order of the frequency (density) repetitions. Suppose that the corresponding density-variable values are xl, x2, ..., xm. The number of distinct relationships that have each of these density values are counted and this frequency data is used to fit an appropriate probability distribution (such as Poisson or multinomial) to this data. The discrete probability function for the Poisson distribution is e'Axi f(xi) = ^ xi! (4.46) and for the multinomial distribution, it is n! TT ^ H x,! xi • (4.47) The parameters A or pi are estimated from the frequency data obtained from the connectivity diagram. The Poisson distribution can be fitted to some of the models, but the more general multinomial can be fitted to most models. The subsequent steps involved are : 1. Estimate the parameters of the distribution (eg., using the method of moment or maximum likelihood estimate) 2. Test the goodness of the fit 3. Select the cut-off value c for the relationship density (stochastic) X such that the probability P(X < c) <^ (4.48) Chapter 4. Rule Generation ^ where 62 p is the cumulative probability associated with the low density relationships that will be discarded (typically 0.05 ^0.10). 4. Select c to be the threshold level for relationship density, i.e., If X < c, disregard the corresponding relationships If X >= c, extract the corresponding regions from the connectivity diagram for rule generation The purpose of disregarding the values of X less than c is a) to improve the efficiency of on-line inferencing by considering only the significant regions, b) to eliminate the possible noisy data that might be present in the group. This procedure of rule base generation is carried out off-line. When new data is available, the parameters of the distribution are estimated again. Depending on the new value of the estimated parameters, a new cut off value is computed. In this manner, the system can learn which rules should be deleted and perhaps which new rules should be included. The knowledge base is continuously updated by this procedure. 4.3 Illustrative Example An example of the process of rule generation using connectivity analysis is described in this section. In this example, the fish data for a particular group (Group 5) is chosen. A sample of measured attributes in this group is shown in Table 4.1. The attributes ac2, and ac3, are Nose-Eye/Length, Nose-Gill/Length, and Width/Length of fish respectively, and these attributes are normalized with respect to their range so that they vary from 0.0 to 1.0. Figure 4.5 shows the connectivity diagram for Group 5 which was drawn using the data for this group according to the procedure described in Section 4.2. In this figure, Chapter 4. Rule Generation ^ 63 Table 4.1: A Sample of Data on Fish Dimensions for Group 5. Fish No. 1 2 3 4 5 6 7 8 9 10 aci 0.53 0.35 0.85 0.34 0.33 0.31 0.79 0.32 0.32 0.78 ac2 0.56 0.27 0.73 0.32 0.29 0.32 0.62 0.37 0.28 0.68 ac3 0.78 0.31 0.39 0.32 0.35 0.36 0.45 0.41 0.27 0.49 the "bold faced" numbers in each attribute column represent the median (711,23) of the subranges ACk13. Table 4.2 shows the medians of subranges of the relationships for Group 5. Table 4.2: The Medians 77 of the Subranges of the Relationships for Group 5. k 1 2 3 1 0.90 0.70 0.45 i 2 0.65 0.60 0.85 3 0.35 0.30 0.30 The subrange ACk, is defined in terms of n as ii,,i hk,,) Acko = nkil ^2 '71k13 + 2 ( where hk,3 is length of a side of a hyper-rectangle as defined in Equation 4.29 (4.49) Chapter 4. Rule Generation^ Figure 4.5: The Connectivity Diagram for Group 5. 64 Chapter 4. Rule Generation ^ 65 Now, the rules for Group 5 can be formulated with the aid of the connectivity diagram and Table 4.2. Rules that assign a new fish to Group 5 are shown below. Rule 1 AC115 AND AC215 AND AC316 Model 5 Rule 2 AC125 AND AC225 AND AC326 AC235 AND ---) Model 5 Rule 3 AC135 AND AC335 ---+ Model 5 4.3.1 Rule Prioritization and Threshold Level Selection In Figure 4.5, the number on the relationship clusters represents the peak height of the relationship density within the highlighted regions. In other words, this number is the number of relationships within the window formed by the kernel h defined in Function 4.21. The threshold level chosen for this diagram is 2, i.e., the relationships with density value 1 are omitted. Table 4.3 summarizes the probabilities of the relationship density values of the connectivity diagram shown in Figure 4.5 while Figure 4.6 gives the multinomial probability distribution of the density values. Once the connectivity diagrams for all groups are drawn, a table similar to Table 4.3 and a distribution similar to Figure 4.6 for all groups can be formulated. The next step of the process is to fit a suitable probability distribution for the relationship density values. A Poisson model was attempted first, but it was found that it does not provide a good fit for the data. A multinomial distribution was then fitted and Table 4.4 shows the Chapter 4. Rule Generation^ 66 Table 4.3: Probabilities of the Multinomial Distribution of Density Values of Relationships for Group 5. No. of Occurrence X, 1 2 3 4 5 6 7 8 9 Probability P(X2) 0.094 0.133 0.171 0.110 0.120 0.076 0.151 0.000 0.146 A a _o iilii ili 1^2^3^4^5^6^7^8^9 Relationship Density Figure 4.6: Probability Distribution of Density Values of Relationships for Group 5. Chapter 4. Rule Generation^ 67 resulting probabilities. Figure 4.7 shows the multinomial probability distribution. Table 4.4: Probabilities of the Multinomial Distribution of Density Values of Relationships for All Groups. No. of Occurrence Xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Probability P(X2) 0.090 0.123 0.152 0.157 0.110 0.094 0.079 0.035 0.042 0.023 0.070 0.010 0.007 0.008 The relationships with density values less than 2 are omitted because this gives the required value for p- according to the relation 4.48. Also, they do not present any consistent association of the subranges for formulating the rules. What this means is that these are the valleys or the non peak regions of the density function (p(acIC3)). Therefore, according to the notation given in relation 4.48 c = 2, i.e., the threshold level for inclusion of the relationships is 2. In the connectivity diagram shown in Figure 4.5, only the regions with densities higher than or equal to this value are shown. In order to assign priorities to these regions, the window heights (w13) as given in Equation 4.28 must be computed. For example, the probability Lp(acIC;)P(C;)dac 1 = 0.33 Chapter 4. Rule Generation^ I 68 0.1^ 1 I I I I I^ ii. MIN NM ■ 2^3^4^5^6^7^8^9^10^11^12 13^14 Relationship Density Figure 4.7: Probability Distribution of Density Values for All Groups. for the region 1 of Group 5 (i.e, the region with a peak height 7). The volume of the hyper-rectangle formed by the subranges associated with this region is 8 x 10. Therefore, 0.33 w).5 = ^ 8 x 10-3 = 41.25 Since the probability used in this computation is class conditional probability, the above value must be multiplied by the class a priori probability to obtain the actual priority. The priority value for this is therefore, Priority15 = wi5P(C5) = 41.25 x 0.16 = 6.6 The priorities for the three rules of Group 5 are summerized in Table 4.5. Chapter 4. Rule Generation^ 69 Table 4.5: Probabilities of the Multinomial Distribution of Density Values of Relationships for All Groups. Rule (Region) 1 (R.1) 2 (7.2) 3 (R.3) Probability 0.33 0.15 0.43 Volume 8.0 x 10 6.0 x 10' 8.0 x 10' Priority 6.6 4.0 8.6 4.3.2 Results Rules for all five groups were generated in a manner described in the previous sections and implemented in a rule base. The system was tested using two common methods, namely resubstitution method and leave one out method. For the resubstitution method, a test set of data was generated by selecting five samples from each group of data that were used to generate the rule base. This data set was tested with the system and the results are shown in Table 4.6. In this table, fl, f2, and f3 represent the ratios of features (nose-eye length)/(length), (nose-gill length)/(length), and (width)/(length) of fish respectively. These values are normalized with their respective ranges such that they vary from 0.0 to 1.0. The group column indicates the actual group of the corresponding data item (fish). The result columns show the group number if the data item is correctly classified else "x" or "nc" which respectively means that the data item is incorrectly classified or not classified. In leave one out approach, one data sample is left out and the system is trained with the rest of the data. The system is then tested with this sample. This process repeated for all the samples in the database. The results of this approach for the test set generated for resubstitution are also shown in Table 4.6. Chapter 4. Rule Generation^ 70 Table 4.6:^Rule-Based Method Classification Results for Resubstitution and Leave-One-Out Methods. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 fl 0.25 0.51 0.54 0.41 0.52 0.77 0.17 0.53 0.53 0.82 0.49 0.70 0.61 0.55 0.42 0.66 0.57 0.43 0.40 0.88 0.53 0.35 0.85 0.34 0.32 Sample f2 f3 0.23 0.35 0.53 0.44 0.47 0.42 0.43 0.23 0.54 0.47 0.86 0.46 0.38 0.54 0.54 0.65 0.53 0.47 0.87 0.42 0.47 0.53 0.69 0.39 0.61 0.34 0.40 0.27 0.31 0.17 0.53 0.38 0.56 0.43 0.46 0.32 0.49 0.37 0.85 0.63 0.56 0.78 0.27 0.31 0.73 0.39 0.32 0.32 0.28 0.27 Group 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 Classification Result Resub. Lye. 1 Out 1 1 1 1 1 1 nc nc 1 1 2 2 2 2 2 2 x x 2 2 x x 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 nc nc 5 5 5 5 5 5 5 5 5 Chapter 4. Rule Generation^ 4.4 71 Integration of Heuristics and Expert Knowledge Once the statistical rules are generated as described in Section 4.3, they can be implemented in an expert system shell to infer the model or the group of a new object. The efficiency of this process can be further improved by integrating heuristics or expert knowledge in to the system. This process is described now. Consider a situation where there are q number of groups and m attributes for each object in any group. If the rules for this system were generated as described in Section 4.3, they will have the following form. < ac < A (ac2, t < ac < ac213.) where ac are the attributes, S, is the ith A • • • A (acmiji < ac < acini3.) statistical rule of the jth class, 1 and u are the lower and upper bounds of the subranges formed by attributes ac, and A denotes the AND operation. Rules for Vi and Vj are collected in the rule base system. Although these rules are adequate to infer the class or group of a new object, the rule search can be considerably improved, resulting in faster inference speed, if additional available knowledge about the problem is integrated into the system. Now consider the following additional facts given by an expert in the field. • Object type A can only be in groups jA = {1, 3, • • -} • Object type B can only be in groups jB^{2,4, • • •} • • Object type T can only be in groups jT = {1, 5, • • •} Chapter 4. Rule Generation^ 72 These facts can be used as a hierarchical preprocessor prior to searching for a statistical rule. For example, rules can be expressed as follows: Mi Type A Sij Once the left term of the right hand side of this rule is unified, the rule will search for a statistical rule that satisfies the right term of the right hand side in order to achieve the goal (i. e. M3). Since the number of elements in each of the sets jA, jB, • •, jT smaller than the number of total groups (because jA, jB, • • •, jT C J, where J = {1, 2, • • , q}), the search space for rules is considerably smaller in comparison to the total search space resulting in faster inference speed. Identification of the object being Typet is common sense knowledge, however, the facts considered above are expert knowledge. 4.4.1 Integration of Knowledge into Fish Rule Base: An Illustration The following facts were obtained from the fish cutting experts regarding the cutting contours. • Sockeye Salmon can only have contour models js = {1, 3, 5} • Pink Salmon can only have contour models jp = {2, 3} • Coho Salmon can only have contour models j, = {1, 4} • Chum Salmon can only have contour models j ch = {4,5} A decision tree can be developed for the fish classification rule base as shown in Figure 4.8. This tree can be directly translated into Prolog rules as follows. The tree shown in Figure 4.8 is a simplified version of the actual decision tree. (.. /-11 ,-I CD g=• bo 1—■ cn c-rI-I cl-i- ..... 0 A; .-0 (7.--, t4=Pzi = co pb co ,-, D, c-, ..... o 0 P ti co r, in- • ..... 0 H ,-, ro co F-1-, 0 ,-1 <-1- co 't .... ;,.., n ET' rn cn .— . r, 6 cf) 'c ;17 ---.1 cA., Chapter 4. Rule Generation^ fish(M) : — sockeye, ss(M). fish(M) : — pink, sp(M). fish(M) : — coho, sc(M). fish(M) : — chum, sm(M). ss(M) : — in_range(AC1, ac111/, aclllu), in_range(AC2, ac211/, ac211u), in_range(AC3, ac311/, ac311u), M = 1. ss(M) : — in_range(AC1, ac1211, ac121u),in_range(AC2, ac2211, ac221u), in_range(AC3, ac3211, ac321u), M = 1. ss(M) : — in_range(AC1, ac113/, ac113u), in_range(AC2, ac2131, ac213u), in_range(AC3, ac3131, ac313u), M = 3. 74 Chapter 4. Rule Generation^ 75 ss(M) : — in_range(AC1, ac115/, ac115u), in_range(AC2, ac2151, ac215u), in_range(AC3, ac3151, ac315u), M = 5. in_range(X, X1, X2) : — X > Xl, X < X2. Here, functors and constants are in lower case letters and variables are in upper case letters. Chapter 5 Attribute Measurement The attributes of a fish that are necessary for model matching have to be measured online quickly and accurately by the vision system. Low-to-medium level image analysis techniques are adopted for this purpose. These attributes serve as the context for determining a model from the model-base using the rule-base of the system as described in previous chapters. The attributes selected from factor analysis of a more extensive set of features are nose-to-gill length, nose-to-eye length, length, and width of the fish. Two image processing algorithms are being used for accomplishing the nose-to-gill length. One method, as described in [Riahi and de Silva, 1990][de Silva, Riahi, and Gamage, 1991], is based on the isolation and enhancement of an elongated edge-like structure formed by the shadow effect of the gill plate of a fish. In this approach, the blobs formed by thresholding the enhanced image and grouping edges through a connectivity procedure are boxed. The longest box with the largest aspect ratio (length/width) determines the gill coordinates. In the other method, an enhanced 2-D image of the gill region is transformed into a 1-D representation through directional averaging of pixel intensity in the direction close to the edge direction. Characteristic peaks in the transformed image (projection profile) establish the gill position and the eye position. The theory behind the projection profile method is described in Section 5.1. This method is generally faster than the boxing method. 76 Chapter 5. Attribute Measurement ^ 77 Figure 5.1: Captured Image Under Structured Lighting. aributes, length and width of the fish are measured by isolating the largest continuous segment (corresponds to the fish) in the image and identifying the smallest rectangle which encloses this segment. By ensuring the length direction of the fish is parallel to a known axis (x-axis) of the image, the length and the width are determined by the co-ordinates of the rectangle. 5.1 Attribute Measurement Using Projection Profile Method To measure the attributes nose-to-eye length and nose-to-gill length, it is necessary to locate the eye position and gill position of the fish in the image. Therefore, these features of the fish are enhanced by using structured lighting as shown in Figure 5.1. Notice that in Figure 5.1 that the gill and eye of the fish are enhanced, while the other features in the head region of the fish image are washed out by the flood lights. Chapter 5. Attribute Measurement ^ 78 e Isolated Largest Continuous Image Segment (Fish in the I An image captured under this lighting condition is thresholded and segmented to isolate the largest box with a high aspect ratio. This box corresponds to the fish in the image as shown in Figure 5.2. Once the co-ordinates of this box are determined, relatively trivial processing such as run length encoding within the box will give the length and width of the fish. Statistical surveys on salmon show that the gill of the fish lies within first quarter from the nose. In other words, nose to gill distance is less than or equal to of the length of the fish. Therefore, only the first quarter of the identified box, as shown in Figure 5.3, is processed in order to determine the location of the gill and eye features. The head image is, first, convolved with a Gaussian filter to eliminate the random high frequency noise. Chapter 5. Attribute Measurement^ 79 Figure 5.3: A region of Interest of the Fish Image. Now, let the image be i(x, y), then ii(x,y), i(x,y)*g(x,y) = 1°. .17:i(u,v)g(x — u,y — v)dudv ^ (5.50) is the resultant Gaussian smoothed image, as shown in Figure 5.4. The function g(x, y) is the 2-D Gaussian function, and can be expressed as given below. g(x,Y) - 1^(.2+y2) e^22 2ircr2 (5.51) Features enhanced using structured lighting give rise to step changes in the intensity (edges) in the image. These intensity changes (edges) are enhanced by convolving the resultant image obtained above with a gradient operator (filter). The resultant image is shown in Figure 5.5. Let the gradient operator be d(x, y), then Chapter 5. Attribute Measurement ^ 80 Figure 5.4: Gaussian-Smoothed Fish Head Image. 00 z"(x,y) =- iI(x,y)* d(x,y) = - r i(u, v)d(x — u, y — v)dudv^(5.52) 00 - 00 where d(x, y) is the gradient operator of the form d(x, y) = a f (x ,y) (5.53) ax This type of directional gradient operator is used to enhance the edges only in a direction of interest. The resultant image i,"(x, y) obtained in this manner is projected on to an aids (1-D signal) to give the projection profile. Since the curvature of the gill is low and its direction is approximately parallel to the y-axis of the image, the projection of 2-D image ill(x, y) on to x-axis gives a large peak as shown in Figure 5.6 at the position of the gill. 00 Projection Profile = p(x) =.0 (x, y)dy (5.54) Chapter 5. Attribute Measurement^ Figure 5.5: Edge-Enhanced Head Image. Figure 5.6: Projection Profile of the Head Image. 81 Chapter 5. Attribute Measurement^ 82 A secondary peak of p(x), not so large as gill peak but large enough to distinguish from other features, in between the gill and the nose gives the eye position of the fish (Figure 5.6). Although the above equations are given in continuous domain to emphasize the mathematics, all computations are essentially carried out in discrete domain as follows. If the digital image captured is /(X, Y), then [M/2] [M/2] P(X, Y) /(X, Y) * G(X,Y)^ E E i(x + i,Y j)G(—i, —j)^(5.55) -LM/2i - [M/2f where G(X, Y) is the M x M Gaussian filter and P(X, Y) is the resultant smoothed image. Convolving the derivative of a Gaussian will give the same result as convolving the image first with the Gaussian, and then convolving the result with the derivative operator. a^ ag(x,y) . ^(i(x y) * g(x y)) = ^ * 2.(x, y) ' (5.56) ' This result is true because both the convolution and the derivative are linear operators. Therefore, in actual implementation, the first derivative of the Gaussian is convolved with the digital image to enhance the edges. Since the edges of interest lie in x-direction, the derivative of the Gaussian filter is also taken in the same direction. [M/2_1 [M/2] /11(X, Y) = /(X, Y) * Gx(X,Y) = E E^+ i,Y j)Gx(—i,—j)^(5.57) - [M/2J - [M/2] where Gx(X, Y) is the first derivative of the Gaussian with respect to x. Now, the projection profile, P(X), is Chapter 5. Attribute Measurement ^ P(X) = 83 N E i"(x, Y)^ (5.58) Y=0 where in(X, Y) is the N x N enhanced image. The features of the image are determined by the peaks of P(X). 5.2 Attribute Measurement Using Region Boxing This method was primarily developed to compute the nose-to-gill length of a fish and, as mentioned previously, is based on the isolation and enhancement of an elongated edge-like structure formed by the shadow effect of the gill plate of a fish. As in the projection profile case, an image captured under structured lighting is first convolved with a directional edge operator (first derivative of a Gaussian). The resultant image is thresholded and the edges in the binary image are grouped in order to form contour segments. Edge grouping on a binarized image is much faster than on a gray scale image. Next, the contour segments are boxed and the longest (in the direction of the gill) box with largest aspect ratio is chosen as the box that contains the gill contour. The coordinates of this box can be used to determine the position of the gill, hence, the nose-to-gill length. 5.3 Comparative Evaluation of the Techniques The edge grouping process in the region boxing method is computationally expensive and time consuming. This problem can, however, be reduced if a region of interest (a region which contains the gill contour) is selected. As mentioned in Section 5.1, through statistical analysis of the dimensional measurements on Pacific Salmon, it is estimated that the distance from nose to the gill-plate edge is approximately 20% of the length of Chapter 5. Attribute Measurement^ 84 the fish with ±5% variation. Edge grouping into contour segments within the region of interest reduces the computational cost by 50%. Directional averaging method is much faster than the region boxing method. Table 5.1 shows typical processing times associated with various steps of each method. These times were obtained by implementing each algorithm on a vision station (IRI SD512) available in the Industrial Automation Laboratory. The accuracy of both methods is approximately ±1.2mm, and the robustness of both methods are acceptable. The directional averaging method is chosen because of its speed. Table 5.1: Typical Times for Various Image Processing Operations. Complete Frame Reduced Window Process 200 x 200 512 x 512 N/A Image Capture 33ms 137ms 3 x 3 Convolution 25ms Edge Grouping N/A 100ms Directional Averaging 40ms 8ms 5.4 Measurement of Orientation The algorithms described in Section 5.1 assume that the angle of fish is known, specifically, they assume that the fish is aligned with one of the axes of the image plane of the camera. For accurate computation of the attributes as well as the positioning of the cutter, the orientation of the fish on the conveyer has to be established. Furthermore, if the contour cutting system is not in operation a default angle cut has to be performed. Two angles of importance are shown as 0,, and 0. in Figure 5.7. Specifically, 0,, represents the angle of V-cut with associated recovery of useful meat within the head region. This angle will be fixed in this particular implementation and will be achieved by a double Chapter 5. Attribute Measurement^ 85 angular cutter. The cutting angle, Oa, of an angle cut is chosen to recover the meat in the top region of a head while eliminating the pectoral fin. Statistics show that (see Table 5.2) the optimum angle of Oa is approximately 600 . Table 5.2: Existing Angles Oa Ot, 70 60 70 50 70 60 70 60 60 75 Typical Optimal Angles of Cut. New Angles Nose-Gill Meat Saving Oa (cm) Ot, (oz) 60 35 11.4 2.5 35 2.0 60 10.7 11.4 60 35 2.0 60 1.5 11.4 35 11.4 1.0 60 35 The concept of orientation is examined in the context of a particular application. Three techniques of measuring orientation of fish are developed [Carnage and de Silva, 1990]. The methods are applied to samples of Pacific salmon and the results are compared with the values obtained through direct measurement of orientation in these samples. 5.4.1 Least Rectangle Method The first method employs a direct geometrical definition of orientation. In this method, the nose and the tail co-ordinates in a digitized image of fish are determined through image enhancement and processing. The image is first thresholded to make the image binary. This eliminates the sensor noise and the background illumination variations [Jain, 1989]. Next, the image is segmented through a process known as labeling which groups continuous image segments together. Following segmentation, each continuous segment is boxed with the least rectangle and the largest of which is selected as the fish in the image (Figure 5.8). Chapter 5. Attribute Measurement ^ 86 Figure 5.7: Two Angles of Importance for Recovery Improvement in Head Removal in Fish Processing. Chapter 5. Attribute Measurement^ 87 Figure 5.8: The Least Rectangle Enclosing a Fish. Following is a high level description of the steps associated with this technique. 1. Synchronize the camera with the conveyer. 2. Snap an image if a fish is detected on the conveyer. 3. Threshold and store the image in the frame buffer. 4. Label the image regions into continuous segments (blobs) through connectivity analysis. 5. Box the continuous segments. 6. Isolate the largest box and assign a unique pixel intensity value for the corresponding blob. 7. Scan the edge region of the box and determine the tail and nose co-ordinates. Chapter 5. Attribute Measurement^ 88 8. Compute the angle of orientation. Once the largest box is isolated systematic scanning of the edge regions of the rectangle will identify the co-ordinates of the nose and the tail. When the mouth of a fish is open, two nose points are obtained by this procedure. The mid point of these two locations is used as the nose point. Similarly, two tail points are obtained for the two corners of the tail fin. Again the mid point of the two is assifmed the tail point. The orientation is then computed as follows using the nose and tail co-ordinates. Consider the two possible cases, mouth open and mouth closed, as shown in Figure 5.9 and 5.10. The Case of Mouth Closed: If the two tail co-ordinates are (x1, y1) and (x2, y2), then the mid point of the tail is ((xi 4-x2) (Y1 4-Y2)) 2^2 (5.59) and if the nose co-ordinates are N — (x3, y3) (5.60) the angle of the orientation is given by = tan-1 [1Y3 (y1 +Y2)/21 1x3 — (xi + x2)/21] • The Case of Mouth Open: The tail co-ordinates are same as above and the nose co-ordinates are (5.61) Chapter 5. Attribute Measurement^ N ((x3 + x4) (Y3 + Y4)) 2^2 89 (5.62) the angle of the orientation is = tan-1 1(Y3 Y4)/2 (Yi + y2)/211 1(x3 + x4)/2 — (x1 + x2)/21] • (5.63) Figure 5.9: The Case of Mouth Closed. This method assumes that the complete image of the fish is available. Structured strobe lighting is used to improve the contrast between the background and the object and to avoid the blur due to the motion of the conveyer. Chapter 5. Attribute Measurement ^ 90 Figure 5.10: The Case of Mouth Open. 5.4.2 2nd Moment of Area Method This method uses second moment of area to establish orientation of an image [Jain, 1989] [Faber and Stokely, 1988]. In this method, first, the head region is enhanced and processed as before (Steps 1-3 of method 1) to obtain the silhouette image of the region. The area within this silhouette is assigned a uniform pixel intensity, and the second moments of area about the orthogonal co-ordinate axes of a reference frame through the centroid and the corresponding cross moment of area are computed as follows (Figure 5.11). The second moment of area about an axis through the centroid is given by IC ,_ E Er2(m,n) E E [(n — ii)cos(0) — (m — iff.)sin(0)]2 ^(5.64) (m,n) eR (man) eR Chapter 5. Attribute Measurement^ 91 Figure 5.11: Illustration of the Parameters Used in Calculation of Second Moment of Area. where (fri, n) is the mass center and is defined as M= 1 E Ern — N (inn) ER 1 if= N EE n. (5.65) (5.66) (m,n) eR Note that, N is the number of pixels which represents the area of the image. It is well known that the second moment of area about orthogonal co-ordinate axes and cross moment of area provide the direction of the major and minor principle axes with respect to the reference frame. The two principle axes are orthogonal. The angle of the orientation, 0, is given by 1^{ 2M11 1 0 = –tan' 2^M20 — MO2 i (5.67) Chapter 5. Attribute Measurement^ 92 where M13 E E(m - ffi)2 (n -^ (5.68) (m,n) el? This method assumes that the orientation of the fish is given by the major principal axis. The assumption is sometimes questionable because, for example, for fish in the same direction, the computed moments of area will depend on many factors such as whether the mouth is open or closed, fins are trimmed or not, and so on. Furthermore, the moments calculated are very sensitive to the noise present in the image. For example, if there is noise at a distant point from the center of gravity of the image, the moments calculated will not be accurate. 5.4.3 Fundamental Ellipse Method In this method, the orientation of an object is defined as the orientation of the major axis of the fundamental ellipse that fits the border of the object. The fundamental ellipse can be extracted by the Fourier series expansion of the borderline contour of the image. Fourier descriptors of the borderline contour of an object, in the past, has been used to describe the shape of objects in an image [Lin and Chellappa, 1987]. A special feature extraction algorithm was developed to extract the bounding contour of an image. This algorithm is given below. 1. Locate any point on the boundary. 2. Go to one of the adjoining pixels. 3. Check whether this pixel is on the boundary. 4. If on the boundary Chapter 5. Attribute Measurement ^ 93 (a) Add this pixel to the boundary. (b) Go to step 2. 5. Else (a) Check whether this pixel is in the body or out of the body of the image. (b) If in the body i. Move towards the edge. ii. Go to step 3. (c) Else i. Move towards the body. ii. Go to step 3. 6. Repeat Until this pixel = stating pixel. This array of pixels along with the distances to a reference point in the body of the image and the corresponding angle to this pixel from a reference axis are stored (Figure 5.12). If we now plot these distances against their angle to the reference axis, we will obtain a graph as shown in Figure 5.13. It is apparent from this figure that there are two cycles within the period of 27r. Hence, the coefficients corresponding to the first harmonic (half the fundamental period) provide the fundamental ellipse from which the orientation of its major axis can be computed in a straightforward manner as follows: f (9) = c--t-;- + i (an cos(n8) + bn sin(nO)) n=1 (5.69) Chapter 5. Attribute Measurement^ 94 where N an = — AO) cos(n8)6 ,=1 1 =— E f(8) sin(n9)58 (5.70) (5.71) For the case of n=2, we have N a2 = —^f (0) cos(20)60 2=1. 1 b2^—f(8)sin(29)89 71. (5.72) (5.73) where N is the number of discrete points on the contour. Consequently, the angle of orientation is given by = an 2^a2 • (5.74) This approach works well even when there are unknown irregularities (due to collapsed fin or broken tail) on the borderline of the image. These irregularities, which generally correspond to the higher harmonics of the Fourier series, will not affect the fundamental ellipse. 5.4.4 Results and Comparison The actual angle of the fish was compared with the results obtained from the three methods. Table 5.3 shows some of the angles compared (in degrees), and Figure 5.14 illustrates the results obtained using the three methods. The three methods discussed above have approximately the following execution times. 95 Chapter 5. Attribute Measurement ^ Figure 5.12: Distances to a Reference Point and the Corresponding Angle. r[j] 80j (N Points) Figure 5.13: The Graph of Distance to a Reference Point Vs. the Angle. Chapter 5. Attribute Measurement ^ Boxina Method Anale 36.016 (a) Captured Image. (b) Least Rectangle Method. Moment Method (c) Second Moment of Area Method.^(d) Fundamental Ellipse Method. Figure 5.14: An Illustration of Three Methods to Calculate the Orientation. 96 Chapter 5. Attribute Measurement^ Table 5.3: Typical Results from the Three Methods of tion. Box Moment Ellipse -2.85 -4.83 -4.04 2.14 1.41 2.45 10.63 9.51 9.57 16.04 15.46 13.90 20.82 19.32 20.17 25.58 24.92 23.03 36.01 35.02 35.43 42.41 43.82 43.92 46.52 47.58 48.08 97 Computing the Angle of OrientaActual -3.82 2.23 9.67 15.86 20.14 25.42 34.53 43.11 47.75 Region Boxing Method^200ms 2nd Moment of Area Method - 150ms Fundamental Ellipse Method - 250ms The accuracy, speed, and the robustness of the least rectangle method is acceptable, but it requires a clean straight fish image. The moment method is the fastest method, however, it is very sensitive to the threshold value selected and the noise present in the image. The noise problem can be reduced by convolving the image with a Gaussian filter prior to the calculation of moments, and automatic thresholding can be employed to improve the accuracy of the computed moments. It is also assumed in this method that the object under consideration (fish) is symmetric. The main advantage of this method is its speed and the robustness. The ellipse method was found to be superior to the other methods due to its accuracy of the results. The robustness of this method is also acceptable. If the processing time is not critical, this method is recommended to measure the orientation. However, if time is critical, the moment method is recommended. Chapter 6 Pattern Recognition Techniques Pattern recognition, in general, is concerned with pattern classification as well as estimation of parameters. The classification procedure can be defined as a mapping from feature space to a class-membership space[Pao, 19891. This is illustrated in Figure 6.1. x -l x2 R(f(5))^p. c(X) xk Generic object Instantiations of generic object Patterns:^Mapping^Class representations^function^index of instatiations in terms of features Figure 6.1: Schematic Representation of Pattern Classification. Here, X is the object of interest and it may have different instantiations represented by X1, 2C2 ....Xk . The process of classification comprises two steps. In the first step a specific appearance of the object is described in terms of features, i.e., X f (X), while in the second step, a mapping ( R(f(X)) ) is carried out to determine the class membership. For example, consider the process to be classification of apples. Let X1, X2, ...., Xk be apples moving on a conveyor. With this set up a suitable set of features to represent apples may be the colour, the shape, and the size. Therefore, f(x) will be {colour, shape, 98 Chapter 6. Pattern Recognition Techniques ^ 99 size}. With a different application, however, the optimal features may be different. The importance of the function f(), therefore, is to select the appropriate set of features for each application. Developing a classification model involves feature extraction and learning the mapping function RO. Feature extraction deals with how and what features should best describe the object in the form of f(). Learning involves training the mapping using a set of labeled (already classified) data. Pattern recognition also deals with estimation of an object parameters from a given set of features. Estimation of parameters is a more general procedure than classification. In fact, classification can be viewed as a special case of estimation in which the class index can be thought of as the only parameter being estimated. The advantage of estimation over classification is that the former gives a continuous value to the parameter being estimated in contrast to the latter which picks the best suited class for the given case. Figure 6.2a illustrates how a training set can be used to train the mapping function, R(), off-line for an estimation model. Figure 6.2b shows how this mapping function is exercised to estimate the parameters of the object. The notation used here is same as before except, A(xi)....A(x_k) represent the parameters (attributes) of the pattern. To illustrate this, consider the problem of determining the price, quality, and size of apples. The possible features for a particular instantiation (eg. apples on a conveyor) may be the colour, the shape, and, the weight. The attributes that are to be estimated using these features are clearly the price, the quality, and the size. According to the notation used, f(xT) = {colour, shape, weight} and A(xT) 2= {price, quality, size} where T denotes the transpose. Chapter 6. Pattern Recognition Techniques ^ 100 A(5i)Ns4 ff=v2) A(X2;Niik x r Machine learning -0.. Rtri(x) xk Mapping function Training set (a) x — I ^111.^f(x1 ) -4* f(42) Rtrf(x) * ------------4'. xk (b) Figure 6.2: (a) Learning the Mapping Function Using a Training set. (b) Estimation Using the Learned Mapping Function. Chapter 6. Pattern Recognition Techniques^ 101 6.1 Pattern Recognition in Automated Fish Processing Real world situations cannot be assessed in terms of isolated facts, rather they can be described in terms of patterns of interrelated facts. This interrelation may, in some cases, be implicit. For example, it may be implicitly understood that a given set of facts corresponding to the same object or situation[Pao, 19891. In other cases, a pattern may be meaningful only because of explicit relationships among the various features of the pattern. For example in fish classification, since the relationships between cutting contours and the dimensional measurements of the fish are not implicit, explicit relationships have to be established between the cutting contours and the other dimensional features. Various combinations of features of fish were tested for possible patterns relating to the cutting contours. These features include length, width, nose-eye length, nose-gill length, and so on, and the problem was approached in terms of both the classification of contours and the estimation of the parameters of the contours. Classification is carried out using several techniques including a rule-based method, neural networks, Bayesian inference, and k-nearest neighbour classifiers. Multiple regression and neural network methods were used to estimate the parameters of the contours. The theoretical development as well as a practical approach to rule generation through connectivity analysis of the rule base system is described in Chapter 4. In this chapter, the alternative approaches are described in detail, and the results of these approaches are presented as well. 6.2 Neural Networks A network of elemental processors arranged in a manner similar to biological neural nets that is capable of learning how to recognize patterns is known as a neural network. The 102 Chapter 6. Pattern Recognition Techniques^ elemental processors are considered to be the neurons and the network is said to be a Perceptron[Pao, 1989][Duda and Hart, 1973]. Neural networks are also referred to as Parallel Distributed Processing or Connectionism in the literature. The present research is based on the concept of Generalized Delta Rule (GDR) formulated by Rumelhart, Hinton, and Williams[Rumelhart, et. al., 19861. GDR is known to be a practical way of implementing a Perceptron like system and the basic theory of it is discussed later in this section. A simple linear discriminant network is an array of multipliers and summing junctions as shown in Figure 6.3. The lines in the box of this figure act as multipliers and the common node of these lines is a summing junction. The factor of multiplication for each line is given by w1. When the pattern feature values are presented to the inputs of this two class classification system, the output indicates if the feature belongs to a certain class, provided that the correct weighting factors are known. These weights are determined using a training set during the initial learning phase. output b Input pattern weight vector w Figure 6.3: Schematic Representation of a Linear Discriminant. Chapter 6. Pattern Recognition Techniques^ 103 A generalized multiclass classification system consists of several outputs as shown in Figure 6.4. In this figure, i and o represent the input and output layer nodes. Also, the values xi....x„ represent the input pattern, bi....bk represent the output class, and w(ai) represent the weights of the connections. This generates a matrix of weights in contrast to the two class problem where weights form a vector. Again, the weight matrix is learned using a training data set. To form an analytical method to determine weights, we shall consider the two class (c1 and c2) classification problem as modeled below. xl --ill• Xn t A node (input) Input Layer ^ Output Layer Figure 6.4: Schematic Representation of a Multiclass Linear Discriminant. Xw = b^ (6.75) where X is the matrix of training set patterns (x), w is a column vector of weights, and b is the output values specified by the training set. Each row of matrix X is a feature vector x of the form XT = (X1, X2, ...Xn) and the number of rows of this matrix is equal to Chapter 6. Pattern Recognition Techniques^ 104 the number of training data. For example, if the number of features is n and the number of training data is m, X will be an 'in x n matrix. Also, x E c1 jib, is positive (+1) ^ (6.76) x^El if bi is negative (-1)^ (6.77) where c1 is one of the two possible classes and Ei means that not c1. Said another way, if b is positive, then the given feature vector is classified to be in class c1 else it is not in class c1 which means it belongs to class c. Equation 6.75 can be solved using pseudo-inverse of X to obtain the weighting factors, w = Xt b^ (6.78) where Xt is the pseudo-inverse and is defined as xt (x Tx) -1 x T.^ (6.79) In practice, this analytical procedure may lead to inaccurate results[Pao, 19891, and researchers formulated the following rules to find w through iterative procedures. Let w1 = arbitrary (the initial values of the weights), then the value of weight vector at the kth step of iteration is tiik^j )c pok wT .xk )xk ^ (6.80) where bk is the value of the output b for pattern Xk and might be +1 for patterns in a class and —1 for patterns not in that class, and b is the output vector for all patterns, Chapter 6. Pattern Recognition Techniques^ i.e., bT^(bi, b2, ....bk....bm). The term Xk 105 is the pattern being considered at step k. Expression 6.80 is a rule for updating the weight factor until all pattern vectors are classified correctly. The procedure is to add a certain amount of Xk to wk if Xk is not classified correctly. The proportionality factor, p(bk _ wT.xk)xk, is zero or vanishingly small if Xk is classified correctly, as clear from equation 6.75. Equation 6.80 can be rewritten as Aw ziSx^ (6.81) where (5. = (b — wT x). Equation 6.81 is known as the Generalized Delta Rule. The above described algorithm to find the weight vector w is called the linear Perceptron algorithm. In the linear case, there is no advantage of having more than one layer because the weights of the other layers can be simplified to obtain the same results as with one layer. The linear Perceptron algorithm, however, is not capable of yielding a discriminant that can solve more general problems. For example, the "exclusive-or" or the parity problem cannot be solved using a linear discriminant or Perceptron. The present research, therefore, uses a semilinear feedforward net which uses GDR with back propagation of errors. This net uses a semilinear activation function to compute the output of each node. Although the activation function is nonlinear, it is considered to be semilinear because of its approximate linear regions. The nonlinear activation function provides a better discrimination of the input patterns. The net is called feedforward because data flows only in the forward direction. In the learning phase of this net, the errors of weights are propagated backwards, i.e., the errors of the weights of layer i, for example, are considered when computing the weights of the z — 1 layer. The system architecture Chapter 6. Pattern Recognition Techniques^ 106 for such a net is schematically shown in Figure 6.5. While the lengthy derivation of mathematical formulae for a semilinear feedforward net with back propagation of error is not illustrated here, the important results are given below. Output Pattern Input Pattern Input Layer Hidden Layer Output Layer Figure 6.5: Schematic Representation of a Semilinear Feedforward Net. With reference to Figure 6.5, the net input to a node in layer j is net; = Ew ^ 2 (6. 82) The output of node j is o;^f(net;)^ (6.83) where f is the activation function. For the sigmoidal activation function the output of node j, Chapter 6. Pattern Recognition Techniques^ 1 1 + e —(netj ej )/90 107 (6.84) where t is a threshold or bias and Bo determines the the shape of the function. In the training phase, the network adjusts its weights according to the input features provided by the training set so as to reach the desired output. The objective in this phase is to minimize the system error given by the following expression: 1^ E^E E(tpk — 2P P k Opk )2 (6.85) where to, is the target or desired output and opk is the observed output at the kth layer for the pth case. In the learning phase, the following iteration is carried out until the system error falls below a threshold. Awii(n + 1) = where n(s, + ^ (6.86) n is known as the learning rate and a is the momentum term. The term 8 is aE defined as^and tv32 are the weights of the links between layers i and j. The term allet, 7 ^ Awiz(n + 1) is the incremental (or decremental) change in weight at the (n 1)th step. The first term on the right hand side of equation 6.86 simply changes the weight of a link in proportion to the change in error with respect to the corresponding weight, i.e., E, while the second term specifies that the change at (n 1)th step should be somewhat similar to to the change undertaken at the nth step. The momentum term a controls the momentum from the nth step to the (n 1)th step. The terms y and a are determined by a trial and error procedure for the case in question. Chapter 6. Pattern Recognition Techniques ^ 108 6.3 A Neural Network for Fish Classification This section explains the application of neural network theory to the fish classification problem. In this process, a set of 200 fish were used to train a classifier that could select one of five classes of cutting contour. The procedure for grouping fish into the classes is described in Chapter 3. The five classes were determined by setting a threshold to the wastage within a particular class. Clearly, this is a multiclass classification problem, hence, the following model was formulated. f(FW)= B^ (6.87) where F is the matrix of training fish pattern features (ac) and f is the activation function. The feature vector ac was chosen as follows: acT^(nose_eye/length,nose_gill/length,width/length). ^(6.88) W is the weight matrix that has to be learned during the training phase, and B is the target output matrix which gives the class of the input pattern. A neural network with one input layer, one hidden layer, and an output layer was constructed and made to converge by experimenting with the number of nodes for the hidden layer as well as the learning rate and the momentum rate. Since a network with one hidden layer required a large number of hidden nodes, a second hidden layer was added. It was clear from these experiments that the number of clusters in the input feature space has a direct relationship to the number of nodes that has to be present in the network. The number of nodes that were experimentally found to give the best convergence was 20 for the first hidden layer and 8 for the second hidden layer. The Chapter 6. Pattern Recognition Techniques^ 109 number of nodes in the input layer is the same as the number of input features and the number of nodes in the output layer is equal to the number of classes. Another network was developed to facilitate dimensional input features rather than non-dimensional features used in the previous case. In this case, features were not divided by the length, rather the length was taken as a separate feature. The number of output nodes used in this network is same as the previous case. The number of nodes in the first hidden layer was increased to 40 because this resulted in faster convergence. The number of nodes in the input layer was 4 as there were four input features in this case. This seemed to converge better than the previous case, however, both networks were not adaptive enough to cope with the new fish. This may be due to the large number of nodes used. A large number of nodes, on the one hand, may give rise to good partitioning of the input feature space, but these regions may be too small or too crisp for the system to adapt to a new fish. A small number of nodes, on the other hand, will be more general but for the present set of data, it was impossible to converge the system to give an acceptable error level. Various learning rates and momentum terms were investigated in order to speed up the convergence. High learning rates correspond to rapid learning but may result in oscillations in convergence. A large momentum term tends to dampen the oscillation but may cause to slow the rate of learning. The optimum values for learning rate and momentum term were obtained experimentally and they were 1.0 and 0.1 respectively. The major problem encountered during training is that the system did not converge with the complete set of data. Deleting the fish that were eliminated for the rule-based system (i.e. fish data below the selected threshold level) and training the network with the remaining fish showed reasonably good performance in convergence as well as in classification. It is reasonable, therefore, to assume that this data is noisy, and may have Chapter 6. Pattern Recognition Techniques^ 110 caused random variations in the feature space. The elimination process is discussed in Chapter 4. The system was tested in two ways. The first method is the resubstitution method mentioned in Chapter 4. In this method, a sample of data that is used to train the network is tested against the system and the results are tabulated in Table 6.1. The same sample of data, that was used to test the rule base system, is used in order to compare the results. Furthermore, the same sample will be used to test other systems as well. In Table 6.1, fl, f2, and f3 as previously mentioned represent the ratios of features (nose-eye length)/(length), (nose-gill)/(length), and (width)/(length) respectively. The group column indicates the group of the corresponding features or the group of the fish. The two results columns indicate the classification results of the two methods. The corresponding cell is marked "a", "x", or "nc" if the classification of the feature vector is ambiguous, incorrect, or unsuccessful (not classified) respectively, it shows the group number of the fish if it is correctly classified. An ambiguous classification occurs when the classifier indicates that the feature vector belongs to two or more classes or groups. An incorrect classification labels the input feature vector to a wrong class while no classification does not label the input vector into any class or group (all outputs of the classifier are zero or close to zero). In the leave one out approach, one data sample is left out and the remaining data set was used to train the network. Once the network is trained, it is tested with the sample that was left out. This process is repeated for all samples in the database and the results for the test sample are tabulated in Table 6.1. In another approach to this problem, a neural network was trained to estimate the coefficients of the cutting contour polynomials. The model used was the same as for classification except that the output matrix now consists of the values of the coefficients. Chapter 6. Pattern Recognition Techniques^ 111 The behavior of this network is similar to that of the classification network. 6.4 Multiple Regression Multiple regression is an extension of simple regression, in that there are several independent variables rather than just one independent variable which is the case in simple regression, to estimate a dependent variable[Weisberg, 1980] [Draper and Smith, 1966] [Rice, 1988]. For example, a regression model with two independent variables to estimate a dependent variable is as follows. z = + 0 1 x + 02y ^ (6.89) In the present case, the cutting contours are represented by polynomial functions, and the coefficients of these polynomials can be directly estimated using multiple regression. Therefore, the dependent variables are the polynomial coefficients of the cutting contours and the independent variables are the features of the fish. The following regression model was formed to estimate the polynomial coefficients. ai = Oio^Onaci Oizacz 013ac3 (6.90) az — 1320^Pziaci 022aC2 023aC3 (6.91) a3^030 + 031ac1 1332aC2 033aC3 (6.92) + Anaci /342ac2 /343ac3 (6.93) a4 = 040 in which a, are the polynomial coefficients and ac, are the features or attributes of the fish. There are four coefficients in the polynomial model of the cutting contours, hence, the four expressions in the above model. All fish data in the database were initially used Chapter 6. Pattern Recognition Techniques^ 112 Table 6.1: Neural Network Classification Results for Resubstitution and Leave-One-Out Methods. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 fl 0.25 0.51 0.54 0.41 0.52 0.77 0.17 0.53 0.53 0.82 0.49 0.70 0.61 0.55 0.42 0.66 0.57 0.43 0.40 0.88 0.53 0.35 0.85 0.34 0.32 Sample f2 f3 0.23 0.35 0.53 0.44 0.47 0.42 0.43 0.23 0.54 0.47 0.86 0.46 0.38 0.54 0.54 0.65 0.53 0.47 0.87 0.42 0.47 0.53 0.69 0.39 0.61 0.34 0.40 0.27 0.31 0.17 0.53 0.38 0.56 0.43 0.46 0.32 0.49 0.37 0.85 0.63 0.56 0.78 0.27 0.31 0.73 0.39 0.32 0.32 0.28 0.27 Croup 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 Classification Result Resub. Lye. 1 Out 1 1 1 x 1 1 1 1 1 1 2 2 2 2 2 2 a nc 2 2 x x 3 3 3 x 3 3 3 3 4 4 4 nc 4 4 4 4 4 a x 5 5 5 5 5 5 5 5 5 Chapter 6. Pattern Recognition Techniques ^ 113 to compute the parameters /310 043. Although the model was acceptable according to the residual analysis, the variances of the coefficients of the features (as) were significantly large. Some of the outliers (possible noisy data), therefore, were eliminated using the procedure described in the development of the rule-based system (this is described in Chapter 4). Again, these outliers correspond to the noisy data mentioned in the previous section. The regression, after eliminating the outliers, improved considerably resulting in acceptable p-values, F-ratios, and variances. The p-value and the F-ratio of regression analysis indicate the goodness of the model[Rice, 1988]. Furthermore, the validity of the model built was checked in several ways including normal probability plots and scatter plots of residuals. These are shown in Figures 6.6 to 6.13. The normal probability plot is used to observe the distribution pattern of the residuals. If the residuals are normally distributed the model cannot be further improved, for example, by introducing non linear terms. A straight line in a normal probability plot indicates that the residuals are normally distributed. It is clear from these figures that the above model resulted approximately normally distributed residuals. The scatter plot also allows us to graphically observe the randomness of the residuals. The coefficients of the features computed during the regression phase were implemented in an estimation model to estimate the cutting contour coefficients using the above model as given below. al = —6.612 — 0.032ac1 + 0.147ac2 + 0.031ac3 (6.94) a2 = 13.000 + 0.111ac1 — 0.289ac2 — 0.108ac3 (6.95) a3 = —9.038 — 0.101ac1 + 0.182ac2 + 0.119ac3 (6.96) a4 =^2.194 + 0.049ac1 — 0.042ac2 — 0.028ac3 (6.97) 114 Chapter 6. Pattern Recognition Techniques ^ Normal Probability Plot for Coefficient a 1 1 / / o i 1 .: -2 1 -2 ^^ 1 o ^^ 1 2 Residual Figure 6.6: Normal Probability Plot of Residuals for Coefficient al. ▪ 115 Chapter 6. Pattern Recognition Techniques^ Residual Plot for Coefficient a I 2 • • . „^ . a 1 • ^ • •^ • •^•^ •% • o _ .^ • • . ' . •^• ... • • • • • • • ,. . .. • •• . .. %. .. .^• • • • • .^.. • • .. - • a • a • • s • s -2 - • • % • - -3 ^ -2.2 -3.2^-3.0^-2.8^-2.6^-2.4 Estimate of Coefficient a 1 Figure 6.7: Scatter Plot of Residuals for Coefficient al. Chapter 6. Pattern Recognition Techniques ^ 116 Normal Probability Plot for Coefficient a2 3 2 1 0 1 -2 -3 -2 —1 o 1 2 3 Residual Figure 6.8: Normal Probability Plot of Residuals for Coefficient a2. Chapter 6. Pattern Recognition Techniques^ 117 Residual Plot for Coefficient a2 3 2 - 1 .^.^.^. .^ o - .•^. • • . a .• • • •• •• .. • d, .. .^ . . .^.•^ . . • _ .^. . .^... :. • . .• . ,• .^. • • le^ .• • • • .^. • • -1 - • • •• - .•^1^• .^.^. • •• • . •^• -2 3 ^ ^ ^ 4 5 6 Estimate of Coefficient a2 Figure 6.9: Scatter Plot of Residuals for Coefficient a2. 118 Chapter 6. Pattern Recognition Techniques ^ Normal Probability Plot for Coefficient a3 2 1 1 f : 1 o —1 ./ -2 1 -2 ^^ 1 ^^ I^ I o 1 2 Residual Figure 6.10: Normal Probability Plot of Residuals for Coefficient a3. 119 Chapter 6. Pattern Recognition Techniques^ Residual Plot for Coefficient a3 2 .. • • 0 1 •• • . .^ • • • • • - • a • : . .. . ^. a s • o • W.' • • .. •^. • . .. : . • . • 1. • • • • • • g : . •^• • • •^ • • • • - e .... '^ •^e . • . .a .•^ . • • -2 • a • • -3 ^ ^ ^ -3.0 -3.5 -2.5 -2.0 Estimate of Coefficient a3 Figure 6.11: Scatter Plot of Residuals for Coefficient a3. Chapter 6. Pattern Recognition Techniques ^ 120 Normal Probability Plot for Coefficient a4 3 2 1 -2 -3 -3 -2 1 —1 2 3 Residual Figure 6.12: Normal Probability Plot of Residuals for Coefficient a4. Chapter 6. Pattern Recognition Techniques^ 121 Residual Plot for Coefficient a4 3 2 a • • 1 • • • • • • • • • : • •0 • • :^• • a •• a • • a^• •• • • a • I o. • • a •• • • a 1 • • • • : • -2 • -3 • • 0.7^0.8^0.9^1.0^1.1 ^ 1.2 Estimate of Coefficient a4 Figure 6.13: Scatter Plot of Residuals for Coefficient a4. Chapter 6. Pattern Recognition Techniques ^ 122 The system was tested with the test data set prepared earlier, and the polynomial coefficients were estimated. The polynomials constructed with the estimated coefficients were compared with the model cutting contour polynomial of the corresponding group to compute the error. The errors or the indices of mismatch[Gamage and de Silva, 1991a][Gamage, 1990] were computed for both resubstitution and leave one out methods, and the results are tabled in Table 6.2. If the index of mismatch between the estimated contour and the model contour for the corresponding group is within the selected threshold level, the corresponding group number is marked in the results column, an "x" is marked otherwise. The tolerance level chosen when grouping fish was 0.0150. Therefore, if the index of mismatch for a particular sample is greater than the tolerance level it is considered to be unacceptable. About 72% of the test data were estimated within the specified tolerance level for the resubstitution case and 68% for the leave one out method. The fact that about 28% is out of the limits is mainly due to the very nature of regression. Regression attempts to average two extreme values to minimize the square error, which results in one common (average) contour for two extremely different contours. It is clear from Table 6.2 that all samples from group 1 are out of the specified tolerance level. The reason for this is because the number of data in group 1 was considerably smaller than the number of data in each of the other groups. The best estimated coefficients, therefore, tend to bias towards the groups with large number of data. A very important feature of regression is that it provides a basis for understanding the importance of the features being used. For example, if the parameter of a particular feature (standardized) is low it means that this feature is rarely significant or insignificant for the estimate of the dependent variable. The opposite is also true, that is, if the feature coefficient has a high value, the feature itself has greater influence on the estimate. Chapter 6. Pattern Recognition Techniques^ 123 Table 6.2: Multiple Regression Estimation Results for Resubstitution and Leave-One-Out Methods. No. f1 1 0.25 2 0.51 3 0.54 4 0.41 5 0.52 6 0.77 7 0.17 8 0.53 9 0.53 10 0.82 11 0.49 12 0.70 13 0.61 14 0.55 15 0.42 16 0.66 17 0.57 18 0.43 19 0.40 20 0.88 21 0.53 22 0.35 23 0.85 24 0.34 25 0.32 Sample f2 f3 0.23 0.35 0.53 0.44 0.47 0.42 0.43 0.23 0.54 0.47 0.86 0.46 0.38 0.54 0.54 0.65 0.53 0.47 0.87 0.42 0.47 0.53 0.69 0.39 0.61 0.34 0.40 0.27 0.31 0.17 0.53 0.38 0.56 0.43 0.46 0.32 0.49 0.37 0.85 0.63 0.56 0.78 0.27 0.31 0.73 0.39 0.32 0.32 0.28 0.27 Group 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 Classification Result Resub. Lye. 1 Out x x x x x x x x x x 2 2 2 2 2 2 2 2 2 2 x x x x x 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 Chapter 6. Pattern Recognition Techniques^ 124 6.5 Bayesian Approach This approach was investigated in order to classify fish into various groups. The objective of the Bayesian approach is to express a posteriori probability in terms of a priori probability and class conditional probability. Consider the following probabilities; P(C2) = the a priori probability that a pattern belongs to class C,. P(x)^The probability that a pattern is x. P(x}C,) = The class conditional probability that the pattern is x given that it belongs to class C. P(C,Ix) = the a posteriori conditional probability that the patterns class membership is C. given that pattern is x. P(C,, x) = the joint probability that the pattern is x and that its class membership is The Bayes relation is P(Cilx)^P(ICi)P(Ci)/ P(_)^ (6.98) and for continuous valued patterns it is P(Ciix) = p(IC)P(C)/p() ^ (6.99) where p(x1C1) and p(x) are the conditional probability density that the pattern is x given that it belongs to class C1, and the probability density of pattern x respectively. Chapter 6. Pattern Recognition Techniques ^ 125 The classification is based on the value of the conditional probability for an individual class. For example, if for classes C, and C3, the conditional probabilities P(C,Ix) > P(Cx), then we conclude that the given set of features belong to the class C3. The major problem of Bayesian classification is to compute the values of p(xiCi), i.e., to estimate the probability density function of p(xiCz). However, if the features are assumed to be normally distributed the following expression, which is the multidimensional Gaussian density, can be used to compute p(xlCi): = ((2 - 1 1 exp[ (x )711(x 1.±.)1 (6.100) where x is the N component feature vector, is the N component mean vector, and E is the N x N covariance matrix. Even with the assumption that the features are normally distributed, expression 6.100 still a computer intensive task to evaluate. In most cases, an additional assumption is made to simplify this task as given in Anderson [Anderson, 1958,1984], i.e., E for all classes are assumed to be the same. Although these two conditions are not imposed by the original Bayesian relationship, they are necessary for any real time application to be feasible. Now we shall consider the Bayesian classification as given in [Anderson, 1958,1984] for a case similar to the fish classification problem. In this case, however, it is assumed that distribution is multivariate normal with equal covariance matrices, though this was not the case for the fish data. These assumptions are made in order to compare the computational complexity of the Bayesian approach with the present rule-based approach. The Bayesian approach selects the class which gives the highest a posteriori conditional probability for a given feature vector, i.e. select class C3 if Chapter 6. Pattern Recognition Techniques^ 126 P(C,Ix) > P(Crjac) dr = 1, • • • ,q;r^j^(6.101) This is same as P(aIC;)P(C3) > p(IC,)P(Cr) dr = 1,•• • ,q;r j. (6.102) P(C4) = 13(_igi)P(C3) p4)^• ( 6. 10 3 ) since Inequality 6.102 can also be expressed as p(x1C3)^P(Cr) r = 1,•• • ,q;r^j.^(6.104) ^P(C)^ r3) p(IC) The density function p(Cr) for the multivariate normal distribution with equal covariance matrices is given below. 2,41(7r)^((270N/21E11/2) 1 exp[-4— )T-1(x_ IL)]^(6.105) With this distribution function, Inequality 6.104 can be expressed as follows: exp [-1(x — p 3)TE-1(x — 113)] P(Cr) Ltr)TE_1(x^ > p(u„3) dr = 1, • • ,q;r j. exp [-(x — )} (6.106) Since the logarithmic functions are monotonically increasing, Inequality 6.106 can be written in log terms as 1 P(CT) _ [(x_ )TE-1(_ 113) _^JTE-1(_ /)]^ L. > logP(C dr = l,,q;r ,) j. (6.107) - Chapter 6. Pattern Recognition Techniques^ 127 Now, the left-hand side of 6.107 can be expanded as 1 [T-1x _ x T1_ itTE-1x xT E-1 x xr E-1^ear E-1 x^E-1 ] by simplifying and rearranging the terms the following is obtained. 1 xTE-1(113 _ )_ 2013 + it )71-1013^) (6. 10 8 ) This is the discriminant function for the multivariate normal case. Now, if the discriminant function (6.108) is represented as up., the classification rule is, select class C3 if P(C) up. > logCT Vr = 1, • • • ,q; r^j .^(6.109) P(3) ^The terms E-1(p3 — ) and ^)TE'(iij — ) of 6.108 can be pre-computed and implemented in a classification system. For a system with three features, the first term is a 3 x 1 vector and the second term is a constant. Now, if there are five classes, the discriminant functions u3k can be written as ^U12 ^U13 — 0-012^crinaci^a212ac2^0-312aC3 0-013^0-213ac2^cr313ac3 u45 a045^a145ac1^cr245ac2^C734542C3 ^ ( 6.1 1 0) Chapter 6. Pattern Recognition Techniques ^ 128 where akr. k = 1, 2, 3 are the components of the vector E-1(// — ) and a037. are the --3 constants computed from --1(p,^)TE-1(y — ). Although there are twenty up. functions,only ten functions have to be computed because u3,^—urj. Once the up.s are computed, the following rules will assign a new feature vector to the corresponding class. : if u12 - P12) U13 > P13) U14 > P14) U15 > P15 C2 : if u21 > P212 U23 > P23) U24 > P24) U25 > P25 C5 : if u51 > P51 U52 > P52 U53 > P53) U54 > P54 where pir P(C,.) = log P(C3)• ( 6. 11 2 ) 6.6 k-Nearest Neighbor Method As one would expect from the name, this method classifies a feature vector x by assigning it the class most frequently represented among the k-nearest samples]Therrieu, 19891 [Duda and Hart, 1973]. In other words, a decision is made by examining the groups of the k-nearest neighbors and taking a vote. Consider a feature vector x in a feature space. The euclidean distances from x to its k nearest neighbours are computed and the decision is based on the groups of the neighbours. For example, consider case where k is equal to 5, i.e., the 5 nearest neighbours are considered for classification. Now, for a particular feature vector x, if the resulted groups of the 5 neighbours are arranged according to the ascending order of the resultant euclidean distances as follows Chapter 6. Pattern Recognition Techniques ^ 129 Neighbour Group 1 1 2 1 3 1 4 2 5 4 it is possible to conclude that the feature vector belongs to Group 1 by taking a vote. In making the decision, the groups of closer neighbours can be given larger leverage than the distant neighbours. This type of classification is known as nonparametric techniques because it does not require the parameters of the probability distributions of the training data. The k-nearest neighbor algorithm was implemented and tested with the set of test data used previously. The results are shown in Table 6.3. As it is clear from this table that the classification rate is about 76% for the resubstitution method and 68% for the leave one out method. The low classifying rate is due to the inability to generalize the training data. Furthermore, the classification is based on a "look up table" approach, hence, it is sensitive to noisy data. The other drawback of this method is that it needs to access the whole database on-line in order to obtain the neighbours for a given feature vector. This may considerably slow down the processing in particular with a large database. Chapter 6. Pattern Recognition Techniques ^ 130 Table 6.3:^k-Nearest Neighbour Classification Results for Resubstitution and Leave-One-Out Methods. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 fl 0.25 0.51 0.54 0.41 0.52 0.77 0.17 0.53 0.53 0.82 0.49 0.70 0.61 0.55 0.42 0.66 0.57 0.43 0.40 0.88 0.53 0.35 0.85 0.34 0.32 Sample f2 f3 0.23 0.35 0.53 0.44 0.47 0.42 0.43 0.23 0.54 0.47 0.86 0.46 0.38 0.54 0.54 0.65 0.53 0.47 0.87 0.42 0.47 0.53 0.69 0.39 0.61 0.34 0.40 0.27 0.31 0.17 0.53 0.38 0.56 0.43 0.46 0.32 0.49 0.37 0.85 0.63 0.56 0.78 0.27 0.31 0.73 0.39 0.32 0.32 0.28 0.27 Group 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 Classification Result Resub. Lye. 1 Out 1 1 1 x x x x x 1 1 2 2 2 2 2 2 x x 2 2 x x 3 3 3 3 3 3 3 3 4 4 x x x x 4 4 4 4 5 5 5 5 5 5 5 5 x 5 Chapter 7 Comparison of Recognition Techniques 7.1 Rule-Based Method As shown by the test results, the overall performance of the the rule-based system is very good. The speed of the method is high and satisfies the necessary real-time requirement for the application in fish processing. The real-time requirement of the system is to process two fish per second which leaves only 500ms for processing and controls. These include feature extraction from image analysis, classification, dimensionalization, co-ordinate transformation, determination of robot actuator commands, and communication. Even though the time left for classification is approximately 200ms, the rule-based system is capable of achieving this constraint. With 22 rules and 3 features for each rule in the present implementation, the classification time is approximately 2ms on a PC 486 computer. The accuracy of classification, in this context, can broadly be viewed in two ways; the percentage of correct classification and the percentage of incorrect classification. Apart from these two classifications, however, there are two other possible classifications; am- biguous classification and no classification. An ambiguous classification occurs when a feature vector is classified into more than one class. If the feature vector does not belong to any class it is considered as being not classified. To illustrate this, consider the following example of outputs of a classifier for a feature vector that belongs to class 2, for four different test cases. 131 Chapter 7. Comparison of Recognition Techniques ^ Case 1 Case 2 Case 3 Case 4 1 0.2 0.1 0.9 0.1 2 1.0 0.1 0.9 0.0 3 0.0 0.2 0.1 0.1 4 0.1 0.9 0.0 0.2 5 0.0 0.0 0.2 0.1 Output 132 Class Now, if a threshold level of 0.5 is selected for the classification to a particular class to be true, case 1 is clearly correctly classified because the only output greater than 0.5 is the output 2 which is same as the class of the feature vector. Case 2 has classified the feature vector to class 4, hence, the classification is wrong. The outputs 1 and 2 are both set to be true in case 3, resulting in an ambiguous classification, while none of the outputs of case 4 is set above the threshold level giving no classification. In the cases of ambiguous and no classifications, a default action (or no action) can be performed. In the case of wrong classification, however, the system has no knowledge about its error, therefore, a wrong action is performed. In fish processing this may not be a serious matter, but in some other cases, such as in hazardous environments, it may be more critical. As can be seen from Tables 4.6, 6.1, and 6.3, the rule-based system gives lowest percentage of wrong classification for both resubstitution and leave one out methods. The percentage of correct classification of this method is also high. Ambiguous classification is not possible with this method because the inference engine of the rule base resolves the conflicts and gives only one output. The robustness of the method is also good resulting in about 80% classification rate for the leave one out test. The learning phase of the rule-based methodology was carried out by setting up a connectivity diagram as described in Chapter 4. The methodology is somewhat similar to Chapter 7. Comparison of Recognition Techniques ^ 133 the density estimates using Parzen windows. The advantages of the present method are: a) it uses only the significant regions of the density resulting in efficient classification, b) it represents the high density regions in the form of rules which can be integrated with expert or heuristic knowledge which are already in this form, c) it allows the knowledge engineer or the designer to view high density regions graphically and understand the patterns or the relationships in the database. Also, this allows a better understanding of the possible noisy data in the database. The validity of the solution is, therefore, graphically supported rather than blindly arriving at it. In order to update the rule-base or learn continuously, it is necessary to collect more data to update the connectivity diagrams of each group and to construct new diagrams if necessary. The updated connectivity diagrams will enable the engineer to add more rules, change priorities of the existing rules, and to delete the redundant rules. 7.1.1 Real-Time Complexity Comparison to Bayesian Approach A Bayesian classifier for the fish problem is given in Expressions 6.110 and 6.111 of Chapter 6. In this system, however, it was assumed that the distribution of the features were multivariate normal with equal covariance matrices. The complexity for this system is 30 multiplications, 30 additions, 20 comparisons, and 15 logical AND operations. All operations are floating point. For the rule base system with 22 rules and three features for each rule, the complexity is 132 comparisons and 66 logical AND operations. These operations were simulated on a PC 486 computer in order to compare the speed of the two methods. From this experiment, it was clear that the rule base system runs twice as fast as the Bayesian classification system. Chapter 7. Comparison of Recognition Techniques ^ 134 7.2 Neural Network The performance of the neural network classifier depends on the complexity of the network. The complexity, in this context, refers to the number of nodes, hence, the number of inter-connections in the network. To illustrate this, consider a node as shown in Figure 7.1. Figure 7.1: Schematic Representation of a Node in the Neural Network. As it is clear from Figure 7.1, each node has to perform (n — 1) additions and n multiplications, ignoring the computations in the activation function. Here, n is the number of inputs to the node. It is, therefore, obvious from this example that as the number of nodes increases the computation overhead may become a serious obstacle for real-time applications such as head removal in fish processing. The on-line classification time for the fish contour classification case is approximately 50ms. The accuracy of classification of this method is better than the rule base system for the resubstitution test, however, it was about 72% for the leave one out method. The percentage of incorrect classification is 16%, which is also high compared to the rule base Chapter 7. Comparison of Recognition Techniques^ 135 approach. The robustness of the neural network is one of its key features but it depends on the level of training and the number of nodes. For example, if a network is trained with a number of nodes which is considerably larger than the required number of nodes to partition the feature space, and if the network is overtrained to produce very sharp boundaries to separate the regions of the feature space, the robustness of the network will be very poor. To illustrate this, consider the following example of 2-D feature space shown in Figure 7.2. xi aaa aaa aa aaaaa aaaaaa aaaaa aaaa bbbbbb bbbbbbb bbbbbbb bbbbbbb bbbbbbb bbbbbbbb • bbbbbbb bbbbbbb • bbbbb x2 Figure 7.2: Schematic Representation of Patterns in 2-D Feature Space. Now, if a network is trained with large number of nodes, as described in the previous paragraph, to partition the feature space with boundaries as shown, a test pattern "x" will not be classified, even though one may classify it as being "a". The robustness of the fish classification network was acceptable giving 72% classifying rate for the leave Chapter 7. Comparison of Recognition Techniques ^ 136 one out test. The initial learning of the network is slow and hard because of the convergence method currently used. The convergence method used in training the weights is known as gradient descent method. In fact, most of the negative implications of the neural network methodology come from this training technique. The gradient descent method is not always guaranteed to converge to the global minimum. Instead, the solution may be trapped in a local minimum resulting in incorrect weights. Even if it does converge to a global minimum, it may take a long winding path resulting in a very large convergence time. A typical convergence time for the fish data was about 24 hours on a Sun SparcStation. Subsequent training can be achieved by training the weights with new data but previous weights can be used as initial weights to improve the learning time. 7.3 Multiple Regression The fastest of all techniques is the multiple regression estimator, because it requires only 3 multiplications and 3 additions to estimate one of the four coefficients of the cutting contour polynomial. Therefore, the total number of multiplications and additions required to estimate all four coefficients are 12 and 12 respectively, as it is clear from Equations 6.94, 6.95, 6.96, and 6.97. The accuracy in this case can only be viewed as the similarity or the closeness of the estimated contour to the actual contour. To compute the error between the estimated contour and the actual contour the index of mismatch between the two is computed. The complete set of training data was tested initially and computed the indices of mismatch between each estimated contour and the actual contour. About 65% of the original data was estimated within the specified tolerance level. The reason for this low success rate is something inherent to linear regression. Chapter 7. Comparison of Recognition Techniques ^ 137 It attempts to fit the best hyperplane for the independent variables (features) in the training data to estimate the dependent variable. For example, in the present case, there are three independent variables, (nose-eye length)/(length), (nose-gill length)/(length), and (width)/(length), to estimate the dependent variable, the cutting contour polynomial coefficient. Regression, therefore, fits the best hyperplane in 4-D space to estimate each coefficient of the polynomial function by minimizing the square errors in each dimension. If a particular group has a considerably larger number of data than some other group, the estimated hyperplane will be biased towards the larger group resulting in a large error for members in the smaller group. This became very clear when the test set of data was tested resulting in unacceptable indices of mismatch for the members in group 1. Here, the indices of mismatch were computed between the test sample and the model contour of the group of that sample. This comparison was essentially carried out to compare the regression model with other techniques. There is not much difference between the results for the two test methods. Even if two or three samples were left out and tested, the results were very consistent. The overall accuracy of the method for resubstitution was about 72%, and for leave one out it was 68%. Unlike all the other classification techniques, the regression model was able to come up with a reasonably good contour even when the system was presented with a completely new set of data. The initial learning of the coefficients of the model is fairly trivial and further improvements can be obtained with increased number of data. Chapter 7. Comparison of Recognition Techniques ^ 138 7.4 k-Nearest Neighbour Method The speed of k-nearest neighbour algorithm is directly dependent on the number of data in the database and the number of features for each data item. As the number of data items and the number of features grow larger, the speed of the method rapidly decreases. The algorithm has to search for the nearest neighbours by computing the euclidean distance between each of the features of the test sample and the corresponding features of each of the data items in the database. As it is clear from this, the complexity of the algorithm rapidly increases with the increased number of data items and the number of features per data item resulting in poor speed performance. The on-line inference time for the fish classification case is about 55ms. The accuracy of the algorithm, on the other hand, increases with the number of data in the database. For the test set of data given in Table 6.3, the algorithm produced only about 76% correct classifications for the resubstitution method, and 68% for the leave one out method. The major drawback of the method is the inability to generalize the training set of data. To illustrate this, consider a single feature (1-D) example shown in Figure 7.3. Now, if the unknown data to be classified is xl, and k is chosen to be 3, the algorithm will determine the neighbours to be 2, 2, and 1. This will result in the class of the unknown data x1 to be 2, although the actual class may be 1. On the other hand, if a smoothed version of the density is used for classification as in rule-based method, the probability of x1 being in class 1 will be larger, hence, the classification will be class 1. There is no initial learning phase for this algorithm and new data could be easily added to the database with minimum effort making even on-line learning possible. ^ Chapter 7. Comparison of Recognition Techniques^ zz c a) 0 22 222 222 2222 2222 22222 222222 2222222 22222222 222222222 2222222222 1 111 11111 1^1^1^1^1^1^1^1 111 1 1111 ^1221 Xi 2222222222222 22222222222222222 Figure 7.3: Density of Patterns for Two-Class, Single-Feature Classification. 7.5 Summary of Classification Results Table 7.1: Summary of Results. Classification Rule-Based Method Test Method Correct Incorrect Not Classified Resub. 88% 8% Lye. 1 Out 80% 8% Neural Net Method KNN Method Multiple Regress. Resub. Lye. 1 Out Resub. Lye. 1 Out Resub. Lye. 1 Out 92% 72% 76% 68% 72% 68% 4% 16% 24% 32% 28% 32% 4% 12% 4% 12% 0% 0% 0% 0% 139 Chapter 8 Pattern Recognition for Estimation of Cutter Position In many fish processing applications, machines employing fixed average settings are used to perform a series of operations on products which contain substantial variations. Since such an approach leads to product wastage, the development of advanced techniques such as computer vision and adaptive pattern recognition in order to reduce wastage and maximize yield and revenue, is important. A good example of such a process is the head removal stage in a fish canning line. By manually butchering a set of fish, and examining them through consultation with fish processing experts, the optimal location of the cutter has been established as the point of intersection of the line along the back of the roof of the mouth, and the center plane of the head. Once the location of the point is established, a cutter can be moved to this position to remove the head of a fish [Carnage et. al., 1993]. The direct use of image analysis to establish the location of the cutting point is not possible because the point itself is not a physical attribute that is externally visible. This chapter describes how a pattern recognition system can be employed to estimate this crucial reference point with a computer image processing system at the front end. The resulting information is directly used to drive the cutter in the head removal process. A multiple regression model and a neural network were developed to estimate the reference point. The results of the two methods are also presented. 140 Chapter 8. Pattern Recognition for Estimation of Cutter Position^141 8.1 Off-line Data Acquisition and Analysis About 500 heads of Sockeye and Pink salmon were collected at B. C. Packers and brought to the Industrial Automation Laboratory to generate the database for data analysis. Images of each fish head were captured to a vision workstation (SD512) to record the important features of the head that would be useful for estimation of the point. Various measurements such as eye-to-nose, eye-to-flap, eye-to-gill, eye-to-the center of the pectoral fin, eye-to-the edge of the pectoral fin, width at flap position, width at gill position, and eye-to-the reference point were taken with the help of the vision system. These features are illustrated in Figure 8.1 1 Nose 2. Eye 3 Flap 4,5 Flap Width 7,8 Gill Width 6 Gill 9 Fin Top 10 Fin Center P Point of interest Figure 8.1: Illustration of the Features of a Fish Head. The data were then transferred into a database in a PC/486 computer to analyze and to examine the relationships between the reference point and the other features. A Chapter 8. Pattern Recognition for Estimation of Cutter Position^142 statistical package (SYSTAT) was used for this purpose. Various statistical parameters including covariances and correlations were used to examine the dependence of measurement of the point to the other measurements, and to compute the interdependence of the features. If, on the one hand, the correlation between a feature and the point measurement is strong (i.e. close to 1) the feature itself is a good representation of the point of interest. On the other hand, if two features are strongly correlated, then one of the features is redundant. Therefore, correlation was used to include the important features and to eliminate the redundant features from the model of the estimate. 8.2 Development of an Estimation Model Since the problem is an estimation of a point in a real time industrial application, a multiple regression model was initially chosen because of its speed and the robustness. As mentioned in Chapter 6, multiple regression is an extension of simple regression, in that there are several independent variables compared to one independent variable as in simple regression, to estimate a dependent variable. In the present case, the point in the head is the dependent variable and is estimated using some dimensional features of the fish. In fact, these dimensional features are the independent variables in the regression model. Therefore, the following regression model was formed to estimate the point of interest. = +thfi + 02f2 + ^ + Onf.^(8.113) where /3 is the point estimated, fi , ^ fn are the dimensional features, and Os are the coefficients of the features. After analyzing correlation coefficients among the features and between the features ^ Chapter 8. Pattern Recognition for Estimation of Cutter Position^143 and the reference point, it was found that some of the features are redundant and some of the features are insignificant. Therefore, the above model was reduced to the following. = Oxo +13.1(eye-to-fin edge) + f3x2(eye-to-gill)^(8.114) 13y = i3 ^edge) +132(eye-to-gill) ^(8.115) The coefficients of the features computed during the regression phase were implemented in an estimation model to estimate the x and y co-ordinates of the point p as given above. The origin of the co-ordinate system was fixed at the position of the eye. The system was tested with a set of fish data and the errors px — Px and py — Py were computed. The average error in both x and y directions are about ±2.5mm. This was a very small percentage (4%) in x direction, but was a large percentage (60%) in y direction because the estimate of y itself is around 4.0mm. Therefore, the distance from the eye to the point is considered only in the x direction for implementation. This distance is approximately the same as the distance from the eye to the point along the center line of the fish. Following is the actual model of the estimate, while Table 8.1 shows a sample of the training data and the results of the estimate. = 7.582 -I- 0.192 * (eye — to — fin edge) + 0.43 * (eye — to^— gill)^(8.116) The validity of the model built was checked in several ways including normal probability plots and scatter plots of residuals. The normal probability plot was used to observe the distribution pattern of the residuals. If the residuals are normally distributed, the model cannot be further improved. This was true for the above model. The scatter plot also allows us to graphically observe the random nature of the residuals. Figure 8.2 Chapter 8. Pattern Recognition for Estimation of Cutter Position^144 Table 8.1: Sample Test Results of Estimate Using Multiple Regression. Fish No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Eye-Gill (mm) 87.11 89.72 84.84 88.17 77.72 80.04 84.52 78.06 76.66 78.53 78.15 72.73 73.65 69.69 78.15 75.97 81.59 83.76 91.53 84.56 Eye-Fin (mm) 86.24 87.08 84.25 85.97 78.11 79.36 79.28 77.11 65.31 77.66 75.63 75.32 74.82 67.57 77.42 76.39 81.25 82.06 85.42 82.76 Eye-Point (Actual)(mm) 60.64 66.70 55.69 56.61 52.46 56.60 53.43 52.26 49.41 56.98 56.33 51.77 48.08 47.52 53.66 52.14 56.96 58.36 65.23 59.55 Eye-Point (Estimate)(mm) 61.60 62.88 60.24 62.00 56.00 57.24 59.15 55.95 53.09 56.26 55.71 53.31 53.61 50.52 56.05 54.91 58.27 59.35 63.34 59.83 Error Abs(mm) 0.959 3.819 4.549 5.392 3.539 0.637 5.716 3.689 3.672 0.720 0.619 1.548 5.538 2.995 2.392 2.768 1.300 0.991 1.896 0.278 Chapter 8. Pattern Recognition for Estimation of Cutter Position^145 and 8.3 show the normal probability plot and the scatter plot of residuals. As it is dear from these plots the residuals are random and normally distributed supporting the validity of the model. A neural network was also trained to estimate the distance to the point p. The same features that were chosen for the multiple regression were used for the neural network. The results of this method are shown in Table 8.2. Table 8.2: Sample Test Results of Estimate Using a Neural Network. Fish No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Eye-Gill (mm) 87.11 89.72 84.84 88.17 77.72 80.04 84.52 78.06 76.66 78.53 78.15 72.73 73.65 69.69 78.15 75.97 81.59 83.76 91.53 84.56 Eye-Fin (mm) 86.24 87.08 84.25 85.97 78.11 79.36 79.28 77.11 65.31 77.66 75.63 75.32 74.82 67.57 77.42 76.39 81.25 82.06 85.42 82.76 Eye-Point (Actual)(mm) 60.64 66.70 55.69 56.61 52.46 56.60 53.43 52.26 49.41 56.98 56.33 51.77 48.08 47.52 53.66 52.14 56.96 58.36 65.23 59.55 Eye-Point (Estimate)(mm) 61.60 64.89 58.98 63.07 55.69 56.75 58.61 55.98 52.63 56.26 56.18 56.06 52.69 48.90 56.01 54.70 57.28 58.26 65.96 58.77 Error Abs(mm) 0.959 1.806 3.289 6.464 3.232 0.152 5.176 3.717 3.224 0.720 0.153 4.291 4.610 1.378 2.343 2.560 0.325 0.099 0.730 0.783 The average absolute error of the neural network estimator is about 2.1mm. The accuracy of this method is better than the multiple regression method, however, real time Chapter 8. Pattern Recognition for Estimation of Cutter Position ^146 Normal Probability Plot for Point Estimation Residual of Estimate Figure 8.2: Normal Probability Plot of Residuals of Point Estimate Using a Neural Network. ▪▪^ Chapter 8. Pattern Recognition for Estimation of Cutter Position^147 Residual Scatter Plot 3 2 - 1 ^.. ^. ▪ •. • • - • ° • ..^.^ .. • • • ..:•. i^. .^ • .^- _ - • • • a •• ea • :".•.e . • • • - • ..^c • . :^..^• . • •^. .^.I.^• • • •• :•••• • ••^•^ • • - .• • ... • • • •^•^• 'lb • • .^.^. •„ • 45 • - • a • a • % -3 _ • -^.^ .^I.^ • . •^• • • .. %..•^ -2 - - ^% : V^• • • •^. • •^a^• . .^, • • .. d' ....• • ..^..• .8 . . • ••• sir ••• •^• • • • .. • . a .^•^i^• . .^• . •• • • .^• • % ^ 50 ^ • a • • 55^60 ^ 65 ^ 70 Estimate of Point Figure 8.3: Scatter Plot of Residuals of Point Estimate. Chapter 8. Pattern Recognition for Estimation of Cutter Position^148 computation will be more intensive than the regression. Models were also developed to estimate the point for each species separately rather than all types represented in one model. However, no improvements were observed in the errors of estimate with this separation. In the practical automated fish processing system, the front end image processing system extracts the required features from the fish images that are captured on-line from a CCD camera. These features are then passed in to the reference point estimator (regression or neural network estimator) which, in turn, estimates the reference point for head removal using the features. Chapter 9 Laboratory Implementation As mentioned in Chapter 3, existing fish butchering machines waste about 10% 15% of useful meat. About 8%^10% of this meat can be recovered by accurately placing a special double angular ("V-shaped") cutter at the gill position and removing the head. In order to recover the rest of the meat, however, it is necessary to perform a contour cut. Section 9.1 describes a "V-cut" butchering machine that is being developed in the Industrial Automation Laboratory of the University of British Columbia [de Silva, 1990] [de Silva, Riahi, and Carnage, 1991]. Section 9.2 gives a detail description of the development and the implementation of a prototype contour cutting system while Section 9.3 describes the automatic rule generation scheme for this system. Section 9.3 and 9.4 respectively describe the cutter technology and the gripper considerations. 9.1 "V-cut" Butchering Machine After manually butchering a large number of fish in consultation with fish cutting experts, it was established that most of the meat that is wasted in the head can be recovered by butchering fish according to the angles shown in Figure 5.7. There are two important angles (By and O.), as shown in this figure. The angle 0„ represents the angle of V-cut with associated recovery of useful meat within the head region. This angle is fixed and is achieved by a double angular cutter. The cutting angle Oci of an angle cut is chosen 149 Chapter 9. Laboratory Implementation ^ 150 to recover the meat in the dorsal side of the head while eliminating the pectoral fin. Statistics show that (see Table 5.2) the optimum values for the angles 0, and Oa are approximately 350 and 600 respectively. In the operation of this machine, each fish on the conveyor is gripped and the head region is imaged in order to compute the gill position. Low-to-medium level image analysis techniques are employed for this purpose. Once the gill position is determined the double angular cutter assembly is positioned at the gill position such that the head is removed from the body just behind the gill (collarbone). Note that, the fish on the conveyor is placed at a 30° angle to the motion of the conveyor in order to remove the pectoral fin while recovering most of the meat in the dorsal side of the head. Also, the thickness of the fish is measured in order to align the center of the fish with the center of the double angular cutter. The cutter assembly is driven by a DC motor through a lead screw. The height of the fish (to align with the cutter) is adjusted by an elevating platform which is also driven by a DC motor. Both motors are controlled by a GAUL two axis motion controller. Also, the cutter has two rotary blades. These are arranged to form the angle au, and are driven by two 3 phase motors. In order to reduce the weight of the cutter assembly, the motors are mounted separately and the blades are driven through flexible shafts. 9.2 Contour Cutting System The rule-based methodology was used for on-line generation of cutting contours in a head removal operation for Pacific Salmon [Carnage and de Silva, 1992]. The integrated system for the contour generation consists of a CCD camera, a commercial image processing system, and a knowledge base computer. The model base and the rule base are located Chapter 9. Laboratory Implementation ^ 151 in the knowledge base computer. During operation, a fish is imaged and the features are computed by the image analyzer. The feature values are then transmitted to the knowledge base computer, and during the rule matching process a rule is fired within the rule base and a model is matched from the model base. The model contains the required cutting contour, which is then transmitted to the controller of the fish cutting robot. Along with the contour, the length and the width of the fish are also transmitted in order to dimensionalize the non-dimensional model contour. This dimensionalization is carried out in the robot controller in the present implementation, since only the model number (not the contour) of the corresponding contour is sent to the robot controller in order to save the communication time. The model base is stored in the robot controller, and upon receiving the model number, the length, and the width of the fish, a program in the controller dimensionalizes the contour retrieved from the model base and drives the robot according to the trajectory generated for the corresponding model. A cutter attached to the end effector of the robot removes the head along a path described by the model contour. A suitable cutter to perform the contour cutting has to be developed. Section 9.4 describes the proposed cutters along with their advantages and disadvantages. Approximately 200 recorded images were processed and analyzed by the SD512 vision workstation in order to select possible features which are further analyzed by a computer (HP Apollo DN4500 and PC386) for model development. The specifications and system configuration of SD512 vision station along with its vision functions and application development are described in Appendix C. The rules for each model are developed by setting up a connectivity diagram. An automatic method for rule generation is being developed and it is described in Section 9.3. A laboratory demonstration to verify the operation of the knowledge based contour generation system, and the accuracy of the trajectory of the robot (SCORBOT-ER VII) Chapter 9. Laboratory Implementation ^ 152 with respect to a determined contour, was performed. A block diagram of the integrated system is shown in Figure 9.1. The specifications of the robot and robot controller, and programming considerations are given in Appendix B. An image obtained by the CCD camera is sent to an image buffer of the SD512 vision computer. The image analysis program on this computer reads an image from the buffer and computes the geometric features that are needed for the knowledge-based system. This program utilizes the C library and the image processing routines library in its computation of these features. In the present implementation, these features are the noseto-eye length, the nose-to-gill length, the length, and the width of the fish. The features are then communicated to the PC486 computer through one of the serial ports of the vision workstation. The serial communication has been established using three wires and XON/XOFF handshake. The serial communication software running on the PC486 reads the features sent by SD512, and transfers them to the working memory of the 0PS83 rulebased program. 0PS83 is a programming language for developing rule base systems, and an overview of it is presented in Appendix A. The feature values in the working memory provides the context for the rule base, and it will fire a rule if satisfied. Otherwise, a default action is followed. The inference of a rule is the cutting contour model, and it is sent to the SD512 computer through the serial link, to be transferred to the SCORBOTER VII robot controller together with the length and the width information of the fish. The interface circuit between SD512 and the robot controller, and the communication and synchronization software are described in Appendix D. An image captured by the vision station is shown in Figure 9.2. A typical inference of the model-based contour generation system is shown in Figure 9.3. The inferred contour superimposed on the corresponding image of a salmon is shown in Figure 9.4. Chapter 9. Laboratory Implementation ^ 5D512 VISION WORKSTATION^ 153 PC 486 COMPUTER OPS 83 C Library (MICROSOFT) PrOgICIM Rules Attrbut Attributes 4^ RS232C Model No. —11• WWI Commun. Software 4-Model No. Image Buffer A Pascal User interface Parallel I/O Comm. Software ^I^ I Model No.^parallel data and handshake Buffer CCD Camera Ash Image Non Dimensional Model Base Digttal Commun. Software Convey Motion ACL Program . DimendonalzatIon • Determinations of joint M011011$ Joint Actuator Interface Joint Actuator Signals Chapter 9. Laboratory Implementation^ Figure 9.2: An Image Captured by the Vision System. Figure 9.3: A Typical Output of the Model-Based Contour Generation. 154 Chapter 9. Laboratory Implementation ^ 155 Cutting Contour Figure 9.4: An Image of a Salmon with the Cutting Contour Superimposed. 9.3 Automatic Rule Generation A two-stage automatic rule generation process, which is carried out on a per class basis, is readily executed using a set of training data stored in the data-base. During the first stage a set of potential rules (P-Rules) is established, and subsequently during the second stage the P-Rules are merged into a smaller number of final rules (F-Rules). With reference to the analytical basis for rule generation, recall that the probability density f:)(cAcri) (or 13(g,c1Cj)P(C3)) is estimated within fixed sized hypercubes in the attribute space. We can consider the attribute space to comprise many adjacent hypercubes, each constituting a potential rule (P-Rule) defined by the bounding values of the hypercube dimensions. In addition, each fish in the training set will be contained within one or more of these hypercubes. In practice, however, multiple fish contribute to a single PRule, thereby decreasing the number of P-Rules and increasing the density of P-Rules containing multiple fish. The initial stage of generating P-Rules involves simply taking Chapter 9. Laboratory Implementation^ 156 each fish in the training data set and either initiating a new P-Rule or incrementing the density of an existing P-Rule. The P-Rules are maintained in an ordered list according to density, and a density threshold is established in order to deactivate low priority P-Rules. A graphical interface was developed to help understand the effect of the threshold level. Figures 9.5, 9.6 show the P-Rules at various threshold levels. In the second stage of rule-generation, neighbouring P-Rules are merged to form the reduced set of F-Rules. This procedure starts by initializing a new F-Rule with the highest density P-Rule. Next, the P-Rules are considered, in the order of their density, for merger with existing F-Rules or for the initiation of new F-Rules. The neighbouring hypercubes of the P-Rule having the next highest density are examined in order to determine the action to be taken. If one of the neighbouring hypercubes corresponds to an existing F-Rule, the corresponding boundary of the F-Rule is extended to include the current P-Rule, and the current P-Rule is deleted. In addition, any remaining P-Rules, including deactivated rules, that fall within the new F-Rule boundary are included in the F-Rule and are deleted. If the current P-Rule does not neighbour an existing F-Rule, the P-Rule remains on the list in order to be reconsidered. If no remaining P-Rules can contribute to an existing F-Rule, the highest density P-Rule is used to initiate a new F-Rule and the process is repeated. The process described continues until all P-Rules have been incorporated into FRules. During this process, the number of training data and the number of hypercubes contributing to each F-Rule are recorded in order to set the priority of the F-Rule. The result of this process is a reduced set of F-Rules describing each class of cutting contour. Chapter 9. Laboratory Implementation^ 157 Treshold Level : 9 adl ac2 ac3 9.0 9.0 9.0 8.0 8.0 8.0 7.0 7.0 .0 6.0 6.0 6.0 5. 0 5.0 5.0 4.0 4.0 4.0 3.0 3.0 3.0 2.0 2.0 2.0 1.0 1, 0 0.0 0. 0 0.0 Treshold Li: 6 act de:2 etc3 9.0 9.0 7.0 7.0 7.0 6.0 6.0 6.0 5.0 5.0 5.0 4.0 4.0 4.0 3.0 3.0 3.0 2.0 2.0 2.0 1.0 1.0 1. 0 0,0 0.0 0.0 9.0 8.0 Figure 9.5: P-Rules with Density 9 (above), and Density 6 (below). Chapter 9. Laboratory Implementation Figure 9.6: P-Rules with Density 4 (above), and Density 2 (below). 158 Chapter 9. Laboratory Implementation^ 159 9.4 Cutter Technology Once the cutting contour for head removal is determined by the knowledge-based system, it has to be transformed to necessary cutting signals for a robotic cutter. The robotic cutter must be able to cut the fish according to the contour generated, within the specified time limits. The cutter can be attached as the end effector of a robot which can provide the necessary trajectory in order to follow the required path. The design of a cutter is a non-trivial task because of the cutting geometry and the material properties of fish. In particular, the cutter will undergo different type of material within a fish, as well as in different batches. For example, it has to cut flesh as well as bones within a fish and also fish may be frozen, fresh, cultured, or wild salmon. With these constraints in mind, the feasibility of following types of cutter was investigated. 1. Water-jet Cutter 2. Laser Cutter 3. Wire Cutter 4. Rotary Blade Cutter 5. Reciprocating Blade Cutter 9.4.1 Water-jet Cutter Although a water-jet cutter can provide necessary cutting power, speed, and accuracy, it is considerably expensive for this application. Their prices can range from $80,000 to $150,000, depending on the pressure, for this application. A low pressure water-jet cutter uses abrasives to improve the cutting ability. This type of cutter is, therefore, not suitable Chapter 9. Laboratory Implementation ^ 160 for fish processing. On the other hand, a high pressure cutter may not be economically viable in head removal of fish. However, the prices of high pressure water-jet cutters are becoming less expensive and this may lead to an economically feasible solution in time. Also, water-jet cutters are currently being used in fish processing industry proving its technical feasibility for this type of application. 9.4.2 Laser Cutter Lasers are available for wide variety of applications. The power output of these is one of the key features that decides the cutting capacity and it ranges from low power 15W to high power 5kW. The low power lasers will not be able to penetrate 70mm thick flesh, which is about the maximum thickness of the fish that are processed using the present machine, to remove the head in contour cutting. Although the medium power lasers (500W to lkW) are able to provide such penetration power, the speed of the operation will not be satisfactory. The only type that can provide deep cutting at the required speed is the high power CO2 laser. Their power output is about 5kW and they cost about $300,000. However, the new Nd : YAG lasers can generate beams with 3kW power output, which may be adequate for the present purpose. These lasers are less expensive and they cost about $80,000 to $100,000. Even with these prices, the use of lasers in head removal will also not be economically feasible. Furthermore, the laser cutting has the added disadvantage that it burns the cutting surface of the fish. 9.4.3 Wire-Cutter Wire-cutting may be another possible solution to the head removal problem. Cutting a fish according to the contour generated along with the necessary "V" angle, as shown in Figure 9.7, will be a tedious task using a wire-cutter. However, a two stage cut may ease Chapter 9. Laboratory Implementation ^ Dorsal Side 161 ^Stomach V - Cut Angle Dorsal Fin Cross Section Figure 9.7: Cutting Geometry of Fish in Contour Cutting for Recovery Improvement. the problem to some extent. Consider the fish as shown in Figure 9.8. To remove most of the material in the head, a cut, as shown by line 1 in this figure, can be performed prior to the contour cutting. The second stage of the cut can be utilized to achieve the required contour pattern using a device as shown in Figure 9.9. Here, the wires are arranged in such a way that it forms the necessary angle for the "V-Cut". The cutter can be attached to an end effector of a robot. Also, the whole assembly may have to be oscillated in a vertical plane to achieve better penetration. The technical feasibility of this type of cutter is yet to be proven by laboratory experiments. Chapter 9. Laboratory Implementation ^ 162 Figure 9.8: Illustration of Two Stage Cutting in Head Removal. Sharp Wires Figure 9.9: Proposed Wire-Cutting Assembly for Robotic Contour Cutting. Chapter 9. Laboratory Implementation ^ 163 9.4.4 Rotary Blade Cutter Two small rotating blades can be arranged to form the necessary angle for "V-Cut" along with two more degrees of freedom to obtain the depth of the cut and the curvature of the contour pattern. This arrangement is shown in Figure 9.10. The blades are free to move in the z direction to achieve the required depth and can also rotate about the same axis to follow the contour curvature. The blades have to be small enough to provide an adequate approximation to the curvature of the actual contour pattern. As in the previous case, this assembly will be suitable only for trimming operations to obtain the final contour cut after the major part of the head is removed by an initial cut. The technical feasibility of this type of cutter is also yet to be observed. Rotary Blades-. Figure 9.10: Proposed Rotary-Blade Cutter Assembly for Robotic Contour Cutting. Chapter 9. Laboratory Implementation ^ 164 9.4.5 Reciprocating Blade Cutter Two reciprocating blades can be arranged as shown in Figure 9.11 to perform the contour cutting. Reciprocating saws are commonly available and can be modified to suit the present requirement. In order to reduce the weight of the cutting assembly, the motors of the saws can be installed separately at a remote location, and the rotational motion can be transmitted through flexible shafts. Although the gear that converts the rotary motion to the reciprocating motion has to be mounted on the cutter assembly, it will not present a serious problem as the weight of the gear together with the saw will be about lkg. If this cutter is technically viable, it will be the most economical solution to the contour cutting problem. Reciprocating Blades Figure 9.11: Proposed Reciprocating-Blade Cutter Assembly for Robotic Contour Cutting. Chapter 9. Laboratory Implementation ^ 165 9.4.6 Comparison and Recommendations Although both the high pressure water-jet cutter and the high power laser cutter are technically feasible, they are considerably expensive for the present application. In other words, they are not economically feasible from the industry point of view. The costs of these devices are, however, becoming less, and eventually they may be economically feasible. The technical feasibility of the other cutters are not yet known, but the cost of these are far less than that of the water-jet or laser cutter. The continuing work of the present research, therefore, is to investigate the technical feasibility of these cutters. In particular, the reciprocating blade cutter will be experimented with first, because of its simplicity and the availability of its components. 9.5 Gripper Considerations Proper holding of fish during the head removal process is important in order to achieve the desired cut. For example, if the fish is not properly held, it may be dragged towards the cutter and/or it may not maintain the optimal orientation with respect to the cutter resulting in a poor cut. Designing an appropriate gripper is, therefore, crucial and this section describes two grippers that are being developed in the Industrial Automation Laboratory. The first gripper, as shown in Figure 9.12, is an active gripper which means that the holding rollers of the gripper rotate in synchronism with the conveyor. In particular, the linear speed of the rollers at the holding position is same as the conveyor speed. With this mechanism, a firm grip can be achieved without obstructing the motion of the fish on the conveyor. Chapter 9. Laboratory Implementation ^ Driving Chain (Synchronized with the Conveyor) Conveyor Motion Conveyor Bed Fish Holding Pan Side View A—Driving Gear Driving String (Rubber) Flexible Plastic Holding Rollers Fish Front View Figure 9.12: Schematic Diagram of the Synchronous Gripper. 166 Chapter 9. Laboratory Implementation ^ 167 In this gripper mechanism, the driving chain, which is linked to the main driving motor of the conveyor, drives all of the gear wheels which in turn drive the independent rollers. The outer surface of the plastic rollers are grooved in order to increase grip. Also, the rollers are flexible to a certain extent which allows them to conform to the shape of an arbitrary fish. The springs attached to each lever are adjusted such that they apply sufficient force in order to hold fish firmly. Note that the fish are conveyed on pans which constrain the rotation of fish around a vertical axis. The second gripper, that is being designed, is a simple hold down mechanism to hold fish while the head is being removed. This gripper is primarily designed to operate with an intermittent (run, stop, run, • • -) conveyor. Figure 9.13 shows a schematic diagram of this gripper. In this gripper mechanism, the solenoid is synchronized to the intermittent motion of the conveyor. In particular, during the moving period the solenoid is activated (moved up) and, therefore, the fish is free to move to the next position. During the stationary period, the solenoid is not activated, and the spring forces the plastic holder down such that the fish is properly gripped. There are three such holders shaped and placed accordingly in order to achieve the optimum grip. 9.5.1 Comparison and Recommendations As mentioned above, the first gripper is an active gripper and, therefore, is best suited for a continuous motion conveyor. Although it may well work with an intermittent conveyor, it is not recommended for such a system because of its high cost and moderate complexity. The second gripper is recommended for an intermittent type conveyor because of its simplicity and low cost. Also, this gripper is only suited for this type (Figure 9.13). One issue concerning gripping is that the initial gripping may change the position Chapter 9. Laboratory Implementation ^ 168 Solenoid Push down Spring Plastic Holder Pan Conveyor Bed Side View Front View Figure 9.13: Schematic Diagram of the Intermittent Gripper. Chapter 9. Laboratory Implementation ^ 169 and/or the angle of the fish with respect to the camera (image) coordinates. The latter problem is minimized by introducing holding pans, however, the position problem still remains. One way to approach this problem is to extend the gripping mechanism up to the camera position (as shown in Figure 9.12) so that the fish is gripped prior to imaging. Although the nose, the eye, and the gill are visible to the camera and, therefore, nose-toeye length and nose-to-gill length can be computed with this arrangement, the length and the width cannot be computed because the whole fish not visible. In order to overcome this problem, it may be necessary to introduce a secondary camera or an alternative sensing mechanism. Further consideration of this problem is, however, beyond the scope of this thesis. Chapter 10 Conclusions and Future Work 10.1 Conclusions The developed rule-based method combines a production system with the techniques of probability and statistics. Production systems, on the one hand, are inherently inefficient when manipulating numerical data but extremely efficient in logical deductions. Statistical systems, on the other hand, have well established techniques to manipulate knowledge available in numerical form, but it is poor in logical deductions. The methodology developed in the present research provides method for expressing statistical knowledge in the form of production rules. These rules can be integrated with expert and heuristic knowledge which are readily available in the rule form. The integrated knowledge-based system provides more efficient and accurate classification than a statistical system or an expert system alone. In the development of off-line knowledge, an approach similar to density estimate using Parzen windows is followed. The generation of rules is based on a priori probabilities like Bayesian method, however, unlike Bayesian inferencing, only the regions with significant a priori probabilities are used. Also, in most Bayesian classification systems a priori probabilities are usually determined by a parametric method. In the present approach, however, the connectivities among features are used to determine these probabilities. This is much simpler and it allows for the examination of the significance of the features being used. 170 Chapter 10. Conclusions and Future Work^ 171 On-line inferencing of the rule base system is computationally efficient because only the significant regions of the a priori probability density function are used to build the rule base. Also, the on-line computational complexity is linearly proportional to the number of features. The thesis presents an extensive error analysis of the rule base methodology, and gives a comparison of the rule base error to the Bayesian error. This thesis also discusses several pattern recognition techniques and compares them with the rule-based method. The comparison is achieved by applying these techniques and methodologies to generate cutting contours for head removal in fish processing. The neural networks method produced good results with resubstitution and was adaptable to small variations in the feature space but it was not acceptably adaptive to the new data. The performance of the rule-based system was better than neural network in terms of successful classification, robustness, and speed. Although it produced good results for the leave one out approach, the adaptability to new fish was not adequate. This may be, however, due to lack of training data. This may be the reason for poor adaptability of the neural network as well. Estimation of cutting contour polynomials using multiple regression was very robust but the accuracy of a significant number of contours was not within the specified tolerance level. The major drawback of the k-nearest neighbor algorithm was its poor classification rate. Also, the on-line computation cost of this method is high. This increases with the number of features and the number data samples in the database. All these methods were experimented with the existing data of 200 fish samples. A portion of this data may be unreliable due to the poor quality of static pictures. Methodology have been developed, however, to eliminate noisy data. A prototype laboratory system incorporating a knowledge base, a vision system, and a robot was developed. The Chapter 10. Conclusions and Future Work^ 172 feasibility of the approach is demonstrated to perform well in the laboratory environment. 10.2 Main Contributions of the Thesis In summary, the main contribution of this research is a methodology for rule generation for knowledge-based systems. In particular, rules are generated and prioritized using multivariate probability density distributions of the features. This methodology enhances the reliability and the performance of the system by combining the knowledge available from experts and others with experience in the process, with the knowledge gained from statistical analysis. This research also provides a set of algorithms to compute features for the rulebased system using image data. Comparison of knowledge representation techniques for this type of industrial application is a further contribution of the research. The main contributions to the industry are the new technology and a practical workcell for fish processing. 10.3 Future Work The present method of rule generation uses a square kernel to estimate the density. The density estimate may be improved by using other kernels such as a Gaussian kernel or a triangular kernel. Also, in rule prioritization a constant value is used for one region in the density function, i.e., for one rule. Accuracy of classification can further be improved if a variable priority value is used depending on the distance from the peak value of the high density region of the density function. The transfer of technology to industry and the demonstration of the ability of the system to function as practical workcell in fish processing are also continuing work from Chapter 10. Conclusions and Future Work^ 173 this research. The major component of this work involves developing a robotic cutter that is able to perform a contour cut at the speed specified by the industry. Feasibility of employing water-jet, hot-wire, or reciprocating blade type cutter is being investigated. Nomenclature Symbol Description A(x)^Parameters of an object ao^Fourier series constant (DC) term a,^Coefficients of cutting contour polynomial Fourier series coefficients of the Cosine terms ACk„^Subset i of attribute k for group j ack^Feature or attribute k bi^Output i of neural network Fourier series coefficients of the Sine terms Ch^Class with highest a priori probability C^Class j CH^Chum Salmon Threshold level of relationship density Dimensionality d(x , y)^2-D directional derivative operator Total error of a neural network E (k)^Expected value of k f (x)^Discrete probability distribution of x f (x)^Feature vector x fi^Ratio (nose-eye length)/ (length) 12^Ratio (nose-gill length)/(length) 13^Ratio (width)/(length) 174 Nomenclature^ G(X,Y) Discrete Gaussian function Gx(X, Y) First derivative of Gaussian function g(x,y)^2-D Gaussian function Size of a subrange hki,^Size of a side, corresponding to feature k, of ith hyper-rectangle of ith class Identity matrix Second moment of area of an image about the center /(X, Y)^2-D digital image IM^Index of mismatch, the minimum absolute area between two cutting contours i(x, y)^Continuous 2-D image Chord length of cutting contour Model j rn^Number of features Output of layer j of a neural network Pink Salmon Probability /5^Estimated probability Probability density p(aciC3)^Conditional probability density of feature vector ac given class C3 P3^Number of clusters in group j Estimated probability density Number of classes R(f())^Mapping function between feature vector and class index 175 Nomenclature^ 176 7?^A region in the feature space Real numbers Sockeye salmon k^Target or desired output at the kth layer uz3^Discriminant function between class i and class j V^Volume of feature space Wh(x) Gaussian kernel with parameter (variance) h w,^Window height of a selected region i in feature space wo„^Weights of neural network connections between layer o and layer i X^Coordinate axis X x coordinate of a point in x-y plane Coordinate axis Y y coordinate of a point in x-y plane Momentum term of a neural network Parameters of the multiple regression model 0(ac)^A unit hyper-cube centered at the origin of the feature space Learning rate of a neural network n 3^Median vector of ith cluster of jth class A^Intersection indicator; the value of this is 1 if two clusters intersect, and 0 otherwise Mean vector of class i Angle of rotation ea^Cutting angle of a straight cut V-cut angle Correlation between features i and j Nomenclature^ Et^Covariance matrix of i azz^ Variance of feature i crij^Covariance of features i and j O(ac) A hyper-rectangle centered at the origin of the feature space ith cluster center of ith class 177 Bibliography [Anderson, 1958,19841 Anderson T. W., "An Introduction to Multivariate Statistical Analysis," Wiley, New York, 1958, 1984. [ART, 1984] ART User's Manual, Inference Corp., 5300 W. Century Blvd., Los Angeles, Calif., 1984. [Ayache and Faugerus, 1986] Ayache N., and Faugerus 0. D., "HYPER: A New Approach for the Recognition and Position of Two Dimensional Objects," IEEE Trans. Pattern Anal. Machine Intell., Vol. PAMI-8, pp 44-54, Jan 1986. [Barnea and Silverman, 1972] Barnea D. I., and Silverman H. F., "A class of Algorithms for fast Digital Image Registration," IEEE Trans. Computers, Vol. C-21, pp 179186, Feb 1972. [Birman, 1982] Birman K. P., "Rule-Based Learning for More Accurate ECG Analysis," IEEE Trans. Pattern Anal. Machine Intel!., Vol. 4, No. 4, pp. 369-380, July 1982. [Blackwell and Girshick, 1954] Blackwell D., and Girshick M. A., "Theory of Games and Statistical Decisions," Wiley, New York, 1954. [Blum, 1982] Blum R. L., "Discovery Confirmation and Incorporation of Causal Relationships from Large Time Oriented and Clinical Database: The RX Project," Computers and Biomedical Research, 15, 1982. [BoIle et. al., 1989] Bolle R. M., Kjeldsen R., and Taylor R. W., "Visual Recognition Using Concurrent and Layered Parameter Networks," IEEE Conf. Pattern Recognition, pp. 625-631, 1989. [Bolles and Cain, 1982] Bolles R. C., and Cain R. A., "Recognizing and Locating Partially Visible Objects: The Local-Feature-Focus Method," Int. J. Robotics Research, Vol. 1, No. 3, pp. 57-82, 1982. [Brady, 1982] Brady M., "Parts Description and Acquisition Using Vision," Proceedings of SPIE - Robot Vision, Vol. 336, pp. 20-28, May 6-7, 1982. [Buchanan and Shortliffe, 1975] Buchanan B. G., and Shortliffe E. H., "A Model of Inexact Reasoning in Medicine," Mathematical Biosciences, Vol. 23, pp. 351-379, 1975. 178 Bibliography^ 179 [Buchanan and Shortliffe, 1984] Buchanan B. G., and Shortliffe E. H., "Rule-Based Expert Systems," Addison Wesley Publishing Company, 1984. [Carbonell et. al., 1981] Carbonell J. G., Cullingford R. E., and Gershman A. V., "Steps Toward Knowledge-Based Machine Translation," IEEE Trans. Pattern Anal. Machine Intell., Vol. 3, No. 4, pp. 376-392, July 1981. [Chellappa and Bagdazian, 1982] Chellappa R., and Bagdazian R., "Fourier Coding of Image Boundaries," IEEE Trans. Pattern Anal. Machine Intel!., Vol. PAMI-6, pp. 102-105, 1982. [Chow, 1957] Chow C. K., "An Optimum Character Recognition System Using Decision Functions," IRE Trans. on Elec. Comp., EC-6, pp. 247-254, 1957. [Darwish and Jain, 1988] Darwish A. M., and Jain A. K., "A Rule-Based Approach for Visual Pattern Inspection," IEEE Trans. Pattern Anal. Machine Intell., Vol. 10, No. 1, pp. 56-68, 1988. [de Silva, 1989] de Silva C. W., "Control Sensors and Actuators," Prentice Hall, Englewood Cliffs, New Jersey, 1989. [de Silva, 1990] de Silva C. W., "Research Laboratory for Fish Processing Automation," Proc. 3rd International Symp. on Robotics and Manufacturing, Vancouver, Canada, 1990. [de Silva, 1991] de Silva C. W., "An Analytical Framework for Knowledge-Based Tuning of Servo Controllers," Engng. Applic. Artif. Intell. Vol. 4, No. 3, pp. 177-189, 1991. [de Silva, Riahi, and Gamage, 1991] de Silva C. W., Riahi N., and Gamage L. "Image Processing Techniques for Position and Orientation Measurements in Fish Processing Automation," Modeling and Simulation, Vol. 22, Part 3, Instrument Society of America Publication, pp. 1528-1535, May, 1991. [Dirilten and Newman, 1977] Dirilten H., and Newman T. G., "Pattern Matching Under Affine Transformations," IEEE Trans. Computers, Vol. C-26, pp. 314-317, March 1977. [Draper and Smith, 1966] Draper N. R., Smith H., "Applied Regression Analysis," Wiley, 1966. [Duda and Hart, 1973] Duda R. 0. and Hart P. E., "Pattern Classification and Scene Analysis," Wiley, New York, 1973. Bibliography^ 180 [Duda et. el., 1979] Duda R. 0., Gaschnig J., and Hart P. E., "Model Design in the Prospector Consultant System for Mineral Exploration," Expert Systems in the Microelectronic Age, Ed. Michie D., Edinburgh University Press, 1979. [Dudani et. al., 1977] Dudani S. A., Breeding K. J., and McGhee R. B., "Aircraft Identification by Moment Invariants," IEEE Trans. Computers, Vol. C-26, pp. 39-45, Jan 1977. [ESHED ROB OTEC, 1988] "ACL, Advanced Control Language, Reference Guide," ESHED ROBOTEC Inc., Princeton, NJ, 1988. [ESHED ROBOTEC, 1990] "SCORBOT-ER VII User Manual," ESHED ROBOTEC Inc., Princeton, NJ, 1990. [Faber and Stokely, 1988] Faber T. L., and Stokely E. M., "Orientation of 3-D Structures in Medical Images," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 10 No. 5 pp. 626- 633, September 1988. [Feigenbaum et. el., 1971] Feigenbaum E. A., Buchanan B. G., and Lederberg J., "On Generality and Problem Solving: A Case Study Involving DENDRAL Program," Machine Intelligence 6, pp. 165-190, New York, 1971. [Ferguson, 1967] Ferguson T. S., "Mathematical Statistics: A Decision Theoretic Approach," Academic Press, New York, 1967. [Fisher, 1936] Fisher R. A., "The Use of Multiple Measurements in Taxonomic Problems,"Ann. Eugenics, 7, part II, pp. 179-188, 1936. [Floyd, 1961] Floyd R., "A Descriptive Language for Symbolic Manipulation," Journal of the Association for Company Machinery, 1961. [Forgy, 1985] Forgy C. L., "OPS83: User's Manual and Report," Tech. Report, Production System Technologies Inc., Pittsburgh, Pa, 1985. [Fu, 1982] Fu K. S., "Syntactic Pattern Recognition and Applications," Prentice Hall, Englewood Cliffs, NJ, 1982. [Fu and Young, 1986] Fu K. S., and Young T. Y., "Hand Book of Pattern Recognition and Image Processing," Academic Press Inc., 1986. [Fu(3), 1982] Fu K. S., "Pattern Recognition for Automatic Visual Inspection," Proceedings of the SPIE, May 6-7, pp. 12-19, 1982. Bibliography^ 181 [Carnage and de Silva, 1990] Carnage L., and de Silva C. W., "Use of Image Processing for the Measurement of Orientation with Application to Automated Fish Processing," Proc. of IEEE Industrial Electronics Society, pp. 482-487, November, 1990. [Carnage, 1990] Carnage L., "PhD Research Proposal," Dept. of Mechanical Engineering, UBC, November, 1990. [Carnage and de Silva, 1991al Carnage L., and de Silva C. W., "Model-Based Computer Vision Approach for the Cutting Contour Generation in Automated Fish Processing," Modeling and Simulation, Vol. 22, Part 3, Instrument Society of America Publication, pp. 1544-1551, May, 1991. [Carnage and de Silva, 1991b] Carnage L., and de Silva C. W., "A Model-Based Approach for On-line Extraction of Complex Features Through Image Processing," Advances in Instrumentation, DSC Vol. 30, ASME, NY, pp. 73-78, 1991. [Carnage and de Silva, 1992] Gamage L., and de Silva C. W., "Rule-Based Contour Generation for Robotic Processes," Proc. of IEEE Region 10 International Conference, TENCON '92, Melbourne, Australia, pp. 649-653, 1992. [Carnage et. al., 1993] Carnage L., de Silva C. W., and Cosine R. G., "Statistical Pattern Recognition for Cutter Positioning in Automated Fish Processing," Proc. of IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, Victoria, Canada, pp. 786-789, 1993. [Gayle and Dankel, 1986] Gayle B. G., and Dankel D. D., "RxPERT : An Intelligent Computer System for Drug Interactions," Applications of Artificial Intelligence SPIE, Vol. 635, pp. 599-605, 1986. [Gilmore, 1986] Gilmore J. F., "A Comprehensive Evaluation of Expert System Tools," Applications of Artificial Intelligence iii, SPIE, Vol. 635, pp. 2-16, 1986. [Gottinger, 1988] Gottinger H. W., "Statistical Expert Systems," Expert Systems, Vol. 5, No. 3, August 1988. [Goshtasby, 1985] Goshtasby A., "Template Matching in Rotated Images," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-7, No. 3, pp. 338-344, May 1985. [Hand, 1985] Hand D. J., "Artificial Intelligence and Psychiatry," Cambridge, Cambridge University Press, 1985. [Harman, 1967] Harman H. H., "Modern Factor Analysis," University of Chicago Press, Chicago and London, 1967. Bibliography^ 182 [Hays-Roth et. al., 1983] Hays-Roth F., Waterman D. A, and Lenat D., "Building Expert Systems," Addison Wesley, Reading, MA, 1983. [Held and Carlis, 1989] Held J. P., and Carlis J. V., "Data Modeling of Expert Systems," IEEE Expert, spring 1989. [Hoffman et. al., 1986] Hofmann M., Carviedes J., Bourne J., G. Beale G., and Brodersen A., "Building Expert Systems for Repair Domains," Expert Systems, Vol. 3, No. 1, pp. 4-11, Jan 1986. [Horn, 1986] Horn B. K. P., "Robot Vision," MIT Press, Cambridge, MA, 1986. [Howard and Rehak, 1989] Howard H. C. and Rehak D. K., "Interfacing Expert Systems with Databases," IEEE Expert, pp. 65-76, Fall 1989. [Hsieh and Fu, 1979] Hsieh Y. Y., and Fu K. S., "A Method for Automatic IC Chip Alignment and Wire Bonding," IEEE Computer Society Conf. Pattern Recognition and Image Processing, pp. 87-92, Aug. 1979. [Hutchinson et. al., 1989] Hutchinson S. A., Cromwell R. L., and Kak A. C., "Applying Uncertainty Reasoning to Model-Based Object Recognition," IEEE Conf. Pattern Recognition, pp. 541-548, 1989. [IRI, 1988] "SUPRAVISION User's Manual," International Robomation Intelligence, Carlsbad, CA, 1988. [IRI, 1989] "IKS/VR Programming Manual," International Robomation Intelligence, Carlsbad, CA, 1989. [Ishida, 1988] Ishida Y., An Application of Qualitative Reasoning to Process Diagnosis: Automatic Rule Generation by Qualitative Simulation," IEEE Conference on Artificial Intelligence, pp. 124-129, 1988. [Jain, 19891 Jain A. K., "Fundamentals of Digital Image Processing," Prentice Hall, Englewood Cliffs, 1989. [Jain and Hoffman, 1988] Jain A. K., and Hoffman R., "Evidence-Based Recognition of 3-D Objects," IEEE Trans. Pattern Anal. Machine Intel, Vol. PAMI-10, pp 783-802, November 1988. [Kendall and Stuart, 1966] Kendall M. G., and Stuart A., "The Advanced Theory of Statistics," Vol. 3, Hafner, New York, 1966. [Kjeldsen et. al., 1988] Kjeldsen R., Bolle R. M., and Sabbah D., "A Connectionist Approach to Primitive Shape Recognition in Range Data," IEEE Conference on Artificial Intelligence, pp. 70-75, 1988. Bibliography^ 183 [Klein and Breeding, 1979] Klein C. A., and Breeding K. J., "Automatic Optical Identification of Faults in Bubble Memory Overlay Patterns," IEEE Computer Society Conf. Pattern Recognition and Image Processing, pp. 87-92, Aug. 1979. [Lafue and Smith, 1985] Lafue G. M. E., and Smith R. G., "A Modular Tool Kit for Knowledge Management," Proc. Ninth IJCAI, Morgan Kaufmann, Palo, Alto, CA, pp. 46-52, Aug 1985. [Levine, 1985] Levine M. D., "Vision in Man and Machine," McGraw Hill, 1985. [Levine and Shaheen, 1981] Levine M. D. and Shaheen S. I., "A Modular Computer Vision System for Segmentation and Interpretation," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-3, No. 5 pp. 540-556, September 1981. [Lin and Chellappa, 1987] Lin C. C., Chellappa R., "Classification of Partial 2-D Shapes Using Fourier Descriptors," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-9, No. 5 pp. 686-690, September 1987. [Machuca and Gilbert, 1981] Machuca R., and Gilbert A. L., "Finding Edges in Noisy Scenes," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI3, No. 1, pp. 103-111, January 1981. [Marr, 1982] Marr D., "Vision," W. H. Freeman and Company, New York, 1982. [Meisel, 1972] Meisel W. S., "Computer-Oriented Approaches to Pattern Recognition," Academic Press, New York and London, 1972. [Modestino and Zhang, 1989] Modestino J. W., and Zhang J., "A Markov Random Field Model-Based Approach to Image Interpretation," IEEE Conf. Pattern Recognition, pp. 458-465, 1989. [Movich, 1979] Movich R. C., "Vision - Controlled Robotic Cell," Proceedings of SPIE Robot Vision, Vol. 336, pp. 59-66, May 6-7, 1982. [Nazif and Levine, 1984] Nazif A. M., and Levine M. D., "Low Level Image Segmentation : An Expert System," IEEE Trans. Pattern Anal. Machine Intell., Vol. 6, No. 5, pp. 555-577, September 1984. [Neyman and Pearson, 1928] Neyman J., and Pearson E. S., "On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference," Biometrica, 20A, pp. 175-240, 1928. [Otsu, 1979] Otsu N., "A Threshold Selection Method from Gray-Level Histograms," IEEE Trans. Syst., Man, Cybern., Vol. SMC-9, No. 1, pp. 62-66, Jan. 1979. Bibliography^ 184 [Pao, 1989] Pao Yoh-Han, "Adaptive Pattern Recognition and Neural Networks," Addison Wesley, 1989. [Perkins, 1978] Perkins W. A., "A Model-Based Vision System for Industrial Parts," IEEE Trans. Comput., Vol. C-27, pp. 126-143, 1978. [Perkins, 1982] Perkins W. A., "Vision System That Learns How to Inspect Parts," Proceedings of SPIE - Robot Vision, Vol. 336, pp. 50- 58, May 6-7, 1982. [Persoon and Fu, 1977] Persoon E., and Fu K. S., "Shape Discrimination Using Fourier Descriptors," IEEE Trans. Syst., Man, Cybern., Vol. SMC-7, pp. 170-179, May 1977. [Porter et. al., 1979] Porter G. B., Cipolla T. M., and Mundy J. L., "Automatic Visual Inspection of Blind Holes in Metal Surfaces," IEEE Computer Society Conf. Pattern Recognition and Image Processing, pp. 83-86, Aug. 1979. [Provan, 1988] Provan G. M., "Model-Based Object Recognition: A Truth Maintenance Approach," IEEE Conference on Artificial Intelligence, pp. 230-235, 1988. [Raudys and Pikelis, 1980] Raudys S. and Pikelis V., "On Dimensionality, Sample Size, Classification Error, and Complexity of Classification Algorithm in Pattern Recognition," IEEE Trans. Pattern Anal. Machine Intell., Vol. PAMI-2, pp 242252, May 1980. [Reddi, 1981] Reddi S. S., "Radial and Angular Moment Invariants for Image Identification," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-3, No. 2 pp. 240-242, March 1981. [Reinhold, 1982] Reinhold A. G., "Automatic Vision Inspection of Sheet Metal Parts," Proceedings of SPIE - Robot Vision, Vol. 336, pp. 84- 90, May 6-7, 1982. [Riahi and de Silva, 1990] Riahi N., and de Silva C. W., "Fast Image Processing Techniques for the Gill Position Measurement in Fish," Proc. of IEEE Industrial Electronics Society, November, 1990. [Rice, 1988] Rice J. A., "Mathematical Statistics and Data Analysis," Wadsworth and Brooks, 1988. [Roberts, 1986] Roberts G. A., "An Expert System for Labeling Segments in Forward Looking Infrared (FUR) Imagery," Applications of Artificial Intelligence iii, SPIE, Vol. 635, pp. 50-57, 1986. [Rosenfeld, 1975] Rosenfeld A., "Survey : Picture Processing 1974," Computer Graphics and Image Processing, Vol. 4, pp. 133-155, 1975. Bibliography^ 185 [Rosenfeld and Danker, 19811 Rosenfeld A., and Danker A. J., "Blob Detection by Relaxation," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-3, No. 1 pp. 79-92, January 1981. [Rosenfeld and Smith, 1981] Rosenfeld A., and Smith R. C., "Thresholding Using Relaxation," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. PAMI-3, No. 5 pp. 598-606, May 1981. [Rosenfeld and Vanderbrug, 1977] Rosenfeld A., and Vanderbrug, "Coarse-Fine Template Matching," IEEE Trans. Syst., Man, Cybern., Vol. SMC-7, pp. 104-107, Feb. 1977. [Rosenfeld and Vanderbrug(2), 19771 Rosenfeld A., and Vanderbrug, "Two-Stage Template Matching," IEEE Trans. on Computers, Vol. C-26, No. 4, pp.384-393, April 1977. [Rowley, 1986] Rowley D., "Statistical Approaches to the Generation of Rules for Expert Systems," 2nd International Expert System Conference, London, pp. 369-75, 1986. [Rudemo, 1982] Rudemo M., "Empirical Choice of Histograms and Kernel Density Estimators," Scandinavian J. Statistics, Vol. 9, pp. 65-78, 1982. [Rumelhart, et. al., 1986] Rumelhart D. C., Hinton G. E. and Williams R. J., "Learning Internal Representation by Error Propagation in Parallel Distributed Processing," Explorations in the Microstructures of Cognition, Vol. 1., pp. 318-362 MIT Press, Cambridge, MA, 1986. [Schaefer et. al., 1988] Schaefer R. M., Colmer J. S., and Miley N., "CABPRO: A RuleBased Expert System for Process Planning of Assembled Multiwire Cables," IEEE Conference on Artificial Intelligence, pp. 181-186, 1988. [Stefik et. al., 1983] Stefik M., Bell A. G., and Bobrow D. G., "Rule-Based Programming in LOOPS," Tech. Report KB VLSI 82-22 Xerox Corp., Palo, Alto, CA, June 1983. [Sterling, 1979] Sterling W. M., "Automatic Non-Reference Inspection of Printed Wiring Boards," IEEE Computer Society Conf. Pattern Recognition and Image Processing, pp. 93-100, Aug. 1979. [Stockman et. al., 1982] Stockman G. et. al., "Matching Images to Models for Registration and Object Detection via Clustering," IEEE Trans. Pattern Anal. Machine Intel, Vol. 4, No. 3, pp. 229-241, May 1982. Bibliography^ 186 [Suetens et. al., 1989] Suetens P., et. al., "Recognition of the Coronary Blood Vessels on Angiograms Using Hierarchical Model-Based Iconic Search," IEEE Conf. Pattern Recognition, pp. 576-579, 1989. [Suzuki and Arimoto, 1988] H. Suzuki and S. Arimoto, "Self-Organizing Model for Pattern Learning and its Application to Robot Eyesight," IEEE Conference on Artificial Intelligence, pp. 236-246, 1988. [Svedlow et. al., 1979] Svedlow M., McGillem C. D., and Anuta P. E., "Experimental Examination of Similarity Measures and Processing Methods Used for Image Registration," Machine Processing of Remotedly Sensed Data, 1976. [Taubin, 1989] Taubin C., Bolle R. M., and Cooper D. B., "Representing and Comparing Shapes Using Polynomials," IEEE Conf. Pattern Recognition, pp. 510-516, 1989. [Therrieu, 1989] Therrieu C. W., "Decision Estimation and Classification," Wiley, 1989. [Tompsett, 1988] Tompsett C. P., "Education, Training and Knowledge Base Design," Expert Systems, Vol. 5, November, 1988. [Torimoto and Pavlidis, 1975] Torimoto S., and Pavlidis T. "A Hierarchical Data Structure for Picture Processing," Computer Graphics and Image Processing, Vol. 4, pp. 104-119, 1975. [Von Neumann and Morgenstern, 1944] Von Neumann J , and Morgenstern 0., "Theory of Games and Economic Behavior," Princeton University Press, Princeton, N.J., 1944. [Wald, 1950] Wald A., "Statistical Decision Functions," Wiley, New York, 1950. [Watanabe et. al., 1989] Watanabe M., Yamanouchi T., Iwamoto M., and Ushida Y., "CL : A Flexible and Efficient Tool for Constructing Knowledge-Based Expert Systems," IEEE Expert, pp. 41-50, Fall 1989. [Waterman, 1970] Waterman D. A., "Generalization Learning Techniques for Automating the Learning of Heuristics," Artificial Intelligence 1, pp. 121-170, 1970. [Weisberg, 1980] Weisberg S., "Applied Linear Regression," Wiley, 1980. [Whitaker, 1986] Whitaker E. T., "Rule-Based Geometrical Reasoning for the Interpretation of Line Drawings," Applications of Artificial Intelligence iii, SPIE, Vol. 635, pp. 621-627, 1986. [Wilkinson, 1990] Wilkinson L., "SYSTAT : The System for Statistics," SYSTAT Inc., Evanston, IL, 1990. Bibliography^ 187 [Wong and Hall, 1978] Wong R. Y., and Hall E. L., "Scene Matching with Invariant Moments," Computer Graphics Image Processing, Vol. 8, pp 16- 24, 1978. [You, 1986] You Y. C., "Expert System for Model Management," Applications of Artificial Intelligence iii, SPIE, Vol. 635, pp. 17-24, 1986. Appendix A Overview of OPS/83 OPS/83 is programming language for developing expert systems. It supports the ordinary programming paradigm of languages such as C and PASCAL as well as the rule-based paradigm. A.1 Rule-Based Programming An OPS/83 program consists of two components, a collection of IF THEN rules and a global database called working memory. Each rule contains a conditional expression called the rule's LHS (Left Hand Side) and an unconditional sequence of actions called the rule's RHS (Right Hand Side). An LHS consists of one or more patterns. An LIE is considered to be satisfied when every pattern in the LHS is matched with an element from working memory. The rule interpreter executes a production system by performing a sequence of operations called recognize-act cycle. The standard recognize-act cycle is: 1. Match: Evaluate the LHSs of the rules to determine which rules are satisfied given the current contents of the working memory. 2. Conflict Resolution: Select one rule from among the ones with satisfied LHSs. If no rules have satisfied LHSs, halt execution. 3. Act: Perform the operations specified in the RHS of the selected rule. 188 Appendix A. Overview of OPS/83^ 189 4. Go to Step 1. A.2 High Level Programming The high level programming in OPS/83 is very similar to programming in C or PASCAL. It has all the variables, constants, operators data structures, control structures, subprograms, and file structures which are available in a conventional high level language. A.3 Working Memory Working memory consists of a set of record structures called working elements. The elements are created and placed into working memory using the make statement. Elements can be modified by the modify statement and deleted from working memory using the remove statement. A.3.1 Element Declarations Working memory elements are declared exactly like records. type goal = element type : symbol; integer; object : integer; status : integer; ), Appendix A. Overview of OPS/83^ 190 A.3.2 The make, modify, and remove Statements Make statement is used to create new elements and add them to the working memory. make(goal); This will make an element called goal. modify(goal type=find object=fish id=sockeye); This modifies the current contents of goal to the new specified values. rem ove(goal); deletes the element from the working memory. A.4 Rules A rule consists of • rule key word • the name of the rule • open brace (0 • LHS of the rule • right pointing arrow (--0 • RHS of the rule • closing brace (1) Appendix A. Overview of OPS/83^ Example: rule leave { SiG(goal status=end); -remove 8/G }; 191 Appendix B ESHED SCORBOT-ER VII Robotic Manipulator ESHED SCORBOT-ER VII is a five degree of freedom, vertically articulated robot. The robot is controlled by MC68010 based controller which is also manufactured by ESHED ROBOTEC. The robot controller has its own internal control language, ACL, Advanced Control Language, and it offers a wide range of capabilities for programming the robot and/or other axes of automation. Although the robot has only five joints, the controller is able to control eight axes. The ACL is on an EPROM within the controller, and is accessed via the RS-232 port using any ASCII terminal This appendix provides the specifications of the robot, the robot controller, and also provides a description of the ACL. B.1 Robot Specifications Table B.1 gives the specifications of the SCORBOT-ER VII robot. Figure B.1 shows the robot arm axes, and Figure B.2 shows the operating range without the gripper [ESHED ROBOTEC, 1990]. B.2 Robot Controller Specifications The SCORBOT-ER VII controller has four driver cards. There are three cards which drive the robot axes and the gripper, and the other card can be used to drive accessories or other external motors. Table B.2 gives the specifications of the SCORBOT-ER VII 192 Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator^ 193 controller. Table B.1: Robot Arm Specifications. Item Specification Mechanical Structure Vertical articulated robot Five axes of motion Harmonic drive transmission on each axis Operating range: Axis 1: Base rotation 250° Axis 2: Shoulder rotation +130° —35° Axis 3: Elbow rotation +130° Axis 4: Wrist pitch +130° Axis 5: Wrist roll Unlimited (mechanically) +180° (electrically) Radius of operation 851mm (33.5in) with gripper 691mm (27.2in) without gripper End effector Electrical gripper Hard Home Fixed reference positions on all axes Actuators Electrical DC servo motors Transmission Harmonic drives Maximum Workload 2000g (4.41b), including gripper Repeatability 0.2mm (0.079in) Maximum Speed 1000mm/sec (39.37in/sec) Weight Approx. llkg (241b) Flexible Hose Factory installed for pneumatic end effector applications Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator Figure B.1: Robot Arm Axes. Figure B.2: Operating Range of the Robot. 194 Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator ^ Table B.2: Robot Controller Specifications. Specification Comments Real time, Multi-tasking, Terminal is required for stand alone, PID programming stage No. of Servo Axes Maximum - 11 Standard - 8 Groups of Control 11 axes can be divided into 3 Each group of control is independent. Full groups: A and B and a group of independent axes (group C) flexibility in operation. Axes interpolation in groups A and B Axes Drive PWM, 20kHz PWM: Pulse Width Modulation PTP Path Control PTP: Point to Point CP - linear, circular paths CP: Continuous Path calculated by mathematical formulas; e.g., parabola Path Control Paraboloid, Trapezoid, Open With acceleration/ Profiles deceleration. Can be Loop Free Running defined separately for each group of control Interpolation Articulation, linear, Functions circular, math. formula Programmable to a Definable either by speed or Motion Speed by travel time between points percentage value within speed range Open or closed loop, integral, Control Parameters proportional, differential, offset Motorola 68010 CPU 384K bytes EPROM 64K bytes Work RAM 32K bytes - standard User RAM is battery User RAM 128K bytes - expanded backed up (up to 12,000 (optional) lines or 6300 positions) Each position requires No. of Program lines Standard RAM: 2960 prog. lines or 1450 positions (or memory space equal to and positions any combination) two program lines Item Controller Type 195 Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator Item Table B.2 contd. Specification ^ Comments Expanded RAM: 12800 prog. lines or 6375 positions (or any combination); e.g., 4800 lines and 4000 positions No. of programs in Hundreds user RAM Multi-tasking Depends on length of programs. Runs up to 20 independent programs simultaneously. User can edit a program while another program runs. Inputs/Outputs 16 inputs All inputs with LEDs OV-1.5VDC: Input ON 2.5V-24VDC: Input OFF 16 Outputs All outputs with LEDs 4 relay outputs 12 open collector outputs 24VDC maximum Home Limit switch for each axis Up to 11 limit switches Communication RS-232 serial port D25 female connector. XON/XOFF handshake Programming Methods Using any terminal or PC ACL: Advanced Control with terminal emulation Language software. More than 100 commands available. 196 Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator^ 197 Table B.2 contd. Item Specification Positions teaching Using ACL Comments Using a teach pendant Coordinate systems Sensors Emergency Brake Joint coordinates Absolute and relative XYZ coordinates definitions for both Sensors interrupt Sensors are connected capabilities via input tabs. Emergency ON/OFF switch Red light when ON brake is activated Disconnects power to motors and resets the system Protection Hardware: Current limit on Factory adjusted; each axis user adjustable. Automatic fuse on each axis Software: Impact protection Thermal Protection Trouble shooting Upper and lower range limits Factory set; for each axis user definable Built in test routine Activated from terminal/computer Or from TP. Tests servo axis, inputs, outputs, and limit switches. Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator Item ^ Table B.2 contd. Specification Comments LED indicator on each servo Externally visible 198 axis and on each input/output Commands for testing the The commands: performance of each motor in "SET ANOUT[i]" open loop and of each encoder "SHOW ENCO" Power requirements 110VAC, 60Hz 500W maximum +5% Power switching Power ON/OFF switch LED indicators Motor ON/OFF switch for both +12VDC, 2A Regulated, external User's Power Supply tabs for each pole Motors Power Supply +24VDC, 8A Operating Temp. 5 — 40°C (41 — 104°F) Weight Approx. 19kg (421b) Dimensions 490mm(L)X445mm(W)X150mm(H) Not regulated 19.3in(L)X17.5in(W)X5.9in(H) B.3 Programming Considerations The ESHED robot is programmed using a special programming language called ACL Advanced Control Language. ACL provides the following functions: • Directly executes robot commands. • Controls Input/Output data. • Enables the user to program the robot. Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator ^199 • Runs all selected programs concurrently (full multi-tasking support). • Synchronizes the execution of programs. • Provides a simple file management system. Access to ACL is established through an RS-232 port on the controller. Table B.3 gives a list of commonly used commands and their usage. There are over 100 ACL program commands. A detail description of these commands can be found in the ACL Reference Guide [ESHED ROBOTEC, 19881. Command CLR COFF CON CONTINUE COPY DEFINE DEF DELAY DELP DELVAR DIM DIR EDIT END ENDIF ENDFOR FOR Table B.3: ACL Command Summary. Format Description Clears the value of encoder n CLR n Clears all encorders CLR * Turns off servo control to all axes COFF Turns on servo control to all axes CON CONTINUE pry Continues the run of the program pry COPY prgl prg2 Copies user program prgl to a new program prg2 DEFINE vi Defines local variables, up to 8 at once •- • v8 Creates a position name p for group A DEFPA p Creates a position name p for group B DEFPB p DELAY t Halts the program for a specified time (t) DELP p Deletes the position p from the position table Deletes the variable v from the variable table DELVAR v Defines a local vector of n variables DIM v[n] Displays a list of all user programs DIR Starts editing program pry EDIT pry End of a program END End if IF branching section ENDIF End of FOR loop ENDFOR Loop command FOR vl=v2 TO v3 Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator ^ 200 Table B.3 contd. Command Format Description GLOBAL GLOBAL vi Defines a global variable up to 8 at once • - • v8 GOSUB GOSUB prg Passes control to another program (prg) GOTO GOTO label Continues program execution from commands after label HERE HERE p Records the actual location of the robot as an absolute position, in axes encoder coordinates HOME HOME Drives the robot to its home position IF IF vi Checks the condition and branches accordingly condition v2 INIT INIT CONTROL Initializes all system parameters LABEL LABEL n Marks a program label n for use with GOTO command LIST LIST prg Listing of program prg LISTP LISTP Displays user position table LISTPV LISTPV p Displays the coordinates of position p MOVE MOVE p Moves the robot to position p MOVE p t Moves the robot to position p subject to the time t MOVED p Same as MOVE except that the program waits until the robot reaches position p MOVES MOVES pv s e Moves the robot smoothly along a vector of positions pv from start s to end e PRINT PRINT "str" Displays a string "str" on the screen Appendix B. ESHED SCORBOT-ER VII Robotic Manipulator ^ 201 Format Table B.3 contd. Description PRINT v Displays the value v on the screen READ READ "str" v Displays the string "str" and inputs the value v RUN RUN pry Executes the program pry SET SET vl=v2 Assigns value v2 to vi SETP SETP pl=p2 Assigns value of position p2 to pl SETPV SETPV p a v Sets a joint value for axis (a) in the position p SETPVC SETPVC p c v Sets a cartesian value for coordinate c of position p SHIFT SHIFT p BY a v Adds value v to axis joint (a) of position p SHIFTC SHIFTC p BY v c Adds value v to cartesian coordinate (c) of p SHOW SHOW DIN Displays status of all digital input lines SHOW DOUT Displays status of all 16 digital output lines SHOW ENCO Displays values of all encoders SHOW SPEED Displays current speed setting SPEED SPEED v Sets speed to value v STOP STOP pry Stops execution of user program pry STOP stops execution of all running programs TRIGGER pry Sets the condition for the execution of the program BY IN/OUT n s pry. The condition is an I/O state s on line n WAIT vi Suspends program execution until the condition is Command TRIGGER WAIT condition v2 satisfied Appendix C SD512 Vision System The SD512 Development Workstation is a UNIX' machine, running under the REGULUS2 operating system. REGULUS is a UNIX-compatible operating system which, unlike the standard UNIX, has a real-time kernel [IRI, 1988]. In order to facilitate vision application program development, the IRI (International Robomation Intelligence) REGULUS version is enhanced by the IKS/VR vision procedure library and its interactive equivalent, INIKS. In addition, IRI provides program development tools such as VSL Vision Support Library to supplement IKS/VR [IRI, 1989]. C.1 System Configuration and Specifications The basic SD512 vision system consists of three circuit boards, called SCPU (SUPRAVISION CPU), FBB (Frame Buffer Board), and IPB (Iconic Processor Board). These components of the SD512 workstation are described below, while Figure C.1 shows a logical block diagram of the main components. • The SCPU encompasses the vision system CPU with its coprocessor, the memory management unit (MMU), up to 8MB DRAM, the I/O interfaces, and all other necessary "glue logic". The CPU is 32 bit MC68020 and the coprocessor is MC68881. I/O interfaces include SCSI for hard disks and streamers, floppy interface, eight 1UNIX is a trademark of AT&T 2REGULUS is a trademark of ALCYON, Inc. 202 Appendix C. SD512 Vision System^ 203 RS232 serial channels, and one parallel Centronix interface. Also, there are seven single line vectored interrupts and two independent timers. • The FBB accommodates the frame buffer memory plus some more memory for look up tables. In addition, the FBB contains all the special circuits needed to handle the video input, synchronize the cameras with the system, and generate the video output for monitors. There are two independent frame buffer memories on FBB with 1MB each, organized to store images of variable resolution. These frame buffers operate in conjunction with a variable resolution digitizer, which can be programmed to interface to matrix cameras with 256 x 256, 512 x 512, or 1024 x 1024 resolution. The content of any frame buffer or a segment of it can be displayed on a monitor screen. • The IPB comprises three iconic processors to perform a variety of iconic operations. These include feature enhancement, image segmentation, and feature extraction. C.2 Available Functions C.2.1 Feature Enhancement Monadic point transformations • Iselm ta b(s,t) Selects a monadic table, s - pixel stream, t - table • Im kmtab(t,f(p)) Makes a monadic table, t - table, f(p) - function of pixel values. Diadic point transformations • ladd() Computes the arithmetic sum of each pair of corresponding pixel values in Source 1 and Source 2 windows. Appendix C. SD512 Vision System^ Figure C.1: SD512 System Block Diagram. 204 Appendix C. SD512 Vision System^ 205 • Isub() Computes the arithmetic difference of each pair of corresponding pixel values in Source 1 and Source 2 windows. • land() Computes the logical AND of each pair of corresponding pixel values in Source 1 and Source 2 windows. • lor() Computes the logical OR of each pair of corresponding pixel values in Source 1 and Source 2 windows. • lcopy() Copies the contents of Source 1 window to Source 2 window. Convolution • Iconv30 Performs 3 x 3 convolution. The coefficient kernel is specified by lcoef, e.g., C_3LAPALACE, C_3GAUSS, and so on. • Iconv50 Performs 5 x 5 convolution. • Iconv70 Performs 7 x 7 convolution. C.2.2 Image Segmentation Labeling • Ilabel() Detects blobs (sets of 8 connected adjacent pixels with identical value) in Source 1 window and assigns each one a label. • Ibbox0 Calculates the minimum-sized rectangle enclosing each labeled blob in the Source 1 window. The coordinates of the bounding boxes are stored in the arrays Ibbrminn (min row), Ibbrmax[] (max row), IbbcminU (min column), and Ibbcmax[] (max column). Appendix C. SD512 Vision System^ 206 C.2.3 Feature Extraction The histogram • 'hist() Computes the frequency of occurrence of each pixel value in the Source 1 window. • ImaskhistO Simultaneously computes a histogram in up to 16 arbitrary-shaped areas. The profile • Ivprof0 Computes a vertical profile of the Source 1 window by summing the pixel values in each column. • Ihprof0 Computes a horizontal profile of the Source 1 window by summing the pixel values in each row. • Ineprof0 Same as above but in the North-East direction. • Inwprof0 Same as above but in the North-West direction. • Imaskprof0 Computes the profile in which mask defines the regions where the pixel values are summed. The pixel statistics memory • Ipsm[i] Contains the results of several iconic processor functions. These functions are; histograms; profiles; labeling; and bounding box. Appendix C. SD512 Vision System^ 207 Run-length encoding • 'runlength() Finds the transitions from black to white or white to black in a binary image and stores their coordinates in a list for further analysis. Moments • Imoments0 Computes the moments (zero through third order) of the image in Source 1 window. C.2.4 Template Matching and Correlation • Iptcorr0 Computes the point correlation of the images in the Source 1 and Source 2 windows. • Inormcorr0 Computes the normalized correlation of the contents of the Source 1 and Source 2 windows. C.2.5 Pyramid Processing • Icomp21O Compresses the dimensions of an image by a factor of two. A 3 x 3 convolution reduction is performed (filtering and sampling). • Icomp4_10 Compresses the dimensions of an image by a factor of four. A 5 x 5 convolution reduction is performed (filtering and sampling). • Isamp41. Shrinks the dimensions of the Source 1 window by selecting every fourth pixel. • lzoom() Enlarges the dimensions of an image by a factor of two. Appendix C. SD512 Vision System^ 208 C.2.6 Graphics Plotting fuctions • !pixel() Changes the value of a pixel; the value and the coordinate are specified. • Hine() Draws a line between the specified start and end coordinates; the pixel value is also specified. • Irect() Draws an outline of a rectangle; upper-left and lower-right coordinates and the pixel value are specified. • lbox() Same as Irect() but draws a filled rectangle. • 'circle() Draws a circle; center coordinates, radius, and the pixel value are specified. Text • Ichar() Prints a character; font is specified. • [text() Prints a string of characters; font is specified. C.2.7 Video Control and Image Acquisition • Ilive0 Turns video going directly from the camera to the monitor on or off. • 'snap() Captures an image from the currently selected camera video port into the currently selected frame buffer. • 'camera() Selects the active camera port. • 'frame() Selects the active frame buffer. • lvideo() Turns the video display on or off. Appendix C. SD512 Vision System^ C.2.8 IK S/VR Variables • Isrc1 Source 1 window upper-left corner. • Isrclb Source 1B window upper-left corner. • Isrc2 Source 2 window upper-left corner. • Idst destination window upper-left corner. • Idstb destination B window upper-left corner. • lwrk Work window upper-left corner. • Irows Number of rows in window. • 'cols Number of columns in window. • Icoef Coefficient array index. • Imisc Misc parameter. • Isres1 Source 1 resolution. • Isreslb Source 1B resolution. • Isres2 Source 2 resolution. • Idres Destination resolution. • Idresb Destination B resolution. • lwres Work resolution. 209 Appendix C. SD512 Vision System^ 210 C.3 Application Software Development Application software on SD512 is developed using the C language. Each and every program that uses an IKS/VR function or variable should have the following line included in the program file. #include <iksvr.h> This header file declares variable and function return value data types, defines constants, and defines macros. The most important IKS/VR function is Iopen(). This function must be the first IKS/VR function called by a program that uses any other IKS/VR function or variable. It is important to check the return value of Iopen() just to make sure that the vision hardware is working correctly. A return value of zero or more indicates that everything is working correctly. A negative value means that something is wrong and the application program should terminate. The following example illustrates how the IKS/VR functions and variables can be combined to build an application program. In particular, this example captures two images and subtracts the second image from the first and the resulting image is displayed. The first image is captured into frame buffer 0, the second image is captured into frame buffer 1, and the result is stored in frame buffer 3. #include <iksvr.h> main() if (lopen() < 0) Appendix C. SD512 Vision System^ 211 exit(-1); Ilive(ON); ^ /* switches video on */ printf(''Press ENTER to snap the first image\n"); getchar(); Ifra me(0); /* selects frame 0 */ /* snaps the first image */ lsnap(); Ilive(ON) printfC Press ENTER to snap the second image\n"); getchar(); Iframe(1); /* selects frame 1 */ /* snaps the second image */ Isnap(); Isrc1^lwloc(0,0,0); /* set srcl to frame 0, row 0, column 0 */ Isrc2^lwloc(1,0,0); /* set src2 to frame 1, row 0, column 0 */ Idst = lwloc(3,0,0); Isub(); /* set dst to frame 0, row 0, column 0 */ subtract src2 from src1, put result in dst */ Ifra m e(3) /* select destination frame */ It is necessary to compile the program and link the object program with the required libraries in order to create the executable program. Suppose that the source program was saved in a file called main.c. The following command is used to compile and link main. c: cc main .c -I/usr/include/iri -liksvr Appendix C. SD512 Vision System^ 212 If the source program is successfully compiled and linked, the executable program will be saved as a.out. Running the program is then just a matter of typing a.out. Appendix D Interface Circuitry and Communication Software This appendix describes the electronic circuits and communication software designed and implemented for the prototype butchering machine. Section D.1 describes the interface circuit that was built to establish communication between the vision station and the robot controller. Section D.2 describes the circuit built to establish synchronization of the conveyor with the vision software. Software developed for communication and synchronization is discussed in Section D.3. D.1 The Interface Circuit An interface circuit between the vision station and the robot controller was built to establish communication between the two. A parameter command from the vision software is converted to two 7 bit digital signals and transferred to the robot controller through this interface circuit shown in Figure D.1. SD512 sends data out through its Special Input/Output Port (SPIO). The 8th bit of each data byte sent out is used as a strobe signal, and it indicates the robot controller that a data set is ready to be read. D.2 Synchronization Circuit Synchronization of the vision software with the conveyor is established through the synchronization circuit shown in Figure D.2. A trigger switch mounted on the conveyor gives 213 Appendix D. Interface Circuitry and Communication Software ^ 214 Ul - 74LS244 U2 - 74LS244 Figure D.1: Interface Circuit between the Robot Controller and SD512. Appendix D. Interface Circuitry and Communication Software ^ 215 an ON/OFF signal depending on whether there is a fish on the conveyor to be imaged. The vision software polls for this signal and proceeds with the processing, if an active signal is detected. The signal obtained from the trigger switch is very noisy. Therefore, two retriggerable monostable multivibrators are serially used to overcome this problem. The first multivibrator is triggered to the first low going edge and masks any other edges for a period of about 200ms. This period can be adjusted by changing the variable resistor R2 in Figure D.2. The second multivibrator is triggered by the high going output of the first multivibrator and outputs a pulse. The duration of this pulse is about 5ms. D.3 Communication and Synchronization Software There are three parameters that must be sent by the vision station to the robot controller. • Length of the fish • Width of the fish • Model No. The length and the width parameters are each converted to two 7 bit words. The model number is converted to one 7 bit word because it ranges only from 1 to 5. The 8th bit of each byte, as mentioned above, is used as a Strobe signal. The vision system sends the first byte of a parameter and waits for an acknowledge (ACK) signal from the robot controller. It sends the next byte of the parameter if a successful ACK is received. This process is continued until all five bytes, corresponding to the above parameters, are sent to the robot controller. Appendix D. Interface Circuitry and Communication Software ^ 216 RI - 510 RI - 10k R3 - 200k CI - 10p1 C2 - 0 11AF C3- lpF U3 - 74LS123 Figure D.2: Synchronization Circuit of the Conveyor and the Vision Software. Appendix D. Interface Circuitry and Communication Software ^ 217 A synchronization pulse from the synchronization circuit is polled from the vision software. An active pulse from the circuit will cause the vision system to capture an image and perform the relevant analysis. This signal is input through one of the digital input channels of SPIO. A high-level description of the communication and synchronization software is as follows: • Repeat for each fish 1. Initialize the SPIO port 2. Wait for the trigger signal 3. Perform image processing 4. Repeat for length and width parameter — Compute the two words (7 bit) of the corresponding parameter — Activate Strobe and send the first byte — Wait for ACK — Activate Strobe and send the second byte 5. Compute the word (7 bit) for the model number 6. Activate Strobe and send the first byte 7. Wait for ACK Figure D.3 shows the complete wiring diagram of the interface circuit and the synchronization circuit. Appendix D. Interface Circuitry and Communication Software ^ 218 Figure D.3: Complete Wiring Diagram of the Interface and the Synchronization Circuit.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A model-based approach to complex contour generation...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A model-based approach to complex contour generation for process automation using computer vision Gamage, Lalith D. K. B. 1993
pdf
Page Metadata
Item Metadata
Title | A model-based approach to complex contour generation for process automation using computer vision |
Creator |
Gamage, Lalith D. K. B. |
Date Issued | 1993 |
Description | The research described in this thesis addresses the development of methodology for fast extraction of complex processing information of an object using relatively simple on-line measurements and prior knowledge and information that can be generated off-line on the object. Even though the approaches investigated here are quite general, the techniques are developed with respect to the specific application of flexible automation of a fish processing cell. Robotics and computer vision have been combined to produce useful results in a variety of applications. Vision guidance may be applied at least in two ways here. Cutting contours of fish could be generated off-line and these could be used to generate the reference signals for the cutting controller. Alternatively, cameras mounted on the cutter could be used to guide the cutter in real time. The present research concentrates on the first approach since the objective is to generate the cutting contour sufficiently fast to meet the process speed requirements. The complexity of the vision problem is eased to some extent by the fact that the object (fish) is already recognized, at least in a generic sense. Low-to-medium-level computer vision techniques involving image transformation, enhancement and analysis of basic features such as edges, regions, shape, colour and texture, are reasonably well established and widely applied. The application of these techniques to directly generate the cutting contour of an arbitrary fish would be computationally slow and unacceptable in terms of process speed requirements. The present research effort is directed at expediting vision-based generation of cutting contours through the use of model-based vision techniques. One of the goals of this research is to develop a knowledge-base and a model-base to support the complex feature extraction procedure. Use of low-level image analysis techniques to obtain non-complex features at high speeds from the fish images is a further goal, as this is related to the first goal. In the model-based approach, the fish models are generated using a representative database of measurements on fish. The data includes a cutting contour and some dimensional measurements which are referred to as attributes or features. The cutting contours are non-dimensionalized, transformed, and grouped into an appropriate number of models using a systematic grouping procedure. Each model contains a non dimensional cutting contour for the corresponding group of fish, and a set of attributes. In real-time operation, the measured attributes have to be matched to the model. The main contribution of this research is the methodology for the generation of rules for model matching or classification. The a priori probability distribution of attributes in each group is used to generate the rules for the model that corresponds to the group. Rules generated for all models are collected in a rule base and used for classification. A systematic method for integrating expert and heuristic knowledge is developed in order to improve the efficiency of the classification process. An extensive process of error analysis of the present methodology is also presented. The techniques developed in the present research were implemented in a prototype fish processing cell that is being developed in the Industrial Automation Laboratory. This cell consists of a vision station, a knowledge-based system, a robotic cutter, and a motorized conveyor unit. In a heading (head removal) operation, each fish on the conveyer is imaged at the vision station and the corresponding cutting contour is determined with the help of the knowledge-based system. This cutting contour is transformed into a drive signal for the robotic cutter. At the cutting station, a fish will be gripped and cut according to the trajectory generated for that particular fish. In the present prototype system, however, the robot draws the corresponding cutting contour on a board placed on the conveyor bed. The vision system, the cutter/gripper controls, and the conveyer controls communicate and synchronize the activities of various components in the fish processing system. Specific ways of transferring the new technology to the Canadian fish processing industry are currently being explored. This research also compares the performance of the developed rule-based methodology with other classification and estimation methodologies. These include neural networks, Bayesian inferences, k-nearest neighbour method, and multiple regression. The results of these comparisons are presented. Also, as a preliminary requirement for implementation, algorithms have been developed for on-line measurement of the attributes using low-level image analysis techniques. |
Extent | 11141715 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2008-09-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080874 |
URI | http://hdl.handle.net/2429/2154 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mechanical Engineering |
Affiliation |
Applied Science, Faculty of Mechanical Engineering, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1993-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1993_spring_phd_gamage_lalith.pdf [ 10.63MB ]
- Metadata
- JSON: 831-1.0080874.json
- JSON-LD: 831-1.0080874-ld.json
- RDF/XML (Pretty): 831-1.0080874-rdf.xml
- RDF/JSON: 831-1.0080874-rdf.json
- Turtle: 831-1.0080874-turtle.txt
- N-Triples: 831-1.0080874-rdf-ntriples.txt
- Original Record: 831-1.0080874-source.json
- Full Text
- 831-1.0080874-fulltext.txt
- Citation
- 831-1.0080874.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080874/manifest