UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Roll: a robotic on-board localization system using landmarks Brewster, Jeffery Charles 1996

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1997-0049.pdf [ 7.1MB ]
JSON: 831-1.0051465.json
JSON-LD: 831-1.0051465-ld.json
RDF/XML (Pretty): 831-1.0051465-rdf.xml
RDF/JSON: 831-1.0051465-rdf.json
Turtle: 831-1.0051465-turtle.txt
N-Triples: 831-1.0051465-rdf-ntriples.txt
Original Record: 831-1.0051465-source.json
Full Text

Full Text

ROLL — A ROBOTIC ON-BOARD LOCALIZATION S Y S T E M USING L A N D M A R K S by JEFFERY CHARLES  BREWSTER  B . Eng., Carleton University, 1992 A THESIS S U B M I T T E D IN PARTIAL F U L F I L M E N T OF T H E REQUIREMENTS F O R T H E D E G R E E OF M A S T E R OF SCIENCE in THE F A C U L T Y OF G R A D U A T E STUDIES Department of Computer Science We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH C O L U M B I A December 1996 © Jeffery Charles Brewster, 1996  In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.  Computer Science The University of British Columbia 2366 Main Mall Vancouver, B C Canada V 6 T 1Z4  ABSTRACT  This thesis presents the design of a system for robot localization named R O L L . The R O L L system was designed to be used in a dynamic real-time robot testbed. The system could provide location information to experimental robot control architectures.  There are many existing localization systems for robots using many different techniques and sensors. All of these localization systems either require custom computer hardware or are much too slow or require the robot to stop for a certain period of time for the localization system to produce output. The system in this thesis allows localization without having to have the robot stop, and without the need for expensive custom hardware.  The R O L L system uses a camera mounted on the robot facing forward to detect special landmarks placed in the environment. From the shape of the landmarks in the camera image, the position of the camera relative to the robot can be calculated. This position can be transformed into absolute frame coordinates using the known absolute frame position of the landmark.  To test the R O L L system, it was implemented in the A C E programming environment on a stand-alone workstation. This implementation was fed sequences of stored images that included a visible landmark to simulate a real robot trajectory. The system was also implemented on a Texas Instruments TMS320C40 DSP board to prove that the algorithm could be run at acceptable real time speeds.  The system was able to detect the landmark in all of the images in two series of 11 images each, except for one. The position of the camera was calculated to within 4.2cm of the known position for each image. The TMS320C40 implementation was able to provide location information at an average of approximately 8Hz.  II  TABLE OF CONTENTS  ABSTRACT  11  TABLE OF CONTENTS  '" 4  LIST OF TABLES  V  LIST OF FIGURES  l  • V»I  ACKNOWLEDGEMENTS  V HI  1. Introduction  1  1.1 Robot Localization 1.1.1 What is Robot Localization? 1.1.2 Why is Localization Necessary? 1.2 Context 1.3 Work 1.4 Summary Of Results 1.5 Overview  1 2 4 4 5 6 7  2. Previous Work  8  2.1 Localization System Tasks 2.2 Localization System Characteristics 2.2.1 Coordinate Systems 2.2.2 Landmarks 2.2.3 Sensing System 2.2.4 Processing System 2.2.5 Maps 2.2.6 Multiple Views 2.2.7 Partial Views 2.2.8 Landmark Tracking 2.2.9 Dynamic vs. Static World 2.2.10 Stop and Sense vs. Continuous Sensing 2.2.11 Accuracy 2.2.12 Robustness 2.2.13 Integration With An Odometry System 2.3 The Ideal Localization System 2.4 Localization With Naturally Occurring Landmarks 2.4.1 EricKrotkov '. 2.4.2 A E C L Ark 2.5 Localization With Fabricated Landmarks 2.5.1 Global Positioning System (GPS) 2.5.2 Murata and Hirose 2.6 Camera Calibration 2.6.1 The Ideal Camera Model 2.6.2 Tsai - Camera Calibration  3. System Outline  .'  8 12 12 13 15 15 16 16 16 17 17 17 18 18 18 19 20 20 21 21 22 22 23 24 25  30  3.1 System Specifications  30  iff  3.2 Design Highlights 3.2.1 General Approach 3.2.2 Localization W i t h A Single Landmark 3.2.3 Landmark Design 3.2.4 Landmark Detection 3.2.5 Landmark Identification 3.2.6 Plant M o d e l i n g 3.3 System Overview 3.3.1 Task Decomposition  32 32 33 36 44 57 58 61 62  4. System Detail  67  4.1 The A C E Environment 4.1.1 Events, Actions and Components 4.1.2 A C E Configurations 4.1.3 Design and Development with A C E 4.2 The R O L L Configuration 4.2.1 The M a i n Pipeline 4.2.2 The Supplementary Pipeline 4.2.3 L o o p Control 4.3 Data Types 4.3.1 Image Plane Coordinate Data Types 4.3.2 Frame Buffer Coordinate Date Types 4.3.3 Cartesian Coordinate Data Types 4.3.4 GreyScalelmage Data Type 4.4 Components 4.4.1 The Frame Grabber 4.4.2 The M a p 4.4.3 The Landmark Predictor 4.4.4 The Landmark Detector 4.4.5 The Relative Position Calculator 4.4.6 The Absolute Position Calculator 4.4.7 The Position Estimator 4.5 Embedded System Implementation  67 68 69 70 71 73 73 73 74 75 75 77 77 78 79 82 84 90 99 100 101 102  5. Results And Evaluation  105  5.1 Localization System Evaluation 5.1.1 Issues/ Difficulties 5.1.2 Experiment Procedure 5.1.3 Obtaining a Sequence of Images 5.1.4 Camera Calibration Procedure 5.1.5 Camera Calibration Results 5.1.6 L o c a l i z a t i o n Test R u n Procedure 5.1.7 A n a l y s i s O f Localization Test Runs 5.2 A C E Environment Evaluation 5.2.1 Advantages 5.2.2 Disadvantages  105 105 107 108 108 109 Ill 113 120 120 121  6. Conclusions  123  6.1 Specifications Evaluation 6.1.1 Continuous Sensing 6.1.2 6.1.3 6.1.4 6.1.5 6.1.6  123 123  R e a l - T i m e Performance M i n i m a l Hardware Single Camera Localization Range Requirements N o Odometry Required  123 124 124 124 124  w  6.2 V i a b i l i t y 6.2.1 Single Landmark Localization 6.2.2 The current R O L L System 6.2.3 A C E for Image Processing 6.3 Future W o r k 6.3.1 Robustness Testing 6.3.2 A Scale Invariant Landmark 6.3.3 Large Scale M a p s 6.3.4 F u l l R e a l - T i m e Implementation  124 125 125 126 126 126 127 127 128  BIBLIOGRAPHY  129  APPENDIX A CONSTANTS  131  A P P E N D I X B T E S T R U N #1 L O G  132  A P P E N D I X C T E S T R U N #2 L O G  137  V  LIST OF TABLES Table 1 Simuated Test Run #1 Full Light Table 2 Simulated Test Run #2 - Half Light  115 117  LIST O F FIGURES  Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure  1 The Ideal Camera M o d e l 2 Camera geometry with perspective projection and radial lens distortion 3 The First Landmark Design 4 The Landmark A s It Appears In The Ideal Image Plane 5 Calculating Relative Position For Landmark Type 1 6 Image Plane Parallelism 7 Landmark Design 2 8 Calculating the Relative Position of Landmark Type 2 9 The Corner Detectors 10 The Second Landmark Type 11 Landmark Identification M a r k e r 12 System Structure 13 The R O L L Configuration 14 Frame Grabber Component..... 15 The Image Sequence Loader Component 16 The M a p Component 17 The Landmark Predictor Component 18 The Landmark Detector Subconfiguration Interface 19 The Landmark Detector Sub-Configuration 20 The Frame Landmark Splitter Component 21 The Ellipse Detector Component 22 The Landmark Shape Verifier Component 23 The Landmark Identification Verifier 25 The Absolute Position Calculator Component 26 The Position Estimator Component 27 The Physical Implementation 28 The Camera M o u n t i n g Platform 29 Optical Bench M o t i o n Stage Setup 30 The Calibration Target 31 Test Run # 1 - F u l l Light 32 Test R u n #2 - Dark 33 Landmark N o t Detected - Test R u n #2  VI!  24 26 37 39 39 40 42 43 47 50 57 62 72 79 80 82 84 90 91 92 93 94 96 100 101 103 106 107 109 112 113 119  ACKNOWLEDGEMENTS  I would first like to thank my lovely wife, Mari-Jane, for her inexhaustible patience during the last few years. In the darkest moment, when I thought I had lost everything she calmed me using her words even though heavy sedatives might have been more appropriate.  For their continuing encouragement I would also like to thank my family, Scott, Cheryl and Mike Brewster, as well as Mari-Jane's family, Michael and Nora Medenwaldt and Betty Tasaka.  My gratitude to my supervisor, Jim Little who possesses an astounding ability to keep high levels of optimism and enthusiasm. There were many months when I had to draw upon his reservoirs of hope when mine were depleted.  I would like to thank the reviewing reader, David Lowe, for taking the time to review this thesis.  Rod Barman and Stewart Kingdon have contributed significant technical prowess, without which this thesis would have been impossible.  Many thanks to my current employers International Submarine Engineering and my managers Eric Jackson and Mike Borghardt for flexible scheduling that allowed me to complete my thesis.  The majority of financial support for this thesis was provided by the Natural Sciences and Engineering Research Council. Many thanks to Betty Tasaka for her personal financial contributions. It is very likely this thesis would not have been completed without her support. Thanks to my employers International Submarine Engineering for their contribution towards my tuition costs.  viii  Chapter One - Introduction  1. Introduction This thesis presents a design for a mobile robot localization system named R O L L . R O L L is an acronym for Robot On-board Localization Using Landmarks. The R O L L system uses onboard video sensing to determine the position of the robot in world coordinates. The position calculations are based on the shape of special landmarks that appear in the image.  The emphasis of this thesis is on the design and evaluation of the system in the simulated testbed implementation. Although the primary topic of this thesis is robot localization, the emphasis of this thesis is not on providing a rigorous proof of any new theoretical material related to the topic of robot localization. The A C E development environment itself is not presented as part of the work done on this thesis, but its impact on the design of the system warrants a description of A C E and a discussion of its impact on the design methodology is included in this thesis.  1.1 Robot Localization Robot localization provides the primary topic of this thesis. Localization deals with finding a current position of a robot in its environment. The topic is related to camera calibration and mobile robot navigation. Camera calibration and its relation to robot localization is discussed in the chapter on previous work.  Robot localization is usually considered a component of navigation. Any navigation system usually has components to perform path-planning, localization, obstacle avoidance and  1  driving control. For specific systems, localization may be tightly integrated with other components. For instance, a real-time localization system may provide the location information for the main driving control loop. If an obstacle avoidance system uses the same sensors as the localization system, then the obstacle avoidance system may be integrated with the localization system.  The localization system presented in this thesis is based on no assumptions about any overlying navigation system. Its design is completely modular, but is open enough that it could be easily modified to be tightly integrated with any other navigation component, should a suitable match be found.  1.1.1 What is Robot Localization? Robot localization, or camera localization, is the process of determining the location of the robot, or camera from sensed information. While the general localization problem is 3-D, for six degrees of freedom, this thesis is only concerned with 2-D, or three degrees of freedom. Krotkov [Kro89], who is also concerned with 2-D localization, provides the following definition of localization :  Let W be a fixed two-dimensional reference frame, and let R be a twodimensional reference frame attached to the mobile robot. ... For a rigid robot, the transformation from W to R has three parameters: two for translation and one for rotation. These parameters define the pose T=(x, y,d) of the robot.  2  We assume that the robot is equipped with a camera, and that it moves on aflat surface. Further, we assume that the robot has a map of its environment. The map marks the positions, referred to W, of vertically oriented parts of objects such as doors, desks, equipment racks, etc. The robot is to take an image and extract vertical edges form it. These edges by themselves do not contain enough information to locally determine which vertical edge corresponds to which point on the map. The problem is to identify Tby establishing the correspondence between the vertical edges and points in the map.  Krotkov's definition was given specifically for his method of localization. It includes references to video, and vertical landmarks which are not present in every localization system. A more general definition can be derived from this definition that can be applied to nearly all two dimensional localization systems.  Let W be a fixed two-dimensional reference frame, and let R(t) be a twodimensional reference frame attached to the robot. For a rigid robot, the transformation from W to R(t) has three parameters: two for translation and one for rotation. These parameters define the pose T(t) = (x(t), y(t),6(t)) of the robot.  The robot has some sensory information I(t), and some information about its present environment M(t). The problem is to identify T(t) from I(t) and M(t).  The latter definition generalizes the definition of localization to include systems that do not use video. The definition even applies to those localization systems that do not use  3  landmarks, or do not start with a map of landmarks , given by the case M(t) = {} and M(0) = {} respectively.  1.1.2 Why is Localization Necessary? Localization is necessary for any task which needs to know the precise position of the robot relative to some object or location. Typically this information is required for precise path planning, or sometimes for object manipulation. For example the task of pushing an object to a specified region requires that the robot know its position relative to the specified region and the position of the object relative to the camera. The former is obtained by a localization system while the latter is obtained by an object recognition system which may or may not use the same sensing information as the localization system.  1.2 Context This thesis was motivated by the work done at the University of British Columbia Laboratory for Computational Intelligence on a dynamic physical testbed for robot control architectures. Sahota, [Sah93], describes a testbed in which remote-control model race cars are controlled by computer to play soccer against each other. In the current system, localization information is provided by an all-seeing off-board camera mounted above the playing field. Each robot is given a continuous stream of location information for the whole system, including the location, and heading of itself, its opponent, and the soccer ball. This system has been successful in proving the capabilities of an architecture to operate in a multiple-agent real-world dynamic environment.  The effort in this thesis is based on the desire in add more complexity to this testbed. If each robot were equipped with its own on-board sensing system, new stresses would be placed upon the architecture. The architecture may then have to accommodate any issues presented by a  4  specific sensing system. For example, most on-board sensing systems do not provide complete information like the off-board camera in the current testbed. Sensing information may vary in availability and quality. Sensing information may arrive at unpredictable intervals, and with varying degrees of accuracy. For sensing that is not robust, the controller must be able to deal with failure of the sensing system. This may entail performing some procedure in order to reinitialize the sensing system, like a football player that must regain his bearings after a nasty tackle. In effect, an architecture would have to be evaluated in conjunction with the sensing system that it uses.  The R O L L design has been implemented on two platforms. The first is Windows N T using a real-time data-flow environment, called A C E , as a simulation testbed, and the second is a real-time partial implementation using a single Texas Instruments TMS320C40 (hereafter a C40) DSP processor and a cylindrical robot, named Spinoza, from Real World Interfaces Ltd. The latter implementation serves only to provide proof of real-time, real-world practicality of the design. At the time of writing of this thesis, no significant evaluation of the performance of the latter system could be performed, although some rudimentary qualitative evaluation is given.  1.3 Work The purpose of this thesis is to provide a design for on-board localization systems for a robot in a dynamic environment. The design incorporates continuous sensing, and location calculation from a single landmark with minimal processing hardware. Though the design is implemented in a simulated environment, a physical implementation of the system is provided as proof of the viability of the design for real-time localization.  The thesis has a sub-emphasis on the implementation of systems using asynchronous data flow paradigms instead of procedural decomposition. This issue is explored in the A C E  5  environment, property of International Submarine Engineering Ltd. This design approach conveys numerous advantages to designing and documenting a control system.  1.4 Summary Of Results Two test runs were performed to in the simulated implementation with a landmark height of 10cm. These test results showed that the system was capable of detecting landmarks from at least 1.5m away. The maximum error in the two test runs was approximately 3.0cm tangential error and 3.1cm radial error . The maximum orientation error was 0.0021 radians, or 0.12 1  degrees. The landmark was successfully detected in all twelve full light images. The landmark was not detected in one of the twelve half light images.  One of the test runs was performed with all laboratory lights on, and the second was performed with the lights off. The first test run resulted in an average error of (-0.5, -0.01, 0.005) , with maximum errors of ( 3.0, 1.1, 0.018 ), and standard deviations of ( 1.32, 0.46, 2  0.0091). The second test run resulted in an average error of ( 0.8, 1.3, 0.007), with maximum errors of ( 2.9, 3.1, 0.021), and standard deviations of ( 1.47, 1.06, 0.0111). These numbers clearly show some deterioration in the darker test run, although the system clearly still operates in vastly different lighting conditions.  The A C E environment was successful in illuminating the'elements of the design of the system. This facilitated the development and debugging of the system. The disadvantage of the A C E system is that it isolated the system from the facilities used to capture the images and thus prevented testing with a large number of images.  1  Here tangential error is approximated as the x direction error and the radial error is approximated as the z direction error. These directions are approximately aligned.  2  This error vector is shown as ( x-dimension error in cm, z-dimension error in cm, orientation error in radians).  6  1.5 Overview Chapter 2 presents a discussion of issues relevant to localization, including a general task decomposition and general system characteristics. It also summarizes several notable papers that present localization designs.  Chapter 3 contains the requirements and explores design decisions for the design proposed in this thesis. The specific landmark design and all relevant equations are included.  Chapter 4 runs through the detailed design, as it is implemented in A C E , including data types, and a task decomposition into A C E components. A small section at the end of Chapter 4 discusses the implementation of the system on a C40 processor board.  Chapter 5 presents descriptions of tests performed to evaluate the system along with the results and a discussion of the results. Two general tests are performed, accuracy tests and a couple of simulated test runs using real images.  Chapter 6 presents the conclusions inferred from the tests in Chapter 5, and proposes future work for improving the system.  7  Chapter Two - Previous Work  2. Previous Work This section presents a summary of several localization systems, some related work and a framework by which to classify these systems. Included is work by Krotkov, Nickerson, Hirose and Murata, and Tsai. The latter is related to camera calibration while the former three are all mobile robot localization schemes. A discussion of the GPS system is also included, although the scale of this system is not suitable for the scale of mobile robotics described in this thesis. The framework is given in three sections. The first section lists a set of tasks that every localization system must perform. The second section characterizes how localization systems perform these tasks. The third section presents an ideal localization system based on selecting the most desirable characteristics described in section two. Sections four, five, and six present three case studies of particular localization systems. Section six describes the topic of camera calibration, and in particular the camera calibration method of Tsai [Tsa87].  2.1 Localization System Tasks Most landmark localization systems can be decomposed in roughly the same tasks. The following is a general task decomposition for localization systems. Some of the tasks are debatably optional, but some are clearly required. For example, every localization system must perform landmark detection and pose calculation, or they would not be localization systems. Even GPS, (Global Positioning System), performs some form of landmark detection when it demodulates the satellite signals.  8  Initialization Initialization is so common in all systems, not just localization, that it could be omitted. However, there are two initialization sub-tasks that pertain to localization systems in particular. Maps must be initialized, and the robot's initial position must be obtained. In some cases, some calibration is also performed at initialization.  If the localization system does not track landmarks between sensing samples, finding the initial position would not be considered part of the initialization task. Finding the initial position could be the same as finding the location during steady state operation.  If tracking is used, initialization may be performed every time tracking is lost, which greatly increases the importance of system initialization. If tracking is frequently lost and system initialization takes too long, or requires the robot to perform some specific actions, then the localization system utility degrades substantially. Map initialization is discussed in the following section.  Map Storage and Access Every localization system that provides global pose information must store the absolute pose of each landmark in a map. The actual form of a map may vary in complexity from a simple array, storing only the pose of the landmark to a relatively sophisticated database which must store information necessary to detect or recognize the landmark.  Maps also vary in their method of initialization. Most maps are calculated or measured off-line and stored in a static file. Some systems may be able to generate new landmark map entries on-line. In this case, map initialization may include a robot "wander" state which locates several landmarks to enter into the map.  Landmark Detection  9  All systems must detect landmarks. Typically the landmark detection algorithm is the salient feature of a localization system. Methods of landmark detection may vary greatly depending on the sensors used and the landmark design. Localization systems that use video may use a complex object recognition algorithm while GPS simply monitors a given frequency for a modulated signal.  Probably the most complex localization schemes are those that use naturally occurring landmarks. These systems must not only detect landmarks that are somewhat arbitrary, they must also have algorithms for selecting and storing new landmarks.  Landmark Tracking Some systems may track landmarks between sensing samples to increase performance. Typically it is necessary to track landmarks if it takes longer to detect a landmark from scratch than the time available between samples.  There are two typical methods of tracking landmarks. The first is to retain the image coordinates of the landmark in the last sample and look for the landmark within a certain range of the these coordinates. The first and second order derivatives of the image coordinate information may be used to increase the accuracy of the tracking.  Another method is to derive the expected position of the landmark in the image from the last calculated location of the robot. Again, derivative information can be used for a more accurate prediction. This method, although it incurs additional processing, can make use of location information from other sources, such as a dead reckoning system. This leads to increased reliability.  While landmark tracking leads to very large performance gains, it also incurs the disadvantage that if tracking is lost, the system will have to reinitialize itself to regain position.  10  An obvious modification would be that if tracking was lost, a larger search region could be specified. In any case, once tracking is thoroughly lost, there may be a blackout period of location information, which may require the robot to remain immobile for the duration of the blackout.  Landmark Identification For many systems, detecting a landmark is not sufficient. Each detected landmark must be identified. Some landmarks are designed to contain some identifying feature that distinguishes one landmark from another. Various bar code identifiers are common amongst localization systems. Other systems use searches or assumptions to determine the identity of a landmark. For instance a system that detects only vertical lines must use a combinatorial search to match detected lines to known lines in the environment.  Finally some systems automatically identify each landmark in the detection phase. For example, a localization scheme that detects naturally occurring landmarks may use the unique features of a landmark to detect the landmark. Thus the identity of a landmark is automatically known once the landmark is detected.  Pose Calculation Once landmarks are detected and identified, the pose of the camera must be calculated. Typically this involves calculating the pose of the camera relative to the landmark and then transforming the pose to the absolute reference frame.  These transformations usually involve numerous triangulation and trigonometric calculations. Although these calculations are usually relatively straightforward compared to the  11  usually complex landmark detection systems, these calculations must be performed with care since high sensitivities may cause large errors in the calculated location.  Position Filtering and Plant Modeling For many systems each location calculation from an individual sample or set of samples contains a high degree of random errors. Assuming there are no systematic errors, these errors can be greatly reduced by filtering the location information.  Filtering schemes may vary from a simple moving average to a Kalman filter system, complete with a model of the plant and complex error calculations. [May90] and [Gel74] give introductions into Kalman filters. Kalman filters may also combine location information from other sources such as dead reckoning, or odometry to produce a better location calculation than with the localization system alone. For most systems, the Kalman filter is considered part of the localization system, although it could be considered a separate component.  2.2 Localization System Characteristics There are many different types of localization systems; nearly every type of mobile robot that is created must have some form of localization system. This section enumerates the possible characteristics of a localization system.  2.2.1  Coordinate Systems All localization systems produce pose information relative to some frame of reference.  Typically this information is produced in the Cartesian coordinate system, with either 3 or 6 degrees of freedom. GPS [Hur93], since it must produce localization information worldwide, produces position information in < latitude, longitude, altitude> tuples.  Local vs. Global  12  The coordinate system may be local or global. GPS is a good example of a global coordinate system , since all position tuples are relative to a single absolute frame of reference 3  <the equator, the first meridian, sea level> . Some systems may produce pose tuples that are 4  relative to some local frame, whose location relative to some absolute frame is known. Since any output produced can be easily transformed to absolute frame coordinates, this type of system is still considered to produce output in global coordinates.  A local coordinate localization system would be one that produces output in coordinates relative to some landmark , whose location is unknown. Because the location of the landmark is 5  unknown it is obviously impossible to determine the location of the camera relative to some absolute frame. For these types of systems, the landmark is usually the/a point of interest for some task, and the absolute location of the robot is irrelevant.  2.2.2 Landmarks Every localization system must have at least one landmark. The composition of the chosen landmark varies widely from system to system. Krotkov [Kro89] uses only naturally occurring vertical lines, Murata and Hirose use specially coded vertical poles, and GPS uses satellites. A l l landmarks have the common characteristic that they are easily detected . They may 6  also be classified in the following ways.  Active vs. Passive An active landmark originates a signal to the robot to be sensed, while a passive landmark does not. GPS is an active system, since the satellites are responsible for sending out radar signals that can be received from anywhere in the earth's atmosphere. A l l other systems  3  Hence the "global" in the name global positioning system.  4  Oblivious to the fact that the world is hurtling through space, rotating and orbiting like a mad hunk of space debris.  5  Or more typically to object of interest for local coordinate system localization system.  13  listed in the previous work chapter use passive landmarks. Active landmarks are more easily detected, but are more expensive.  Fabricated  vs. Natural  A localization system may either rely on objects occurring naturally in the target environment, or require that specially fabricated objects be placed in the environment to serve as landmarks. Using fabricated landmarks is disadvantageous for several reasons.  •  Landmarks may clutter the environment, becoming an obstacle to motion for the robot or for humans.  •  Placing landmarks will incur costs for the landmarks themselves, and the work  required for a human to place the landmarks in the environment . 7  •  The robot will not be able to move outside of an area that has been furnished with the  landmarks.  Naturally occurring landmarks do not have these disadvantages, but have one distinct disadvantage of their own. Naturally occurring landmarks are either difficult to detect or difficult to identify. The A E C L A R K robot uses manually picked "distinct" objects in the environment. Although these objects are selected for ease of detection, landmarks may have no features in common. Therefore a general object recognition system is required.  Identification Fabricated landmarks usually include some coded identification number in their shape. Often this encoding is similar to bar codes. For Murata and Hirose [Mur93] the identification is  6  In the case of Krotkov, landmarks are extremely easily detected, but there are a huge number of false matches.  7  If location accuracy is paramount, the placement of the landmarks must be exact, or some "map-learning" sequence must be followed to ensure that the robot has an accurate map of landmark locations.  14  completely combined with the detection, such that anything that can be identified as a valid landmark identification is determined to be a landmark.  2.2.3 Sensing System Localization may use many different types of sensing systems. Radio beacons, laser range finders, sonar, and video are common sensing equipment used in localization systems. Each system has advantages an disadvantages.  Radio sensing, such as GPS, can be inexpensive, quick and accurate but can only be used at the global scale. A n accuracy of 100m or 10m for various grades of GPS may be extremely accurate when calculating latitude and longitude, but it cannot be used to navigate a robot through a building. Custom radio triangulation can also be accurate and possibly cheap but can also pollute the airwaves and may not be extensible.  Laser range finders can produce accurate results but are slow and expensive. Landmark detection and identification is very difficult with laser range finders, since they output depth maps which require 3-D object recognition techniques.  Video is relatively cheap and has the added advantage that the image produced can be simultaneously shared with other systems such as obstacle avoidance and object recognition/manipulation. Its disadvantage is that the software needed to process the output is very sophisticated and often very slow.  2.2.4 Processing System Each system requires some amount of processing power to process the samples provided by the sensing system. The processing system includes any hardware or software involved in detecting landmarks and calculating the pose of the robot from these landmarks.  15  2.2.5 Maps Every global localization scheme must have some sort of a map. The position of landmarks relative to the absolute frame of reference must be known in order to calculate the absolute position of the landmark. At the simplest level a map can be a simple array of landmark poses, where the index of the landmark pose in the array is the identification number of the landmark.  On the other end of the spectrum, a map may be stored in a database that also provides a model of the landmark necessary for recognizing the landmark. Additional complexity may be introduced if the map is divided into subsections, or must be otherwise capable of selecting only landmarks visible from the current pose of the landmark. If a system is capable of learning the positions of new landmarks, the database must also allow new landmarks to be added dynamically.  2.2.6 Multiple Views Most localization systems require that two landmarks be visible to calculate robot pose. Systems can be characterized by their ability to incorporate superfluous landmarks in the image, i.e., if three landmarks appear in the image, can the system calculate the location with any increase in accuracy?  2.2.7 Partial Views Some systems may be able to gather information from a landmark that is only partly visible, while most cannot. This may have a significant impact on the performance of the localization system. An analysis of this effect is beyond the scope of this thesis, but this effect is described to convince the reader of the effect of sensing partial views of landmarks might have.  16  Each localization system requires some number of landmarks to be present to calculate robot position. A system could be analyzed to find the mean number of landmarks appearing in an image, the percentage of time the required number of landmarks are present, and finally the increase to this percentage when partial views are allowed. This figure could then be used to calculate the overall effect on system robustness and accuracy.  2.2.8 Landmark Tracking Some systems may track landmarks from one sensing sample to another in order to aid detection. This includes both retaining the last position of the landmark in the sample, and using the last known pose of the robot to predict the location of new landmarks in the next sample. The former approach may be simpler, but will not by itself aid in the detection of landmarks that become visible in the next image. The latter may be more susceptible to becoming lost when the current pose of the landmark is erroneous.  2.2.9 Dynamic vs. Static World A system may make assumptions that it is the only agent in a given world, or it may account for differences in the environment due to the presence of other agents. This may amount to simply ignoring any failed detection, which may be caused by some other agent obscuring the view of the landmark. Most systems make the implicit assumption that the position of the landmarks do not change, or the localization problem would become very difficult indeed.  2.2.10 Stop and Sense vs. Continuous Sensing Many localization schemes require the robot to cease any movement in order to perform localization. This may be due to the slow speed of its sensors, the time required to process sensed samples, or the robot may need to "look around" for landmarks. This approach is only acceptable if the robot's task is not time-dependent or time-critical, i.e., it is not a real-time task. It is also  17  not feasible if the robot is not allowed to stop. Therefore many systems are able to sense location information while in motion.  2.2.11 Accuracy The accuracy of information provided by the localization system is a characteristic of each system. Different applications will require different accuracy, and different sensing equipment and calculation methods can provide different levels of accuracy, usually as a tradeoff with required processing time or cost of processing hardware.  2.2.12 Robustness Robustness for localization systems can be summed up in a single measure: what percentage of the time does the system provide accurate location information. Additionally, robustness also involves any required behaviour involved in reinitializing the localization system after it becomes lost. Systems that track landmark position within images from frame to frame are most susceptible to becoming lost. Systems that can find landmarks without tracking image position are not as susceptible to becoming lost, but still must deal with samples that do not contain any visible landmarks.  2.2.13 Integration With A n Odometry System An odometry system by itself actually constitutes a localization system, but all odometry systems accumulate errors. Thus using odometry by itself is not practical.  Many localization systems for mobile robots are tightly integrated with an odometry system. Typically the odometry system is used as the primary source of location information, and a visual system is used to correct any accumulated errors when a required number of landmarks become visible.  18  If an extended Kalman filter is used to merge location information from the odometry system with the visual system, then the visual and odometry systems become somewhat decoupled. Both localization systems will feed input to the Kalman filter at its own rate, and each will also consume pose information, again at its own rate. The vision, localization system will use the pose information for landmark tracking purposes, and the odometry will use the pose information to correct for drift.  2.3 The Ideal Localization System The design process always involves trading off characteristics against other characteristics. For example, processing requirements, representing cost, must be traded off verses real-time performance. Using fabricated landmarks trades versatility for simplicity.  Each application has requirements that dictate which tradeoffs must be made. However if we ignore cost and processing requirements we can come up with an "ultimate localization system specification" that would be suitable for nearly all applications. This exercise is useful to point out what characteristics are generally desirable in any localization system. Existing localization systems can be compared to the ultimate localization systems to expose the design limitations . 8  The ideal localization system would have the following characteristics:  1. Pose would be determined with 6 degrees of freedom. Any systems that required less could simply ignore superfluous information.  2. The system would provide pose information relative to some absolute. This could be easily translated into local pose if required.  19  3. The system would use naturally occurring passive landmarks. It would also be able to select new landmarks from its environment and add them to its map.  4. The system would be able to deal with the effects of other agents in the world.  5. The system would provide continuous sensing of location information.  6. The system would provide pose information that is 100% accurate . 9  7. The system would never become lost, i.e., its robustness would be 100%.  2.4 Localization With Naturally Occurring Landmarks 2.4.1  EricKrotkov Krotkov [Kro89] presents a localization algorithm that uses vertical lines naturally  present in the environment to locate the robot. However, since vertical lines cannot be uniquely identified by any salient features, a search must be used to find the correspondence between lines in the image and vertical lines on the map. This search will have complexity 0(kn ) in time, 4  where k is the number of vertical lines in the image and n is the number of vertical lines recorded as landmarks in the map. Sugihara [Sug88] introduces a constraint that can reduce this search to 0(kn ). This may not be a problem if there is only a limited amount of landmarks, but it does 3  place a hard ceiling on the robot range that can be serviced with a single map.  Krotkov only gives the results for simulated experiments. He also states that these complexities may not be acceptable for some applications. This author agrees that this localization scheme could not be used in its present form for an application that requires continuous sensing localization, such as playing soccer. This problem illustrates the proposition  8  Of course the ideal localization system does not exist, and will not for some time, if ever. Comparing current localization systems with this ideal is not meant as a harsh criticism of any localization system. It must also be remembered that such a comparison is not fair because this ideal has ignored its drawback: cost.  20  that a continuous sensing localization system must have landmarks whose identity can be sensed rather than inferred. However the ability to use naturally occurring landmarks makes this more suitable for environments where it is not possible to place landmarks or where placing landmarks would unsuitably alter the environment characteristics.  2.4.2  AECLArk A paper by Nickerson et al. [Nic93] describes the A R K robot which locates itself by a  combination laser range finder and camera, both mounted on a pan-tilt head. Landmarks are manually selected permanent objects in the environment. As the robot moves according to a preplanned path, it keeps track of its location using a dead-reckoning system. When it believes it is in an appropriate spot to view two of more of the known landmarks it stops and attempts to localize itself using these landmarks. First each landmark is identified and the direction to the landmark is recorded. The range finder is then used to determine the distance to the landmark. The head is then rotated to face another visible landmark. Once the second landmark is detected and ranged, the position of the robot can be calculated by triangulation.  This stop and sense localization system is appropriate for the environment A R K was designed for - monitoring and maintenance in a nuclear power plant. It is not suitable for a dynamic environment, such as the soccer playing robot test bed. Also, while the laser range finder is capable of precise measurements, it is neither fast nor economical.  2.5 Localization With Fabricated Landmarks The next few systems use landmarks that are specifically created to be easily detected and identified. Typically, these landmarks are designed so that position calculations can performed using directly measured characteristics. For example, if the robot is confined to move  9  A n obvious impossibility.  21  in a plane, the height of a landmark can be used to calculate the distance to the landmark projected onto the optical axis of the camera. Also, the horizontal centroid of the object can be used measure the direction to the landmark relative to the optical axis.  2.5.1  Global Positioning System (GPS) GPS perhaps is or will be the most widely used localization system in existence. By  receiving radio signals from satellites, an airplane, ship or land vehicle can calculate its position, including altitude, to within 10m [Hur93]. The attitude of the craft (roll, pitch and yaw) must still be calculated by onboard sensors. Of course this system operates on a scale not suitable for autonomous robots, with the exception of perhaps autonomous attack jet fighters and tanks, which are beyond the scope of this thesis. Still GPS is an invaluable navigation aid to pilots, ship captains and others, giving testament to the usefulness of location information . 10  2.5.2  Murata and Hirose Satoshi Murata and Takeshi Hirose [Mur93] present a localization system that uses video  with landmarks to correct the accumulation of errors of the main dead-reckoning system. The video system uses pairs of bar coded posts sparsely placed in the robot environment. A high rate of processing is achieved by using specialized hardware. The posts appear as vertical bars of black and white segments that can be matched to a bar coding scheme. The camera is turned sideways so that each scan line is fed into a hardwired detector system that can recognize bar coded patterns of variable size. The relative size of the landmark, and its scan line number, can be used to calculate the distance and direction to the landmark. If a bar code is detected in several consecutive scan lines, a landmark is considered to be detected. If the bar's mate is also detected, then the localization system can calculate the position of the robot using triangulation.  111  During the writing of this thesis, developments in GPS to increase its accuracy, such as differential GPS, have drastically increased the suitability of GPS for use in mobile robotics.  22  Since the bar codes used to detect the landmarks also provide the identity of the landmarks, there is no need for an iterative search. Their results indicate the localization system provided smooth location information (no frequency given) for a robot moving at 20cm/s. Although the speed of the robot was relatively slow, it must be noted that continuous sensing was provided , thus eliminating the need for the robot to stop to sense its environment. The drawbacks of this system are the need for hardwired video processing and the need for landmarks which may be obstructions in the working environment.  2.6 Camera Calibration Camera calibration is a topic that is closely related to robot localization. Camera calibration deals with modeling camera images with respect to the environment. Typically a calibration system provides a model of a camera, and an algorithm for finding camera parameters of a specific camera with respect to the given model (e.g., the focal length). Camera models usually include the pose of the camera relative to some known object, which is localization. The model also provides camera parameters such as focal length, and a translation from frame coordinates to ideal image plane coordinates. Camera calibration is rarely performed in real time, however, and the algorithm for localizing the camera involves very high order calculations. This makes them unsuitable for continuous sensing localization systems. Still, calibration systems can provide some direct benefits to a localization system. If a camera is calibrated off-line, the model of the camera can be used when performing calculations on the position of objects in subsequent images. If the startup sequence for the localization is allowed to be sufficiently long, calibration could be performed during startup, an the pose information obtained could be used as a starting point for the localization system. This  23  would be a bit difficult, since landmarks used for localization can rarely be used for calibration and vice-versa. Finally, research into camera calibration provides insight into calculating pose from camera images. Even if the localization algorithms are not useful, the mathematical derivation of these algorithms may provide localization systems with equations that relate the position of objects within an image to the pose of the camera.  2.6.1 The Ideal Camera Model The ideal camera model provides equations for calculating where on the image plane a point in space will appear. The equations are based on perspective projection. [Hor91], page 19, gives a discussion of the ideal camera model. Figure 1 shows the a graphic representation of the ideal camera model.  Image Plane z = focal length Position o f Point In 3D Space  (x, y, z)  Projection o f  Optical Center  Point Onto Image Plane  (xi, yi)  Figure 1 The Ideal Camera Model  The equations that transform <x, y, z> to <Xj, y;> are:  24  Equation 1  x , — focal t  y . = focal (  length x — z y length x — z  2.6.2 Tsai - Camera Calibration Because all cameras exhibit non-ideal behavior, precise measurement from images requires a more sophisticated camera model than the ideal camera model. Tsai, [Tsa87], gives an appropriate model, and some transformations between an ideal image plane and the actual image frame buffer of the camera. This allows the camera to be used as if it were an ideal camera without knowledge of the more complicated model that is being used.  Tsai's camera calibration model and algorithm were selected for this thesis because it was widely accepted and was immediately available in the lab. Relevant parts of Tsai's work are presented here and referred to in future sections.  Tsai's Camera  Model  A camera model is used to transform coordinates in the frame buffer (pixel coordinates) into real world rays. These rays can then be converted into a real world coordinates by establishing the z coordinate of the real world point, (along the z axis).  Tsai's camera model consists of extrinsic parameters and intrinsic parameters. The extrinsic parameters are those parameters that are not a function of the construction of the camera itself, but of its placement in the environment. The intrinsic parameters are therefore those parameters that derive from the physical makeup of the camera and the C C D itself.  25  Tsai describes his camera model by describing four transformations needed to go from world coordinates to camera frame buffer coordinates. Figure 2 shows the geometry of the problem space used for the following transformation.  Step LFrom  world coordinates to camera coordinates.  In Figure 2, the real word point P is given two labels. P(x, y, z) represents its coordinates in the camera frame, indicated by the axes emanating from origin O. P(x , y , z ) represents the w  w  w  point's coordinates in the real world absolute frame, indicated by the axes emanating from origin O ). This can be determined by the well known rotation and translation operations. w  26  Equation 2 X  w  y  = R yw + T  z  zw  Here R is a 3x3 rotation matrix and 7 is a 3x1 translation matrix. R and T together form the extrinsic parameters of the camera model.  Step 2: From camera coordinates to ideal image plane coordinates.  From the ideal camera model equations we find:  Equation 3  Here,/ represents the focal length, (an intrinsic parameter) for the ideal camera and (X , u  Y ) represents the coordinates of the point P projected onto the image plane using perspective u  projection.  Step 3:From ideal image plane coordinates to distorted image plane coordinates.  The salient part of Tsai's model for this thesis is in the transformation from ideal image plane coordinate to distorted image plane coordinates that models the effects of radial distortion. This transformation is modeled by these equations.  Equation 4 u  Y +D =Y d  27  y  u  The D and D represent the distortion, and are given in the following equations. x  y  Equation 5  D  = X (K r  2  x  d  ]  D =Y (K r  +  2  y  cl  +  l  K r +--.) 4  2  K r +--) 4  2  Equation 5 introduces two new internal parameters Kj and K . These parameters are 2  called the distortion coefficients. Tsai states that these coefficients are part of an infinite series, but for accurate results, only the first term, K , is required. ;  Step 4:From distorted image plane coordinates to frame buffer coordinates.  The last step is to transform the distorted image plane coordinates to the row and column offsets of the frame buffer as it is stored in the computer memory. This transformation involved numerous intrinsic parameters related to the physical dimensions of the C C D chip itself. This transformation is given by the following equations.  28  Equation 6  X =s d"\X +C f  x  d  x  Y =d -%+C f  y  y  Tsai lists the meaning of each variable [Tsa87], page 226: (X Y ) row and column or the image pixel in computer frame buffer, fi  f  (C„ C ) row and column of the center of the image, y  Equation 7  d' = d.  d  N..  center to center distance between adjacent sensor elements in X (scan  x  line) direction, d  center to center distance between adjacent C C D sensors in the Y  y  direction, N  number of sensor elements in the X direction,  cx  Nf  number of pixels in a line as sampled by the computer.  s  the uncertainty image scale factor.  X  x  The Calibration  Algorithm  Tsai's calibration algorithm is beyond the scope of this thesis, and is well documented in his paper [Tsa87]. It is sufficient to note that the algorithm was not suitable for a continuous sensing localization system for several reasons. The most important reason is that the algorithm requires eleven images of the target landmark to provide location images. The second reason is that the algorithm requires human intervention to detect several dots.  29  Chapter Three - System Outline 3. System Outline The prime goal of this thesis is to explore the possibility of a real-time localization system using minimal hardware that is capable of localization with only a single visible landmark. This chapter is structured to present a specification of the required system, a discussion all salient design issues, and an overview of the final design. The detailed design is presented in the following chapter.  3.1 System Specifications As discussed in section 1.2, the design presented in this thesis is an exploration of the issues relevant to creating an on-board localization system for a dynamic, real-time robot testbed environment. This dictates many hard constraints on the system. It is unlikely that all of these constraints could be fully satisfied even in a final working version. Academically, however, it is these deficiencies that would prove most interesting from the point of view of research. If the sensing and plant systems worked perfectly, there really would be no need to test robot control architectures on a physical testbed.  1. Continuous Sensing - The robot controller should not have to stop to sense position. Requiring the robot controller to do so would impair the flow of the soccer game and put the robot at a distinct disadvantage to any other robots that do not need to stop to sense position. This restriction does not apply to reinitializing the localization system should it become unstable.  30  2. Real-Time Performance - The soccer playing test bed is a rapid multiple-agent testbed. Decisions made by the controller must include a notion of time when evaluating correctness, i.e., the right decision too late is the wrong decision. It would be impossible for the robot to make decisions in a real-time frame if sensing was not guaranteed to provide information at a reasonable rate. For the purpose of this thesis a "reasonable rate" is set at 10Hz.  3. Non-Obtrusive Landmarks - The playing field for the soccer playing robots is a simple rectangular pen made of wood. The landmarks cannot become obstacles on the playing field. They can (actually they must), however, alter the visual characteristics of the playing field. This means that landmarks may be placed flush with the sides of the playing field.  4. Minimal Hardware Requirements - The localization system must be run concurrently with the controllers of both robots and with any other sensing systems used. Processing resources will always be limited. For this thesis, the localization system is allocated a single Texas Instruments TMS320C40 processor. This requirement is obviously in conflict with requirement 2. In the end some acceptable tradeoff must be reached.  5. Single Camera Localization - This restriction is not a requirement of the soccer playing testbed, rather a constraint on available resources. The hardware allocated for this project includes only a single camera. Any future attempts to perform on-board localization may use multiple cameras if such hardware is available.  6. Range Required - An approximate maximum range for the testbed will be set at 3 meters. The minimum distance at which the landmark is detectable shall be set at 15cm. The required angular range from which a landmark is detectable shall be set at +/- 7 0 ° from "deadon".  31  7. No Odometry Required - The robots in the soccer playing testbed are simple remote control model race cars, which are not capable of supporting an odometry system. It is not known what platform will be used for an on-board sensing testbed. This means that the system can not count on the availability of an odometry system to provide supplementary or primary location information. The design should, however allow for the addition of such a system.  3.2 Design Highlights This section introduces salient points of the Robot Onboard Localization using Landmarks, ( hereafter R O L L ) , design. The first section outlines equations governing localization with a single landmark view. The second section deals with landmark design. The third section outlines algorithms for detecting landmarks. The fourth section discusses the means of identifying the landmarks. The fifth section presents a second order model of the RWI mobile robot that is the current target platform for this system.  3.2.1 General Approach Requirements 2 and 4 are clearly competing and/or contradictory". The real-time performance of the design will not be known until the system is implemented on a physical system, however it is expected that it will be difficult to satisfy both requirements. Design of the system will be primarily concerned with minimizing the required processing.  All algorithms and methods are to be kept as simple as possible. Each localization subtask could be performed with a great deal of sophistication, but it is not known which actually require more than the simplest approach. The sophistication of individual subtasks can be increased when required, usually independent of the other subtasks.  " These requirements are contradictory if it is impossible to satisfy both, and competing because real time performance can usually be improved by adding processing hardware, and hardware can be reduced at the expense of real time performance.  32  Iterative searches should be avoided when possible. If it is not possible to avoid an iterative search, some consideration must be made on how to bound the search. This will keep performance fairly constant, a key characteristic for real-time systems.  Error handling should be minimized, or avoided altogether. For example, if a landmark is not detected in the current image where it is expected, it is better to simply move on to the next image than to enlarge the search area indefinitely.  3.2.2 Localization With A Single Landmark Localization with a single landmark increases the percentage of time that a robot is able to calculate location. For any localization system, it is likely that for some percentage of time no landmarks or only a single landmark will be visible. If this percentage of time is too high, a localization system that requires two landmarks to calculate position will suffer a degradation of performance. This may interfere with the system's ability to perform continuos sensing. The decision to use a single landmark localization system is a result of the specification that the system allow continuous sensing.  All of the localization systems described in Sections 2.4.1, 2.4.2, 2.3.1, and 2.5.2 describe systems that require two visible landmarks to calculate position. These systems require two landmarks because they only sense two measurements from each landmark, direction and distance. Because the pose of the robot has at least three unknowns, two landmarks must be visible to have a completely defined system of equations. However, if it were possible to obtain orientation information from the landmark, then it should be possible to determine the pose of the camera from a single landmark.  The following equations show the translations required to calculate the absolute pose of the landmark starting with the pose, including orientation, of the landmark relative to the  33  camera. These equations are represented here are simple two dimensional homogeneous transformations based on those found in [McK91]. An extra row and column were added to provide for orientation information, so that the transformation becomes a pose transformation rather than a position transformation. The general pose transformation can be represented by:  Equation 8 1  p = T( '"'0)-Pi m  2  2  This is a short form representation for:  Equation 9  o  2  1  COS0  -sin0  0  m  sin0  COS0  0  n  0  0  1  <i>  0  0  0  I  X  I  Where: i  T(m, n, (j)) is a transformation that performs a translation of (m,n) followed by a rotation of <j). Pi is the pose before the transformation. p is the pose after the transformation. 2  For these equations, rotations are assumed to be positive in the counter clockwise direction, and the axes have been renamed, so that the typical x axis has been renamed to z and the y axis has been renamed to x. This is done to align the Cartesian coordinates used for robot pose with the typical Cartesian coordinate axes used in the ideal camera model.  34  Equation 8 can be used to show that calculating the absolute pose of the camera, starting with the pose of the landmark relative to the camera, where the absolute pose of the landmark is given. First, the pose of the camera is calculated relative to the landmark.  Equation 10  p =T(O,O-0, )-T(-z -ac ,O)-o er  r  lr  lr  Where: Per is the pose of the camera relative to the landmark. [Z| , Xi , 2i„ 1 ] is the pose of the landmark relative to the camera. T  r  r  o is the origin - [0, 0, 0, 1 ]  T  Two general transformations are needed since this equation represents a rotation followed by a translation, rather than a translation followed by a rotation.  The absolute pose of the camera can then be found using the absolute pose of the landmark by applying a single general homogeneous transformation.  Equation 11  Pea  = (Z ,X ,d )-p T  la  la  la  cr  Where: Pea is the absolute pose of the camera. [zi , X| , 8i , 1 ] is the absolute pose of the landmark. T  a  a  a  Though it is not difficult to find the absolute pose of the camera, given the pose, including orientation, of the landmark relative to the camera, it is difficult to accurately measure the orientation of the landmark from a single image. This was found to be the largest problem to overcome with the current design. The following subsections describe an initial design of a  35  landmark that proved insufficient due to a condition called the image plane parallelism problem, and another landmark that avoids this problem. The latter landmark was selected to be used in the design in this thesis.  3.2.3 Landmark Design The design of the landmark itself dictates much of the rest of the system. Landmarks should be specifically designed to be easily detectable, easily identifiable, and to allow easy calculation of pose from its appearance in the image. Much research is being done on how to detect, and calculate the pose of general objects in images under the object recognition field. These methods are usually very sophisticated, and not suitable for real time systems. The point of designing a landmark is to avoid the need to use these object recognition methods. The First Landmark  Figure 3 shows the appearance of the first landmark design. The main body of the landmark consists simply of a black square, 10cm by 10cm, with a number of black rectangle above the landmark for identification purposes. The white circle in the bottom comer indicates a later improvement. White circles were added in each corner of the main body. Once the landmark was detected, the centroids of these circles were calculated. The position of the centroids could be measured more accurately that the position of the corners.  36  Number O f Bars Indicates Identification  10cm  Circle A d d e d F o r Increased Accuracy  4 10cm  Figure 3 The First Landmark Design  This simple landmark allowed calculation of its orientation, and also had several notable advantages. First, it was easily detectable using comer detector matched filters [Der90] and [Ran89], which could also located the identifying rectangles. The landmark is completely flat, and thus, if mounted on an existing wall, would not be a physical obstruction in the environment. It was also extremely easy to produce, requiring only a draw program and a suitable printer.  The following equations can be used to calculate the pose of the landmark relative to the camera. Figure 4 and Figure 5 serve as a graphical reference to the physical values represented by the variable names. The positions of the corners in Figure 4 are given in ideal image plane coordinates. These values must first be found in the actual image, and translated into ideal image plane coordinates through the transformations provided by Tsai's camera model as discussed in 2.6.2. This step is not shown in the following calculations. These equations were derived from the ideal camera model [Hor91] pages 19-20.  37  Equation 12  focal y«i  focal  length  • 10cm  - y  u  length  • 10cm  y - y, ur  r  Equation 13 (  x, =•  u , +  X  X  /2  V  focal  „ ) / ^ J  length  J  V  focal  length  Equation 14  _  (  X  l +  X  r )  2  Equation 15  9 = tan"yX ~X la  l  38  r  j  + 7T  !  y  Figure 4 The Landmark As It Appears In The Ideal Image Plane  Landmark Left Edge Location (xi, z , )  Landmark Location ( la> l a ) x  z  Landmark Orientation  (2i.-B)  Landmark Right Edge Location (x , z ) r  Camera Position  •(0,0,0©  Figure 5 Calculating Relative Position For Landmark Type J  39  r  This landmark suffered from two problems: identification was unreliable and more importantly, the orientation of the landmark could not be accurately measured at around 180°. This is when the camera is looking directly at the landmark. This was found to be related to the sensitivities of Equation 19. It was also found that this result can be generalized into a condition called the image plane parallelism problem. The Image Plane Parallelism Problem It is practically impossible to calculate the orientation of any line segment or plane segment that is parallel to the image plane with a single camera. This is because the sensitivity of the orientation calculation approaches infinity as the measured object becomes parallel with the image plane.  z,) Optical Center  (X,  Z ) r  Figure 6 Image Plane Parallelism  From Section, Equation 15 we know that  Equation 16 -1  Clearly it can be seen that  40  /  Equation 17  o,  lim r  \ l  J  Z  Z  and given that  Equation 18  lim  dx  x->0  it follows that:  Equation 19  90,  lim  ^t \ l~ Z  ^> r  Z  J  This equation means that as the landmark becomes parallel to the image plane, the calculation of 0 i becomes infinitely sensitive to the values of xi, x , z\, and z , that is, small errors a  r  r  in the measurement of the latter values cause very large errors in the calculated value of ©i . To a  summarize, it is impossible to accurately measure the orientation of an object that is nearly, or completely, parallel to the image plane. This effect is known as the image plane parallelism problem. The New Landmark It is possible to avoid the image plane parallelism problem if the landmark is made so that orientation measurements are taken from part of the landmark that can not be parallel to the  41  image plane. Figure 7 shows a landmark composed of three black dots and an identifying pattern. The middle dot is offset from the face of the landmark so that the three dots form a triangle. This triangle will only be parallel to the image plane when the landmark orientation is perpendicular to the optical axis. At this angle, the dots themselves will not be visible, so detection is impossible anyway. Thus the image plane parallelism problem is avoided.  Identification Pattern  5 cm  •  10 cm  i  Figure 7 Landmark Design 2  The equations for measuring the relative pose of this landmark are derived below. Figure 8 shows the physical values that the variables represent. The given values are the locations of the three landmark dots in image coordinates. (x , y ), (x d, y d). and (x , yw) are the coordinates of ud  ud  m  m  ]d  the centroids of the upper, middle and lower dots respectively. The constants focal length, landmark height, and middle dot offset are also known.  42  X  Shadow Point (S) (x , z )  A  Middle Dot (M)  s  s  Line x = x, (X) Landmark (L) (Xl, z,)  Shadow Ray (R)..-''  (A)  Camera (C) (0, 0)  Line z = z, (Z)  Figure 8 Calculating the Relative Position of Landmark Type 2  Calculating point L is a straightforward application of the ideal camera model.  Equation 20  hlandmark f y«d-y  u  Equation 21  (  X  u  x,=-  + d  X  ld)/  •z,  f  Where/ is the focal length of the camera, and h  !andmark  is the height of the landmark.  Calculating the relative orientation of the landmark 6i requires the use of similar triangles. In Figure 8, the triangle C A S is similar to triangle BLS. Once point S has been found,  43  the length of line segment B L can be calculated, since it is proportional to line segment C A . The x value of point S is calculated using the ideal camera model,  Equation 22  x.. -  md  X  '  l  Z  focal length  and the z value of point B can be calculated as  Equation 23  Now ZSBL = cos  -l  s  l  X  \  z  X  t ~  z  b  , the lengths of line segments B L and S L are known, J  which means angle 0 can be calculated using the sine law and the sum of the angles of a triangle rule for triangle L M B .  Equation 24  ZLMB = sin"  '(z,-z )-sin(ZSBL) b  middle dot offset  Equation 25  0 = ZSBL + ZLMB  3.2.4 Landmark Detection Landmark detection using video is an object recognition problem. Features in the image must be matched to a model of the object to be detected. Many sophisticated object recognition techniques use explicit models of objects with several different feature types. These techniques  44  are not suitable for real-time localization because of the amount of processing required. Forming a landmark detection scheme must involve using implicit knowledge of the landmark model in order to quickly detect landmarks.  The detection method for both landmark types are described in the following sections. Although the first landmark type was rejected due to accuracy problems, its detection algorithm is included to contrast with the detection method for the second landmark. This illustrates that detection methods are highly dependent on the landmark design. It is also worthwhile to examine other problems with this landmark. Implicit Knowledge For Both Landmarks For both of the landmarks types, the immediate vicinity, excluding shadows, contains only white and black . Assuming a binary image simplifies the landmark detection greatly. 12  Within the immediate vicinity of the landmark, all black pixels can be assumed to be part of the landmark. Since the image is actually greyscale, some threshold value will be assumed to determine which pixels are white and which are black.  Another important piece of implicit knowledge is that landmarks are mounted so that the upper and lower dots are aligned vertically. Also, the landmark's middle dot must be horizontally aligned with the camera optical axis. Calculations could be added to remove the need for this accurate landmark mounting specifications, but this would add a level of sophistication to the calculations that would increase processing time. The First Landmark Type Two designs were created for the R O L L system. The first proved unsuccessful. This section details the design of the first landmark and the reason for its failure.  12  Each landmark is pure black, on a 8.5" x 11" sheet of white paper.  45  The first landmark, shown in Figure 3, on page 37, is a simple square. Therefore it can be modeled as four corners. Each of these corners can be detected by convolving a corner detector over the image. Detected corners are then matched into a list of rectangles. Rectangles that may be landmarks are selected and position is calculated. The following algorithm describes the landmark detection procedure.  1.  2.  For each corner type:  1.1.  Convolve the image with the particular corner detector.  1.2.  Scan the convolution for local peaks, adding each peak to a list.  For each comer in the "upper left" list, find matching "upper right", "lower left", and "lower right", corners. Combine these corners to form a rectangle and add it to a "rectangle" list.  3.  Scan the "rectangle" list for rectangles that may be landmarks.  4.  Calculate camera location from each recognized landmark. Corner Detectors Figure 9 shows the corner detectors that were used for this landmark. These detectors are 4x4 matched filters modified to provide a 0 response to constant regions and reduce the filter's response to lines. In each filter, the quadrants adjacent to the "corner", denoted by the quadrant of -5's, are given heavier weighting that the quadrant opposite the comer. This heavier weighting reduces the corner detector's response to lines.  Note that these filters are zero-sum, but not balanced. This means that the corner detectors are directional so several are required to detect corners at different orientations.  46  These detectors are designed specifically for right-angled corners, oriented along the vertical and horizontal axis. While it is known that the landmarks are mounted vertically, perspective effects make many comers appear to be either wider or sharper than 90°. It is assumed that these effects will be negligible for most landmark views. It is noted that this is generally true for almost all poses except for landmarks that are very close and at oblique angle to the camera.  1  2  2  1  2  2  2  -5  -5  2  -5  -5  2  2  1  1  2  2  1  1  -5  -5  2  2  -5  -5  2  2  2  -5  -5  2  -5  -5  1  2  2  1  2  2  upper left filter =  upper right filter =  lower left filter =  -5  -5  2  2  -5  -5  2  2  2  2  1  1  2  2  1  1  lower right filter =  Figure 9 The Corner Detectors  47 Landmark Verification When selecting rectangles from the rectangle list, a number of characteristics of the landmarks can be used to eliminate false rectangles:  1. Landmarks can be taller that they are wide, but not vice-versa.  2. Upper corners must be directly above their respective lower corners, within some tolerance.  3. Upper and lower corners for the same side should be equidistant from horizontal, within some tolerance.  4. Landmarks should have some shape pattern above them for unique identification.  Once landmarks have been identified and the relative position calculated, this position could be compared with an estimate of where the landmark should be, based on a current estimate of robot position. This would be a very effective test, but was not implemented for the first landmark type. Efficiency Considerations Without hardware to perform the comer detection convolutions, this detection method is very slow. Even with hardware, the processing required to search the entire image four times in order to find all the local maximums for the convolutions is considerable. It is necessary to restrict searches to some small regions of the image. Thus some kind of tracking or prediction is essential.  The procedure to match corners into rectangles also causes some concern. In order to be done properly, a combinatorial search, 0(n(n-l)(n-2)(n-3)), must be performed. This could lead to an unacceptable search time, should the corner list become too long. Either the number of  48  corners located in the image must be somehow constrained, or the comers must be matched using heuristics. The latter was selected since no method of eliminating false comers was discovered. Notable Problems This landmark type was eventually rejected due to inaccuracies caused by the image plane parallelism problem described in Section There were however several other problems with this detection method.  The heuristics used to match corners into rectangles proved unreliable for the identifying rectangles above each landmark. In order to rectify this situation, landmarks were identified by traversing a vertical line up from the middle of the top of the landmark and the number of black segments crossed were counted. This in turn brought up the problem that for the identifying rectangles to be large enough to be detected, they ran the risk of disappearing off the top of the image if the landmark was too close.  Even if the search regions of the image were reduced, the convolutions required too much processing power to be performed without specialized hardware.  The assumption that perspective effects were negligible in most cases proved invalid. Although the corner detectors proved very resilient to the perspective effects, accuracy decreased. In order to improved the accuracy of corner detection, which is a single pixel at best , 13  white circles were added inside each corner. The centroid of these circles could be calculated more accurately than the location of the comer via corner detectors, but required a significant amount of increased processing.  1 3  Subpixel interpolation schemes were considered but not implemented since it was determined that the increased accuracy still would not help solve the image plane parallelism problem.  49 The Second Landmark Type 30cm  lcm  30cm  10cm  Figure 10 The Second Landmark Type  The second type of landmark is modeled as a triangle of black dots. Since each dot is the same, we can not immediately distinguish upper, lower and middle dots from each other. It is also not possible to use a matched filter to detect circles because they are not scale invariant. Despite these restrictions, it is relatively easy to detect the landmarks by assuming that any black pixel is part of a circle, and providing a prediction of the circle locations. The general algorithm for detecting the second landmark type is as follows:  1.  3.  Predict the location of each of the dots for a given landmark.  2.  For each dot  2.1.  Attempt to detect a dot near the predicted location.  If more that one dot was not found  3.1.  4.  End. (Failure)  If one dot was not found  50  5.  4.1.  Form several new predictions based on the two dots that were found.  4.2.  Repeat from step 2 for missing dot, ending at step 4 if all dots not found.  Verify that the three dots form a landmark.  6.  Identify the landmark  7.  Calculate the position of the camera from the location of the three dots.  Step 4 was added when it was found that the algorithm failed much too often. To rectify the situation, step 4 was added, and the landmarks were mounted with more care. Step 4 represents a second try, if the predictions were off enough to miss on of the dots. It was found that when a single dot was missed, the second attempt nearly always found the missing dot. Dot Detection Dot detection amounts to finding convex regions of black pixels, which is a fairly simple and straightforward process.  1. Search over some region in the vicinity near the predicted location for a black pixel.  2. Starting with a rectangle enclosing only the black pixel, expand the rectangle until it encloses all black pixels in the dot.  3. Verify that the dot may be part of a landmark.  4. Calculate the centroid of the dot. Search Patterns In order to search for a black pixel near the predicted location of a dot, some search strategy must be implemented. The effectiveness of the search pattern has a significant impact on the performance of the system in general. The search should be restricted to some area, and need  51  not examine every pixel since dots will be more than one pixel in size. It is also desirable to search the where the centroid is most likely to appear within the search area.  A grid must be established that samples the region. Only pixels on the grid points should be checked. In accordance with sampling theory, the grid width must be no bigger than the expected width of the dot, and the height must be no bigger than the expected height. For a safety margin, half of the height and width are selected as the grid width and height.  There are two reasonable search approaches available to search the grid. The first is to order grid points in order of proximity to the predicted dot centroid, and then search them linearly. This is approximately equivalent to creating a square around the predicted location, and expanding its width and height by one grid point until it intersects a black pixel. The second approach is to search horizontally first, shifting vertically until a black pixel is found.  The latter method was selected because variance in the horizontal direction is more common than in the vertical direction. This is because rotation can cause rapid horizontal movement when the landmark is far away from the camera, while the vertical movement of dots due to forward or backwards movement of the robot tend to be more subdued. Thus it is more likely that the vertical coordinate of the prediction is correct while the horizontal coordinate is erroneous. Dot Verification We can make several assumptions about the landmarks that will facilitate detection and help eliminate false matches.  •  Dot will be fully visible - no occlusion, no partial views.  •  Landmark will be within a specified range.  52  •  Landmark will be mounted at camera level and vertical.  •  Perspective effects will be negligible  With these assumptions, there are several checks that can be made to reduce false dot detection. First, because the robot is confined to a horizontal plane, and the dots are mounted on vertical 2 dimensional planes, the landmark dots will appear as ellipses that can never be wider than they are tall. Secondly, the area of the detected dot should not differ from the predicted area by more than some tolerance. Lastly, a compactness test can be applied.  Compactness Test The compactness of a dot is the ratio of its area to the area of the smallest rectangle that completely covers the dot. This test helps to eliminate any detected dots that are not convex. A non-convex dot, more appropriately called a blob, will have a relatively small compactness value while a convex dot will approach n/4. A perfect rectangle will have a compactness of 1. The ideal compactness of an ellipse is given by the following equation:  Equation 26  Kab  K  Aab  4  a = primary radius b = secondary radius  Note that the compactness of an ellipse is independent of its radii, thus this test is scale independent. The compactness test may also reject valid dots that are occluded, so we must assume either that the landmark dots will not be occluded or simply accept that landmarks may be rejected if occlusion occurs. While ideally a localization algorithm would be able to extract location information from partially occluded landmarks, for this system it is likely better to reject partially occluded landmarks because the occlusion would shift centroids, and thus incur error.  53  The compactness of a proposed dot could be tested for being too large or too small, but the difference between the compactness of an ellipse and the compactness of a square is not large enough to allow a safe comparison. Thus only dots that have a compactness ratio smaller than 7t/4 - £ are rejected, where £ is some tolerance, named the dot compactness test tolerance' . 4  Landmark Verification Once three suitable dots are found, several checks can be performed to reduce false detection, based on implicit knowledge of the landmark shape.  1. The horizontal coordinate of the upper and lower dot differ by less than some tolerance - the vertical dot alignment tolerance. The value of this constant is 5.  2. The middle dot is not further away from the line connecting the upper and lower dots than physically possible due to the dimensions of the landmark. These dimensions are specified in the constants landmark middle dot offset, and landmark height. The value of these constants is 5cm and 10cm as can be seen in Figure 7.  3. The distances from the middle dot to the upper and lower dots are equal to within some tolerance, named the maximum middle dot vertical deviation factor. The value of this constant is 16. Efficiency Considerations Even though the elimination of the convolutions makes this detection method faster, the majority of the processing is still devoted to detecting the landmark. The majority of the time taken to detect a landmark is used to find the black dots, and calculate their centroids.  14  Appendix A contains a list of all constants and tolerances in the system.  54  There are two ways to minimize the time taken to find a black dot. The first is to improve the prediction calculation . The better the prediction is, the less time taken to find a black pixel 15  that is part of the dot. When the prediction is perfect, the search takes no time, since the first dot checked is black.  The second way to reduce the processing requirements is to reduce the area of the search. However, reducing the search area has the adverse affect of allowing the robot to become lost more frequently . 16  Step 4 in the landmark detection algorithm states that new predictions and searches should be undertaken if only one of the three landmark dots was not found. This greatly increases the time taken to process a single image, should the first detection fail. It was found that step 4 of the algorithm is necessary, since if the landmark is improperly mounted, errors will continuously occur. There are many ways to improve this situation. One way is to flag the situation, allowing a human to remount the offending landmark properly. Another way is to not perform this extra step if there is another landmark visible in the image. Possibly the best alternative is to store information about the improperly mounted landmark in the map, and use this information to make a better prediction.  One very significant unanswered question is whether it is better to attempt to detect all landmarks that are predicted to appear in the image, or to stop and capture a new image as soon as a single landmark has been detected.  15  There are generally three ways to improve the prediction. The first is to make the prediction smarter, by increasing the order of the plant model, or otherwise making it more accurate. The second is to decrease the time taken between successive images, but that's why we want an improved prediction. The third is to allow the map to "learn" mounting parameters for each landmark, i.e., if the landmark is mounted x cm too high, or at an angle y° off vertical.  16  This would be an interesting relationship to examine, but would require a large amount of data gathering from a fully operational system.  55  It is expected that the answer to this question lies in the relative processing time required to capture an image versus to process a single landmark. At 30 frames per second, capturing an image may take 1/30* of a second. This "cost of sensing" could lead to the decision to make the most of each image if the "cost of detecting" is relatively small. If the image can be captured without using a significant amount of processor time, then it may be best to stop after a single landmark has been detected. In the current design, it is assumed that the image capture will take the full 1/30* of a second of processor time and thus the system attempts to detect all landmarks visible in the image. This issue represents possibly the single largest performance increase for the system should asynchronous image capture become possible. This would mean that image capture occurs automatically via interrupt control. This would require some non-trivial synchronization between the frame grabber and the processor. Noted Problems There are two major known problems with the current system-tracking may be lost, and the identification pattern suffers from perspective effects. The latter is dealt with in Section  The fact that tracking be lost represents a fundamental instability in the system. The significance of this instability rests with the "cost of re-establishing the tracking", and the relative frequency that the robot loses its tracking. In other words, losing tracking, or becoming lost, is not a major problem if it only requires slightly more time to re-establish the tracking, via scanning the entire image for landmarks. Currently, using a rather naive initialization procedure, re-establishing the tracking requires approximately 10-20 seconds. The significance of losing tracking also decreases as the frequency of losing tracking decreases, i.e., if the tracking is only lost 1/day, the effect is insignificant. Early experiments have shown however, that tracking is lost fairly frequently.  56 Landmark Tracking Landmark tracking can be performed by tracking the landmark in the image plane or by tracking the position of the robot and predicting the position of the landmark in the next frame. The latter approach has be adopted, since tracking a landmark in the image plane does not provide for new landmarks coming into view.  The plant model for the system plays an important part in landmark position prediction. The plant model must supply an estimate of the current absolute pose of the robot at the time the image was captured. The greater the error in the current position estimate, the greater the chance that a landmark will not be detected in the allotted search region.  3.2.5 Landmark Identification Landmarks are identified, when required, by the identification pattern located directly above the upper dot. This identification marker consists of two black reference bars, and 10 bit squares which may be either black, representing a 1, or white representing a 0. Figure 11 shows the identification marker reference bars and bit regions. The lower middle bit region is always white so that the height of the lower reference bar can be measured by moving directly up from the upper dot of the landmark.  Reference Bars  Figure 11 Landmark Identification Marker  57  The reference bars serve two purposes. First, the height, width and position of the lower reference bar is used to calculate the position of the bit squares. Second, the presence of the reference bars prevents the bit regions from being accepted as valid landmark dots. Any black bit square will be attached to a reference bar. Together they will have a low compactness value.  To increase the reliability of the identification scheme, bits 0,3, and 6 are checksum bits which must ensure that the number of bit squares mod 4 is always 3. Perspective Problem The current algorithm for "reading" an ID marker assumes that the shape of the marker is not distorted by perspective. As the effect of perspective increases, the id marker cannot be read. This problem can be ignored with little consequence because landmarks do not have to be identified except at startup. The identity of a landmark is known prior to detection by the prediction subtask. It is still a good idea to attempt to verify the identity of each detected landmark. If the identity marker of the landmark is properly read , and is different from the 17  expected identity, then it can be assumed that the robot was lost, but the new location can be immediately calculated.  3.2.6  Plant Modeling The plant model subtask is responsible for combining multiple pose calculations for a  given instant into a single pose and for providing predictions of the robot pose when requested. The actual implementation of the plant model depends on what information is available in the final system. For the plant model proposed in this thesis it is assumed that odometry and control information is not available. This means that the internal state variables for the model must be derived from observations from the video localization system alone.  17  i.e. The checksum is correct.  58  Ideally the plant model would incorporate, or be attached to, an extended Kalman filter. However the design of an extended Kalman filter is fairly involved and was left for future work. The plant is modeled with a simple second order model. The internal state variables shall be updated with every new pose received. The current pose estimate for the robot is calculated by averaging all pose calculations available in the current image. The previous position of the robot is not taken into account.  The plant model is for the RWI robot. This robot may rotate, and move forward or backwards in the direction it is currently facing. This leads to a system that is defined by the following equations.  Equation 27  x = v • sin 0 z-v-  cos 6  From these two equations form a simple second order model of the robot. Equation 28 shows the set of state variables for the model. The last three state variables are maintained internally.  Equation 28  ' x{t) z{t) S(t) =  0(0 0(0 v(0  v(t)_  The plant model can provide a prediction of robot pose for time t i, given state variables n+  were last updated at time t„. Equation 30 gives the equation to calculate S(t ), n+]  59  the estimate of  the robot pose at time t„+i. Note that the internal state variables are included even though they are not used by the current system. They may be required for future system improvement.  Equation 29  Equation 30  x(r ) + sin 0 + V  0-A^ f  n  2  /  *(0 + cos  0 + V  2  0(rJ + 0-Ar  J J  v+  v-AA  V  2  f • v+ V  J  v-AA 2  J  6(0 v(0 When new pose measurements are received, the state variables are updated. Equation 31 shows the calculation of S(t„+i), given new pose measurement P = [ x , z , 0 ] . m  m  m  m  Equation 31  © ('„ )-©(0 m  +1  Ar  At ,  { t  )  =  v(O-v(0  60  3.3 System Overview The system is designed as an iterative pipeline with feedback. Figure 12 shows this pipeline structure proceeding from top to bottom. Processing starts with the Landmark Map Control block signaling the Image Capture block to provide a current image. The Landmark Map Control block also sends the absolute position of a potentially visible landmark to the Landmark Prediction block. The Landmark Prediction block in turn sends a prediction of the images coordinates of that landmark to the Landmark Detection block, which locates the landmark within the image. The Position Calculation block calculates the position of the camera from the shape and position of the landmark in the image and sends the position to the Plant Model block. The Plant Model block, accepts the position, and updates its internal model of the robot. Once the Plant Model has finished, the Landmark Map Control block is notified. The Landmark Map Control block then either sends another potential landmark to the Landmark Prediction block causing another iteration through the pipeline with the same image, or it signals the Image Capture block to provide another image. In the latter case, processing repeats as described above.  61  Landmark Map Control  t  Landmark Prediction  Image Capture  Landmark t Detection  Position ' Calculation  Plant Modelling  System Output  ]  Figure 12 System Structure  3.3.1 Task Decomposition Figure 12 represents a functional decomposition of the system. Each block can be considered a transformation from input to output. The tasks are listed below with a general description of the inputs, outputs and transformation performed. Image Capture /Image Sequence Loader This block, labeled Image Capture, is responsible for providing the current image to any other functional blocks that require it, primarily the Landmark Detection block. It has no inputs, except for a signal that initiates the capture of another image. In a working system, this  62  functional block would encapsulate the hardware frame grabber. For the development system, this functional block is implemented as an image sequence loader that loads consecutive images that have been captured and stored off-line. Landmark Map and Loop Control This block is responsible for managing the landmark map database and controlling the flow of the system pipeline. It accepts a signal from the Plant Model block indicating that the current iteration through the pipeline has completed. It has two outputs. The first is a signal which is used to initiate the capture of a new image. This output is sent to the Image Capture block. The second output is the location of a potentially visible landmark. This output is fed to the Landmark Prediction block, and initiates an iteration through the pipeline.  The control rules that govern the Landmark Map Control block may vary depending on implementation, without affecting the proper functioning of other control blocks. For instance, whether the Landmark Map Control block limits the number of landmarks to check before initiating the capture of a new image, or whether it sends out all possible landmarks before initiating capture of a new image will not affect any other functional block in the system. Landmark Prediction/Tracking This block has two inputs: the current position of the robot and the absolute position of some landmark. It has two outputs. The first is a prediction of where the landmark will appear in the current image. This second signal is used when it is predicted that the landmark will not be visible in the current image. This signal can be used to abort the current iteration through the pipeline. Each dot is put through the following transformations.  63  Global Position to Relative Position Equation 32 s Plr^T(-Z ,-X ,-Q )-p c  c  c  i&  r  Where:  Piris the pose of the landmark relative to the camera.  Pig is the pose of the landmark relative to the global frame. s X r  is the general 2D pose transformation.  Zc, x  a  @ is the pose of the camera relative to the global frame. c  Relative Position to ideal image plane  Equation 33  Ztr  i  x  —  y  t  z  lr  f  Where:  x y„ zi is the position of the dot in 3D space projected onto the image plane. h  xir, ytn Zi is the position of the dot in 3D space relative to the camera. r  / is the focal length of the camera.  Note that yt = yi , since the height is independent of the translation and rotation. r  g  64  Ideal image plane to frame buffer (via Tsai's model) <<  Equation 34  \row col\ = Tsai(x , y,) J  Where  row, col is the coordinates of the pixel in the frame buffer.  Tsai(x y ) is the transformation equations as specified in Equation 3, Equation h  t  4, and Equation 5. Together these equations for a transformation from ideal image plane to frame buffer, using the intrinsic parameters of Tsai's camera model (for a specific camera). Landmark Detection This block is the most complex and time consuming of all blocks. It is responsible for locating a landmark within the image. It is also responsible for preventing false matches. This block has three inputs: the current image, the predicted position of a landmark in the image and the id of the landmark to be detected. It has two outputs. The first is the detected location of the landmark in the image. The second is a signal that is used to abort the current iteration through the pipeline if the landmark was not detected. Position Calculation This calculates the absolute position of the camera from the location and shape of the landmark in the image. The block takes two inputs, the position of the landmark in the image and the absolute position of the image according to the map. The transformations for this block are listed as equations Equation 11.  65 Plant Modeling/ System Output This functional block is responsible for modeling the plant in order to maintain an estimate of the current position at any time. Although this block should ultimately be integrated with a Kalman Filter, the current version simply models the plant with a second order model and does not filter new measurements with previous knowledge. The exception is for outliers. If a new measurement arrives that would require the plant to have exceeded acceleration and/or velocity constraints, the measurement is declared as an outlier and discarded.  66  Chapter Four - System Detail  4. System Detail This chapter presents a detailed design of the R O L L system implemented on the A C E environment. The chapter begins with a brief description of the A C E environment with a section on design methodology. An overview of the implementation is given, showing general data flow control. The design is broken down into data types and functional components. Each 18  component is described and its function is related to the overall localization task decomposition given in section 2.1. The chapter finishes with a brief description of a second implementation that was connected to a mobile robot to operate in real time.  The A C E implementation is used as a test bed for the localization algorithm. Images are stored off-line and loaded into the system as if they were obtained in real time by a frame grabber. The A C E environment was selected as a platform for the design because it enforces a high degree of modularity and its clear data flow architecture. These characteristics aid the creation of an easily understood and modifiable design.  4.1 The ACE Environment The A C E environment is an implementation of the Events & Components paradigm, created by ISE Ltd.' for use in robotics control. The Events & Components paradigm is an 9  20  18  The presentation of data types is an important part of the design methodology of Events & Actions, a paradigm upon which the A C E environment is based.  19  The A C E system was created by the author of this thesis while employed at International Submarine Engineering Limited, and is currently still under development as an in-house software tool. Its potential as a commercial software product is being evaluated.  67  objected oriented, event driven data flow environment that supports asynchronous data and control flow. This paradigm is comprised of three major objects: events, actions and components.  4.1.1 Events, Actions and Components Events can be considered pipes through which a specified type of data can flow. Components are functional objects that encapsulate a transformation of data from some input events to some output events. Components can be producers of events, consumers of events, or, (most often) both a producer and consumer of events. Actions are contained within components objects to perform the preemptive task scheduling. Events An event is a pipe through which a specific type of data can be passed from a producer component to a consumer component. A producer outputs data to the event by signaling the event. This is done using the Signal() function that is a member of the event object. Signaling an event incorporates two functions. First the data being fed though the pipe is changed, and second, any actions that have been attached to the event are scheduled for execution. It is this mechanism that gives A C E its asynchronous event-driven nature.  Events are strongly typed, and any data type that is definable in C++ can be incorporated in an event. Each event stores a single value of its data type for consumers to retrieve. There is also an event that has no data type and is solely used to schedule actions (control flow).  The name Events & Actions originates from their original implementation of the paradigm that had only two of the three main objects: events and actions. Components have since been added, and have been added and superseded actions in terms of visibility to the programmer, yet the name of the paradigm remains unchanged.  68 Actions Actions represent a single executable task.. They are attached to specific events and are executed each time that event is signaled. Each action has a function to execute and an associated component that provides local data. Components A component is a base class from which specific functional blocks can be created by the programmer. A user component consists of inputs, outputs, local data, member functions and an action(s) that perform the transformation(s) from input to output. Component inputs and outputs are references to events. A component is responsible for attaching actions to inputs during initialization. During steady state operation they are static objects with no direct functionality. All active processing is provided through actions which call a member function of the component.  4.1.2 A C E Configurations An A C E configuration is an application created using the A C E environment. A configuration can be defined as the set < E , C , I, 0>, where E is a set of events, C is a set of components, I is the set of connections between events and component inputs, and O is a set of connections between events and component outputs.  The actual specification of the configuration is implemented as a single C++ source 21  file. Sets E and C appear as statically declared event and component objects. Sets I and O are included in the constructor parameters of the constructor, i.e., in order to declare a specific component, constructor parameters must be provided to define which events to connect to the  2 1  Due to current C++ standards, the single source file is a restriction that cannot be easily overcome.  69  input(s) and output(s) of the component. Actions within the components are constructed automatically by the components themselves. The Integrated Circuit Analogy An A C E configuration can be considered an analogous to hardware circuit design . 22  Components can be though of as ICs and events are the wires that connect the ICs. This analogy can be carried far, and is most useful in understanding A C E configurations. For example, in an A C E configuration, just as in an IC design, it is usually incorrect to have two components (ICs) driving the same event (wire).  4.1.3 Design a n d Development w i t h A C E Design with A C E should be performed in two major parts: a functional decomposition of subtasks and a data type decomposition of the objects involved in the system.  The first major part of design is to define the data types passed between the components in the form of events. This part is performed in an object-oriented approach, and is critical to the success of the design. If the data decomposition is performed well, the designer will find that any errors in the functional decomposition are isolated and can be easily corrected by changing a single component.  Second, the problem should be divided into subtasks using functional decomposition. Any subtask can be further broken down. When a subtask can be broken down no further, a component can be created to encapsulate the subtask.  As with all design processes, design in A C E is an iterative process. The functional decomposition and data decomposition will affect each other, although the designer should  2 2  This analogy is actively pursued at ISE. Current plans for A C E programming tools include graphical programming environments similarly to currently available hardware design aids.  70  attempt to let the data decomposition carry more weight than the functional decomposition, since the functional decomposition is more likely to change, due to future requirement changes and unforeseen deficiencies in algorithms, etc. Asynchronous Data Flow A special note must be included about the asynchronous nature of A C E . Actions are scheduled as a result of an event being signaled, not on a global clock tick. However, just as ICs are actually all asynchronous, made synchronous by the presence of a clock input, an A C E component, and a whole A C E configuration can be made synchronous by having "trigger" inputs. This trigger can then be connected to "clock" events which are signaled periodically by a timer component.  The R O L L design presented in this thesis is asynchronous in nature. The asynchronous nature is a large benefit to designing a component that can combine pose input from independent sources , such as the video localization system and an odometry system. The asynchronous 23  nature allows these two systems to provide their information to this component, when it become available, completely independent on each other.  4.2 The ROLL Configuration Figure 13 shows the A C E configuration for the R O L L system. This figure shows the system components as rectangles, with the component name on the inside. Events are shown as lines connecting the components with arrowheads pointing to the components that accept these events for input. The name of the event and its type are labeled beside the line. Events with no data are labeled with their name only. The system output, CurrentPosition, is shown as a thick line.  71  Iteration Terminated  H Map Component  Frame Grabber Next I mage  Next Landmark (LandmarkPoseStc)  Image Captured (GreyScalelmage)  Time Latch Current Time (Seconds Floating Point)  Not Visible  Or Gate  Landmark Image Predictor  Landmark Image Prediction (LandmarkFrameStc)  Landmark Detector  Or Gate  Position Estimator  Landmark Image Detection (LandmarkFrameStc)  Current Position (PoseStc) System Output  Relative Position Calculator Relative Position (PoseStc)  Absolute Position Calculator Absolute Position (PoseStc)  Legend • ••  Control How Only  —•  Data and Control Row  X  Not Connected  Figure 13 The ROLL Configuration  The configuration figure appears very similar to Figure 12, on page 62, with some extra components and more detailed data/control flow information. The system is organized into a main pipeline, with a side pipeline to provide supplemental data.  This component would encapsulate an extended Kalman filter, although in the design presented in this thesis only a simple second order plant model is implemented for this purpose.  72  4.2.1 The Main Pipeline The main pipeline appears as the center column of components in Figure 13. Processing starts with the Map component, and proceeds in a linear fashion through the Landmark Image Predictor, Landmark Detector, Relative Position Calculator, and Absolute Position Calculator components. Each iteration through this pipeline calculates the position of the robot from a single landmark.  4.2.2 The Supplementary Pipeline The supplementary pipeline is responsible for providing the main pipeline with the current image and current position estimate. It is also responsible for providing the current position as an external output of the system. This pipeline is shown as the vertical column of components on the right in Figure 13. Processing through this short pipeline begins with the Frame Grabber component obtaining the current image, which is provided to the Landmark Detector component. Updating the current image also fires the Time Latch component, which records the time of capture. This time is fed to the Plant Model component which responds by providing an estimate of the current robot location to the Landmark Predictor component.  Hereafter components will be referred to by their names alone. The trailing 'component' will be omitted.  4.2.3 Loop Control The Map is responsible for coordinating the activities of the two pipelines as well as providing the positions of particular landmarks. When the system is started, the Map signals the Frame Grabber, and waits until the image is obtained. At this point, the Map will start an internal iteration through its database/array of known landmarks, sending them, one at a time, though the main pipeline. Once the main pipeline has completed an iteration, it has the option of firing the  73  main pipeline with the next landmark in its internal iteration, or firing the supplementary pipeline to obtain a new image.  On the left side of Figure 13, several events are shown leading from components in the main pipeline to the Map. A l l of these events indicate termination of the pipeline, regardless of whether an absolute position was actually obtained. Both the Landmark Predictor and the Landmark Detector are capable of terminating the pipeline prematurely should the current landmark, provided by the Map, not be obtainable.  4.3 Data Types As mentioned in section 4.1.3, specifying the data types is a critical part of configuration design. The R O L L system has many inherent data types, some of which appear as event types, and some of which are used internally to components. These data types provide a level of data abstraction, that allows the system to be developed at a fairly high level. Separating data types independently from which components produce and consume them also provides flexibility when rearranging the system in order to explore its characteristics and performance.  Most of these data types deal with representing the same information in one of three coordinate systems-Cartesian coordinates, both relative to the camera or landmark, and relative to a designated absolute frame; the image plane frame; and the frame buffer coordinates.  All of these data types are simple C structures, since they represent pure data. Operations on the data are more closely associated with the components than the data itself. The one exception is the GreyScalelmage, which is a C++ class because it has several functions which should be available regardless of whichever component accepts a GreyScalelmage event as input. One could imagine that if other visual systems were added to the robot's sensing system, they  74  could use the same GreyScalelmage event as input, and would require the same functions that the R O L L system requires.  4.3.1  Image Plane Coordinate D a t a Types The image plane coordinate system is used for calculations using the ideal camera model.  The origin is the camera optical center, and the z axis is aligned with the optical axis. The image plane is then located at z = focal length. The direction to an object in this coordinate system is usually given relative to the optical axis, thus relative to the z-axis. All values in this coordinate system are inherently real numbers. ImagePositionStc The ImagePositionStc represents a single point on the image plane. It encapsulates two values, an x and a y value. Since the point is assumed to be on the image plane, the z value is assumed to be the focal length of the camera. ImageLandmarkStc A landmark appears as three dots in an image. The ImageLandmarkStc encapsulates this as three ImagePositionStc''s representing the centroids of each dot, and a value representing the radius of the dots, which are assumed to be identical, regardless of the fact that the center dot will be slightly different. This difference is assumed to be negligible.  4.3.2  F r a m e Buffer Coordinate Date Types The frame plane refers to the representation of an image in a computer. A typical frame  buffer consists of a segment of memory used to store a matrix of pixel values. There are many different representations of pixel values. The R O L L system assumes that the pixel values are a single number in the range [0,255], representing the relative greyscale intensity of the pixel.  75  Points in a frame are referred to by a <row, column> tuple. Each value can range from 0 to the size of the matrix in either the row or column direction. Because of the discrete nature of a matrix, the row and column values are usually represented as integer numbers. In the R O L L system, real number values are also required for centroid calculations. Therefore the R O L L system includes both real and integer representations of points in the frame buffer. RealFramePositionStc and IntegerFramePositionStc These two structures represent a single point in the frame buffer. They encapsulate the row and column values, as row and column fields. In the RealFramePositionStc these values are stored as C data type double and in the IntegerFramePositionStc they are stored as C data type int. These data types are represented using 64 and 32 bits respectively on the Win32 platform using the Visual C++ 4.0 compiler. FrameEllipseStc This structure represents an ellipse in the frame buffer. The structure contains a RealFramePositionStc and an integer area. The area represents the number of pixels contained wholly within the ellipse. FrameLandmarkStc Like an ImageLandmarkStc represents a landmark in the ideal camera image plane, by three points and a radius, the FrameLandmarkStc represents a landmark in the frame buffer by three RealFramePositionStc'?, and an area. Again although the areas of the dots may vary, this variance is assumed to be insignificant and the average is stored as a single value representing the area of all three dots.  76  4.3.3 Cartesian Coordinate Data Types The Cartesian coordinate frame is represented by a right handed <x, y, z> tuple. For the R O L L system, the xz plane is aligned with the horizontal, and the y axis points directly upwards. This corresponds to the normal conception of the Cartesian coordinate system. Unlike the typical Cartesian coordinate system, the R O L L system assumes that 0 ° in the xz plane is aligned with the z axis instead of the x axis. This change is made in order to align the Cartesian coordinate axis with the traditional alignment of axis in the ideal camera model, where the z axis is set to the camera's optical axis. The direction of objects from the position of the camera, corresponds to orientation values in the Cartesian plane. This makes translation between the coordinate systems not only easier to perform, but easier to understand. A l l values in the Cartesian coordinate system will be real valued. A l l angles are in radians. PoseStc The PoseStc represents a robot or landmark position in the Cartesian coordinate system. Each robot or landmark is assumed to be fixed to the horizontal plane, which eliminates the need for a y coordinate. Each robot or landmark also has an orientation relative to the z-axis. Thus the PoseStc contains a <x, z, orientation> tuple. LandmarkStc A LandmarkStc is used to represent a landmark in a Cartesian plant. It contains a pose, and adds an integer identification field. A l l landmarks have a unique integer identification number.  4.3.4 GreyScalelmage Data Type To encapsulate a frame buffer the GreyScalelmage data type was created. Unlike the previous data types, a GreyScalelmage is an object type and thus has some associated operations.  77  The GreyScalelmage is actually an abstract base class. This was done to allow simulated and real images flow through the system without modifying those components that require access to the image.  The most notable operations that must be provided by a GreyScalelmage are GetPixel(), SetPixel,(), Width(), and Height(). It also contains a GetBuffer() operation that is useful for performing lower-level time critical operations. Time performance is not critical for proving the operation of the R O L L system in the A C E environment so all pixel values are obtained through the GetPixel() member function.  4.4 Components Each component shown in Figure 13 is described in this section. Inputs, outputs and attached actions are listed for each component. The functions called by any actions are described using a terse pseudo code algorithm.  Components that perform critical localization subtasks are described in detail, since much of the design work for the R O L L system lies in the algorithms contained within these components.  78  4.4.1 The Frame Grabber  Frame Grabber Component  ( GreyScalelmage Get  Image  Current Image  )  I  Figure 14 Frame Grabber Component  The  frame grabber is an abstract base class  that encapsulates the responsibility of  generating the current image from the system's camera. It has a single, dataless input, Getlmage, and  a single output Currentlmage of type GreyScalelmage. There is a single action connected to  its input that is responsible for generating/obtaining the image.  For  a physical implementation of the R O L L system, a component derived from this base  class would be created to interface with a specific hardware frame grabber. Although constructor parameters might be added to deal with initialization of the hardware, the runtime operation of the component would remain the same, i.e., its 'event interface' remains the same.  Components are C++ objects and can therefore use C++ inheritance mechanisms to provide specialization for different hardware, or even in this case, simulation of hardware.  79 The Image Sequence Loader  Image Sequence Loader  (  GreyScalelmage)  Current Image  Get Image  Figure 15 The Image Sequence Loader Component  For simulation in the A C E testbed, images are generated off-line and must be loaded during simulation. The Image Sequence Loader is a component that mimics a Frame Grabber. It loads images, in sequence, from disk whenever the Getlmage input is signaled. The function that performs this operation is a simple four-step process. Each step has a large amount of details but these are of no academic interest. The high level algorithm is:  1. Load the next image.  2. Increment the current image counter.  3. Convert the image to a GreyScalelmage  4. Signal the output event with the new image.  80  The current implementation uses a simple IFF greyscale image format for image file format. Image pixels are stored exactly as they are in a GreyScalelmage, so that the file is simply mapped to memory.  81  4.4.2 T h e M a p  Map Loop Terminated  Next Image (LandmarkPoseStc  Image  Grabbed  Landmark Absolute Position  1  Figure 16 The Map Component  The Map has a dual responsibility. It must load an maintain the database of known landmarks and it must synchronize iteration through the two pipelines . It has two inputs: 25  LoopTerminated, and ImageGrabbed. Both of the inputs are dataless. It has two outputs; CurrentLandmark, and Nextlmage. The former has data type LandmarkPoseStc, and the latter is dataless.  The current implementation of the Map stores all of the known landmarks in an array of PoseStc, where the index of an element in the array is the id of the landmark. It has two actions: NextLandmark, and StartMainheration. There is a single local variable, currentlndex, that retains an index into the array of landmarks. The StartMainheration action follows the following algorithm:  In retrospect, the author feels this component should have been separated this components into two components: a Map and a Pipeline Controller.  82  )  1. Set currentlndex to the first element of the landmark array.  2. Signal the CurrentLandmark output with the first element of the array.  The NextLandmark action follows this algorithm:  1.  Increment the currentlndex.  2.  If the currentlndex exceeds the bounds of the landmark array:  2.1.  Signal the Grablmage output.  2.2.  3.  End.  Signal the CurrentLandmark with the landmark at currentlndex  83  4.4.3 T h e L a n d m a r k P r e d i c t o r  Landmark Predictor (LandmarkStc) Landmark  Pose  Not Visible  (PoseStc)  ( Camera Pose  Landmark Image  FrameLandmarkStc)  Position  Trigger  A i  Figure 17 The Landmark Predictor Component  The Landmark Predictor is responsible for predicting whether a given landmark will appear in the image, and if so where. It has a two inputs. The first is CurrentLandmark, of type LandmarkStc. This input should hold the absolute position and id of some landmark. The second input, of type PoseStc, should contain an estimation of the current landmark position.  The Landmark Predictor has two outputs. First is a dataless output, Notlnlmage, that should be signaled if the landmark will not appear in the current image, based on the current position estimate for the robot. The second output, of type LandmarkFrameStc, is called LandmarkPrediction. This output should be signaled with a prediction of where the landmark will appear in the image, if it is determined that the landmark will appear in the image.  There is a single action, PredictLandmark, which performs the computations of the following algorithm:.  84  1.  If the landmark will not appear in the current image  1.1.  1.2.  2.  Signal the Notlnlmage output.  End.  Calculate the predicted position of the landmark in the current image.  3.  Signal the LandmarkPrediction with the calculated prediction. Predicting If the Landmark Will Appear in the Image The landmark will appear in the image if a ray traced from the predicted position of the camera to the absolute position of the landmark falls within a given range.  Equation 35  tan  1  (X  Idmk  V ^Idmk  —X  camera  <  FIELD  OF VIEW  ^camerea J  The FIELD OF VIEW constant depends on the camera and lens that are used. For the particular camera used in this system, the value of this constant is approximately 0.40 radians, or 23°. In addition, the landmark must be facing the camera, i.e., the orientation of the landmark relative to the predicted pose of the camera must be within 7i/2 of -71.  Equation 36  < ft/  Q—ff Idmk  camera  /  2  If the above two conditions are true then it is assumed that the landmark will appear in the current image.  85 Estimating the Position of the Landmark in the Image Calculating the expected position of the landmark in the current image is a fairly lengthy procedure. Section introduces the general approach to projecting a point in absolute space onto an ideal image plane. These calculations are repeated here in more detail.  First, the x-z positions of the upper and lower dots of the landmark are calculated relative to the camera. This is done by translating the landmark position by the negative of the camera position, and then rotating by the negative of the camera orientation. This requires a simple application of the modified general transformation equation given as Equation 8.  Equation 37 absolute . ^  (  ^camera '  camera '  X  ldmk  ^camera  From this position it is possible to calculate the expected x coordinate of the upper and lower dots of the landmark in the image plane. This is done through straight application of the ideal camera equations.  Equation 38 cameras camera  X  X  —x  —  X  U m k  f  OCal  8  len  th  7  '"camera^  Knowing that the landmark is placed at exactly the same height as the camera, we can calculate the expected y position of the upper and lower dots in the image by calculating the expected height of the landmark in the image plane and placing the upper and lower dots equidistant from the y=0 line in the image plane.  86  Equation 39 actual height  landmark  height  x focal  length  —  7 ^camera,,.  Equation 40 height  y. Jimage ,  m er  = — y. J image  — lower  /  />> /I  The next major step is to calculate the position of the middle dot of the landmark in the image plane. First the absolute position of the landmark middle dot must be calculated. Clearly this value could be stored to avoid repeating the calculation of a constant value, but it was left as a calculation in the simulated environment.  Equation 41 X  camera g 0  est  ^offset ) • offset  + s'm(d  ldmk  length  Equation 42 Zcamera  offesl  = ^offse, +  C  0^  l  d  m  k  ) ' Offset  length  Again, in Equation 8, the general pose transformation equation is used to transform the absolute position into the position relative to the camera.  Equation 43  offset  3  absolute ^solute // ^ \ Z  j camera ' ^camera /P off.se X  camera  From the assumption that the landmark is mounted at the same height as the optical axis, we can assume that the y position of the middle (offset) dot is 0. The x position can be calculated using the ideal camera equation.  87  Equation 44  camera ,  X  X  X  ajr a  —  f  S  OCal  len  tk  Finally, the last step for calculating the dot locations is to take all of the ideal image positions and translate them into frame buffer coordinates. This is done using the Tsai transformation specified in Section 2.6.  Equation 45  [row  upper  col  j  upper  Tsai|x  i m a j ; e  ^,y  j  imagemKr  Equation 46  V middle OW  Liddle ] ~ ^ ^{ image  CO  Sa  X  milUle  ' image  m  Equation 47  [row  lower  col  ] = Tsai(  ,  lower  y  _)  imagei  Now all that is left for the landmark prediction output is to provide an estimate of the radius of each dot in frame buffer pixels. This is approximated using the ideal camera equations, the Tsai transformation near the origin of the image plane, and the known diameter of the landmark dots.  Equation 48 (  [row,  n  l  r  ^dot diameter• focal col ] = Tsai  l  +1.  length dot  r  ii  diameter• focal  length\  x  camera  Z  Umk  Equation 49  [row  2  col ] = Tsai(0,0) 2  88  J  Equation 50  (raw, — row ) — (co/, — col ) radius — 2  2  2  89  4.4.4 T h e L a n d m a r k Detector  Landmark Detector (LandmarkStc) Current Landmark (  Not Detected  - - •  FrameLandmarkStc Camera Pose (  (  FrameLandmarkStc)  Detected Landmark <  GreyScalelmage) Current Image  Figure 18 The Landmark Detector Subconfiguration Interface  The Landmark Detector component is responsible for finding a specific landmark in an image, given a prediction of where the landmark will appear in the image, and a reference to the image itself. This is by far the most complex component because is contains an entire, though rudimentary, object recognition system. Because of its complexity, the landmark detector is further decomposed into a sub-configuration. This sub-configuration is comprised of four different component types: a structure splitter, an ellipse detector, a landmark shape verifier, and a landmark id verifier. These components are connected in the configuration shown in Figure 19. The thick arrows represent the inputs and outputs to the landmark detector sub-system.  90  Current Image  Predicted  Current  (GreyScalelmage)  Landmark  Landmark  Connections not  (FrameLandmarkStc)  (LandmarkStc)  shown to reduce clutter  Structure Splitter  Predicted Centroid (FrameEllipseStc)  Tier 1  Ellipse Detector  Ellipse Detector  Ellipse Detector  Detected Proposed  Centroid  Landmark Shape Verification  (FrameEllipseStc)  Tier 2 Predicted Centroid  Landmark (FrameLandmarkStc)  (FrameEllipseStc)  Tier 3  Ellipse Detector  Ellipse Detector  Landmark Shape Verification  Landmark Shape Verification  Ellipse Detector  Detected Centroid (FrameEllipseStc)  Tier 4  P-l  Landmark Shape Verification  Not Detected  Landmark Identification Verification  Tier 5  i1 A N D f*  Not Detected  Figure 19 The Landmark Detector Sub-Configuration  91  Detected Landmark (FrameLandmarkStc) The Frame Landmark Splitter  Frame Landmark Splitter ( FrameLandmarkStc  (FrameEllipseStc  Upper  Centroid  Middle  Centroid  )  >  ) Landmark  Pose  Lower  Centroid  (  FrameEllipseStc)  ( FrameEllipseStc  Trigger  - 4 -  Figure 20 The Frame Landmark Splitter Component  The frame landmark splitter component simply takes a FrameLandmarkStc as an input and splits it into three FrameEllipseStc for output. Splitting this structure up allows each centroid to be detected using a single type of component that attempts to detected a dot independently of the other dots in the landmark.  92  ) The Ellipse Detector  Ellipse Detector ( FrameEllipseStc  ) Predicted  Centroid (  (GreyScalelmage  Detected  )  FrameEllipseStc)  Centroid  I  Current Image  Trigger  A i  Figure 21 The Ellipse Detector Component  The ellipse detector component accepts a predicted ellipse location and a reference to the current image and searches for an ellipse in the image in the vicinity of the predicted centroid. A general description of the algorithm used can be found on page 51. If an ellipse can not be detected within the vicinity of the prediction, the output is marked as an invalid ellipse by setting the row and column frame buffer coordinates as -1.  93 The Landmark Shape Verifier  Landmark Shape Verifier (FrameEllipseStc) Upper ( FrameEllipseStc  Upper Ellipse Prediction Middle  (  FrameLandmarkStc)  Valid Landmark <  Ellipse  )  (  Ellipse  ( FrameEllipseStc  )  ( FrameEllipseStc  )  <  Upper Valid Ellipse < (FrameEllipseStc)  FrameEllipseStc) Lower  Ellipse  Middle Ellipse Prediction < (  FrameEllipseStc)  I  Lower Valid Ellipse < (  FrameEllipseStc)  Lower Ellipse Prediction > Bad Shape  Figure 22 The Landmark Shape Verifier Component  The landmark shape verifier component takes three FrameEllipseStc events as inputs and has seven outputs. Several tests are performed on the three input ellipses. First, each ellipse is checked to see if it is valid. If it is not valid, the x and y coordinates will be -1. If all three landmarks are valid, the three ellipses are subject to several further tests. These tests are detailed in the section entitled Landmark Verification on page 48. If the ellipses pass all of the tests, then they are aggregated and output in the Valid Landmark output event.  If one ellipse is found to be invalid it is possible to further the search by providing three more ellipse centroid positions based on the locations of the two valid ellipses. These predictions can then be passed on to more ellipse detectors.  94  Should two or more of the ellipses be invalid, or should three valid ellipses fail one of the landmark verification tests, the dataless event BadShape is signaled.  95 The Landmark Identification Verifier  Landmark Identification Verifier (  FrameLandmarkStc)  ( FrameLandmarkStc Detected  Landmark  Verified Landmark  1  (LandmarkStc) Current  Landmark  (GreyScalelmage)  Wrong Id  - - •  Current Image  Figure 23 The Landmark Identification Verifier  The landmark identification verifier component takes a verified landmark and searches the area above it in the current image for a valid identifier block. The number read from this search is compared to the value given in the identification field of the current landmark input. If the number matches then the detected landmark is copied to the verified landmark output. If the numbers do not match, and the identifier block was correctly read, then the wrong id output is signaled.  The specification of a valid identifier block is found in the Section 3.2.5 on page 57. Operation of the Landmark Detector Sub-Configuration Referring to Figure 19, one can trace the operation of the Landmark Detector SubConfiguration. This sub-configuration is composed of 5 sequential tiers. Inside of each tier are one or more components. For each tier, these components may operate concurrently. Although  96  )  true concurrency cannot be achieved without a multiprocessor machine, it is notable that this possible concurrency occurs naturally when designing systems using A C E .  In tier 1, the predicted landmark event is fed into a Frame landmark structure splitter which produces three FrameEllipseStc events, each holding the predicted centroid of a single dot of the landmark. Each of these FrameEllipseStc events is fed into an Ellipse Detector Component, which searches the vicinity of the predicted centroid in the current image to find a black dot. If none is found, the Ellipse Detectors output FrameEllipseStc''s with -1 in the row and column fields.  The three detected FrameEllipseStc events are then passed to a Landmark Shape Verifier component in tier 2. This component performs some tests on the three ellipses to see if they form a valid landmark. If these three FrameEllipseStc events form a valid landmark, the subconfiguration skips to tier 5. Otherwise, if one of the ellipses is invalid, there is still a chance that the third dot may be found. To do this, the first Landmark Shape Verifier calculates three new ellipse predictions. These new predictions are passed on to tier 3 which consists of three more Ellipse Detectors.  The output of the tier 3 Ellipse Detectors, along with the original ellipses from tier 2 that were valid, are passed to tier 4 which contains three Landmark Shape Verifiers. If any of the tier 3 Ellipse Detectors managed to find a valid ellipse, one of the tier 4 Landmark Shape Verifiers will receive three valid ellipses and can verify the shape.  In tier 5, if any of the Landmark Shape Verifiers from tier 2 or tier 4 has found that its inputs form a valid landmark, that landmark is passed on to the Landmark Identification Verifier . This component can then verify the identity of the landmark. If the identification proceeds satisfactorily, the fully detected and verified landmark is output from the Landmark Identification Verifier in the Detected Landmark event. If none of the Landmark Shape Verifiers  97  if tier 2 or tier 4 receive ellipses that form a valid landmark, or if the identification of the detected landmark is found to be wrong, the Not Detected event is signaled. This condition is governed by the AND component which will signal its output if all three tier 4 Landmark Shape Verifiers do not find a valid landmark on its inputs.  98  4.4.5 The Relative Position Calculator  Relative Position Calculator (FrameLandmarkStc)  (PoseStc) Frame  Position  Relative  Position  Figure 24 The Relative Position Calculator Component  The Relative Position Calculator component takes a valid landmark in the frame buffer and calculates the pose of the landmark relative to the camera. Section, starting on page 41 details the equations used to calculate the relative position of the landmark. The final relative position is output in a PoseStc in the Relative Position event.  99  4.4.6 T h e Absolute Position C a l c u l a t o r  Absolute Camera Position Calculator  (PoseStc)  Landmark Relative  (PoseStc) Camera Absolute  (LandmarkStc) Landmark  '  Absolute  Figure 25 The Absolute Position Calculator Component  The Absolute Camera Position Calculator component takes the pose of a landmark, and the known absolute pose of the detected landmark, and calculates the absolute position of the camera. This is done using the general pose transformation equation listed as Equation 8. This equation is shown again in Equation 11 with the appropriate parameters to calculate the absolute position of the camera.  100  4.4.7 T h e Position E s t i m a t o r  Position Estimator (Float) Time in seconds (PoseStc) Position Estimate  (PoseStc) Detected Position  Figure 26 The Position Estimator Component  The position estimator component is responsible for maintaining a estimate of the current position. The implementation of this component may be as simple as copying the detected position to the output, to a Kalman filter that incorporates odometry or dead reckoning with the visual localization system. Section 3.2.6 gives a set of equations that provide a second order kinematic model of the robot considered for this system.  If the system were to be integrated with a full test-bed, this component must be able to provide a position estimate when requested rather than in discrete asynchronous intervals. The equations given in Section 3.2.6 do allow for a pose estimate at any time.  101  4.5 Embedded System Implementation Although this thesis centers around the simulated version of the R O L L system, the simulated version was actually preceded by a physical implementation. On overview of the system is shown in Figure 27. A small camera was mounted on top of a cylindrical robot, nicknamed Spinoza. The image from this camera was transmitted using two transceivers. A l l of the localization was performed on a single Texas Instruments TMS320C40 DSP board with a frame grabber. Position data obtained from the localization algorithm was passed on to a second TMS32C40 board with a frame buffer. This buffer was then displayed on an available monitor. The robot was manually driven using a host computer serial port connection and another set of transceivers not shown in Figure 27.  Preliminary performance analysis revealed that a single C40 was able to process the images at approximately 8Hz. At this rate it was possible to drive the robot only a speeds lower than lmeters/s. Rotational speed also had to be limited. The maximum range that landmarks were detectable was about 2.5 meters.  102  Figure 27 The Physical Implementation  The system showed sensitivity to the many constants involved in the algorithm. These constants are listed in Appendix A . The system also experienced a high frequency of "misses", where the robot would fail to detect the landmark for several frames and become lost. Several rough re-initialization schemes were attempted but none proved both reliable and quick.  103  Unfortunately, before these deficiencies could be investigated and system characteristics quantified properly several factors came to the forefront that instigated the move to the simulated test bed using the A C E programming environment.  The first factor was that the code was not well organized and debugging and maintenance was quickly becoming impractical. Secondly, after a prolonged absence from the project, the robot mountings were changed rendering it impossible to use the robot further.  104  Chapter Five - Results and Evaluation 5. Results And Evaluation This chapter presents the results from system test and qualitative evaluation of system performance. It is split into two sections: Localization System Evaluation and A C E Environment Evaluation. The first and larger section concerns itself with evaluation of the localization system and algorithms while the second, smaller, section relates to the sub-theme of this thesis which is programming in the A C E environment.  5.1 Localization System Evaluation Accurate system evaluation proved to be very difficult for the R O L L system. There are a very large number of possible metrics that would be desirable, but for one reason or another each of these metrics was not practically obtainable.  5.1.1 Issues/ Difficulties There were two main problems that thwarted testing. The first difficulty in measuring absolute location. The second was the effort involved in obtaining images. Measuring Errors Even with a metered optical bench, there were several large sources of error that prevented an accurate measurement of absolute pose.  105  Because the camera was required to be at the same height as the landmark middle dot, a vertically adjustable Mountin;>g Platform  Figure 28 The Camera Mounting  mounting for the camera was required. This vertical mounting was in fact a two degree of freedom mounting. The  Platform  platform was adjustable in the horizontal plane. In turn, when the camera was mounted on the mounting platform, it was also free to rotate in the horizontal plane. Both of these rotational freedoms could be secured, but not at a specific angle. Because the camera is mounted at an offset from the vertical pole, this inaccuracy affected all three pose variables x, z, and orientation.  The landmark mounting also had similar problems. Particularly, on the motion stage shown in Figure 29, the landmark had no fastenings and had to be aligned by simple inspection. Also while it was very easy to get very accurate relative motion from the motion stages, the absolute coordinate of the starting position was difficult to measure resulting in more error. Image Capture Difficulties The second problem was the rather slow rate at which images could be generated and analyzed. Because the system was moved to a stand-alone Windows N T simulated environment each image in any image sequence must be manually transferred either through floppy disk or modem. Each image used approximately 256Kbytes of disk space. Another problem was the lack of automation possible in the frame grabber utility. Capturing each image required input from the user.  With these image capture problems, it became clear that analysis that required significant numbers of images were impractical.  106  5.1.2 Experiment Procedure For the above reasons the decision was made to test the system using a small number of test runs under relatively controlled conditions.  To minimize inaccuracies in measuring the absolute position of the camera, the camera was calibrated using the Tsai method immediately before capturing images. The camera was kept immobile after calibration and the calibration target was replaced by the localization target. An assumption was made that the results from the calibration algorithm were perfectly accurate. This reduced the source of measurement error to the error incurred by replacing the calibration target with the localization target. Experiment Setup Optical  Bench  Camera  Figure 29 Optical Bench Motion Stage Setup  Figure 29 depicts an overhead view of the experimental setup.  107  The optical bench provided a secure surface on which to mount the camera platform and the motion stage. The motion stage consisted of a platform that was driven by an electric motor with a single screw gear along a precise track. This platform could be moved using a few script utilities provided by Stewart Kingdon at the U B C LCI.  5.1.3 Obtaining a Sequence of Images For each run, the motion stage was "homed" by moving it to the extreme far end from the motor, and then moving it a known length backwards to arrive at the starting point for each ran. Each ran consisted of eleven images taken at different positions beginning with 2 inches in front of the starting point, and at 2 inch intervals thereafter. At each stop an image was obtained.  5.1.4 Camera Calibration Procedure The system described in this section is the result of the efforts of Rod Barman who has implemented the calibration algorithm as described in [Tsa87]. Beginning with the setup referred to in Figure 29, the localization target, shown in Figure 7 was temporarily mounted on the motion platform, and the motion platform was moved adjacent to the camera mounting pole. The camera mounting was then adjusted so that the camera was at the exact height of the middle dot of the landmark. The camera position was not allowed to be altered from this step forward.  The motion stage was moved away from the camera to the opposite corner of the optical bench. The localization landmark was then replaced with the calibration landmark. A sequence of images was captured. This sequence of images was processed to obtain the centroid of each dot in each image in the image sequence using a utility that required the user to specify a few dots manually before the rest could be detected automatically.  108  Figure 30 The Calibration Target  The array of dots was then processed by a utility that actually performed the Tsai calibration calculations. This utility was applied to the data multiple times until the results converged.  5.1.5  C a m e r a C a l i b r a t i o n Results The results of the calibration run are the internal and external parameters of the Tsai  camera model. The exact parameters are listed below.  # F r a m e b u f f e r c e n t e r (Cx, Cy) 240.106566062225 231.027618810727  # # CCD s e n s o r d i m e n s i o n s i n m e t e r s ( d x , dy) 1.145 83 3 3 3 3 3 3 3 3 3 e - 0 5 1 . 3 2 5 3 0 1 2 0 4 8 1 9 2 8 e - 0 5  # # Frame b u f f e r t o 0.682694160379097  CCD s c a l e  factor  (Sx)  # # Radial distortion 2954.18720841782 0  coefficients  (kl,  #  109  k2)  # Focal length (f) 0.00812931290009322 # # Yaw, p i t c h a n d r o l e ( t h e t a , p h i , p s i ) 0.379992612801381 3.14827952707384 -0.0104168964014736 # # T r a n s l a t i o n V e c t o r (tx, t y , tz) -0.331945854525235 0.111682017910593 1.75903951865051 # # Camera o r i g i n : (0.961970) (0.089901) (1.511096) # RMS p i x e l e r r o r ( 0 . 1 0 8 1 7 6 ) x , (0.156792) y .  # # Maximum p i x e l  error  (0.412734)  x,  (0.499458)  y.  # The internal parameters were converted into data that was accessible to the localization system so that the Tsai ideal image plane - frame buffer transformations could be applied. The external parameters, Yaw, and the translation vector were used as the absolute position measurement of the camera relative to the world origin. The values given in the translation vector were relative to the bottom right dot of the calibration target, and constitute the T vector shown below in Equation 51. The yaw value is also used to form a 2D rotation matrix for R  Equation 51  X z  =R  w  _ w_ Z  The world coordinates of the camera can be found by rearranging Equation 51, and inserting known values. Equation 52 shows the resulting linear system. The results of solving this system are shown in  110  Equation 52  cos(-0.38)  -sin(-0.38)'  sin(-0.38)  cos(-0.38)  "0.332"  z  _-1.760  Equation 53  " 0.89 " 7  _ w _  -1.51  Equation 54  w  IP  7 camera  w  0.61m  -0.22m  x  +  yaw  0  -K  -  -1.51m - 2.762rae?  5.1.6 Localization Test Run Procedure The calibration target was replaced with the localization target. The localization target top and bottom dots were vertically aligned with the center front of the motion stage platform. The landmark orientation was also aligned with the center front of the motion stage. These alignments were made with only the assistance of the human eye, and therefore are believed to have incurred significant error. Despite this source of error, other sources of error have been minimized by the use of the calibration routine to measure camera position, and by keeping the camera position fixed through calibration and the test runs.  The image sequence for the test ran images started from the same spot along the motion stage track as the calibration, and proceeded forward in steps of 2 inches. At each point, an image was captured and saved to disk.  The images were then transferred to a Windows N T workstation were they were fed to the simulated system.  Ill Light Level Variance The test run was repeated at a lower light level. This was achieved by turning off half of the lights in the laboratory. This second run is to be used to demonstrate the system's robustness in varying light levels. Figure 31 and Figure 32 show a sample of the images taken in the high light level test run and low light level test run respectively. These images are 256 colour greyscale images but were only shown with 64 colours by the image viewing software used to display the images.  Figure 31 Test Run #1 - Full Light  112  Figure 32 Test Run #2 - Dark  5.1.7 Analysis O f L o c a l i z a t i o n Test R u n s The simulated system was configured to produce a text file log of the results of each component. The complete log for each test run is included in appendices B and C. Excerpts from each run are listed in the sections below to explain any error conditions.  113 Simulated Test Run #1 - Full Light Table 1 summarizes the results from the first simulated test run. Each row shows the position calculated by the system, and the position calculated based on the initial position calculated in the Tsai calibration. The former value is shown in the "Known Pose" column of the table. The actual name of the image file is shown in the left most column. The error pose is calculated in the fourth column. This value is found by subtracting the measured pose from the known pose. The final column adjusts the error by subtracting the average error for the test run. This is used to approximate a correction for any systematic errors.  114  Table 1 Simuated Test Run #1 Full Light Imago  Measured Pose  Known Pose  Cartesian Krror  ( x, /.. oriental ion)  ( x, / , orientation)  ( x, / , orientation)  Uinkl5.ilT  i -68.3. 152.0, 2.755)  (-67.0. 151.y. 2.762 )  ( 1.3.-0.1. 0.007 )  ldrr.kl3.iff  (-66.8, 145.7, 2.761)  (-67.0, 146.8 , 2.762 )  (-0.2, 1.1,0.001 )  ldmkll.iff  (-64.0, 141.6, 2.780)  (-67.0, 141.7, 2.762)  (-3.0, 0.1,-0.018)  ldmk09.iff  (-68.3, 136.0, 2.754)  (-67.0, 136.7, 2.762)  ( 1.3,0.7, 0.008)  ldmk07.iff  (-67.4, 131.9, 2.762)  (-67.0, 131.6, 2.762)  ( 0.4, -0.3, 0.000 )  ldmk05.iff  (-65.6, 126.8, 2.773 )  (-67.0, 126.5, 2.762)  (-1.4, -0.3, -0.011 )  ldmk03.iff  (-67.6, 121.4, 2.760)  (-67.0, 121.4, 2.762)  ( 0.6, 0.0, 0.002)  ldmkOl.iff  (-66.9, 116.5,2.765)  (-67.0, 116.3, 2.762)  (-0.1,-0.2,-0.003)  ldmk-l.iff  (-65.3, 111.6, 2.777)  (-67.0, 111.3, 2.762)  (-1.7,-0.3, -0.015)  ldmk-3.iff  (-66.2, 106.3,2.770)  (-67.0, 106.2, 2.762)  (-0.8, -0.1,-0.008)  ldmk-5.iff  (-65.2, 101.6, 2.780)  (-67.0, 101.1,2.762)  (-1.8, -0.5, -0.018)  ldmk-7.iff  (-66.4, 96.2, 2.770 )  (-67.0, 96.0, 2.762)  (-0.6, -0.2, -0.008)  average  (-0.5, -0.01, -0.005)  max error magnitude  (3.0, 1.1,0.018)  standard deviation (s)  ( 1.32, 0.46, 0.0091 ) Commentary As the above table shows, all of the landmarks in the sequence were detected. In addition, there were no overly large errors. An approximation of any residual systematic errors is shown at the bottom of the table in column 4. The errors are then adjusted for this residual systematic error in column 5. This column shows that the variance of the values is limited.  115 Simulated Test Run #2 - Half Light The results from the second simulated test run are presented in the same way as the first test run. Table 2 shows the raw pose measurements along with the error calculations.  116  Table 2 Simulated Test Run #2 - Half Light Measured Pose  Known Pose  Cartesian Error  ( x. z. orientation)  ( x. z, orientation')  ( x, z, orientation)  ldnikl5.ilT  «-ny.y. I4y.7.2.7411  (-67.0. 151.'). 2.762 )  l 2.y. )  ldmkl3.iff  (-69.1, 144.5, 2.745)  (-67.0, 146.8 , 2.762)  (2.1,2.3,0.017)  ldmkll.iff  (-68.9, 138.6, 2.743)  (-67.0, 141.7, 2.762)  ( 1.9,3.1,0.019)  ldmk09.iff  Not Detected  (-67.0, 136.7, 2.762)  N/A  ldmk07.iff  (-67.9, 129.9, 2.753 )  (-67.0, 131.6, 2.762)  (0.9, 1.7,0.009)  ldmk05.iff  (-67.4, 124.7, 2.755 )  (-67.0, 126.5, 2.762)  (0.4, 1.8, 0.007)  ldmk03.iff  (-68.9, 120.8, 2.750 )  (-67.0, 121.4, 2.762)  ( 1.9, 0.6, 0.012)  ldmkOl.iff  (-65.8, 116.0, 2.771 )  (-67.0, 116.3, 2.762)  (-1.2, 0.3, -0.009)  ldmk-l.iff  (-68.6, 109.5, 2.746)  (-67.0, 111.3, 2.762)  ( 1.6, 1.8, 0.016)  ldmk-3.iff  (-66.6, 106.2, 2.768)  (-67.0, 106.2, 2.762)  (-0.4, 0.0, -0.006 )  ldmk-5.iff  (-66.6, 101.0,2.767)  (-67.0, 101.1,2.762)  (-0.4, 0.1,-0.005)  ldmk-7.iff  (-66.5, 95.6, 2.766)  (-67.0, 96.0, 2.762)  (-1.5,0.4, -0.004)  average  (0.75, 1.3,0.007)  max error magnitude  ( 2.9,3.1,0.021 )  standard deviation (s)  ( 1.47, 1.06, 0.0111 )  Imago Anomalies Missed Landmark In the image stored in the file ldmk09.iff, no landmark was detected. The log file for the test run for ldmk09.iff is shown below.  1795722CurrentImage 1795731 AbsoluteLandmarkPosition  117  Id: x: z: theta:  0 0.0000 0.0000 0.0000  179575 8 PredictedLandmark 211.2 upper 231 middle 250.9 lower 1795787 EllipseDetectionTop row= 229 1795797 EllipseDetectionMiddle row= 238.1 1795 807 EllipseDetectionBottom row= 259.4 1795818Tier2HighestEllipse row= 216.8 1795873Tier2HighEllipse row= 238.1 1795882Tier2MiddleEllipse row= 248.7 1795896Tier2LowEllipse row= 259.4 1795907Tier2LowestEllipse row= 280.7 1795918 Tier2HighestDetectedEllipse row= 228 1795929Tier2MiddleDetectedEllipse row= -1 1795939 Tier2LowestDetectedEllipse row= -1 1795950NotDetected  209.6 202.3 209.6  col=  50 50 50  205  area=l  col= 195.1  area=32  col= 202.8  area=33  col=  199  area=32  col= 195.1  area=32  col=  area=32  199  col= 202.8  area=33  col=  199  area=33  col=  198  area=3  col=  -1  area=-l  coI=  -1  area=-l  This log file shows that the top ellipse was not detected from the first prediction. The second prediction, show as Tier2HighestEllipse is approximately row 199, and column 205. From the image, shown below in Figure 33, we find that the approximate centroid of the top ellipse is row 217, and column 203. These values are rather far apart.  From the image, one can spot two very small "ellipses" detected between the middle ellipse and the top ellipse. These will not be visible in the black and white image below, but were visible when displayed in color . 26  Detected ellipses were marked in color.  118  Figure 33 Landmark Not Detected - Test Run #2  Consulting the log file again, it is apparent that the shape verifier improperly used the ellipse found in EllipseDetectionTop: row = 229, col =  205, area=l, in the calculation of the  second predictions. This constitutes a bug in the code or the algorithm. With this bug fixed, the prediction for Tier2HighestEllipse would have been row = 216.9, column = 199.0. This prediction would have most likely have led to the detection of the proper ellipse.  119  5.2 ACE Environment Evaluation There are other environments comparable to A C E , some of which resemble embedded real-time kernels and some of which resemble graphical data flow tools. A C E has advantages and disadvantages with respect to each of these environments but a reasonably comprehensive comparison is not practical within this thesis. In this section, the A C E Environment is compared only to coding from scratch in C on a Texas Instruments TMS320C40 DSP chip.  5.2.1 Advantages Better Hierarchical Designs. The largest advantage of the A C E system lies in the way it forces the designer into separate the functional elements of a system from the control flow of a system. This occurs because an A C E component is a purely functional abstraction. The flow of data and control is encapsulated in the A C E events, which can be arbitrarily rearranged to change the overall functionality or timing characteristics of the system. The components are completely removed from connectivity issues resulting in a building-block like approach to assembling the system. This in turn has the following two advantages:  •  more reuse - A arbitrary piece of software cannot practically be reused unless it has functional cohesion [McC93] page 81, and is easily reconnected . The A C E environment guarantees the latter by its requirement that components use events for its data input and output.  •  easier maintenance - It is very easy to change one part of the system without causing errors in another part of the system. This is because each component is an individual entity affected only through its explicit inputs and outputs.  120  An experienced programmer, or designer could create a hand-coded solution following good software design principles that provided easy maintenance and a large amount of reuse. It is likely that the same designer/programmer could achieve higher levels of maintainability and reuse using A C E .  Better debugging tools. Using A C E on Windows N T rather than a C40 allows the use of an integrated debugging tool. A debugging tool, which was not available for the particular C40 system that was used, can provide a time savings during development that can prove very significant. This advantage of the A C E environment itself directly. Rather it is a function of the operating system on which the A C E environment runs.  5.2.2 Disadvantages Most of the disadvantages of the A C E environment were related to its lack of support tools. As was noted above, its most significant tool is a commercial off-the-shelf integrated development environment, (which contained the debugger referenced above). A C E is in its infancy. As is, it a custom in-house real-time scheduling framework that is promisingly powerful and simple. It lacks the support infrastructure, including documentation, that would allow it to be used off-the-shelf.  No Support For Image Processing A C E has no pre-designed image processing components. This meant that nearly all components in this configuration had to be created from scratch. This in turn minimized the amount of reuse, although some components were reused within the configuration e.g. the EllipseDetector component.  121  Separation From Video Hardware There was no video frame buffer hardware available for the PC that was used to run the A C E system. This meant that the capture of images had to be performed manually. Pictures were captured and stored. The images were then transferred to the PC. The A C E system then simulated a frame grabber by loading each image, in sequence, from the hard disk. It was not practical to automate this procedure. The result was that large volume testing was not possible, i.e., testing was confined to approximately 20 images.  Unfamiliarity In this case, the author of this thesis was also the programmer who created A C E . Thus there was no learning curve necessary. Also, there was no unknown functionality which meant all potentially hidden pitfalls were avoided. Other users of A C E would clearly have an initial learning curve and might also run into several unforeseen pitfalls.  Lack of a Graphical Editor The data-flow nature of the A C E environment lends itself naturally to graphical programming. The components would be represented by blocks and the events represented by lines, ( or "wires"), that connect the events. This is not necessarily a disadvantage of the A C E environment with respect to hand-coding in C, it is a distinct disadvantage with respect to more off-the-shelf data flow environments.  122  Chapter Six - Conclusions 6. Conclusions In this chapter, the system presented will be evaluated with comparison to the specifications outlined in Section 3.1. These comparisons will show that the current implementation R O L L system did not fully meet all of the specifications, but could meet these specifications if development continued.  Section 6.1 evaluates the current R O L L implementation against each of the specifications outlined in Section 3.1. Section 6.2 includes a discussion of the viability of the main tenets of this thesis.  Section 6.3 enumerates several improvements that could be made to the system.  6.1 Specifications Evaluation 6.1.1 Continuous Sensing This specification said that the system must not require the robot to stop in order to carry out the localization algorithm The R O L L algorithm clearly satisfies this specification. The only case that the system would force the robot to stop is if the robot becomes "lost". In this condition no landmarks would appear within the predicted tracking area.  6.1.2 Real-Time Performance After the move to the simulated environment, this specification became impossible to prove. It is noted that the original C40-based version of the R O L L system did operate at approximately 8Hz. Although it must be said that the R O L L system technically did not meet the  123  specification, it is fairly clear that with additional optimization or increased processing power, the R O L L system could exceed the 10Hz specification.  6.1.3 Minimal Hardware The system was required to run on a single Texas Instruments TMS320C40 DSP chip without any special processing, except for a frame buffer. This specification was met, but this specification is clearly a trade-off with the previous Real Time Performance specification. Thus more processing could be used to satisfy the above specification if available.  6.1.4 Single Camera Localization This specification was satisfied. At no time was a second camera ever needed.  6.1.5 Range Requirements Unfortunately, the testing difficulties did not allow proof that the range requirements were met. The range specifications required the system to be able to detect landmarks from 3 meters to 15cm away from the camera.  6.1.6 No Odometry Required Technically this requirement was satisfied. The system does not inherently rely on any odometry. However, odometry could be used to make the system more robust, thus avoiding the need to stop when the robot becomes lost.  6.2 Viability In this section issues are discussed that were not directly addressed by the specifications. The main tenets of this thesis are a) that single camera, single landmark, real time localization is practical with minimal hardware, b) that the particular landmark and algorithms used in the  124  R O L L system are practical, and finally, c) the A C E environment is suitable for designing image processing applications.  6.2.1 Single Landmark Localization This R O L L system has clearly shown that single landmark localization is possible with small errors. The key to minimizing the errors lies in avoiding the image plane parallelism problem described in Section If this problem is not tackled by the landmark design, the errors radial to the ray from the camera to the landmark will render the data practically useless.  6.2.2 The current R O L L System The R O L L system successfully implements a single camera single landmark localization system. Because the testing did not allow a lot of images to be taken, there was no real way to verify any kind of robustness. Dependence on Parameters One subject associated with robustness is the dependence on parameters that have been hard-coded as constants. During development, many times a landmark could not be detected. The solution was sometimes a bug in the code, but more often changing a constant provided the solution. These changes were usually made by a guess of what would solve the current problem. This represents a serious reliability problem since it is likely that these values are not converging on a set of values that are best for all possible angles, ranges and lighting situations.  Appendix A contains a table of all the constants required by the R O L L system. This table shows the name of the constant, its final value, and a brief indication of how this value was achieved. The constants listed as manually tuned are the ones that could be changed if the current image sequence showed an error.  125  The ones marked real world measurement were the ones that had a direct correspondence to a measurable value in the real world. This includes the camera calibration parameters that were obtained from the Tsai calibration algorithm, since the Tsai calibration is a form of measuring.  6.2.3 A C E for Image Processing Despite several serious disadvantages, which were listed in Section 5.2.2, the advantages listed for A C E outweigh these disadvantages. This is because the data flow nature of A C E conforms very closely to the way in which image processing systems are designed.  A C E also has the advantage that it seamlessly handles asynchronous input from multiple sources. For example, if there were a Kalman filter subsystem, it could be programmed in A C E and would then be able to accept input from the video localization system and odometry at whatever rate the input systems can provide data. This information could then be fed back to each system as it becomes available. This would provide the localization system with a more accurate estimate of the current position, which would make the system more robust.  6.3 Future Work This section lists potential avenues of future work. For each item it lists the motivation, and a possible approach that could be taken.  6.3.1 Robustness Testing With the minimal testing reported in this thesis there is a need for testing with a high number of image sequences. This will allow a clearer determination of the systems robustness. If the system fails consistently, then two approaches could be taken to rectify the situation.  126  The first is to reduce the number of manually tuned constants. Some of these constants could be derived from real world measurement type constants.  The other method would be to improve the tracking performance. This can be done by adding a better kinematic model, or reducing the mean main loop iteration time.  6.3.2 A Scale Invariant Landmark One would like the system to work over a very large range (camera to landmark distance). Whether or not the current landmark will allow the system to reach the 15cm to 3 meter range requirements, it would be desirable to continually increase the range over which the localization can be performed.  The landmark itself is the limiting factor of this effort. Increasing the size of the landmark would allow the system to work from farther away, but the close range would also increase.  If a landmark could be found that is visible at several scales then much of the range problem would have been overcome.  6.3.3 Large Scale Maps The current testing uses only a single landmark. It is still yet to be shown that the system will work in any reasonable sized environment with multiple landmarks. For this to happen, a "course" could be set up to demonstrate the localization system used in navigation with multiple landmarks.  127  6.3.4 F u l l R e a l - T i m e Implementation This system could be used to leverage other research into robot control architectures by providing a test bed. To do this the system would have to be integrated into an existing robot on which a camera can be mounted. Use of LCI's Colour Tracking System Within the LCI there exists a Colour Tracking System that could be integrated into the R O L L system. The LCI colour tracking system finds colored blobs in the current image and reports the centroid of each distinct colored blob. The current landmark could be modified so that the top middle and lower dots are colored, e.g. all middle dots could be orange, and all top and bottom dots green.  The LCI colour tracking system could then almost entirely replace the tracking and detection parts of the R O L L system. This would likely increase the robustness of the system significantly.  The one disadvantage of the LCI Blob Tracking System is that it requires a custom transputer network including some specialized data flow image processing hardware. This would violate the specification that the system run on a single C40 with only a frame buffer as extra hardware.  128  BIBLIOGRAPHY  [Ati91]  Atiya, Sami. and Hager, Greg., "Real-Time Vision-Based Robot Localization," In Proceedings of the 1991 IEEE International Conference on Robotics and Automation, pages 639 - 644, April 1991.  [Bal82]  Ballard, D. H . , and Brown, C M . , "Computer Vision", Prentice-Hall, 1982.  [Cap90]  Caprile, B. and Torre, V . , "Using Vanishing Points for Camera Calibration," In International Journal of Computer Vision, volume 4, pages 127-140. Kluwer Academic Publishers, Netherlands, 1990.  [Der90]  Deriche, Rachid, and Gerard Giraudon, "Accurate Corner Detection : An Analytical Study," In IEEE International Conference on Computer Vision, pages 66-70, IEEE 1990.  [Gel74]  Gelb, Arthur (ed.), "Applied Optimal Estimation," The M.I.T. Press, Cambridge 1974.  [Han93]  Steve Hanks, Martha E. Pollack, and Paul R. Cohen, "Benchmarks, Test Beds, Controlled Experimentation and the Design of Agent Architectures," A l Magazine Winter 1993, pages 17-41.  [Har]  Harris, Chris and Stephens, Mike, "A Combined Corner and Edge Detector, " unpublished, Plessey Research, Roke Manor, United Kingdom.  [Hor91]  Horn, Berthold., "Robot Vision," McGraw-Hill, 1991.  [Hur93]  Hum, Jeff "Differential GPS Explained," Trimble Navigation, 1993.  [Kro89]  Krotkov, E r i c , "Mobile Robot Localization Using A Single Image," In Proceedings from the IEEE International Conference on Robotics and Automation, pages 978-983, IEEE Press. 1989.  [Leo91]  Leonard, John J., and Durrant-Whyte, Hugh F., "Mobile Robot Localization by Tracking Geometric Beacons", In IEEE Transactions on Robotics and Automation, vol 7, no. 3., pages 376-382, IEEE Press, 1991.  [Lit90]  Little, James J., "Vision Servers and Their Clients," In International Conference on Pattern Recognition, vol. 3. pages 295-299. Jerusalem, Oct. 1994.  [Liu90]  Liu, Yuncai, Huang, Thomas S., Faugeras, Olivier D., "Determination of Camera Location from 2-D to 3-D Line and Point Correspondences," In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12. No 1., pages 28-37. IEEE Press, January 1990.  [Mac93]  Mackworth, Alan K, "On Seeing Robots," U B C Technical Report 93-5, March 1993, Vancouver BC.  [May90]  Maybeck, Peter S., "The Kalman Filter: An Introduction to Concepts," In Autonomous Robot Vehicles, Cox & Wilfong, Springer-Verlag, 1990.  129  [McC93]  McConnell, Steve, "Code Complete," Microsoft Press, 1993.  [McK91]  McKerrow , Phillip John., "Introduction to Robotics," Addison-Wesley, 1991.  [Mur93]  Murata S. and Hirose T., "Onboard Locating System Using Real-Time Image Processing for a Self-Navigating Vehicle," In IEEE Transactions on the Industrial Electronics, vol. 40, N o . l . pages 145-154 February 1993.  [Nic93]  Nickerson S.B. et al., " A R K Autonomous Navigation of a Mobile Robot in a Known Environment," in Intelligent Autonomous Systems IAS-3, pages 288-296, Groen, Hirose and Thorpe editors, IOS Press Amsterdam, 1993.  [Ran89]  Rangarajan, Krishnan, Shah, Mubrak and Brackle, David Van, "Optimal Corner Detector," In Computer Vision, Graphics and Image Processing, vol. 48, pages 230-245, 1989.  [Sah93]  Sahota, Michael K., "Real-Time Intelligent Behaviour in Dynamic Environments: Soccer-playing Robots," Master's Thesis, University of British Columbia, August 1993.  [Sug88]  Sugihara, K., "Some Location Problems for Robot Navigation Using a Single Camera," In Computer Vision, vol. 42, no. 1, pages 112-129, April 1988.  [Tsa87]  Tsai, Roger Y., "A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf T V Cameras and Lenses," in IEEE Journal of Robotics and Automation, Aug. 1987 RA-3 N4, pp. 323-344.  130  APPENDIX A  CONSTANTS  Name  Value  Unils  Method  Notes  DOT COMPACTNESS TEST TOLERANCE  1.25  factor  M a n u a l Tuning  M a x i m u m deviation from the ideal compactness of an ellipse for landmark dots.  DOT COMPACTNESS TEST M E D I A N AREA VERTICAL DOT ALIGNMENT TOLERANCE  20  pixels  M a n u a l Tuning  M i n i m u m area for a compactness test to be valid.  5  pixels  M a n u a l Tuning  Number of pixels deviation allowed between the columns of the upper and lower dots on a landmark.  MAXIMUM MIDDLE DOT VERTICAL DEVIATION  16  pixels  M a n u a l Tuning  M a x i m u m number of horizontal pixels between the columns of middle dots and the upper and lower dots.  LANDMARK MIDDLE DOT OFFSET  5  cm  Real W o r l d Measurement  Landmark physical characteristic.  LANDMARK HEIGHT  10  cm  Real W o r l d Measurement  Landmark physical characteristic.  L A N D M A R K WIDTH  10  cm  Real W o r l d Measurement  Landmark physical characteristic.  DOT  2  cm  Real W o r l d Measurement  Landmark physical characteristic.  CENTROID A R E A TOLERANCE  20  pixels  M a n u a l Tuning  M a x i m u m difference in area between dots in the same landmark.  CENTROID A R E A RATIO TOLERANCE  0.33  factor  M a n u a l Tuning  M a x i m u m ratio deviation of areas in dots in the same landmark.  PREDICTION DEVIATION TOL  40  pixels  M a n u a l Tuning  M a x i m u m distance from a prediction that the searching algorithm w i l l search.  WIDTH FRACTION  1.5  factor  M a n u a l Tuning  Search parameters.  HIEGHT FRACTION  0.5  factor  M a n u a l Tuning  Search parameters.  DIAMETER  131  A P P E N D I X B T E S T R U N #1 L O G  Comments are marked in bold text. Numbers on the left are a relative time measure in milliseconds. 236203778 236204611 236204684  Start Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000  236204729  P r e d i c tedLandmark upper 214 . 6 middle 231 lower 247.5  236204753  EllipseDetectionTop row= 220.3 c o l = 222 .4 EllipseDetectionMiddle row= 257.1 c o l = 222.6 EllipseDetectionBottom row= -1 col= -1 Tier2HighestEllipse row= 183.5 c o l = 222 . 5 Tier2HighEllipse row= 220.3 c o l = 222 .4 Tier2MiddleEllipse row= 238.7 c o l = 222 . 5 Tier2LowEllipse row= 2 57.1 COl= 222.6 Tier2LowestEllipse row= 2 93.9 c o l = 222.5 Tier2HighestDetectedEl l i p s e row= -1 col= -1 Tier2MiddleDetectedEll ipse row= 237.7 c o l = 216.3 Tier2LowestDetectedEll ipse row= -1 col= -1 DetectedLandmark upper 22 0.3 222 .4 middle 237.7 216.3 lower 257.1 222.6  236204762 236204769 236204780 236204838 236204847 236204855 236204862 236204873 236204882 236204892 236204901  236204972 236205062 236205680 236205704  236205735  236205761 236205817 236205827 236205838  236205861  -ldmkl5.iff  259 254.5 259  RelativePositionEvent x=-6.0470 z=166.5319 AbsolutePosition x=-68.3376 z=151.9849 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 212.6 middle 231 lower 249.4  34 34 34 area=17 area=2 8 area=-l area=17 area=17 . area=22 area=2 8 area=2 8 area=-l area=24 area=-l 23 23 23  theta= -2.7553 theta= 2 .7553  222.5 216.3 222 .5  EllipseDetectionTop row= 219.6 c o l = 216.3 EllipseDetectionMiddle row= 257.9 c o l = 216.2 EllipseDetectionBottom row= 238 col= 209.7 DetectedLandmark upper 219.6 216.3 middle 238 209.7 lower 257.9 216.2 RelativePositionEvent  132  <—ldmkl3.iff  43 43 43  area=2 0 area=2 9 area=2 6 25 25 25  236205953 236206570 236206599  x=-7.8735 z=160.0473 AbsolutePosition x=-66.7967 z=145.6548 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 211.9 middle 231 lower 250.2  236206699 236206709 236206718 236206729  236206753 236206825 236207508 236207515  EllipseDetectionTop row= 219.1 col= EllipseDetectionMiddle row= 237.8 col= EllipseDetectionBottom row= 258.6 col = DetectedLandmark upper 219.1 209 middle 237.8 202 lower 258.6 209  236207571  EllipseDetectionTop row= 218.6 col= EllipseDetectionMiddle row= 23 8 col= EllipseDetectionBottom row= 258.9 cbl= DetectedLandmark upper 218.6 202 middle 238 195 lower 258.9 202  236207664  236207689 236207708 236208389 236208396  236208517 236208528 236208537 236208547  46 46 46 area=23  202.9  area=29  209.8  area=32  5 9 8  28 28 28  theta=-2.7799 theta= 2.7799  209.6 202.9 209.6  <--ldmk09.iff  49 49 49  202.7  area=24  195.2  area=28  202 . 8  area=35  7 2 8  RelativePositionEvent x=-11.7309 z=151.7592 AbsolutePosition x=-68.2864 z=136.0347 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 210.8 middle 231 lower 251.2  <--ldmkll.iff  209 . 5  RelativePositionEvent x=-9.7699 z=155.1413 AbsolutePosition x=-64.0299 z=141.6490 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 211.3 middle 231 lower 250.8  236207652  theta= 2.7608  216.3 209.7 216.3  236207545  236207581  theta=-2.7608  29 29 29 theta=-2.7535 theta= 2.7535  202.7 195.2 202.7  EllipseDetectionTop row= 218.1 c o l = 195.3 EllipseDetectionMiddle row= 237.8 c o l = 187.5 EllipseDetectionBottom row= 259.6 c o l = 195.3 DetectedLandmark upper 218.1 195.3 middle 237.8 187.5 lower 259.6 195.3  133  -ldmk07.iff  52 52 52 area=27 area=31 area=37  32 32 32  236208571 236208592 236209341 236209348  23 6209377  236209405 236209416 236209426 236209437  236209519 236209540 236210214 236210222  23 6210251  236210279 236210345 236210356 236210368  236210390 236210411 236211099 236211107  RelativePositionEvent x=-13.6682 z=147.4392 AbsolutePosition x=-67.3626 z=131.8614 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 210.3 middle 231 lower 251.8  236211175 236211184 236211194 236211209  theta=  195.3 187.5 195.3  RelativePositionEvent x=-15.5283 z=141.8968 AbsolutePosition x=-65.6022 z=126.7761 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 209.5 middle 231 lower 252.6  55 55 55 area=2 9 area=31 area=41 34 34 34  theta=-2.7731 theta=  2.7731 <--ldmk03.iff  187.2 179 187.2  EllipseDetectionTop row= 217 col= EllipseDetectionMiddle row= 238 col= EllipseDetectionBottom row= 261.3 col= DetectedLandmark upper 217 178 middle 238 169 lower 261.3 178  59 59 59  178 9  area=29  169 9  area=33  178 9  area=44  9 9 9  RelativePositionEvent x=-17.4907 z=137.8561 AbsolutePosition x=-67.6233 z=121.3974 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000  EllipseDetectionTop row= 216.2 EllipseDetectionMiddle row= 238.1 EllipseDetectionBottom row= -1 Tier2HighestEllipse row= 194.4 Tier2HighEllipse row= 216.2  2.7617 <--ldmk05.iff  EllipseDetectionTop row= 217.4 col= 187.1 EllipseDetectionMiddle row= 238 col= 179 EllipseDetectionBottom row= 260.6 col= 187.3 DetectedLandmark upper 217.4 187.1 middle 238 179 lower 260.6 187.3  PredictedLandmark upper 2 08.9 middle 231 lower 2 53.2 236211165  theta=-2.7617  35 35 35 theta=-2.7596 theta=  2.7596 -ldmkOl.iff  178.9 169.9 178.9  63 63 63  col=  169.8  area=31  col=  160.2  area=37  col=  -1  area=-l  col=  165  area=31  col=  169.8  area=31  134  236211218 236211289 236211299 236211309 236211319 236211330 236211341  236211418 236211437 236212124 236212132  Tier2MiddleEllipse row= 227.2 col= Tier2LowEllipse row= 23 8.1 col= Tier2LowestEllipse row= 259.9 col= Tier2HighestDetectedEllipse row= -1 col= Tier2MiddleDetectedEllipse row= -1 col= Tier2LowestDetectedEllipse row= 2 62.2 col= DetectedLandmark upper 216.2 169 middle 238.1 160 lower 2 62.2 169  236212187  EllipseDetectionTop row= 215.5 col= EllipseDetectionMiddle row= 238.1 col= EllipseDetectionBottom row= 2 63.3 col= DetectedLandmark upper 215.5 160 middle 238.1 150 lower 2 63.3 160  236212276  236212299 236212388 236213292 236213300  236213331  236213415 236213427 236213436 236213448  236213473 236213493 236214362 236214389  160.2  area= 37  165  area= 37  -1  area= -1  -1  area= -1  169.7  area= 45  8 2 7  theta= 2 .7652  169 7 160 2 169 7  67 67 67 area= 34  150 .1  area= 40  160.2  area= 49  1 1 2  41 41 41 theta=- 2 . 7767 theta= 2.7767  160.2 150.1 160.2  -ldmk-3.iff  73 73 73  EllipseDetectionTop row= 214.8 c o l = 149.8 EllipseDetectionMiddle row= 238 c o l = 138.8 EllipseDetectionBottom row= 264.3 c o l = 149.7 DetectedLandmark upper 214.8 149.8 middle 238 138.8 lower 264.3 149.7 RelativePositionEvent x=-23.1108 z=123.0362 AbsolutePosition x=-66.1708 z=106.2706 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000  -ldmk-l.iff  160.1  RelativePositionEvent x=-21.1741 z=127.5785 AbsolutePosition x=-65.3080 2=111.6221 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 207.1 middle 231 lower 254.9  38 38 38 theta=- 2.7652  %  P r e d i c tedLandmark upper 208 middle 231 lower 254  236212266  area= 34  RelativePositionEvent x=-19.3923 ' z = 132.8861 AbsolutePosition x=-66.8813 z=116.4546 CurrentImage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000  236212162  236212196  165  area=40 area=42 area=5 6 46 46 46  theta= -2.7703 theta= 2.7703  135  -ldmk-5-iff  theta:  0.0000  236214419  P r e d i c tedLandmark upper 206.3 middle 231 lower 255.8  236214447  EllipseDetectionTop row= 213.7 col= 138.6 EllipseDetectionMiddle row= 238.3 col= 126.9 EllipseDetectionBottom row= 265.2 col= 138.5 DetectedLandmark upper 213.7 138. middle 238.3 126. lower 265.2 138.  236214456 236214657 236214675  236214760 236214779 236215389 236215416  236215446  236215474 236215534 236215545 236215553  236215576 236215598  RelativePositionEvent x=-24.9931 z=118.1437 AbsolutePosition x=-65.2111 z=101.6371 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 205.3 middle 231 lower 256.8  79 79 79  149 . 8 138.8 149.8  area=40 area=49 area=56  theta=-2.7796 theta=  2.7796  <--ldmk-7.if£  85 85 85  138.5 126.9 138.5  EllipseDetectionTop row= 213.1 col= 126.6 EllipseDetectionMiddle row= 238.3 col= 113.9 EllipseDetectionBottom row= 266.4 col= 126.8 DetectedLandmark upper 213.1 126.6 middle 238.3 113.9 lower 266.4 126.8  53 53 53  RelativePositionEvent x=-26.9237 z=113.7144 AbsolutePosition x=-66.3915 z=96.1667  2.7696  area=46 area=52 area=61  theta=-2.769E theta=  136  A P P E N D I X C T E S T R U N #2 L O G  Comments are marked in bold text. Numbers on the left are a relative time measure in milliseconds. 10855142 10855937 10856037  Start Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000  10856090  P r e d i c tedLandmark upper 214.6 middle 231 lower 247.5  10856114  EllipseDetectionTop row= 220.3 col= 222.4 EllipseDetectionMiddle row= 237.7 c o l = 216.3 EllipseDetectionBottom row= 257.1 col= 222.6 DetectedLandmark upper 220.3 222.4 middle 237.7 216.3 lower 257.1 222.6  10856122 10856132 10856141  10856223 10856239 10856846 10856877  10!356909  10!356939 10!356949 10!357015 10!357024  10!357048 10857069 10857677 10857704  10857736  10857821  ldmkl5.iff  RelativePositionEvent x=-6.0470 z=166.5319 AbsolutePosition x=-68.3376 z=151.9849 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 P r e d i c tedLandmark upper 212.6 middle 231 lower 249.4  area=17 area=24 area=2 8  23 23 23  theta=-2.7553 theta=  222 5 216 3 222 5  EllipseDetectionTop row= 219.6 col= EllipseDetectionMiddle row= 257.9 col= EllipseDetectionBottom row= 23 8 col= DetectedLandmark upper 219.6 216 middle 238 209 lower 257.9 216  2.7553  <--  ldmkl3.iff  43 43 43  216 3  area=2 0  216 2  area=2 9  209 7  area=2 6  3 7 2  RelativePositionEvent x=-7.8735 z=160.0473 AbsolutePosition x=-66.7967 z=145.6548 CurrentImage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 211.9 middle 231 lower 250.2  34 34 34  259 254.5 259  25 25 25 theta=-2.760! theta=  216.3 209.7 216.3  EllipseDetectionTop  137  2.760!  46 46 46  ldmkll.iff  10857829 10857840 10857850  10857873 10857970 10858727 10858734  10858763  10?S5! 3791 10!35!3800 10!35!3807 10!35!3818  10858896 10858916 10859601 10859607  10859635  10 359665 10 359736 10!359747 10!359757  10859782 10859804 10860497 10860505  row= 219.1 c o l = 209.5 EllipseDetectionMiddle row= 237.8 c o l = 202.9 EllipseDetectionBottom row= 258.6 c o l = 209.8 DetectedLandmark upper 219.1 209.5 middle 237.8 202.9 lower 258.6 209.8 RelativePositionEvent x=-9.7699 z=155.1413 AbsolutePosition x=-64.0299 z=141.6490 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 211.3 middle 231 lower 250.8  theta=  2.7799  202 .7  area=24  195.2  area=28  202.8  area=3 5  7 2 8.  29 29 29 theta=-2.7535 theta=  2.7535 <—  ldmk07.iff  52 52 52  195 3  area=27  187 5  area=31  195 3  area=37  3 5 3  RelativePositionEvent x=-13.6682 z=147.4392 AbsolutePosition x=-67.3626 2=131.8614 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 210.3 middle 231 lower 251.8  28 28 28  49 49 49  202.7 195.2 202.7  EllipseDetectionTop row= 218.1 col= EllipseDetectionMiddle row= 237.8 col= EllipseDetectionBottom row= 259.6 col= DetectedLandmark upper 218.1 195 middle 237.8 187 lower 259.6 195  area=32  ldmk09.iff  RelativePositionEvent x=-11.7309 2=151.7592 AbsolutePosition x=-68.2864 z=136.0347 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 210.8 middle 231 lower 251.2  area=29  theta=-2.7799  209.6 202.9 209.6  EllipseDetectionTop row= 218.6 col = EllipseDetectionMiddle row= 238 col= EllipseDetectionBottom row= 258.9 col= DetectedLandmark upper 218.6 202 middle 238 195 lower 258.9 202  area=23  32 32 32 theta=-2.7617 theta=  2.7617 <--  195.3 187.5 195.3  138  55 55 55  ldmk05.iff  10860566 10860577 10860587 10860596  10860621 10860644 10861340 10861348  EllipseDetectionTop row= 217.4 col= 187.1 EllipseDetectionMiddle row= 238 col= 179 EllipseDetectionBottom row= 260.6 col= 187.3 DetectedLandmark upper 217.4 187.1 middle 238 179 lower 2 60.6 187.3 RelativePositionEvent x=-15.5283 z=141.8968 AbsolutePosition x=-65.6022 z=126.7761 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000  theta=  PredictedLandmark upper 209.5 middle 231 lower 252.6  10861403  EllipseDetectionTop row= 217 col= 178.9 EllipseDetectionMiddle row= 238 col= 169.9 EllipseDetectionBottom row= 261.3 col= 178.9 DetectedLandmark upper 217 178.9 middle 238 169.9 lower 261.3 178.9  10861421 10861434  10861457 10861476 10862195 10862203  10862232  10862320 10862330 10862340 10862351 10862365 10862374 10862386 10862397 10862450 10862459 10862469 10862478 10863183 10863191  187.2 179 187.2  RelativePositionEvent x=-17.4907 z=137.8561 AbsolutePosition x=-67.6233 z=121.3974 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 208.9 middle 231 lower 253.2  area=41 34 34 34  2.7731  ldmk03.iff  59 59 59 area=29 area=33 area=44 35 35 35  theta=-2.7596 theta=  178.9 169.9 178.9  EllipseDetectionTop row= 216.2 col= EllipseDetectionMiddle row= 238.1 col= EllipseDetectionBottom row= -1 col= Tier2HighestEllipse row= 194.4 col= Tier2HighEllipse row= 216.2 col= Tier2MiddleEllipse row= 227.2 col= Tier2LowEllipse row= 238.1 col= Tier2LowestEllipse row= 259.9 col= Tier2HighestDetectedEllipse row= -1 col= Tier2MiddleDetectedEllipse row= -1 col= Tier2LowestDetectedEllipse row= -1 col= NotDetected Currentlmage AbsoluteLandmarkPosition Id: 0  area=31  theta=-2.7731  10861376  10861411  area=29  2.7596  ldmkOl.iff  63 63 63  169.8  area=31  160.2  area=37  -1  area=-l  165  area=31  169.8  area=31  165  area=34  160.2  area=37  165  area=37 area=-l area=-l area=-l ldmk-l.iff  139  x: z: theta:  0.0000 0.0000 0.0000  •  PredictedLandmark upper 2 08.9 middle 231 lower 253.2 10863305 10863315 10863326 10863336  10863360 10863379 10864082 10864090  108 64187  10864213 10864222 10864231 10864242  10864313 10864330 10865018 10865027  10S365163 10S365228  101365242  10865267 10865366 10866154 10866175  "  EllipseDetectionTop row= 215.5 col= 160.1 EllipseDetectionMiddle row= 238.1 col= 150.1 EllipseDetectionBottom row= 263.3 col= 160.2 DetectedLandmark upper 215.5 160.1 middle 238.1 150.1 lower 263.3 160.2 RelativePositionEvent x=-21.1741 z=127.5785 AbsolutePosition x=-65.3080 z=111.6221 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 207.1 middle 231 lower 254.9  theta=  160.2 150.1 160.2  RelativePositionEvent x=-23.1108 z=123.0362 AbsolutePosition x=-66.1708 z=106.2706 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000  RelativePositionEvent x=-24.9931 z=118.1437 AbsolutePosition x=-65.2111 z=101.6371 Currentlmage AbsoluteLandmarkPosition  area=34 area=40 area=49 41 41 41  2.7767  <--  ldmk-3.iff  73 73 73 area=40 area=42 area=56 46 46 46  theta=-2.7703 theta=  149.8 138.8 149.8  EllipseDetectionTop row= 213.7 col= EllipseDetectionMiddle row= 238.3 col= EllipseDetectionBottom row= 2 65.2 col= DetectedLandmark upper 213.7 138 middle 238.3 126 lower 265.2 138  63 63 63  theta=-2.7767  EllipseDetectionTop row= 214.8 col= 149.8 EllipseDetectionMiddle row= 238 col= 138.8 EllipseDetectionBottom row= 264.3 col= 149.7 DetectedLandmark upper 214.8 149.8 middle 238 138.8 lower 264.3 149.7  PredictedLandmark upper 2 06.3 middle 231 lower 255.8 10!365154  178.9 169.9 178.9  2.7703  <—  ldmk-5.iff  79 79 79  138 6  area=40  126 9  area=49  138 5  area=56  6 9 5  48 48 48 theta=-2.7796 theta=  140  2.7796  <--  ldmk-7 . ' i f f  Id: x: z: theta: 10866206  10866272 10866282 10866289 10866301  10866325 10866348  0 0.0000 0.0000 0.0000  PredictedLandmark upper 205.3 middle 231 lower 256.8  138.5 126.9 138.5  85 85 85  EllipseDetectionTop row= 213.1 col= 126.6 EllipseDetectionMiddle row= 238.3 c o l = 113.9 EllipseDetectionBottom row= 266.4 col= 126.8 DetectedLandmark upper 213.1 126.6 middle 238.3 113.9 lower 266.4 126.8 RelativePositionEvent x=-26.9237 z=113.7144 AbsolutePosition x=-66.3915 z=96.1667  area=46 area=52 area=61 53 53 53  theta=-2.7698 theta= 2.7698  141  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items