UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Roll: a robotic on-board localization system using landmarks Brewster, Jeffery Charles 1996

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1997-0049.pdf [ 7.1MB ]
Metadata
JSON: 831-1.0051465.json
JSON-LD: 831-1.0051465-ld.json
RDF/XML (Pretty): 831-1.0051465-rdf.xml
RDF/JSON: 831-1.0051465-rdf.json
Turtle: 831-1.0051465-turtle.txt
N-Triples: 831-1.0051465-rdf-ntriples.txt
Original Record: 831-1.0051465-source.json
Full Text
831-1.0051465-fulltext.txt
Citation
831-1.0051465.ris

Full Text

ROLL — A ROBOTIC ON-BOARD LOCALIZATION SYSTEM USING L A N D M A R K S by J E F F E R Y C H A R L E S B R E W S T E R B . Eng., Carleton University, 1992 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F S C I E N C E in T H E F A C U L T Y O F G R A D U A T E S T U D I E S Department of Computer Science We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y O F B R I T I S H C O L U M B I A December 1996 © Jeffery Charles Brewster, 1996 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Computer Science The University of British Columbia 2366 Main Mall Vancouver, B C Canada V 6 T 1Z4 A B S T R A C T This thesis presents the design of a system for robot localization named R O L L . The R O L L system was designed to be used in a dynamic real-time robot testbed. The system could provide location information to experimental robot control architectures. There are many existing localization systems for robots using many different techniques and sensors. All of these localization systems either require custom computer hardware or are much too slow or require the robot to stop for a certain period of time for the localization system to produce output. The system in this thesis allows localization without having to have the robot stop, and without the need for expensive custom hardware. The R O L L system uses a camera mounted on the robot facing forward to detect special landmarks placed in the environment. From the shape of the landmarks in the camera image, the position of the camera relative to the robot can be calculated. This position can be transformed into absolute frame coordinates using the known absolute frame position of the landmark. To test the R O L L system, it was implemented in the A C E programming environment on a stand-alone workstation. This implementation was fed sequences of stored images that included a visible landmark to simulate a real robot trajectory. The system was also implemented on a Texas Instruments TMS320C40 DSP board to prove that the algorithm could be run at acceptable real time speeds. The system was able to detect the landmark in all of the images in two series of 11 images each, except for one. The position of the camera was calculated to within 4.2cm of the known position for each image. The TMS320C40 implementation was able to provide location information at an average of approximately 8Hz. II T A B L E O F C O N T E N T S ABSTRACT 11 TABLE OF CONTENTS ' " 4 LIST OF TABLES V l LIST OF FIGURES • V»I ACKNOWLEDGEMENTS V HI 1. Introduction 1 1.1 Robot Localization 1 1.1.1 What is Robot Localization? 2 1.1.2 Why is Localization Necessary? 4 1.2 Context 4 1.3 Work 5 1.4 Summary Of Results 6 1.5 Overview 7 2. Previous Work 8 2.1 Localization System Tasks 8 2.2 Localization System Characteristics 12 2.2.1 Coordinate Systems 12 2.2.2 Landmarks 13 2.2.3 Sensing System 15 2.2.4 Processing System 15 2.2.5 Maps 16 2.2.6 Multiple Views 16 2.2.7 Partial Views 16 2.2.8 Landmark Tracking 17 2.2.9 Dynamic vs. Static World 17 2.2.10 Stop and Sense vs. Continuous Sensing 17 2.2.11 Accuracy 18 2.2.12 Robustness 18 2.2.13 Integration With An Odometry System 18 2.3 The Ideal Localization System 19 2.4 Localization With Naturally Occurring Landmarks 20 2.4.1 EricKrotkov '. .' 20 2.4.2 A E C L Ark 21 2.5 Localization With Fabricated Landmarks 21 2.5.1 Global Positioning System (GPS) 22 2.5.2 Murata and Hirose 22 2.6 Camera Calibration 23 2.6.1 The Ideal Camera Model 24 2.6.2 Tsai - Camera Calibration 25 3. System Outline 30 3.1 System Specifications 30 iff 3.2 Design Highlights 32 3.2.1 General Approach 32 3.2.2 Localization W i t h A Single Landmark 33 3.2.3 Landmark Design 36 3.2.4 Landmark Detection 44 3.2.5 Landmark Identification 57 3.2.6 Plant Model ing 58 3.3 System Overview 61 3.3.1 Task Decomposition 62 4. System Detail 67 4.1 The A C E Environment 67 4.1.1 Events, Actions and Components 68 4.1.2 A C E Configurations 69 4.1.3 Design and Development with A C E 70 4.2 The R O L L Configuration 71 4.2.1 The M a i n Pipeline 73 4.2.2 The Supplementary Pipeline 73 4.2.3 Loop Control 73 4.3 Data Types 74 4.3.1 Image Plane Coordinate Data Types 75 4.3.2 Frame Buffer Coordinate Date Types 75 4.3.3 Cartesian Coordinate Data Types 77 4.3.4 GreyScalelmage Data Type 77 4.4 Components 78 4.4.1 The Frame Grabber 79 4.4.2 The M a p 82 4.4.3 The Landmark Predictor 84 4.4.4 The Landmark Detector 90 4.4.5 The Relative Position Calculator 99 4.4.6 The Absolute Position Calculator 100 4.4.7 The Position Estimator 101 4.5 Embedded System Implementation 102 5. Results And Evaluation 105 5.1 Localization System Evaluation 105 5.1.1 Issues/ Difficulties 105 5.1.2 Experiment Procedure 107 5.1.3 Obtaining a Sequence of Images 108 5.1.4 Camera Calibration Procedure 108 5.1.5 Camera Calibration Results 109 5.1.6 Localizat ion Test Run Procedure I l l 5.1.7 Analysis O f Localization Test Runs 113 5.2 A C E Environment Evaluation 120 5.2.1 Advantages 120 5.2.2 Disadvantages 121 6. Conclusions 123 6.1 Specifications Evaluation 123 6.1.1 Continuous Sensing 123 6.1.2 R e a l - T i m e Performance 123 6.1.3 M i n i m a l Hardware 124 6.1.4 Single Camera Localization 124 6.1.5 Range Requirements 124 6.1.6 N o Odometry Required 124 w 6.2 Viabi l i ty 124 6.2.1 Single Landmark Localization 125 6.2.2 The current R O L L System 125 6.2.3 A C E for Image Processing 126 6.3 Future Work 126 6.3.1 Robustness Testing 126 6.3.2 A Scale Invariant Landmark 127 6.3.3 Large Scale Maps 127 6.3.4 Fu l l Real-Time Implementation 128 B I B L I O G R A P H Y 129 A P P E N D I X A C O N S T A N T S 131 A P P E N D I X B T E S T R U N #1 L O G 132 A P P E N D I X C T E S T R U N #2 L O G 137 V L I S T O F T A B L E S Table 1 Simuated Test Run #1 Full Light 115 Table 2 Simulated Test Run #2 - Half Light 117 LIST O F FIGURES Figure 1 The Ideal Camera Mode l 24 Figure 2 Camera geometry with perspective projection and radial lens distortion 26 Figure 3 The First Landmark Design 37 Figure 4 The Landmark A s It Appears In The Ideal Image Plane 39 Figure 5 Calculating Relative Position For Landmark Type 1 39 Figure 6 Image Plane Parallelism 40 Figure 7 Landmark Design 2 42 Figure 8 Calculating the Relative Position of Landmark Type 2 43 Figure 9 The Corner Detectors 47 Figure 10 The Second Landmark Type 50 Figure 11 Landmark Identification Marker 57 Figure 12 System Structure 62 Figure 13 The R O L L Configuration 72 Figure 14 Frame Grabber Component..... 79 Figure 15 The Image Sequence Loader Component 80 Figure 16 The M a p Component 82 Figure 17 The Landmark Predictor Component 84 Figure 18 The Landmark Detector Subconfiguration Interface 90 Figure 19 The Landmark Detector Sub-Configuration 91 Figure 20 The Frame Landmark Splitter Component 92 Figure 21 The Ell ipse Detector Component 93 Figure 22 The Landmark Shape Verifier Component 94 Figure 23 The Landmark Identification Verifier 96 Figure 25 The Absolute Position Calculator Component 100 Figure 26 The Position Estimator Component 101 Figure 27 The Physical Implementation 103 Figure 28 The Camera Mounting Platform 106 Figure 29 Optical Bench Mot ion Stage Setup 107 Figure 30 The Calibration Target 109 Figure 31 Test Run # 1 - F u l l Light 112 Figure 32 Test Run #2 - Dark 113 Figure 33 Landmark Not Detected - Test Run #2 119 VI! A C K N O W L E D G E M E N T S I would first like to thank my lovely wife, Mari-Jane, for her inexhaustible patience during the last few years. In the darkest moment, when I thought I had lost everything she calmed me using her words even though heavy sedatives might have been more appropriate. For their continuing encouragement I would also like to thank my family, Scott, Cheryl and Mike Brewster, as well as Mari-Jane's family, Michael and Nora Medenwaldt and Betty Tasaka. My gratitude to my supervisor, Jim Little who possesses an astounding ability to keep high levels of optimism and enthusiasm. There were many months when I had to draw upon his reservoirs of hope when mine were depleted. I would like to thank the reviewing reader, David Lowe, for taking the time to review this thesis. Rod Barman and Stewart Kingdon have contributed significant technical prowess, without which this thesis would have been impossible. Many thanks to my current employers International Submarine Engineering and my managers Eric Jackson and Mike Borghardt for flexible scheduling that allowed me to complete my thesis. The majority of financial support for this thesis was provided by the Natural Sciences and Engineering Research Council. Many thanks to Betty Tasaka for her personal financial contributions. It is very likely this thesis would not have been completed without her support. Thanks to my employers International Submarine Engineering for their contribution towards my tuition costs. v i i i Chapter One - Introduction 1. Introduction This thesis presents a design for a mobile robot localization system named R O L L . R O L L is an acronym for Robot On-board Localization Using Landmarks. The R O L L system uses on-board video sensing to determine the position of the robot in world coordinates. The position calculations are based on the shape of special landmarks that appear in the image. The emphasis of this thesis is on the design and evaluation of the system in the simulated testbed implementation. Although the primary topic of this thesis is robot localization, the emphasis of this thesis is not on providing a rigorous proof of any new theoretical material related to the topic of robot localization. The A C E development environment itself is not presented as part of the work done on this thesis, but its impact on the design of the system warrants a description of A C E and a discussion of its impact on the design methodology is included in this thesis. 1.1 Robot Localization Robot localization provides the primary topic of this thesis. Localization deals with finding a current position of a robot in its environment. The topic is related to camera calibration and mobile robot navigation. Camera calibration and its relation to robot localization is discussed in the chapter on previous work. Robot localization is usually considered a component of navigation. Any navigation system usually has components to perform path-planning, localization, obstacle avoidance and 1 driving control. For specific systems, localization may be tightly integrated with other components. For instance, a real-time localization system may provide the location information for the main driving control loop. If an obstacle avoidance system uses the same sensors as the localization system, then the obstacle avoidance system may be integrated with the localization system. The localization system presented in this thesis is based on no assumptions about any overlying navigation system. Its design is completely modular, but is open enough that it could be easily modified to be tightly integrated with any other navigation component, should a suitable match be found. 1.1.1 What is Robot Localization? Robot localization, or camera localization, is the process of determining the location of the robot, or camera from sensed information. While the general localization problem is 3-D, for six degrees of freedom, this thesis is only concerned with 2-D, or three degrees of freedom. Krotkov [Kro89], who is also concerned with 2-D localization, provides the following definition of localization : Let W be a fixed two-dimensional reference frame, and let R be a two-dimensional reference frame attached to the mobile robot. ... For a rigid robot, the transformation from W to R has three parameters: two for translation and one for rotation. These parameters define the pose T=(x, y,d) of the robot. 2 We assume that the robot is equipped with a camera, and that it moves on aflat surface. Further, we assume that the robot has a map of its environment. The map marks the positions, referred to W, of vertically oriented parts of objects such as doors, desks, equipment racks, etc. The robot is to take an image and extract vertical edges form it. These edges by themselves do not contain enough information to locally determine which vertical edge corresponds to which point on the map. The problem is to identify Tby establishing the correspondence between the vertical edges and points in the map. Krotkov's definition was given specifically for his method of localization. It includes references to video, and vertical landmarks which are not present in every localization system. A more general definition can be derived from this definition that can be applied to nearly all two dimensional localization systems. Let W be a fixed two-dimensional reference frame, and let R(t) be a two-dimensional reference frame attached to the robot. For a rigid robot, the transformation from W to R(t) has three parameters: two for translation and one for rotation. These parameters define the pose T(t) = (x(t), y(t),6(t)) of the robot. The robot has some sensory information I(t), and some information about its present environment M(t). The problem is to identify T(t) from I(t) and M(t). The latter definition generalizes the definition of localization to include systems that do not use video. The definition even applies to those localization systems that do not use 3 landmarks, or do not start with a map of landmarks , given by the case M(t) = {} and M(0) = {} respectively. 1.1.2 Why is Localization Necessary? Localization is necessary for any task which needs to know the precise position of the robot relative to some object or location. Typically this information is required for precise path planning, or sometimes for object manipulation. For example the task of pushing an object to a specified region requires that the robot know its position relative to the specified region and the position of the object relative to the camera. The former is obtained by a localization system while the latter is obtained by an object recognition system which may or may not use the same sensing information as the localization system. 1.2 Context This thesis was motivated by the work done at the University of British Columbia Laboratory for Computational Intelligence on a dynamic physical testbed for robot control architectures. Sahota, [Sah93], describes a testbed in which remote-control model race cars are controlled by computer to play soccer against each other. In the current system, localization information is provided by an all-seeing off-board camera mounted above the playing field. Each robot is given a continuous stream of location information for the whole system, including the location, and heading of itself, its opponent, and the soccer ball. This system has been successful in proving the capabilities of an architecture to operate in a multiple-agent real-world dynamic environment. The effort in this thesis is based on the desire in add more complexity to this testbed. If each robot were equipped with its own on-board sensing system, new stresses would be placed upon the architecture. The architecture may then have to accommodate any issues presented by a 4 specific sensing system. For example, most on-board sensing systems do not provide complete information like the off-board camera in the current testbed. Sensing information may vary in availability and quality. Sensing information may arrive at unpredictable intervals, and with varying degrees of accuracy. For sensing that is not robust, the controller must be able to deal with failure of the sensing system. This may entail performing some procedure in order to reinitialize the sensing system, like a football player that must regain his bearings after a nasty tackle. In effect, an architecture would have to be evaluated in conjunction with the sensing system that it uses. The R O L L design has been implemented on two platforms. The first is Windows NT using a real-time data-flow environment, called A C E , as a simulation testbed, and the second is a real-time partial implementation using a single Texas Instruments TMS320C40 (hereafter a C40) DSP processor and a cylindrical robot, named Spinoza, from Real World Interfaces Ltd. The latter implementation serves only to provide proof of real-time, real-world practicality of the design. At the time of writing of this thesis, no significant evaluation of the performance of the latter system could be performed, although some rudimentary qualitative evaluation is given. 1.3 Work The purpose of this thesis is to provide a design for on-board localization systems for a robot in a dynamic environment. The design incorporates continuous sensing, and location calculation from a single landmark with minimal processing hardware. Though the design is implemented in a simulated environment, a physical implementation of the system is provided as proof of the viability of the design for real-time localization. The thesis has a sub-emphasis on the implementation of systems using asynchronous data flow paradigms instead of procedural decomposition. This issue is explored in the A C E 5 environment, property of International Submarine Engineering Ltd. This design approach conveys numerous advantages to designing and documenting a control system. 1.4 Summary Of Results Two test runs were performed to in the simulated implementation with a landmark height of 10cm. These test results showed that the system was capable of detecting landmarks from at least 1.5m away. The maximum error in the two test runs was approximately 3.0cm tangential error and 3.1cm radial error1. The maximum orientation error was 0.0021 radians, or 0.12 degrees. The landmark was successfully detected in all twelve full light images. The landmark was not detected in one of the twelve half light images. One of the test runs was performed with all laboratory lights on, and the second was performed with the lights off. The first test run resulted in an average error of (-0.5, -0.01, -0.005)2, with maximum errors of ( 3.0, 1.1, 0.018 ), and standard deviations of ( 1.32, 0.46, 0.0091). The second test run resulted in an average error of ( 0.8, 1.3, 0.007), with maximum errors of ( 2.9, 3.1, 0.021), and standard deviations of ( 1.47, 1.06, 0.0111). These numbers clearly show some deterioration in the darker test run, although the system clearly still operates in vastly different lighting conditions. The A C E environment was successful in illuminating the'elements of the design of the system. This facilitated the development and debugging of the system. The disadvantage of the A C E system is that it isolated the system from the facilities used to capture the images and thus prevented testing with a large number of images. 1 Here tangential error is approximated as the x direction error and the radial error is approximated as the z direction error. These directions are approximately aligned. 2 This error vector is shown as ( x-dimension error in cm, z-dimension error in cm, orientation error in radians). 6 1.5 Overview Chapter 2 presents a discussion of issues relevant to localization, including a general task decomposition and general system characteristics. It also summarizes several notable papers that present localization designs. Chapter 3 contains the requirements and explores design decisions for the design proposed in this thesis. The specific landmark design and all relevant equations are included. Chapter 4 runs through the detailed design, as it is implemented in A C E , including data types, and a task decomposition into A C E components. A small section at the end of Chapter 4 discusses the implementation of the system on a C40 processor board. Chapter 5 presents descriptions of tests performed to evaluate the system along with the results and a discussion of the results. Two general tests are performed, accuracy tests and a couple of simulated test runs using real images. Chapter 6 presents the conclusions inferred from the tests in Chapter 5, and proposes future work for improving the system. 7 Chapter Two - Previous Work 2. Previous Work This section presents a summary of several localization systems, some related work and a framework by which to classify these systems. Included is work by Krotkov, Nickerson, Hirose and Murata, and Tsai. The latter is related to camera calibration while the former three are all mobile robot localization schemes. A discussion of the GPS system is also included, although the scale of this system is not suitable for the scale of mobile robotics described in this thesis. The framework is given in three sections. The first section lists a set of tasks that every localization system must perform. The second section characterizes how localization systems perform these tasks. The third section presents an ideal localization system based on selecting the most desirable characteristics described in section two. Sections four, five, and six present three case studies of particular localization systems. Section six describes the topic of camera calibration, and in particular the camera calibration method of Tsai [Tsa87]. 2.1 Localization System Tasks Most landmark localization systems can be decomposed in roughly the same tasks. The following is a general task decomposition for localization systems. Some of the tasks are debatably optional, but some are clearly required. For example, every localization system must perform landmark detection and pose calculation, or they would not be localization systems. Even GPS, (Global Positioning System), performs some form of landmark detection when it demodulates the satellite signals. 8 Initialization Initialization is so common in all systems, not just localization, that it could be omitted. However, there are two initialization sub-tasks that pertain to localization systems in particular. Maps must be initialized, and the robot's initial position must be obtained. In some cases, some calibration is also performed at initialization. If the localization system does not track landmarks between sensing samples, finding the initial position would not be considered part of the initialization task. Finding the initial position could be the same as finding the location during steady state operation. If tracking is used, initialization may be performed every time tracking is lost, which greatly increases the importance of system initialization. If tracking is frequently lost and system initialization takes too long, or requires the robot to perform some specific actions, then the localization system utility degrades substantially. Map initialization is discussed in the following section. Map Storage and Access Every localization system that provides global pose information must store the absolute pose of each landmark in a map. The actual form of a map may vary in complexity from a simple array, storing only the pose of the landmark to a relatively sophisticated database which must store information necessary to detect or recognize the landmark. Maps also vary in their method of initialization. Most maps are calculated or measured off-line and stored in a static file. Some systems may be able to generate new landmark map entries on-line. In this case, map initialization may include a robot "wander" state which locates several landmarks to enter into the map. Landmark Detection 9 All systems must detect landmarks. Typically the landmark detection algorithm is the salient feature of a localization system. Methods of landmark detection may vary greatly depending on the sensors used and the landmark design. Localization systems that use video may use a complex object recognition algorithm while GPS simply monitors a given frequency for a modulated signal. Probably the most complex localization schemes are those that use naturally occurring landmarks. These systems must not only detect landmarks that are somewhat arbitrary, they must also have algorithms for selecting and storing new landmarks. Landmark Tracking Some systems may track landmarks between sensing samples to increase performance. Typically it is necessary to track landmarks if it takes longer to detect a landmark from scratch than the time available between samples. There are two typical methods of tracking landmarks. The first is to retain the image coordinates of the landmark in the last sample and look for the landmark within a certain range of the these coordinates. The first and second order derivatives of the image coordinate information may be used to increase the accuracy of the tracking. Another method is to derive the expected position of the landmark in the image from the last calculated location of the robot. Again, derivative information can be used for a more accurate prediction. This method, although it incurs additional processing, can make use of location information from other sources, such as a dead reckoning system. This leads to increased reliability. While landmark tracking leads to very large performance gains, it also incurs the disadvantage that if tracking is lost, the system will have to reinitialize itself to regain position. 10 An obvious modification would be that if tracking was lost, a larger search region could be specified. In any case, once tracking is thoroughly lost, there may be a blackout period of location information, which may require the robot to remain immobile for the duration of the blackout. Landmark Identification For many systems, detecting a landmark is not sufficient. Each detected landmark must be identified. Some landmarks are designed to contain some identifying feature that distinguishes one landmark from another. Various bar code identifiers are common amongst localization systems. Other systems use searches or assumptions to determine the identity of a landmark. For instance a system that detects only vertical lines must use a combinatorial search to match detected lines to known lines in the environment. Finally some systems automatically identify each landmark in the detection phase. For example, a localization scheme that detects naturally occurring landmarks may use the unique features of a landmark to detect the landmark. Thus the identity of a landmark is automatically known once the landmark is detected. Pose Calculation Once landmarks are detected and identified, the pose of the camera must be calculated. Typically this involves calculating the pose of the camera relative to the landmark and then transforming the pose to the absolute reference frame. These transformations usually involve numerous triangulation and trigonometric calculations. Although these calculations are usually relatively straightforward compared to the 11 usually complex landmark detection systems, these calculations must be performed with care since high sensitivities may cause large errors in the calculated location. Position Filtering and Plant Modeling For many systems each location calculation from an individual sample or set of samples contains a high degree of random errors. Assuming there are no systematic errors, these errors can be greatly reduced by filtering the location information. Filtering schemes may vary from a simple moving average to a Kalman filter system, complete with a model of the plant and complex error calculations. [May90] and [Gel74] give introductions into Kalman filters. Kalman filters may also combine location information from other sources such as dead reckoning, or odometry to produce a better location calculation than with the localization system alone. For most systems, the Kalman filter is considered part of the localization system, although it could be considered a separate component. 2.2 Localization System Characteristics There are many different types of localization systems; nearly every type of mobile robot that is created must have some form of localization system. This section enumerates the possible characteristics of a localization system. 2.2.1 Coordinate Systems All localization systems produce pose information relative to some frame of reference. Typically this information is produced in the Cartesian coordinate system, with either 3 or 6 degrees of freedom. GPS [Hur93], since it must produce localization information worldwide, produces position information in < latitude, longitude, altitude> tuples. Local vs. Global 12 The coordinate system may be local or global. GPS is a good example of a global coordinate system3, since all position tuples are relative to a single absolute frame of reference -<the equator, the first meridian, sea level>4. Some systems may produce pose tuples that are relative to some local frame, whose location relative to some absolute frame is known. Since any output produced can be easily transformed to absolute frame coordinates, this type of system is still considered to produce output in global coordinates. A local coordinate localization system would be one that produces output in coordinates relative to some landmark5, whose location is unknown. Because the location of the landmark is unknown it is obviously impossible to determine the location of the camera relative to some absolute frame. For these types of systems, the landmark is usually the/a point of interest for some task, and the absolute location of the robot is irrelevant. 2.2.2 Landmarks Every localization system must have at least one landmark. The composition of the chosen landmark varies widely from system to system. Krotkov [Kro89] uses only naturally occurring vertical lines, Murata and Hirose use specially coded vertical poles, and GPS uses satellites. All landmarks have the common characteristic that they are easily detected6. They may also be classified in the following ways. Active vs. Passive An active landmark originates a signal to the robot to be sensed, while a passive landmark does not. GPS is an active system, since the satellites are responsible for sending out radar signals that can be received from anywhere in the earth's atmosphere. All other systems 3 Hence the "global" in the name global positioning system. 4 Oblivious to the fact that the world is hurtling through space, rotating and orbiting like a mad hunk of space debris. 5 Or more typically to object of interest for local coordinate system localization system. 13 listed in the previous work chapter use passive landmarks. Active landmarks are more easily detected, but are more expensive. Fabricated vs. Natural A localization system may either rely on objects occurring naturally in the target environment, or require that specially fabricated objects be placed in the environment to serve as landmarks. Using fabricated landmarks is disadvantageous for several reasons. • Landmarks may clutter the environment, becoming an obstacle to motion for the robot or for humans. • Placing landmarks will incur costs for the landmarks themselves, and the work required for a human to place the landmarks in the environment7. • The robot will not be able to move outside of an area that has been furnished with the landmarks. Naturally occurring landmarks do not have these disadvantages, but have one distinct disadvantage of their own. Naturally occurring landmarks are either difficult to detect or difficult to identify. The A E C L A R K robot uses manually picked "distinct" objects in the environment. Although these objects are selected for ease of detection, landmarks may have no features in common. Therefore a general object recognition system is required. Identification Fabricated landmarks usually include some coded identification number in their shape. Often this encoding is similar to bar codes. For Murata and Hirose [Mur93] the identification is 6 In the case of Krotkov, landmarks are extremely easily detected, but there are a huge number of false matches. 7 If location accuracy is paramount, the placement of the landmarks must be exact, or some "map-learning" sequence must be followed to ensure that the robot has an accurate map of landmark locations. 14 completely combined with the detection, such that anything that can be identified as a valid landmark identification is determined to be a landmark. 2.2.3 Sensing System Localization may use many different types of sensing systems. Radio beacons, laser range finders, sonar, and video are common sensing equipment used in localization systems. Each system has advantages an disadvantages. Radio sensing, such as GPS, can be inexpensive, quick and accurate but can only be used at the global scale. An accuracy of 100m or 10m for various grades of GPS may be extremely accurate when calculating latitude and longitude, but it cannot be used to navigate a robot through a building. Custom radio triangulation can also be accurate and possibly cheap but can also pollute the airwaves and may not be extensible. Laser range finders can produce accurate results but are slow and expensive. Landmark detection and identification is very difficult with laser range finders, since they output depth maps which require 3-D object recognition techniques. Video is relatively cheap and has the added advantage that the image produced can be simultaneously shared with other systems such as obstacle avoidance and object recognition/manipulation. Its disadvantage is that the software needed to process the output is very sophisticated and often very slow. 2.2.4 Processing System Each system requires some amount of processing power to process the samples provided by the sensing system. The processing system includes any hardware or software involved in detecting landmarks and calculating the pose of the robot from these landmarks. 15 2.2.5 Maps Every global localization scheme must have some sort of a map. The position of landmarks relative to the absolute frame of reference must be known in order to calculate the absolute position of the landmark. At the simplest level a map can be a simple array of landmark poses, where the index of the landmark pose in the array is the identification number of the landmark. On the other end of the spectrum, a map may be stored in a database that also provides a model of the landmark necessary for recognizing the landmark. Additional complexity may be introduced if the map is divided into subsections, or must be otherwise capable of selecting only landmarks visible from the current pose of the landmark. If a system is capable of learning the positions of new landmarks, the database must also allow new landmarks to be added dynamically. 2.2.6 Multiple Views Most localization systems require that two landmarks be visible to calculate robot pose. Systems can be characterized by their ability to incorporate superfluous landmarks in the image, i.e., if three landmarks appear in the image, can the system calculate the location with any increase in accuracy? 2.2.7 Partial Views Some systems may be able to gather information from a landmark that is only partly visible, while most cannot. This may have a significant impact on the performance of the localization system. An analysis of this effect is beyond the scope of this thesis, but this effect is described to convince the reader of the effect of sensing partial views of landmarks might have. 16 Each localization system requires some number of landmarks to be present to calculate robot position. A system could be analyzed to find the mean number of landmarks appearing in an image, the percentage of time the required number of landmarks are present, and finally the increase to this percentage when partial views are allowed. This figure could then be used to calculate the overall effect on system robustness and accuracy. 2.2.8 Landmark Tracking Some systems may track landmarks from one sensing sample to another in order to aid detection. This includes both retaining the last position of the landmark in the sample, and using the last known pose of the robot to predict the location of new landmarks in the next sample. The former approach may be simpler, but will not by itself aid in the detection of landmarks that become visible in the next image. The latter may be more susceptible to becoming lost when the current pose of the landmark is erroneous. 2.2.9 Dynamic vs. Static World A system may make assumptions that it is the only agent in a given world, or it may account for differences in the environment due to the presence of other agents. This may amount to simply ignoring any failed detection, which may be caused by some other agent obscuring the view of the landmark. Most systems make the implicit assumption that the position of the landmarks do not change, or the localization problem would become very difficult indeed. 2.2.10 Stop and Sense vs. Continuous Sensing Many localization schemes require the robot to cease any movement in order to perform localization. This may be due to the slow speed of its sensors, the time required to process sensed samples, or the robot may need to "look around" for landmarks. This approach is only acceptable if the robot's task is not time-dependent or time-critical, i.e., it is not a real-time task. It is also 17 not feasible if the robot is not allowed to stop. Therefore many systems are able to sense location information while in motion. 2.2.11 Accuracy The accuracy of information provided by the localization system is a characteristic of each system. Different applications will require different accuracy, and different sensing equipment and calculation methods can provide different levels of accuracy, usually as a trade-off with required processing time or cost of processing hardware. 2.2.12 Robustness Robustness for localization systems can be summed up in a single measure: what percentage of the time does the system provide accurate location information. Additionally, robustness also involves any required behaviour involved in reinitializing the localization system after it becomes lost. Systems that track landmark position within images from frame to frame are most susceptible to becoming lost. Systems that can find landmarks without tracking image position are not as susceptible to becoming lost, but still must deal with samples that do not contain any visible landmarks. 2.2.13 Integration With An Odometry System An odometry system by itself actually constitutes a localization system, but all odometry systems accumulate errors. Thus using odometry by itself is not practical. Many localization systems for mobile robots are tightly integrated with an odometry system. Typically the odometry system is used as the primary source of location information, and a visual system is used to correct any accumulated errors when a required number of landmarks become visible. 18 If an extended Kalman filter is used to merge location information from the odometry system with the visual system, then the visual and odometry systems become somewhat decoupled. Both localization systems will feed input to the Kalman filter at its own rate, and each will also consume pose information, again at its own rate. The vision, localization system will use the pose information for landmark tracking purposes, and the odometry will use the pose information to correct for drift. 2.3 The Ideal Localization System The design process always involves trading off characteristics against other characteristics. For example, processing requirements, representing cost, must be traded off verses real-time performance. Using fabricated landmarks trades versatility for simplicity. Each application has requirements that dictate which tradeoffs must be made. However if we ignore cost and processing requirements we can come up with an "ultimate localization system specification" that would be suitable for nearly all applications. This exercise is useful to point out what characteristics are generally desirable in any localization system. Existing localization systems can be compared to the ultimate localization systems to expose the design limitations8. The ideal localization system would have the following characteristics: 1. Pose would be determined with 6 degrees of freedom. Any systems that required less could simply ignore superfluous information. 2. The system would provide pose information relative to some absolute. This could be easily translated into local pose if required. 19 3. The system would use naturally occurring passive landmarks. It would also be able to select new landmarks from its environment and add them to its map. 4. The system would be able to deal with the effects of other agents in the world. 5. The system would provide continuous sensing of location information. 6. The system would provide pose information that is 100% accurate9. 7. The system would never become lost, i.e., its robustness would be 100%. 2.4 Localization With Naturally Occurring Landmarks 2.4.1 EricKrotkov Krotkov [Kro89] presents a localization algorithm that uses vertical lines naturally present in the environment to locate the robot. However, since vertical lines cannot be uniquely identified by any salient features, a search must be used to find the correspondence between lines in the image and vertical lines on the map. This search will have complexity 0(kn4) in time, where k is the number of vertical lines in the image and n is the number of vertical lines recorded as landmarks in the map. Sugihara [Sug88] introduces a constraint that can reduce this search to 0(kn3). This may not be a problem if there is only a limited amount of landmarks, but it does place a hard ceiling on the robot range that can be serviced with a single map. Krotkov only gives the results for simulated experiments. He also states that these complexities may not be acceptable for some applications. This author agrees that this localization scheme could not be used in its present form for an application that requires continuous sensing localization, such as playing soccer. This problem illustrates the proposition 8 Of course the ideal localization system does not exist, and will not for some time, if ever. Comparing current localization systems with this ideal is not meant as a harsh criticism of any localization system. It must also be remembered that such a comparison is not fair because this ideal has ignored its drawback: cost. 20 that a continuous sensing localization system must have landmarks whose identity can be sensed rather than inferred. However the ability to use naturally occurring landmarks makes this more suitable for environments where it is not possible to place landmarks or where placing landmarks would unsuitably alter the environment characteristics. 2.4.2 A E C L A r k A paper by Nickerson et al. [Nic93] describes the A R K robot which locates itself by a combination laser range finder and camera, both mounted on a pan-tilt head. Landmarks are manually selected permanent objects in the environment. As the robot moves according to a pre-planned path, it keeps track of its location using a dead-reckoning system. When it believes it is in an appropriate spot to view two of more of the known landmarks it stops and attempts to localize itself using these landmarks. First each landmark is identified and the direction to the landmark is recorded. The range finder is then used to determine the distance to the landmark. The head is then rotated to face another visible landmark. Once the second landmark is detected and ranged, the position of the robot can be calculated by triangulation. This stop and sense localization system is appropriate for the environment A R K was designed for - monitoring and maintenance in a nuclear power plant. It is not suitable for a dynamic environment, such as the soccer playing robot test bed. Also, while the laser range finder is capable of precise measurements, it is neither fast nor economical. 2.5 Localization With Fabricated Landmarks The next few systems use landmarks that are specifically created to be easily detected and identified. Typically, these landmarks are designed so that position calculations can performed using directly measured characteristics. For example, if the robot is confined to move 9 An obvious impossibility. 21 in a plane, the height of a landmark can be used to calculate the distance to the landmark projected onto the optical axis of the camera. Also, the horizontal centroid of the object can be used measure the direction to the landmark relative to the optical axis. 2.5.1 Global Positioning System (GPS) GPS perhaps is or will be the most widely used localization system in existence. By receiving radio signals from satellites, an airplane, ship or land vehicle can calculate its position, including altitude, to within 10m [Hur93]. The attitude of the craft (roll, pitch and yaw) must still be calculated by onboard sensors. Of course this system operates on a scale not suitable for autonomous robots, with the exception of perhaps autonomous attack jet fighters and tanks, which are beyond the scope of this thesis. Still GPS is an invaluable navigation aid to pilots, ship captains and others, giving testament to the usefulness of location information10. 2.5.2 Murata and Hirose Satoshi Murata and Takeshi Hirose [Mur93] present a localization system that uses video with landmarks to correct the accumulation of errors of the main dead-reckoning system. The video system uses pairs of bar coded posts sparsely placed in the robot environment. A high rate of processing is achieved by using specialized hardware. The posts appear as vertical bars of black and white segments that can be matched to a bar coding scheme. The camera is turned sideways so that each scan line is fed into a hardwired detector system that can recognize bar coded patterns of variable size. The relative size of the landmark, and its scan line number, can be used to calculate the distance and direction to the landmark. If a bar code is detected in several consecutive scan lines, a landmark is considered to be detected. If the bar's mate is also detected, then the localization system can calculate the position of the robot using triangulation. 111 During the writing of this thesis, developments in GPS to increase its accuracy, such as differential GPS, have drastically increased the suitability of GPS for use in mobile robotics. 22 Since the bar codes used to detect the landmarks also provide the identity of the landmarks, there is no need for an iterative search. Their results indicate the localization system provided smooth location information (no frequency given) for a robot moving at 20cm/s. Although the speed of the robot was relatively slow, it must be noted that continuous sensing was provided , thus eliminating the need for the robot to stop to sense its environment. The drawbacks of this system are the need for hardwired video processing and the need for landmarks which may be obstructions in the working environment. 2.6 Camera Calibration Camera calibration is a topic that is closely related to robot localization. Camera calibration deals with modeling camera images with respect to the environment. Typically a calibration system provides a model of a camera, and an algorithm for finding camera parameters of a specific camera with respect to the given model (e.g., the focal length). Camera models usually include the pose of the camera relative to some known object, which is localization. The model also provides camera parameters such as focal length, and a translation from frame coordinates to ideal image plane coordinates. Camera calibration is rarely performed in real time, however, and the algorithm for localizing the camera involves very high order calculations. This makes them unsuitable for continuous sensing localization systems. Still, calibration systems can provide some direct benefits to a localization system. If a camera is calibrated off-line, the model of the camera can be used when performing calculations on the position of objects in subsequent images. If the startup sequence for the localization is allowed to be sufficiently long, calibration could be performed during startup, an the pose information obtained could be used as a starting point for the localization system. This 23 would be a bit difficult, since landmarks used for localization can rarely be used for calibration and vice-versa. Finally, research into camera calibration provides insight into calculating pose from camera images. Even if the localization algorithms are not useful, the mathematical derivation of these algorithms may provide localization systems with equations that relate the position of objects within an image to the pose of the camera. 2.6.1 The Ideal Camera Model The ideal camera model provides equations for calculating where on the image plane a point in space will appear. The equations are based on perspective projection. [Hor91], page 19, gives a discussion of the ideal camera model. Figure 1 shows the a graphic representation of the ideal camera model. Optical Center Image Plane z = focal length Projection of Point Onto Image Plane (xi, yi) Position of Point In 3D Space (x, y, z) Figure 1 The Ideal Camera Model The equations that transform <x, y, z> to <Xj, y;> are: 24 Equation 1 xt, — focal length x — z y y(. = focal length x — z 2.6.2 Tsai - Camera Calibration Because all cameras exhibit non-ideal behavior, precise measurement from images requires a more sophisticated camera model than the ideal camera model. Tsai, [Tsa87], gives an appropriate model, and some transformations between an ideal image plane and the actual image frame buffer of the camera. This allows the camera to be used as if it were an ideal camera without knowledge of the more complicated model that is being used. Tsai's camera calibration model and algorithm were selected for this thesis because it was widely accepted and was immediately available in the lab. Relevant parts of Tsai's work are presented here and referred to in future sections. Tsai's Camera Model A camera model is used to transform coordinates in the frame buffer (pixel coordinates) into real world rays. These rays can then be converted into a real world coordinates by establishing the z coordinate of the real world point, (along the z axis). Tsai's camera model consists of extrinsic parameters and intrinsic parameters. The extrinsic parameters are those parameters that are not a function of the construction of the camera itself, but of its placement in the environment. The intrinsic parameters are therefore those parameters that derive from the physical makeup of the camera and the C C D itself. 25 Tsai describes his camera model by describing four transformations needed to go from world coordinates to camera frame buffer coordinates. Figure 2 shows the geometry of the problem space used for the following transformation. Step LFrom world coordinates to camera coordinates. In Figure 2, the real word point P is given two labels. P(x, y, z) represents its coordinates in the camera frame, indicated by the axes emanating from origin O. P(xw, yw, zw) represents the point's coordinates in the real world absolute frame, indicated by the axes emanating from origin O w ) . This can be determined by the well known rotation and translation operations. 26 Equation 2 X w y = R y w + T z z w Here R is a 3x3 rotation matrix and 7 is a 3x1 translation matrix. R and T together form the extrinsic parameters of the camera model. Step 2: From camera coordinates to ideal image plane coordinates. From the ideal camera model equations we find: Here,/ represents the focal length, (an intrinsic parameter) for the ideal camera and (Xu, Yu) represents the coordinates of the point P projected onto the image plane using perspective projection. Step 3:From ideal image plane coordinates to distorted image plane coordinates. The salient part of Tsai's model for this thesis is in the transformation from ideal image plane coordinate to distorted image plane coordinates that models the effects of radial distortion. This transformation is modeled by these equations. Equation 3 Equation 4 Yd+Dy=Yu u 27 The Dx and Dy represent the distortion, and are given in the following equations. Equation 5 Dx = Xd(K]r2 + K2r4+--.) Dy=Ycl(Klr2 + K2r4+--) Equation 5 introduces two new internal parameters Kj and K2. These parameters are called the distortion coefficients. Tsai states that these coefficients are part of an infinite series, but for accurate results, only the first term, K;, is required. Step 4:From distorted image plane coordinates to frame buffer coordinates. The last step is to transform the distorted image plane coordinates to the row and column offsets of the frame buffer as it is stored in the computer memory. This transformation involved numerous intrinsic parameters related to the physical dimensions of the C C D chip itself. This transformation is given by the following equations. 28 Equation 6 Xf=sxd"\Xd+Cx Yf=dy-%+Cy Tsai lists the meaning of each variable [Tsa87], page 226: (Xfi Yf) row and column or the image pixel in computer frame buffer, (C„ Cy) row and column of the center of the image, Equation 7 N.. line) direction, d' = d. dx center to center distance between adjacent sensor elements in X (scan direction, dy center to center distance between adjacent C C D sensors in the Y Ncx number of sensor elements in the X direction, NfX number of pixels in a line as sampled by the computer. sx the uncertainty image scale factor. The Calibration Algorithm Tsai's calibration algorithm is beyond the scope of this thesis, and is well documented in his paper [Tsa87]. It is sufficient to note that the algorithm was not suitable for a continuous sensing localization system for several reasons. The most important reason is that the algorithm requires eleven images of the target landmark to provide location images. The second reason is that the algorithm requires human intervention to detect several dots. 29 Chapter Three - System Outline 3. System Outline The prime goal of this thesis is to explore the possibility of a real-time localization system using minimal hardware that is capable of localization with only a single visible landmark. This chapter is structured to present a specification of the required system, a discussion all salient design issues, and an overview of the final design. The detailed design is presented in the following chapter. 3.1 System Specifications As discussed in section 1.2, the design presented in this thesis is an exploration of the issues relevant to creating an on-board localization system for a dynamic, real-time robot testbed environment. This dictates many hard constraints on the system. It is unlikely that all of these constraints could be fully satisfied even in a final working version. Academically, however, it is these deficiencies that would prove most interesting from the point of view of research. If the sensing and plant systems worked perfectly, there really would be no need to test robot control architectures on a physical testbed. 1. Continuous Sensing - The robot controller should not have to stop to sense position. Requiring the robot controller to do so would impair the flow of the soccer game and put the robot at a distinct disadvantage to any other robots that do not need to stop to sense position. This restriction does not apply to reinitializing the localization system should it become unstable. 30 2. Real-Time Performance - The soccer playing test bed is a rapid multiple-agent testbed. Decisions made by the controller must include a notion of time when evaluating correctness, i.e., the right decision too late is the wrong decision. It would be impossible for the robot to make decisions in a real-time frame if sensing was not guaranteed to provide information at a reasonable rate. For the purpose of this thesis a "reasonable rate" is set at 10Hz. 3. Non-Obtrusive Landmarks - The playing field for the soccer playing robots is a simple rectangular pen made of wood. The landmarks cannot become obstacles on the playing field. They can (actually they must), however, alter the visual characteristics of the playing field. This means that landmarks may be placed flush with the sides of the playing field. 4. Minimal Hardware Requirements - The localization system must be run concurrently with the controllers of both robots and with any other sensing systems used. Processing resources will always be limited. For this thesis, the localization system is allocated a single Texas Instruments TMS320C40 processor. This requirement is obviously in conflict with requirement 2. In the end some acceptable tradeoff must be reached. 5. Single Camera Localization - This restriction is not a requirement of the soccer playing testbed, rather a constraint on available resources. The hardware allocated for this project includes only a single camera. Any future attempts to perform on-board localization may use multiple cameras if such hardware is available. 6. Range Required - An approximate maximum range for the testbed will be set at 3 meters. The minimum distance at which the landmark is detectable shall be set at 15cm. The required angular range from which a landmark is detectable shall be set at +/- 70° from "dead-on". 31 7. No Odometry Required - The robots in the soccer playing testbed are simple remote control model race cars, which are not capable of supporting an odometry system. It is not known what platform will be used for an on-board sensing testbed. This means that the system can not count on the availability of an odometry system to provide supplementary or primary location information. The design should, however allow for the addition of such a system. 3.2 Design Highlights This section introduces salient points of the Robot Onboard Localization using Landmarks, ( hereafter R O L L ) , design. The first section outlines equations governing localization with a single landmark view. The second section deals with landmark design. The third section outlines algorithms for detecting landmarks. The fourth section discusses the means of identifying the landmarks. The fifth section presents a second order model of the RWI mobile robot that is the current target platform for this system. 3.2.1 General Approach Requirements 2 and 4 are clearly competing and/or contradictory". The real-time performance of the design will not be known until the system is implemented on a physical system, however it is expected that it will be difficult to satisfy both requirements. Design of the system will be primarily concerned with minimizing the required processing. All algorithms and methods are to be kept as simple as possible. Each localization subtask could be performed with a great deal of sophistication, but it is not known which actually require more than the simplest approach. The sophistication of individual subtasks can be increased when required, usually independent of the other subtasks. " These requirements are contradictory if it is impossible to satisfy both, and competing because real time performance can usually be improved by adding processing hardware, and hardware can be reduced at the expense of real time performance. 32 Iterative searches should be avoided when possible. If it is not possible to avoid an iterative search, some consideration must be made on how to bound the search. This will keep performance fairly constant, a key characteristic for real-time systems. Error handling should be minimized, or avoided altogether. For example, if a landmark is not detected in the current image where it is expected, it is better to simply move on to the next image than to enlarge the search area indefinitely. 3.2.2 Localization With A Single Landmark Localization with a single landmark increases the percentage of time that a robot is able to calculate location. For any localization system, it is likely that for some percentage of time no landmarks or only a single landmark will be visible. If this percentage of time is too high, a localization system that requires two landmarks to calculate position will suffer a degradation of performance. This may interfere with the system's ability to perform continuos sensing. The decision to use a single landmark localization system is a result of the specification that the system allow continuous sensing. All of the localization systems described in Sections 2.4.1, 2.4.2, 2.3.1, and 2.5.2 describe systems that require two visible landmarks to calculate position. These systems require two landmarks because they only sense two measurements from each landmark, direction and distance. Because the pose of the robot has at least three unknowns, two landmarks must be visible to have a completely defined system of equations. However, if it were possible to obtain orientation information from the landmark, then it should be possible to determine the pose of the camera from a single landmark. The following equations show the translations required to calculate the absolute pose of the landmark starting with the pose, including orientation, of the landmark relative to the 33 camera. These equations are represented here are simple two dimensional homogeneous transformations based on those found in [McK91]. An extra row and column were added to provide for orientation information, so that the transformation becomes a pose transformation rather than a position transformation. The general pose transformation can be represented by: Equation 8 1 p 2 = T( m '" '0) -Pi 2 This is a short form representation for: Equation 9 COS0 -sin0 0 m sin0 COS0 0 n X o2 0 0 1 <i> 1 0 0 0 I I Where: i T(m, n, (j)) is a transformation that performs a translation of (m,n) followed by a rotation of <j). Pi is the pose before the transformation. p 2 is the pose after the transformation. For these equations, rotations are assumed to be positive in the counter clockwise direction, and the axes have been renamed, so that the typical x axis has been renamed to z and the y axis has been renamed to x. This is done to align the Cartesian coordinates used for robot pose with the typical Cartesian coordinate axes used in the ideal camera model. 34 Equation 8 can be used to show that calculating the absolute pose of the camera, starting with the pose of the landmark relative to the camera, where the absolute pose of the landmark is given. First, the pose of the camera is calculated relative to the landmark. Equation 10 p e r =T(O,O-0 , r )-T(-z l r -ac l r ,O)-o Where: Per is the pose of the camera relative to the landmark. [Z|r, Xir, 2i„ 1 ] T is the pose of the landmark relative to the camera. o is the origin - [0, 0, 0, 1 ] T Two general transformations are needed since this equation represents a rotation followed by a translation, rather than a translation followed by a rotation. The absolute pose of the camera can then be found using the absolute pose of the landmark by applying a single general homogeneous transformation. Equation 11 Pea =T(Zla,Xla,dla)-pcr Where: Pea is the absolute pose of the camera. [zia, X|a, 8ia, 1 ] T is the absolute pose of the landmark. Though it is not difficult to find the absolute pose of the camera, given the pose, including orientation, of the landmark relative to the camera, it is difficult to accurately measure the orientation of the landmark from a single image. This was found to be the largest problem to overcome with the current design. The following subsections describe an initial design of a 35 landmark that proved insufficient due to a condition called the image plane parallelism problem, and another landmark that avoids this problem. The latter landmark was selected to be used in the design in this thesis. 3.2.3 Landmark Design The design of the landmark itself dictates much of the rest of the system. Landmarks should be specifically designed to be easily detectable, easily identifiable, and to allow easy calculation of pose from its appearance in the image. Much research is being done on how to detect, and calculate the pose of general objects in images under the object recognition field. These methods are usually very sophisticated, and not suitable for real time systems. The point of designing a landmark is to avoid the need to use these object recognition methods. 3.2.3.1 The First Landmark Figure 3 shows the appearance of the first landmark design. The main body of the landmark consists simply of a black square, 10cm by 10cm, with a number of black rectangle above the landmark for identification purposes. The white circle in the bottom comer indicates a later improvement. White circles were added in each corner of the main body. Once the landmark was detected, the centroids of these circles were calculated. The position of the centroids could be measured more accurately that the position of the corners. 36 Number O f Bars Indicates Identification 10cm 4 10cm Circle Added For Increased Accuracy Figure 3 The First Landmark Design This simple landmark allowed calculation of its orientation, and also had several notable advantages. First, it was easily detectable using comer detector matched filters [Der90] and [Ran89], which could also located the identifying rectangles. The landmark is completely flat, and thus, if mounted on an existing wall, would not be a physical obstruction in the environment. It was also extremely easy to produce, requiring only a draw program and a suitable printer. The following equations can be used to calculate the pose of the landmark relative to the camera. Figure 4 and Figure 5 serve as a graphical reference to the physical values represented by the variable names. The positions of the corners in Figure 4 are given in ideal image plane coordinates. These values must first be found in the actual image, and translated into ideal image plane coordinates through the transformations provided by Tsai's camera model as discussed in 2.6.2. This step is not shown in the following calculations. These equations were derived from the ideal camera model [Hor91] pages 19-20. 37 Equation 12 focal length y«i - yu focal length yur - y,r • 10cm • 10cm ( X u , + X „ ) / ^ /2 x, =• V J focal length V J focal length Equation 13 Equation 14 _ ( X l + X r ) 2 Equation 15 9la = tan" yXl~Xr j + 7T 38 ! y Figure 4 The Landmark As It Appears In The Ideal Image Plane Landmark Left Edge Location (xi, z , ) Camera Position •(0,0,0© Landmark Location (xla> z l a ) Landmark Orientation (2i.-B) Landmark Right Edge Location (x r, z r ) Figure 5 Calculating Relative Position For Landmark Type J 39 This landmark suffered from two problems: identification was unreliable and more importantly, the orientation of the landmark could not be accurately measured at around 180°. This is when the camera is looking directly at the landmark. This was found to be related to the sensitivities of Equation 19. It was also found that this result can be generalized into a condition called the image plane parallelism problem. 3.2.3.2 The Image Plane Parallelism Problem It is practically impossible to calculate the orientation of any line segment or plane segment that is parallel to the image plane with a single camera. This is because the sensitivity of the orientation calculation approaches infinity as the measured object becomes parallel with the image plane. z,) Optical Center (X, Z r ) Figure 6 Image Plane Parallelism From Section 3.2.3.1, Equation 15 we know that Equation 16 -1 / Clearly it can be seen that 40 Equation 17 l i m \ Z l Zr J o, and given that l i m x->0 dx Equation 18 it follows that: l i m 90, ^t ^> \ Z l ~ Zr J Equation 19 This equation means that as the landmark becomes parallel to the image plane, the calculation of 0i a becomes infinitely sensitive to the values of xi, xr, z\, and zr, that is, small errors in the measurement of the latter values cause very large errors in the calculated value of ©i a. To summarize, it is impossible to accurately measure the orientation of an object that is nearly, or completely, parallel to the image plane. This effect is known as the image plane parallelism problem. 3.2.3.3 The New Landmark It is possible to avoid the image plane parallelism problem if the landmark is made so that orientation measurements are taken from part of the landmark that can not be parallel to the 41 image plane. Figure 7 shows a landmark composed of three black dots and an identifying pattern. The middle dot is offset from the face of the landmark so that the three dots form a triangle. This triangle will only be parallel to the image plane when the landmark orientation is perpendicular to the optical axis. At this angle, the dots themselves will not be visible, so detection is impossible anyway. Thus the image plane parallelism problem is avoided. Identification Pattern 5 cm • i 10 cm Figure 7 Landmark Design 2 The equations for measuring the relative pose of this landmark are derived below. Figure 8 shows the physical values that the variables represent. The given values are the locations of the three landmark dots in image coordinates. (xud, yu d), (xmd, y m d). and (x]d, yw) are the coordinates of the centroids of the upper, middle and lower dots respectively. The constants focal length, landmark height, and middle dot offset are also known. 42 X A Middle Dot (M) Shadow Ray (R)..-'' Camera (C) (0, 0) Shadow Point (S) (xs, zs) Landmark (L) (Xl, z,) (A) Line x = x, (X) Line z = z, (Z) Figure 8 Calculating the Relative Position of Landmark Type 2 Calculating point L is a straightforward application of the ideal camera model. h landmark f y«d-yu Equation 20 ( X u d + X l d ) / •z, x,=-f Equation 21 Where/ is the focal length of the camera, and h!andmark is the height of the landmark. Calculating the relative orientation of the landmark 6i requires the use of similar triangles. In Figure 8, the triangle CAS is similar to triangle BLS. Once point S has been found, 43 the length of line segment B L can be calculated, since it is proportional to line segment C A . The x value of point S is calculated using the ideal camera model, x.. -Xmd ' Zl focal length Equation 22 and the z value of point B can be calculated as Now ZSBL = cos - l Xs Xl \ z t ~ z b J Equation 23 , the lengths of line segments B L and SL are known, which means angle 0 can be calculated using the sine law and the sum of the angles of a triangle rule for triangle L M B . ZLMB = sin" '(z,-zb)-sin(ZSBL) middle dot offset Equation 24 0 = ZSBL + ZLMB Equation 25 3.2.4 Landmark Detection Landmark detection using video is an object recognition problem. Features in the image must be matched to a model of the object to be detected. Many sophisticated object recognition techniques use explicit models of objects with several different feature types. These techniques 44 are not suitable for real-time localization because of the amount of processing required. Forming a landmark detection scheme must involve using implicit knowledge of the landmark model in order to quickly detect landmarks. The detection method for both landmark types are described in the following sections. Although the first landmark type was rejected due to accuracy problems, its detection algorithm is included to contrast with the detection method for the second landmark. This illustrates that detection methods are highly dependent on the landmark design. It is also worthwhile to examine other problems with this landmark. 3.2.4.1 Implicit Knowledge For Both Landmarks For both of the landmarks types, the immediate vicinity, excluding shadows, contains only white and black12. Assuming a binary image simplifies the landmark detection greatly. Within the immediate vicinity of the landmark, all black pixels can be assumed to be part of the landmark. Since the image is actually greyscale, some threshold value will be assumed to determine which pixels are white and which are black. Another important piece of implicit knowledge is that landmarks are mounted so that the upper and lower dots are aligned vertically. Also, the landmark's middle dot must be horizontally aligned with the camera optical axis. Calculations could be added to remove the need for this accurate landmark mounting specifications, but this would add a level of sophistication to the calculations that would increase processing time. 3.2.4.2 The First Landmark Type Two designs were created for the R O L L system. The first proved unsuccessful. This section details the design of the first landmark and the reason for its failure. 1 2 Each landmark is pure black, on a 8.5" x 11" sheet of white paper. 45 The first landmark, shown in Figure 3, on page 37, is a simple square. Therefore it can be modeled as four corners. Each of these corners can be detected by convolving a corner detector over the image. Detected corners are then matched into a list of rectangles. Rectangles that may be landmarks are selected and position is calculated. The following algorithm describes the landmark detection procedure. 1. For each corner type: 1.1. Convolve the image with the particular corner detector. 1.2. Scan the convolution for local peaks, adding each peak to a list. 2. For each comer in the "upper left" list, find matching "upper right", "lower left", and "lower right", corners. Combine these corners to form a rectangle and add it to a "rectangle" list. 3. Scan the "rectangle" list for rectangles that may be landmarks. 4. Calculate camera location from each recognized landmark. 3.2.4.2.1 Corner Detectors Figure 9 shows the corner detectors that were used for this landmark. These detectors are 4x4 matched filters modified to provide a 0 response to constant regions and reduce the filter's response to lines. In each filter, the quadrants adjacent to the "corner", denoted by the quadrant of -5's, are given heavier weighting that the quadrant opposite the comer. This heavier weighting reduces the corner detector's response to lines. Note that these filters are zero-sum, but not balanced. This means that the corner detectors are directional so several are required to detect corners at different orientations. 46 These detectors are designed specifically for right-angled corners, oriented along the vertical and horizontal axis. While it is known that the landmarks are mounted vertically, perspective effects make many comers appear to be either wider or sharper than 90°. It is assumed that these effects will be negligible for most landmark views. It is noted that this is generally true for almost all poses except for landmarks that are very close and at oblique angle to the camera. upper left filter = upper right filter = lower left filter = lower right filter = 1 2 2 1 2 2 2 -5 -5 2 -5 -5 2 2 1 1 2 2 1 1 -5 -5 2 2 -5 -5 2 2 2 -5 -5 2 -5 -5 1 2 2 1 2 2 -5 -5 2 2 -5 -5 2 2 2 2 1 1 2 2 1 1 Figure 9 The Corner Detectors 47 3.2.4.2.2 Landmark Verification When selecting rectangles from the rectangle list, a number of characteristics of the landmarks can be used to eliminate false rectangles: 1. Landmarks can be taller that they are wide, but not vice-versa. 2. Upper corners must be directly above their respective lower corners, within some tolerance. 3. Upper and lower corners for the same side should be equidistant from horizontal, within some tolerance. 4. Landmarks should have some shape pattern above them for unique identification. Once landmarks have been identified and the relative position calculated, this position could be compared with an estimate of where the landmark should be, based on a current estimate of robot position. This would be a very effective test, but was not implemented for the first landmark type. 3.2.4.2.3 Efficiency Considerations Without hardware to perform the comer detection convolutions, this detection method is very slow. Even with hardware, the processing required to search the entire image four times in order to find all the local maximums for the convolutions is considerable. It is necessary to restrict searches to some small regions of the image. Thus some kind of tracking or prediction is essential. The procedure to match corners into rectangles also causes some concern. In order to be done properly, a combinatorial search, 0(n(n-l)(n-2)(n-3)), must be performed. This could lead to an unacceptable search time, should the corner list become too long. Either the number of 48 corners located in the image must be somehow constrained, or the comers must be matched using heuristics. The latter was selected since no method of eliminating false comers was discovered. 3.2.4.2.4 Notable Problems This landmark type was eventually rejected due to inaccuracies caused by the image plane parallelism problem described in Section 3.2.3.2. There were however several other problems with this detection method. The heuristics used to match corners into rectangles proved unreliable for the identifying rectangles above each landmark. In order to rectify this situation, landmarks were identified by traversing a vertical line up from the middle of the top of the landmark and the number of black segments crossed were counted. This in turn brought up the problem that for the identifying rectangles to be large enough to be detected, they ran the risk of disappearing off the top of the image if the landmark was too close. Even if the search regions of the image were reduced, the convolutions required too much processing power to be performed without specialized hardware. The assumption that perspective effects were negligible in most cases proved invalid. Although the corner detectors proved very resilient to the perspective effects, accuracy decreased. In order to improved the accuracy of corner detection, which is a single pixel at best13, white circles were added inside each corner. The centroid of these circles could be calculated more accurately than the location of the comer via corner detectors, but required a significant amount of increased processing. 1 3 Subpixel interpolation schemes were considered but not implemented since it was determined that the increased accuracy still would not help solve the image plane parallelism problem. 49 3.2.4.3 The Second Landmark Type 30cm 30cm l c m 10cm Figure 10 The Second Landmark Type The second type of landmark is modeled as a triangle of black dots. Since each dot is the same, we can not immediately distinguish upper, lower and middle dots from each other. It is also not possible to use a matched filter to detect circles because they are not scale invariant. Despite these restrictions, it is relatively easy to detect the landmarks by assuming that any black pixel is part of a circle, and providing a prediction of the circle locations. The general algorithm for detecting the second landmark type is as follows: 1. Predict the location of each of the dots for a given landmark. 2. For each dot 2.1. Attempt to detect a dot near the predicted location. 3. If more that one dot was not found 3.1. End. (Failure) 4. If one dot was not found 50 4.1. Form several new predictions based on the two dots that were found. 4.2. Repeat from step 2 for missing dot, ending at step 4 if all dots not found. 5. Verify that the three dots form a landmark. 6. Identify the landmark 7. Calculate the position of the camera from the location of the three dots. Step 4 was added when it was found that the algorithm failed much too often. To rectify the situation, step 4 was added, and the landmarks were mounted with more care. Step 4 represents a second try, if the predictions were off enough to miss on of the dots. It was found that when a single dot was missed, the second attempt nearly always found the missing dot. 3.2.4.3.1 Dot Detection Dot detection amounts to finding convex regions of black pixels, which is a fairly simple and straightforward process. 1. Search over some region in the vicinity near the predicted location for a black pixel. 2. Starting with a rectangle enclosing only the black pixel, expand the rectangle until it encloses all black pixels in the dot. 3. Verify that the dot may be part of a landmark. 4. Calculate the centroid of the dot. 3.2.4.3.1.1 Search Patterns In order to search for a black pixel near the predicted location of a dot, some search strategy must be implemented. The effectiveness of the search pattern has a significant impact on the performance of the system in general. The search should be restricted to some area, and need 51 not examine every pixel since dots will be more than one pixel in size. It is also desirable to search the where the centroid is most likely to appear within the search area. A grid must be established that samples the region. Only pixels on the grid points should be checked. In accordance with sampling theory, the grid width must be no bigger than the expected width of the dot, and the height must be no bigger than the expected height. For a safety margin, half of the height and width are selected as the grid width and height. There are two reasonable search approaches available to search the grid. The first is to order grid points in order of proximity to the predicted dot centroid, and then search them linearly. This is approximately equivalent to creating a square around the predicted location, and expanding its width and height by one grid point until it intersects a black pixel. The second approach is to search horizontally first, shifting vertically until a black pixel is found. The latter method was selected because variance in the horizontal direction is more common than in the vertical direction. This is because rotation can cause rapid horizontal movement when the landmark is far away from the camera, while the vertical movement of dots due to forward or backwards movement of the robot tend to be more subdued. Thus it is more likely that the vertical coordinate of the prediction is correct while the horizontal coordinate is erroneous. 3.2.4.3.1.2 Dot Verification We can make several assumptions about the landmarks that will facilitate detection and help eliminate false matches. • Dot will be fully visible - no occlusion, no partial views. • Landmark will be within a specified range. 52 • Landmark will be mounted at camera level and vertical. • Perspective effects will be negligible With these assumptions, there are several checks that can be made to reduce false dot detection. First, because the robot is confined to a horizontal plane, and the dots are mounted on vertical 2 dimensional planes, the landmark dots will appear as ellipses that can never be wider than they are tall. Secondly, the area of the detected dot should not differ from the predicted area by more than some tolerance. Lastly, a compactness test can be applied. Compactness Test The compactness of a dot is the ratio of its area to the area of the smallest rectangle that completely covers the dot. This test helps to eliminate any detected dots that are not convex. A non-convex dot, more appropriately called a blob, will have a relatively small compactness value while a convex dot will approach n/4. A perfect rectangle will have a compactness of 1. The ideal compactness of an ellipse is given by the following equation: Note that the compactness of an ellipse is independent of its radii, thus this test is scale independent. The compactness test may also reject valid dots that are occluded, so we must be rejected if occlusion occurs. While ideally a localization algorithm would be able to extract location information from partially occluded landmarks, for this system it is likely better to reject partially occluded landmarks because the occlusion would shift centroids, and thus incur error. Equation 26 Kab K Aab 4 a = primary radius b = secondary radius assume either that the landmark dots will not be occluded or simply accept that landmarks may 53 The compactness of a proposed dot could be tested for being too large or too small, but the difference between the compactness of an ellipse and the compactness of a square is not large enough to allow a safe comparison. Thus only dots that have a compactness ratio smaller than 7t/4 - £ are rejected, where £ is some tolerance, named the dot compactness test tolerance'4 . Landmark Verification Once three suitable dots are found, several checks can be performed to reduce false detection, based on implicit knowledge of the landmark shape. 1. The horizontal coordinate of the upper and lower dot differ by less than some tolerance - the vertical dot alignment tolerance. The value of this constant is 5. 2. The middle dot is not further away from the line connecting the upper and lower dots than physically possible due to the dimensions of the landmark. These dimensions are specified in the constants landmark middle dot offset, and landmark height. The value of these constants is 5cm and 10cm as can be seen in Figure 7. 3. The distances from the middle dot to the upper and lower dots are equal to within some tolerance, named the maximum middle dot vertical deviation factor. The value of this constant is 16. 3.2.4.3.2 Efficiency Considerations Even though the elimination of the convolutions makes this detection method faster, the majority of the processing is still devoted to detecting the landmark. The majority of the time taken to detect a landmark is used to find the black dots, and calculate their centroids. 1 4 Appendix A contains a list of all constants and tolerances in the system. 54 There are two ways to minimize the time taken to find a black dot. The first is to improve the prediction calculation15. The better the prediction is, the less time taken to find a black pixel that is part of the dot. When the prediction is perfect, the search takes no time, since the first dot checked is black. The second way to reduce the processing requirements is to reduce the area of the search. However, reducing the search area has the adverse affect of allowing the robot to become lost more frequently16. Step 4 in the landmark detection algorithm states that new predictions and searches should be undertaken if only one of the three landmark dots was not found. This greatly increases the time taken to process a single image, should the first detection fail. It was found that step 4 of the algorithm is necessary, since if the landmark is improperly mounted, errors will continuously occur. There are many ways to improve this situation. One way is to flag the situation, allowing a human to remount the offending landmark properly. Another way is to not perform this extra step if there is another landmark visible in the image. Possibly the best alternative is to store information about the improperly mounted landmark in the map, and use this information to make a better prediction. One very significant unanswered question is whether it is better to attempt to detect all landmarks that are predicted to appear in the image, or to stop and capture a new image as soon as a single landmark has been detected. 1 5 There are generally three ways to improve the prediction. The first is to make the prediction smarter, by increasing the order of the plant model, or otherwise making it more accurate. The second is to decrease the time taken between successive images, but that's why we want an improved prediction. The third is to allow the map to "learn" mounting parameters for each landmark, i.e., if the landmark is mounted x cm too high, or at an angle y° off vertical. 1 6 This would be an interesting relationship to examine, but would require a large amount of data gathering from a fully operational system. 55 It is expected that the answer to this question lies in the relative processing time required to capture an image versus to process a single landmark. At 30 frames per second, capturing an image may take 1/30* of a second. This "cost of sensing" could lead to the decision to make the most of each image if the "cost of detecting" is relatively small. If the image can be captured without using a significant amount of processor time, then it may be best to stop after a single landmark has been detected. In the current design, it is assumed that the image capture will take the full 1/30* of a second of processor time and thus the system attempts to detect all landmarks visible in the image. This issue represents possibly the single largest performance increase for the system should asynchronous image capture become possible. This would mean that image capture occurs automatically via interrupt control. This would require some non-trivial synchronization between the frame grabber and the processor. 3.2.4.3.3 Noted Problems There are two major known problems with the current system-tracking may be lost, and the identification pattern suffers from perspective effects. The latter is dealt with in Section 3.2.5.1. The fact that tracking be lost represents a fundamental instability in the system. The significance of this instability rests with the "cost of re-establishing the tracking", and the relative frequency that the robot loses its tracking. In other words, losing tracking, or becoming lost, is not a major problem if it only requires slightly more time to re-establish the tracking, via scanning the entire image for landmarks. Currently, using a rather naive initialization procedure, re-establishing the tracking requires approximately 10-20 seconds. The significance of losing tracking also decreases as the frequency of losing tracking decreases, i.e., if the tracking is only lost 1/day, the effect is insignificant. Early experiments have shown however, that tracking is lost fairly frequently. 56 3.2.4.4 Landmark Tracking Landmark tracking can be performed by tracking the landmark in the image plane or by tracking the position of the robot and predicting the position of the landmark in the next frame. The latter approach has be adopted, since tracking a landmark in the image plane does not provide for new landmarks coming into view. The plant model for the system plays an important part in landmark position prediction. The plant model must supply an estimate of the current absolute pose of the robot at the time the image was captured. The greater the error in the current position estimate, the greater the chance that a landmark will not be detected in the allotted search region. 3.2.5 Landmark Identification Landmarks are identified, when required, by the identification pattern located directly above the upper dot. This identification marker consists of two black reference bars, and 10 bit squares which may be either black, representing a 1, or white representing a 0. Figure 11 shows the identification marker reference bars and bit regions. The lower middle bit region is always white so that the height of the lower reference bar can be measured by moving directly up from the upper dot of the landmark. Reference Bars Figure 11 Landmark Identification Marker 57 The reference bars serve two purposes. First, the height, width and position of the lower reference bar is used to calculate the position of the bit squares. Second, the presence of the reference bars prevents the bit regions from being accepted as valid landmark dots. Any black bit square will be attached to a reference bar. Together they will have a low compactness value. To increase the reliability of the identification scheme, bits 0,3, and 6 are checksum bits which must ensure that the number of bit squares mod 4 is always 3. 3.2.5.1 Perspective Problem The current algorithm for "reading" an ID marker assumes that the shape of the marker is not distorted by perspective. As the effect of perspective increases, the id marker cannot be read. This problem can be ignored with little consequence because landmarks do not have to be identified except at startup. The identity of a landmark is known prior to detection by the prediction subtask. It is still a good idea to attempt to verify the identity of each detected landmark. If the identity marker of the landmark is properly read17, and is different from the expected identity, then it can be assumed that the robot was lost, but the new location can be immediately calculated. 3.2.6 Plant Mode l ing The plant model subtask is responsible for combining multiple pose calculations for a given instant into a single pose and for providing predictions of the robot pose when requested. The actual implementation of the plant model depends on what information is available in the final system. For the plant model proposed in this thesis it is assumed that odometry and control information is not available. This means that the internal state variables for the model must be derived from observations from the video localization system alone. 1 7 i.e. The checksum is correct. 58 Ideally the plant model would incorporate, or be attached to, an extended Kalman filter. However the design of an extended Kalman filter is fairly involved and was left for future work. The plant is modeled with a simple second order model. The internal state variables shall be updated with every new pose received. The current pose estimate for the robot is calculated by averaging all pose calculations available in the current image. The previous position of the robot is not taken into account. The plant model is for the RWI robot. This robot may rotate, and move forward or backwards in the direction it is currently facing. This leads to a system that is defined by the following equations. Equation 27 x = v • sin 0 z-v- cos 6 From these two equations form a simple second order model of the robot. Equation 28 shows the set of state variables for the model. The last three state variables are maintained internally. Equation 28 ' x{t) z{t) 0 ( 0 0 ( 0 v(0 v(t)_ The plant model can provide a prediction of robot pose for time t n +i, given state variables were last updated at time t„. Equation 30 gives the equation to calculate S(tn+]), the estimate of S(t) = 59 the robot pose at time t„+i. Note that the internal state variables are included even though they are not used by the current system. They may be required for future system improvement. x(r n ) + sin *(0 + cos 0 - A ^ f v - A A 0 + v + V 2 J V 2 J / f v - A A 0 + • v + V 2 J V 2 J 0 ( r J + 0 - A r 6 ( 0 v (0 Equation 29 Equation 30 When new pose measurements are received, the state variables are updated. Equation 31 shows the calculation of S(t„+i), given new pose measurement Pm= [ xm, zm, 0 m ]. ©m('„+1)-©(0 Ar At , { t ) = v ( O - v ( 0 Equation 31 60 3.3 System Overview The system is designed as an iterative pipeline with feedback. Figure 12 shows this pipeline structure proceeding from top to bottom. Processing starts with the Landmark Map Control block signaling the Image Capture block to provide a current image. The Landmark Map Control block also sends the absolute position of a potentially visible landmark to the Landmark Prediction block. The Landmark Prediction block in turn sends a prediction of the images coordinates of that landmark to the Landmark Detection block, which locates the landmark within the image. The Position Calculation block calculates the position of the camera from the shape and position of the landmark in the image and sends the position to the Plant Model block. The Plant Model block, accepts the position, and updates its internal model of the robot. Once the Plant Model has finished, the Landmark Map Control block is notified. The Landmark Map Control block then either sends another potential landmark to the Landmark Prediction block causing another iteration through the pipeline with the same image, or it signals the Image Capture block to provide another image. In the latter case, processing repeats as described above. 61 Landmark M a p t Control Landmark Prediction Image Capture Landmark Detection t Position ' Calculation Plant Modell ing System ] Output Figure 12 System Structure 3.3.1 Task Decomposition Figure 12 represents a functional decomposition of the system. Each block can be considered a transformation from input to output. The tasks are listed below with a general description of the inputs, outputs and transformation performed. 3.3.1.1 Image Capture /Image Sequence Loader This block, labeled Image Capture, is responsible for providing the current image to any other functional blocks that require it, primarily the Landmark Detection block. It has no inputs, except for a signal that initiates the capture of another image. In a working system, this 62 functional block would encapsulate the hardware frame grabber. For the development system, this functional block is implemented as an image sequence loader that loads consecutive images that have been captured and stored off-line. 3.3.1.2 Landmark Map and Loop Control This block is responsible for managing the landmark map database and controlling the flow of the system pipeline. It accepts a signal from the Plant Model block indicating that the current iteration through the pipeline has completed. It has two outputs. The first is a signal which is used to initiate the capture of a new image. This output is sent to the Image Capture block. The second output is the location of a potentially visible landmark. This output is fed to the Landmark Prediction block, and initiates an iteration through the pipeline. The control rules that govern the Landmark Map Control block may vary depending on implementation, without affecting the proper functioning of other control blocks. For instance, whether the Landmark Map Control block limits the number of landmarks to check before initiating the capture of a new image, or whether it sends out all possible landmarks before initiating capture of a new image will not affect any other functional block in the system. 3.3.1.3 Landmark Prediction/Tracking This block has two inputs: the current position of the robot and the absolute position of some landmark. It has two outputs. The first is a prediction of where the landmark will appear in the current image. This second signal is used when it is predicted that the landmark will not be visible in the current image. This signal can be used to abort the current iteration through the pipeline. Each dot is put through the following transformations. 63 Global Position to Relative Position Equation 32 s Plr^T(-Zc,-Xc,-Qc)-pi& r Where: Piris the pose of the landmark relative to the camera. Pig is the pose of the landmark relative to the global frame. s X is the general 2D pose transformation. r Zc, xa @ cis the pose of the camera relative to the global frame. Relative Position to ideal image plane Equation 33 xi Ztr yt — zlr f Where: xh y„ zi is the position of the dot in 3D space projected onto the image plane. xir, ytn Zir is the position of the dot in 3D space relative to the camera. / is the focal length of the camera. Note that ytr = yig, since the height is independent of the translation and rotation. 64 Ideal image plane to frame buffer (via Tsai's model) << Equation 34 \row col\ = Tsai(x J , y,) Where row, col is the coordinates of the pixel in the frame buffer. Tsai(xh yt) is the transformation equations as specified in Equation 3, Equation 4, and Equation 5. Together these equations for a transformation from ideal image plane to frame buffer, using the intrinsic parameters of Tsai's camera model (for a specific camera). 3.3.1.4 Landmark Detection This block is the most complex and time consuming of all blocks. It is responsible for locating a landmark within the image. It is also responsible for preventing false matches. This block has three inputs: the current image, the predicted position of a landmark in the image and the id of the landmark to be detected. It has two outputs. The first is the detected location of the landmark in the image. The second is a signal that is used to abort the current iteration through the pipeline if the landmark was not detected. 3.3.1.5 Position Calculation This calculates the absolute position of the camera from the location and shape of the landmark in the image. The block takes two inputs, the position of the landmark in the image and the absolute position of the image according to the map. The transformations for this block are listed as equations Equation 11. 65 3.3.1.6 Plant Modeling/ System Output This functional block is responsible for modeling the plant in order to maintain an estimate of the current position at any time. Although this block should ultimately be integrated with a Kalman Filter, the current version simply models the plant with a second order model and does not filter new measurements with previous knowledge. The exception is for outliers. If a new measurement arrives that would require the plant to have exceeded acceleration and/or velocity constraints, the measurement is declared as an outlier and discarded. 66 Chapter Four - System Detail 4. System Detail This chapter presents a detailed design of the R O L L system implemented on the A C E environment. The chapter begins with a brief description of the A C E environment with a section on design methodology. An overview of the implementation is given, showing general data flow control. The design is broken down into data types18 and functional components. Each component is described and its function is related to the overall localization task decomposition given in section 2.1. The chapter finishes with a brief description of a second implementation that was connected to a mobile robot to operate in real time. The A C E implementation is used as a test bed for the localization algorithm. Images are stored off-line and loaded into the system as if they were obtained in real time by a frame grabber. The A C E environment was selected as a platform for the design because it enforces a high degree of modularity and its clear data flow architecture. These characteristics aid the creation of an easily understood and modifiable design. 4.1 The ACE Environment The A C E environment is an implementation of the Events & Components paradigm, created by ISE Ltd.' 9 for use in robotics control. The Events & Components20 paradigm is an 1 8 The presentation of data types is an important part of the design methodology of Events & Actions, a paradigm upon which the A C E environment is based. 1 9 The A C E system was created by the author of this thesis while employed at International Submarine Engineering Limited, and is currently still under development as an in-house software tool. Its potential as a commercial software product is being evaluated. 67 objected oriented, event driven data flow environment that supports asynchronous data and control flow. This paradigm is comprised of three major objects: events, actions and components. 4.1.1 Events, Actions and Components Events can be considered pipes through which a specified type of data can flow. Components are functional objects that encapsulate a transformation of data from some input events to some output events. Components can be producers of events, consumers of events, or, (most often) both a producer and consumer of events. Actions are contained within components objects to perform the preemptive task scheduling. 4.1.1.1 Events An event is a pipe through which a specific type of data can be passed from a producer component to a consumer component. A producer outputs data to the event by signaling the event. This is done using the Signal() function that is a member of the event object. Signaling an event incorporates two functions. First the data being fed though the pipe is changed, and second, any actions that have been attached to the event are scheduled for execution. It is this mechanism that gives A C E its asynchronous event-driven nature. Events are strongly typed, and any data type that is definable in C++ can be incorporated in an event. Each event stores a single value of its data type for consumers to retrieve. There is also an event that has no data type and is solely used to schedule actions (control flow). The name Events & Actions originates from their original implementation of the paradigm that had only two of the three main objects: events and actions. Components have since been added, and have been added and superseded actions in terms of visibility to the programmer, yet the name of the paradigm remains unchanged. 68 4.1.1.2 Actions Actions represent a single executable task.. They are attached to specific events and are executed each time that event is signaled. Each action has a function to execute and an associated component that provides local data. 4.1.1.3 Components A component is a base class from which specific functional blocks can be created by the programmer. A user component consists of inputs, outputs, local data, member functions and an action(s) that perform the transformation(s) from input to output. Component inputs and outputs are references to events. A component is responsible for attaching actions to inputs during initialization. During steady state operation they are static objects with no direct functionality. All active processing is provided through actions which call a member function of the component. 4.1.2 A C E Configurations An A C E configuration is an application created using the A C E environment. A configuration can be defined as the set < E, C, I, 0>, where E is a set of events, C is a set of components, I is the set of connections between events and component inputs, and O is a set of connections between events and component outputs. The actual specification of the configuration is implemented as a single21 C++ source file. Sets E and C appear as statically declared event and component objects. Sets I and O are included in the constructor parameters of the constructor, i.e., in order to declare a specific component, constructor parameters must be provided to define which events to connect to the 2 1 Due to current C++ standards, the single source file is a restriction that cannot be easily overcome. 69 input(s) and output(s) of the component. Actions within the components are constructed automatically by the components themselves. 4.1.2.1 The Integrated Circuit Analogy An A C E configuration can be considered an analogous to hardware circuit design22. Components can be though of as ICs and events are the wires that connect the ICs. This analogy can be carried far, and is most useful in understanding A C E configurations. For example, in an A C E configuration, just as in an IC design, it is usually incorrect to have two components (ICs) driving the same event (wire). 4.1.3 Design and Development with A C E Design with A C E should be performed in two major parts: a functional decomposition of subtasks and a data type decomposition of the objects involved in the system. The first major part of design is to define the data types passed between the components in the form of events. This part is performed in an object-oriented approach, and is critical to the success of the design. If the data decomposition is performed well, the designer will find that any errors in the functional decomposition are isolated and can be easily corrected by changing a single component. Second, the problem should be divided into subtasks using functional decomposition. Any subtask can be further broken down. When a subtask can be broken down no further, a component can be created to encapsulate the subtask. As with all design processes, design in A C E is an iterative process. The functional decomposition and data decomposition will affect each other, although the designer should 2 2 This analogy is actively pursued at ISE. Current plans for A C E programming tools include graphical programming environments similarly to currently available hardware design aids. 70 attempt to let the data decomposition carry more weight than the functional decomposition, since the functional decomposition is more likely to change, due to future requirement changes and unforeseen deficiencies in algorithms, etc. 4.1.3.1 Asynchronous Data Flow A special note must be included about the asynchronous nature of A C E . Actions are scheduled as a result of an event being signaled, not on a global clock tick. However, just as ICs are actually all asynchronous, made synchronous by the presence of a clock input, an A C E component, and a whole A C E configuration can be made synchronous by having "trigger" inputs. This trigger can then be connected to "clock" events which are signaled periodically by a timer component. The R O L L design presented in this thesis is asynchronous in nature. The asynchronous nature is a large benefit to designing a component that can combine pose input from independent sources23, such as the video localization system and an odometry system. The asynchronous nature allows these two systems to provide their information to this component, when it become available, completely independent on each other. 4.2 The ROLL Configuration Figure 13 shows the A C E configuration for the R O L L system. This figure shows the system components as rectangles, with the component name on the inside. Events are shown as lines connecting the components with arrowheads pointing to the components that accept these events for input. The name of the event and its type are labeled beside the line. Events with no data are labeled with their name only. The system output, CurrentPosition, is shown as a thick line. 71 Or Gate Or Gate Iteration Terminated H Map Component Next Landmark (LandmarkPoseStc) Not Visible Next I mage Landmark Image Predictor Landmark Image Prediction (LandmarkFrameStc) Landmark Detector Landmark Image Detection (LandmarkFrameStc) Relative Position Calculator Frame Grabber Image Captured (GreyScalelmage) Time Latch Current Time (Seconds Floating Point) Position Estimator Current Position (PoseStc) System Output Relative Position (PoseStc) Absolute Position Calculator Absolute Position (PoseStc) Legend • •• Control How Only —• Data and Control Row X Not Connected Figure 13 The ROLL Configuration The configuration figure appears very similar to Figure 12, on page 62, with some extra components and more detailed data/control flow information. The system is organized into a main pipeline, with a side pipeline to provide supplemental data. This component would encapsulate an extended Kalman filter, although in the design presented in this thesis only a simple second order plant model is implemented for this purpose. 72 4.2.1 The Main Pipeline The main pipeline appears as the center column of components in Figure 13. Processing starts with the Map component, and proceeds in a linear fashion through the Landmark Image Predictor, Landmark Detector, Relative Position Calculator, and Absolute Position Calculator components. Each iteration through this pipeline calculates the position of the robot from a single landmark. 4.2.2 The Supplementary Pipeline The supplementary pipeline is responsible for providing the main pipeline with the current image and current position estimate. It is also responsible for providing the current position as an external output of the system. This pipeline is shown as the vertical column of components on the right in Figure 13. Processing through this short pipeline begins with the Frame Grabber component obtaining the current image, which is provided to the Landmark Detector component. Updating the current image also fires the Time Latch component, which records the time of capture. This time is fed to the Plant Model component which responds by providing an estimate of the current robot location to the Landmark Predictor component. Hereafter components will be referred to by their names alone. The trailing 'component' will be omitted. 4.2.3 Loop Control The Map is responsible for coordinating the activities of the two pipelines as well as providing the positions of particular landmarks. When the system is started, the Map signals the Frame Grabber, and waits until the image is obtained. At this point, the Map will start an internal iteration through its database/array of known landmarks, sending them, one at a time, though the main pipeline. Once the main pipeline has completed an iteration, it has the option of firing the 73 main pipeline with the next landmark in its internal iteration, or firing the supplementary pipeline to obtain a new image. On the left side of Figure 13, several events are shown leading from components in the main pipeline to the Map. All of these events indicate termination of the pipeline, regardless of whether an absolute position was actually obtained. Both the Landmark Predictor and the Landmark Detector are capable of terminating the pipeline prematurely should the current landmark, provided by the Map, not be obtainable. 4.3 Data Types As mentioned in section 4.1.3, specifying the data types is a critical part of configuration design. The R O L L system has many inherent data types, some of which appear as event types, and some of which are used internally to components. These data types provide a level of data abstraction, that allows the system to be developed at a fairly high level. Separating data types independently from which components produce and consume them also provides flexibility when rearranging the system in order to explore its characteristics and performance. Most of these data types deal with representing the same information in one of three coordinate systems-Cartesian coordinates, both relative to the camera or landmark, and relative to a designated absolute frame; the image plane frame; and the frame buffer coordinates. All of these data types are simple C structures, since they represent pure data. Operations on the data are more closely associated with the components than the data itself. The one exception is the GreyScalelmage, which is a C++ class because it has several functions which should be available regardless of whichever component accepts a GreyScalelmage event as input. One could imagine that if other visual systems were added to the robot's sensing system, they 74 could use the same GreyScalelmage event as input, and would require the same functions that the R O L L system requires. 4.3.1 Image Plane Coordinate Data Types The image plane coordinate system is used for calculations using the ideal camera model. The origin is the camera optical center, and the z axis is aligned with the optical axis. The image plane is then located at z = focal length. The direction to an object in this coordinate system is usually given relative to the optical axis, thus relative to the z-axis. Al l values in this coordinate system are inherently real numbers. 4.3.1.1 ImagePositionStc The ImagePositionStc represents a single point on the image plane. It encapsulates two values, an x and a y value. Since the point is assumed to be on the image plane, the z value is assumed to be the focal length of the camera. 4.3.1.2 ImageLandmarkStc A landmark appears as three dots in an image. The ImageLandmarkStc encapsulates this as three ImagePositionStc''s representing the centroids of each dot, and a value representing the radius of the dots, which are assumed to be identical, regardless of the fact that the center dot will be slightly different. This difference is assumed to be negligible. 4.3.2 Frame Buffer Coordinate Date Types The frame plane refers to the representation of an image in a computer. A typical frame buffer consists of a segment of memory used to store a matrix of pixel values. There are many different representations of pixel values. The R O L L system assumes that the pixel values are a single number in the range [0,255], representing the relative greyscale intensity of the pixel. 75 Points in a frame are referred to by a <row, column> tuple. Each value can range from 0 to the size of the matrix in either the row or column direction. Because of the discrete nature of a matrix, the row and column values are usually represented as integer numbers. In the R O L L system, real number values are also required for centroid calculations. Therefore the R O L L system includes both real and integer representations of points in the frame buffer. 4.3.2.1 RealFramePositionStc and IntegerFramePositionStc These two structures represent a single point in the frame buffer. They encapsulate the row and column values, as row and column fields. In the RealFramePositionStc these values are stored as C data type double and in the IntegerFramePositionStc they are stored as C data type int. These data types are represented using 64 and 32 bits respectively on the Win32 platform using the Visual C++ 4.0 compiler. 4.3.2.2 FrameEllipseStc This structure represents an ellipse in the frame buffer. The structure contains a RealFramePositionStc and an integer area. The area represents the number of pixels contained wholly within the ellipse. 4.3.2.3 FrameLandmarkStc Like an ImageLandmarkStc represents a landmark in the ideal camera image plane, by three points and a radius, the FrameLandmarkStc represents a landmark in the frame buffer by three RealFramePositionStc'?, and an area. Again although the areas of the dots may vary, this variance is assumed to be insignificant and the average is stored as a single value representing the area of all three dots. 76 4.3.3 Cartesian Coordinate Data Types The Cartesian coordinate frame is represented by a right handed <x, y, z> tuple. For the R O L L system, the xz plane is aligned with the horizontal, and the y axis points directly upwards. This corresponds to the normal conception of the Cartesian coordinate system. Unlike the typical Cartesian coordinate system, the R O L L system assumes that 0° in the xz plane is aligned with the z axis instead of the x axis. This change is made in order to align the Cartesian coordinate axis with the traditional alignment of axis in the ideal camera model, where the z axis is set to the camera's optical axis. The direction of objects from the position of the camera, corresponds to orientation values in the Cartesian plane. This makes translation between the coordinate systems not only easier to perform, but easier to understand. All values in the Cartesian coordinate system will be real valued. All angles are in radians. 4.3.3.1 PoseStc The PoseStc represents a robot or landmark position in the Cartesian coordinate system. Each robot or landmark is assumed to be fixed to the horizontal plane, which eliminates the need for a y coordinate. Each robot or landmark also has an orientation relative to the z-axis. Thus the PoseStc contains a <x, z, orientation> tuple. 4.3.3.2 LandmarkStc A LandmarkStc is used to represent a landmark in a Cartesian plant. It contains a pose, and adds an integer identification field. All landmarks have a unique integer identification number. 4.3.4 GreyScalelmage Data Type To encapsulate a frame buffer the GreyScalelmage data type was created. Unlike the previous data types, a GreyScalelmage is an object type and thus has some associated operations. 77 The GreyScalelmage is actually an abstract base class. This was done to allow simulated and real images flow through the system without modifying those components that require access to the image. The most notable operations that must be provided by a GreyScalelmage are GetPixel(), SetPixel,(), Width(), and Height(). It also contains a GetBuffer() operation that is useful for performing lower-level time critical operations. Time performance is not critical for proving the operation of the R O L L system in the A C E environment so all pixel values are obtained through the GetPixel() member function. 4.4 Components Each component shown in Figure 13 is described in this section. Inputs, outputs and attached actions are listed for each component. The functions called by any actions are described using a terse pseudo code algorithm. Components that perform critical localization subtasks are described in detail, since much of the design work for the R O L L system lies in the algorithms contained within these components. 78 4.4.1 The Frame Grabber F r a m e G r a b b e r C o m p o n e n t Get Image Current Image ( GreyScalelmage ) I Figure 14 Frame Grabber Component The frame grabber is an abstract base class that encapsulates the responsibility of generating the current image from the system's camera. It has a single, dataless input, Getlmage, and a single output Currentlmage of type GreyScalelmage. There is a single action connected to its input that is responsible for generating/obtaining the image. For a physical implementation of the R O L L system, a component derived from this base class would be created to interface with a specific hardware frame grabber. Although constructor parameters might be added to deal with initialization of the hardware, the runtime operation of the component would remain the same, i.e., its 'event interface' remains the same. Components are C++ objects and can therefore use C++ inheritance mechanisms to provide specialization for different hardware, or even in this case, simulation of hardware. 79 4.4.1.1 The Image Sequence Loader I m a g e S e q u e n c e L o a d e r Get Image Current Image ( GreyScalelmage) Figure 15 The Image Sequence Loader Component For simulation in the A C E testbed, images are generated off-line and must be loaded during simulation. The Image Sequence Loader is a component that mimics a Frame Grabber. It loads images, in sequence, from disk whenever the Getlmage input is signaled. The function that performs this operation is a simple four-step process. Each step has a large amount of details but these are of no academic interest. The high level algorithm is: 1. Load the next image. 2. Increment the current image counter. 3. Convert the image to a GreyScalelmage 4. Signal the output event with the new image. 80 The current implementation uses a simple IFF greyscale image format for image file format. Image pixels are stored exactly as they are in a GreyScalelmage, so that the file is simply mapped to memory. 81 4.4.2 The M a p M a p Loop Terminated Next Image Image Grabbed Landmark Absolute Position 1 (LandmarkPoseStc ) Figure 16 The Map Component The Map has a dual responsibility. It must load an maintain the database of known landmarks and it must synchronize iteration through the two pipelines25. It has two inputs: LoopTerminated, and ImageGrabbed. Both of the inputs are dataless. It has two outputs; CurrentLandmark, and Nextlmage. The former has data type LandmarkPoseStc, and the latter is dataless. The current implementation of the Map stores all of the known landmarks in an array of PoseStc, where the index of an element in the array is the id of the landmark. It has two actions: NextLandmark, and StartMainheration. There is a single local variable, currentlndex, that retains an index into the array of landmarks. The StartMainheration action follows the following algorithm: In retrospect, the author feels this component should have been separated this components into two components: a Map and a Pipeline Controller. 82 1. Set currentlndex to the first element of the landmark array. 2. Signal the CurrentLandmark output with the first element of the array. The NextLandmark action follows this algorithm: 1. Increment the currentlndex. 2. If the currentlndex exceeds the bounds of the landmark array: 2.1. Signal the Grablmage output. 2.2. End. 3. Signal the CurrentLandmark with the landmark at currentlndex 83 4.4.3 The L a n d m a r k Predictor (LandmarkStc) (PoseStc) L a n d m a r k P r e d i c t o r Landmark Pose Camera Pose Not Visible Landmark Image Position Trigger A i ( FrameLandmarkStc) Figure 17 The Landmark Predictor Component The Landmark Predictor is responsible for predicting whether a given landmark will appear in the image, and if so where. It has a two inputs. The first is CurrentLandmark, of type LandmarkStc. This input should hold the absolute position and id of some landmark. The second input, of type PoseStc, should contain an estimation of the current landmark position. The Landmark Predictor has two outputs. First is a dataless output, Notlnlmage, that should be signaled if the landmark will not appear in the current image, based on the current position estimate for the robot. The second output, of type LandmarkFrameStc, is called LandmarkPrediction. This output should be signaled with a prediction of where the landmark will appear in the image, if it is determined that the landmark will appear in the image. There is a single action, PredictLandmark, which performs the computations of the following algorithm:. 84 1. If the landmark will not appear in the current image 1.1. Signal the Notlnlmage output. 1.2. End. 2. Calculate the predicted position of the landmark in the current image. 3. Signal the LandmarkPrediction with the calculated prediction. 4.4.3.1 Predicting If the Landmark Will Appear in the Image The landmark will appear in the image if a ray traced from the predicted position of the camera to the absolute position of the landmark falls within a given range. Equation 35 ( tan 1 X — X Idmk camera V ^Idmk ^camerea J < FIELD OF VIEW The FIELD OF VIEW constant depends on the camera and lens that are used. For the particular camera used in this system, the value of this constant is approximately 0.40 radians, or 23°. In addition, the landmark must be facing the camera, i.e., the orientation of the landmark relative to the predicted pose of the camera must be within 7i/2 of -71. Equation 36 Q—ff < ft/ Idmk camera / 2 If the above two conditions are true then it is assumed that the landmark will appear in the current image. 85 4.4.3.2 Estimating the Position of the Landmark in the Image Calculating the expected position of the landmark in the current image is a fairly lengthy procedure. Section 3.3.1.3 introduces the general approach to projecting a point in absolute space onto an ideal image plane. These calculations are repeated here in more detail. First, the x-z positions of the upper and lower dots of the landmark are calculated relative to the camera. This is done by translating the landmark position by the negative of the camera position, and then rotating by the negative of the camera orientation. This requires a simple application of the modified general transformation equation given as Equation 8. Equation 37 absolute . ^ ( ^camera ' X camera ' ^camera ldmk From this position it is possible to calculate the expected x coordinate of the upper and lower dots of the landmark in the image plane. This is done through straight application of the ideal camera equations. Equation 38 Xcameras X fOCal len8th camera U m k X — x — 7 '"camera^ Knowing that the landmark is placed at exactly the same height as the camera, we can calculate the expected y position of the upper and lower dots in the image by calculating the expected height of the landmark in the image plane and placing the upper and lower dots equidistant from the y=0 line in the image plane. 86 height — Equation 39 actual landmark height x focal length 7 ^camera,,. Equation 40 height / y. = — y. — />> Jimage m,er J imagelower /I The next major step is to calculate the position of the middle dot of the landmark in the image plane. First the absolute position of the landmark middle dot must be calculated. Clearly this value could be stored to avoid repeating the calculation of a constant value, but it was left as a calculation in the simulated environment. Equation 41 X camera 0gest ^offset + s'm(dldmk) • offset length Equation 42 Zcameraoffesl = ^offse, + C 0 ^ l d m k ) ' Offset length Again, in Equation 8, the general pose transformation equation is used to transform the absolute position into the position relative to the camera. Equation 43 absolute / 3offset ^solute / ^ \ Zcamera j Xcamera ' ^camera /P off.se From the assumption that the landmark is mounted at the same height as the optical axis, we can assume that the y position of the middle (offset) dot is 0. The x position can be calculated using the ideal camera equation. 87 Equation 44 Xcameraajr,a X fOCal lenStk X — Finally, the last step for calculating the dot locations is to take all of the ideal image positions and translate them into frame buffer coordinates. This is done using the Tsai transformation specified in Section 2.6. Equation 45 [rowupper colupper j T s a i | x i m a j ; e ^ , yimagemKr j VOWmiddle COLiddle ] ~ ^ Sa^{XimagemilUle ' image midclle ) [rowlower collower] = Tsai( , yimagei_) Equation 46 Equation 47 Now all that is left for the landmark prediction output is to provide an estimate of the radius of each dot in frame buffer pixels. This is approximated using the ideal camera equations, the Tsai transformation near the origin of the image plane, and the known diameter of the landmark dots. Equation 48 ( n r l l +1. ii r [row, colx ] = Tsai ^dot diameter• focal length dot diameter• focal length\ ZcameraUmk J Equation 49 [row2 col2 ] = Tsai(0,0) 88 Equation 50 (raw, — row2) — (co/, — col2) radius — 2 89 4.4.4 The L a n d m a r k Detector Landmark Detector (LandmarkStc) Current Landmark Not Detected - - • ( FrameLandmarkStc Camera Pose Detected Landmark < ( FrameLandmarkStc) ( GreyScalelmage) Current Image Figure 18 The Landmark Detector Subconfiguration Interface The Landmark Detector component is responsible for finding a specific landmark in an image, given a prediction of where the landmark will appear in the image, and a reference to the image itself. This is by far the most complex component because is contains an entire, though rudimentary, object recognition system. Because of its complexity, the landmark detector is further decomposed into a sub-configuration. This sub-configuration is comprised of four different component types: a structure splitter, an ellipse detector, a landmark shape verifier, and a landmark id verifier. These components are connected in the configuration shown in Figure 19. The thick arrows represent the inputs and outputs to the landmark detector sub-system. 90 Current Image (GreyScalelmage) Connections not shown to reduce clutter Predicted Landmark (FrameLandmarkStc) Tier 1 Current Landmark (LandmarkStc) Predicted Centroid (FrameEllipseStc) Structure Splitter Ellipse Detector Ellipse Detector Ellipse Detector Tier 2 Detected Centroid (FrameEllipseStc) Predicted Centroid (FrameEllipseStc) Landmark Shape Verification Tier 3 Ellipse Detector Detected Centroid (FrameEllipseStc) Tier 4 Tier 5 Not Detected Ellipse Detector Landmark Shape Verification Ellipse Detector Landmark Shape Verification A N D f* Landmark P-l Shape Verification Proposed Landmark (FrameLandmarkStc) Not Detected i 1 Landmark Identification Verification Detected Landmark (FrameLandmarkStc) Figure 19 The Landmark Detector Sub-Configuration 91 4.4.4.1 The Frame Landmark Splitter ( FrameLandmarkStc ) Frame Landmark Splitter Upper Centroid Landmark Pose Middle Centroid Lower Centroid (FrameEllipseStc ) > ( FrameEllipseStc) ( FrameEllipseStc ) Trigger - 4 -Figure 20 The Frame Landmark Splitter Component The frame landmark splitter component simply takes a FrameLandmarkStc as an input and splits it into three FrameEllipseStc for output. Splitting this structure up allows each centroid to be detected using a single type of component that attempts to detected a dot independently of the other dots in the landmark. 92 4.4.4.2 The Ellipse Detector ( FrameEllipseStc ) (GreyScalelmage ) Ellipse Detector Predicted Centroid Current Image Trigger A i Detected Centroid ( FrameEllipseStc) I Figure 21 The Ellipse Detector Component The ellipse detector component accepts a predicted ellipse location and a reference to the current image and searches for an ellipse in the image in the vicinity of the predicted centroid. A general description of the algorithm used can be found on page 51. If an ellipse can not be detected within the vicinity of the prediction, the output is marked as an invalid ellipse by setting the row and column frame buffer coordinates as -1. 93 4.4.4.3 The Landmark Shape Verifier (FrameEllipseStc) ( FrameEllipseStc ) ( FrameEllipseStc) Landmark Shape Verifier Upper Ellipse Valid Landmark < Upper Ellipse Prediction < Middle Ellipse Upper Valid Ellipse < Lower Ellipse Middle Ellipse Prediction < Lower Valid Ellipse < Lower Ellipse Prediction > Bad Shape ( FrameLandmarkStc) ( FrameEllipseStc ) ( FrameEllipseStc ) (FrameEllipseStc) ( FrameEllipseStc) I ( FrameEllipseStc) Figure 22 The Landmark Shape Verifier Component The landmark shape verifier component takes three FrameEllipseStc events as inputs and has seven outputs. Several tests are performed on the three input ellipses. First, each ellipse is checked to see if it is valid. If it is not valid, the x and y coordinates will be -1. If all three landmarks are valid, the three ellipses are subject to several further tests. These tests are detailed in the section entitled Landmark Verification on page 48. If the ellipses pass all of the tests, then they are aggregated and output in the Valid Landmark output event. If one ellipse is found to be invalid it is possible to further the search by providing three more ellipse centroid positions based on the locations of the two valid ellipses. These predictions can then be passed on to more ellipse detectors. 94 Should two or more of the ellipses be invalid, or should three valid ellipses fail one of the landmark verification tests, the dataless event BadShape is signaled. 95 4.4.4.4 The Landmark Identification Verifier Landmark Identification Verifier ( FrameLandmarkStc) Detected Landmark ( FrameLandmarkStc ) Verified Landmark 1 (LandmarkStc) Current Landmark (GreyScalelmage) Wrong Id Current Image - - • Figure 23 The Landmark Identification Verifier The landmark identification verifier component takes a verified landmark and searches the area above it in the current image for a valid identifier block. The number read from this search is compared to the value given in the identification field of the current landmark input. If the number matches then the detected landmark is copied to the verified landmark output. If the numbers do not match, and the identifier block was correctly read, then the wrong id output is signaled. The specification of a valid identifier block is found in the Section 3.2.5 on page 57. 4.4.4.5 Operation of the Landmark Detector Sub-Configuration Referring to Figure 19, one can trace the operation of the Landmark Detector Sub-Configuration. This sub-configuration is composed of 5 sequential tiers. Inside of each tier are one or more components. For each tier, these components may operate concurrently. Although 96 true concurrency cannot be achieved without a multiprocessor machine, it is notable that this possible concurrency occurs naturally when designing systems using A C E . In tier 1, the predicted landmark event is fed into a Frame landmark structure splitter which produces three FrameEllipseStc events, each holding the predicted centroid of a single dot of the landmark. Each of these FrameEllipseStc events is fed into an Ellipse Detector Component, which searches the vicinity of the predicted centroid in the current image to find a black dot. If none is found, the Ellipse Detectors output FrameEllipseStc''s with -1 in the row and column fields. The three detected FrameEllipseStc events are then passed to a Landmark Shape Verifier component in tier 2. This component performs some tests on the three ellipses to see if they form a valid landmark. If these three FrameEllipseStc events form a valid landmark, the sub-configuration skips to tier 5. Otherwise, if one of the ellipses is invalid, there is still a chance that the third dot may be found. To do this, the first Landmark Shape Verifier calculates three new ellipse predictions. These new predictions are passed on to tier 3 which consists of three more Ellipse Detectors. The output of the tier 3 Ellipse Detectors, along with the original ellipses from tier 2 that were valid, are passed to tier 4 which contains three Landmark Shape Verifiers. If any of the tier 3 Ellipse Detectors managed to find a valid ellipse, one of the tier 4 Landmark Shape Verifiers will receive three valid ellipses and can verify the shape. In tier 5, if any of the Landmark Shape Verifiers from tier 2 or tier 4 has found that its inputs form a valid landmark, that landmark is passed on to the Landmark Identification Verifier . This component can then verify the identity of the landmark. If the identification proceeds satisfactorily, the fully detected and verified landmark is output from the Landmark Identification Verifier in the Detected Landmark event. If none of the Landmark Shape Verifiers 97 if tier 2 or tier 4 receive ellipses that form a valid landmark, or if the identification of the detected landmark is found to be wrong, the Not Detected event is signaled. This condition is governed by the AND component which will signal its output if all three tier 4 Landmark Shape Verifiers do not find a valid landmark on its inputs. 98 4.4.5 The Relative Position Calculator (FrameLandmarkStc) Relative Position Calculator (PoseStc) Frame Position Relative Position Figure 24 The Relative Position Calculator Component The Relative Position Calculator component takes a valid landmark in the frame buffer and calculates the pose of the landmark relative to the camera. Section 3.2.3.3, starting on page 41 details the equations used to calculate the relative position of the landmark. The final relative position is output in a PoseStc in the Relative Position event. 99 4.4.6 The Absolute Position Calculator (PoseStc) (LandmarkStc) Absolute Camera Position Calculator Landmark Relative Camera Absolute ' (PoseStc) Landmark Absolute Figure 25 The Absolute Position Calculator Component The Absolute Camera Position Calculator component takes the pose of a landmark, and the known absolute pose of the detected landmark, and calculates the absolute position of the camera. This is done using the general pose transformation equation listed as Equation 8. This equation is shown again in Equation 11 with the appropriate parameters to calculate the absolute position of the camera. 100 4.4.7 The Position Est imator (Float) (PoseStc) Position Estimator Time in seconds Detected Position Position Estimate (PoseStc) Figure 26 The Position Estimator Component The position estimator component is responsible for maintaining a estimate of the current position. The implementation of this component may be as simple as copying the detected position to the output, to a Kalman filter that incorporates odometry or dead reckoning with the visual localization system. Section 3.2.6 gives a set of equations that provide a second order kinematic model of the robot considered for this system. If the system were to be integrated with a full test-bed, this component must be able to provide a position estimate when requested rather than in discrete asynchronous intervals. The equations given in Section 3.2.6 do allow for a pose estimate at any time. 101 4.5 Embedded System Implementation Although this thesis centers around the simulated version of the R O L L system, the simulated version was actually preceded by a physical implementation. On overview of the system is shown in Figure 27. A small camera was mounted on top of a cylindrical robot, nicknamed Spinoza. The image from this camera was transmitted using two transceivers. Al l of the localization was performed on a single Texas Instruments TMS320C40 DSP board with a frame grabber. Position data obtained from the localization algorithm was passed on to a second TMS32C40 board with a frame buffer. This buffer was then displayed on an available monitor. The robot was manually driven using a host computer serial port connection and another set of transceivers not shown in Figure 27. Preliminary performance analysis revealed that a single C40 was able to process the images at approximately 8Hz. At this rate it was possible to drive the robot only a speeds lower than lmeters/s. Rotational speed also had to be limited. The maximum range that landmarks were detectable was about 2.5 meters. 102 Figure 27 The Physical Implementation The system showed sensitivity to the many constants involved in the algorithm. These constants are listed in Appendix A. The system also experienced a high frequency of "misses", where the robot would fail to detect the landmark for several frames and become lost. Several rough re-initialization schemes were attempted but none proved both reliable and quick. 103 Unfortunately, before these deficiencies could be investigated and system characteristics quantified properly several factors came to the forefront that instigated the move to the simulated test bed using the A C E programming environment. The first factor was that the code was not well organized and debugging and maintenance was quickly becoming impractical. Secondly, after a prolonged absence from the project, the robot mountings were changed rendering it impossible to use the robot further. 104 Chapter Five - Results and Evaluation 5. Results And Evaluation This chapter presents the results from system test and qualitative evaluation of system performance. It is split into two sections: Localization System Evaluation and A C E Environment Evaluation. The first and larger section concerns itself with evaluation of the localization system and algorithms while the second, smaller, section relates to the sub-theme of this thesis which is programming in the A C E environment. 5.1 Localization System Evaluation Accurate system evaluation proved to be very difficult for the R O L L system. There are a very large number of possible metrics that would be desirable, but for one reason or another each of these metrics was not practically obtainable. 5.1.1 Issues/ Difficulties There were two main problems that thwarted testing. The first difficulty in measuring absolute location. The second was the effort involved in obtaining images. 5.1.1.1 Measuring Errors Even with a metered optical bench, there were several large sources of error that prevented an accurate measurement of absolute pose. 105 Because the camera was required to be at the same Mountin; Platform >g height as the landmark middle dot, a vertically adjustable mounting for the camera was required. This vertical Figure 28 The Camera Mounting mounting was in fact a two degree of freedom mounting. The the camera was mounted on the mounting platform, it was also free to rotate in the horizontal plane. Both of these rotational freedoms could be secured, but not at a specific angle. Because the camera is mounted at an offset from the vertical pole, this inaccuracy affected all three pose variables x, z, and orientation. The landmark mounting also had similar problems. Particularly, on the motion stage shown in Figure 29, the landmark had no fastenings and had to be aligned by simple inspection. Also while it was very easy to get very accurate relative motion from the motion stages, the absolute coordinate of the starting position was difficult to measure resulting in more error. 5.1.1.2 Image Capture Difficulties The second problem was the rather slow rate at which images could be generated and analyzed. Because the system was moved to a stand-alone Windows NT simulated environment each image in any image sequence must be manually transferred either through floppy disk or modem. Each image used approximately 256Kbytes of disk space. Another problem was the lack of automation possible in the frame grabber utility. Capturing each image required input from the user. With these image capture problems, it became clear that analysis that required significant numbers of images were impractical. Platform platform was adjustable in the horizontal plane. In turn, when 106 5.1.2 Experiment Procedure For the above reasons the decision was made to test the system using a small number of test runs under relatively controlled conditions. To minimize inaccuracies in measuring the absolute position of the camera, the camera was calibrated using the Tsai method immediately before capturing images. The camera was kept immobile after calibration and the calibration target was replaced by the localization target. An assumption was made that the results from the calibration algorithm were perfectly accurate. This reduced the source of measurement error to the error incurred by replacing the calibration target with the localization target. 5.1.2.1 Experiment Setup O p t i c a l Bench Camera Figure 29 Optical Bench Motion Stage Setup Figure 29 depicts an overhead view of the experimental setup. 107 The optical bench provided a secure surface on which to mount the camera platform and the motion stage. The motion stage consisted of a platform that was driven by an electric motor with a single screw gear along a precise track. This platform could be moved using a few script utilities provided by Stewart Kingdon at the UBC LCI. 5.1.3 Obtaining a Sequence of Images For each run, the motion stage was "homed" by moving it to the extreme far end from the motor, and then moving it a known length backwards to arrive at the starting point for each ran. Each ran consisted of eleven images taken at different positions beginning with 2 inches in front of the starting point, and at 2 inch intervals thereafter. At each stop an image was obtained. 5.1.4 Camera Calibration Procedure The system described in this section is the result of the efforts of Rod Barman who has implemented the calibration algorithm as described in [Tsa87]. Beginning with the setup referred to in Figure 29, the localization target, shown in Figure 7 was temporarily mounted on the motion platform, and the motion platform was moved adjacent to the camera mounting pole. The camera mounting was then adjusted so that the camera was at the exact height of the middle dot of the landmark. The camera position was not allowed to be altered from this step forward. The motion stage was moved away from the camera to the opposite corner of the optical bench. The localization landmark was then replaced with the calibration landmark. A sequence of images was captured. This sequence of images was processed to obtain the centroid of each dot in each image in the image sequence using a utility that required the user to specify a few dots manually before the rest could be detected automatically. 108 Figure 30 The Calibration Target The array of dots was then processed by a utility that actually performed the Tsai calibration calculations. This utility was applied to the data multiple times until the results converged. 5.1.5 Camera Cal ibra t ion Results The results of the calibration run are the internal and external parameters of the Tsai camera model. The exact parameters are listed below. # Frame b u f f e r c e n t e r (Cx, Cy) 240.106566062225 231.027618810727 # # CCD s e n s o r d i m e n s i o n s i n meters (dx, dy) 1.145 83 333333333e-05 1.32530120481928e-05 # # Frame b u f f e r to CCD s c a l e f a c t o r (Sx) 0.682694160379097 # # R a d i a l d i s t o r t i o n c o e f f i c i e n t s ( k l , k2) 2954.18720841782 0 # 109 # F o c a l l e n g t h (f) 0.00812931290009322 # # Yaw, p i t c h and r o l e ( t h e t a , p h i , p s i ) 0.379992612801381 3.14827952707384 -0 .0104168964014736 # # T r a n s l a t i o n V e c t o r ( tx , t y , t z ) -0 .331945854525235 0.111682017910593 1.75903951865051 # # Camera o r i g i n : (0.961970) (0.089901) (1.511096) # RMS p i x e l e r r o r (0.108176) x, (0.156792) y . # # Maximum p i x e l e r r o r (0.412734) x, (0.499458) y . # The internal parameters were converted into data that was accessible to the localization system so that the Tsai ideal image plane - frame buffer transformations could be applied. The external parameters, Yaw, and the translation vector were used as the absolute position measurement of the camera relative to the world origin. The values given in the translation vector were relative to the bottom right dot of the calibration target, and constitute the T vector shown below in Equation 51. The yaw value is also used to form a 2D rotation matrix for R Equation 51 X = R w z _ Z w _ The world coordinates of the camera can be found by rearranging Equation 51, and inserting known values. Equation 52 shows the resulting linear system. The results of solving this system are shown in 110 cos(-0.38) -sin(-0.38)' sin(-0.38) cos(-0.38) "0.332" z _-1.760 Equation 52 " 0.89 " 7 _ w _ -1.51 Equation 53 xw -0.22m 0.61m I P camera 7 w + 0 - -1.51m yaw -K - 2.762rae? Equation 54 5.1.6 Localization Test Run Procedure The calibration target was replaced with the localization target. The localization target top and bottom dots were vertically aligned with the center front of the motion stage platform. The landmark orientation was also aligned with the center front of the motion stage. These alignments were made with only the assistance of the human eye, and therefore are believed to have incurred significant error. Despite this source of error, other sources of error have been minimized by the use of the calibration routine to measure camera position, and by keeping the camera position fixed through calibration and the test runs. The image sequence for the test ran images started from the same spot along the motion stage track as the calibration, and proceeded forward in steps of 2 inches. At each point, an image was captured and saved to disk. The images were then transferred to a Windows NT workstation were they were fed to the simulated system. I l l 5.1.6.1 Light Level Variance The test run was repeated at a lower light level. This was achieved by turning off half of the lights in the laboratory. This second run is to be used to demonstrate the system's robustness in varying light levels. Figure 31 and Figure 32 show a sample of the images taken in the high light level test run and low light level test run respectively. These images are 256 colour greyscale images but were only shown with 64 colours by the image viewing software used to display the images. Figure 31 Test Run #1 - Full Light 112 Figure 32 Test Run #2 - Dark 5.1.7 Analysis O f Local izat ion Test Runs The simulated system was configured to produce a text file log of the results of each component. The complete log for each test run is included in appendices B and C. Excerpts from each run are listed in the sections below to explain any error conditions. 113 5.1.7.1 Simulated Test Run #1 - Full Light Table 1 summarizes the results from the first simulated test run. Each row shows the position calculated by the system, and the position calculated based on the initial position calculated in the Tsai calibration. The former value is shown in the "Known Pose" column of the table. The actual name of the image file is shown in the left most column. The error pose is calculated in the fourth column. This value is found by subtracting the measured pose from the known pose. The final column adjusts the error by subtracting the average error for the test run. This is used to approximate a correction for any systematic errors. 114 Table 1 Simuated Test Run #1 Full Light Imago Measured Pose Known Pose Cartesian Krror ( x, /.. oriental ion) ( x, / , orientation) ( x, / , orientation) Uinkl5.ilT i -68.3. 152.0, 2.755) (-67.0. 151.y. 2.762 ) ( 1.3.-0.1. 0.007 ) ldrr.kl3.iff (-66.8, 145.7, 2.761) (-67.0, 146.8 , 2.762 ) (-0.2, 1.1,0.001 ) ldmkll.iff (-64.0, 141.6, 2.780) (-67.0, 141.7, 2.762) (-3.0, 0.1,-0.018) ldmk09.iff (-68.3, 136.0, 2.754) (-67.0, 136.7, 2.762) ( 1.3,0.7, 0.008) ldmk07.iff (-67.4, 131.9, 2.762) (-67.0, 131.6, 2.762) ( 0.4, -0.3, 0.000 ) ldmk05.iff (-65.6, 126.8, 2.773 ) (-67.0, 126.5, 2.762) (-1.4, -0.3, -0.011 ) ldmk03.iff (-67.6, 121.4, 2.760) (-67.0, 121.4, 2.762) ( 0.6, 0.0, 0.002) ldmkOl.iff (-66.9, 116.5,2.765) (-67.0, 116.3, 2.762) (-0.1,-0.2,-0.003) ldmk-l.iff (-65.3, 111.6, 2.777) (-67.0, 111.3, 2.762) (-1.7,-0.3, -0.015) ldmk-3.iff (-66.2, 106.3,2.770) (-67.0, 106.2, 2.762) (-0.8, -0.1,-0.008) ldmk-5.iff (-65.2, 101.6, 2.780) (-67.0, 101.1,2.762) (-1.8, -0.5, -0.018) ldmk-7.iff (-66.4, 96.2, 2.770 ) (-67.0, 96.0, 2.762) (-0.6, -0.2, -0.008) average (-0.5, -0.01, -0.005) max error magnitude (3.0, 1.1,0.018) standard deviation (s) ( 1.32, 0.46, 0.0091 ) 5.1.7.1.1 Commentary As the above table shows, all of the landmarks in the sequence were detected. In addition, there were no overly large errors. An approximation of any residual systematic errors is shown at the bottom of the table in column 4. The errors are then adjusted for this residual systematic error in column 5. This column shows that the variance of the values is limited. 115 5.1.7.2 Simulated Test Run #2 - Half Light The results from the second simulated test run are presented in the same way as the first test run. Table 2 shows the raw pose measurements along with the error calculations. 116 Table 2 Simulated Test Run #2 - Half Light Imago Measured Pose Known Pose Cartesian Error ( x. z. orientation) ( x. z, orientation') ( x, z, orientation) ldnikl5.ilT «-ny.y. I4y.7.2.7411 (-67.0. 151.'). 2.762 ) l 2.y. 2.2.0.021 ) ldmkl3.iff (-69.1, 144.5, 2.745) (-67.0, 146.8 , 2.762) (2.1,2.3,0.017) ldmkll.iff (-68.9, 138.6, 2.743) (-67.0, 141.7, 2.762) ( 1.9,3.1,0.019) ldmk09.iff Not Detected (-67.0, 136.7, 2.762) N/A ldmk07.iff (-67.9, 129.9, 2.753 ) (-67.0, 131.6, 2.762) (0.9, 1.7,0.009) ldmk05.iff (-67.4, 124.7, 2.755 ) (-67.0, 126.5, 2.762) (0.4, 1.8, 0.007) ldmk03.iff (-68.9, 120.8, 2.750 ) (-67.0, 121.4, 2.762) ( 1.9, 0.6, 0.012) ldmkOl.iff (-65.8, 116.0, 2.771 ) (-67.0, 116.3, 2.762) (-1.2, 0.3, -0.009) ldmk-l.iff (-68.6, 109.5, 2.746) (-67.0, 111.3, 2.762) ( 1.6, 1.8, 0.016) ldmk-3.iff (-66.6, 106.2, 2.768) (-67.0, 106.2, 2.762) (-0.4, 0.0, -0.006 ) ldmk-5.iff (-66.6, 101.0,2.767) (-67.0, 101.1,2.762) (-0.4, 0.1,-0.005) ldmk-7.iff (-66.5, 95.6, 2.766) (-67.0, 96.0, 2.762) (-1.5,0.4, -0.004) average (0.75, 1.3,0.007) max error magnitude ( 2.9,3.1,0.021 ) standard deviation (s) ( 1.47, 1.06, 0.0111 ) 5.1.7.2.1 Anomalies 5.1.7.2.1.1 Missed Landmark In the image stored in the file ldmk09.iff, no landmark was detected. The log file for the test run for ldmk09.iff is shown below. 1795722CurrentImage 1795731 AbsoluteLandmarkPosition 117 Id: 0 theta: x: z: 0.0000 0.0000 0.0000 179575 8 PredictedLandmark upper middle lower 211.2 231 250.9 209.6 202.3 209.6 50 50 50 1795787 EllipseDetectionTop row= 229 col= 205 area=l 1795797 Ell ipseDetectionMiddle row= 238.1 col= 195.1 area=32 1795 807 EllipseDetectionBottom row= 259.4 col= 202.8 area=33 1795818Tier2HighestEllipse row= 216.8 col= 199 area=32 1795873Tier2HighEllipse row= 238.1 col= 195.1 area=32 1795882Tier2MiddleEll ipse row= 248.7 col= 199 area=32 1795896Tier2LowEll ipse row= 259.4 col= 202.8 area=33 1795907Tier2LowestEllipse row= 280.7 col= 199 area=33 1795918 Tier2HighestDetectedEllipse row= 228 col= 198 area=3 1795929Tier2MiddleDetectedEllipse row= -1 col= -1 area=-l 1795939 Tier2LowestDetectedEllipse row= -1 coI= -1 area=-l 1795950NotDetected This log file shows that the top ellipse was not detected from the first prediction. The second prediction, show as Tier2HighestEllipse is approximately row 199, and column 205. From the image, shown below in Figure 33, we find that the approximate centroid of the top ellipse is row 217, and column 203. These values are rather far apart. From the image, one can spot two very small "ellipses" detected between the middle ellipse and the top ellipse. These will not be visible in the black and white image below, but were visible when displayed in color26. Detected ellipses were marked in color. 118 Figure 33 Landmark Not Detected - Test Run #2 Consulting the log file again, it is apparent that the shape verifier improperly used the ellipse found in EllipseDetectionTop: row = 229, col = 205, area=l, in the calculation of the second predictions. This constitutes a bug in the code or the algorithm. With this bug fixed, the prediction for Tier2HighestEllipse would have been row = 216.9, column = 199.0. This prediction would have most likely have led to the detection of the proper ellipse. 119 5.2 ACE Environment Evaluation There are other environments comparable to A C E , some of which resemble embedded real-time kernels and some of which resemble graphical data flow tools. A C E has advantages and disadvantages with respect to each of these environments but a reasonably comprehensive comparison is not practical within this thesis. In this section, the A C E Environment is compared only to coding from scratch in C on a Texas Instruments TMS320C40 DSP chip. 5.2.1 Advantages Better Hierarchical Designs. The largest advantage of the A C E system lies in the way it forces the designer into separate the functional elements of a system from the control flow of a system. This occurs because an A C E component is a purely functional abstraction. The flow of data and control is encapsulated in the A C E events, which can be arbitrarily rearranged to change the overall functionality or timing characteristics of the system. The components are completely removed from connectivity issues resulting in a building-block like approach to assembling the system. This in turn has the following two advantages: • more reuse - A arbitrary piece of software cannot practically be reused unless it has functional cohesion [McC93] page 81, and is easily reconnected . The A C E environment guarantees the latter by its requirement that components use events for its data input and output. • easier maintenance - It is very easy to change one part of the system without causing errors in another part of the system. This is because each component is an individual entity affected only through its explicit inputs and outputs. 120 An experienced programmer, or designer could create a hand-coded solution following good software design principles that provided easy maintenance and a large amount of reuse. It is likely that the same designer/programmer could achieve higher levels of maintainability and reuse using A C E . Better debugging tools. Using A C E on Windows NT rather than a C40 allows the use of an integrated debugging tool. A debugging tool, which was not available for the particular C40 system that was used, can provide a time savings during development that can prove very significant. This advantage of the A C E environment itself directly. Rather it is a function of the operating system on which the A C E environment runs. 5.2.2 Disadvantages Most of the disadvantages of the A C E environment were related to its lack of support tools. As was noted above, its most significant tool is a commercial off-the-shelf integrated development environment, (which contained the debugger referenced above). A C E is in its infancy. As is, it a custom in-house real-time scheduling framework that is promisingly powerful and simple. It lacks the support infrastructure, including documentation, that would allow it to be used off-the-shelf. No Support For Image Processing A C E has no pre-designed image processing components. This meant that nearly all components in this configuration had to be created from scratch. This in turn minimized the amount of reuse, although some components were reused within the configuration e.g. the EllipseDetector component. 121 Separation From Video Hardware There was no video frame buffer hardware available for the PC that was used to run the A C E system. This meant that the capture of images had to be performed manually. Pictures were captured and stored. The images were then transferred to the PC. The A C E system then simulated a frame grabber by loading each image, in sequence, from the hard disk. It was not practical to automate this procedure. The result was that large volume testing was not possible, i.e., testing was confined to approximately 20 images. Unfamiliarity In this case, the author of this thesis was also the programmer who created A C E . Thus there was no learning curve necessary. Also, there was no unknown functionality which meant all potentially hidden pitfalls were avoided. Other users of A C E would clearly have an initial learning curve and might also run into several unforeseen pitfalls. Lack of a Graphical Editor The data-flow nature of the A C E environment lends itself naturally to graphical programming. The components would be represented by blocks and the events represented by lines, ( or "wires"), that connect the events. This is not necessarily a disadvantage of the A C E environment with respect to hand-coding in C, it is a distinct disadvantage with respect to more off-the-shelf data flow environments. 122 Chapter Six - Conclusions 6. Conclusions In this chapter, the system presented will be evaluated with comparison to the specifications outlined in Section 3.1. These comparisons will show that the current implementation R O L L system did not fully meet all of the specifications, but could meet these specifications if development continued. Section 6.1 evaluates the current R O L L implementation against each of the specifications outlined in Section 3.1. Section 6.2 includes a discussion of the viability of the main tenets of this thesis. Section 6.3 enumerates several improvements that could be made to the system. 6.1 Specifications Evaluation 6.1.1 Continuous Sensing This specification said that the system must not require the robot to stop in order to carry out the localization algorithm The R O L L algorithm clearly satisfies this specification. The only case that the system would force the robot to stop is if the robot becomes "lost". In this condition no landmarks would appear within the predicted tracking area. 6.1.2 Real-Time Performance After the move to the simulated environment, this specification became impossible to prove. It is noted that the original C40-based version of the R O L L system did operate at approximately 8Hz. Although it must be said that the R O L L system technically did not meet the 123 specification, it is fairly clear that with additional optimization or increased processing power, the R O L L system could exceed the 10Hz specification. 6.1.3 Minimal Hardware The system was required to run on a single Texas Instruments TMS320C40 DSP chip without any special processing, except for a frame buffer. This specification was met, but this specification is clearly a trade-off with the previous Real Time Performance specification. Thus more processing could be used to satisfy the above specification if available. 6.1.4 Single Camera Localization This specification was satisfied. At no time was a second camera ever needed. 6.1.5 Range Requirements Unfortunately, the testing difficulties did not allow proof that the range requirements were met. The range specifications required the system to be able to detect landmarks from 3 meters to 15cm away from the camera. 6.1.6 No Odometry Required Technically this requirement was satisfied. The system does not inherently rely on any odometry. However, odometry could be used to make the system more robust, thus avoiding the need to stop when the robot becomes lost. 6.2 Viability In this section issues are discussed that were not directly addressed by the specifications. The main tenets of this thesis are a) that single camera, single landmark, real time localization is practical with minimal hardware, b) that the particular landmark and algorithms used in the 124 R O L L system are practical, and finally, c) the A C E environment is suitable for designing image processing applications. 6.2.1 Single Landmark Localization This R O L L system has clearly shown that single landmark localization is possible with small errors. The key to minimizing the errors lies in avoiding the image plane parallelism problem described in Section 3.2.3.2. If this problem is not tackled by the landmark design, the errors radial to the ray from the camera to the landmark will render the data practically useless. 6.2.2 The current R O L L System The R O L L system successfully implements a single camera single landmark localization system. Because the testing did not allow a lot of images to be taken, there was no real way to verify any kind of robustness. 6.2.2.1 Dependence on Parameters One subject associated with robustness is the dependence on parameters that have been hard-coded as constants. During development, many times a landmark could not be detected. The solution was sometimes a bug in the code, but more often changing a constant provided the solution. These changes were usually made by a guess of what would solve the current problem. This represents a serious reliability problem since it is likely that these values are not converging on a set of values that are best for all possible angles, ranges and lighting situations. Appendix A contains a table of all the constants required by the R O L L system. This table shows the name of the constant, its final value, and a brief indication of how this value was achieved. The constants listed as manually tuned are the ones that could be changed if the current image sequence showed an error. 125 The ones marked real world measurement were the ones that had a direct correspondence to a measurable value in the real world. This includes the camera calibration parameters that were obtained from the Tsai calibration algorithm, since the Tsai calibration is a form of measuring. 6.2.3 A C E for Image Processing Despite several serious disadvantages, which were listed in Section 5.2.2, the advantages listed for A C E outweigh these disadvantages. This is because the data flow nature of A C E conforms very closely to the way in which image processing systems are designed. A C E also has the advantage that it seamlessly handles asynchronous input from multiple sources. For example, if there were a Kalman filter subsystem, it could be programmed in A C E and would then be able to accept input from the video localization system and odometry at whatever rate the input systems can provide data. This information could then be fed back to each system as it becomes available. This would provide the localization system with a more accurate estimate of the current position, which would make the system more robust. 6.3 Future Work This section lists potential avenues of future work. For each item it lists the motivation, and a possible approach that could be taken. 6.3.1 Robustness Testing With the minimal testing reported in this thesis there is a need for testing with a high number of image sequences. This will allow a clearer determination of the systems robustness. If the system fails consistently, then two approaches could be taken to rectify the situation. 126 The first is to reduce the number of manually tuned constants. Some of these constants could be derived from real world measurement type constants. The other method would be to improve the tracking performance. This can be done by adding a better kinematic model, or reducing the mean main loop iteration time. 6.3.2 A Scale Invariant Landmark One would like the system to work over a very large range (camera to landmark distance). Whether or not the current landmark will allow the system to reach the 15cm to 3 meter range requirements, it would be desirable to continually increase the range over which the localization can be performed. The landmark itself is the limiting factor of this effort. Increasing the size of the landmark would allow the system to work from farther away, but the close range would also increase. If a landmark could be found that is visible at several scales then much of the range problem would have been overcome. 6.3.3 Large Scale Maps The current testing uses only a single landmark. It is still yet to be shown that the system will work in any reasonable sized environment with multiple landmarks. For this to happen, a "course" could be set up to demonstrate the localization system used in navigation with multiple landmarks. 127 6.3.4 F u l l Real-Time Implementation This system could be used to leverage other research into robot control architectures by providing a test bed. To do this the system would have to be integrated into an existing robot on which a camera can be mounted. 6.3.4.1 Use of LCI's Colour Tracking System Within the LCI there exists a Colour Tracking System that could be integrated into the R O L L system. The LCI colour tracking system finds colored blobs in the current image and reports the centroid of each distinct colored blob. The current landmark could be modified so that the top middle and lower dots are colored, e.g. all middle dots could be orange, and all top and bottom dots green. The LCI colour tracking system could then almost entirely replace the tracking and detection parts of the R O L L system. This would likely increase the robustness of the system significantly. The one disadvantage of the LCI Blob Tracking System is that it requires a custom transputer network including some specialized data flow image processing hardware. This would violate the specification that the system run on a single C40 with only a frame buffer as extra hardware. 128 BIBLIOGRAPHY [Ati91] Atiya, Sami. and Hager, Greg., "Real-Time Vision-Based Robot Localization," In Proceedings of the 1991 IEEE International Conference on Robotics and Automation, pages 639 - 644, April 1991. [Bal82] Ballard, D. H., and Brown, C M . , "Computer Vision", Prentice-Hall, 1982. [Cap90] Caprile, B. and Torre, V. , "Using Vanishing Points for Camera Calibration," In International Journal of Computer Vision, volume 4, pages 127-140. Kluwer Academic Publishers, Netherlands, 1990. [Der90] Deriche, Rachid, and Gerard Giraudon, "Accurate Corner Detection : An Analytical Study," In IEEE International Conference on Computer Vision, pages 66-70, IEEE 1990. [Gel74] Gelb, Arthur (ed.), "Applied Optimal Estimation," The M.I.T. Press, Cambridge 1974. [Han93] Steve Hanks, Martha E. Pollack, and Paul R. Cohen, "Benchmarks, Test Beds, Controlled Experimentation and the Design of Agent Architectures," A l Magazine Winter 1993, pages 17-41. [Har] Harris, Chris and Stephens, Mike, "A Combined Corner and Edge Detector, " unpublished, Plessey Research, Roke Manor, United Kingdom. [Hor91] Horn, Berthold., "Robot Vision," McGraw-Hill, 1991. [Hur93] Hum, Jeff "Differential GPS Explained," Trimble Navigation, 1993. [Kro89] Krotkov, Eric , "Mobile Robot Localization Using A Single Image," In Proceedings from the IEEE International Conference on Robotics and Automation, pages 978-983, IEEE Press. 1989. [Leo91] Leonard, John J., and Durrant-Whyte, Hugh F., "Mobile Robot Localization by Tracking Geometric Beacons", In IEEE Transactions on Robotics and Automation, vol 7, no. 3., pages 376-382, IEEE Press, 1991. [Lit90] Little, James J., "Vision Servers and Their Clients," In International Conference on Pattern Recognition, vol. 3. pages 295-299. Jerusalem, Oct. 1994. [Liu90] Liu, Yuncai, Huang, Thomas S., Faugeras, Olivier D., "Determination of Camera Location from 2-D to 3-D Line and Point Correspondences," In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12. No 1., pages 28-37. IEEE Press, January 1990. [Mac93] Mackworth, Alan K, "On Seeing Robots," U B C Technical Report 93-5, March 1993, Vancouver BC. [May90] Maybeck, Peter S., "The Kalman Filter: An Introduction to Concepts," In Autonomous Robot Vehicles, Cox & Wilfong, Springer-Verlag, 1990. 129 [McC93] McConnell, Steve, "Code Complete," Microsoft Press, 1993. [McK91] McKerrow , Phillip John., "Introduction to Robotics," Addison-Wesley, 1991. [Mur93] Murata S. and Hirose T., "Onboard Locating System Using Real-Time Image Processing for a Self-Navigating Vehicle," In IEEE Transactions on the Industrial Electronics, vol. 40, No.l . pages 145-154 February 1993. [Nic93] Nickerson S.B. et al., "ARK Autonomous Navigation of a Mobile Robot in a Known Environment," in Intelligent Autonomous Systems IAS-3, pages 288-296, Groen, Hirose and Thorpe editors, IOS Press Amsterdam, 1993. [Ran89] Rangarajan, Krishnan, Shah, Mubrak and Brackle, David Van, "Optimal Corner Detector," In Computer Vision, Graphics and Image Processing, vol. 48, pages 230-245, 1989. [Sah93] Sahota, Michael K., "Real-Time Intelligent Behaviour in Dynamic Environments: Soccer-playing Robots," Master's Thesis, University of British Columbia, August 1993. [Sug88] Sugihara, K., "Some Location Problems for Robot Navigation Using a Single Camera," In Computer Vision, vol. 42, no. 1, pages 112-129, April 1988. [Tsa87] Tsai, Roger Y., "A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf T V Cameras and Lenses," in IEEE Journal of Robotics and Automation, Aug. 1987 RA-3 N4, pp. 323-344. 130 A P P E N D I X A C O N S T A N T S Name Value Unils Method Notes D O T C O M P A C T N E S S T E S T T O L E R A N C E 1.25 factor Manual Tuning M a x i m u m deviation from the ideal compactness of an ellipse for landmark dots. D O T C O M P A C T N E S S T E S T M E D I A N A R E A 20 pixels Manual Tuning M i n i m u m area for a compactness test to be valid. V E R T I C A L D O T A L I G N M E N T T O L E R A N C E 5 pixels Manual Tuning Number of pixels deviation allowed between the columns of the upper and lower dots on a landmark. M A X I M U M M I D D L E D O T V E R T I C A L D E V I A T I O N 16 pixels Manual Tuning M a x i m u m number of horizontal pixels between the columns of middle dots and the upper and lower dots. L A N D M A R K M I D D L E D O T O F F S E T 5 cm Real W o r l d Measurement Landmark physical characteristic. L A N D M A R K H E I G H T 10 cm Real W o r l d Measurement Landmark physical characteristic. L A N D M A R K W I D T H 10 cm Real W o r l d Measurement Landmark physical characteristic. D O T D I A M E T E R 2 cm Real W o r l d Measurement Landmark physical characteristic. C E N T R O I D A R E A T O L E R A N C E 20 pixels Manual Tuning M a x i m u m difference in area between dots in the same landmark. C E N T R O I D A R E A R A T I O T O L E R A N C E 0.33 factor Manual Tuning M a x i m u m ratio deviation of areas in dots in the same landmark. P R E D I C T I O N D E V I A T I O N T O L 40 pixels Manual Tuning M a x i m u m distance from a prediction that the searching algorithm w i l l search. W I D T H F R A C T I O N 1.5 factor Manual Tuning Search parameters. H I E G H T F R A C T I O N 0.5 factor Manual Tuning Search parameters. 131 A P P E N D I X B T E S T R U N #1 L O G Comments are marked in bold text. Numbers on the left are a relative time measure in milliseconds. 236203778 S t a r t 236204611 Currentlmage 236204684 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 thet a : 0.0000 - l d m k l 5 . i f f 236204729 236204753 236204762 236204769 236204780 236204838 236204847 236204855 236204862 236204873 236204882 236204892 236204901 236204972 236205062 236205680 236205704 236205735 236205761 236205817 236205827 236205838 Predic tedLandmark upper 214 . 6 middle lower 231 247.5 259 254.5 259 El l i p s e D e t e c t i o n T o p row= 220.3 E l l i p s e D e t e c t i o n M i d d l e row= 257.1 Elli p s e D e t e c t i o n B o t t o m row= -1 T i e r 2 H i g h e s t E l l i p s e row= 183.5 T i e r 2 H i g h E l l i p s e row= 220.3 T i e r 2 M i d d l e E l l i p s e row= 238.7 T i e r 2 L o w E l l i p s e row= 2 57.1 T i e r 2 L o w e s t E l l i p s e row= 2 93.9 Tier2HighestDetectedEl row= -1 Tie r 2 M i d d l e D e t e c t e d E l l row= 237.7 Tier2LowestDetectedEll row= -1 DetectedLandmark upper 22 0.3 middle 237.7 lower 257.1 c o l = col= col= col= col= col= 222 .4 222.6 -1 222 . 5 222 .4 222 . 5 COl= 222.6 col= 222.5 l i p s e col= -1 ipse col= 216.3 ipse col= -1 222 .4 216.3 222.6 34 34 34 area=17 area=2 8 area=-l area=17 area=17 . area=22 area=2 8 area=2 8 area=-l area=24 area=-l R e l a t i v e P o s i t i o n E v e n t x=-6.0470 z=166.5319 A b s o l u t e P o s i t i o n x=-68.3376 z=151.9849 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 thet a : 0.0000 theta= theta= 23 23 23 -2.7553 2 .7553 PredictedLandmark upper 212.6 middle 231 lower 249.4 222.5 216.3 222 .5 43 43 43 E l l i p s e D e t e c t i o n T o p row= 219.6 col= 216.3 El l i p s e D e t e c t i o n M i d d l e row= 257.9 col= 216.2 Elli p s e D e t e c t i o n B o t t o m row= 238 col= 209.7 DetectedLandmark upper 219.6 216.3 middle 238 209.7 lower 257.9 216.2 <—ldmk l3 . i f f area=2 0 area=2 9 area=2 6 25 25 25 236205861 R e l a t i v e P o s i t i o n E v e n t 132 x=-7.8735 z=160.0473 theta=-2.7608 236205953 A b s o l u t e P o s i t i o n x=-66.7967 z=145.6548 theta= 2.7608 236206570 Currentlmage < - - l d m k l l . i f f 236206599 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 thet a : 0.0000 PredictedLandmark upper 211.9 middle 231 lower 250.2 216.3 46 209.7 46 216.3 46 236206699 EllipseDetectionTop row= 219.1 col= 209 . 5 area=23 236206709 El l i p s e D e t e c t i o n M i d d l e row= 237.8 col= 202.9 area=29 236206718 EllipseDetectionBottom row= 258.6 c o l = 209.8 area=32 236206729 DetectedLandmark upper 219.1 209 5 28 middle 237.8 202 9 28 lower 258.6 209 8 28 236206753 R e l a t i v e P o s i t i o n E v e n t x=-9.7699 z=155.1413 theta=-2.7799 236206825 A b s o l u t e P o s i t i o n x=-64.0299 z=141.6490 theta= 2.7799 236207508 Currentlmage < - - ldmk09 . i f f 236207515 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 thet a : 0.0000 236207545 PredictedLandmark upper 211.3 middle 231 lower 250.8 209.6 49 202.9 49 209.6 49 236207571 El l i p s e D e t e c t i o n T o p row= 218.6 col= 202.7 area=24 236207581 E l l i p s e D e t e c t i o n M i d d l e row= 23 8 c o l = 195.2 area=28 236207652 Ell i p s e D e t e c t i o n B o t t o m row= 258.9 c b l = 202 . 8 area=35 236207664 DetectedLandmark upper 218.6 202 7 29 middle 238 195 2 29 lower 258.9 202 8 29 236207689 R e l a t i v e P o s i t i o n E v e n t x=-11.7309 z=151.7592 236207708 A b s o l u t e P o s i t i o n x=-68.2864 z=136.0347 236208389 Currentlmage 236208396 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 theta=-2.7535 theta= 2.7535 - l d m k 0 7 . i f f PredictedLandmark upper 210.8 middle 231 lower 251.2 202.7 52 195.2 52 202.7 52 236208517 E l l i p s e D e t e c t i o n T o p row= 218.1 col= 195.3 236208528 E l l i p s e D e t e c t i o n M i d d l e row= 237.8 col= 187.5 236208537 El l i p s e D e t e c t i o n B o t t o m row= 259.6 col= 195.3 236208547 DetectedLandmark upper 218.1 195.3 middle 237.8 187.5 lower 259.6 195.3 32 32 32 area=27 area=31 area=37 133 236208571 R e l a t i v e P o s i t i o n E v e n t x=-13.6682 z=147.4392 theta=-2.7617 236208592 A b s o l u t e P o s i t i o n x=-67.3626 z=131.8614 theta= 2.7617 236209341 Currentlmage < - - ldmk05 . i f f 236209348 AbsoluteLandmarkPosi t ion Id: 0 x: 0.0000 z : 0.0000 t h e t a : 0.0000 23 6209377 PredictedLandmark upper 210.3 middle 231 lower 251.8 195.3 55 187.5 55 195.3 55 236209405 E l l i p s e D e t e c t i o n T o p row= 217.4 col= 187.1 236209416 E l l i p s e D e t e c t i o n M i d d l e row= 238 col= 179 236209426 E l l i p s e D e t e c t i o n B o t t o m row= 260.6 col= 187.3 236209437 DetectedLandmark upper 217.4 187.1 middle 238 179 lower 260.6 187.3 34 34 34 area=2 9 area=31 area=41 236209519 R e l a t i v e P o s i t i o n E v e n t x=-15.5283 z=141.8968 theta=-2.7731 236209540 A b s o l u t e P o s i t i o n x=-65.6022 z=126.7761 theta= 2.7731 236210214 Currentlmage < - - ldmk03 . i f f 236210222 AbsoluteLandmarkPosi t ion Id: 0 x: 0.0000 z : 0.0000 the ta : 0.0000 23 6210251 PredictedLandmark upper 209.5 187.2 59 middle 231 179 59 lower 252.6 187.2 59 236210279 Ellip s e D e t e c t i o n T o p row= 217 col= 178 9 area=29 236210345 El l i p s e D e t e c t i o n M i d d l e row= 238 col= 169 9 area=33 236210356 EllipseDetectionBottom row= 261.3 col= 178 9 area=44 236210368 DetectedLandmark upper 217 178 9 35 middle 238 169 9 35 lower 261.3 178 9 35 236210390 R e l a t i v e P o s i t i o n E v e n t x=-17.4907 z=137.8561 236210411 A b s o l u t e P o s i t i o n x=-67.6233 z=121.3974 236211099 Currentlmage 236211107 AbsoluteLandmarkPosi t ion Id: 0 x: 0.0000 z : 0.0000 t h e t a : 0.0000 theta=-2.7596 theta= 2.7596 - l d m k O l . i f f PredictedLandmark upper 2 08.9 middle 231 lower 2 53.2 178.9 63 169.9 63 178.9 63 236211165 E l l i p s e D e t e c t i o n T o p row= 216.2 c o l = 169.8 area=31 236211175 E l l i p s e D e t e c t i o n M i d d l e row= 238.1 col= 160.2 area=37 236211184 E l l i p s e D e t e c t i o n B o t t o m row= -1 col= -1 area=- l 236211194 T i e r 2 H i g h e s t E l l i p s e row= 194.4 col= 165 area=31 236211209 T i e r 2 H i g h E l l i p s e row= 216.2 c o l = 169.8 area=31 134 236211218 T i e r 2 M i d d l e E l l i p s e row= 227.2 c o l = 165 area= 34 236211289 Ti e r 2 L o w E l l i p s e row= 23 8.1 c o l = 160.2 area= 37 236211299 T i e r 2 L o w e s t E l l i p s e row= 259.9 c o l = 165 area= 37 236211309 Ti e r 2 H i g h e s t D e t e c t e d E l l i p s e row= -1 c o l = -1 area= -1 236211319 Ti e r 2 M i d d l e D e t e c t e d E l l i p s e row= -1 c o l = -1 area= -1 236211330 Tier2LowestDetectedEllipse row= 2 62.2 c o l = 169.7 area= 45 236211341 DetectedLandmark upper 216.2 169 8 38 middle 238.1 160 2 38 lower 2 62.2 169 7 38 236211418 R e l a t i v e P o s i t i o n E v e n t x=-19.3923 ' % z = 132.8 861 theta=- 2.7652 236211437 A b s o l u t e P o s i t i o n x=-66.8813 z=116.4546 theta= 2 .7652 236212124 CurrentImage 236212132 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 thet a : 0.0000 236212162 Predic tedLandmark upper 208 169 7 67 middle 231 160 2 67 lower 254 169 7 67 236212187 El l i p s e D e t e c t i o n T o p row= 215.5 c o l = 160.1 area= 34 236212196 E l l i p s e D e t e c t i o n M i d d l e row= 238.1 c o l = 150 .1 area= 40 236212266 Elli p s e D e t e c t i o n B o t t o m row= 2 63.3 col= 160.2 area= 49 236212276 DetectedLandmark upper 215.5 160 1 41 middle 238.1 150 1 41 lower 2 63.3 160 2 41 236212299 R e l a t i v e P o s i t i o n E v e n t x=-21.1741 z=127.5785 theta=- 2 . 7767 236212388 A b s o l u t e P o s i t i o n x=-65.3080 2=111.6221 theta= 2.7767 236213292 Currentlmage 236213300 236213331 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 thet a : 0.0000 PredictedLandmark upper 207.1 160.2 middle 231 150.1 lower 254.9 160.2 - l d m k - l . i f f - l d m k - 3 . i f f 73 73 73 236213415 E l l i p s e D e t e c t i o n T o p row= 214.8 col= 149.8 236213427 E l l i p s e D e t e c t i o n M i d d l e row= 238 col= 138.8 236213436 El l i p s e D e t e c t i o n B o t t o m row= 264.3 col= 149.7 236213448 DetectedLandmark upper 214.8 149.8 middle 238 138.8 lower 264.3 149.7 236213473 R e l a t i v e P o s i t i o n E v e n t x=-23.1108 z=123.0362 theta= 236213493 A b s o l u t e P o s i t i o n x=-66.1708 z=106.2706 theta= 236214362 Currentlmage 236214389 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 46 46 46 -2.7703 2.7703 area=40 area=42 area=5 6 - l d m k - 5 - i f f 135 theta: 0.0000 236214419 236214447 236214456 236214657 236214675 236214760 236214779 236215389 236215416 236215446 236215474 236215534 236215545 236215553 Predic tedLandmark upper 206.3 middle 231 lower 255.8 149 . 8 138.8 149.8 79 79 79 EllipseDetectionTop row= 213.7 col= E l l i p s e D e t e c t i o n M i d d l e row= 238.3 col= EllipseDetectionBottom row= 265.2 col= DetectedLandmark upper 213.7 138. middle 238.3 126. lower 265.2 138. Relativ e P o s i t i o n E v e n t x=-24.9931 z=118.1437 Absol u t e P o s i t i o n x=-65.2111 z=101.6371 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 138.6 126.9 138.5 theta=-2.7796 theta= 2.7796 PredictedLandmark upper 205.3 middle 231 lower 256.8 138.5 126.9 138.5 85 85 85 E l l i p s e D e t e c t i o n T o p row= 213.1 col= 126.6 Ell i p s e D e t e c t i o n M i d d l e row= 238.3 col= 113.9 EllipseDetectionBottom row= 266.4 col= 126.8 DetectedLandmark upper 213.1 126.6 middle 238.3 113.9 lower 266.4 126.8 53 53 53 area=40 area=49 area=56 < - - l d m k - 7 . i f £ area=46 area=52 area=61 236215576 Relati v e P o s i t i o n E v e n t x=-26.9237 z=113.7144 236215598 AbsolutePosition x=-66.3915 z=96.1667 theta=-2.769E theta= 2.7696 136 A P P E N D I X C T E S T R U N #2 L O G Comments are marked in bold text. Numbers on the left are a relative time measure in milliseconds. 10855142 10855937 10856037 S t a r t Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 l d m k l 5 . i f f 10856090 Predic tedLandmark upper 214.6 middle 231 lower 247.5 259 254.5 259 34 34 34 10856114 10856122 10856132 10856141 E l l i p s e D e t e c t i o n T o p row= 220.3 col= 222.4 El l i p s e D e t e c t i o n M i d d l e row= 237.7 col= 216.3 EllipseDetectionBottom row= 257.1 col= 222.6 DetectedLandmark upper 220.3 222.4 middle 237.7 216.3 lower 257.1 222.6 23 23 23 area=17 area=24 area=2 8 10856223 10856239 10856846 10856877 Relativ e P o s i t i o n E v e n t x=-6.0470 z=166.5319 Abs o l u t e P o s i t i o n x=-68.3376 z=151.9849 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 theta=-2.7553 theta= 2.7553 <-- l d m k l 3 . i f f 10! 356909 Predic tedLandmark upper 212.6 222 5 43 middle 231 216 3 43 lower 249.4 222 5 43 10! 356939 Ellip s e D e t e c t i o n T o p row= 219.6 c o l = 216 3 10! 356949 Ell i p s e D e t e c t i o n M i d d l e row= 257.9 c o l = 216 2 10! 357015 EllipseDetectionBottom row= 23 8 c o l = 209 7 10! 357024 DetectedLandmark upper 219.6 216 3 25 middle 238 209 7 25 lower 257.9 216 2 25 10! 357048 Relativ e P o s i t i o n E v e n t 10857069 10857677 10857704 x=-7.8735 z=160.0473 Absol u t e P o s i t i o n x=-66.7967 z=145.6548 CurrentImage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 theta=-2.760! theta= 2.760! area=2 0 area=2 9 area=2 6 l d m k l l . i f f 10857736 PredictedLandmark upper 211.9 middle 231 lower 250.2 216.3 209.7 216.3 46 46 46 10857821 EllipseDetectionTop 137 row= 219.1 col= 209.5 area=23 10857829 E l l i p s e D e t e c t i o n M i d d l e row= 237.8 col= 202.9 area=29 10857840 E l l i p s e D e t e c t i o n B o t t o m row= 258.6 col= 209.8 area=32 10857850 DetectedLandmark upper 219.1 209.5 28 middle 237.8 202.9 28 lower 258.6 209.8 28 10857873 10857970 10858727 10858734 R e l a t i v e P o s i t i o n E v e n t x=-9.7699 z=155.1413 A b s o l u t e P o s i t i o n x=-64.0299 z=141.6490 Currentlmage AbsoluteLandmarkPosi t ion Id: 0 x: 0.0000 z : 0.0000 the ta : 0.0000 theta=-2.7799 theta= 2.7799 l d m k 0 9 . i f f 10858763 PredictedLandmark upper 211.3 middle 231 lower 250.8 209.6 49 202.9 49 209.6 49 10? S5! 3791 E l l i p s e D e t e c t i o n T o p row= 218.6 c o l = 202 .7 area=24 10! 35! 3800 E l l i p s e D e t e c t i o n M i d d l e row= 238 col= 195.2 area=28 10! 35! 3807 E l l i p s e D e t e c t i o n B o t t o m row= 258.9 col= 202.8 area=3 5 10! 35! 3818 DetectedLandmark upper 218.6 202 7 29 middle 238 195 2 29 lower 258.9 202 8. 29 10858896 R e l a t i v e P o s i t i o n E v e n t x=-11.7309 2=151.7592 theta=-2.7535 10858916 A b s o l u t e P o s i t i o n x=-68.2864 z=136.0347 theta= 2.7535 10859601 Currentlmage <— l d m k 0 7 . i f f 10859607 AbsoluteLandmarkPosi t ion Id: 0 x: 0.0000 z : 0.0000 t h e t a : 0.0000 10859635 PredictedLandmark upper 210.8 202.7 52 middle 231 195.2 52 lower 251.2 202.7 52 10 359665 EllipseDetectionTop row= 218.1 c o l = 195 3 area=27 10 359736 Ell i p s e D e t e c t i o n M i d d l e row= 237.8 col= 187 5 area=31 10! 359747 EllipseDetectionBottom row= 259.6 col= 195 3 area=37 10! 359757 DetectedLandmark upper 218.1 195 3 32 middle 237.8 187 5 32 lower 259.6 195 3 32 10859782 R e l a t i v e P o s i t i o n E v e n t x=-13.6682 z=147.4392 theta=-2.7617 10859804 A b s o l u t e P o s i t i o n x=-67.3626 2=131.8614 theta= 2.7617 10860497 Currentlmage <-- l d m k 0 5 . i f f 10860505 AbsoluteLandmarkPosi t ion Id: 0 x: 0.0000 z: 0.0000 t h e t a : 0.0000 PredictedLandmark upper 210.3 middle 231 lower 251.8 195.3 55 187.5 55 195.3 55 138 10860566 10860577 10860587 10860596 10860621 10860644 10861340 10861348 10861376 10861403 10861411 10861421 10861434 10861457 10861476 10862195 10862203 10862232 10862320 10862330 10862340 10862351 10862365 10862374 10862386 10862397 10862450 10862459 10862469 10862478 10863183 10863191 EllipseDetectionTop row= 217.4 col= 187.1 Ell i p s e D e t e c t i o n M i d d l e row= 238 col= 179 EllipseDetectionBottom row= 260.6 col= 187.3 DetectedLandmark upper 217.4 187.1 middle 238 179 lower 2 60.6 187.3 34 34 34 RelativePositionEvent x=-15.5283 z=141.8968 Absol u t e P o s i t i o n x=-65.6022 z=126.7761 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 theta=-2.7731 theta= 2.7731 PredictedLandmark upper 209.5 middle 231 lower 252.6 187.2 179 187.2 59 59 59 EllipseDetectionTop row= 217 col= 178.9 Ell i p s e D e t e c t i o n M i d d l e row= 238 col= 169.9 EllipseDetectionBottom row= 261.3 col= 178.9 DetectedLandmark upper 217 178.9 middle 238 169.9 lower 261.3 178.9 35 35 35 R e l a t i v e P o s i t i o n E v e n t x=-17.4907 z=137.8561 Absol u t e P o s i t i o n x=-67.6233 z=121.3974 Currentlmage AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 theta=-2.7596 theta= 2.7596 PredictedLandmark upper 208.9 middle 231 lower 253.2 1 7 8 . 9 1 6 9 . 9 1 7 8 . 9 63 63 63 EllipseDetectionTop row= 216.2 Ell i p s e D e t e c t i o n M i d d l e row= 238.1 col= EllipseDetectionBottom c o l = col= 169.8 160.2 -1 row= -1 T i e r 2 H i g h e s t E l l i p s e row= 194.4 col= T i e r 2 H i g h E l l i p s e row= 216.2 col= 169.8 T i e r 2 M i d d l e E l l i p s e row= 227.2 col= 165 Tier2LowEllipse row= 238.1 col= 160.2 Tier2 L o w e s t E l l i p s e row= 259.9 col= Tier2HighestDetectedEllipse row= -1 c o l = Tier2MiddleDetectedEllipse row= -1 col= Tier2LowestDetectedEllipse row= -1 col= NotDetected Currentlmage AbsoluteLandmarkPosition Id: 0 165 165 area=29 area=31 area=41 l d m k 0 3 . i f f area=29 area=33 area=44 l d m k O l . i f f area=31 area=37 area=-l area=31 area=31 area=34 area=37 area=37 area=-l area=-l area=-l l d m k - l . i f f 139 x: 0.0000 • z: 0.0000 theta: 0.0000 PredictedLandmark upper 2 08.9 middle 231 lower 253.2 178.9 " 63 169.9 63 178.9 63 10863305 10863315 10863326 10863336 Ellip s e D e t e c t i o n T o p row= 215.5 col= 160.1 Ell i p s e D e t e c t i o n M i d d l e row= 238.1 col= 150.1 EllipseDetectionBottom row= 263.3 col= 160.2 DetectedLandmark upper 215.5 160.1 middle 238.1 150.1 lower 263.3 160.2 41 41 41 area=34 area=40 area=49 10863360 Relati v e P o s i t i o n E v e n t x=-21.1741 z=127.5785 theta=-2.7767 10863379 Abso l u t e P o s i t i o n x=-65.3080 z=111.6221 theta= 2.7767 10864082 Currentlmage <-- l d m k - 3 . i f f 10864090 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 108 64187 PredictedLandmark upper 207.1 middle 231 lower 254.9 160.2 73 150.1 73 160.2 73 10864213 10864222 10864231 10864242 Ellip s e D e t e c t i o n T o p row= 214.8 col= 149.8 El l i p s e D e t e c t i o n M i d d l e row= 238 col= 138.8 EllipseDetectionBottom row= 264.3 col= 149.7 DetectedLandmark upper 214.8 149.8 middle 238 138.8 lower 264.3 149.7 46 46 46 area=40 area=42 area=56 10864313 Relati v e P o s i t i o n E v e n t x=-23.1108 z=123.0362 theta=-2.7703 10864330 Ab s o l u t e P o s i t i o n x=-66.1708 z=106.2706 theta= 2.7703 10865018 Currentlmage <— l d m k - 5 . i f f 10865027 AbsoluteLandmarkPosition Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 2 06.3 middle 231 lower 255.8 149.8 79 138.8 79 149.8 79 10! 365154 Ellip s e D e t e c t i o n T o p row= 213.7 c o l = 138 6 area=40 10S 365163 Ell i p s e D e t e c t i o n M i d d l e row= 238.3 c o l = 126 9 area=49 10S 365228 EllipseDetectionBottom row= 2 65.2 c o l = 138 5 area=56 101 365242 DetectedLandmark upper 213.7 138 6 48 middle 238.3 126 9 48 lower 265.2 138 5 48 10865267 Relati v e P o s i t i o n E v e n t x=-24.9931 z=118.1437 theta=-2.7796 10865366 Abso l u t e P o s i t i o n x=-65.2111 z=101.6371 theta= 2.7796 10866154 Currentlmage <-- ldmk-7 . ' i f f 10866175 AbsoluteLandmarkPosition 140 10866206 10866272 10866282 10866289 10866301 10866325 10866348 Id: 0 x: 0.0000 z: 0.0000 theta: 0.0000 PredictedLandmark upper 205.3 middle 231 lower 256.8 138.5 126.9 138.5 EllipseDetectionTop row= 213.1 col= 126.6 Ell i p s e D e t e c t i o n M i d d l e row= 238.3 col= 113.9 EllipseDetectionBottom row= 266.4 col= 126.8 DetectedLandmark upper 213.1 126.6 middle 238.3 113.9 lower 266.4 126.8 Relativ e P o s i t i o n E v e n t x=-26.9237 z=113.7144 Absolu t e P o s i t i o n x=-66.3915 z=96.1667 85 85 85 53 53 53 theta=-2.7698 theta= 2.7698 area=46 area=52 area=61 141 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051465/manifest

Comment

Related Items