MULTIAGENT MANIPULATOR CONTROL By Simon Philip Monckton B. Sc. (Mechanical Engineering) University of Alberta M. Sc. (Mechanical Engineering) University of Alberta A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES MECHANICAL ENGINEERING We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA September 1997 © Simon Philip Monckton, 1997 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain sljall not be allowed without my written permission. Mechanical Engineering The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1Z1 Date: Abstract The objective of this research is to define, specify, and implement a new, robust, and extensible manipulator control founded upon recent developments in multiagent robot control architectures.. Historically manipulator controllers serve within an idealized monolithic"sense-modcl-plan-act" (SMPA) control cycle that is both difficult and expensive to design for real time implementation. Recently, however, robotic systems have achieved remarkable performance through the combination of multiple, relatively simple, task specific controllers. These agents are arguably more reliable, robust, and extensible than SMPA architectures exhibiting similar performance. Furthermore, complex tasks have been achieved through multiagent teams, often exhibiting self organinzing or emergent behaviour. Despite these benefits and the growing popularity of these techniques, a formal model of agent and/or multiagent systems has not been proposed nor has any such architecture been applied to classical manipulation robotics. This thesis attempts to address these omissions through the analysis and application of multiagent design principles to manipulator control. After an introduction; to problems in real time supervisory robot control, an overview of manipulator and controller dynamics establishes a reference robot model. With this model as background, experimental high performance robot architectures arc examined and concepts common to these systems identified. Multiagent manipulator control strategies are then discussed and global goal distribution mechanism introduced. The design and implementation of a complementary multiprocess manipulator simulator is then described. With a global goal distribution definition, the design of a manipulator model-free global goal generator is discussed. Results from a multiagent manipulator control simulation are then presented and evaluated. The focus then turns to multiple global goal operation, discussing self organization, multiple global goals, and the impact of simultaneously active local and global goal systems on stability and arbitration. Results demonstrate multiple global and local goal operation including combinations of end effector trajectory tracking, joint failure, obstacle avoidance, joint centering, and joint limit avoidance. Finally, the significance of these results is discussed in the context of general multiagent control. ii Table of Contents Abstract ii List of Tables ix List of Figures xi Acknowledgement xviii Nomenclature 1 2 3 xix Introduction 1 1.1 Motivation 5 1.2 Survey of Related Research 6 1.2.1 Multiagent Control 6 1.2.2 Real-Time Manipulator Control 9 1.3 Contributions of the Thesis 1.4 Thesis Outline 9 11 The Manipulator Model 13 2.1 Introduction. 13 2.2 The Manipulator Model 14 2.2.1 The Kinematic Model 14 2.2.2 Manipulator Inertia Matrix 17 2.2.3 Manipulator Dynamics 18 2.2.4 Properties of The Manipulator Jacobian 19 2.2.5 The Motor Model 20 Real Time Robot Control 3.1 Introduction 3.2 Task Functions 24 • 24 25 iii 3.3 3.4 3.5 3.6 Manipulator Control and the Inverse Solution 26 3.3.1 Resolved Motion Position Control 26 3.3.2 Resolved Motion Rate Control 28 3.3.3 Resolved Motion Acceleration Control 29 3.3.4 Jacobian Transpose Control, J T C 30 Real Time Supervisory Control Architectures 31 3.4.1 Manipulation 32 3.4.2 High Speed Motion 36 3.4.3 Mobile Robot Navigation 37 3.4.4 Computer Graphics 41 3.4.5 Common Denominators in Real Time Robot Control 42 Multiple Robot Control 44 3.5.1 47 Common Denominators in Real Time Robot Team Control Remarks 49 4 The Agent and Multiagent Model 50 4.1 Introduction 50 4.2 Behaviour Control 50 4.2.1 Goals 52 4.2.2 Behaviours 53 4.2.3 Behaviour Control 54 4.3 4.4 A General Model of Agency 57 4.3.1 Information Exchange 59 4.3.2 Arbitration 61 4.3.3 Example: A Linear Arbitration Model 65 4.3.4 From lone Agents to Agent Teams 67 A General Model of Multiagency 67 4.4.1 Local Behaviour 68 4.4.2 Global Behaviour 69 4.4.3 Multiagent Coordination 71 4.4.4 Global Goals 74 iv 4.4.5 Multiagent Control 76 4.4.6 Emergent Multiagent Control 76 4.5 Linear Arbitration and Multiagent Control 77 4.6 Multiagent Manipulation 78 4.6.1 Links as Agents 78 4.6.2 Manipulators as Multiagent Systems 79 4.6.3 Global Goal Distribution 80 4.6.4 The Global Goal Proxy 82 4.7 A Link Agent Specification 85 4.8 Summary 87 5 The Multiprocess Manipulator Simulator 5.1 5.2 5.3 5.4 5.5 89 Introduction 89 5.1.1 Requirements Overview 90 5.1.2 Design Specification Overview 91 Foundations 92 5.2.1 Quaternions 92 5.2.2 Cartesian State 93 5.2.3 Cartesian Trajectories and Paths 94 5.2.4 Parsers 94 5.2.5 LinkModel 95 5.2.6 Interprocess Communications 96 The Process Manager and Synthetic Clock, SI 97 5.3.1 97 Execution The Dynamic Modeler, DM 98 5.4.1 ManipulatorModel 98 5.4.2 Obstacle Modelling 103 5.4.3 Physical Readability 103 5.4.4 Execution 104 The Global Goal Process, GP 105 5.5.1 105 The GlobalAgent Object v 5.5.2 5.6 Execution 1° 7 The Link Process,LP 107 5.6.1 The Agent Object 107 5.6.2 The Kinematic Bus 108 5.6.3 The Global Goal IOStream 109 5.6.4 Controller Objects 109 5.6.5 Arbitration HO 5.6.6 Execution HI 5.7 Operational Overview Ill 5.8 Data Logs and Viewing HI 5.9 Verification 113 5.9.1 114 A 3R Reference Model 5.10 Summary 116 6 Global Goal Generation H 7 6.1 Introduction 117 6.2 Global Goal Generation 118 6.2.1 119 6.3 6.4 Resolved Motion Decentralized Adaptive Control Multiagent Controller Performance 123 6.3.1 Trajectory Tracking Performance 126 6.3.2 Model Free Goal Regulation 128 6.3.3 Actuator Saturation 129 Summary 137 7 Multiple Goal Systems 140 7.1 Introduction 140 7.2 Auxiliary Behaviour and the Jacobian Null Space 141 7.3 Auxiliary Global Goal Systems 145 7.3.1 Obstacle Avoidance 145 7.3.2 Multiagent Obstacle Avoidance 149 7.3.3 Results 154 7.3.4 Discussion 162 vi 7.4 Local and Global Goals Combined 163 7.4.1 Why Local Goals? 163 7.4.2 Continuous Nonlinear Joint Limit Repulsion 164 7.5 Linear Local Goal Design 167 7.6 Local Goal Strategies 168 7.6.1 Fail Safe Locking and Robustness 169 7.6.2 Constant Compliance Centering 172 7.6.3 Constant Compliance Retraction 179 7.6.4 Variable Compliance Centering 190 7.6.5 Combinations: Trajectory Tracking, Obstacle Avoidance, and Centering 191 7.7 Arbitration and Compliance 205 7.8 Summary 208 8 Multiagent Control Compared 8.1 8.2 8.3 211 8.0.1 Multiagent vs. Centralized Control 211 8.0.2 Multiagent Manipulator Control and the Multiagent Context 212 Demonstrating The Advantages 214 8.1.1 Performance 214 8.1.2 Structure 215 Stability 218 8.2.1 Combining Locally and Globally Compliant Goal Systems 220 8.2.2 Combining Globally Adaptive and Locally Compliant Goal Systems 224 Summary 225 9 Conclusions, Contributions, and Recommendations 9.1 226 Conclusions 226 9.1.1 226 Caveats 9.2 Contributions 227 9.3 Recommendations 230 Bibliography 232 vii Appendices A Quaternions 240 A.l Definition 240 A.2 Properties 240 A. 3 Relation to the Rotation Matrix 241 B Orientation Error 242 B. l Differential Motion Vector or Euler Rotations 242 B. 2 Quaternion Orientation Error 243 C Scenario Specification Database 245 C. l Manipulator Description Files (MDF): theRobot.rbt 246 C.2 Global Goal Description Files (GDF): theGoals.dat 246 C.3 Trajectory Description Files: theTrajectory .spt 246 C. 4 Obstacle Description Files: theObstacle.obs 248 D R M D A C Gain Selection 250 D. l Control Rates and RMDAC 253 E Global Goal Generators Compared 257 E. l The Operational Space Formulation E.l.l 257 Theoretical Performance 258 E.1.2 Simulation 259 E.1.3 Discussion 260 E.2 Resolved Motion Force Control 260 E.2.1 Theoretical Performance 264 E.2.2 Simulation 264 E.2.3 Discussion 265 viii List of Tables 3.1 The temporal scope of the NASREM layers. Each layer represents an increasingly (approximately an order of magnitude) longer planning horizon 36 5.2 Structure of the C a r t e s i a n S t a t e 94 5.3 Structure of the LinkModel 96 5.4 Reference Link, Joint, and Motor parameters. Centroid is relative to link frame 114 6.5 Reference Link, Joint, and Motor parameters. Centroid is relative to link frame 124 6.6 Reference Payload parameters. Centroid is relative to gripper frame 125 6.7 Knotpoints on the reference trajectory. 125 6.8 The reference RMDAC 'Best Gains' determined through trial and error 129 6.9 'Best Gains' RMS error in position and orientation for 'step', 'reference' and 'spiral' trajectories. 129 6.10 Tabulation of position and orientation RMS L2 error and normalized execution time as a function of manipulator degrees of freedom 137 7.11 RMS error for a combined tracking and obstacle avoidance strategy. 154 7.12 Comparison of end effector RMS position and orientation error performance between reference and fail-safe behaviour of link 9 172 7.13 Tabulation of local joint centering PD gains versus RMS task space position error and RMS centering error for a simple joint centering strategy. 179 7.14 Tabulation of joint centering PD gains and setpoint for the constant compliance retraction local goal strategy. 190 7.15 RMS error for the constant compliance retraction strategy. 190 7.16 Tabulation of increasing and decreasing joint centering PD gains for the variable compliance local goal strategy. 196 7.17 RMS error for "increasing" and "decreasing" variable centering gain strategies 196 7.18 Comparison of RMS errors for a tracking, obstacle avoidance (rj = 10.0), and centering (fc = p 100 kd = 20) combined strategies 197 ix 8.19 RMAC and Multiagent Control RMS end effector and joint centering performance 218 D.20 A tabulation of initial values (italic) and integral gains (bold) for a set of scenarios. Each integral can use up to three gains, UQ, U \ , and 112. In this study UQ is omitted while u-i never varies, thus the listed gains represent the value of u\ in each integral 250 D.21 Tabulation of position and orientation RMS L2 error with variation in error weights,W . p . . 252 D.22 Tabulation of position and orientation RMS L2 error as a function of variation of the weight W„ given w pi = 900 253 D.23 Tabulation of position and orientation RMS L2 error as a function of the Feedback integral gain, an given pn = 1.0 253 D.24 Tabulation of position and orientation RMS L2 error as a function of the Feedback integral gain, Pn given an — 1.0 253 D.25 Tabulation of position and orientation RMS L2 error as a function of the Feedforward integral gain, Aii given vn = 0.0, 7 ^ = 0.0 253 D.26 Tabulation of position and orientation RMS L2 error as a function of the Feedforward integral gain, 7 l l 5 given vu = 0.0, Ai; = 2.0 254 D.27 Tabulation of position and orientation RMS L2 error as a function of the Feedforward integral gain, vu given 71; = 2.0, Ai, = 2.0.Note that cases 23 and 24 were unstable in the step response 254 D.28 Tabulation of position and orientation RMS L2 error as a function of the auxiliary integral gain, Su, given 71; = 2.0, Ai, = 2.0 254 x List of Figures 1.1 The Sense-Model-Plan-Act (SMPA) manipulator control cycle. An example of the prototypical robot control cycle borne out of early AI research 2.2 2 A 6 degree of freedom elbow manipulator. Note coordinate systems conform to Denavit Hartenberg conventions 15 2.3 The Range and Null spaces of the Jacobian and Jacobian Transpose 20 2.4 An armature controlled dc drive joint motor and gear train model 22 3.5 Fisher's corrected Hybrid Position Force Control. S is the selection matrix. J + is the pseu- doinverse of the Jacobian and Zq and z are arbitrary vectors 33 r 3.6 A block diagram of the Visual servoing approach to target tracking. Desired reference values arc in feature space f f. Image feature changes are computed through a feature Jacobian Jf t- 33 rc 3.7 ea Khatib's operational space formulation. Feedforward cartesian dynamics and obstacle avoidance forces are summed to drive a manipulator along a stable collision free trajectory. Colbaugh's configuration control replaces the potential field model with sensing and the feedforward dynamics with adaptive control 35 3.8 The NASA/NIST Standard Reference Model for telerobot control 35 3.9 Carnegie Mellon's Task Control Architecture for the Ambler hexapod. Note the centralized planner and distributed reactive controller 38 3.10 Carnegie Mellon's Distributed Architecture for Mobile Navigation for the N A V L A B series of robots. Each behaviour votes on every possible command (e.g. steering radii) and all votes are processed within the arbiter 39 3.11 A subsumption network. Perception (P) drives augmented finite state machines (M#) to output messages. Suppression nodes substitute horizontal line messages with vertical (tap) messages. Similarly Inhibition nodes disable line messages if a tap message is received xi 40 3.12 A SAN network example. Note the interconnections sensor (S), hidden (H) nodes and Actuator (A) nodes. Each node has the structure expanded at right. The hidden nodes act as an intcractuator information mechanism 42 3.13 By applying both direct 0 and temporal ® combination to the same basic behaviours higherlevel behaviours can be generated. In this example, safe wandering is used to generate both flocking and foraging. Similarly aggregation is used in both foraging and in surrounding. . . . 45 4.14 A generic plant, G_,(-), with control input u,(i) and behaviour Yj{t) 53 4.15 A goal r,(t) input to the generic controller, H,-(-), with output Ui(t) controls the plant, Gj(-) making the behaviour, Vj(t) 55 4.16 A goal r,(i) input to the generic controller, H (-), observes disturbances v(t) to output u,(i) t and control the plant, G(-) making the behaviour, Vj(t) 56 4.17 A set of goals r,-(t) : 1 < i < b input to generic controllers, H,(-) : 1 < i < b, each observing select disturbances v(i) to output u,-(t) : 1 < i < b to the arbitrator Aj(-). The composite output u - controls the plant, Gj(-) to make the behaviour, yj(t) 58 CJ 4.18 A set of behaviour controllers r,-(t) and H,(-), each observe select disturbances v(f) to output u,-(t) to the arbitrator Aj(-). Aj(-) arbitrates amongst these controllers to track a desired goal r j(t). c The composite output u j(t) controls the plant, Gj(-) to make the behaviour, yj(t). c . 4.19 A simple linear agent arbitration model 59 65 4.20 A multiagent controller model. Note that each agent acts upon the global plant G , though each agent may apply control effort to only a portion of the plant Gj 70 4.21 A decomposed model of a multiagent controller. Though each agent acts upon a local plant Gj, disturbances dynamically bind plants to form a single global plant G . The aggregate relation f(-) represents the binding between the local and global behaviour states Vj and y respectively. 72 4.22 The rows of the Jacobian transpose are a projection operator from the global goal (a force vector) to the local goal space (an actuator moment or thrust vector) 83 4.23 With a global goalf^ and the global goal proxy, J f (p»-i,pjv)> manipulator end effector control can be distributed amongst TV agents, each with local and global goals iv and x<* respectively. 87 5.24 The interprocess data flow of the Multiprocess Manipulator Simulator. As GP is not a mandatory process, it appears as a dashed circle 99 xii 5.25 Interprocess data flow between the link processes 1... N, global goal process GP, process manager S I , and dynamic model DM. Note that the link state X, — [g, g,-] 2 5.26 The 3R planar reference model used to verify DM against NEDYN 112 114 5.27 Here, the superimposed displacement responses of both the symbolic N E D Y N and numerical 3R planar reference manipulator (link 1, top; link 2, middle; link 3, bottom) indicate excellent agreement between the two methods 115 6.28 The Reference Manipulator: a ten degree of freedom planar revolute manipulator initially lying along the x axis 124 6.29 The Reference end effector position (top) and orientation (bottom) trajectories. Each segment employs a parabolic velocity profile in both position and orientation. The position trajectory plot is annotated with time milestones 126 6.30 End effector trajectory tracking under 'Best Gains' RMDAC. Note oscillation prior to convergence 130 6.31 Trajectory tracking history for the reference trajectory under 'Best Gains' RMDAC 131 6.32 End effector tracking performance over an unusual spiral trajectory in which the radius and angular displacement vary according to a parabolic velocity profile 6.33 Trajectory tracking history for the spiral trajectory under 'Best Gains' RMDAC 132 133 6.34 End effector trajectory tracking using 'Best Gains' RMDAC goal generator for a 5 degree of freedom planar manipulator 134 6.35 End effector trajectory tracking using 'Best Gains' RMDAC goal generator for a 15 degree of freedom planar manipulator 135 6.36 End effector trajectory tracking using 'Best Gains' RMDAC goal generator for a 20 degree of freedom planar manipulator. . 136 6.37 End effector trajectory performance for the reference manipulator and payload under R M D A C end effector and joint force saturation (u t = 1000 N-m) sa 7.38 A simple link agent controller model 138 150 7.39 Semilog plots of the potential field (top) and repulsive force (bottom) as a function of normalized range 152 xiii 7.40 The reference manipulator avoids a small stationary sphere while tracking the reference trajectory. The sphere is 0.250m in diameter at x bs = [1-00 c - 0.750] . The RMOA controller T uses a clearance of 0.75 m and gain of r\ = 100.0 155 7.41 End effector trajectory tracking performance of the reference manipulator engaging in multiagent trajectory tracking and obstacle avoidance 156 7.42 Evolution of the Global Goals within links l(top) and 2 (bottom) 157 7.43 Evolution of the Global Goals within links 3(top) and 4 (bottom) 158 7.44 Evolution of the Global Goals within links 5(top) and 6 (bottom) 159 7.45 Evolution of the Global Goals within links 7(top) and 8 (bottom) 160 7.46 Evolution of the Global Goals within links 9(top) and 10 (bottom) 161 7.47 End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint limit avoidance (r) = 5.0 N-m) 7.48 Both RMDAC end effector global and joint limit local goals are engaged 165 166 7.49 End effector trajectory error for the reference manipulator and payload under the failure of link 9 170 7.50 The manipulator configurations at the knotpoints of the reference trajectory for the reference manipulator and payload under the failure of link 9. The 'fail safe' braking behaviour locks link 9 to link 8. 171 7.51 The structure of a multiagent system in which each agent has both local centering and global trajectory tracking (and other) behaviours 173 7.52 End effector trajectory for the reference manipulator and payload under both RMDAC end effector global and joint centering local goals 180 7.53 The manipulator configurations at the knotpoints of the reference trajectory for the reference manipulator and payload under both RMDAC end effector global and joint centering local goals. Note the manipulator simultaneously adopts a 'leaf spring' configuration while tracking the reference trajectory, indicating that joint centering forces are acting in N(J) 181 7.54 Centering and tracking torques for links l(top) and 2 (bottom) of the reference manipulator tracking the reference trajectory. 182 7.55 Centering and tracking torques for links 3 (top) and 4 (bottom) of the reference manipulator tracking the reference trajectory. 183 xiv 7.56 Centering and tracking torques for links 5 (top) and 6 (bottom) of the reference manipulator tracking the reference trajectory. 184 7.57 Centering and tracking torques for links 7 (top) and 8 (bottom) of the reference manipulator tracking the reference trajectory. 185 7.58 Centering and tracking torques for links 9 (top) and 10 (bottom) of the reference manipulator tracking the reference trajectory. 186 7.59 Both RMDAC end effector global and joint centering local goals with alternating setpoints are engaged 188 7.60 End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint centering 'retraction' local goals 189 7.61 End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint centering local goals of linearly increasing stiffness 192 7.62 Both RMDAC end effector global and joint centering local goals with increasing stiffness are engaged 193 7.63 End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint centering local goals of linearly decreasing stiffness 194 7.64 Both RMDAC end effector global and joint centering local goals with decreasing stiffness are engaged 195 7.65 End effector trajectory error for the reference manipulator and payload under both RMDAC end effector global, RMOA obstacle avoidance, and joint centering local goals 198 7.66 Reference manipulator configurations at reference trajectory knotpoints under RMDAC end effector and obstacle avoidance global goals (rj = 10.0) and joint centering local goals(fc = 100, g k w = 20). Note 'leaf spring' configuration and avoidance reside in N(J) 199 7.67 Range to surface, avoidance, centering, and tracking torques for links l(top) and 2 (bottom) of the reference manipulator tracking the reference trajectory. 200 7.68 Range to surface, avoidance, centering, and tracking torques for links 3 (top) and 4 (bottom) of the reference manipulator tracking the reference trajectory. 201 7.69 Range to surface, avoidance, centering, and tracking torques for links 5 (top) and 6 (bottom) of the reference manipulator tracking the reference trajectory. 202 7.70 Range to surface, avoidance, centering, and tracking torques for links 7 (top) and 8 (bottom) of the reference manipulator tracking the reference trajectory. xv 203 7.71 Range to surface, avoidance, centering, and tracking torques for links 9 (top) and 10 (bottom) of the reference manipulator tracking the reference trajectory. 204 7.72 End effector tracking performance overconstrained by RMOA and TV Centering controllers (K q = diag(100), K w = diag(20)) and the agents arbitration vector fyy. = [fctrack fcavoid ^centering] [1.0 1.0 0.5] = 207 8.73 Convergence of techniques that form Multiagent Manipulator Control 213 8.74 End effector trajectory for the reference manipulator and payload under centralized R M A C end effector and joint centering task assigned to the null space 216 8.75 End effector trajectory for the reference manipulator and payload under centralized R M A C end effector and decentralized joint centering auxiliary tasks (K ? = diag( 100.0) , K. — w diag(20.0)) 219 8.76 About the end effector position x(t), the desired global goal x<j is related to the desired local goal x d through the vectors x , x g e qe and e 222 C.77 A robot Specification Database flat (ASCII) file. Note that the link list order is from manipulator base (frame 1) to end effector (frame TV) 246 C.78 A Global Goal Database flat (ASCII) file 246 C.79 An Trajectory Database flat (ASCII) file 247 C. 80 An Obstacle Database fiat (ASCII) file 248 D. 81 Absolute end effector step response under 'Best Gains' 255 E. 82 R M A C end effector trajectory tracking performance of the reference trajectory with exact manipulator parameter estimates 260 E.83 R M A C (equivalent to OSF) configuration history of the reference trajectory for the reference manipulator and payload. Note that despite precise trajectory tracking, the manipulator adopts convoluted configurations, a clear indication of free motion in TV(J) E.84 The Force Convergent Controller 261 262 E.85 End effector trajectory error performance during the reference trajectory without Force Convergent Control at 120hz.The manipulator carries the reference payload and the controller employs PD gains of K = 100 and K p d = 20 xvi 265 E.86 End effector trajectory tracking performance during the reference trajectory under Force Convergent Control (RMFC nominal rate of 120hz, actual control rate 480Hz) 266 E.87 End effector trajectory performance during the reference trajectory under Force Convergent Control(RMFC nominal rate of 30hz, actual control rate 120Hz) 267 E.88 Trajectory tracking history for the reference trajectory under Force Convergent Control (RMFC nominal rate of 30hz, actual control rate 120Hz) xvii 268 Acknowledgement Entering a doctoral engineering program, a daunting task at best, would have been unthinkable without the care, thoughtful guidance, and support of my supervisor, Dr. Dale Cherchas. Dale's kindness, optimism, and vision remain a standard I can only hope to follow. The understanding, patience, and advice of the Research and Examination Commitecs, Dr. Vinod Modi, Dr. C. Ma, Dr. Clarence de Silva, Dr. Peter Lawrence, Dr. William Gruver, Dr. Farrokh Sassani, Dr. Jim Little, and Dr. X i of the National Research Council supplied thought provoking questions and suggestions for which I am grateful. I am grateful, too, for the contributions of both the National Research Council and NSERC to this research. However, the Alberta Research Council deserves special recognition for their collaboration and support. In particular, I should thank Keith Crystall both for supporting the collaboration and, with Pat Feighan, suffering my intrusions in their busy schedule. Their perspective and sense of humour did much to preserve my sanity. Above all, I must thank my better half, Simone, for her enduring patience and boundless confidence; our precocious daughter, Valerie, a source of vital distraction; and of course, Mum, Dad, Stephen and Elizabeth - providers of solid advice and often a good debate! Like many before me, I must finally pay tribute to Isaac Asimov, the father of robotics, for his legacy: Asimov's Three Laws of Robotics I. (Safety) A robot may not injure a human being, or, through inaction, allow a human being to come to harm. II. (Service) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law. III. (Prudence) A robot must protect its own existence, as long as such protection does not conflict with the First or Second Laws. -Isaac Asimov, Runaround, Astounding Science Fiction, March 1942. xviii Nomenclature In the following nomenclature section significant acronyms are listed alphabetically. This is followed by a nomenclature listing of significant symbols grouped into sections by topic. With few exceptions, this dissertation adopts the following notational conventions: • italic roman symbols, (e.g. x), signify scalars. • small case bold roman symbols, (e.g. x), signify column vectors, • capitalized bold roman symbols, (e.g. X), signify matrix quantities . Acronyms APF Artificial Potential Field (from OSF). OSF the Operational Space Formulation (resolved motion computed torque). JTC Jacobian Transpose Control. FCC Force Convergent Control (from RMFC). PSH Point Subject to Hazard (similar to PSP). PSP Point Subject to Potential (from OSF). RMAC Resolved Motion Acceleration Control. RMDAC Resolved Motion Decentralized Adaptive Control or Configuration Control RMFC Resolved Motion Force Control. RMOA Resolved Motion Obstacle Avoidance. RMPC Resolved Motion Position Control. RMR.C Resolved Motion Rate Control. Manipulator Kinematics and Dynamics. q the configuration space position vector. X the task space position vector. xix f (q) the Forward Solution, a map from configuration space to task space. = ^dcf 6 t u e Jacobian of the Forward Solution. D(q) the inertia matrix in configuration space coordinates. g(q) the gravitational forces in configuration space coordinates. C(q, q) coriolis, centrifugal forces in configuration space coordinates. M(q) the inertia matrix in cartesian space coordinates. N(q) the gravitational forces in cartesian space coordinates. P(q?q) coriolis, centrifugal forces in cartesian space coordinates. u link control forces P(x) the gravitational forces in task space coordinates. t time. r manipulator joint torque. Denavit Hartenberg Parameters. 6i revolute actuator displacement about z; di prismatic actuator displacement about z,- a,- link twist a,- link length The Link Model. rfc gear train coefficient for the fcth link. r the redundancy rate r — n — m dk link disturbance forces J link motor inertia J ff effective link inertia p,- the position of link frame i in world coordinates m e the position of the ith link's centroid in link coordinates R; the orientation of link frame i in world coordinates xx w, the angular velocity of link frame i in world coordinates The Agent Model. Q a goal space. r(t) a trajectory or goal in Q. B a behaviour space. y(t) a trajectory or behaviour (or response) of some dynamic system in B. Agent Definitions A(-) an agent. b the number of behaviour controllers in agent control system. Gi(-) the jth subplant of a dynamic system. Yj(t) the behaviour (or response) of Gj(-). H.-(-) an agent's ith controller. ',•(*) the goal of H,(i). Ui(*) the output of Ui(t). r (t) a composite goal of A . Uc(t) the composite output of A . B r A « = [ri(t)... Tb(t)] a column vector of all r,- . T «A(*) = [ui(t)... Ub(t)] a column vector of all u; . k (*) a linear arbitration vector such that u (£) = k^(i)u^. T c A Multiagent Definitions /V the number of agents in multiagent system. Ai(-) the jth agent where 1 < j < N. the number of behaviour controllers in A j ( - ) . Ho-(-) the ith controller of Aj. the goal of Hij(t). the output of Hij(t). r A U j M A/') — [ j'i (*)••• jbj r r column vector of all r,- in A j . a = [uji (t) •. • Ujbj (t)] T a column vector of all u- in Aj. t xxi r j(t) c a composite goal of Aj(t). r nA W = I Aj (*)••• Ajv(*)] i the vector of controller setpoints for all Aj(t) : 1 < j < N. r cnA^) = [ u j(t) nA W r T (*)••• civ(*)] > the vector of the composite goals for all A _,•(£) : 1 < j < N. r<;1 r T the composite output of Aj(t). c u r [ Ai (*)••• A (*)] ' *he vector of controller outputs for all Aj(t) : 1 < j < N. = u U T N K ^(i) a linear arbitration matrix such that u j^(t) = K ^(i)u ^. u ^(i) = [u i(^)... u ./v(i)] , the vector of composite output for all Aj(t) : 1 < j < N. Qj the goal space of Aj(t). Q the image of Qj in Q. Bj the behaviour space of Aj(t). B the image of Bj in B. G(-) a plant composed of n subplants. p cn 3 J ^nAM y(t) f(ynA) g(r(i)) cn n n/ T c c [yi • • • YN} a vector of the behaviours (or dynamic response) of N agents. 1 = = f(y ^), the global or aggregate behaviour (or dynamic response) of G(-). n e, S- y(0 = ^(ynA) Si x . . . x 8jv -> S - ail aggregate relation. : e.g. r fa(t) = g(r(i)): Q —¥ Qi x . . . x Q^ - an inverse aggregate relation. cn Cartesian Controllers G a global goal couplet. the basis vectors of the ith coordinate frame. Papp the objective position of a global goal couplet. x(i) a state in task space {p,R, p,w}, often x(i) = x^(i), the end effector state x (t) the desired task space position with elements x.di(t) : 1 < i < M x (t) = Xd(t) — x(t) the task space position error vector. x, = [</,- qi] the joint state. X = [XJV XAI] the end effector state. &(*) the desired force at the end effector. f-(*) the applied force at the end effector. d c 1 7 xxii OSF Ut an artificial potential field in OSF p the range to an obstacle surface po the trigger range to an obstacle surface V the vector gradient d(t) an auxiliary gain. C(t) an feedforward gravitational gain. B(t) an feedforward Coriolis gain. A(*) an feedforward Inertia gain. K„(f) an adaptive cartesian position error gain. K„(t) a adaptive cartesian velocity error gain. di(t) the ith auxiliary gain. a(t) the ith feedforward gravitational gain. bi(t) the ith feedforward Coriolis gain. Cli(t) the ith feedforward Inertia gain. K (t) the ith cartesian position error gain. Kvi(t) the ith cartesian velocity error gain. n(t) the ith component of the weighted cartesian error. Wpi the ith component of the position error weighting gain w the ith component of the velocity error weighting gain b a convergence coefficient for the Robbins Monroe Stochastic Estimator. k a counter for the Robbins Monroe Stochastic Estimator. ar RMDAC pi vi RMFC xxiii Chapter 1 Introduction As robot control engages progressively more complex problems, centralized supervisory architectures encounter barriers to real time performance, specifically: computational complexity coupled with insufficient computing power and impoverished sensor resources. Despite startling advances in hardware and software technology and similarly surprising cost reductions, these fundamental barriers remain unchanged. This thesis investigates an alternative to centralized supervisory control of manipulation robots known as multiagent control. Centralized supervisory robot control architectures are usually based on variants of the sense-modelplan-act (SMPA) cycle [Brooks, 1989c] . The dominance of this architecture can be traced to early AI l [Brooks, 1991a] in which artificial intelligence was broken into four distinct problems: sensing, modelling, planning, and action. This decomposition stemmed from the belief that cognition was a monolithic process in which the relatively simple phases of sensing and action supported complex symbolic modelling and planning engines. This 'reductionist' view of intelligent action focussed research into symbolic manipulation of the environment and task-level robot programming techniques (e.g. Stanford Research Institutes's 'Shakey' in which logic and planning were of primary interest). Task-level programming describes tasks through a hierarchy of symbolic operators fundamentally founded upon basic geometric relationships (i.e. homoge- neous transformations, trajectories, etc.) within some coherent representation of the environment, a world model. Within a structured, slowly varying environment like an assembly line, world modelling and task level programming methods are more than sufficient to support safe, accurate, predictable, and cost effective automation. The advantage of SMPA is that, if the environment is known a priori, so too is the robot's behaviour. The disadvantage, of course, is that unstructured environments (UE) are intrinsically unpredictable. Real time SMPA relies on a fragile chain of events: fast planning, precise modelling, and rich sensing. Since the latter tends to be expensive and/or unreliable in a U E , strictly following an SMPA model inevitably ensures a robotic system's performance never exceeds that of the worst sensor. Such systems become input and compute 1 Brooks original papers describe this as the Sense-Think-Act cycle 1 Chapter 1. Introduction 2 Figure 1.1: The Sense-Model-Plan-Act (SMPA) manipulator control cycle. An example of the prototypical robot control cycle borne out of early A l research. Chapter 1. 3 Introduction bound, awaiting all relevant data prior to distilling plans and world models into a single action. In maximizing SMPA performance in a U E , computing resources can become sufficiently expensive and/or physically large that elements must be housed remotely, adding the burdens of communication delay, bandwidth limits, and tethered operations to input and compute bound performance. It is not surprising, then, that research into practical, real time robotics has diverged from the SMPA model. Research, in artificial intelligence [Minsky, 1990], mobile robotics [Brooks, 1989b]-[Brooks, 1991b], [Connell, 1990], and high performance manipulation [Khatib, 1987, Colbaugh, 1989, Seraji, 1996a] in particular have challenged the reductionist model. While not entirely rejecting the utility of recognition, modelling, and compensation of environmental events, practical real time controllers are often designed to react to the world through the adoption of control arbitration networks. Using high speed sensors, minimal signal post processing, multiple inexpensive embedded computers execute one or more control strategies. Such sensor driven, distributed, control arbitration processes are increasingly referred to as agents. In agent based systems, the semantics or meaning traditionally embodied in a world model is condensed into a sense-act relationship distributed over multiple application specific computing elements. In effect, task level cognitive elements are transformed into a set of control level instinctive responses or behaviours and combined through some arbitration mechanism, a greatly simplified model-planning system. It is important to note that this method seeks not to remove centralized cognitive or supervisory elements from the cycle but to distribute model based control to simpler application specific controllers and, in so doing, both reduce expense and improve autonomous performance. Proponents argue that agent architectures both simplify the design and improve robustness of robot systems by basing behaviour on the interaction of the robot with its environment. Based on his research experience, an outspoken critic of traditional AI, Rodney Brooks, proposed a set of properties [Brooks, 1991a] he believed crucial to intelligent real time behaviour : 2 Situatedness The world is its own best model. A robot should 'model' the world through sensing. Embodiment The world grounds regress. A robot should reside within a physical system. Simulated robot capabilities usually regress in the real world. Intelligence Intelligence is determined by the dynamics of the interaction with the world. Emergence Intelligence is in the eye of the beholder. Intelligence emerges from the interaction of parts of the system. italicized remarks are quotes from Intelligence without Reason [Brooks, 1991a], perhaps his most controversial paper. 4 Chapter 1. Introduction Though uriproven, these properties are, nevertheless, a useful summary of this new breed of simple, capable, real time robots. An example from manipulator control clarifies the difference between SMPA and agent control. An Example Consider the common tasks in redundant manipulator control: end effector trajectory tracking and obstacle avoidance. At each control time step, an SMPA system uses high level machine vision and world modelling to capture and extract surrounding hazards for construction of object models. The manipulator's configuration space trajectory is determined through a centralized process such as resolved motion acceleration controller (RMAC). In R M A C the desired cartesian acceleration Xd(<), is transformed into joint torques, u, based on the pscudoinverse of the end effector Jacobian, J^(x), a PD cartesian controller with gains k and k , and p v an estimate of the manipulator dynamic model (D(q), C(q, q), and g(q)): q<i(t) = J jx (t) - Jq(t) + k k + fc x ] f d v e p e j=n -(I-jtj)(K u = o b s ^jJf m o d e l ) D(q)q +C(q,q)+g(q) d where rj and n are the number of joints and obstacles respectively, f odei is a model based obstacle avoidance m strategy, and Jj is the Jacobian of a nearest point of contact on link j. In this approach the obstacle avoidance control is explicitly inserted into the null space of the Jacobian, ensuring the end effector trajectory is not perturbed. To achieve obstacle free trajectory tracking, sensing and modelling must work in concert to supply the centralized controller with accurate data. In contrast, an agent-like system employs embedded low level range sensing in each link to trigger obstacle avoidance behaviours. No attempt at obstacle mapping or representation is made. By combining obstacle avoidance and end effector trajectory tracking control behaviours an obstacle free configuration space trajectory can be achieved. In this thesis this is achieved through an adaptive end effector trajectory controller fd(xd(i)), and a reactive obstacle avoidance controller embedded into each link, fpsn (")- For the k jth link : Uj c = [1 (S id(£)] Utrack avo Uavoid Chapter 1. 5 Introduction 1 if |x,- | < c 0 if jxifcj > c fc where c is a triggering range and |x,fc | is magnitude of the distance from the fcth link to the ith obstacle. The tracking and avoidance controllers respectively are: Utrack = jJ(Pn,X _ )f (x (t)) j N—j 1 d d i=K where Jj is the jth row of a generic Jacobian Transpose or global proxy and JV is the number of links. In this approach, the adaptive end effector controller serves two purposes normally requiring a manipulator model. It ensures accurate trajectory tracking and forces all additional behaviours into the null space of the Jacobian. Sensing and control are decentralized and need not be synchronous, while modelling of both manipulator and environment are drastically reduced. 1.1 Motivation This research was motivated by the desire to understand multiagent systems and by the demand for more capable 'real world' robotic manipulation systems. The inspiration for this work and perhaps the most famous example of agent based robotic systems is Brooks' subsumption architecture [Brooks, 1989c]. While subsumption represents only one approach in agent design, the simplicity and effectiveness of the methodology represented a wake up call to the robotics community. Similarly, the early work of Wu and Paul [Wu, 1982], Khatib[Khatib, 1985], and more recent research by Seraji et al [Seraji, 1989b, Colbaugh, 1989, Glass, 1995] at JPL acted as both foundation and guiding light for much of the new material presented here. Despite the remarkable behaviour exhibited by agent and multiagent systems and the simplicity of their design, the theoretical foundations for these systems remains relatively shallow. While significant effort has been made to set agent control in the context of hybrid systems (e.g. Zhang and Mackworth [Zhang, 1995]), the fundamental problems of arbitrator and controller design, coupling are unresolved. Few investigators have explicitly described their design methods in control theoretic terms. Thus agent design remains inconsistent, ad hoc, and experimental. In real time manipulation, Jacobian Transpose systems have achieved an important balance between computing resources, performance and cost within monolithic reactive architectures. This thesis will show that these same systems represent an avenue to an area gaining popularity in mobile robotics - the use of multiagent teams to control complex systems. Chapter 1. Introduction 6 A key motivation for this work was the potential impact of a genuine autonomous multiprocess manipulator control architecture on manipulator performance and capabilities. If possible, the configuration of a given manipulator need not be limited to static nonredundant serial manipulators. Indeed the arrangement and construction of the manipulator can be both arbitrary and time varying, presenting some interesting possibilities including expandable, modular manipulators. All of which pose significant barriers to traditional, centralized supervisory control (or, for that matter, single agent) systems. Thus the application of multiagent control to manipulation expands the definition of manipulation systems while providing simple, robust, manipulation architectures at lower cost. 1.2 Survey of Related Research 1.2.1 Multiagent Control The term agent originally appeared in distributed computation research (e.g. agent oriented programming [Shoham, 1993]) and decentralized artificial intelligence. In these systems, independent processes or intelligent agents act autonomously to perform tasks in the interest of a user (e.g. an electronic mail filter). As applied to robots, agent often implies a number of relatively recent techniques, including reactive control, behavioural control, subsumption, sensor actuator networks, motor schema, discrete event or hybrid systems and others. Robots as Agents The concept of robot agent has evolved from a large number of noteworthy 'agent-like' robotic systems. Perhaps the earliest demonstration of this design philosophy appears in Walter's [Walter, 1950a, Walter, 1950b] published an account of reactive mobile robots, Elmer and Elsie in 1951. These entirely analog 'shoe box' 3 robots exemplified pure reactive control. With the widespread use of computers in the '60's and '70's such analog methods were supplanted by model based techniques. Perhaps the most famous of these are SRI's Shakcy and the allied planner STRIPS, the top down philosophy of which that has dominated research robot. Despite early successes, the demand for real world performance has driven the evolution of robot architectures away from a serial sequence or single threaded, model driven SMPA architectures to parallel or multithreaded, reactive architectures. In 1984 Raibert and Brown implemented a hopping robot that adopted a pragmatic, sensor driven, finite state machine architecture. By interacting with the surroundings, 'Walter inserts these into a mock biological species Machina Speculatrix Chapter 1. Introduction 7 the robot could hop in place, travel and leap obstacles. In 1986 Brooks applied subsumption, a hierarchical behaviour arbitration system to a mobile cart, Allen[Brooks, 1989c]. and later refined on a walking robot, Genghis [Brooks, 1989b, Brooks, 1989a]. A student of Brooks, Connell, published an extensive description of a can collecting robot, Herbert, based on a heterogeneous behaviour network [Connell, 1990]. Further experiments using this architecture include learning in Mahadevan's mobile carts [Mahadevan, 1992]. Similarly Verschure investigated self organization and learning within a mobile cart [Verschure, 1992]. Hartley and Pipitonc investigated subsumption and carrier aircraft landing systems [Hartley, 1991]. Using Motor schema or behaviours driven by a simple potential field world model, Arkin implemented a navigation system for a mobile robots [Arkin, 1987]. All demonstrated that autonomous behaviour often arises from the interaction of instinctive controllers, simple arbitration networks and the real world. The success of these techniques encouraged Brooks to make fervent arguments against traditional AI [Brooks, 1991b, Brooks, 1991a] and others to extend the technique [Jones, 1993]. A new robot control design philosophy seems to be emerging in which parallel behaviours, some reactive and some model driven, are combined, algebraicly or discretely, in real time to achieve a composite behaviour. Despite the current popularity of such approaches, a number of technical issues are unresolved or poorly defined: i What is a behaviour? Variously understood to mean either an input plan, an applied control effort, or the response of a robotic system. Terminology remains a lingering problem in agent research. ii What is an agent? There is no formal definition of agent other than 'a behaviour arbitration process' (e.g: [Mataric, 1994]) though agent characteristics seem widely recognized. iii How should agent activity evolve over time? The popularity of subsumption has encouraged Kosecka and Bajcsy [Kosecka, 1994] to apply Discrete Event Systems[Rarnadge, 1989] to behaviour sequence design. Similarly, Zhang and Mackworth have placed hybrid systems (i.e. possessing both discrete and continuous dynamics) into a formal mathematical context [Zhang, 1995], Constraint Nets though arbitration remains unexamined. Van de Panne and Fiume determine sequencing and behaviour design in Sensor-Actuator-Networks (SAN) through genetic algorithms [van de Panne, 1992]. A formal relationship between behaviours, the environment, and arbitration strategies has not been established. 8 Chapter 1. Introduction iv What constitutes emergent behaviour? Most agree emergence is a qualitative, even subjective property of an agent based system [Brooks, 1989c, Mataric, 1994]. Indeed, only Steels [Steels, 1991] informally addressed the problem. Multiagent Systems Perhaps the first recognizable account of an artificial multiagent system appears in Walter's description of the analog robots Elmer and Elsie[WalteT, 1950a, Walter, 1950b]. Significantly, Walter documents emergent or spontaneous organized behaviour in which simple controllers are combined to generate complex navigation, avoidance, and exploration behaviours. In a seminal 1987 paper, Reynolds at Symbolics Corp., devised a multiprocess reactive method for animating flocking behaviour and obstacle avoidance in bird-like agents, boids [Reynolds, 1987]. This inspired Mataric's effort to assess agent interaction and behaviour [Mataric, 1992, Mataric, 1994] and Parker's investigation of troop movement [Parker, 1992b, Parker, 1992a]. Similarly, Tu and Grzeszczuk [Tu, 1994] designed surprisingly realistic reactive fish animations. To date all multiagent research has focussed on subjective behaviour assessment of teams (or herds) of mobile robots leaving additional unknowns to be itemized for multiagent control: i What constitutes global behaviour? If agent behaviour is ill defined, so too is global multiagent behaviour. Though generally accepted that agent systems possess both individual behaviour and global team behaviour, there is no formal definition of collective behaviour. ii What is a multiagent system? Specifically, what properties of an agent colony forms a recognizable, coordinated team of agents? iii How do agents interact? Agent interaction is the crucial mechanism that enables agent teams to accomplish group objectives. Though agent interaction is widely observed, the mechanisms of interaction remain largely unscrutinized. iv What is the origin of emergent multiagent behaviour? Without knowledge or consistent representation of interagent dynamics exploration of emergent or desirable, unplanned, group behaviour has been limited to broad characterizations such as [Steels, 1991] Chapter 1. Introduction 1.2.2 9 Real-Time Manipulator Control Though the literature on manipulator control is enormous ( e.g. [Lewis, 1993]), virtually all control methods rely on at least one of four fundamental supervisory control strategies: joint position control, resolved motion rate control, resolved motion acceleration control, and Jacobian transpose control. Traditional industrial position control first thoroughly treated by Paul[Paul, 1981], remains the foundation of modern industrial manipulator control. Resolved motion rate control was established by Whitney in [Whitney, 1969] and extended in resolved motion acceleration control by Luh, Walker, and Paul [Luh, 1980b]. The latter now forms the basis of most redundant manipulator control schemes. Achieving the desired performance from these techniques characterizes virtually all research in manipulator control, from simple PD control to linearizing feedforward, feedback, optimal and adaptive centralized and decentralized control strategies. Another, less common, class of manipulator end effector control, Jacobian Transpose Control, was initially popularized by both Khatib [Khatib, 1985] and Wu [Wu, 1982] in the early 1980's and later promoted by Seraji at J P L [Seraji, 1989c]. 1.3 Contributions of the Thesis The primary contribution of this thesis is the development of a manipulation system that controls an TV d.o.f. manipulator in the execution of multiple tasks using N independent processes or agents relying neither upon a centralized manipulator parameter model nor an auxiliary task distribution mechanism. The models and techniques used in the course of this research leading to the achievement of this task include: Theoretical Foundation Agent and multiagent systems theory has little foundation in traditional control theory and, furthermore, is often obscured by inconsistent, vague terminology. To classify and formalize both agent and multiagent architectures required the application of standard control theoretic definitions to terms and components commonly referenced in reactive, behavioural, and other agent based control schemes. The translation of these terms into a formal control theoretic context is new and facilitates precise specification, comparison and reproduction of agent architectures. Agent and Multiagent Architectures With inconsistent terminology and weak theoretical foundations, a control theoretic specification of agent and multiagent architectures was not previously possible. By formally translating and defining terms and components, a formal architecture could be proposed in which an arbitrator combines a set of task specific behaviour controllers into a composite command which, applied to the plant, generates a response or behaviour. It was further shown that to achieve Chapter 1. Introduction 10 a desired behaviour, an arbitrator could use explicit, implicit, and emergent arbitration to combine controllers through either model based local goal assignment, broadcast global goal assignment, or autonomous local goal assignment, respectively. A formal multiagent architecture was proposed based on this agent model. A set of agents were shown to generate global behaviour through the combination of local behaviours through a binding or aggregate relation. It was also shown that inversions of this relation, governed by the Inverse Function Theorem, could be classified according to explicit, implicit, and emergent coordination strategies. Cartesian Decentralization To create a multiagent manipulation system composed of TV link agents, manipulator control must be decomposed into TV independent processes. By examining the problem of end effector control it was shown that traditional inverse kinematic solution methods such as resolved motion position, rate and acceleration controllers (RMPC, RMRC, and R M A C respectively) were fundamentally centralized systems. However, Jacobian Transpose Control was shown to be a function solely of link and end effector coordinate frames. By assuming a distributed Newton Euler kinematic computation structure, J T C was shown to be cartesian decentralizable and a distributed manipulation architecture specified. Multiagent Manipulator Control Using TV autonomous link processes, this thesis demonstrated for the first time the parallel control of a TV d.o.f. manipulator without estimation or reliance upon a centralized manipulator parameter model. In combination with a well established task space adaptive controller, the innovation of cartesian decentralization removes the robot's structural description from the goal generation system. In effect, no manipulator model of any kind is required to derive setpoints for link agents to track a desired end effector trajectory. Rather, the combination of "natural" manipulator dynamics, cartesian decentralization, and adaptive task space control are sufficient to determine the necessary control forces for end effector trajectory tracking. Behaviours With each link agent acting autonomously to fulfill the end effector trajectory, it was shown how individual agents may act on local goals and "assert" global goals to the agent team. In selecting a simple PD joint space position controller as a local goal generator, new manipulator configuration policies were demonstrated including near minimal joint displacement trajectory tracking, fail safe trajectory tracking, retraction and variable compliance behaviours. A global goal assertion protocol enabled multiagent obstacle avoidance. Though some of these behaviours are well established (e.g. joint centering and obstacle avoidance), none have been decentralized over TV processes. 11 Chapter 1. Introduction Multiple Goal Interaction Auxiliary task tracking is not new to redundant manipulator control. However, these auxiliary tasks, traditionally developed and assigned to regions of configuration space through a centralized supervisory controller, were distributed over the multiagent manipulator control system without any centralized task assignment process. This study identified a number of significant properties of goal interaction that made this possible: 1. Compliance as Priority. Adaptive task space goals always suppress local compliant (PD) goal systems. Similarly, conflicting compliant systems may be biased through manipulation of PD gains in each system. 2. Null Space Self Organization. Adaptive global goal enforcement was shown to functionally equivalent to an explicit Jacobian null space task assignment. If local goals perturb the global goal system, adaptive global goal enforcement drives these goals automatically into the Jacobian null space. Emergent Coordination The combination of global goal regulation and compliant local goals results in emergent manipulator configuration policy. By enforcing a global goal, local disturbances that propagate to the global behaviour are corrected and transmitted (as a correction) to the agent team. This local perturbation/global correction mechanism thus becomes a communication medium between agents. 1.4 Thesis Outline To clarify agent and multiagent structures and permit an in depth dynamic analysis, chapter two introduces a standard plant by reviewing the kinematics, dynamics, and joint motor models of an arbitrary serial manipulation system. Chapter three then examines the real time control of these robots by examining the common supervisory centralized manipulator control strategies. This is contrasted by a brief overview of high performance single and multirobot real time supervisory controllers. Relating these agent-like controllers to traditional control theory, chapter four develops a theoretical foundation and introduces structural components to provide a consistent understructure for agent based control. The resulting agent and multiagent models are then applied to manipulator control and candidate global goals are considered. Chapter five documents the design and structure of a manipulator simulation platform, the Multiprocess Manipulator Simulator. Chapter six presents a detailed exploration of a global goal implementation and is followed in Chapter seven by results documenting several global, local and global goal combinations and identifies Chapter 1. Introduction 12 a mechanism for apparent emergent behaviour in combined systems. Chapter eight offers a comparison between centralized model based manipulator control and multiagent control, citing the advantages of simplicity, extensibility, and freedom from a manipulator parameter model and exploring the stability of multiple goal systems. Chapter nine summarizes the conclusions of the research and points out new directions possible through the techniques developed. Chapter 2 The Manipulator Model 2.1 Introduction While producing capable and robust robots, the investigation of agent and multiagent systems has been hampered by unique, underdocumented robot platforms, poorly understood agent and interagent dynamics, and inconsistent, confusing terminology. Despite the success of these systems, the variety of robots upon which agent and multiagent techniques are demonstrated greatly complicates the independent corroboration of results and the exploration of intra- and interagent dynamics. Terminology, hopefully clarified in chapter 4, further obscures this otherwise promising area of investigation. Therefore, this chapter introduces the dynamic structure of a well established 'benchmark' robotic platform, a serial manipulator, upon which multiagent control will be applied in subsequent chapters. Compared to most agent or multiagent systems, typically one or more mobile robots, the selection of a manipulator robotic platform may seem unusual. After all, manipulator control is well established and few operate in time varying unstructured environments. These observations are certainly true. The kinematics and dynamics of manipulators are well understood with an overwhelming selection of capable industrial manipulator controllers. Yet, the same problems that once limited mobile robots to cautious laboratory excursions continue to limit manipulation systems to slowly changing, structured environment applications. As discussed earlier, reliable motion in a changing, unknown environment requires close integration of real time sensing, modelling, planning, and control components. However, in assuming that configuration space positions, velocities, or forces are specified externally, the great majority of industrial manipulator control systems surrender sensing, modelling, and planning to other systems. In short, the attraction of a serial manipulator model as a foundation for multiagent control is threefold: i Manipulator tasks are well formed, i.e. the movement of the end effector along a prescribed, collision free, task space trajectory. ii Kinematics and dynamics of serial kinematic chains are well understood. 13 14 Chapter 2. The Manipulator Model iii SMPA manipulator real time controllers for unstructured environments are complex, compute intensive, and scale poorly. Manipulator trajectory tracking stands as an effective benchmark against which controllers of all kinds have been tested. By applying new controllers to this benchmark, results can be placed within a familiar, control theoretic context and compared against established methods. So the selection of a manipulator model as a basis for multiagent control can be justified as a well defined control problem that will profit from such methods. This chapter will, therefore, briefly review the kinematics and dynamics of manipulators and introduce the joint motor model. 2.2 The Manipulator Model The 'atomic' unit of the serial manipulator is the link, and, depending on the research objective, can have many characteristics such as flexible motor shafts and/or elastic material properties. For simplicity, this study will treat a link as a rigid body connecting two single degree of freedom revolute (rotating) or prismatic (linear) joints driven by some joint motor system. The manipulator model describes the dynamic characteristics of a kinematic chain of link/motor systems [Fu, 1987, de Silva, 1989]. 2.2.1 The Kinematic Model A manipulator is a serial kinematic chain composed of an arbitrary number of links driven by either prismatic or revolute joints. The base of the manipulator is fully constrained or fixed while the end effector is unconstrained or free. A manipulator with TV such links may adopt a configuration described by a joint displacement vector q € Q C R^ composed of either angular or linear displacements, t9,- or d,- respectively. The position of the link in cartesian space is described by an homogeneous transformation from an arbitrary world coordinate system to a link frame. A homogeneous transformation, A , is composed of a rotation matrix R € SO(3), a position vector p e R and a scaling vector s G R arranged in the following 3 4 4 x 4 matrix: (2.1) Since the position of the «th link frame is a function of superior link frame positions, or those between the base link, 0, and the link, i : 0 < i < TV, it is convenient to employ relative transformations between adjacent link frames: A ) . - 1 Chapter 2. The Manipulator 15 Model Figure 2.2: A 6 degree of freedom elbow manipulator. Note coordinate systems conform to Denavit Hartenberg conventions. The transformation process is further simplified through the adoption of the widely used Denavit Hartenberg homogeneous transformation [Denavit, 1955] characterizing both revolute and prismatic joints with the standardized matrix expression: a ; - 1 cos(6*,-) — cos(a;) sin(f?,) sin(a,-)a,- sin(f5*) a,-cos(#) sin(#,) cos(a,) cos(0,) — sin(a ) cos(6*,) a,sin(0,) 0 sin(a,) cos(a;) d,- 0 0 0 1 t t = given revolute displacements, (2.2) or prismatic displacements, d,- and the relative rotation, a,-, about the x axis between adjoining frames. Conventions for the placement of the link frame vary, but most apply the link frame at the distal (inferior) joint. This differs with the convention in dynamic analysis, that places the frame at the link centroid. Chapter 2. The Manipulator 16 Model The Forward Solution Given a fixed world coordinate system at the base of the manipulator, the world position of the ith frame in the manipulator is the product of all the transformations from frame 0, the base coordinate frame, to the link: A,- = U) A( ) j =1 (2.3) 1 qi With a common abuse of notation, the world position and orientation of any point is presented as a vector in R , though it is clearly a 4 x 4 matrix in the above expression. The world position of the end effector, x, 6 may be computed through forward solution, f(q), a function of the joint displacements, q: x = f(q) = n f A t e ) f (2-4) 1 = 1 Thus the forward solution may be characterized as a map between configuration space, Q C R^, and task space, X C R , or f (q) : Q -> X. M The Manipulator Jacobian End effector velocity vector, x, may be computed through the chain rule: = J(q)q (2-6) where J(q) is the Jacobian of the forward solution. However, the rate of change of orientation (a matrix) complicates this expression and is better expressed through a position subJacobian J and orientation p subJacobian J c [Lewis, 1993]: x - Jp(q) (2.8) Jo(q) J (q) 0 = [KIZ ...KJVZJV-I] 0 (2.10) where «; is a selection operator (1 for revolute, 0 for prismatic) and z;_i is the axis of rotation for the ith link in world coordinates. Chapter 2. The Manipulator 17 Model A still more useful expression reveals the column-wise structure of a generic Jacobian [Spong, 1989]: j [k,-_i x (p„ - p,_i) «•«' ~ ] r t [0 k,_i] ki-i] if revolute T (2-11) if prismatic Just as in the forward solution, the Jacobian is a map J(q) : Q —• X. By reapplying the chain rule, end effector acceleration, x may be determined: x(t)=J(q)q + J(q)q 2.2.2 (2.12) Manipulator Inertia Matrix The link inertia matrix, I,-, for link i in inertial coordinates is a 3 x 3 is a diagonal matrix of constant principal inertias: I , I x t yi and I Zi of the ith link about the centroid. Assuming the principal axes are aligned with the coordinate frame and the Denavit-Hartenberg conventions are followed, I Zi is related to J fT,, the effective moment of inertia about the actuator axis, through the parallel axis theorem while I C Xi and I . are a function y of link geometry. The inertia matrix of the ith link about its centroid in world coordinates, D, is therefore: D, = nii 0 0 R,I,R^ (2.13) where m- is the mass of the link and R; is the orientation of the link frame with respect to the world t coordinate system. The manipulator inertia matrix, D(q), is an expression of the inertia of the kinematic chain expressed in world coordinates. The matrix is a function of the mass and the link inertias, D;, of the TV links. From the principle of conservation of energy: N q D(q)q T = £x£D,-x i'=i = J^q^DiJ^q C i (2.14) M (2.15) 1=1 n D(q) = £j£D,-J < »=i c (2.16) where J . is the 6 x TV Jacobian of the ith centroid, the subscript c, indicates the centroid of the ith link c and D; is the ith inertia matrix in world coordinates. Chapter 2. The Manipulator 2.2.3 18 Model Manipulator Dynamics As stated earlier, the manipulator dynamics equations of motion are usually formed through either NewtonEuler methods, a recursive link by link application of Newtons Laws, or Lagrange methods in which potential and kinematic energy is conserved within the kinematic chain. Developing the manipulator Lagrangian equations [Spong, 1989, Goldstein, 1981], the Lagrangian, L, is: L = K = (2.17) V — ^q D(q)q-V(q) (2.18) T (2.19) i = l j=l where K is kinetic energy, and V(q) is potential energy and d,j is the ijth element of D((/). The Lagrangian equations: d dL _ dL_ Tk - dt dqk (2.20) dqk Computing the components of this equation dL (2.21) Ii dqk d dL dL j dt dqk •• , j=i dL _ 1^ ~ 2 ^ ^ d q ^ ddjj . . .. , dd j . . q i q (2.22) —mi »=i j=i dq k dclfckj.. dVjcj) (2.23) dq j k k Thus the L E equations can be rewritten: E, k n r dd j k = E d *'& 1 - A dd^ . , 2 ^ E + dqi E dd j . . 1^ k mi 2 dq + k dV(q) dqk (2.24) (2.25) e c o m e s 9d^ 9d ,- (2.26) t 2 4^ % dq-, dqk k 1 ddjj 1 realizing that through reorganization the term 53^q^Q'^i b 3V(q) dq a?,- which when substituted into the second term of 2.25 becomes: cMfcj E dqi 13d,-,- .. . _ 1 A f^dtj 2 <9g J ~ 2 - [ ag,c f 2 9d - _ 5d ,% a ti + i; % (2.27) Chapter 2. The Manipulator 19 Model This term produces the Christoffel symbols: dd dd _ ddij kj Cijk(q) ki dqi dqj qiqj dq (2.28) k The last term of equation (2.25), the change in potential energy, is often expressed as: dV(q) <Mq) (2.29) dqk the L E equations finally becoming: n n n = ^2 d qj + Cijk{<\)qiqj + </>fc(q) kj j (2.30) i,i where r £ Q.In a gravitational potential field 4> (<l) = g(q)- However if the end effector applies an external k k force, f t G X, at the end effector then the work done: cx V(q) = Sx f T (2.31) = <5q J(q) f T cxt T ext and <Afc(q) attributed to this work: 5f(q) dV(<l) dqk Tfo where the (2.32) dqk (2.33) Jfc(q)fext (q) is the fcth row of the Jacobian Transpose. The manipulator joint forces expressed in matrix form: t = D(q)q + C(q, q) + g(q) + J f T c x t (2.34) where the k,j element of C(q, q) is 1" ddkj dqt ddid _ z^ii dqj dq \ (2.35) k Through appropriate control of joint forces, T , a manipulator can execute a desired task at the end effector. 2.2.4 Properties of The Manipulator Jacobian Since the properties of the Jacobian are frequently cited throughout this text, it is important to establish a clear understanding of the concepts surrounding Jacobian null and range spaces. Referring to figure 2.3, recall that the instantaneous domain of the Configuration space is the volume that bounds all possible joint 20 Chapter 2. The Manipulator Model q x Figure 2.3: The Range and Null spaces of the Jacobian and Jacobian Transpose. velocities, q € Q, within a particular configuration, q € Q. The range of the Jacobian, i?(J), is the space occupied by all end effector velocities x € X produced through joint motion. The null space of the Jacobian, N(J), is the space occupied by all joint velocities q for which no end effector motion results. Conversely the range of the Jacobian transpose, R ( J ) , is the space within which all joint forces must T be applied to balance an end effector force. While the null space of the Jacobian transpose, N ( J ), is the 7 region of configuration space for which no joint forces are required to balance an end effector force. Now the duality [Asada, 1986] of manipulator kinematics and statics means that the, R ( J ), and N(J) are orthogonal 2 complements or R(J' ) U N(J) = Q. r 2.2.5 The Motor Model Each link is driven by either a linear or revolute powered mechanism. A popular model for direct drive joint motors is the armature controlled dc servomotor [Fu, 1987]. In this model, the servomotor dynamics are described by: T mk = Jmjm k +B q mh mt (2.36) Chapter 2. The Manipulator where J mk 21 Model is the motor shaft inertia and B is a viscous force coefficient . An identical expression governs mk the link itself: h = h qi + B, qi T k k k (2.37) k The torque required to drive the motor-link system is simply the sum: r = r k + r£ mk (2.38) where T,* is the torque applied by the link via the gear train on the motor shaft. Given the gear train ratio: r = ^ (2.39) k where N m and TV; are (for example) the number of teeth on each gcar( see figure 2.4a), r; can be rewritten in terms of q fc mk producing the expression: ri = r (J q +Bq ) 2 k k lk mk lk (2.40) mk For the total torque: r k = (J + r\.h )q = Jcft q mk k k + (B mk + r\B, )q mk k (2.41) mk +B { q mk cf k (2.42) mk Small gear ratios r tends to isolate the motor from the fcth link's dynamics. k Now it is common to assume that the motor torque is directly proportional to armature current, i : ait T =K ia k ak (2.43) k Applying Kirchoff's law to the motor circuit in figure 2.4a: V ai (t) = R i ak ^ 1 (t) + L ak ak + e H (2.44) { t ) The back electromotive force, eb (t), is proportional to the motor's angular shaft velocity. k e (t) = K q bk bk (2.45) mk Taking the laplace transform of equations (2.42), (2.43), (2.44), (2.45) a transfer function relating shaft position to motor voltage can be developed [Fu, 1987]. Qm (s) _ V (s) s[s J (! L k e k Kg + Ra Jett )s + R B ({ k 2 ak ak +(B ([ L e k ak k k a k e k +K K ] ak bk (2.46) Chapter 2. The Manipulator 22 Model D(S) b) V(s) fl 0 - 1 A + A e (s) se^s) [SJ^B^]- [SL R ]-' L Figure 2.4: An armature controlled dc drive joint motor and gear train model. Since the mechanical time constants are generally far larger than the electrical time constants, the motor inductance L is usually ignored.Thus the servo control block diagram in figure 2.4b, is usually reduced to ak the transfer function [Fu, 1987]: Qm (s) Va (s) ~ [s J t! +sB a (2.47) k 2 k where K k = K /R . a a e k e k + A'tA'iJ Within a manipulator, the link does not act in isolation. Unmodelled disturbances dk include off-axis inertial and coriolis/centrifugal forces (i.e. imposed by other links on linkfc)or from the manipulator dynamics: >° ~ ] C i i d d k q E 'Jkqi<ij + (2.48) C With these link disturbances included, the link motor dynamic equation becomes: 'h« (im k k + {B « c k + K K )q k bk = Ku mk k k - r d (t) k k (2.49) where u = V (t), the command voltage. The state space equation for thefcthlink: k ak x = AfcXfc + b u + dv fc A (2.50) 23 Chapter 2. The Manipulator Model Xfc - A = fc (2.51) 0 -J \(B eff bfc = 0 + KK ) cifk k bk Ufc = J~ff K u (2.52) Vfc = - J~Jr d (2.53) k k 1 dfc = 0 k k 1 It is not surprising that given the relative simplicity of link servo control and the dynamic regime of industrial robots, most manipulator controllers ignore link dynamics entirely. Nonlinear coupling and manipulator inertia are significant only during high velocity or acceleration, commonly found in low precision maneuvers in industrial applications. High precision maneuvers are generally performed at low velocity and with minimal acceleration. Therefore, pure PD control without dynamic compensation (the cancellation of d ) is adequate for the majority of industrial manipulator tasks. However, high performance manipulak tion (e.g. high speed, precise positioning), redundant manipulation, and unstructured environments (UEs) challenge such joint space robot control methods as explained in the following chapter. Chapter 3 Real Time Robot Control 3.1 Introduction This chapter will explore the relationship between the inverse problem, the specification of robot action based on a desired robot behaviour, and real time control. In so doing, the argument will be proposed that real time robot controller design and robotic system design are closely related problems. Though directed at manipulation robotics in particular, much of this chapter is applicable to robotics in general. As mentioned in the last chapter, the primary objective of most manipulation systems is to track an end effector trajectory in task space. Since control of the end effector is performed indirectly through the control of joint motor controllers, a desired task space trajectory must first be transformed into configuration space coordinates before it can be realized at the end effector. This transformation must invert the relationship between configuration space and task space, a process complicated by the nonlinear structure of the forward kinematic solution. For typical manipulators, the transformation between task and configuration spaces is usually a geometric inversion of the forward kinematic solution, the inverse kinematic solution, and often considered separate from the control problem. However, if the number of degrees of freedom available to the manipulator, N, exceed the number required to accomplish the task, M, the manipulator becomes redundant, capable of additional simultaneous tasks. Unfortunately, as the degree of redundancy, N — M, grows, geometric inverse solution strategies become increasingly complex. In general, the solution embeds the inverse solution within the manipulator controller. To explore these issues further, this chapter adopts a task formalism to discuss inversion strategies and show that, in general, model based inverse solution methods influence the structure of redundant manipulation systems. Finally a discussion of general robot control architectures will show that idealized, single threaded, SMPA robot control is rarely used in practical real time systems and that robot architectures are evolving towards pragmatic multithreaded reactive designs: an agent architecture. 24 25 Chapter 3. Real Time Robot Control 3.2 Task Functions The colloquial definition of the term task, much like the term goal, is simply a desired action. However, in robotics, tasks are more precisely defined in terms of one or more trajectories in some task space that, with the application of appropriate tools, accomplish some change in the environment. Typically, tasks reside in task space, R , often (but not limited to) a subset of cartesian space, R . Though a number of task M 6 specification techniques exist, the objective of task generation is the construction of a desired task space trajectory, Xd(t) and velocity Xd(t). Samson et al.[Samson, 1991] further formalize this specification within Task Functions. Briefly, a task function is the error between the desired task space coordinate and the forward solution over a time interval [0,T] or: x (q,i)=x (i)-f(q) e (3.54) d Samson et al. further define the feasibility of a task by: x (q,i) = 0 V t e [0,T] (3.55) e By recognizing that the specification of configuration space positions based on a task space coordinate requires the inversion of the forward solution, Samson applies the inverse function theorem [Vidyasagar, 1993] to characterize inverse solutions. Reiterating the inverse function theorem: Theorem 1 ( Inverse Function Theorem) Suppose f : R" -¥ R" is C 1 Suppose ^ 1 L at xo € R" and let y = f(x ). 0 0 is nonsingular. Then there exist open sets U C R" containing Xo and V C R" containing -I X=Xo yo such that f is a diffeomorphisrn ofU x onto V. If, in addition, f is smooth and f _1 is smooth then f is a smooth diffeomorphisrn This theorem was then used by Samson to describe three factors influence task feasibility: i A unique solution exists. If ^gffi'^ is invertible 2 a position Xd can be mapped to a unique position q^. This task is feasible, since a unique inverse kinematic solutions exists that will map Xd onto qj. ii Infinite solutions exist. If a X g ^ ' ^ is not invertible but onto 3 , a position Xd can be mapped to a region of dimension corank( gq^'^ )• In effect an infinite number of positions q<j exist. The 9X ' A differentiable mapping with a differentiable inverse. o r infective: For every member in the range there is a unique member in the domain. or surjective. Define the codomain of a function as the simply connected region that bounds the values of the function and 2 3 Chapter 3. Real Time Robot Control 26 manipulator has fewer constraints, m, than degrees of freedom, n. The manipulator is said to possess r — n — m redundant degrees of freedom. This case is feasible if the redundant degrees of freedom are somehow constrained to determine a unique solution. iii No solution exists. If 8 X gq^'^ is neither invertible nor onto. This is the singular case and, by the Inverse Function Theorem no inverse, g ( x ( £ ) ) , exists. d So task feasibility is a measure of whether an inverse solution exists. If a task is feasible and is either invertible or simply onto, some form of inverse kinematic solution g : X —> Q can be constructed that will map desired task space coordinates into configuration space or q (t) = g(x (*)) d (3.56) d The next section will examine four methods of inverse solution. 3.3 Manipulator Control and the Inverse Solution Given the feasibility of a task, four basic inversion methods are used to map desired task space coordinates into configuration space through three classes of inverse function: geometric inverse, integrable inverse, and projection inverse solutions. Geometric solutions map task space positions directly to configuration space positions and here will be referred to as Resolved Motion Position Control. Integrable solutions include Resolved Motion Rate and Acceleration Control, both of which map task space to configuration space derivatives of position. Finally, Jacobian Transpose Control projects task forces onto configuration space generalized force axes. 3.3.1 Resolved Motion Position Control The most common form of inverse kinematic solution is the inverse geometric solution. The analytical inverse map: q (t) = g(x (i)) d (3.57) d is derived off-line, enabling fast on-line determination of q . However, as the forward solution is usually a d nonlinear product of transcendental functions, this off-line inversion process requires both mathematical and the set of values itself is the range of the function. Then the surjective function, ^jqS associates two sets, X e and Q, such that every member, X , of the codomain, X , is the image of at least one member, q, of the domain, Q, though there may be members of the domain that are not mapped to the codomain (i.e. some values of q may not map to any X ). Thus the range of the surjective mapping R( i%/ ) is the entire codomain, X . e e e 3 c 27 Chapter 3. Real Time Robot Control geometric insight to rapidly achieve an efficient solution [Paul, 1981, Fu, 1987] and is unique to a particular manipulator design. Indeed, inverse solution functions are rarely simple mathematical procedures and usually include considerable flow control logic to achieve physically realizable solutions. Given the complexity of manipulator geometry, errors in inverse kinematic solutions are common and thorough testing of these solutions throughout the workspace is mandatory prior to installation. Once established, the desired configuration space vector, q^, is passed to the manipulator's joint control processes. The majority of manipulator control techniques rely on such setpoint specification. In particular, industrial manipulator control often applies basic PD control to the setpoints qd k Uk = k qc g where k and k q w k (3.58) + ku>qe k are position and velocity gains, respectively, and q ek = qd — q k mk is the joint error for the fcth link. Redundancy Resolution As mentioned earlier, there exist tasks for which either an exact inverse kinematic solution exists, an infinite number of solutions exist, or no solutions exist. When an exact inverse kinematic solution exists, the manipulator is fully constrained and any inverse kinematic solution method will produce unique solutions. However, most realistic end effector trajectories enter regions of task space for which the manipulator becomes redundant, possessing degrees of freedom in excess of those required to accomplish the task. Technically, joint space becomes divided between two regions, the range and the null space of the manipulator Jacobian, R(J) and N(J) respectively. A manipulator becomes redundant when an infinite number of solutions form a dense set in joint s/>ace[Samson, 1991]. While a 2R planar manipulator may adopt one of two configurations for any point in planar cartesian space, it is not possible to move on a continuous path in configuration space from one solution to the other while maintaining the desired end effector position. Thus the solution space for this manipulator is bifurcated but not redundant. In contrast, a 3R planar manipulator can adopt a continuous path through a subspace of Q while maintaining a planar end effector position. While these 'extra' degrees of freedom permit a wide selection of configurations and therefore flexibility, determining inverse solutions for such systems can be complex. The simple answer to finding a unique inverse solution for an underconstrained manipulator is to add Chapter 3. Real Time Robot Control constraints to N(3). 28 By fully constraining a redundant manipulator through heuristics or task space aug- mentation, a unique inverse kinematic solution may be generated. In R M P C a geometric solution of a redundant manipulator requires the adoption of joint displacement selection rules or heuristics. Another approach is to augment the forward solution with additional or secondary tasks that constrain the manipulator's unconstrained degrees of freedom. Redefining the task space as: fe(Q) x(t) = (3.59) fe(q) where f (q) : q € Q c respectively. m and f (q) : q G Q are the end effector and additional constraint forward solutions c r With an appropriately augmented task space an inverse solution can again become one-to- one. Despite this useful technique, geometric inversion remains a complicated process. Therefore, redundant systems are usually controlled through numerical inverse methods such as either resolved motion rate control (RMRC) or resolved motion acceleration control (RMAC) techniques. Configuration space position setpoints are a double edged sword. Though the determination of provides a clear specification of both the joint and end effector trajectories under stable control, the inversion process becomes increasingly complex with robot geometry. Thus R M P C provides an easy method of inversion for simple robot geometries combined with the security of predetermined trajectories. Conversely, R M P C becomes very complex for redundant manipulation - precisely when the robot exhibits the most versatility. 3.3.2 Resolved Motion Rate Control An alternative to the geometric inverse kinematic solutions for position q, is to solve an inverse kinematic solution for joint rates q. qW=J- (q)x(i) 1 (3.60) First introduced by Whitney [Whitney, 1969], Resolved Motion Rate Control, RMRC, exploits the relationship between cartesian and joint space velocities in equation (2.6). Since R M R C is a velocity controller, the joint controller is of the form: = fc,'•w<le (3.61) = (3.62) k U k where J k is fcth row the Jacobian inverse. fc J (q)Xe(*) 1 u) fc 29 Chapter 3. Real Time Robot Control Unfortunately, this method is prone to numerical instability near Jacobian singularities. These usually occur near the extremities of a manipulator's work space or in underconstrained, redundant, regions of configuration space. Redundancy Resolution Whitney [Whitney, 1969] recognized these hazards and recommended the use of the pseudoinverse for redundant systems: q(t) = J x(t) (3.63) + to solve for joint angles. Defining the pseudoinverse as: J (t) = J ( J J ) T T (3.64) _ 1 produces a least squares velocity solution. Additional constraints can then be inserted into N(3) using the well established technique: q(t) = J x(t) + ( I - J J ) h + (3.65) f where (I —J^J) selects the Jacobian null space and h is an arbitrary vector inserted into N(3) based on some secondary optimization criteria (e.g. [Hollerbach, 1987, Nakamura, 1984, Nakamura, 1991, Kazerounian, 1988, Sung, 1996, Whitney, 1969]). 3.3.3 Resolved Motion Acceleration Control Similarly Resolved Motion Acceleration Control, RMAC, exploits the relationship in equation (2.12): q(t) = J [x(t)~ jq(t) 1 (3.66) With a prescribed twice differentiable task space trajectory, x<j(£), an acceleration control law may be devised [Luh, 1980b]: q {t) d — J = - 1 x<i(t) — 3q(t) + k x + &pX j v -fc„q + J [*(*)- Jq(t) + fc„x + - 1 (3.67) e e d fcpX ] e (3.68) clearly the equation 3.68 is only possible if k and k are scalars. p v From (2.34), the acceleration vector computed in equation (3.67) can be substituted into a feedforward dynamic model to provide the necessary control effort for each actuator: u = D(q)q + C(q,q) + g(q) d (3.69) 30 Chapter 3. Real Time Robot Control Since R M A C relies on the Jacobian inverse, it, too, becomes numerically unstable near Jacobian singularities. Redundancy Resolution Differentiating equation (3.65) produces the general solution for joint accelerations in a redundant manipulator : q(t) = J [*(*) - Jq(t)] + (I - J J)(J x(t) - J J h + h) f t t f (3.70) where the second term again inserts secondary tasks into the null space. Though setting h = h = 0 in equation (3.65) results in a least squares joint velocity solution, clearly it does not necessarily produce a least squares acceleration solution in equation (3.70). Indeed [Kazerounian, 1988] shows that ignoring the second term of (3.70), locally minimizing joint accelerations, globally minimizes joint velocities. Since the computation of both the dynamic model, equation (3.69), and the pseudoinverse [Press, 1988] can be a slow process, the computing resources required to exploit redundant manipulators in real time position control are significant. 3.3.4 Now, Jacobian Transpose Control, J T C the setpoint controllers in equations (3.58),(3.62), and (3.67) ultimately specify a torque setpoint related through some map to task space position. So PD, RMRC, or R M A C controllers first invert the forward solution and then apply an error regulator to ultimately generate the necessary joint forces to move the end effector along the desired trajectory. Though effective, the control of end effector trajectories through inversion, joint space error regulation, and finally joint space force application is somewhat indirect. A more immediate method would be to regulate end effector forces directly through joint space forces and is exactly the mechanism behind Jacobian Transpose Control (JTC). Unlike joint space methods, Jacobian Transpose Control applies torques to the manipulator by projecting a task-dependent end effector force trajectory onto each joint through the following equation u = Jl f (x (t),x (t),x (t)) k d d e e (3.71) where u and f(-) are generalized forces in configuration and cartesian space respectively. Common in force k control (e.g. [Raibert, 1981, Fisher, 1992]), J T C is less well known as an on-line position control scheme [Wu, 1982, Hogan, 1985c, Khatib, 1987, Seraji, 1987b] and off-line inverse kinematic solution [Asada, 1986, Slotine, 1988]schemes. In general, J T C is less compute intensive than " Newtons method" numerical solutions (i.e. RMRC and RMAC) but is slower to converge [Slotine, 1988]. Compute intensive disturbance Chapter 3. Real Time R,obot Control 31 compensation (i.e. through feedforward dynamics or adaptive control) must also be added to J T C to achieve precision position control [Khatib, 1985]. Redundancy Resolution The Jacobian Transpose solution does not suffer numerical instability near Jacobian singularities as do solutions based on the Jacobian inverse. Since the required end effector force is projected onto each actuator, redundant manipulators are treated no differently than fully constrained manipulators. However, kinematic singularities, in which desired end effector forces have no projection on any of the actuators, produces no response in the manipulator. Such singularities occur near the extremities of the robot work space and can be avoided through simple techniques [Slotine, 1988]. 3.4 Real Time Supervisory Control Architectures Control represents one of four phases in the sense-model-plan-act (SMPA) cycle that persists as the context for 'intelligent' robotic systems. Of the four, the symbolic model-plan portion of the cycle has traditionally been considered the crux of the AI problem. The sensing and action phases of the cycle are generally considered services to this 'intelligent' symbolic system. As reviewed above, manipulation systems use symbolic engines that often include model driven inverse kinematics solutions (i.e. centralized R M P C , RMRC, R M A C , and J T C solutions) - the output of which is a deterministic trajectory in joint space. As a consequence, the traditional responsibility of manipulator control is to guarantee exact joint space trajectory tracking. Contrary to early optimistic assessments of the problem, many now believe that autonomous robot performance is limited not by inadequate symbolic reasoning but by impoverished sensor fusion and control. Initially, investigators assumed that sensor fusion would eventually yield simple methods for detection and modelling of the workspace . However, building and maintaining a sufficiently capable sensor fusion system 4 for autonomous operation has proven to be complex, often unreliable and, in general, prohibitively expensive. Hence the interest in alternative control architectures that do not rely on extensive signal processing, modelling, and symbolic reasoning to achieve real time autonomous performance. A view traceable from early systems such P L A N N E R , and Winograd's S H R D L U [Winograd, 1972] to existing task specification/assembly modelling languages such as R A P T [Ambler, 1975, Ambler, 1980, Thomas, 1988] 4 Chapter 3. Real Time Robot Control 3.4.1 32 Manipulation Real time manipulation has many aspects, such as real time navigation and force control, that often place multiple demands on manipulator performance in a changing workspace. Since it is possible that multiple, unpredictable, constraints may arise over the course of a single task, redundant robots are more attractive than traditional fully constrained manipulators in an unstructured environment. To leverage this flexibility, a growing number of redundant robotic systems incorporate additional sensing into RMRC, R M A C or J T C controllers. These sensors engage task specific auxiliary controllers through switching logic and/or provide data for environmental servos. In effect embedding limited modelling and planning portion into the control system. Some examples from manipulation and mobile robotics follow. Hybrid Position/Force Control Manipulation often requires not only simple end effector positioning but also force control, for example in machining tasks. Hybrid position/force control, in figure 3.4.1, [Raibert, 1981, Anderson, 1988, Fisher, 1992] employs a switching architecture to select appropriate control techniques for either position or force control. By incorporating end effector force sensors to control the manipulator's control effort, force controllers can regulate end effector forces in real time. Hogan formalized this concept further within impedance control [Hogan, 1985a]. From [Fisher, 1992] the equations for (differential) position and force control respectively: q e = (SJ^Xe + [I - J J ] z r = ( S J ) f + [I - J J]z (3.72) + x r q T e T (3.73) where the matrix S is the selection matrix and S - = I — S. -1 Visual Servoing For target tracking in real time (e.g. grasping a moving object), even simple model-based vision can be too slow in unstructured environments. By embedding image processing into the controller a task specific vision servo can be devised. Weiss demonstrated visual servoing [Weiss, 1987] in which a desired image feature such as area or centroid was compared against actual image features and a mobile camera servo ed to regulate the error between the features. This structure is depicted in figure 3.4.1.The system does not track an object in task space, but a feature in image space, a geometric object model does not exist. A more ambitious system, MURPHY, by Mel [Mel, 1990] left feature extraction and inverse kinematics to a neural network to implement fast vision servoing and vision based obstacle avoidance. With a trained 33 Chapter 3. Real Time Robot Control Forward Kinematics Figure 3.5: Fisher's corrected Hybrid Position Force Control. S is the selection matrix. J verse of the Jacobian and Z q and z are arbitrary vectors. + is the pseudoin- T J * Controller u(r) u(*) DAC Robot delay • forward kinematics XreltO M Vision w ^feat( rel) x f(*) Figure 3.6: A block diagram of the Visual servoing approach to target tracking. Desired reference values are in feature space f f. Image feature changes are computed through a feature Jacobian Jf trc ea Chapter 3. Real Time Robot Control 34 the neural network, image, robot motion, and obstacle semantics became internal to the control system and unknowable to an external observer. Obstacle Avoidance SMPA based obstacle avoidance requires world models supplied with high speed, rich, range images over the robot's work space. At least two methods of real time manipulator obstacle avoidance [Colbaugh, 1989, Khatib, 1985] dispense both with the modelling of obstacle boundaries and planning of collision free trajectories. These algorithms simplify the obstacle avoidance process by relying on real time sensing and reactive control. By identifying points on the manipulator threatened by collision, "points subject to hazard" (PSH) 5 and applying an evasive force to each PSH, both schemes avoided collisions with workspace obstacles. Khatib's Operational Space Formulation (OSF, see figure 3.4.1) [Khatib, 1987] employed a potential field obstacle model embedded into a simple control level obstacle avoidance scheme. By adopting a nonlinear potential field about an obstacle with a specified clearance range, manipulator motion was unconstrained while outside the clearance range of the obstacle. In OSF, Khatib circumscribed obstacles with artificial potential fields (APFs), $. By determining the negative gradient of these potential fields at the PSH, a repulsive force could be prescribed and applied to the PSH. The Jacobian Transpose of the PSH could then be used to project the repulsive forces onto the manipulator's actuators to generate an evasive maneuver. Recognizing the computational burden in computing and assembling $, Colbaugh at ,IPL [Colbaugh, 1989] used range sensing, state augmentation and an adaptive controller to achieve similar results. In this implementation, range sensors on each link determine the range between the PSH and the nearest obstacle. By augmenting the state of the end effector with the state of the PSH to form an augmented task space vector, x , and monitoring the range sensor values continuously, an augmented Jacobian Transpose could be applied a to the augmented force vector. Teleoperation In recognition of practical real time system requirements, Albus et al. [Albus, 1990, Lumia, 1994] proposed a multilevel SMPA model of robot control that operates in parallel at multiple time scales, the NASA/NIST Standard Reference Model for telerobot control system architecture. In NASREM, the sense-model-plan-act cycle is spread over six time scale levels, successive scales differing by, approximately, an order of magnitude. 5 A convention similar to Khatib's "points subject to potential" or PSP 35 Chapter 3. Real Time Robot Control Potential Field Modeller Obstacle Avoidance Controller Position Evaluation Torque Controller Position.Velodty Evaluation Decoupling Forces Parameter Evaluation Figure 3.7: Khatib's operational space formulation. Feedforward cartesian dynamics and obstacle avoidance forces are summed to drive a manipulator along a stable collision free trajectory. Colbaugh's configuration control replaces the potential field model with sensing and the feedforward dynamics with adaptive control. Sensory Processing World Modelling Task Decomposition Detect Integrate Model Evaluation Plan Execute M1 H1 G2 M2 H2 G3 M3 H3 G4 M4 H4 G5 M5 H4 G1 33 Maps Object Lists State Variables Evaluation Functions Program Files 3 Operator Interface Coordinate Transform Servo Figure 3.8: The NASA/NIST Standard Reference Model for telerobot control. Chapter 3. Real Time Robot Control 36 Level Scope servo primitive elementary movement task service bay service mission coordinate transforms dynamics path planning actions on objects assembly tasks mission scheduling Table 3.1: The temporal scope of the NASREM layers. Each layer represents an increasingly (approximately an order of magnitude) longer planning horizon . All time scales access sensor data streams in the modelling and planning of tasks. Similarly all time scales contribute to the manipulators final motion. At any time, a human operator can interrupt the systems operation at any or all time scales. NASREM represents an important step in the formalized specification of robot system components and, like any specification, avoids detailed design issues. Only one published implementation conforms to this model [Lumia, 1994], though presumably N A S R E M is sufficiently flexible to accept many of the control systems reviewed here. 3.4.2 High Speed Motion Real time robot control is naturally not limited to manipulation robotics, and much can be learned by reviewing high performance robots in other applications. The following is a brief overview of some particularly successful real time robotic systems. Raibert's Hopping Robot R.aibert [R.aibert, 1984] and others (e.g. [Pratt, 1995]) have designed robots to explore hopping and leaping dynamics. Raibert's was controlled through a sequencer, or finite state machine, driven by data streams from pressure, inclinometer, angle and position sensors. By observing incoming data streams the sequencer could coordinate height, velocity, and attitude controllers with the timing of the machine's support and flight phases. Raibert's hopping robot employed a double acting pneumatic cylinder connected to a pair of pneumatic 'hip' actuator joints, and the entire leg/hip assembly to a large inertia balance beam. By tethering the balance beam to rigid aluminum boom, the robot was constrained to hop within a spherical surface. Two controllers cooperated in the motion of the robot. Chapter 3. Real Time Robot Control 37 Each phase of the hopping sequence was triggered by specific sensor thresholds determined through experimentation and analysis. The modelling portion of each phase was limited estimating the dynamics of the hopping frame, while the planning portion used the dynamic model to plan an angle/thrust response. Raibert's robot is significant for its use of a combination of dynamic analysis, control, and finite state strategy sequencing rather than an SMPA like planning of the robots running stride. Quoting from [R.aibert, 1984]: "The back and forth motions were not explicitly programmed, but resulted from interactions between the velocity controller that operated during flight and the attitude controller that operated during stance." Andersson's Ping-Pong Player Anderssons Ping Pong player [Andersson, 1989] used an 'expert controller' - an hybrid between an expert system and a controller. Andersson employed a 'figure of merit' system to trigger activity within the expert controller. Through an analysis of the ping pong ball dynamics, Andersson identified a set of 'free variables' upon which the ping pong task was dependent including: paddle orientation, ball velocity, manipulator settle time and others. In each control cycle these values would be run, in parallel, through a set of simple models, each generating a figure of merit. The model returning the highest figure of merit would be executed as the next task in the system. In effect, Andersson condensed the model-plan portion of the cycle into a set of parallel processes that simultaneously examined the free variable stream and developed correspondence measures. 3.4.3 Mobile Robot Navigation Mobile robotics stands as one of the most challenging autonomous robotic tasks. Unlike manipulators, the robot is no longer holonornic and must rely on sensing and dead reckoning to establish its position (though differential global positioning systems make this task increasingly easier). The environment challenges sensors and actuators with inconsistent lighting, irregular surface properties and topographies, and complex obstacle morphologies. In an olympian contest, autonomous navigation systems pitch state of art hardware and software against changing environments only to reach a target location in one piece. Understandably, many of these systems are only partially successful, but all provide useful lessons in real time robot architectures. Task Control Architecture (TCA) Chapter 3. Real Time Robot Control 38 Ambler Gait Planner Footfall Planner Leg Recovery Planner | Error Recovery Planner User Interface Figure 3.9: Carnegie Mellon's Task Control Architecture for the Ambler hexapod. Note the centralized planner and distributed reactive controller. The Task Control Architecture, in figure 3.9, a product of CMU's Robotic Institute, sought to unite deliberative and reactive systems through an SMPA-like layer. T C A acts as a planner/oversear on top of task specific reactive systems. Perhaps the most well known implementation of T C A is on the Ambler hexapod. After initially using an SMPA cycle for Carnegie Mellon's six legged robot, 'Ambler's' Task Control Architecture (TCA) [Simmons, 1991, Simmons, 1994] was modified into an asynchronous reactive layer combined with traditional A l modelling and planning elements. Distributed Architecture for Mobile Navigation (DAMN) The D A M N architecture, depicted in figure 3.10, also developed at C M U , again sought to integrate deliberative planning with reactive control. In DAMN, a discrete set of control actions on a group of actuators (e.g. pan/tilt camera, vehicle steering motors, engine speed) forms a command space in which multiple modules concurrently share robot control. By voting for or against alternatives in the command space, each module contributes to the control commands for the robot. DAMN employs an arbiter for the resolution of the voting process on each device. In the case of the U G V project [Leahy Jr, 1994]: a turn arbiter, a speed arbiter, and a 'field of regard' arbiter. To explain the arbitration process the turn arbitration procedure will be described: Each behaviour votes (ranging between —1 and +1) on every member of a discretized set of radii,i?oiThis means that each behaviour's vote is actually a distribution over all the possible steering radii. The Chapter 3. Real Time Robot Control 39 Figure 3.10: Carnegie Mellon's Distributed Architecture for Mobile Navigation for the N A V L A B series of robots. Each behaviour votes on every possible command (e.g. steering radii) and all votes arc processed within the arbiter. arbiter collects vote distributions from all participating behaviours, performs a gaussian smoothing on each, followed by a 'normalized weighted sum' for each of the i radii candidates: (3.74) where IUJ is a behaviour weight and Vj is the vote for the jth behaviour. The radius with the highest vote v = Max(v,-), is sent to the controller. 'Field of regard' and velocity arbiters perform similar smoothing and selection operations. This approach allows for multiple modules operating at multiple frequencies to vote on various command spaces. DAMN runs on a number of platforms [Leahy Jr, 1994, Rosenblatt, 1995a, Rosenblatt, 1995b]. Motor Schemas Motor schemas [Arkin, 1987, Arkin, 1991] are small processes that correspond to primitive behaviours that, when combined with other motor schemas, yield more complex behaviour. Arkin employed two kinds of schema, perceptual schema, that observed and represented the environment through sensing and potential field models respectively and motor schema that devised responses to classes of events. A central move-togoal or move-ahead schema sums the responses and commands the robot motors. Thus if a find-obstacle schema detected an obstacle, an avoid-obstacle schema was instantiated that produced a velocity vector based upon a repellent potential field around the obstacle (similar to Khatib [Khatib, 1985]). By summing the output velocity vectors from a collection of such schemas, a move-robot could navigate through the environment. 40 Chapter 3. Beal Time Robot Control P1 U1 P2_ P4 M4 Augmented Finite State Mactiine. P2 P3 P4 EKDi P3 M3 M5 line line 1 Inhibition: tap messages block line messages. X. —KD Suppression: tap message replaces line messages. Figure 3.11: A subsumption network. Perception (P) drives augmented finite state machines (M#) to output messages. Suppression nodes substitute horizontal line messages with vertical (tap) messages. Similarly Inhibition nodes disable line messages if a tap message is received. Subsumption Architecture Physically, subsumption is a hierarchical network of simple sensors, controllers, and actuators that can be embedded into relatively small robots. In Ferrell's 14 inch 3 kilogram hexapod, 19 degrees of freedom were controlled through 100 sensors, including leg mounted foil force sensors, joint angle and velocity sensors, foot contact sensors, and an inclinometer. Applications include aircraft flight and landing systems [Hartley, 1991], heterarchical subsumption (Connell's Herbert[ConneYL, 1990]), and hexapod motion (Ferrell's Hannibal[FcTTell, 1993]). Mataric [Mataric, 1994] Nerds showed that behaviour arbitration could be learned through repeated trials. Each subsumption network node, an augmentedfinitestate machine (AFSM), consists of registers, a combinatorial network, an alarm clock, a regular finite state machine, and an output. Sensors are connected to specific registers while actuators receive commands from the output of specific AFSMs. A message arriving at a register or an expired timer can trigger the A F S M into one of three states: wait, branch, or combine register contents. Results of combinatorial operations may be sent to an input register or an output port. Since each A F S M uses an internal clock, output messages can decay over time. AFSMs can inhibit inputs and suppress outputs of other AFSMs through inhibition and suppression 'side taps' placed on input or output connections in the network. Inhibition side taps prevent transmission of original messages along an input connection if an inhibition message has been received from an A F S M . When a suppression message is sent to a suppression side tap from an A F S M , the original output message is substituted by messages from the AFSM. Inhibition and suppression side taps encourage layered subsumption (as in figure 3.11) in which basic Chapter 3. Real Time Robot Control 41 behaviours are embodied within a fundamental layer of AFSMs. Through judicious use of side taps, additional behaviours can be built over the basic set (e.g. 'leg down','walk', 'prowl'). Mataric developed a set qualitative criteria to aid in the selection of basic behaviours. Each behaviour should be:Simple, Local through local rules and sensors, Correct by provably attaining the desired objective, Stable through insensitivity to perturbations, Repeatable, Robust by tolerating bounded sensor or actuator error, and Scalable by scaling well with group size. 3.4.4 Computer Graphics In order to produce complex, believable movement in computer animation, a number of novel strategies have appeared for motion control. While most ignore the details of real world dynamics, these systems apply a 'sensor' driven switching architecture to simplify the programming of flocks of birds [Reynolds, 1987], schools of fish [Tu, 1994], and planar running machines [van de Panne, 1992]. Boids The realistic animation of large collections of entities e.g. crowds of people, schools of fish, etc. becomes time consuming and inflexible if the trajectories of each entity are specified a priori. In a novel solution, Reynolds [Reynolds, 1987] employed a set of three behaviours: collision avoidance, velocity matching, and flock centering to model formation flying within every bird-like graphical construct, bird-oids or boids.Each behaviour generated an acceleration vector and was placed in a priority list. As each behaviour contributed a desired acceleration to the arbitrator, the arbitrator would accumulate both the acceleration vectors and the magnitudes of each output behaviour over time in order of priority. When the sum of the accumulated magnitudes exceeded a fixed acceleration value, the acceleration vector components would be apportioned to each behaviour in priority. Under normal flight conditions in which each behaviour is of approximately equal priority, this is equivalent to vector averaging. However, if one behaviour experiences an emergency and issues large magnitude vectors, this method effectively suppresses lower priority requests. Similar strategies were used by Terzopoulos et al [Tu, 1994] to produce realistic behaviour within more sophisticated animated fish. Sensor-Actuator Networks 42 Chapter 3. Real Time Robot Control H s S W =—I H — • —Kir Figure 3.12: A SAN network example. Note the interconnections sensor (S), hidden (H) nodes and Actuator (A) nodes. Each node has the structure expanded at right. The hidden nodes act as an interactuator information mechanism. The realistic animation of complex running or galloping entities, like the real world equivalent, is difficult to coordinate and control. Van De Panne and Fiume tackled this problem through the creation of Sensor Actuator Networks (SAN) [van de Panne, 1992]. Given a mechanical configuration and the location of binary sensors and PD actuators on the mechanism, a generate-and-test evolution method modulates weights connecting sensor nodes, hidden nodes, and actuator nodes of the network. In a complex information ex- change, all sensor nodes are linked to all hidden and actuator nodes, all hidden nodes to one another and actuator nodes, and all actuators nodes to all sensors and hidden nodes. Each link is unidirectional and weighted. Within each node, a weighted sum is performed on all connections, thresholded, integrated, and filtered through a simple hysteresis function. The SAN structure is depicted infigure3.12. 3.4.5 Common Denominators in Real Time Robot Control Many of these architectures adopt common, pragmatic solutions to shared problems of computational complexity, limited time, and unstructured environments. While an SMPA controller might be feasible given sufficient computing and communication resources, in general the only means of implementing 'real world' autonomous robots (or real time animations) is through embedded, often multiprocess, control exploiting fast simple data streams. A summary of these shared characteristics follow. Application Specific Sensing While cost and precision requirements make task specific sensor suites more attractive than higher level general purpose sensors, time and computing constraints limit the amount of signal processing and modelling Chapter 3. Real Time Robot Control 43 available for control. Together, these constraints drive the adoption of application specific signal processing and control. Multiple Data Streams Using multiple data streams, a single controller can perform limited sensor fusion based entirely on task specific criteria. Rather than performing broad based signal processing and recognition to develop a multipurpose symbolic model, behavioural systems embed meaning into sensor data by responding through simple signal processing methods (e.g. thresholding) to specific stimuli. Thus each controller establishes a partial, self-interested, view of the available data streams. Arbitration of Multiple Controllers Using task specific sensing, a controller's output represents both an actuator command value and a confidence measure within a multicontroller environment. In effect, the output is a measure of the correspondence between the controller's task specific world model and the sensor data stream. In this way multiple behaviours classify the sensor data stream, merging the database-like world models and symbolic planners of the SMPA into a smaller, simpler action model. 'Correct' action selection is left to simple arbitration procedures (e.g. weighted averaging or switching). Parallel Computation A number of factors promote the distribution of control over multiple processors including sensor and controller complexity, disparate time scales, and robustness. Signal processing complexity can vary considerably within a single system e.g. from topographical L A D A R mapping to inclinometers in the C M U Ambler project. Similarly controllers vary from simple PD-like control to sophisticated model based systems. Concurrent sensing and control removes I/O boundedness within the system during complex computation by allowing the system to respond over many scales. Furthermore, physically distributed computation makes the system less susceptible to catastrophic damage to the control system. In general, distributing computing " cycles over multiple CPUs substantially reduces the cost of such systems, though some additional complexity may be encountered in interprocess communication. In summary, practical constraints on cost, time, and performance are slowly driving robot control techniques that exploit inexpensive simple sensing, multiple control behaviours, and behaviour arbitration. These are the basic constituents of the modern robotic agent and to a significant extent, describe many existing Chapter 3. Real Time Robot Control 44 real time manipulation systems. However, just as agent control methods may simplify the control of real time robots, teams of agents may simplify the control of complex, multiplant, systems. 3.5 Multiple Robot Control Using multiple robots to achieve a single task is not new. Indeed, multiple manipulator control is a key issue for the International Space Station Special Purpose Dextrous Manipulator (SPDM) a twin armed manipulation system mounted on a remote manipulation system (RMS). However, the difficulties of real time manipulator performance become further magnified in such unstructured multirobot environments. Centralized methods must accommodate modelling and planning for multiple manipulators that carry common or separate payloads through a possibly cluttered work space. A task made more difficult by complex sensors and, possibly, remote computing installations. This section will review some examples of distributed robot team control and draw some conclusions on the significance of these architectures. Reynolds' Boids revisited Based on beliefs of actual bird behaviour, Reynolds reasoned that given similar, if simplified, sensing and flight dynamics, behaviours such as velocity matching, collision avoidance, and flock centering should be sufficient to 'hold together' a group of boids in a flock as well as produce realistic flock trajectories. In effect, Reynolds believed that desired team behaviour could be achieved through careful design of a team member. After trial and error, this approach generated realistic flocking behaviour. Perhaps more interesting than the final flock performance are some of Reynold's observations on the design process: • "Flock behaviour depends on a localized view". 'Realistic' behaviour did not require complete flock models within each boid. • Linear collision avoidance rules (e.g. PD rules) produce oscillatory 'unrealistic' behaviour while nonlinear 'inverse square' rules promoted stability. In observing that linear flocking rules produced spring like behaviour, Reynolds demonstrated that agent teams are dynamic systems. • Apparent flock complexity is due to environmental complexity (e.g. obstacles), an phenomenon observed by others (e.g. the Intelligence property identified by Brooks [Brooks, 1989a]). • The control algorithm is 0 ( N ) in the number of boids, due primarily to the implementation of the behaviours. Autonomous agents in a multiagent systems should be constant time. Chapter 3. Real Time Robot Control homing — 45 1 Figure 3.13: By applying both direct 0 and temporal ® combination to the same basic behaviours higher-level behaviours can be generated. In this example, safe wandering is used to generate both flocking and foraging. Similarly aggregation is used in both foraging and in surrounding. The central conclusion is that multiagent systems can exhibit desirable behaviour without explicit coordination from an external supervisor. Ferrell's Hannibal: single or multiagent? Though subsumption architecture has resisted classification, the approach falls within the definition of agency. In some subsumption networks it is debatable whether individual AFSMs are not themselves agents and that these hierarchies are not multiagent systems. Indeed, Ferrell refers to AFSMs in Hannibal as agents, each distributed over each leg of the hexapod coordinated loosely by a set of global agents. Each leg of the hexapod is controlled by a discrete A F S M network and the legs as a whole are coordinated by a central timing agent. In this instance coordination between agents is explicit, timing signals set the walking pace of the machine. However, the implementation of each step is unique to each leg. This demonstrated that while a global agent might explicitly synchronize agent activities, it need not explicitly specify the activity itself. Strict subsumption systems such as Hannibal show that multiagent systems can be hierarchical and yet autonomous within any given layer. Indeed, viewed as a multiagent system, Connell's Herbert shows that multiagent systems can be heterarchical. Mataric's Nerd Herd Mataric investigated the interaction of agents within a group of 20 mobile robots, 'the Nerd Herd'. As mentioned earlier, each robot was equipped with a set of basic behaviours: Safe-Wandering, Following, Dispersion, Aggregation, and Homing. These controllers used input from contact sensors, piezo-electric 46 Chapter 3. Real Time Robot Control bump sensors, IR range sensors, and IR break beam sensors. Each shoe box sized robot could collect and stack pucks with a simple fork-lift gripper/actuator assembly. By expressing basic behaviours in terms relative to the agent or the environment, global group behaviour was distilled into each agent. For example safe wandering was expressed as: ^ ^ 0 and dij > <5 id (3.75) avo where pj is the absolute position of the jth agent, ^jjji is the velocity, J id is a range threshold, dij is the avo magnitude of the distance to the ith agent or obstacle. Similarly, following minimized the angle 6 between the relative distance vector and the agent's velocity vector: •P;)< dpj_ dt \\( i-Pj)\\cose P (3.76) where i is the leading and j is the following agent. Similarly dispersion and aggregation employed attraction and repulsion controllers. By modulating the velocity vector through the use of these rules desirable global behaviours were generated.Figure 3.13 summarizes the behaviour hierarchy developed through behavioural and temporal sequencing in a single Nerd. Troops Parker [Parker, 1994] successfully explored the use of robot systems in the mediation of (simplified) toxic spills. The goal of the research was to establish a software architecture that enabled explicit cooperation between robots. Each robot possessed a set of low level subsumption behaviours. These behaviours are grouped into sets, each set representing a high level task (e.g. cleaning the floor). A set of prioritized motivational behaviours receives information through sensors, communication, and feedback from other behaviours and selection behaviour sets based on: Behaviour Set Priority a,, (integer). Sensor Feedback Fi(t). the applicability of the behaviour set to the current situation, (boolean). Impatience Ii(t). the period this agent will wait for another agent to accomplish the same behaviour (real). Acquiescence Ai(t) the period this agent will maintain the behaviour before yielding to another agent (boolean). Suppression Si(t) whether another behaviour suppresses this behaviour set (boolean). 47 Chapter 3. Real Time Robot Control The motivation calculation: m,(0) = oti • Fi(t) m,(i) = (m,-(t-!) + /(«)) • Ai(t) (3.77) • Si(t) • Fi(t) (3.78) if rrij(t) > 9i, a threshold of activity, then the behaviour set is activated and, possibly, a message broadcast to other agents. Based upon observed broadcasts and the environment, an agent could either engage a task achieving behaviour set, wait while another agent performed the task, or, if an agent failed to perform, take over a task. In this model of multiagent activity, agents inform one another through broadcast communication and do not know or assume the abilities of other agents present. Cooperation arises through opportunistic agent activity. Agents act only if they are ready and able to respond. Since there is no central coordinator or agent-to-agent information exchange, the system is robust to agent failure. 3.5.1 Common Denominators in Real Time Robot Team Control Again, sharing similar problems to robot team control, multiagent systems possess common solutions worth summarizing. In particular: Global and Local goals, Interagent Communication, Robustness, Locality in Sensing, Linearity in Computation, and Interagent Dynamics. Global Goals Agent teams are often fulfill a team or global objective. The objective is either a specific goal (e.g. a trajectory) expressed by an external process to the team, or arises from interagent dynamics (e.g. Reynolds' boids) and the changing environment (e.g. Ferrell's Hannibal). Local Goals Local goals arc often used to ensure safe operation. Global goals often coexist (and even arise from) the maintenance of local goal systems. Local goals are usually explicitly stated relative to the agent and not some global coordinate system (e.g. move-forward). Though a number of similar goal arbitration methods exist, there docs not seem to be a consensus how to establish goal priorities or how local goal systems should be selected and/or designed. Chapter 3. Real Time Robot Control 48 Interagent Communication Individual agents often communicate either through one-to-one (directed) transmission between agents, through broadcast, or through interagent sensing. Parker's Troops used directed messages between agents over R F bands while Mataric's Nerd Herd did not generally communicate. Similarly Ferrel's Hannibal did not use intcrleg communication to achieve stable gates. A common feature amongst most systems is that if a global objective is set, it is done so externally through some goal generation system and communicated either to all the agents or through some lead agent (e.g. boid's flock leader and Ferrel's synchronizing agent). Global Robustness The fulfilment of a global goal is often robust to individual agent failures. Mataric's system demonstrated considerable robustness to failures in positioning and communications, achieving the puck collecting objective in spite of these problems. Similarly Ferrel's Hannibal was robust to leg failures and environmental irregularities, while flocking in Reynold's boids was not influenced by flock fission or fusion. Locality in Sensing Individual agents sense only neighbouring team members and do not monitor the entire team. In general, sensing is kept local to each agent: short range sensing, limited boid fields of view, joint angles, forces, etc. This limits the impact of distant events on individual agents and, subsequently, lowers the computational burden. This means that global goals must be expressed either in terms of locally sensible events or communicated to each agent. There are exceptions to this rule, however. Hannibal adopted a single inclinometer Constant Time Computation Additional agents within a team do not significantly add to the computational burden on an individual agent. The short sensor horizon, limited modelling, and embedded computation within each agent means that (ideally) the number of agents should not affect the response time of a lone agent. In effect a multiagent system becomes a distributed, parallel computer. Chapter 3. Real Time Robot Control 49 Interagent Dynamics Reynolds' Boid system demonstrated that multiagent teams are dynamic systems. Ferrel's Hannibal relies on dynamic interaction with the environment to achieve stability. Mataric's Nerd Herd, too, demonstrates characteristics of dynamic systems, though, surprisingly, she felt analysis of the team as a particle system had little merit. 3.6 Remarks In this chapter we have reviewed the structure of supervisory manipulator control and shown that, in context, real time control of robots in an unstructured environment often requires novel robot architectures, most of which bear little resemblance to SMPA supervisory control. Clearly, real time control of robots and robot teams requires integrated design of the four components of SMPA. It seems that rather than generating a monolithic model-plan system, real time systems classify and reduce environmental events into a set of controller processes. The response of each controller is often used as a measure of correspondence between the classification and the observed world. Though there is no consensus on action selection or combination from the controller set, voting schemes, simple combination, and switching (through AFSMs) seem the most common. The next chapter will take these features and place them into a familiar, concise, control theoretic representation, laying the basic foundations for both Agent and Multiagent Control Theory. Chapter 4 The Agent and Multiagent Model 4.1 Introduction Previous chapters have discussed the SMPA architecture, its origins in AI, and the difficulties in extending it to real time environments. In particular, the last chapter reviewed architectures that break away from the SMPA model to a class of task specific architectures. Furthermore, a set of features common to real time robotics was identified, reinforcing the perception that a genuinely new class of controller, an agent, is emerging. Agent based control has spawned interest in multiagent systems, agents acting not only on the environment, but on and with other agents. The objective of this chapter is to propose a control theoretic agent model based on observations of common features in existing agent and multiagent systems. In so doing, terminology will be defined, relating the characteristic features of agent control to traditional control theory. Within this framework, the reputed abilities of these systems will then be summarized within a set of hypotheses. Given these properties of agency, further definitions of multiagency will be offered and hypotheses on multiagent control proposed. 4.2 Behaviour Control Though the basic philosophy of multithreaded behaviour arbitration that underlies agent control is widely acknowledged, the theoretical foundations remain weak due in large part to the wide variety of implementations, interpretations, and nomenclature of agent control. These applications range from production scheduling systems and autonomous mobile robot teams to software avatars (virtual personas) and combat simulators. Terminology is often the source of confusion in any growing field, agent and multiagent control is no exception. Recognizing the breadth oiagent oriented applications [Shoham, 1993], this thesis focuses exclusively on the control of robotic systems. Even within this narrow context, the definition and structure of an agent has yet to be formally fixed, though most agree that an agent contains some kind of 'behaviour arbitration mechanism' capable of perception and action [Mataric, 1994, Parker, 1992b, Kosecka, 1994]. Mataric 50 Chapter 4. The Agent and Multiagent 51 Model [Mataric, 1994] proposes the following qualitative definition "An agent is a process capable of perception, computation and action within its world...". Mataric goes on to develops a spectrum of agent control schemes based on modelling and state retention including reactive systems possessing no memory, behavioural systems with limited state retention, hybrid methods relying partially on symbolic systems, and purely symbolic deliberative systems. MacKenzie, Cameron, and Arkin [MacKenzie, 1995] also proposed that an agent is "... a distinct entity capable of exhibiting a behavioural response to stimulus ", providing the symbolic expression: Agent = Behaviour (Stimulus) (4.79) defining an agent as a behaviour, itself a function of some stimuli. To complicate matters, the term behaviour is rarely used consistently. From Steels [Steels, 1991] behaviour, is "... a regularity in the interaction dynamics between the agent and the environment". While Mataric suggests behaviour is [Mataric, 1994]: "...a control law that clusters a set of constraints in order to achieve and maintain a goal". This latter definition seems to equate control and behaviour as does Parker [Parker, 1992a]. MacKenzie et al [MacKenzie, 1995] are more specific. Behaviour is y where: y = f{vi,v ,... 2 ,v ) m (4.80) where u,- are some input variables and /(•) is defined Vi>,-. Others are careful to discriminate between control and observed behaviour (e.g. [Colombetti, 1996]). Despite some confusion of terminology, behaviour is clearly related to the response of some system to control input. Some have defined agents as a hybrid control problem, one in which discrete and continuous systems coexist, and have applied Discrete Event Systems (DES) theory[Kosecka, 1994] and a notable variant, Constraint Net (CN) theory [Zhang, 1995] to agent analysis. Discrete event systems experience asynchronous state changes at discrete points in time. Each state in the system corresponds to some continuous evolution of the system, while each event is a discontinuous transition between qualitative changes in the system's behaviour. In this model the definition of behaviour is trace [Zhang, 1995] or "an event trajectory" [Kosecka, 1994] and the agent a Supervisory Controller (or Constraint Solver in [Zhang, 1995]). By issuing a trace or event stream, these controllers can modulate the output of the plant, itself an event stream or trace, to follow a desired event trajectory. Since each symbol represents a control strategy, Pure DES controllers are a means of sequencing action towards a desired objective. Though powerful in uniting analog and switching controllers, neither represents new techniques in the design of individual controllers nor provide insights into the interaction between these controllers and the environment. Chapter 4. The Agent and Multiagent 52 Model Agency and Control Theory Though few investigators dispute Brooks' [Brooks, 1991a] assertion that 'the world should be its own model' and Steel's [Steels, 1991, Steels, 1994] observation that agents are dynamic systems, most continue to believe that furthering agent control depends on better computational and linguistic apparatus and not on the application of control theory. Some have concluded that agent based system's can only be designed through field trials and that control theoretical approaches have limited value. From Mataric [Mataric, 1994]: The exact behaviour of an agent situated in a nondeterministic world, subject to real error and noise, and using even the simplest of algorithms, is impossible to predict exactly... 1 Precise analysis and prediction of the behaviour of a single situated agent, specifically, a mobile robot in the physical world, is an unsolved problem in robotics and A l . 2 While it is true the exact trajectory of any dynamic system is not exactly knowable, Mataric's statement underestimates the value of a dynamics and controls analysis of even simplified problems. Indeed, amongst subsumption researchers in particular there is an open prejudice against simulation [Parker, 1994]. Imprecise, confusing terminology has obscured the specification of agent designs and hindered the formalization of a plainly successful real time control technique. The result is a collection of systems that have many common structural attributes but, surprisingly, little theoretical common ground. To enable a theoretical treatment, this section will review the concepts behind agent control from the control theoretic standpoint, and establish formal definitions where possible. Through this process a set of hypotheses will be proposed that characterize agent design and performance objectives. 4.2.1 Goals The term goal has many interpretations and is discussed in detail by Weir [Weir, 1984]. Colloquially understood to represent a desirable, possibly time varying, state of some system, the philosophical arguments reviewed in [Weir, 1984] center mostly on whether such states are external or internal to the construction of goal directed systems. For the purposes of machine control however, a goal is generally equivalent to the desired behaviour of a controlled mechanical plant. Such behaviours are sometimes expressed as the abstract "wander" [Brooks, 1989c] while others are more precisely expressed as a static setpoint or time varying trajectory through some relevant coordinate system (e.g. position, velocity, temperature, etc.) . A broadbased definition can then be proposed: x 2 page 27: paragraph 3. page 29: paragraph 4 53 Chapter 4. The Agent and Multiagent Model v(0 f w yft) G<f,x(r),u(0,v(0) i / Figure 4.14: A generic plant, Gj(-), with control input U;(£) and behaviour yj(t). Definition: D4.1 (Goal) If a trajectory, r;(t), is specified to traverse through some space, Qi € IR", then r(t) is a goal and Qi a goal space. There are no preconditions on the function r ,•(£). However manipulator end effector goal functions r,-(r) = Xd(t) € R 6 are usually at least C 1 trajectories through task space. Since mobile systems often leave path planning to an on board reactive system goals are usually discretely changing planar positions or perhaps intervening waypoints or r,-(t) = x<j(i) € R 4.2.2 3 [Gat, 1994]. Behaviours The artificial intelligence community uses the term behaviour to describe the observable evolution of an automaton's state. In robot control, behaviour is understood to mean the performance of the robot within its environment. The colloquial use of " behaviour" in the context of machine control generally refers to the response of some plant. So to clarify the concept of behaviours, consider the plant in figure 4.14. In thisfigurea control effort u,-(r) and a disturbance, v(t) drives a plant, Gj(-), to produce a response yj(t). To an observer, the behaviour of a plant is simply the observable response, yj(t). Definition: D4.2 (Behaviour) Consider an arbitrary plant, Gj, with state, Xj € JR ' that is perturbed Ij by both control effort, u,-(t) € IR", and environmental disturbances v(t) S IR". Then the plant's behaviour is the observed output, yj(t) : TR XJ x IR" x IR" x IR -4 IR : + 6 yj(t) = G (t,x (t),u (t),v(t)) i i i (4.81) and the state of the system evolves according to some relation: k (t)=f[t,x (t),m(t),v(t)}, j j V*>0 (4.82) Chapter 4. The Agent and Multiagent 54 Model For example, given a linear time invariant system, a familiar multivariable description of the system can be composed: X(t) = Ax(t) + Bu(t) + Dv(t) (4.83) y(t) = Cx(t) + Fu(t) (4.84) Through appropriate selection of C and F, the system's behaviour, y(t), can be defined. For a linear time invariant manipulator, U(i) could be a vector of desired joint torques and, with C = I, the identity matrix, and F = 0, the observed behaviour could be defined as the end effector state y(t) — [x x] . However, this T expression could just as easily describe the ith time invariant link, in which case U,-(t) = TJ, a scalar, and y*(*) = [qi qi] T D4.2 reflects the consensus that "behaviour" is the observed response of the plant. Since any realistic plant produces a bounded range of observable behaviours a behaviour space can also be defined, formally: Definition: D4.3 (Behaviour Space) / / the set of all possible output trajectories, yj(t), within a region, Bj C IR , then Bj is the behaviour space 6 4.2.3 are bounded ofyj(t). Behaviour Control In agent and multiagent control literature, confusion arises when investigators use the term behaviour. Though differences in desired and actual response in a plant are minimized through control, "behaviour" is usually used indiscriminately, equating control and response. For example, in describing an obstacle avoidance behaviour in a mobile robot it is often unclear whether the robot's final response or driving control algorithm is described.In using such terminology, the process of controller design is ignored and, more significantly, the dynamics between control and environment overlooked. Nevertheless, the intent of this usage is clear: controllers can be designed to generate a specific predictable response from the system under specific stimuli. So to clarify this usage, a more precise concept, behaviour control will be examined in detail. The purpose behind behaviour control is the generation of a predictable response to a specific, but narrow, set of stimuli. Suppose a controller can be designed to react to a specific stimulus. The controller must map the stimulus through some setpoint to a control effort that generates the desired response from the plant. The setpoint, a point or trajectory in a controller's input space, is clearly a goal description and the plant's response, as described earlier, a behaviour. Immediately, an elementary necessary condition of reachability Chapter 4. The Agent and Multiagent 55 Model v(0 Uj(0 ^(0^(0) G/f,x(r), (/),v(0) Ui Figure 4.15: A goal r,(t) input to the generic controller, H,(-), with output u,(t) controls the plant, G •,-(•) making the behaviour, yj(t). can be applied to behaviour control. A goal is reachable by the plant if the goal resides within the range of all possible plant behaviours. Definition: D4.4 (Reachable Goal) Given a behaviour space, Bj, and a goal space, Gi, a goal r,(£) G Gi is reachable ifvi(i) G Gi H Bj A controller, as depicted in figure 4.15, achieves a desired response by modulating a control input to a plant such that the desired and actual responses converge, specifically: Definition: D4.5 (Control) If a process H,(-,r(t)) producing a control effort u,(t), can be designed such that given a reachable goal Vi(t) G Gi, the output behaviour, y(t), is stable about r,(£), then the function, H;(-,r(t)), is a controller. Thus control ensures that desired and actual responses are convergent. Clearly, goal seeking behaviour is synonymous with convergence: Definition: D4.6 (Goal Seeking Behaviour) / / the behaviour yj(t) G Bj is stable[Vidyasagar, 1993] about a reachable goal r,(i) G Gi, then the behaviour is goal seeking. This definition equates goal seeking with stability, including its progressively strict definitions such as local, global, and asymptotic, and exponential stability stability [Vidyasagar, 1993]. D4.5 establishes an artificial, causal relationship between a goal state and the observed behaviour and imposes an artificial dynamic equilibrium on the system through the application of control forces. In contrast, the more general D4.6 states that goal seeking behaviour can arise from any dynamic equilibrium - no causal relationship has been assumed (though one may exist nevertheless). In short, goal seeking behaviour does not imply control though control does imply goal seeking behaviour. Chapter 4. The Agent and Multiagent > p 56 Model r Hi(ri(f),y(0,v(t)) w G a,x(f),ii (»),v(f)) j yj« i i Figure 4.16: A goal r,(t) input to the generic controller, H,(-), observes disturbances v(t) to output u,(i) and control the plant, G(-) making the behaviour, Vj(t). For example: given stable turn-right-to-avoid and turn l e f t controllers, a mobile robot will spontaneously follow a wall. Thus the apparent goal seeking f ollow-wall behaviour arises out of equilibrium with the environment and the two controllers. Unless an explicit f ollow-wall goal was issued, one cannot say the robot is controlled by a f ollow-wall controller, only that the robot exhibits f ollow-wall behaviour. Ideally, perfect control is the result of complete knowledge of the plant's dynamics and disturbances. When condensed into a model, this knowledge can be used within the controller to cancel undesirable dynamics and disturbances. In an unstructured environment, however, such complete knowledge is often too complex to maintain in real time. If only partial, task specific models are available, a task specific controller may be used to manage a particular disturbance or behaviour. In short, such behaviour controllers conforming to the structure in figure 4.16 respond to particular stimuli (e.g. wall collisions) using appropriate sensing (e.g. range finders), a suitable control algorithm, and a setpoint strategy to either minimize disturbances to the system or engage specific behaviours: Definition: D4.7 (Behaviour Control) If a process H,-(-,r,-(t), v(t)) can be designed such that, given a reachable goal r,(i), the output o/H;(-), u,(t), minimizes the effect of a specific disturbance v(t) and if the behaviour yj(t) is stable about r,(i), then the function, H,(-,r,(t), v(t)), is a behaviour controller. Some good examples of behaviour control include: • Hartley and Pipitone's [Hartley, 1991] aircraft landing gear deployment/retraction triggered at a fixed altitude. • Arkin's [Arkin, 1987] obstacle avoidance triggered by obstacle range. • Reynold's [Reynolds, 1987] boid velocity matching behaviour. Chapter 4. The Agent and Multiagent Model 57 Given these definitions, behaviour controlled systems seem to be simple variations on traditional control albeit under new nomenclature. How does agent based control differ from traditional systems? 4.3 A General Model of Agency Of course, agents are traditional control systems that exploit multiple goal systems to achieve desired behaviour. In highly structured environments, it is possible to devise model based traditional closed loop control system to detect and compensate for all likely disturbances. However, as the environment becomes more complex so too does the model dynamics, greatly challenging centralized closed loop architectures in performance, cost, and complexity. One solution to the growing complexity of centralized model-based controllers is to arbitrate between a set of controllers, each coping with a specific contingency, to generate desirable plant behaviour. Though unproven, it is widely accepted that arbitrary behaviours can be decomposed into either simultaneous and/or sequential 'atomic' behaviours. Usually, atomic behaviours respond to a physically accountable stimulus, producing behaviour through either attractive or repulsive controllers. All arbitrators seem to fall between two extremes: combination and switching. Combined arbitration mixes the outputs of behavioural controllers algebraicly to achieve the composite behaviour. Conversely, switching arbitrators switch from one controller to another based on some criteria. In either instance, arbitration between behaviours plays a critical role in the generation of final desired behaviour. Figure 4.17 documents this multicontroller structure while figure 4.18 depicts a condensed shorthand representation. The set of controllers and the arbitration process together form a new control element known as an agent: Definition: D4.8 (Agent) Given a plant, Gj(t,Xj(t),xii(t)), and a set of behaviour controllers, H,(£, •) : 1 < i < b, an agent, A, combines u,- : 1 < i < b to form a composite control effort, u j(t), c to generate the behaviour y(t) or: cj = Aj(ui (*),...,u (*)) (4.85) (t) = G (i,x(*),u c j (0,v(0) (4.86) u (t) yj t i By mapping the relatively informal definitions and structures from literature to more stringent terms within control theory, it is possible to specify an agent as a component composed of three fundamental parts: • a plant: a controlled dynamic system acting within some environment. Chapter 4. The Agent and Multiagent 58 Model v(f) r,«) H (r(/),y(r),v(r)) u,(0 | A/ujCO-.-.u^O) G/r,x(i),x(r),v(0) H (r(0,y(0,v(0) b Figure 4.17: A set of goals r,(t) : 1 < i < b input to generic controllers, H;(-) : 1 < t. < b, each observing select disturbances v(t) to output U;(t) : 1 < i < b to the arbitrator Aj(-). The composite output u j controls the plant, Gj(-) to make the behaviour, Yj(t). c • a set of behaviour controllers that respond to specific environmental stimuli. • an arbitrator that combines behaviour controllers to control the plant. In effect, the combination process condenses and segments the model-plan portion of the SMPA cycle into smaller subproblems. From the earlier literature review, it is clear that many believe that this represents a fundamental control strategy, specifically: Hypothesis: HI (Agent Control) Given a reachable goal, r j(t), c \ii(t) = Hi(t,-,Ti(t)) and a set of behaviour controllers, : 1 < ii < b, an agent, A j , can be designed to combine u,-(i) into u j(t) c such that the composite behaviour yj(t) is stable about r j(t). c In effect, Aj acts as composite controller with output, u j(t) or: c u (t) = A(r ,-(t),y(t),u,-(t),v(t)) ci j e (4.87) Based on control theoretic arguments, a general model of agency has been proposed. Given a fixed plant, this model highlights the two fundamental design problems of agent based systems: behaviour controller and arbitrator design. Though apparently distinct in this model, behaviour controller and arbitrator design are not always distinct in practice and are rarely simple in structure. While some architectures draw clear distinctions between control and arbitration (e.g. the DAMN architecture), others appear to closely couple decision and control (e.g. subsumption or voting models). Furthermore, in some systems (e.g. subsumption and 59 Chapter 4. The Agent and Multiagent Model f H /r(0,y»,v») • »,y»,vM,u «) c A A u /r) c G/r,x(r), u M,v(/)) cJ Figure 4.18: A set of behaviour controllers r,(i) and H,(-), each observe select disturbances v(t) to output Ui(t) to the arbitrator Aj(-). Aj(-) arbitrates amongst these controllers to track a desired goal r j(t). The composite output u j(t) controls the plant, Gj(-) to make the behaviour, yj(t). c c the SAN networks) arbitration is driven by individual controllers (a bottom up approach) while others rely on a centralized decision making process (e.g. Andersson's Ping Pong Player). Structurally arbitration mechanisms have been formed from linear (e.g. Motor Schemas) and nonlinear (subsumption), discrete (e.g AFSMs) and continuous functions of controller output, sensor data, and time. A first step to understanding arbitrator design and the coupling between arbitration and control can be made by examining the information flow or communication between these controllers and arbitrator. The following section examines these data flow relationships to aide in the further classification of arbitration schemes. 4.3.1 Information Exchange Information exchange can be classified according to the relationship between sender and receiver of a message. Fundamentally, a transmission has a source or sender, an audience or receiver, and possess' data content, a message. The act of communication reflects two assumptions: 1. a sender exists that can assemble a meaningful message. 2. a qualified receiver exists to disassemble the message. Between a number of entities these assumptions are complicated by the potential of one-to-one or one-to-many transmission events. In one-to-one (or one-to-few) communication, the sender must be able to discriminate between potential receivers (e.g. through frequency selection, message tagging, encryption, etc.). To do so requires that the sender must maintain a model of the intended receiver (e.g the receivers frequency, identifier, or encryption key). In one-to-many communication, this need not be the case, the sender need only maintain a common communication standard (e.g. through a common frequency, message identifier, 60 Chapter 4. The Agent and Multiagent Model or encryption algorithm). From this basic description some definitions may be proposed that assist in the characterization of agent arbitration (and, later, multiagent coordination) strategies. First, formalizing the concept of a message as a formal representation built out of some consistent language (here drawing briefly from elementary formal-language theory [Cormen, 1990]): Definition: D4.9 (Message) Given • a set of symbols or alphabet, E , and • a set of strings composed of the alphabet E or language, I C S , a message is simply an element of m € L. Thus a message can range from a lone 8 bit character to an elaborate set of data structures, the meaning of which is determined by mutual agreement of sender and receiver. Transmission of a message is simply the transfer of an interpretation of a representational structure from one entity to another or Definition: D4.10 (Transmission) Given a sender, S, a receiver, R, and a message, m, a transmission of m is the act of conveying m from S to R. through some medium. The correct assembly of the message is the duty of the sender while interpretation of the same message is the duty of the receiver. Transmission could be as basic as variable sharing within a single program to interprocess communication between hosts (i.e. independent computers) over some network connection. A sender may possess the ability to discriminate a subset of receivers from a field of candidates and transmit to them alone - an act of one-to-one or one-to-few communication, here termed directed transmission: Definition: D 4 . l l (Directed Transmission) Given a • sender, S, • a set of receivers, Rj : 1 < j < N , r • a message, m E L, then if S somehow identifies a unique receiver d £ N r and transmits m to R<j, the transmission of m is a directed transmission and the message m is private. In other words there exist messages for which the sender must know the identity of the receiver prior to transmission. Clearly, private messages exploit particular capabilities within particular receivers to interpret Chapter 4. The Agent and Multiagent 61 Model and/or act upon a message. So the act of direct transmission requires that the sender refers to an internal model of the receiver to both identify the receiver and compose a message that the receiver will understand. Within a multiprocess agent (i.e. an agent composed of many processes), the implications of such actions are obvious, the sending process must know the location of the receiving process (e.g. a UNIX socket or memory address) and must send a legible message (e.g. a byte format). In single process agents, direct transmission amounts to explicit variable sharing between functions or objects. Conversely, it is possible the sender does not know the identity of the receiver(s) prior to transmission in which case the content of the message must adhere to some common or public standard, i.e. a broadcast transmission: Definition: D4.12 (Broadcast Transmission) Given a • sender, S, • a group of receivers, Rj : 1 < j < N , r • a message, m € L, then if S does not identify a unique receiver and transmits m to all Rj : 1 < j < N~ , the transmission of m r is a broadcast transmission and the message m is public. This definition implies that a public message contains data that may be translated and/or acted upon by any or all receivers. These definitions can now be applied to the classification of agent arbitration and control strategies. Mataric [Mataric, 1994] offers similar definitions of directed communication but does not distinguish between one-to-one and one-to-many. Interestingly, she defines indirect communication as act of one agent observing another's behaviour, known as stigmergic communication in biology. 4.3.2 Arbitration The combination of behaviours to form a composite behaviour is commonly referred to as arbitration. Many strategies have been discussed in literature from subsumption [Ferrell, 1993] and task priority[Nakamura, 1984] to weighted summation[Rosenblatt, 1995a] and discrete event systems [Kosecka, 1994]. No single technique is definitively more applicable to autonomous operation than any other, though subsumption has enjoyed wider application than most arbitration techniques. Chapter 4. The Agent and Multiagent Model 62 Behaviour arbitration usually falls between concurrent behavioural combination and serial temporal sequencing. Between these extremes, an arbitrator combines the output of concurrent behaviours, changing the mixture over time. However, most arbitrators occupy the extreme regions of this design spectrum, resulting in some common arbitration strategies. Mixtures of these systems generally employ combination in the basic controllers and build meta controllers through switching [Mataric, 1994]. Colombetti et al [Colombetti, 1996] supply a classification notation descriptive of many multiagent systems: • Independent Sum. Two or more behaviours (e.g.a and P) act independently (i.e. on different actuators): a\p • Combination. (4.88) Two or more behaviours are combined into a new behaviour on a single actuator: a + P (4.89) • Suppression. One behaviour inhibits another, not necessarily on the same actuator: ~ (4-90) p • Sequence. A behavioural pattern is built as a sequence, o, of simpler behaviours, again not necessarily on the same actuator: cr = <*•/?•... (4.91) and a repeating pattern may be suffixed or o*. In the transformation of controller output, U;, into composite output, u , all arbitration algorithms draw c from one or more of these tools. As mentioned earlier, the integration of arbitration and control varies considerably in practice. Some arbitrators rely on a centralized model to select a specific controller. Others broadcast a single objective to the controller set and rely on a voting or summation strategy to determine the appropriate response. Finally, some arbitrators rely exclusively on environmental interaction to select the appropriate controller. Based on these observations the following classes of arbitration are proposed: • explicit arbitration • implicit arbitration • emergent arbitration 63 Chapter 4. The Agent and Multiagent Model Explicit arbitration transforms r into a set of subgoals,r,-, executed by the controllers, H,-, as directed c by the arbitrator. Implicit arbitration occurs when each controller determines the appropriate response to the composite goal without direction from the arbitrator. Finally emergent arbitration occurs when the controllers drive the plant towards r based purely on environmental interaction and without the application c of an externally denned goal system. The communication definitions developed earlier can now be applied to form definitions that formally characterize these arbitration approaches: Definition: D4.13 (Explicit Arbitration) If an agent, A, with behaviour controllers H,(t,-,r,(t),v(t)) : 1 <i < b 1. directly transmits r,(t) to controllers H,-(i,-,r,-(i),v(*)) 2. combines the output u (i) to form a composite output u (t) t c 3. achieves a stable trajectory y (t) about r (t) c c A exercises control through explicit arbitration or simply explicit control. In effect, the agent determines which controller can best achieve the desired behaviour. To do so requires that the arbitrator maintain an accurate model of the participating controllers and that the desired behaviour can be explicitly decomposed into controller setpoints. The direct transmission of setpoints r,(i) is symptomatic of model maintenance since each setpoint is correlated with each controller. An example of explicit arbitration is common in manipulation control where a supervisory controller determines the necessary setpoints for each link controller. Since each setpoint is correlated with a particular link controller, setpoints must be transmitted directly to each controller (often through a simple moveto procedure call). The less strict implicit arbitration permits each controller to determine suitable setpoints, autonomously: Definition: D4.14 (Implicit Arbitration) If an agent, A, with behaviour controllers H (£,-,r (£),v(t)) : t 1<i < b 1. broadcasts the composite goalr (t) to controllers H,-(t,-,r (t), v(t)), c c 2. combines of the output u,-(i) to form a composite output u (t) c 3. achieves a stable trajectory y (t) about r (t) c c A exercises control through implicit arbitration or simply implicit control. c Chapter 4. The Agent and Multiagent Model 64 Again broadcast information transmission is symptomatic of a relatively model-free arbitrator. With a broadcast goal, there is no explicit one-to-one correspondence between goal and setpoint assignment by the agent. In effect, the agent leaves the applicability of a given control strategy to the controller, and arbitrates between the results using some selection criteria (e.g. votes, etc.). Implicit arbitration is rare in traditional control but common in daily activity. Consider the combat pilot's task of formation flying. The flight leader does not specify a trajectory for each wingman (in fact he may not know how many wingmen are present), but relies on the wingmen's ability to track the flight leader and maintain intervals autonomously. Emergent arbitration permits each controller to respond purely to environmental stimuli - the environment becomes the arbitrator: Definition: D4.15 (Emergent Arbitration) If an agent, A, with behaviour controllers H,-(t,-,r,-(t), v(t)) : 1< i < b 1. only combines the output u,(i) to form a composite output u (t) c 2. achieves a stable trajectory y (t) c about v (t) c A exercises control through emergent arbitration or simply emergent control. With no arbitrator to controller transmissions, an emergent arbitrator becomes a simple algebraic function or switching protocol of stimulus driven controller responses. Emergent controllers rely on a complex interaction of three dynamic systems to achieve the desired behaviour: the environment, the combination mechanism, and the controllers. Though this last form may seem somewhat contrived, in fact it represents a very common method of generating useful composite behaviours such as the wall following combination: avoid-obstacle and move-ahead [Arkin, 1987]. By fixing the arbitrator as a finite state machine switched between active behaviours, this stimulus driven controller combination will drive the system parallel to obstructing walls. This classification strategy dissects predictive modelling and planning out of arbitration. The specification of setpoints based on a centralized model is a form of a priori arbitration, implying that a monolithic internal representation of the world is sufficiently accurate to plan future actions. By classifying setpoint transmissions as directed or broadcast, arbitration can, therefore, be categorized as explicit or implicit. If no a priori setpoints are transmitted, then arbitration must be emergent or purely stimulus driven. Emergent arbitration does not preclude feedforward or predictive models within each controller, rather it limits their use to specific, narrowly defined problems. Similarly, models within an emergent arbitrator (i.e. those based solely on controller responses) are limited to action selection alone. Chapter 4. The Agent and Multiagent Model 65 T r (0 **j(t)=ftx (t),m.y(t)rtty) c c H /r(f),y(r),v(f)) G/t,x(r), u^OMO) A y/O Figure 4.19: A simple linear agent arbitration model. In theory, the spectrum of explicit, implicit, through emergent arbitration represents a migration from centralized to decentralized modelling; relatively fragile single threaded control to relatively robust parallelism; and reliance on computational determinism to dependence on environmental/controller interaction. In practice, agent-like systems often exhibit one or more of these arbitration strategies. In other words, explicit, implicit and stimulus driven arbitrators may operate concurrently within a given agent. 4.3.3 Example: A Linear Arbitration Model Despite the significance of the arbitrator in most real time systems, few choose to explicitly describe the arbitrator as part of the control system, preferring to classify the arbitrator as a computing and not a control element. To explore arbitration more deeply, a simple agent model can be presented that distinguishes the control from the arbitration while retaining a control theoretic representation. Consider a model of A in which the output of m controllers are linearly combined to form a composite control action or: xi (t) = A(r(t),y(t), v(t),u(t)) = T c k {t) ( ,r(t)) U (4.92) t where k(t) = [fci(i),..., kt(t)] is a set of b gains and u(t) = [iii (i),... , Uf,(i)] . 7 3 Since the gain vector represents any class of arbitrating mechanism, from finite state machine to continuous combinatorial system, k should evolve as a nonlinear function of environmental events,v(i), time and possibly its current state: k(t) = /((,k(t),r(t),y(t),v(0) as depicted in fig. 4.19. (4.93) Though /(•) has been chosen to map continuous time to a continuous domain k(i), plainly many arbitrators use discrete domains and discrete time measures [Zhang, 1995]. Indeed, the 66 Chapter 4. The Agent and Multiagent Model evolution of k(i) might be defined as a function of either a finite state machine (discrete time and domain), a difference equation (discrete time and continuous domain), an asynchronous circuit (continuous time and discrete domain), or a differential equation (continuous time and continuous domain). Recalling Colombetti's [Colombetti, 1996] composition operators, the outputs of controllers H (-) and Hg(-), u and up respectively Q l Q might be combined through linear arbitration vectors as: • Independent Sum: Uco(t) 1 0 _ U g(r) 0 1 C| • u 0 (4.94) Combination: (4.95) U («) = [ 1 1 C • u /3 u /3 Suppression: u (t) = (4.96) 1 0 c • Sequence: u (t)(t) = k (t) T c (4.97) 0 U 8(k(t),t)={ 10... t < tp 0 1 ... t>tp (4.98) where 6(-) is the delta function indicating a discrete change in state. Therefore k is a temporal sequencer or finite state machine within which mutual suppression, combination and/or independent action can be expressed. Now if r,- is based on a centralized model, it must be transmitted directly to each agent, symptomatic of explicit arbitration. If r(t) is unknown a priori and broadcast to all agents, then arbitration is implicit. Finally, if r,-(r) is internal to each controller, arbitration must be emergent. Chapter 4. The Agent and Multiagent Model 67 Equivalency Most of the models described here use elementary combination or sequencing. Indeed, Arkin's motor schemas, Steels' behaviours, Van de Panne's SAN, and the DAMN architecture are plainly behaviour summation strategics. In contrast, subsumption and DES are plainly discrete, using finite or augmented finite state machines to evolve k over time (e.g. [Mahadevan, 1992]). 4.3.4 From lone Agents to Agent Teams During the design evolution of a complex single agent system, it is conceivable that individual controllers within an agent might themselves become sufficiently complex that they acquire the attributes of agents. Indeed, many multiagent systems are composed agent hierarchies, in which a set of agents operate within a larger agent (e.g. subsumption). Alternatively, tasks may exist that are practically impossible to fulfill with a single agent (e.g. lacking time, energy, or degrees of freedom). In either case, close examination of the goal system and the available resources may suggest more than one agent is necessary to achieve the desired behaviour. Not surprisingly, real time constraints makes centralized control of agent teams impractical. For this reason, the coordination and control of agent based robot teams has become a growing area of investigation and, as we shall see in the next section, shares many of the attributes and definitions of agent based control systems. 4.4 A General Model of Multiagency In the previous section, definitions of behaviour, control, behaviour control, and agents were established and an Agent Control hypothesis proposed. Within a team these definitions remain largely unchanged, the team is composed of a set of plants each perturbed by disturbances and controlled by an agent controller. The difference and advantage that homogeneous teams of agents hold over lone agents is that the goal trajectory need not be reachable to all team members all the time. If members cooperate or compete to fulfill the goal trajectory, a degree of robustness andflexibilitycan be afforded. The behaviour of a multiagent team, or global behaviour, is an aggregation of every agents' local behaviour in the team. Global goals are, of course, the desired behaviour of the agent team and are analogous to composite goals in single agent systems. Achieving a global goal is more complex than a composite goal, since controller arbitration within a Chapter 4. The Agent and Multiagent 68 Model single agent is less complex than behaviour coordination amongst an agent team. As observed in Mataric [Mataric, 1994], the size of a discrete state space balloons from s states for a lone agent, to s for a team of a a identical agents. Just as in the lone agent case, information exchange governs the classifications of explicit, implicit, and emergent coordination. Historically, the ideal global behaviour arises from emergent coordination, in which a set of agents produces the desired behaviour through interagent dynamics and environmental interaction. Since both behaviours and the manner in which they are combined (e.g. temporally, algebraically, etc) greatly influence the output of such systems, the design of local behaviours becomes critically important. Without an explicit global goal definition, local behaviour has historically been an exercise in trial and error. Thus emergent behaviour is as potentially powerful is it is difficult to implement. Therefore, this thesis attacks the multiagent control problem assuming an explicit a priori global goal statement. This approach is supported by two arguments. The first is that many tasks exist that must be executed with some predictable guarantees of stability and precision. The second is that through a top down analysis and decomposition of a global goal system, a clearer understanding of multiagent dynamics might be achieved. Given this investigation strategy,the following sections must again address structure and nomenclature in a manner consistent with foregoing definitions, in particular: • What is global behaviour? • What is a global goal? • What is the structure of a multiagent system? • How are global behaviours derived from local behaviours? • How are local goals derived from global goals? 4.4.1 Local Behaviour In a multiagent system, a group of agent controllers generate behaviours through their respective plants. Recalling that each agent arbitrates between b local controllers, H ; : 1 < i < b we can define local behaviour in a familiar form: Definition: D4.16 (Local Behaviour) Consider the jth subplant, Gj : 1 < j < N, within a composite system perturbed by both control effort, VLij(t) : 1< i' < b, plant's local behaviour is the observed output, yj(t) (E Bj. and environmental disturbance Vj(t). Then the 69 Chapter 4. The Agent and Multiagent Model In a flock of boids, for example, the trajectory of boid i is the local behaviour of the boid i. Flock behaviour, r on the other hand, is clearly an aggregate characteristic or global behaviour of the collection of boids. 4.4.2 Global Behaviour In [Mataric, 1994], global behaviour is termed ensemble, collective, or group behaviour and is defined loosely as: "...an observer defined temporal pattern of interactions between multiple agents." If the dynamics of a team of agents were enclosed within a 'black box', the system's behaviour could be defined through definition D4.2 as the behaviour of the black box and the structure would be indistinguishable from figure 4.14. Thus the definition of Global Behaviour is simply an extension of D4.2: Definition: D4.17 (Global Behaviour) Given plants, Gj : 1 < j < N, perturbed by control efforts, u,j(r) : 1 < j < N, and disturbances Vj(t) : 1 < j < N, global behaviour is the observed output, y(t) : TR x IR" x 1R —¥ IR of the aggregate system: X + fc y'(t) = G(t,x(t),u (t),v(0) (4-99) c where x(t) = [xi(i),...,xx(t)] T and u (t) = [u i(t),...,U JV(t)] c c T C and the state of the system evolves ac- cording to some relation: k(t) = f[«,x(t),u,-(t),v(*)], Vi>0 (4.100) and a definition of the global behaviour space may be similarly defined as: Definition: D4.18 (Global Behaviour Space) If the set of all possible output trajectories, y(t), are bounded within a region, B, then B is the behaviour space ofy(t). Observing a large flock of birds, global behaviour is often observable as a flockwide change in direction and the behaviour space, possibly a complex mixture of gross motion and flock shape. To the observer, a multiagent system may appear to behaves as a single 'black box' plant, G (Le. at great distance a flock may appear as a coherent mass). On closer inspection the behaviour exhibited by the system arises from the combination of agent action and interaction (or birds in a flock). Thus the black box structure of definition D4.2 and depicted in figure 4.20 is more precisely described in the complex structure 70 Chapter 4. The Agent and Multiagent Model r(r), r c v(r) .(») H Ai(r,r ,,y,yi,v,u ) A,( A,.y.yi. ) r v c Al L_. yW, y/t) G(»,x, u ,v) I I ' K, U AAT (,) > L—L •y/v.. /»< ' cy ' A A r r y y v u A^ 1/ —TT 1 Figure 4.20: A multiagent controller model. Note that each agent acts upon the global plant G , though each agent may apply control effort to only a portion of the plant G j . of figure 4.21 in which the global behaviour is the product of some binding aggregate relation applied to agent behaviours . Drawing again on the flock of birds, the aggregate relation is some function that relates 3 the properties of each individual bird to the behaviour of the flock (e.g. bird position and flock centroid). Definition: D4.19 (Aggregate Relation) Given the behaviour of N agents, an aggregate relation is a function that maps the set of behaviours yj(t) : 1 < j' < N of the combined system into an observable global behaviour y(t) in some global behaviour space B or: y(t) = f( (t),...,y (t)) yi N (4.101) Thus, aggregate relations express constraints on the behaviour of a multiagent system, relating the behaviour of individual plants to the behaviour of the global plant. Such constraints can be physical equality or inequality constraints (e.g. end effector position, obstacle boundaries, etc.) or performance constraints (e.g. minimum time or minimum energy). For example, holonomic constraints on a multiagent system of dimension N described in some coordinate system rj can be expressed through the equality constraint: y-f(r]i,...,r] ) N =0 (4.102) Such expressions are greatly simplified through the adoption of a set of generalized coordinate systems. An aggregate relation describing the behaviour of the system, y 6 6, and the generalized coordinate space of the Note that it is not apparent in this figure that as each subplant Gj(t) generates behaviour yj(t), it may also disturb other subplants in the system. The term, v(t), however, embodies all such disturbances. 3 71 Chapter 4. The Agent and Multiagent Model ith of N agents, yj G Bj, are often related by a nonlinear transformation f : B\ x ... x BN -> B or: y = f(yi,...,yjv) (4.103) and the coordinates yj are independent of the constraints. For example, the cartesian position of a manipulator end effector, y is clearly a function of the cartesian positions of the manipulator's links forming the constraint: y-f(x ,...,x 1 J V ) = 0 (4.104) Since the j'th joint's cartesian position, x,-, is a function of joint displacements, this constraint can be further simplified through configuration space generalized coordinates^ = qj or: y-f(qi,...,qjv) = 0 (4.105) the forward kinematic solution. The role of the aggregate relation is depicted in figure 4.21. Many examples can be drawn such as: • mobile robot troops local robot trajectories combines to form troop formations. • legged walkers local leg trajectories combines to form gate patterns. • vehicle control individual motor output combines to form vehicle thrust. Of course, there exist systems for which establishing a desired behaviour is trivial in comparison to generating the aggregate relation necessary to describe the aggregate behaviour. 4 4.4.3 Multiagent Coordination From the definition of aggregate relation, any function of agents' behaviours is an acceptable aggregate relation. Thus, given the local behaviours of an agent team, aggregate relations describe a global behaviour. To achieve a desired global behaviour requires that local behaviours are organized into coordinated behaviour. More precisely: Definition: D4.20 (Coordinated Behaviour) / / agents Aj : 1 < j < N demonstrate a global behaviour, y(t), stable about r(t), given a set of local composite goals r j : 1 < j < N then the agents exhibit coordinated c global behaviour through some aggregate relation. e.g. Stewart platforms [Fichter, 1996] a parallel mechanism in which a known platform position easily specifies leg lengths but known leg lengths do not easily specify platform position. 4 Chapter 4. r(D. I 72 The Agent and Multiagent Model r y(0 r M * l W ^ ^ H A l '' l' - ''' ( A y 5 V > "A,''^> G,(r,x,, u ,v) MV.,"!'".".,) ci y,C) —*\ y«) f(y,(')) yw N H (r .y,y ,v) AN AN A (r,,r Jr,y .v. U N N CN N X N G (r,x,, u ,v) ) N ci V Figure 4.21: A decomposed model of a multiagent controller. Though each agent acts upon a local plant G j , disturbances dynamically bind plants to form a single global plant G . The aggregate relation f (•) represents the binding between the local and global behaviour states yj and y respectively. Just as an agent's arbitration strategy may be classified into either explicit, implicit, or emergent, so too, can multiagent coordination. The definitions of behaviour, D4.2, and global behaviour, D4.17 are virtually identical and the control of such system raises similar issues of centralized and decentralized arbitration. Indeed, it is this very duality that allows for single agent systems to be decomposed into multiagent systems. In explicit coordination, a centralized process composes a unique instruction set for the jth r . (t) of N Cj agents in a multiagent system or: Definition: D4.21 (Explicit Coordination) / / agents, Aj : 1 < j < N, achieve coordinated behaviour through direct transmission (i.e. private message) of composite goals r j(t) then agents Aj : 1 < j < c exhibit control through explicit coordination. In order to collate goal messages and receivers prior to a directed transmission, the centralized sending process must possess a description or model of the receiving process. Explicit coordination identifies a particular agent receiver as capable of understanding and acting upon a goal message of a particular type. For example, air traffic controllers assign unique instructions to each aircraft (eg. altitude, heading, take off and landing order, etc.) on approach or take off, but do not necessarily instruct each aircraft how to achieve these goals (e.g. stick/rudder/throttle positions). To achieve such explicit coordination requires an accurate centralized modelling system (e.g. radar imaging, transponders, air to ground communication, etc.). 73 Chapter 4. The Agent and Multiagent Model Implicit coordination, however, disposes of a centralized model of the multiagent system while retaining a centralized goal generation process. Such coordination requires that each agent is able to interpret and act locally to pursue some form of global goal expression. Furthermore, implicit coordination suggests that the goal generation process does not maintain a model of each receiver, permitting the same message to be broadcast to any number of receivers. Definition: D4.22 (Implicit Coordination) If agents, Aj : 1 < j < TV, achieve coordinated behaviour through broadcast transmission (i.e. a public message) of a goal, r (t), then the agents Aj exhibit control c through implicit coordination. Again drawing on air traffic control, air strikes in ground support are sufficiently uncertain to render explicit air traffic control impossible (however desirable). Though general directives (e.g. targeting, safe flight corridors, etc.) may be supplied by ground and air controllers, individual aircraft behaviour is determined by pilots as conditions warrant - an example of implicit coordination. Given sufficient communication resources any single agent system could be decomposed into an explicitly coordinated multiagent system. In such systems a centralized process engages in bilateral information exchanges to achieve the global goal. Implicit systems presumably require lower bandwidth, unilateral communications between the goal generation and agent processes. In emergent coordination, the ideal system falls 'naturally' into the desired goal state through interagent dynamics: Definition: D4.23 (Emergent Coordination) / / agents, Aj : 1 < j < TV, achieve coordinated behaviour, y(t) , through neither directed nor broadcast transmissions of a goals r . (£) or r (t) Cj c respectively, then the agents Aj : 1 < j < TV exhibit control through emergent coordination. In a dogfight, air traffic is still less structured and pilots often must act independently based on local observation and experience (and rarely with external air traffic control) to achieve local air superiority. Since air traffic models are necessarily local to each pilot, this global objective is often difficult to observe, but is achievable with sufficient training and equipment. In reality, multiagent systems often exhibit the full spectrum of coordination methods. Agents within multiagent systems have been known to communicate to negotiate explicit local plans and pursue broadcast objectives, while exploiting reactive behaviours (e.g. objective. [Parker, 1992a]) - all of which support the global 74 Chapter 4. The Agent and Multiagent Model In Mataric's PhD thesis [Mataric, 1994], similar though not identical definitions are used to describe cooperation 5 that often requires directed communication to assign particular tasks to the participants. Explicit cooperation is defined as: "...a set of interactions which involve exchanging information or performing actions in order to benefit another agent" Implicit cooperation is defined as "...consists of actions that are part of the agents own goal achieving repertoire, but have effects in the world that help other agents achieve their goals." Implicit cooperation has more in common with emergent coordination in that global behaviour arises from local goal achieving behaviour. 4.4.4 Global Goals Thus far the properties of global behaviour and multiagent coordination have been discussed, often with passing reference to a global goal. Global behaviour is, in general, the result of team behaviour. Global goals follow this definition by specifying a desired global behaviour. A necessary condition in the composition of a global goal is (as in the single agent case), reachability: Definition: D4.24 (Globally Reachable Goal) Given a global behaviour space, B, and a global goal space, Q, a goal r(t) 6 Q is globally reachable ifr(t) € Q fl B Given that agents can only pursue locally reachable goals (D4.4), for a goal to be Globally reachable it must be locally reachable for at least one agent throughout the interval of the global goal. Therefore, one possible definition of global goal is: Definition: D4.25 (Global Goal (basic)) / / a trajectory r(t) S Q is locally reachable by two or more agents then the trajectory is a global goal. Now recall that agent behaviour spaces Bj : 1 < j < N, are mapped into B, the global behaviour space, through the aggregate relation, f(-). Acting in isolation, the behaviour of the jth agent when projected through the aggregate relation, prescribes an agent behaviour subspace, B , within B. In other words, at any 3 5 [Mataric, 1994]:p. 23, paragraphs 4 and 5 75 Chapter 4. The Agent and Multiagent Model instant there exists a finite region of B reachable by the jth. agent alone. The union of the N agent subspaces B* is the global behaviour space or B = (Jjli &• Defining y ^ = [yi • • • yjv] , the rate of change of global 7 n behaviour with respect to local behaviours may be expressed as: y =^ f y n (4.106) A y = J(y„A)y,,A <- > 4 1<)7 where J(y ^) is the Jacobian of the aggregate relation. Given bounds on the local behaviour 'velocity' the n instantaneous extent of an agent's global 'velocity' subspace. To achieve the desired global behaviour, the global goal must be decomposed into local goals through an inverse of the aggregate relation from the global space into the agent's local goal spaces r /\(t) = cr] [r (t)...r (t)] : T cl cN 'cnA(') = g ( « ) r and g(-) : G ( 4 108 ) Gi x ... x GN • Ideally, g(-) would be a simple inverse of the aggregate relation or: g(r(t)) = f- (rW) 1 (4.109) For example, in resolved motion position control, a geometric inverse of the forward solution (an aggregate relation) is an example of an inverse aggregate relation. However, aggregate relations may not be easily invertible. There may be too few or too many agents available to exactly achieve the desired global goal. Recalling Samson's task function approach, it is clear that, just as in manipulation end effector tasks, global goals must be instantaneously feasible. So a more stringent definition of global goal can be proposed: Definition: D4.26 (Global Goal ) Given a trajectory r(t) £ G, local behaviours yj : yi < j < YN, and an aggregate relation y = f(yi • • • YN), */r(t) is feasible through f(-) then the trajectory is a global goal. If the aggregate relation is invertible, explicit coordination can use a one-to-one mapping to decompose the global goal trajectory into a unique set of trajectories for each agent. In implicitly coordinated systems, the inverse aggregate is distributed amongst the agents, each agent determining autonomously the appropriate contribution to the global goal. Finally in emergent control, an inverse aggregate is unnecessary, since the system falls naturally into the desired global behaviour. Regardless of the coordination method, one might argue that the perfect multiagent system contains exactly the right number and configuration of agents to produce the desired global behaviour (i.e. the aggregate is invertible). However, it is often beneficial for a multiagent system to contain agents in excess 76 Chapter 4. The Agent and Multiagent Model of the global goal's requirements. In this case the inverse aggregate is not invertible requiring the extension of an explicitly coordination system with additional rules to distribute subgoals between 'redundant' agents. So while redundant agent teams complicate explicit coordination, they provide flexibility and robustness to multiagent systems and encourage the adoption of implicit or emergent coordination methods. Indeed robustness through redundancy is a key attribute of many published multiagent systems (e.g. [Mataric, 1994, Parker, 1992a]). 4.4.5 Multiagent Control It has been demonstrated by a number of investigators [Mataric, 1994, Reynolds, 1987, Parker, 1992a] that through the arbitration of multiple local control strategies, each agent within a population of such agents can contribute to useful global behaviour with little or no modelling of the entire system. The foregoing definitions provide the necessary components to construct a second hypothesis that characterizes the intent of multiagent control control systems: Hypothesis: H2 (Multiagent Control) Given a global goal, r(t), a set of Agents, Aj : 1 < j < N, can be designed such that the global behaviour of the system, y(t), is stable about r(t). Encompassing both implicit and explicit varieties of coordinated agent behaviour, figure 4.20 and hypothesis H2 imply the existence of some form of goal statement r(t). However, D4.23 implies that explicit global goals and aggregate relations need not be specified in some systems, though desired global behaviours and aggregate relations may be observed nevertheless. 6 This is an example of emergent multiagent control, discussed in the following section. 4.4.6 Emergent Multiagent Control Though often attributed with "something for nothing" [Mataric, 1994] qualities, emergent coordinated behaviour is attractive because it does not require a central controller to produce a global behaviour. Unfortunately, this reputation masks the considerable difficulty in designing systems that 'naturally' converge towards a desirable behaviour. Historically, the attributes of a given emergent behaviour has been subjective and rarely explicitly defined or measured. Hence the following broad hypothesis on emergent control: A good example is the Flock Centering behaviour in Reynolds Boids [Reynolds, 1987]. At no time was an explicit flock description provided, though the local goal 'attempt to stay close to nearby flockmates' produced the desired grouping behaviour. 6 77 Chapter 4. The Agent and Multiagent Model Hypothesis: H3 (Emergent Multiagent Control) Given a global goal, r(t), a set of Agents can be designed to exhibit emergent coordination such that the global behaviour of the system, y(t), is stab a desired trajectory r(t). So important are the implications for redundant, fault tolerant control systems that emergent behaviour, originally an interesting side effect of multiagent interaction (e.g. [Walter, 1950b]), is now pursued as a performance objective in its own right ([Steels, 1991]). Unfortunately, the mechanisms of emergent behaviour remain largely unexplored, few investigators having modelled action and interaction of agent teams as dynamic systems. Though condensing individual agent dynamics into a single team expression permits the exploration of multiagent dynamics, the number and variety of agent implementations, makes a generally descriptive model difficult. Nevertheless, one approach to this condensation is to adopt the linear agent model described earlier and assemble a linear multiagent model. 4.5 Linear Arbitration and Multiagent Control Suppose that N agents , Aj, coexist in a multiagent system. From figure 4.20 u (i) may be defined as: c u (4.110) cnA(*) = [ c i W - - - c W ] u u N Substituting equation (4.92) for each u : Cj kf(t)u u (*) Cl u cnA(*) A i (t) (4.111) = u (t) k£(t)«A (*) CN N k'f U A,W (4.112) 0 = K A N « (4.113) nA('KAW where u ^(i) is the composite control vector, K cn U n A is a linear arbitration matrix and u ( £ ) is the agent p A control input vector. The condensation of a multiagent system into a single expression flattens the control hierarchy and conceals individual agent structures. Theoretically, if an aggregate relation is defined for a multiagent system as in equation (4.103) and the agent plants, G j , (or global plant, G) can be established and combined with the above linear arbitration Chapter 4. The Agent and Multiagent Model 78 model, the response of the system can be determined. In practice, however, this is rarely undertaken primarily due to the complexity of the (real world) plant dynamics. Thus further discussion of multiagent systems is difficult without greater understanding of the nature of the plant G(-), the disturbances v(t) and aggregate relation f(-). Clearly, the exploration of agent and multiagent systems would be facilitated by the adoption of a standard plant. The next section will do exactly this by adopting a serial manipulator as a reference plant and applying the foregoing definitions and structures to explore the multiagent control hypotheses in greater detail. 4.6 Multiagent Manipulation A logical starting point in the development of a multiagent manipulator control system is to identify agents that together generate an aggregate behaviour. Fortunately, this is a problem with a simple solution. As reviewed earlier, a serial manipulator is the physical combination of a set of links - each driven by a single degree of freedom actuator. The selection of individual links as agents automatically relates a number of manipulator properties to multiagent system components. 4.6.1 Links as Agents Recalling D4.2 and reiterating the time invariant linear link state space equation (2.50) for the jth link, we have: Gj: Xj = AjXj + bjUj+djVj (4.114) (4.115) Yj = CjXj (4.116) with control input uy and disturbance Vj. Selecting the output matrix, Cj, as the identity matrix and assuming a direct drive robot (the gear transmission ratio in equation (2.50), rj = 1), one can define the behaviour of the jth link as the output [qj <jj] . Given bounds on yj, the maximum and minimum joint T displacements and velocities, the behaviour space Bj can also be prescribed. A reachable goal for the jth agent is then a trajectory that intersects this behaviour space - a joint position and velocity setpoint within the bounds of qj and qj respectively. By selecting a stable controller (e.g. a PD controller with LHP characteristic roots) and providing a reachable goal trajectory, Tj(t), goal seeking behaviour can be observed. By providing multiple stable 79 Chapter 4. The Agent and Multiagent Model controllers that respond to local and global goal objectives, this link controller is transformed into a link agent. 4.6.2 Manipulators as Multiagent Systems Collecting the link agents into a single model, the nonlinear manipulator plant can be recovered. Expressed (for example) in the linear time invariant Brunkowsy canonical form: G : X = AX + bu + dv (4.117) where q 0 q 0 0 0 I u = D(q) r _1 1 0 v = -D(q)- 1 y = 1 [C(q,q) + g(q) + J f T cxt ] Cx (4.118) (4.119) Again, selecting the output matrix C as the identity matrix and the transmission coefficient TJ = 1, one can define the behaviour of manipulator as the output [x x] . Given bounds on y, the maximum and minimum T end effector displacements and velocities, the behaviour space B can also be prescribed. The definition of the aggregate relation is dependent upon the desired global behaviour. Though not directly controlled by most manipulator controllers, the end effector ultimately performs tasks expressed in cartesian coordinates. Adopting generalized coordinates for each link agent, the aggregate relation maps the agent states to the end effector coordinate, the forward kinematic solution. Reiterating equation(2.4): y = f(q)=Ilf Af ( 1 = 1 ( ? i ) (4.120) If, for example, the global behaviour was the manipulator shape, the aggregate relation could be stated as: y = f (q) = ijvq and the global behaviour: y = q. (4.121) 80 Chapter 4. The Agent and Multiagent Model 4.6.3 Global Goal Distribution Once a feasible global goal trajectory has been established, the goal expression must somehow be transformed from Q to the Qi : 1 < j < N generalized spaces of the contributing agents through some inverse of the aggregate mapping as in equation (4.108). End effector trajectory inverse solution methods, discussed earlier, are commonly implemented as explicit coordination strategies. In these methods joint positions, velocities, accelerations or forces are uniquely correlated to each joint of the manipulator. These methods may be characterized as monolithic centralized inverse solutions often requiring substantial dynamic modelling and are, therefore, a poor foundation for multiagent control. However, a closer examination of these methods reveals subtle differences that ultimately provides insight into an effective decentralized multiagent manipulator control strategy. Inverse Aggregate Maps Recall that R.MPC,RMRC and R M A C establish a one-to-one mapping of end effector position to joint displacement, velocity, and acceleration respectively. R M P C performs this through a geometric inverse kinematic solution, transforming end effector position from task space to configuration space. Usually nonlinear functions, independent of the manipulator's configuration space history, R M P C inverse solutions tend to be monolithic solutions in which model errors are not easily corrected on line. Therefore, it is imperative that this solution exactly model both manipulator and environment. Like R M P C , RMRC and R M A C are centralized inverse solutions that correlate unique setpoints with each joint. These integrable methods rely on the Jacobian inverse or pseudoinverse to solve for joint velocities or accelerations respectively. Unlike RMPC, however, the Jacobian inversion process uses the manipulator's configuration space history to correct for modelling errors. As reviewed in chapter 3, R M A C and RMRC use this model to implement an outer, task space control loop, moving centralized end effector tracking from joint space to task space. In redundant systems, this outer loop frees inner loop joint controllers to pursue local goals in the end effector Jacobian null space as described in detail in the next chapter. Finally, J T C methods apply a similar outer loop control strategy. Provably identical to R M A C given an exact feedforward dynamic model in task space, the Operational Space Formulation (OSF) appears to be little more than a transformation of integrable methods into task space. This summary suggests that local goals and link controller are correlated, explicit coordination methods. Indeed, R M P C is restrictive even for explicit multiagent coordination, since all decision making is bound to 81 Chapter 4. The Agent and Multiagent Model a centralized geometric arbitration strategy. Conversely, RMRC and R M A C seem reasonable candidates for explicitly coordinated multiagent manipulator control of redundant systems. Through the use of outer task space control loops, both permit local activity with little global interference. And yet, the requirement of centralized models for both RMRC and R M A C seems to defeat the purpose of a multiagent system. If the state of each agent must be collected, a centralized aggregate relation assembled and inverted, prior to forming a set of local setpoints, what advantages are there to decomposing a monolithic R M A C controller into a multiagent system? Computationally and architecturally, there is no advantage. Indeed, the overhead of interprocess communications would add to the computational burden. To reduce or remove the burden of information exchange would require a substantial reduction in reliance upon a centralized model. However, if a model free agent-decentralized global objective could be formulated based on a task space control strategy, the computational disadvantages would vanish, replaced by significant gains in robustness, extensibility, and, possibly, real time response. Does a model-free decentralized global goal system exist for manipulation robotics? Given the complex nonlinearity of manipulators one would not think so. However, with some minor compromises a decentralized global goal system is possible. Consider the problem from the standpoint of the lone link agent. What action must this agent take to perform an end effector task if acting alone? Referring to figure 4.22 the required action becomes apparent. Suppose the desired end effector motion can be expressed as a differential distance and rotation: 6r(t) = [<Sr (t) f5r (i)] (t) = [dx dy dz] Sr (t) = £r trans I0t trans 2 rot T [de dO d9 ] T x y z and that a lone actuator is to provide this motion. A prismatic actuator, oriented arbitrarily along vector k,_i must translate: Sq = k,-_i • Sr (t) + 0 • Jr (i) tians (4.122) rot but cannot rotate (hence the second term on the right hand side). For a revolute actuator at p --i t Sq = k,-_i x (p„ - p,-_i) • cfrtransM + k,-_i • Sr(t) TOt (4.123) represents the differential rotation of the ith agent's actuator. Using this line of reasoning the following global differential motion projection operator can be developed: Sq = Si(Pn,P,--i,k --i)-6r(t) I (4.124) 82 Chapter 4. The Agent and Multiagent Model ^Pn.Pi-i.k,-.!) = f [k,-_i x (p„ - pi-i) I ^[ki-i 0] k,_i ] if revolute (4.125) if prismatic Comparison of this equation with equation (2.11) reveals that (/,•(•) is, in fact, the column vector of a revolute or prismatic Jacobian respectively. Thus #,•(•) is the row of the Jacobian Transpose, Jf. Apparently, the product of the end effector force trajectory and the ith row of the Jacobian transpose prescribes the required generalized force required from the ith actuator. Furthermore, the components of the row elements are simply the local and end effector frames in task space. Jacobian Transpose Control Jacobian Transpose control also employs equation (4.120) as the aggregate behaviour and adopts a variation on the inverse solution by determining required forces and torques rather than position, velocity, or acceleration: T=J (q)f (4.126) T d where fd is a desired force profile. On the surface, this seems little different than R M P C , RMRC, or R M A C , since the joint torque vector is computed through a central process that models the complete manipulator kinematics in the Jacobian, J . As we have seen, however, the transpose has a unique rowwise structure equivalent to a differential projection of the end effector's motion onto each actuator or: n =Jf(p,--i,p„) f (4.127) Encapsulated within a joint controller, J] (•) can be applied to a common global goal f, acting as a proxy of on behalf of the global goal. 4.6.4 The Global Goal Proxy Jacobian Transpose Control offers an avenue for implicitly coordinated multiagent manipulator control. By assuming that p , p;_i and f are communicated to or sensed by the agent, the Jacobian Transpose may n be decomposed row-wise into n separate, parallel computations. Unlike decentralized controllers in which setpoints and controllers are uniquely correlated, distributed Jacobian Transpose Control enables all the controllers to apply the same global goal, fd, to p . In effect, J / (•) is a projection operator that identifies n that portion of the global goal that the ith joint can perform. Since it is decentralized 7 , the row can be Strictly speaking decentralized control refers to the independent control of generalized coordinates. Clearly task space link states, functions of geometric constraints, are not generalized coordinates. However, from the data flow (or parallel 7 83 Chapter 4. The Agent and Multiagent Model a) Prismatic Link r(0 Figure 4.22: The rows of the Jacobian transpose are a projection operator from the global goal (a force vector) to the local goal space (an actuator moment or thrust vector). 84 Chapter 4. The Agent and Multiagent Model incorporated into each link agent as a proxy controller, a local controller acting on behalf of a global goal: Jf(p _ ,p p)= 1 where p 1 ap f [k,-_i x (p [ [k,_i app - pi_i) k,-_i] if revolute 0] . if prismatic (4.128) is the position of application - in the case of trajectory tracking, p„. Rather than decentralizing a p p the controller into functions of local joint state, this technique decentralizes the controller into functions of local cartesian state, p,- and k;. To acquire this state information, each link process must either sense or, receive through communication, the necessary information. If the jth agent link stores a communications directory (arguable a rudimentary model) of agents j — 1 and j + 1, directed transmission can be used to pass kinematic packets containing cartesian state data between neighbours These messages are equivalent to shared sensory data upon which the agent may act at its own 8 discretion. To support the proxy controller, message packets (composed from the language of real numbers) describing the distal coordinate frame of link agent Aj_i: Kinematic Packet, Hj_i : {pj,Rj,...} (4.129) must be assembled and directly transmitted by Aj_i to agent Aj. Interpreted by agent Aj as the jth proximal coordinate frame, Aj must transform the data to describe the distal frame of j and transmit a new packet to Aj+i. Through this recursive communication, the communication bus distributes the forward solution (the aggregate relation) over the manipulator. With the proxy controller and the distributed forward solution, an implicitly coordinated multiagent system is possible. The global goal process derives a and broadcasts a message composed of a point of application and a desired applied force, together forming a global goal couplet a-{p for end effector tracking, p ,f } (4.130) d € R is end effector position vector and fd S R the force goal, both in global 6 a p p app 6 coordinates. Once assembled, Q may be broadcast to the agent team. The global goal generator performs no arbitration and, as detailed later in this study, need not maintain a model of the manipulation system. computing) standpoint, global proxies are solely dependent on the task space objective and link frame positions and, therefore, are decentralized inverse aggregate relations. In contrast, traditional decentralized manipulator controllers (e.g. [Stokic, 1984]) require explicit centralized inverse kinematic solutions. Alternatively a unilateral communications bus structure could be implemented to transport packets from manipulator base to end effector, 'hard wiring' a communications model into the agent. 85 Chapter 4. The Agent and Multiagent Model Trajectory tracking under Jacobian transpose control is not without difficulties, however. It is trivial to construct conditions under which the transpose becomes 'singular', the row product collapsing to zero: f [k,-_i x ( p T i = ] - pi_ ) app 2 k,_! f f d if revolute (4.131) T [[k;_i 0] if prismatic Though rare, such conditions arise in revolute links through the colinearity of 1. fd and k,_i. 2. fd and (p app - p,-i). 3- (Papp - Pi-i) and k.-.j. and in prismatic links if fd and k;_i are perpendicular. Rarely completely freezing link motion, these conditions can produce a stalled or wallowing response as p a p p leaves the collapsed region. With these caveats in mind, Jacobian transpose techniques permit the specification of a variety of global goals, automatically fulfilling the implicit coordination definition. 4.7 A Link Agent Specification Having identified a distributed trajectory tracking global goal, the specification of an implicitly coordinated link agent can now be formed based on the previously defined structure of generic agent and multiagent systems. The jth agent acts on a subplant, G , first defined in chapter 2. ; (4.132) Xj(t) = A x (t) + bu (t) + dv{t) j j j where: (4.133) A,- = . q m J V J = note that since rj, the motor transmission coefficient is set to 1.0, qj = -JcB, i d (4.134) q . mj The behaviour of this subplant is simply the current link's joint: (4.135) Chapter 4. The Agent and Multiagent Model 86 For end effector trajectory tracking, the behaviour of the multiagent system is simply the end effector positionp„(t), velocity p (t), and force,f„(£): n PnW y(*) = (4.136) Pn(*) fn(«) The instantaneous local behaviour space Bj, is bounded by the joint position and velocity limits. { Qjiow < qjmin q < Qjhigh < q < qjmax Joint position • Joint velocity (4.137) Despite the vector notation, the control effort in this case applied to the jth link is a scalar generalized force or torque, u j. c The combination and/or selection of behaviour controllers will be performed through the linear arbitration model defined earlier u (t) - A(k(t),Uij(t) : 1 < i < b) = k ( i ) Uj(t) r cj (4.138) Though the details of global and local goal design are yet to be discussed, the form of the global goal is known. The global goal proxy discussed earlier provides the foundation for decentralized global goal seeking. A global goal trajectory tracking goal is a force trajectory applied to the end effector. The global goal r(t): fd(Pn(*),Pr») (4.139) Pn(*) where p n is the location of the end effector in world coordinates. Therefore, the multiagent system's be- haviour space, B, is ultimately limited to the bounds of end effector position and velocity performance as well as the maximum applied force. The latter is a product of the saturation properties of each link motor, u sat , and the manipulator dynamics: 'p B=< n C Work Volume,, Cartesian position p„ 6 i?(J ) Cartesian velocity n (4.140) Jf„| < \J- (Dq+ Cq + g - u )| T sat The bounds on f<j simply indicate that the bounds on the available applied force are dependent on the manipulator's instantaneous joint state and joint actuator saturation limits. The global proxy becomes: u,- = H,(r c (0,Pj-i) (4.141) = Jj(p _ (i), P n (0)fdW (4.142) J 1 Chapter 4. The Agent and Multiagent 87 Model End Effector Global Goal Generator Global Goal Couplet Formation Other Agents Local Behaviour Controllers Figure 4.23: With a global goal and the global goal proxy, Jf (p,-i, PTV), manipulator end effector control can be distributed amongst N agents, each with local and global goals r<j and respectively. where Jj (pj-i(t),p (t)) n is defined in equation (4.128). This prototype link agent structure is depicted in figure 4.23. 4.8 Summary In this chapter, a number of common terms in both agent and multiagent control have been discussed and formal definitions proposed. In particular, behaviours have been defined as the response of some plant; a behaviour controller seeks to produce a particular behaviour in response to a specific environmental condition; and an agent combines behaviour controllers through some arbitration mechanism. Definitions of public and private messaging clarified three forms of agent control: explicit, implicit, and emergent. These definitions have enabled the clear statement of three hypotheses that drive interest in agent and multiagent control systems: An Agent Control Hypothesis posits that an agent can be devised to achieve a desired composite behaviour, the product of selection and/or combination of control action to achieve a desired net behaviour. The Global behaviour of a multiagent team was defined as the product of some aggregate relation, a Chapter 4. The Agent and Multiagent Model 88 function of some set of constraints imposed on the system. Just as agent control was classified according to explicit, implicit and emergent arbitration strategies, multiagent control was categorized according to explicit, implicit and emergent coordination. From these definitions, multiagent hypotheses were formulated. A Multiagent Control Hypothesis proposed that a set of agent controllers may be designed such that the global behaviour of the system achieves a desired global behaviour. By using a global goal generation process, explicit and implicit coordination methods were identified as a mechanism for the production of desired global behaviour of the team. Emergent multiagent coordination was discussed as the product of a multiagent system without such central processes, the Emergent Multiagent Control Hypothesis proposing that such control was possible. With these tools in hand, the manipulator control problem was reexamined to discover a mechanism that decomposes a monolithic manipulation process into a set of link agents. After discussing the significance of aggregate and inverse aggregate relations and manipulation, a global goal distribution operator was derived and shown to be exactly equivalent to the Jacobian Transpose. Given these results, it seems that multiagent manipulator control is indeed possible, though many questions remain unanswered. While multiple control strategies have been combined within Khatib's OSF, Seraji's Configuration Control, and numerous R M A C based redundant resolution controllers, these combinations have been strictly controlled through a centralized arbitration mechanism such as task prioritization and/or null space selection. Implicitly coordinated agents do not have the luxury of a centralized null space selection technique. Naturally, this raises questions on the stability and practicality of multiagent manipulator control. Decentralized manipulator control is not new. However, decentralization of manipulator control through the Jacobian Transpose has, to this authors knowledge, neither been identified nor demonstrated previously. The next step is to demonstrate that multiagent trajectory tracking is possible and to explore the effects of additional behaviours within each agent on global behaviour and stability. Manipulator simulation, like manipulation in general, is typically performed through monolithic simulators. The next chapter will discuss the structure of a multiprocess manipulator simulation system assembled specifically for this research. Chapter 5 The Multiprocess Manipulator Simulator 5.1 Introduction In the previous chapter the structure of agent and multiagent systems was examined from a control theoretic standpoint. It was shown that multiagent systems are composed of multiple processes each capable of achieving independent local goals but cooperating towards a collective, global objective. Exploring these issues further, a manipulator was chosen as a benchmark dynamic system upon which these concepts could be applied. With the identification of an implicit coordination architecture, the dynamics of a multiagent system can now be explored through simulation. Traditional control simulations are usually monolithic, centralized processes. Though such simulations are often implemented in a high level language such as Mathematica™ or M a t l a b ™ , the real world controller is implemented in a lower level language such as C or assembler. Of course high level languages speed the investigation of controller-plant dynamics by ignoring details such as hardware architecture, data flow and communications limitations. However, these details are crucial to the successful control of a real world plant. For this reason, the software system was designed with a view to the simulation of a real world multicontroller environment in the belief that some of these architectural concerns might be clarified. The objective of this chapter is to explain the structure of the Multiprocess Manipulator Simulator (MMS) and its relevance to multiagent control. Though any simulator remains only an approximation of real world conditions, this multiprocess simulator represented a significantly more realistic and demanding software development environment than monolithic simulators, bearing a striking resemblance to distributed, embedded controller development platforms [RTI, 1996]. 89 Chapter 5. The Multiprocess Manipulator 5.1.1 Simulator 90 Requirements Overview Functional Requirements The functional requirements of any manipulator control simulation fall into two areas: environmental simulation and controller logic. The environmental simulation must manage real world events such as temporal flow, manipulator dynamics, and other environmental features such as obstacle kinematics. The control portion of the simulator represents some form of controller that, ultimately, applies a force through each actuator in the simulator. Fundamentally, a manipulator simulator must faithfully generate the dynamic response of a robot given the application of an arbitrary number of linear or revolute actuator forces. In short, the simulator must be capable of modelling any serial configuration of links that drive an arbitrary constant payload along any feasible end effector trajectory. Of course manipulation is not limited to end effector trajectory tracking. A versatile simulator must allow for controllers based on environmental sensing such as force, range, and machine vision sensors, all operating within disparate time scales. Furthermore, the system must accommodate the possibility of world modelling and path planning extensions. In reality a system of this complexity would likely be distributed over multiple computers, becoming a multiprocess controller. To simulate parallel multicontrollers, each controller and dynamic model must become a distinct process, ideally each executing on separate CPUs. This multiprocess architecture places additional requirements on the simulation system in the design of both process and communication structures. In reality, control processes communicate with the environment through sensors and actuators. In monolithic simulators this 'data flow' is scarcely visible, embedded within a common set of symbols shared between simulator and controller. In multiprocess controllers, however, interprocess data flows must be explicitly implemented through interprocess communication (IPC) or shared memory structures. An inescapable reality of all control processes, and one that presents a significant barrier to SMPA architectures, is that sensing and actuation usually occur at different rates, often by orders of magnitude. Similarly, data transfer between domains (e.g. from controller to motor or from controller to controller) may not occur synchronously or at identical rates, and may not always be reliable. Simulated data flow implementations should reflect these realities. Chapter 5. The Multiprocess Manipulator 5.1.2 91 Simulator Design Specification Overview In examining a generic parallel multiprocess control problem, a number of key processes can be identified as fundamental to a simulated manipulator control system: • a process manager and synthetic clock • a generic manipulator and obstacle modelling process • a generic link process. • a global goal process Clearly, the launch and synchronization of the simulator requires a central simulation management module or process manager within which a clock can simulated. For speed, the manipulator model that mimicks the response of a real manipulator and its environment must be contained within a single process. Obviously each link agent is, ideally, an independent process. Global goals are, themselves, independent processes performing tasks such as trajectory generation, visual servoing, and obstacle avoidance. Despite the multiprocess architecture of the system, each process type shares a number of common elements, such as interprocess communications, vector and matrix algebra, and cartesian state representation to name a few. Coherent design and implementation of a class library is crucial to the simplification of the simulator's design and implementation. With basic software design specifications in place, a number of nonfunctional requirements impose constraints on the implementation of the multiprocess manipulator simulator. Nonfunctional Requirements To take advantage of available computing platforms (Sun SPARCstations, NextStations, and SGI Indigo workstations), the simulator should be designed for extensibility, robustness and portability. Extensibility is greatly enhanced through the adoption of object oriented computing languages such as C++ or objective C and the adherence to both code libraries or class hierarchies. Robustness should be guaranteed through modular testing and industrial debugging tools. Portability can be guaranteed through the use of ANSI standard languages, particularly ANSI 2.0 C++. The selection of an object oriented language such as C++ is based on portability and the ability to build on tested, proven code with little difficulty. Chapter 5. The Multiprocess Manipulator Simulator To ease portability between imaging platforms (e.g. 92 Display PostScript or X windows) and between simulation and hardware implementation, animation, rendering, and analysis should be separated from simulation. By using an ASCII data log format any third party data analysis tool may be used for performance assessment. Another portability issue is the form of IPC or interprocess communication. On the grounds of performance and simplicity, shared memory appears to be a logical choice for interprocess data transfer. Unfortunately, standard shared memory is based on local machine memory protocols for which there is no standardized network support, the most likely hardware option. This forces the selection of widely supported, though slower, UNIX datagram socket techniques. With these basic requirements for a multiprocess manipulator simulator, this chapter will overview the design of the simulation system and discuss the implications of this architecture on both simulator and controller design. 5.2 Foundations Written in C++, any manipulation simulator will be composed of numerous classes that, if well designed, fall into a class hierarchy. The Multiprocess Manipulator simulator is no exception, composed of approximately 20 object classes. This section will briefly overview the technical details around the most significant members of the MMS class hierarchy. In particular, attention will be focussed on the most complex abstractions including: quaternions, cartesian state and trajectory, LinkModel and ManipulatorModel classes. Where possible every class possessed a dedicated testing module and passed through the industrial-strength bug filter, Purify™. 5.2.1 Quaternions Crucial to the implementation of a cartesian controller is the representation of orientation and, in particular, orientation error. Orientation, unlike translation, presents special problems not the least of which is that elements of the familiar 3x3 rotation matrix are not a generalized coordinates. A number of alternatives exist including a number of Euler angle representations [Goldstein, 1981]. In representing orientation error, Paul [Paul, 1981] employed a differential orientation system only useful over small displacements (from Appendix Chapter 5. The Multiprocess Manipulator Simulator 93 B equation(B.399)): x = e dx n dy O dz a Sx 2( Sy |(n • (Pd - P a ) a • (Pd - P a ) a • (Pd - P a ) a & a a Sz Od - ad •o) ad - n • nd- d a a) a Od • n ) a where n, o, arid a are the column vectors of the end effector orientation matrix and the subscripts d and a refer to desired and actual respectively. Alternatively, Yuan [Yuan, 1988] developed a Quaternion expression for orientation error Q = —Q1Q2 e where the quaternion Q is a couplet t], q: Sri = mm + qfq2 <5q = J q - %qi - qi x q A 2 (5.143) (5.144) 2 and determines that two coordinate systems coincide if and only if r5q = 0. Thus cartesian error can be represented as: r dx dy (5.145) dz Sq x H/ <fcu (see Appendix A for more detail on quaternion algebra and Appendix B for details on orientation error). Since this representation is compact and accurate over large orientation errors, MMS regulates cartesian orientation error through quaternion representations. By providing a quaternion class with the necessary addition, subtraction and inversion operators, a uniform interface to both vectors and quaternions was established. This interface greatly simplifies the implementation of the Cartesian State and Trajectory classes to follow. 5.2.2 Cartesian State The Cartesian State is universally understood to be a position and velocity vector in IR , though, as stated 6 above, orientation is usually represented as a 3 x 3 matrix. In building a Cartesian State, quaternions were Chapter 5. The Multiprocess Manipulator Simulator Class Object Representation vector vector vector Quaternion vector vector P V A position velocity acceleration orientation angular velocity angular acceleration Q w Wd 94 Table 5.2: Structure of the CartesianState. incorporated into the object definition summarized in table 5.2. With equality and subtraction operators, arithmetic manipulation of cartesian states, (e.g. cartesian error vectors, trajectory generation, etc.) is straightforward. 5.2.3 Cartesian Trajectories and Paths Cartesian Trajectories used four simpler trajectory objects to build both step and cubic linear trajectories in both position and orientation. A CartesianTrajectory could be completely described through a time interval, starting, and endpoint cartesian states. Once instantiated, CartesianTrajectory data members could be updated through an update(int TimeStamp) method. The class has been designed to be easily extensible to other trajectory types. Each element of the cartesian position vector was described by a trajectory object.Since the difference between two quaternion orientations is, itself a quaternion, an orientation trajectory can be described as a time varying angle of rotation, <f>, about the quaternion error axis Jq was also specified as a trajectory object. Hence a trajectory in IR.'' requires the maintenance of four time varying trajectory objects x (t),x {t),x {i),<f> (t) ex Cy ez e and the preservation of an orientation error axis, (5q, and start (or end) point positions. Since realistic manipulator trajectories are usually chains of smooth trajectories sometimes joined by discrete changes, a CartesianPath list class was defined that automatically managed the trajectory over changing segments. New trajectory segments could be appended to the trajectory list at any time. 5.2.4 Parsers The system relies heavily on a flat file database format that describes the physical and architectural characteristics of each link in the manipulator. Though flat files are often adopted for legibility, large streams of Chapter 5. The Multiprocess Manipulator Simulator 95 numbers remain extraordinarily difficult to read without classification. A flat file format based on a simple BNF syntax implemented through Lex and Yacc enabled the assignment of numbers, strings, vectors, and 1 matrices to named standard variables such as mass and i n e r t i a or 'user defined parameters' for unique variables (e.g. in a particular controller implementation). This data file system also allowed the definition of more abstract structures such as globalgoal, knotpoint,controller and link, the details of which are discussed later. In general, manipulator control experiments test the response of systems given variations in robot type, payloads, trajectories, control algorithms, and, sometimes, obstacle trajectories. Within the Multiprocess Manipulator Simulator, these four entities have been formed into an abstraction called a scenario. Experiments as Scenarios A scenario is formed by bundling five database files: • a robot database: manipulator link and controller specifications • a payload database: payload mass and inertia. • an obstacle database: obstacle geometries and knotpoint lists. • a goal trajectory database: end effector knotpoint list. • a global goal database: a globalgoal list. together within a single directory structure. By maintaining libraries of robot types, payloads, trajectories, end effector goal descriptions, and obstacle trajectories, scenarios can be the rapidly assembled for experimental trials. Furthermore, automated scripting enables rapid construction of experimental series in which a single database type may be varied over a number of scenarios (e.g. varying payload mass). So significant is the scenario concept that each process in the MMS is launched with an internal representation of this file bundle, a Scenario, retrieving initialization data as required and writing execution and data logs to the Scenario directory. 5.2.5 LinkModel The abstraction of a 'link', though straightforward, is fundamental, appearing as a parent class to both the link agent and link rendering concepts. The LinkModel class contains fundamental link descriptive types 'Lex and Y A C C (Yet Another Compiler Compiler), both standard Unix utilities Chapter 5. The Multiprocess Manipulator Simulator Class char matrix vector vector char double double double double double Object Name I C X Type Mass theta offset length alpha 96 Representation link label inertia centroid state vector revolute or prismatic link mass revolute link rotation offset displacement link length link twist Table 5.3: Structure of the LinkModel. and scenario based self initialization methods. Table 5.3 details the root structure of the LinkModel family tree. 5.2.6 Interprocess Communications Interprocess communications play a central role in the multiprocess manipulator simulator. IPC acts in three capacities: the application of forces computed by link processes to the link joint motors in the dynamic model, the simulation of sensor data flow in link agents, and communication between link and global goal processes. In Berkeley UNIX, the main mechanism for interprocess communication is sockets a named file-like device with a well deserved reputation as both difficult to implement and debug. To simplify and encapsulate the communications protocol a Socket object was developed as the base of a communications hierarchy that is ultimately equivalent to a distributed shared memory system. The first layer of this hierarchy is the abstraction of data sources and sinks or Measurands and Sensors into segments of shared data SensorVector and MeasurandVectors. Sensor and Measurand Vectors MeasurandVectors are the means through which an MMS process sends data streams to other processes (thus acting as a measurand). Each MeasurandVector is composed of a linked list of data types (double, vector, or matrix) and a transmission buffer composed of an array of doubles. Double, vectors, or matrices of specified dimensions allocated through a MeasurandVector with a specific update rate are mirrored in the buffer.Allocated data elements may then be transmitted through an update (int TimeStamp) message to the Chapter 5. The Multiprocess Manipulator Simulator 97 MeasurandVector. With a specific update rate, stale buffer elements can be updated prior to transmission. Similarly, SensorVectors are the means through which a process receives data streams from other processes (thus acting as a sensor). Just as in the MeasurandVector, doubles, vectors, or matrices of specific dimensions can be allocated with a specific update rate and are mirrored in a receive buffer. Again sensed data can be updated through an update (int TimeStamp) message in which only stale variables are updated. To establish the equivalent of a distributed shared memory, a transmitting processes must establish a MeasurandVector and receiving process a SensorVector. Once both processes have allocated identical data sets in identical orders, continuous data updates can be facilitated through update messages on each side of the communication connection. Sensing and Actuation By default, each link has joint position and velocity sensors. These are implemented through SensorVector connections to the dynamic model and are updated at a specified rate. Other sensors can also be implemented: end effector force data for example. dynamic model. Again this is provided through a SensorVector connection to the Conversely, joint actuator forces are exerted on the link through a MeasurandVector connection to the dynamic model. The foregoing overview provides enough background to simplify the description of the individual processes in the Multiagent Manipulator Simulator. In the following sections, each MMS process structure will be briefly reviewed and the execution life cycle explained. 5.3 The Process Manager and Synthetic Clock, SI The Process Manager launches, safely initializes, and synchronizes each process in the system. Given a scenario, SI launches and initializes the necessary link processes, the dynamic modelling process, and the goal process. The process manager also establishes communication links to each process used to synchronize processes to the synthetic clock. In the MMS, an artificial clock was adopted to maintain synchronization and to ensure that all the time dependent processes receive a common time stamp. 5.3.1 Execution The process manager and synthetic clock are contained within the Simulation Initialization process, SI. Launched with the command: Chapter 5. The Multiprocess Manipulator Simulator 98 SI <aScenarioName> , SI reads the robot database file and spawns a link process, LP, for every link in the manipulator description. A goal process, GP, and dynamic model, DM, are then spawned successively. Therefore at any time in the simulation cycle at least N + 2 processes are active an TV link manipulator. Once initialized, SI creates a set of synchronization sockets through which Time St amps can be transmitted. The link and goal processes then await for the time stamps from SI, the number of milliseconds (a long integer) from simulation To and broadcast in 1 ms. intervals. Execution commences with a call to the method,execute(StartTime .EndTime ,TimeStep). Only SI uses the experiments duration from StartTime to EndTime, all other processes cycle ad infinitum. During execution, the process manager transmits time stamps to both goal and link processes, synchronizing them to a common, synthetic clock. Before sending the next timestamp, the process manager awaits an 'echo' of the timestamp from the dynamic modeler. If an error occurs, due to spawn failures or a late 'end-of-cycle' message, or if the simulation is concluded, the system invokes a clean shutdown by transmitting a k i l l notice - a negative timestamp. During error based termination the system simply completes writing to execution logs, terminates socket connections, to exit cleanly. During normal termination, the system awaits an 'end-of-cycle' message from DM before shutting down. The overall structure of the multiprocess manipulator simulator is illustrated in figure 5.24. 5.4 The Dynamic Modeler, DM The dynamic modeler represents the core of the simulation portion of the MMS. The purpose of the Dynamic Modeler is to accurately compute the response of a specific robot given physical characteristics such as the mass, inertia, relative geometry, and state of each link in the manipulator and the link motor's applied forces. The dynamic modeler must also manage the motion of obstacles in the environment. 5.4.1 ManipulatorModel An abstraction of a manipulator model is important for both the dynamic modelling aspect of the MMS and also for controllers that rely on models for dynamic cancellation. Unlike other C++ modules in the class hierarchy that compromise speed for ease of use, the ManipulatorModel class has been optimized exclusively for speed. Object oriented methods are powerful programming tools and greatly simplify the design and Chapter 5. The Multiprocess Manipulator 99 Simulator SI Log Scenario Link History Figure 5.24: The interprocess dataflowof the Multiprocess Manipulator Simulator. As GP is not a mandatory process, it appears as a dashed circle. testing process. However, OOP methods are often slower than traditional C procedural methods. Therefore, a compute intensive dynamic model [Luh, 1980a, Walker, 1982] was implemented based on proven ' C tools [Press, 1988]. The manipulator was represented by an array of link structures, similar to the LinkModel object above. This approach permitted fast access to a given link's parameters through a single structure rather than simpler methods that aggregate manipulator parameters into monolithic arrays indexed by link number. Kinematics and Dynamics Kinematics and dynamics were computed using established Newton-Euler methods [Luh, 1980b, Walker, 1982]. Given link state values, [</,- </,• qi\ T celeration, [x,- x; x,] T : 1 < i < N, and the link Type, link cartesian position, velocity, and ac- : 1 < i < N were developed through a recursive outward computation from the Chapter 5. The Multiprocess Manipulator 100 Simulator manipulator base to the end effector. R, = R,-_i R- (</;) (5.146) p.- = R,-p;- +p,-_ (5.147) -1 1 { f LOi-i + 1 Ri-iijik,- if revolute (5.148) if prismatic P i - i + Ri_i(jik x (p, — p,-_i) p , _ i + Ri-iijjk if revolute + R,-_iq,-k + R,_i-j k x (p,- - p,-_i) t { (5-149) if prismatic if revolute (5.150) if prismatic p,-_i + uj{ x p,- + u>i x Ui X (p,- - p,-_i) if revolute (5.151) Pi_i + R,_i(j',k + ui x (p; - p,-_i) + 2u>i x Ri-iij.k + w,- x w,- x p,- if prismatic Recall that the ith joint displacement (and derivatives) is expressed as a vector with a direction along link a prismatic or revolute axis of frame i — 1 (hence the frequent terms such as: R,-_i<7,-k). Link forces were developed in a recursive inward computation from end effector to manipulator base. The inertial forces in each link: ftotali = miPi (5.152) nitotai; = J,wi + w,-x (J,Wi) (5.153) where J ; is the link inertia in world coordinates. Applying Newtons Second Law to the »th link: ftotal; nitotai - fi-f»+i (5.154) = m,--m,-+i+ ( p , - _ i - p , - ) x f,-- (p,--p ,-) x f,-+i c c (5.155) where f , i is the force applied to the ith link by link i + 1 and p - is the position of the center of mass in + CI world coordinates. Solving for f; and m,-: fi = fi+l+ftotal; (5.156) m,- = m,- + p* x f i - (pt - p*.) x P +1 i + t (5.157) where p* = Pi+i — P; and p* is the location of the ith link's center of mass in link i coordinates (i.e. a ; constant). The forces at the joint may then be extracted from fi and m,- through: { (R;_ik) m if rotational (Rj_ik) f if prismatic 7 T (5.158) Chapter 5. The Multiprocess Manipulator 101 Simulator The origin of the world coordinate system is fixed to the base of the manipulator and assigned a null velocity (i.e. p = 0 wo = 0). Gravity is simulated by applying an acceleration of magnitude g to the world 0 coordinate system (e.g. po = ffk wo = 0). Dynamics are initialized with either known end effector forces (e.g. f„+i = f ee m n + i = m ) or inertial ee payload forces (e.g. assuming a fixed end effector payload: f +i = "ipayloadPn m n +i = JpayioadWn 4- u„ x n (JpayioadW )). For clarity, the manipulator equations of motion have been presented in world coordinates. n However, the actual dynamic model adopts a faster, though less intuitive form of these equations composed in link coordinates. For a complete discussion of this well established technique consult [Luh, 1980a]. Model Integration The objective of a dynamic simulator is to compute the accelerations of the joints given the applied torques, displacements, and velocities, integrating the results to develop the next time step's position and velocity. At each timestep (every 0.001 seconds of simulation time) the MMS manipulator model was integrated through a fixed timestep fourth order Runge Kutta integrator [Press, 1988]. Joint accelerations are determined from the robot equation: D(q)q + C(q, q)q + G(q) + J ( f ) = r r e (5.159) The robot equation can be rewritten: D(q)q = r-b (5.160) b = C(q,q)q + G(q)+J (f ) T e (5.161) where b is a bias vector, the output of the recursive Newton Euler dynamics algorithm when q have been set to zero. The accelerations may then be computed simply through: q = D(q)- [r-b] 1 (5.162) The final problem is the determination of the mass matrix D(q). For a manipulator of known geometry an analytical solution may be determined manually a priori. However, for arbitrary manipulators a numerical method is required. Determining the Manipulator Mass Matrix The mass matrix algorithm (from [Walker, 1982]) employs an interesting technique to determine the manipulator mass matrix, the most CPU intensive task of a generic manipulator modelling system. Basically Chapter 5. The Multiprocess Manipulator 102 Simulator the technique uses the relation that each element, ij, of the mass matrix is equivalent to the internal forces present in link i during a unit acceleration of link j , all other links being motionless. For a given accelerated link j , the inferior links > j are treated as a rigid body, requiring the computation of an effective mass center and moment of inertia for this rigid body. The forces exerted on the rigid body above link j are: Fj = Mj&j (5.163) where Mj and ay are the mass and acceleration of the composite rigid body in the jth coordinate system. Since only the jth link is accelerating and all others are motionless, this may be rewritten as: Fj = MjqjZj-i x Cj (5.164) where Cj is the rigid body's center of mass. Rewritten for a unit acceleration: Fj = Zj_i x MjCj I revolute joint j > (5.165) J prismatic joint j Fj = MjZj_i Similarly the applied moment becomes: M j = Z j _ i E j 1 revolute joint j (5166) J prismatic joint j M j =0 where Ej is the effective moment of inertia of the composite rigid body. These forces,f,-, and moments,mi, are propagated down to the base of the manipulator employing the recursive relations begun at joint j : fj mj = Fj (5.167) = (5.168) MJ +CJXFJ and propagated down to the base of the manipulator through all the links i < j: fi m; = f i (5.169) = m,-+i + p,-x fj+i (5.170) i+ where p; is the distance from joint i — 1 to i and remembering that all the links i < j are motionless. With the values of fi and m, known for each link i — 1... j , the value of the Mass Matrix Elements for the jth column may be determined by: * . z,_i • m,- revolute joint i z,-_i • fi prismatic joint i Dij = <( _ /_ _ (5.171) Chapter 5. The Multiprocess Manipulator Simulator 103 The key in this process is designing this algorithm to be completely recursive, thus conserving C P U resources. Walker and Orin [Walker, 1982] describes the necessary recursive technique in detail, finally recommending a "method 3", used in DM, that has the least computations per degree of freedom. The total minimum computations to compute N joint accelerations using "method 3" becomes [Walker, 1982]: multiplications : additions : Thus, at best, arbitrary manipulator dynamic model computations are 0(N ). 3 5.4.2 Obstacle Modelling In this simulation, obstacles were represented as rigid physical objects having both geometry and movement specified in the obstacle database. An ObstacleManager acted as a single interface between the dynamic modelling process and a set of obstacle objects. Updates or queries posted to the manager were broadcast to all obstacles. Obstacle objects were specified in database entries standardized about the obstacle type, current position and path of the obstacle frame to form a trajectory script. Specific characteristics, unique to obstacle subclasses, were specified through nonstandard 'user defined variables'. Though only spheres were implemented (SphereObstacles), the obstacle interface was designed to be generic and could support arbitrary obstacle geometry. In addition to motion, obstacles must also respond to sensor probes of the environment. Fortunately, such probing produces data of limited dimension, either scalar, linear arrays, or matrices of values. For a given sensor type, the magnitude of these values depends entirely on the relevant obstacle property (e.g. shape, temperature, etc). Thus sensor implementations reside not in a Sensor class but within the obstacle itself. In this way a Ping method applied to a particular obstacle returns the vector to the surface nearest to the sensor source - much like an ultrasound, lidar, or radar sensor. It is up to the obstacle implementation to develop the correct return vector for a particular obstacle morphology. 5.4.3 Physical Realizability Though every attempt has been made to ensure that the manipulator and motor dynamic models are realistic, one important real world characteristic has been omitted from DM, interference checking. Interferences occur when two objects simultaneously occupy identical regions of space. Of course, in reality colliding objects do Chapter 5. The Multiprocess Manipulator Simulator 104 not intersect, but rebound according to the kinematic and material properties of both bodies. In manipulation simulation, interference is frequent as revolute manipulators rotate ad infinitum about their axes and links intersect both work space obstacles and other links. The solution is to model and maintain the boundaries of each object in the system and, if interference is detected, apply collision forces to interfering bodies. So for every TV simulated bodies TV — 1 interference checks must be performed or TV(TV — 1) checks per timestep and, if necessary, collision forces applied. Collision modelling, too, is nontrivial, requiring accurate surface geometric and material descriptions to compute incident angles and return forces. Neither the link nor obstacle modelling systems adopt interference checking (a common practice in many manipulator simulations) nor is a collision model provided. The computational cost of such checking, the subsequent difficulty of developing a realistic collision model, and the limited value of such extensions to an investigation of multiagent control are sufficient cause to leave these features to future manipulator simulation systems. So while manipulator dynamics have been developed according to standard well established simulation techniques, DM is unable to accurately portray manipulators collision dynamics during interference events. Though theoretically valuable, the manipulator's response during such events is, nevertheless, physically unrealizable. 5.4.4 Execution When DM is spawned by SI: DM <aScenarioName> the robot database is used to build the ManipulatorModel object discussed above and to construct two socket connections to each link process, a measurand connection to provide each link with joint state data and a dynamic stream connection to receive motor control commands. DM also inspects the global goal database for global goal sensor requirements (e.g. force sensing in RMFC). Obstacles and their trajectories are created based on the contents of the obstacle database. DM then creates both an execution log and a result history within the scenario directory. Once initialized DM initializes sensor data to the Link Processes. The process then enters the main event loop by awaiting a controller command, a timestamp and force value pair. If either a k i l l notice is received or a discrepancy in the timestamps is observed, an error condition has occurred and DM begins shutdown and cleanup. Otherwise, the model proceeds with the integration process, a call to the RungeKutta4( .) method. Chapter 5. The Multiprocess Manipulator Simulator 105 The result of a successful integration, a new manipulator state, is immediately transmitted to the link processes as position sensor data. Similarly, force data computed through the Newton Euler dynamic model methods may also be transmitted, if required. Integration fails only if the inversion of the mass matrix, D ( q ) , in equation 5.162 (via LU decomposition) fails, a sure sign of lurking algorithmic errors or controller instability. Such failures trigger k i l l notice transmissions and subsequent process shutdowns. If the state transmissions are successful, an end of cycle message is passed to SI and the system returns to the start of the main event loop. The dynamic modeler executes the main event loop ad infinitum. However, any k i l l notice frees DM from this loop and invokes a clean shutdown of the process, including the transmission of the final timestamp to SI, the storage to file of historical data, the deallocation of memory, and the closure socket connections. Thus far the simulator portion of MMS has been described, the basic infrastructure of the system. The following sections detail the control portion and represent a significant divergence from traditional monolithic control architectures. 5.5 The Global Goal Process, GP The global goal process serves a key function in MMS: to compute and distribute goals to link processes. The design of the global goal process accommodates multiple global goal processes through the adoption of a GoalArray, a simple array of goal generation objects. As indicated in the definition D4.1, a goal is a trajectory through some goal space. The goal space varies between control strategies. Traditional RMPC systems translate desired end effector locations into local goals for each link process, position setpoints in joint space. RMAC systems, however, translate these same locations into force setpoints for each link process. JTC, as discussed earlier, uses a goal couplet containing both end effector position and desired force (equation(4.130)), 5.5.1 The GlobalAgent Object The GlobalAgent forms the core of the global process. This object has two important data structures: the Goal array, and the AgentlOStream array. The Goal array contains a list of active global goals, and the AgentlOStream is an array composed of a SensorVectors and a MeasurandVectors, each of which comprises an input and output data flow between global and link processes. The purpose of the GlobalAgent is to generate and distribute global goals to all of the link processes. Chapter 5. The Multiprocess Manipulator Simulator 106 There are two sources of goal generation. The first is through local goal generators such as : • Resolved Motion Force Control • Resolved Motion Decentralized Adaptive Control • Resolved Motion Acceleration Control. and reviewed in the next chapter. The second mechanism is through remote goal generation in which link processes assert global goals to the agent team, demonstrated in Resolved Motion Obstacle Avoidance in the next chapter. In this case, the global goal process acts as a rebroadcast medium propagating a single goal expression to the entire agent group. To support multiple goals, the GlobalAgent employs a Goal array. Given goals specified within the Global Goal database file, the GlobalAgent instantiates objects inheriting from the Controller class into the Goal array. The Controller Abstract Superclass The Controller class encapsulates any object possessing time dependent response within an abstract superclass, such as goals and controllers. By adopting an abstract superclass for various goals, goal subclasses can be placed into a single array of Controllers. Practically speaking, this means that regardless of the controller subclass, all Goal array members will respond to the update () method regardless of their subclass. For a further description of the abstract superclass concept consult [Lippman, 1991]. IOStreams To communicate global goals, receive goal assertions, and kinematic bus data, a general purpose interprocess input/output data structure or IOStream implements a bilateral communication connection. The GlobalAgent object can pass these connections to members of the Goal array, allowing each to establish input or output data streams with the agent team at will. In this way an end effector trajectory tracking task can simultaneously receive end effector state data and obstacle avoidance goal assertions without interference from the GlobalAgent object. The only requirements for successful mating of goal and agent communication is that the agent proxies and the global goals employ identical data formats and that space is allocated in SensorVector and MeasurandVector structures for both link as the global goal processes. Fortunately, the latter is easily ensured by adopting identical sequences in both link controller and globalgoal lists in the robot and global goal database files respectively. Chapter 5. The Multiprocess Manipulator 5.5.2 Simulator 107 Execution A t l a u n c h GP parses t h e r o b o t i n t h e S c e n a r i o a n d creates t h e necessary n u m b e r o f I O S t r e a m s . B a s e d o n t h e i r order o f a p p e a r a n c e i n t h e g l o b a l g o a l d a t a b a s e , GP i n s t a n t i a t e s t h e necessary g o a l generators i n t o t h e G o a l array, p a s s i n g t h e I O S t r e a m s t o each generator. W i t h i n each g o a l generator, d a t a is a l l o c a t e d f r o m t h e S e n s o r V e c t o r a n d a M e a s u r a n d V e c t o r c o m p o n e n t s of t h e I O S t r e a m s . T h i s a l l o c a t i o n step, w h e n m i r r o r e d i n t h e l i n k processes, establishes t h e equivalent o f a shared m e m o r y c o n n e c t i o n between t h e t w o processes. T h o u g h n o t a l l g o a l generators allocate S e n s o r V e c t o r m e m o r y , a l l a l l o c a t e M e a s u r a n d V e c t o r space t h r o u g h w h i c h t h e g l o b a l g o a l is t r a n s m i t t e d . O n c e i n i t i a l i z e d GP enters a n event l o o p b y a w a i t i n g t h e a r r i v a l o f a t i m e s t a m p f r o m S I . O n receipt o f a t i m e s t a m p , t h e i n p u t I O S t r e a m is u p d a t e d . If successful, t h e G l o b a l A g e n t successively calls t h e u p d a t e ( ) m e t h o d for each C o n t r o l l e r subclass i n t h e G o a l array. T h i s m e t h o d either u p d a t e s o r copies g o a l assertions i n t o t h e a p p r o p r i a t e a l l o c a t e d variables i n t h e o u t p u t I O s t r e a m . T h e event l o o p is c o n c l u d e d b y u p d a t e i n g the output IOstream. J u s t as t h e receipt o f a k i l l n o t i c e invokes s h u t d o w n a n d cleanup, so t o o w i l l a n I O S t r e a m t r a n s m i s s i o n or receipt failure. L i k e t h e other processes, t e r m i n a t i o n does n o t o c c u r u n t i l t h e e x e c u t i o n a n d h i s t o r i c a l d a t a logs are c o m p l e t e d . 5.6 The Link Process,LP T h e l i n k process d r a w s o n t h e scenario d a t a b a s e t o c o n s t r u c t a n agent t h a t receives sensor d a t a s t r e a m s f r o m t h e d y n a m i c m o d e l , acts o n a n a r b i t r a r y n u m b e r o f b o t h l o c a l a n d g l o b a l goals, a n d r e t u r n s a some force o u t p u t b a c k t o t h e d y n a m i c m o d e l . T h e core o f L P is t h e A g e n t object. 5.6.1 The Agent Object T h e A g e n t O b j e c t i m p l e m e n t s t h e concept o f a n agent as defined i n c h a p t e r 4 . 8 . S i n c e t h e p u r p o s e o f a n agent is t o a r b i t r a t e between b e h a v i o u r s e m p l o y i n g some d e c i s i o n o r c o m b i n a t i o n strategy, a means o f s t o r i n g a n d i n v o k i n g generic b e h a v i o u r s must b e p r o v i d e d as well as a m e c h a n i s m for t h e c o l l e c t i o n a n d s u p p l y o f g o a l a n d sensor d a t a to a n d f r o m e x t e r n a l processes. S t r u c t u r a l l y , a n agent object is c o m p o s e d o f three f u n d a m e n t a l entities: • the K i n e m a t i c B u s • the IOStream Chapter 5. The Multiprocess Manipulator Simulator 108 • a set of Controllers • an arbitrator exploiting six fundamental data flows: • kinematic bus streams (input/output) • Sensor data streams (input) • Global goal IOStreams (input/output) • Control effort (output) 5.6.2 The Kinematic Bus LP decentralizes control by distributing cartesian state computations over each link of the manipulator. This is accomplished by supplying each link process with locally sensed displacement data, a local geometric model, and through distributed Newton Euler equations coupled to an interagent communication infrastructure or Kinematic Bus. This infrastructure has a flow through recursive architecture in which the ith link successively receives and transforms cartesian state information packets from link i — 1 and transmits the transformed data in a packet to link i + 1. The Kinematic Packet sent from i to i + 1 is a message with syntax: Kinematic Packet, E,- : {p,-,R,-,p,-, w , } (5.172) Since values are in double precision (8 bytes) the kinematic packet is 144 bytes: • R;, an orientation matrix (72 bytes). • p,-, a position vector in base coordinates (24 bytes). • pi, a velocity vector in base coordinates (24 bytes). • Wi,an angular velocity vector in base coordinates (24 bytes). The Kinematic Bus encapsulates the reception, transformation, and transmission of cartesian state data as well as sensed data collection from the dynamic model, DM within a single KinematicBus class. Communications are through SensorVector and MeasurandVector objects updated at every timestep. Chapter 5. The Multiprocess Manipulator 109 Simulator On receipt, a kinematic packet is transformed from the cartesian state of A,_i to A,- through the following set of recursive Newton Euler equations: R, = Ri-iRp'te) (5.173) p,- = R,- r +p,-_ (5.174) 1 { { P 1 LOi-i + RiQik; if revolute u){-\ if prismatic (5.175) x>i-i + R,a,k x (pi — Pi-i) if revolute pi_i + Rj^jk if prismatic (5.176) where k is the link's revolute axis of rotation or the prismatic axis of translation (according to the Denavit Hartenberg convention [Denavit, 1955]) in link coordinates and (qi,qi) are the sensed joint positions and velocities respectively . The local geometric model is limited to Denavit Hartenberg parameters. 2 5.6.3 The Global Goal IOStream As mentioned in the description of GP, the global goal and link processes communicate through IOStream connections - pairs of SensorVector and MeasurandVector structures. Controller proxies receive global goal directions (such as global setpoints) through the SensorVector, while data can be transmitted to a global goal process through the MeasurandVector structure (such as end effector locations, obstacle avoidance assertions, etc.). 5.6.4 Controller Objects A key feature of the MMS in general and LP in particular is controller modularity. This was greatly simplified through the adoption of the Controller abstract superclass for the description of local controllers. All controllers employed within an agent were subclasses of Controller. Though most local controllers used fixed setpoints specified within the robot database file, another variety of controller acted on behalf of the global goal process to influence the agents actions. When a global objective is being pursued, SI launches the global control process, GP that generates and transmits instructions to the link processes, LP. This requires that each link has a mechanism for receiving and interpreting externally generated goal process instructions. Controllers that implement these reception protocols are proxies, acting on behalf of the global goal specifying a control effort to the agent. Controller A s a second derivative of position, cartesian accelerations are considered too volatile to incorporate into the kinematic packet. 2 Chapter 5. The Multiprocess Manipulator 110 Simulator proxies object allows external goal processes to act as a local controller to a link object. These proxies are indistinguishable from any other controller object from the Link object's standpoint. However, a controller proxy differs from local controllers by establishing a communication link between an independent controller process and the link object. The goal process (described earlier) continuously broadcasts goal objectives to the client controller proxies. Semantically one can interpret such transmissions as the arrival of global sensory data to a local controller. Depending on the control rate of the sensor the controller proxy may or may not act upon the information received from the global goal. The Controller Array Active controllers (and controller proxies) are stored within an array of M Controller objects, u : A u A = [ui u ... u ] (5.177) T 2 M . After the link process parses the link database file, legal controller database definitions are instantiated as subclasses of the Controller class and added to the controller array. Once instantiated, the link process has no mechanism discriminating between controller objects treating each controller identically by invoking an update () message in all members of the controller array at each timestep. 5.6.5 Arbitration In this agent implementation a simple linear arbitrator of the form described in equation (4.92) is used to weight controller array responses. The jth controller definition in the link database file can assign a constant arbitration coefficient kj (by default kj = 1.0). These coefficients are, in turn, stored as an arbitration vector k: k A = [ki k 2 ... k ] M T (5.178) The agent output, u , becomes: c M uc = 53k,-Uj (5.179) J'=I As discussed in the following chapters, even this simple strategy can produce interesting 'self arbitrating' behaviour between rival goal systems. Chapter 5. The Multiprocess Manipulator 5.6.6 111 Simulator Execution At launch, link processes establish bus connections through the SensorVector and MeasurandVector objects to neighbouring links, and the terminal link (the end effector) establishes a similar connection through an IOStream connection to the global goal process. The link cartesian state is updated upon confirmation of incoming data from the Kinematic State Bus describing the cartesian kinematic state of the lower link frame. Once this data has been retrieved the Newton Euler transformations (either revolute or prismatic) are applied to the cartesian state and transmitted immediately to the next link process via the KSB. If global goals have been specified, each link process awaits any global goal broadcasts. Each controller is then anonymously invoked and the responses combined within the linear arbitrator to form u . Finally, c u is transmitted to the Dynamic model as a motor torque. c As in other processes, the receipt of a k i l l notice invokes shutdown and cleanup as does the failure of interprocess communications. Again, termination occurs only after execution and historical data logs are completed. 5.7 Operational Overview Figure 5.25 provides an overview of the operation of the Multiprocess Manipulator Simulator. At the beginning of each cycle a timestamp is transmitted to the GP and all LP processes. The timestamp triggers the base LP process to transmit the kinematic packet Ei to LP , in turn triggering a similar recursive 2 transmissions down the kinematic chain to LPjv and, if necessary, GP. With updated end effector data, GP is free to compute and broadcast the global goal couplet to all LP;. On receipt of global goal couplet(s) each LP,- computes and transmits a response to the dynamic modelling process, DM. DM incorporates these control efforts into the dynamic model and integrates the system to the next time step. Finally, the model transmits sensed data back to each link process. 5.8 Data Logs and Viewing Only GP, DM, and LP processes generate experimental data, all stored in historical data logs. Rather than archiving results from every 1 ms. time step, all results data streams are sampled every five timesteps to reduce potentially enormous data files to manageable sizes. Nevertheless, running a 10 degree of freedom manipulator over a 40 second simulation run consumes 12 Mbytes. Chapter 5. The Multiprocess Manipulator Simulator TimeStamp Broadcast Kinematic Update Tracking Goal Generation Agent Response Model Integration Figure 5.25: Interprocess data flow between the link processes 1... N, global goal process GP, process SI, and dynamic model DM. Note that the link state X; = [</, r j , ] . T Chapter 5. The Multiprocess Manipulator Simulator 113 Digesting and analyzing these data files was, itself, a significant undertaking. The MMS class hierarchy in conjunction with Pixar's excellent QuickRenderMan™ image rendering library [Upstill, 1990] and superb N E X T S T E P ™ [NeXT, 1993] 'AppKit' libraries, greatly facilitated the rapid recovery, representation, manipulation, and archiving of robot images from these data files. Data plots were generated through the powerful M A T L A B ™ environment. 5.9 Verification Verification of such complex software systems must be a cumulative process, each software layer tested independently. By ensuring that vector mathematics modules (for example) are functioning to specification, only the algorithmic correctness of later modules (such as the dynamic modeler) need be verified. Extensive cumulative testing played a crucial role in establishing confidence in the simulator software. While cumulative tests are necessary and sufficient for service utilities, they are ineffective in testing the algorithmic correctness of physical simulations. Therefore, the final verification step in the simulator development process is to validate the kinematic, dynamic, and mass matrix computations by comparing the simulator against a standard, verified, reference model. To speed the verification process and ensure accurate implementation of a reference model, the dynamic modelling facilities of the simulator were verified against an established symbolic dynamics generator, NEDYN provided by Toogood [EL-Rayyes, 1990]. This remarkable software generates symbolic dynamic models for either rigid or flexible manipulators models in FORTRAN, further optimizing these models into the minimum number of operations. At once, this software provides arbitrary dynamic models, optimized and verified through previous testing. After minor conversion to C++, NEDYN's output was used to verify the arbitrary dynamics generator, DM through comparative testing. Though the specific verification reference model must be sufficiently rich to exercise all aspects of the modelling system, it must not be overly complex. Indeed as an error detection tool, it is important that discrepancies between the reference and tested models be easily resolved. For this reason a relatively simple planar manipulator model is sufficient to exercise the dynamic modeler and mass matrix computation system and yet is simple enough to provide a useful debugging tool. Based on previous matrix algebra verifications, planar manipulators provided sufficient complexity to demonstrate algorithmic correctness. 114 Chapter 5. The Multiprocess Manipulator Simulator Parameter Name Mass Pr. Inertias centroid Type Init. Displacement Init. Velocity Twist Length Back E M F Saturation Symbol m,IJJ C 6 0i a; di or cZ 0l Value link i 1.000e+01 kg. [+0.0500 + 0.4938 + 0.4938] kg-m [-0.3750 + 0.0000 + 0.0000] m. R (revolute) +0.000e + 00° (or in) +0.000e + 00° /s (or m/s) +0.000e + 00° 7.500e - 01 m. 1.0 +1.000e + 06 (unsaturated) N-m 2 Tabic 5.4: Reference Link, Joint, and Motor parameters. Centroid is relative to link frame. Figure 5.26: The 3R planar reference model used to verify DM against NEDYN. 5.9.1 A 3R Reference Model One such example is the planar three degree of freedom robot depicted in figure 5.26 and composed of three identical links. Within the robot database file, each link entry clearly identifies all the relevant parameters for a given link. A link name distinguishes each entry and provides a source of unique socket names. Link parameters such as the link mass, inertial and centroidal data (in local coordinates), and the denavit Hartenberg parameters, 9,d,a,£, provide critical geometric data the assembly of both kinematic and dynamic models. Saturation is included to provide real world constraints on torque/force output from each actuator. These values are summarized in table 5.4. These physical parameters are followed by controller specifications. For the verification runs, a simple PD 115 Chapter 5. The Multiprocess Manipulator Simulator 0 -0.2 -0.4 Time (seconds) 0.4, , , , i 5 i 10 i 15 , 0.3 0.2 Q! 0 i i 20 25 Time (seconds) i • 30 35 , 1 40 Figure 5.27: Here, the superimposed displacement responses of both the symbolic N E D Y N and numerical 3R planar reference manipulator (link 1, top; link 2, middle; link 3, bottom) indicate excellent agreement between the two methods. Chapter 5. The Multiprocess Manipulator Simulator controller, PID.joint.center, described in detail later, with setpoint 116 = +1.57rad, position and velocity gains set to 100 and 20 respectively, operating at 120Hz. The payload data file specifies a 10 kilogram payload centered on the end effector. Since PID.joint.center controllers pursue the setpoint specified in the database entry, they do not require a central global goal generator, and GP is never launched. The experimental results depicted in figure 5.27 demonstrate excellent agreement between the symbolically optimized N E D Y N and the equivalent numerical methods described by Walker and Orin. Since N E D Y N and the numerical modeler are based on identical Newton Euler methods and use identical manipulator models, identical output from the two systems is to be expected. 5.10 Summary In this chapter we have reviewed the design and implementation of the Multiprocess Manipulator Simulator and verified its performance against a published numerical simulation system. In the following chapters, this simulator will be applied to the multiagent manipulator control problem and the goal and link processes expanded to implement multiple local and global goals. The next chapter, however, will examine the performance of the global goal system for trajectory tracking, through comparison of goal generation methods and demonstration of the global goal proxy concept. Chapter 6 Global Goal Generation 6.1 Introduction In chapter 4 resolved motion position control, resolved motion rate and acceleration control, and Jacobian transpose control were evaluated as global goal distribution mechanisms. Based on an examination of the structure of these inverse aggregate methods, RMRC and R M A C were shown to be less desirable than Jacobian transpose methods. Further, it was suggested that if each link's cartesian state could be sensed or computed, Jacobian transpose control becomes effectively decentralized. By projecting a global goal, an end effector force onto the ith link through the ith row of the Jacobian transpose, a global goal couplet was defined in equation (4.130): (6.180) G = (p ,fd) G where p n n is the end effector position. This chapter will explore a multiagent manipulation system based on these precepts. Having identified a distribution mechanism for f<j in the global goal proxy, a method of generating fd must be selected from the following possible techniques: Resolved Motion Force Control, the Operational Space Formulation, and Configuration Control. Each of which develop joint forces based on an end effector force vector, fd, a function of a desired task space trajectory x<j(i): fd = /d(xd(i),x (t),x (r)) e c (6.181) where x (t) = Xd(t)—x (t) and to varying degrees, employ a centralized kinematic model of the manipulation e a system. In the Operational Space Formulation cartesian trajectory tracking forces are generated through a task space feedforward manipulator model. In a simpler technique, Resolved Motion Force Control develops a force trajectory based on a feedforward payload model, compensating for 'unmodelled' dynamics with a force feedback stochastic estimator. Direct Adaptive Control (for clarity renamed here Resolved Motion Decentralized Adaptive Control or RMDAC) employs model reference adaptive controllers to compensate for both payload and manipulator dynamics. Appendix E briefly reviews the pros and cons of OSF and 117 Chapter 6. Global Goal Generation 118 R M F C as goal generator candidates, while the details of Resolved Motion Decentralized Adaptive Control, the preferred goal generator, are described below. 6.2 Global Goal Generation A number of force trajectory generation methods have been proposed for Jacobian Transpose Control, including Khatib's Operational Space Formulation (OSF), Wu and Paul's Resolved Motion Force Control(RMFC), and Seraji's Direct Adaptive Control (DAC). Intended as centralized end effector control architectures that maintain a single, coherent, manipulator model, these techniques must be reevaluated as global goal generation systems. In short, what constitutes an appropriate global goal generator for multiagent manipulator control? Recall that three forms of coordination span multiagent system control: explicit, implicit, and emergent, each representing a compromise between deterministic behaviour and robustness. Explicit coordination strategies regulate each agent's behaviour through a centralized model based process. Implicit coordination strategies control global behaviour without any team model. Emergent coordination methods produce global behaviour purely through agent interaction, without any global goal specification or modelling. Local and global Behaviour are deterministic for systems under explicit coordination though they tend to be fragile in the face of unmodelled changes. By sacrificing some local determinism, implicitly coordinated systems maintain globally deterministic behaviour and gain robustness to unmodelled events. Local behaviour under emergent coordination is often unpredictable while global behaviour often cannot be analytically guaranteed. Nevertheless, these systems routinely fulfill global objectives while demonstrating resilience to environmental change. For multiagent manipulator control, a balance between determinism and robustness must be maintained. With less modelling to achieve a desired behaviour comes less computation, lower expense, and the robustness to change. Yet, to be useful, global behaviour must also be deterministic. Clearly a compromise exploiting some form of implicit coordination is necessary. In decentralizing Jacobian transpose control, a centralized kinematic model has been removed from the goal distribution mechanism, a significant step towards this compromise. However, the global goal generator, fd, too, must limit model dependency. Therefore, establishing model independence within a global goal generator is an important objective in implicit multiagent controller design. Retaining centralized feedforward dynamic models of the manipulator, both OSF and R M F C are less Chapter 6. Global Goal Generation 119 attractive implicit coordination strategies (see appendix E) than Seraji's Direct Adaptive Control. This latter controller replaces the last vestige of a feedforward dynamic model with an adaptive task space controller that models an ideal generic trajectory tracking process rather than a specific manipulator geometry. In so doing this global goal generator becomes manipulator independent or model-free . In the following section 1 DAC will be explored in detail and its performance within a multiagent controller demonstrated. 6.2.1 Resolved Motion Decentralized Adaptive Control Though the internal operation of any global goal generator, such as RMDAC, is a separate issue from multiagent control, the global performance of the multiagent team is dependent on the performance of this generator. Furthermore, the less manipulator within the generator, the more robust the global goal system is to environmental changes. A close examination of this global goal generator is, therefore, worthwhile. In 1987 Seraji [Seraji, 1987b] introduced Direct Adaptive Control and later Configuration Control of manipulators in cartesian space. To differentiate this technique from a further variation Improved Configuration Control for joint space controllers, these methods are henceforth referred to as Resolved Motion Decentralized Adaptive Control (RMDAC) in recognition of its cartesian adaptive control heritage.This technique differs from OSF and R M F C , in that a model reference adaptive controller (MRAC) generates the desired end effector forces based on an idealized trajectory tracking process, devoid of any manipulator model. The controller may be derived from a linearized manipulator model about an initial condition P. At P, the manipulator is at joint positions q o , end effector position X o , and joint torque r o . By perturbing To slightly by Sr(t) to form r(t) = To + ST(t) the joint and end effector positions change by <5q(i) and 8x(t) to form q(£) = q o + Sq(t) and x(t) = x o + Sx(t). In [Seraji, 1987a], Seraji shows that by linearizing 2.34 about P: AeSq(t) + B<Sq(t) + CJq(t) = Sr(t) (6.182) where A , B, and C are constant n X n matrices dependent on P. By using equation (2.33) and (2.7), this expression can be transformed into cartesian space . 2 ASx(t) + B8x(t) + C6x(t) = Sf(t) (6.183) where Seraji then defines Sr(t) as an "incremental reference trajectory" in cartesian space. To track this trajectory, feedback and feedforward controllers are developed. The feedback controller provides a stable 'Model-free in the sense that the global goal neither estimates nor guarantees convergence of any manipulator parameters. Significantly, Seraji points out the system could be derived entirely in cartesian space and, therefore, irrespective of any manipulator model 2 120 Chapter 6. Global Goal Generation closed loop control, ensuring asymptotic convergence of any error to zero: = K Sx (t) + K 6x (t) p e v (6.184) e where Sx (t) = Sr(t) — Sx(t). The feedforward controller ensures that Sx(t) tracks the reference trajectory e Sr(t) and is simply the inverse of the end effector model in equation (6.183): <Sf(*) = ASr(t) + B6r(t) + C5r(t) (6.185) 2 Substituting Sf\(t) and Sf2(t) into 6f(t) and rewriting equation (6.183) A6x (t) + (B + K )(t)6x + {C + K )6x (t) = 0 e v e p (6.186) e Now this control scheme works well for regulating small errors about the nominal operating point, P. However, the control of the end effector over a general trajectory, r(£), is the sum of the incremental control forces required to maintain the control at Sr(t) and the nominal force f (t) required to drive the end effector 0 along the trajectory, r(t): f = f + <5f (6.187) = f + K 6x (t)+K 6x (t) 0 0 p e v + ASr(t) + BSr(t) + C6r(t) c (6.188) Now defining the total reference trajectory and end effector positions as r(t) = ro +c5r(£) and x(t) = Xo+Sx(t) respectively, equation (6.188) can be rewritten as: f = d + KpX (r) + K „ x ( * ) + Af(*) + Br(t) + Cr(r) (6.189) d = f - Af - Br - Cr (6.190) e 0 0 c 0 0 By generalizing these equations to control the nonlinear end effector dynamics, Seraji establishes the following system: M(x,x)x + N(x,x)x + G(x,x)x = d(t) + K x ( t ) + K x {t) p c v e +Af (t) + Br(t) + Cr(i) (6.191) where d(t) corresponds to the operating point term d and is synthesized by the adaptive controller. Rewritten in terms of total tracking error x : e M i e + CN + K ^ x e + CG + Kpjxc = d(t) + ( M - A)r(t)+ (N - B)r(i) + (G - C)r(f) (6.192) Chapter 6. Global Goal Generation 121 From this equation it is clear that d(t) and r(t) drive the error system of the total controller. Therefore it is essential to adapt the gains of the system. Rearranging equation (6.192) into a state variable form: o Mt) i „ - M - ^ G + Kp] -M 0 - M _ 1 _ 1 ( N + K„) 0 z(t) + - M - *F(t) - 0 r(t) + i(t) + >J-B) ( G - C) _ Ht) - M - (6.193) where z(t) — [x x ] . Now a simple linear reference model can be developed T e e x (t) + D x „ (t) + D em Defining z (t) = [ x , „ x m e C m ] T 2 e r l X c m (6.194) (t) = 0 the state space reference model becomes: 0 z (t) = I„ -Dj - D (6.195) >(*) m 2 Since equation (6.195) is stable, there exists a positive definite 2n x 2n matrix, P , which satisfies the Lyapunov equation: PD + D P = - Q (6.196) T where D is the system matrix in equation (6.195). By defining e = z (t) — z(t), and subtracting (6.193) from (6.195) a model reference error equation in e m can be assembled. Seraji then uses a simple design method [Seraji, 1989c] to generate a Lyapunov function based on this equation and, in solving for P , produces adaptation laws ensuring the convergence of z(t) to z (t): m v(*) diag(w ,-)x + diag(w )x d(t) d(0) + <Ji / p e vi Jo v(«)d* + (6.197) e (6.198) *2v(t) K„(t) K (0) + a i / v(t)x (t)dt + a v » x ( i ) (6.199) K„(t) K »(0) + /?i / v(t)x (t)dt + p v (t)k {t) Jo (6.200) C M C (0) + i/i / v(t)r(t)dt + v v (t)r(t) Jo (6.201) B(t) B (0) + 7i / v(t)r(t)dt + v (t)r(t) Jo (6.202) Mt) A (0) + Ai / v{t)T(t)dt Jo + X v (t)f{t) (6.203) 7 p Jo c 2 e T e 2 T 2 T l2 T 2 e Chapter 6. Global Goal Generation 122 where {S\,ai,/3i,i/i,Xi} are positive scalar integral gains and {62,c*2,02,^2, A2} are zero or positive gains and {w i,iv i} are positive scalar weighting factors. By adaptively modifying the force vector f(t) in this p v way, z(t) will converge to z (t): m f(t) = d(t) + C(t)r(t) + B(t)r(t) + A(t)f(t) + K (t)x (t)+K (t)x (t) p e v (6.204) e Later, in 1989, Seraji [Seraji, 1989b] described a further variation on this approach, Configuration Control, by decentralizing the computations in cartesian space. In effect the expression in equations (6.197) through (6.203) appear virtually identical - but arc scalars equations applied in each cartesian direction or fi(t) = di(t) + Ci(t)n(t) + bi(t)Ti(t) + ai(t)ti(t) + K (t)x (t) + K (t)x (t) pi c vi (6.205) e where, as before, {a,-,6,-,c,-} are adaptive feedforward gains and {K i,Kdi} are adaptive PD feedback gains p and di is an auxiliary signal to improve transient performance and i : 1... n and r(t) £ R . As before, these n coefficients are an integral of the generic form: f vi(t)ei(t)dt + u Vi{t)ei(t) ki(v(t),e(t),t) = ki(0) + uu (6.206) 2i Jo where ui, are positive scalar gains, U2i are zero or positive, and Ci(t) is the variable modified by the coefficient (e.g. k,(t) = K i, ei(t) = x (t)). v c Again r,-(r) is a weighted error of the form: r,-(t) = WpiXei + IVviXei (6.207) where w i and I/J„; are weighting gains. p In summary Seraji noted the following characteristics of this algorithm: • Computation is extremely simple, using discrete trapezoidal integration. For thefcthtime step: fi(k) = di(k) + Ci(k)ri(k) + bi(k)ri(k) + ai(k)ri(k) + K (k)x (k) pi r,-(fc) ki(k) = = e w x i(k) pi e + K (k)x (k) vi e + w x i(k) fc,-(fc-l)+ vi e (6.208) (6.209) AT — [w,-(fc)e,-(&)+u,-(»-l)e,-(i-l)] + u i[vi(k)ci(k) - vi(k - l)e,-(fe - 1)] 2 governed by the task space dimension, M, the computational cost is O(M). (6.210) Chapter 6. Global Goal Generation 123 • The manipulator parameters are assumed to be 'slowly time varying'. • Initial manipulator parameters are not required to ensure convergence. • Convergence is guaranteed through the Lyapunov Design method [Seraji, 1989b]. • The rate of convergence is governed by the integral adaptive gains. • The rate of convergence is independent of initial values. Indeed, Seraji demonstrated that even gross violations of the 'slowly time varying' assumption parameters does not destabilize the system. 6.3 Multiagent Controller Performance Within the context of the multiprocess manipulator simulator, the RMDAC global goal generator is implemented as a subclass of Controller within the global goal process, GP. This subclass, RMDAController, retrieves the end effector location, p , through the kinematic bus, computes fd through equations (6.210), n and broadcasts both p , and fd as the global goal couplet: n Q = {PnM (6-211) to link agents Aj: : 1 < j < N. Within each agent process, LP, the trajectory tracking behaviour controller observes this broadcast and commands the joint motor (or dynamic model) with the output: uj = u c t r a c k = Jj(pj_ (t),p (<))fd(0 1 n (6.212) In the following section, a simulated manipulator (see figure 6.28) under multiagent control will demonstrate this technique through the implicit coordination of a reference manipulator and payload along a specified trajectory. Below, the reference manipulator, payload, and trajectory used in this simulation are described. The Manipulator The reference robot is a redundant planar revolute manipulator composed of ten links. Each link is of the form described in table 6.5. Inertia parameters assume a 0.75 metre cylindrical link 0.2 rn in diameter with a mass of 10 kilograms. 124 Chapter 6. Global Goal Generation Parameter Name Mass Pr. Inertias [I ,I ,I ,] centroid Type Initial Displacement Initial Velocity Twist Length Back emf constant Saturation xx yy z Symbol Name mi Ijj c Type 6 or d 0i 0i C*i di_ A'emf Saturation Value link i 1.000e+01 kg. [+0.0500 + 0.4938 + 0.4938] kg-m [-0.3750 +0.0000 + 0.0000] m. R (revolute) +0.000e + 00° +0.000e + 00° per sec. (or m/s) +0.000e + 00° 7.500e - 01 m. 1.0 +1.000e + 06 (unsaturated) N-m 2 Table 6.5: Reference Link, Joint, and Motor parameters. Centroid is relative to link frame. AY Figure 6.28: The Reference Manipulator: a ten degree of freedom planar revolute manipulator initially lying along the x axis. 125 Chapter 6. Global Goal Generation Parameter Mass Pr. Inertias [Ixa;,Iyy,Izz] Centroid Symbol m p hu c Value l.OOOe + 01 kg. [+0.4938 + 0.4938 + 0.4938] kg-m [+0.00 + 0.00 + 0.00] m. 2 Table 6.6: Reference Payload parameters. Centroid is relative to gripper frame. Time (sec.) 10.00 16.00 22.00 28.00 34.00 40.00 X(m) Y(m) 1.00e+00 1.00e+00 2.00e+00 2.00e+00 1.00e+00 1.00e+00 2.00e+00 1.00e+00 1.00e+00 -1.00e+00 -1.00e+00 -2.00e+00 Angle (rad.) 0.00e+00 1.57e+00 1.57e+00 1.57e+00 1.57e+00 1.57e+00 Table 6.7: Knotpoints on the reference trajectory. As mentioned previously, the motor model of chapter 2 is included within the MMS dynamic modeler, DM. The values A'fc and K\, and B n may be set within the robot database file, but are left to the default k c k (and arbitrary) values 0.5, 1.0, and 0.5 respectively. The Payload The payload, centered on the gripper (the coordinate frame of link n) of the manipulator, has been chosen to be identical to the mass and inertia of a manipulator link (see table 6.6). In this way, payload sensitivity studies may be assessed in terms of simple manipulator/payload mass ratios. The Trajectory The reference trajectory specification ("SquareWave.spt") describes a planar trajectory composed of 6 cubic segments (i.e. a straight line segment with parabolic velocity profile) of 40 seconds duration. A simple orientation trajectory rotates the end effector counterclockwise 90 degrees along the first leg and constant during the remaining trajectory. See table 6.7 and figure 6.29. The trajectory presents some difficulties for the Jacobian based Algorithms. Since the manipulator is initially homed along the x axis, Jacobian inverse algorithms may encounter a kinematic singularity if motion along the x axis is required. Though the trajectory does not specify such motions, controller dynamics may demand such motions nevertheless resulting in numerical instability. The shallow inclination of the first 126 Chapter 6. Global Goal Generation - 4 - 2 0 2 4 6 8 10 12 X (metres) 2 .2 2.5- Time (seconds) Figure 6.29: The Reference end effector position (top) and orientation (bottom) trajectories. Each segment employs a parabolic velocity profile in both position and orientation. The position trajectory plot is annotated with time milestones. leg to the x-axis, relatively small initial velocity and acceleration vectors, and poor initial manipulator position, means Transpose algorithms tend to produce a delayed wallowing response. This has the effect of exaggerating integral wind up in adaptive controller as shown later. 6.3.1 Trajectory Tracking Performance Having selected a set of integral gains for the RMDAC goal generator through a trial and error process (see table 6.8 and appendix D), the first experiment examines the trajectory tracking performance of the multiagent team. Initialized with the desired reference trajectory, the global goal process GP generates the global goal couplet, (x ,fd), through an R M D A C o n t r o l l e r object. The couplet is broadcast to a team often n link agents, each controlling a link by observing and responding to the global goal couplet through a global goal proxy. The results of this demonstration, depicted in figure 6.31, clearly show that, despite decentralization, the manipulation system exhibits coordinated trajectory tracking behaviour. Note that while the end effector appears to track the desired trajectory, the manipulator seems to adopt arbitrary, 'jumbled', configurations. Since a geometric interference engine was not implemented in DM, physically impossible configurations are Chapter 6. Global Goal Generation often observed. 127 Nevertheless, this experiment demonstrates that global behaviour can be deterministic without resorting to local behaviour specification. A root mean square (RMS) measure characterizes the end effector's tracking accuracy over the reference trajectory and is defined as: 1 2 6rms — X>d(fc) - Pn(fc)) (Pd(fc) Jt=0 T (6.213) (fc)) P n where N is the total number of timesteps. Similarly, the planar orientation error is measured through the a RMS of the scalar rotation about the global z axis: $rms — •fr E^( ) " <?n(fc))(0«*(fc) " 0n(k)) k=0 fc (6.214) T Figure 6.30 depicts the trajectory tracking error and end effector trajectory while table 6.9 documents the RMS position and orientation error over a step trajectory , the reference trajectory and a spiral trajectory. The step trajectory simply jumps from x rt = [7.5 0.0 0.0] to sta Xfi ; h n s = [6.5 1.0 0.0] and from 6 2 — 0.0 to 6 = 1.57 radians in rotation at t = 1.0s. This trajectory was primarily used to compare goal generation Z strategies. See appendix E . Since the initial end effector position lies near a kinematic singularity and the initial end effector forces are small, initial tracking of the reference trajectory is characterized by large overshooting oscillations as RMDAC works to discover the parameters of the idealized trajectory tracking process. Trajectory tracking improves significantly as the desired trajectory exits this region. Despite these excursions, the RMS error over the total trajectory is limited to 2.8 mm. Similar, though smaller, end effector perturbations may be observed at segment transitions in the reference trajectory (16.00, 22.00, 28.00, and 34.00 seconds). The global goal generator is equally effective over arbitrary trajectories. Figures 6.32 and 6.33 demonstrate the multiagent system's trajectory tracking performance over a 'spiral' trajectory. In this trajectory both the radius and the angular displacement about an axis x velocity profile from x t t s a x j s = [4.5 0.0 0.0] vary according to a parabolic = ar [4.5 2.0 0.0] to Xfinish — [4.5 —0.5 0.0]. RMS position and orientation results are reported in table 6.9. It is apparent that centripetal accelerations encountered between 15 and 25 seconds, the maximum angular velocity of the trajectory about x a x j , s drives position and orientation errors on this trajectory. The simulation does not occur in real time, as it is dominated by relatively slow socket interprocess communications and model integration. Typically, a 40.00 second run of the reference manipulator on an Chapter 6. Global Goal Generation 128 SGI Indigo2 using an integration step size of 1 millisecond takes approximately 15 minutes. Define the normalized execution time as: AT N = r f i n i s h ~ T s t a r t (6.215) Where T r t and Tfi j h are real clock start and end times in seconds and N is the manipulator's degrees of sta n s freedom. The normalized execution time, the time committed to simulate a degree of freedom per timestep, is 2.2 ms for this reference scenario. Such measures are greatly affected by interprocess communications (i.e. disk and network) loads. 6.3.2 Model Free Goal Regulation A lingering question, however, is whether the global goal generator, RMDAC, is actually a centralized manipulator controller harbouring a manipulator model. If this were so, RMDAC's performance would be governed by changes to the configuration and/or nonlinearity of the manipulator model. The following experiment demonstrates that RMDAC is insensitive to these factors, reinforcing that this goal generator is regulating an ideal dynamic tracking process and not any particular robot dynamic model. In effect, RMDAC is manipulator model free. By examining tracking performance of a variety of planar manipulator configurations, the model-free characteristics of RMDAC can be demonstrated. In this experiment 5, 10, 15, and 20 degree of freedom manipulators are applied to identical reference payloads and trajectories. Each manipulator has an identical mass and fully extended length to the reference manipulator and all links within a given manipulator possess identical physical parameters. Thus as N rises, link inertias, length, and centroid dimensions fall while agents remain computationally identical. This method permits an investigation of the sensitivity of RMDAC to manipulator design. Table 6.10 documents RMDAC's relative insensitivity to changes in the robot model. Indeed, higher degrees of freedom (and therefore smaller links) appear to improve tracking slightly, probably due to the smaller link inertias in the higher degree of freedom manipulators. The end effector of the three additional cases may be observed within figures 6.34,6.35, and 6.36. Note the exaggerated end effector error in the five link case in comparison to the relatively smooth error curvature of the twenty link run. End effector error magnitudes arc smaller and of shorter duration as the number of links rise. Given identical integral gains, varying manipulator inertia influences initial trajectory tracking (i.e. the convergence rate of the adaptive controller) as expected. Once converged, however, the end effector tracks 129 Chapter 6. Global Goal Generation linear rotation Wpi U>vi f an Pit V\i 1800 500 800 100 6 6 4 4 4 4 1 1 7i« 1 1 2 2 Table 6.8: The reference RMDAC 'Best Gains' determined through trial and error. Trajectories Step Reference Spiral ^rms 1.132e-01 2.779e-03 4.348e-03 $rms 9.179e-02 2.850e-02 7.372e-03 Table 6.9: 'Best Gains' RMS error in position and orientation for 'step', 'reference' and 'spiral' trajectories. the desired trajectory regardless of the manipulator's order of nonlinearity. As long as these parameters remain slowly time varying, convergence is guaranteed regardless of manipulator complexity. Normalized execution times are almost constant, indicating near linearity in degrees of freedom and that a real world, multiprocess implementation of a multiagent manipulator architecture is constant in time. In 3 a real world system, the dominant computational load of the MMS, an 0(N ) 3 dynamic model [Walker, 1982] disappears. Similarly the larger interprocess communications load, of O(N), that dominate MMS overall could be drastically reduced or removed through more efficient design. An efficient implementation of multiagent control would exploit multiple processors and an asynchronous kinematic bus, rendering execution time independent of agent team size. Unlike the MMS synchronous bus in which communications are triggered by an external clock, asynchronous buses allow agents to read and/or write shared data at any time. In short, this experiment demonstrates the feasibility of a decentralized, multiagent manipulator architecture and the relative independence of the RMDAC global goal generator to manipulator configuration. 6.3.3 Actuator Saturation In previous demonstrations, each agent's motor could generate arbitrarily large torques. In reality, motor performance saturates at some torque value, u sat . In the following demonstration, the effects of saturation on trajectory tracking will be examined. With saturation limit of u t = 1000 N-m applied to all joint sa actuators, each link agent will, again, adopt equation (6.212) as the global goal proxy modified to prevent Ideally, a multiprocess task that exhibiting O(N) execution time on a single C P U (such as MMS) is constant time in a process per C P U environment. 3 130 Chapter 6. Global Goal Generation 0.051 1 1 1 1 1 1 r ' 30 ' 35 X (metres) Orientaltion (i•ads) 2r 1.5 1 0.5Oi -0.5' 0 ' 5 ' 10 ' 15 ' ' 20 25 Time (seconds) ' 40 Figure 6.30: End effector trajectory tracking under 'Best Gains' RMDAC. Note oscillation prior to convergence. Chapter 6. Global Goal Generation Figure 6.31: Trajectory tracking history for the reference trajectory under 'Best Gains' RMDAC. Chapter 6. Global Goal Generation 132 Figure 6.32: End effector tracking performance over an unusual spiral trajectory in which the radius and angular displacement vary according to a parabolic velocity profile. Chapter 6. Global Goal Generation Figure 6.33: Trajectory tracking history for the spiral trajectory under 'Best Gains' RMDAC. Chapter 6. Global Goal Generation 134 Figure 6.34: End effector trajectory tracking using 'Best Gains' RMDAC goal generator for a 5 degree of freedom planar manipulator. 135 Chapter 6. Global Goal Generation 0.05 — I T -0.05 L -0.02 10 15 20 25 30 35 40 10 15 20 25 30 35 40 ~ 0.05 15 20 25 Time (seconds) n 40 —i r" 1- 2h •s| 1 0 I >• -1 -2 _] I I I 2 -2 15 I L. 10 4 X (metres) 20 25 Time (seconds) 30 35 12 40 Figure 6.35: End effector trajectory tracking using 'Best Gains' RMDAC goal generator for a 15 degree of freedom planar manipulator. 136 Chapter 6. Global Goal Generation 0.02 2 <5 0 x" -0.02 F 10 ( 15 20 n 0.02 r —vww--- -0.02 10 15 25 30 35 1 - 40 r~ \j\r^^-~' 20 25 30 35 40 -i 1 1- 10 12 ~ 0.05 <D O < -0.05 20 Time (seconds) -1 £ T" 1 0 >- -1 I -2 -3 _J 2 -2 10 15 1_ 4 X (metres) 8 20 25 Time (seconds) Figure 6.36: End effector trajectory tracking using 'Best Gains' RMDAC goal generator for a 20 degree of freedom planar manipulator. 137 Chapter 6. Global Goal Generation Refturence Trajectory D.o.F. \ / | X e | 5 10 15 20 AT (sec.) N 2 5.331e-03 3.059e-03 2.404e-03 1.965e-03 1.737e-02 6.014e-03 5.376e-03 5.196e-03 0.0024 0.0022 0.0025 0.0027 Table 6.10: Tabulation of position and orientation RMS L error and normalized execution time as a function of manipulator degrees of freedom. 2 torque values from exceeding u t: sa ' Jj(pj-i(t),p (t))f (t) n U j = <U at c —u d if - U s a t >Uj < U c S if U - > Usat sat if c ; < sat sat CJ u (6.216) —u The results depicted in 6.37 show that end effector performance is virtually identical to figure 6.30. This is the first indication that the global goal generator's maintenance of the desired end effector trajectory tends to drive local disturbances (such as torque error due to saturation) into the Jacobian null space, a phenomenon discussed in the next chapter. 6.4 Summary In this chapter, a global goal generator was specified and the implicit coordination of a multiagent manipulator controller demonstrated. The following points can be taken from this discussion: • the RMDAC global goal generator models not a manipulator, but an ideal tracking process • the generated global goal may be broadcast as a desired force-position couplet. • a simulated manipulator can be controlled by N autonomous processes without a centralized manipulator model. • each agent acting on the global goal, end effector control becomes linear in time on a single CPU, or constant in time if distributed. By regulating an idealized trajectory tracking process rather than estimating and compensating for a specific manipulator architecture, RMDAC was shown to be an effective global goal generator free of any manipulator model. Chapter 6. Global Goal Generation 138 Figure 6.37: End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint force saturation (u = 1000 N-m). sat Chapter 6. Global Goal Generation 139 The combination of this goal generation method and the global goal proxy described in the last chapter enables an arbitrary number of link processes to observe a single broadcast global goal couplet. This simple computational architecture was successfully applied to the reference scenario through the multiprocess manipulator simulator and demonstrated, for the first time, the control of an end effector trajectory by a set of autonomous link processes devoid of any manipulator model. Given the definitions in chapter 4, this behaviour is clearly an example of an implicit coordination strategy. The scalability and model independence of this strategy was explored in a performance comparison between manipulators of various configuration. Multiagent manipulator control was shown to be manipulator independent and, given one C P U per agent, constant in time. The next chapter will examine the design and performance of additional goal systems within this architecture and discuss the arbitration of multiple conflicting goals. Chapter 7 Multiple Goal Systems 7.1 Introduction In the previous chapter, an adaptive end effector trajectory tracking controller, Configuration Control (or RMDAC), was reviewed and formulated into a global goal generation strategy. In this chapter additional global and local goals will be formulated and combinations of these goals demonstrated in multiagent manipulator control simulations. While multiple goals in the form of primary and auxiliary controllers are not new to manipulator control, multiagent control adopts a decentralized architecture that changes how these goals might be generated and raises interesting questions about how agents should be coordinated to pursue multiple tasks. In particular, how can the convergence of multiple, simultaneous global goal systems be guaranteed without centralized coordination or allocation of agents to tasks? This chapter will address such questions in the context of redundant multiagent manipulator control and present a distributed global goal generation protocol. In particular, the impact of local goal systems on global stability and team coordination will be discussed and a general approach to multiple goal system design presented. The following specific local and global goal strategies will be presented in combination with trajectory tracking: • global nonlinear obstacle avoidance. • local nonlinear limit avoidance. • local constant compliance centering • local constant compliance retraction • local variable compliance centering • combined (local constant compliance centering and global nonlinear obstacle avoidance). It will be shown in applying both compliant and adaptive goal systems, emergent coordination replaces traditional coordination strategies. Furthermore, it will be shown that through the combination of local and 140 141 Chapter 7. Multiple Goal Systems global goal systems, additional global behaviour emerges. However, before venturing into multiple goal systems the next section will explore the fundamental interaction of primary and auxiliary goal systems and, based on established arguments, propose a Null Space Self Organization Theorem. 7.2 Auxiliary Behaviour and the Jacobian Null Space In this section, a simple argument will demonstrate the inherent robustness of end effector control and, by analogy, any globally controlled multiagent system. The development demonstrates clearly that an underconstrained or redundant system under global control automatically inserts auxiliary behaviour into the null space of the global behaviour. Equally important, this argument suggests that adaptive global goal regulation is equivalent to explicit null space insertion. Redundant manipulators are excellent vehicles to explore and demonstrate multiple global goal seeking. By explicitly inserting tasks into the null space, techniques such as RMRC and R M A C can guarantee the convergence of multiple tasks. Similarly, JTC can be applied to augmented tasks by ensuring the dimension of the auxiliary task, A f aux does not exceed the dimensions of the null space or A / a u x < N — Mp ; r m a r v . While this is clearly a necessary condition that the number of constraints should not exceed the available degrees of freedom, it is not immediately clear why this simple rule is often sufficient. Surely, some additional care must be taken to ensure that, at the joint level, tasks do not conflict directly or G G a u x a u x fl G p r i m a r y = 0 for all in the manner of explicit null space insertion methods. An et al. [An, 1989] provide some insight into these issues. An et al. intended to establish stability conditions for the convergence of kinematic learning controllers, those that estimate the end effector Jacobian through learning algorithms. The objective was to demonstrate that estimates of an inverse kinematic solution, f (x), converged to an actual inverse f (x). However, by _1 _1 examining their arguments in a different light and assuming that f (x) = f (x), it becomes clear that end _1 _1 effector control automatically drives joint level perturbations into the Jacobian Null space. This property is sufficiently important to propose a simple theorem: Theorem 2 (Null Space Self Organization) Consider an N degree of freedom redundant manipulator in configuration q* € R^ and at task space position, x* G R . If a task space control law is applied to M manipulator: q f c + 1 = qfc + t f - V J - r ^ x t ) ) 142 Chapter 7. Multiple Goal Systems where the forward kinematic solution, f(q) : R^ -> R , is invertible. M Then a disturbance, Sq , is always k reorganized into the Jacobian Null Space, N(J): Sq = [l-jt(x*)J(q*)]*q k+1 (7.217) t where Jt is the pseudoinverse of the task space Jacobian, J . Proof At time k, the manipulator is in position q* with the end effector at x*. Now consider the effect of the small configuration space perturbation, Sq on the manipulator configuration position becomes: k q fc = q* + <5q (7.218) cf This deviation propagates to the end effector: Xfc = f(qjfe) = x* + Jxfc = x* + J(q*)Jq fc Without correction, such deviations must corrupt the end effector trajectory. Now suppose an end effector position correction strategy is adopted to correct such deviations, by commanding a new configuration, q +i through the simple rule [An, 1989]: k qjb+i = q + (f- (x*)-f- (x )) 1 1 fc fc i.e. the difference between desired and current inverse solutions defines the joint error. Given: f - ^ X f c ) where = r (x*)+jt(x*)<5xfc 1 is the pseudoinverse, q and Sx may be substituted k q*+i k = qfc + ( q * - ( f - ( x * ) + J ( x * ) ^ X f c ) ) = q* + Sq + J (x*)J(q*) Sq ) = q* + (I-jt(x*)J(q*))<5qfc 1 t t k k Thus the correction to the manipulator configuration becomes: 6q i k+ = q = (I-jt(x*)J(q*))<Jq k+1 - q* Jb 143 Chapter 7. Multiple Goal Systems i.e. perturbations not already in the null space will be corrected into the null space. As k marches through time, <5q/b+i will diminish if the absolute value of all the eigenvalues of (I — J^(x*)J(q*)) are less than one [An, 1989]. Since Jt is the pseudoinverse of J , this is always true unless q* lies on a kinematic singularity. Therefore, end effector control ensures that the manipulator reorganizes itself such that disturbances persist only in N(J). Remarks Now for a perfectly constrained manipulator in which the dimension of task and configuration spaces are identical, Jt = J - 1 and q^+i = q/t, and the new position command is identical to the original setpoint. However, if the manipulator is redundant then the correction command Sqk+i is the projection of c5qfc onto N(J) through the null space selection operator (I — J (x*)J(q*)). Thus perturbations initially in R(J) t migrate into N(J) if the manipulator is redundant and the correction <5qfc+i is exact. On reflection, this derivation is clearly reminiscent of the redundant manipulator control law in RMRC. In summary the simple Null Space Self Organization theorem implies that perturbations in configuration space are driven into the null space by end effector control commands. Significantly, this property is not limited to manipulator kinematics but includes any multiagent system controlling some aggregate behaviour. Thus any multiagent system will exhibit the property that strict enforcement of a global goal ensures that local disturbances (or behaviours) are automatically driven into the null space. This argument can be extended to the dynamic case in the following corollary to Theorem 2: Corollary 2.1 Consider an N degree of freedom redundant manipulator with configuration forces r* £ R^ applying task space forces, f * £ R . If a task space control law is applied to the manipulator: M u where the Jacobian Transpose J (q) : R T =u* + f c + 1 M 5T where (J (q*)r-J (q*)f ) r (7.219) T fc —• R^. Then a disturbance, ST, is always reorganized into N(J): k + l = [l-jt(x*)J(q*)] Sr (7.220) k is the pseudoinverse of the task space Jacobian, J . Proof Consider that actuators apply torque to each joint to provide a desired force at the end effector: u* - [D(q*)q + Cq + g(q*)] = J (q*)f T d (7.221) 144 Chapter 7. Multiple Goal Systems with precise dynamic compensation in fj = f mp + f *: CO u* = J' (q*)f* (7.222) r the application of u* produces the force f* at the end effector. Now suppose at time k, the manipulator exerts joint forces u* with the end effector forces at f*. Given a small perturbation <$Ufc, the correction forces t5ufc+i can be discovered through a derivation similar to Theorem 2. The task and configuration space forces at time k: Ufc = U* + JUfc ^ = J (x*H t r = r + <Jf = f*+Jt (x*)<Su cf T fc Again without correction, torque perturbation deviations corrupt the end effector trajectory. Correcting the end effector force error at time k + 1 through, u +i with the rule: k u* = ^ + (J (q*)f* - J' (q*)ffc) = Ufc - J (x*)<5ffc r +1 r T (7.223) In effect, the projection of the error between desired and current end effector forces determines the configuration space force error. Given Ufc and Jffc : u = u* + (I - J (x*)J (x*))<hifc T f c + 1 (7.224) tT Thus the applied correction torque becomes: Su k+1 = (I-J '(q*)jt (x*))Jufc 7 T and configuration space force corrections are projected into a redundant manipulator's Jacobian null space through the (I - J '(q*)J (x*)) operator. 7 tT As before, fully constrained manipulators (i.e. = J - 1 ) command the original joint force setpoint. Since joint forces balancing the end effector load reside, by definition, in R(J ), forces that produce no end T effector force must reside solely in N(J). Having established a Null Space Self Organization Theorem, the combined performance of multiple goal systems can now be discussed. 145 Chapter 7. Multiple Goal Systems 7.3 Auxiliary Global Goal Systems To this point, global goals have been generated by a external, dedicated global goal processes (e.g. a user defined trajectory generator). However, it is not uncommon for agents, themselves, to both generate and pursue one or more global goal requests. Indeed many multiagent systems use agent based global goal broadcasts (e.g. [Mataric, 1994, Parker, 1992a]) as a foundation of interagent communication. This section will introduce and demonstrate a protocol that enables individual link agents to both assert and pursue multiple global goals within the agent team. 7.3.1 Obstacle Avoidance Avoiding collisions with workspace obstacles is perhaps the most desirable auxiliary task in manipulation. The object is to avoid collisions in the manipulator's workspace while tracking a desired end effector trajectory. Despite some similarity with path planning, obstacle avoidance schemes often do not rely on detailed world models to ensure that the manipulator will not collide with objects in the work space. In this way, auxiliary obstacle avoidance task supports the primary trajectory tracking global goal. In R M P C position control, obstacle avoidance strategies require the geometric resolution of both manipulator and obstacle boundary constraints. RMRC and R M A C are less complex methods, inserting obstacle avoidance controllers either directly or indirectly into the null space of the end effector Jacobian through the Jacobian insertion operator, (I — J^J), or augmentation of the task space, x, (and the Jacobian) with an avoidance function. Controllers include simple proportional [Slotine, 1988] and adaptive [Colbaugh, 1989] range controllers, nonlinear and optimal model based controllers (e.g. [Nakarnura, 1984, Sung, 1996]). Similar methods have been applied to Jacobian Transpose control by applying virtual avoidance forces to the manipulator. The next subsections will briefly overview two relevant J T C techniques and present a multiagent hybrid of these methods. The Operational Space Formulation (OSF) In the operational space formulation, Khatib [Khatib, 1985] described obstacles of known geometry with an A P F or artificial potential field model, By composing 'virtual' repulsive forces derived from the negative gradient of $ and applying them to "points subject to potential" or PSPs, the Jacobian transpose of the PSP can be used to propagate these forces to the.manipulator and avoid the obstacle. The strategy was 146 Chapter 7. Multiple Goal Systems active only within a specific PSP-obstacle surface triggering range. Specifically, repulsive forces (again per unit mass),f(*- ^ were derived from the jth objects potential field model, $ j , applied to the ith links PSP: ^)=-V$i(p ) (7 w 225 ) a technique since emulated by many including Arkin [Arkin, 1987]. The potential field model Khatib used was based on an inverse law: ifp<Po if ( 7 2 2 6 ) P > Po where r) is a gain, p describes the obstacle in link coordinates and po is a threshold distance. Applying the gradient operator, the repulsive force becomes an inverse square law: ' ^° lfp [ 0 (7.227) P iip>Po As defined in [Khatib, 1985], p is a function of obstacle geometry, requiring a recognition and world modelling system to identify and store obstacles. With the jth obstacle modelled by an A P F , the repulsive force at ith PSP is computed through simple superposition: ?=E?;,o j = £ - v * ; ( 7 228 ) i Thus the combined expression control law for trajectory tracking and real time obstacle avoidance became: r = 3 f + E J?M(x)f T d 0 0 (7.229) i where fd is as in equation(E.414) and J,- is the Jacobian of the ith PSP. Discussion Despite the effectiveness of this technique, Khatib anticipated problems with OSF. Since fd, is a cartesian PD controller and are nonlinear forces, configurations are possible in which [Khatib, 1985]: "...local minima can occur in the resultant potential field. This can lead to a stable positioning of the robot before reaching its goal." Khatib recognized that a manipulator, trapped by obstacle repulsive fields, might not be able to follow the desired end effector trajectory. Khatib believed that local procedures might be able to extract a robot from this predicament, in effect resolving constraints through some decision making process. 147 Chapter 7. Multiple Goal Systems Though powerful, A P F models are not practical for multiagent obstacle avoidance. The reliance on complete, global view of both the environment and the manipulator is both computationally burdensome, difficult to decentralize, and, with current sensing, prohibitively expensive to implement in real time. An alternative to the global view is to combine local sensing with a simpler obstacle independent repulsion strategy, removing both the cost of sensing and recognition and the centralized world model. This strategy, adopted by Colbaugh in the implementation of an RMDAC based obstacle avoidance system, is briefly reviewed in the next section. Obstacle Avoidance with Configuration Control Colbaugh attacks the obstacle avoidance problem by augmenting the task space with an obstacle avoidance constraint and enforcing task space forces with a model reference decentralized adaptive controller. In redundant manipulator Configuration Control (for clarity, here referred to as RMDAC), the end effector position vector, x, is augmented by additional kinematic constraints such as obstacle and joint limit boundaries: x< (7.230) e X . Thus the end effector Jacobian, J e c , is also augmented by a constraint Jacobian, J , through which constraint c forces, f , are applied to a manipulator's redundant degrees of freedom. Specifically: c T f T f T J - aa — Jc l dee( ee) X (7.231) fdc(Xc) An RMDAC controller is then applied to the augmented state vector to produce an augmented force vector. As before these forces are transformed into joint torques through the (augmented) Jacobian transpose. Note that the augmentation process and the summation process in OSF are mathematically identical. In RMDAC obstacle avoidance fd (x ) was based on a computed minimum distance from the obstacle c c surface and range sensor data. In [Colbaugh, 1989], Colbaugh defines a critical point,(x. )ij as the point on c link i closest to the surface of obstacle j in local link coordinates. If the position, (x )j, of the jth obstacle is 0 known and a clearance (or triggering) radius (r )j is defined, then the magnitude of the minimum distance Q is: [dc(q)],-,- = ll(xc),-i-(x )i||2 0 (7.232) and an avoidance constraint,gij(<l) can be formulated: 50 -(q) = [d (q)],-; - (r )j > 0 c 0 (7.233) 148 Chapter 7. Multiple Goal Systems where f[0 0] if#;(q)>0 T I \-9ij ~ 9ij] T (7.234) if .9.j(q) < 0 Now fdc(q) is generated through the feedback portion of an RMDAC controller to x : ce (7.235) fdc(q) = d {t) + K {t)I X + A'„ (i)x, •ce c Pc e e c where d (t),K (t) and A' (i) are defined as in equation (6.206). c Pc l)c Discussion Since there are no A P F obstacle models, this technique is less compute intensive than OSF and enables direct sensing of [d (q)]i.j- Furthermore, the incorporation of x into the task state vector means that both x c c ec and x are enforced by an adaptive controller. To be achieved, therefore, both tasks must occupy complementary c regions of configuration space, q or: /?.(j'f)ni?.(Jf ) = 0 (7.236) e This is merely a corollary on the fact that, given r the redundant degrees of freedom, a maximum of r constraints may augment the task vector or: r (7.237) If both tasks were compliant and if the above condition were not true, neither task would be achieved and an equilibrium would be established between the two tasks. Under adaptive control, however, there is no such compliance. Indeed, competition between adaptive controllers can lead to numerical instability. In other words, adaptive augmented state regulation ensures the convergence of mutually exclusive tasks, but cannot resolve conflicting tasks. Obstacle Avoidance and Arbitration Together OSF and RMDAC obstacle avoidance methods provide two important lessons on arbitration. Conflicts between strictly enforced goals can only be resolved through explicit coordination. If multiple goals are maintained by similar adaptive schemes, any conflict between goal systems (e.g. M j d > N — Mt k avo rac avoidance tasks) can lead to competitive adaptation and, ultimately, instability. Clearly, agents that use adaptive augmented state systems must use some form of explicit arbitration to either enable, disable, or modify goal systems during goal conflict. Indeed, if the goal systems are global, a centralized process may be necessary to orchestrate a global arbitration strategy - an explicit coordination system. Chapter 7. Multiple Goal Systems 149 Conflicts between compliant goals can be resolved through dynamic equilibrium. By using a compliant, albeit nonlinear, repulsion system and a compliant linear end effector force generator, OSF ensures collision avoidance and, when tasks conflict, stability. In short, OSF relies on environmental state, manipulator dynamics, and controller interaction to implement a goal coordination scheme - an emergent coordination system. Next, a multiagent technique will be introduced that exploits Jacobian Transpose decentralization and a compromise between the rigid goal enforcement of RMDAC and the compliance of OSF to demonstrate decentralized obstacle avoidance. 7.3.2 Multiagent Obstacle Avoidance Since both Khatib's and Colbaugh's obstacle avoidance methods rely on Jacobian transpose control, it is clear that these, too, may be decentralized just as the end effector trajectory tracking task. Thus the structure of the global goal proxy is not specific to end effector trajectory tracking, but is applicable to any global goal system. Hence the variable p p refers to a generic point of application for some desired force. a P For obstacle avoidance, the desired applied force, fjj = f h , a repulsive force away from the obstacle p s surface while the point of application is the 'point subject to hazard' (psh) p 1 a p p = p hps Together these form the obstacle avoidance global goal couplet: £ s h = <P sh,fpsh> P where p a p p =p p s h (7.238) P and f = f h d p s The structure and computation of the agent global proxy (4.128) remains unchanged, regardless of the source of Q h- However, for the ith link, avoidance forces can only be provided by superior links 0 < i ps those constituting the forward solution of the psh. Specifically: P sh P = ''"psh — /psh(q) (7.239) d/ sh(q) P <9q . T f sh (7.240) P where q is the vector of generalized coordinates. Now if p h lies on link i then the forward solution / h (q) ps p s is a function of links j : 0 < j < i. This means that only 'superior' links, j < i, participate in the obstacle avoidance effort making obstacle avoidance a group correlated goal . 2 ' W i t h nomenclature reminiscent of Khatib's PSPs, 'points subject to hazard' are not fixed a priori and, in fact, are virtually identical to Colbaugh's critical points However, if both end effector and base are rigidly constrained, theoretically all agents can drive the psh away from the obstacle based on the same global goal expression. 2 150 Chapter 7. Multiple Goal Systems Frame i Distal joint, p( Frame /-/ -•-1 imal joint, Pi'-/ Figure 7.38: A simple link agent controller model. This section will describe the design of an agent behaviour controller that, when triggered by a sensed obstacle boundary, asserts or broadcasts a global obstacle avoidance goal to the agent team. This resolved motion obstacle avoidance behaviour controller (RMOA) is formed of two fundamental components. The first monitors the local region with range sensors and formulates a response to an obstacle that intrudes a clearance lozenge surrounding the link. The second component polls the incoming goal stream for multiple global goals of the form (7.238) and develops a response based upon these requests. Responding to Obstacles Just as Colbaugh's obstacle avoidance system identified a critical point on each link, the sensor model in the multiagent obstacle avoidance behaviour controller determines a point subject to hazard or psh through a cylindrical sensor array. This array is aligned with the link axis and measures range to the nearest obstacle surface. Employing neither recognition nor surface mapping techniques, RMOA's sensor model simply returns a vector to the nearest obstructing surface, x , expressed in the link's coordinate system. r The RMOA sensor can be modelled as a linear range sensor array or its equivalent described parametrically in link coordinates: x rr(a) a = Xi+a(x -Xi) 2 0<a<l (7.241) where X i and x are the proximal and distal coordinates of the link in link coordinates. With such an array, 2 151 Chapter 7. Multiple Goal Systems a visible surface can be interpreted as a function of the array length. S Defining min(x f) as the value of / sur s u r : X s u r = f (7-242) /surf(Xarr) f at which the following is true: <9/surf <9x = 0 (7.243) > 0 (7.244) arr <9 / urf 2 S <9x 2 a sensor array may be designed to report only the nearest hazard or min(x f): sur Xmin = min(x ) (7.245) surf Xpsh = x • x,- (7.246) r where X ; is the basis vector of the ith frame in the x direction. Thus the minimum range from the psh to the surface, x , is simply x,. = x j r m — x n p s h. With a linear array of range sensors distributed over each link, as in Colbaugh's implementation, each link may be enclosed by an artificial potential field, reminiscent of obstacle APFs used in OSF and depicted in 7.38. The obstacle avoidance strategy enforces a clearance range forming a nonlinear repulsion 'lozenge' about the link within which an A P F , similar to equation (7.226), is activated. Within the lozenge, a repulsive force is determined based on the magnitude of the minimum distance or p — |x |, from the link surface to r the obstacle surface (see figure 7.39): ^ = ^\k7rl)\^ - (7 247) The force acts in the direction of the range, T ^ T . Triggering the behaviour at a clearance range of c, the repulsive force becomes: f (x ) r P = / & (0 (W " 0 X " ^ i f | X r | C (7 .248) if |x | > c r Through this RMDAC/nonlinear repulsion hybrid a margin of stability is gained during periods of overconstraint. The advantage the potential field approach has over the RMDAC obstacle avoidance goal is that the former permits some deflection and, therefore, some room for direct conflict between constraints. The disadvantage is that these same conflicts may result in instability if the obstacle intrusion becomes large. Once computed, the f and r x p s h are transformed into the global coordinates and asserted to the global goal process for retransmission to other agents. f h p s and p p s h respectively 152 Chapter 7. Multiple Goal Systems Range (m) Figure 7.39: Semilog plots of the potential field (top) and repulsive force (bottom) as a function of normalized range. Global Goal Assertion Rebroadcast As described in the previous chapter, the global goal process GP maintains all the global goal processes. As described above, RMOA also enables each agent to generate global goal assertions. However, agents lack the necessary information to broadcast such assertions to the agent team (such information constitutes a local manipulator model - a feature worth avoiding if possible). Ideally such broadcasts might occur over a common communications bus to all agents possessing the RMOA controller. To mimic such a bus system, the global goal process receives global goal assertions from RMOA controllers for rebroadcast to superior links with RMOA controllers. Thus RMOA controllers must transmit data (through a MeasurandVector) and a receive data (through a SensorVector) from the goal process, both of which are members of the agent's IOStream. For a given receiving agent, the global goal process determines that the agent is superior to (i.e. closer to the manipulator base in the kinematic chain than) the asserting agent and inserts the global goal couplet into the agents receive message queue. Unfortunately, this message filtering process is a clearly not a simple broadcast. However, a 'one way' or unidirectional bus structure from end effector to base simplifies the communication model, retaining a relatively model-free communications structure. Chapter 7. Multiple Goal Systems 153 Arbitrating Between Multiple Goals Having completed the assertion phase of the obstacle avoidance task, RMOA retrieves the global goal message queue and applies the global goal proxy expression to every received RMOA couplet to establish response forces for each avoidance request. For the jth agent: Utrack = (p„,Xj_i)f (7.249) d N-j Uavoid = E J i (Ppshk,Xj-i)fpshk (7.250) fc=l where J J(-, •) is the global goal proxy and N is the number of agents. The response u id is passed to the avo arbitrator, A. Just as in previous obstacle avoidance methods arbitration can be simplified to simple linear combination. Recalling equation (4.92), for the jth agent: u cj = Utrack k ^ u . = [1 1] A (7.251) U void a where the kj is the arbitration vector. Significantly, the arbitration strategy of linear combination is not the product of some design process but from Newtons Second Law. Since u id is a triggered behaviour controller, the arbitration vector could be rewritten as: avo kj = [fctrack ^avoid] where { (7.252) 1 if |x | < c 0 if |x | > c r (7.253) r Since this triggering occurs within Hj void and not within A the goal system is self triggering. a ; Despite decentralization, goal assertion, and rebroadcast, the expression describing RMOA over the manipulator is identical to both equations (7.230) and (7.229) as would any system based on Jacobian Transpose obstacle avoidance. The benefit of this system is that the computation is distributed, each agent can assert unique (even multiple) avoidance strategies based on different sensing systems (e.g. thermal, ultrasonic, etc.), and, perhaps most importantly, the failure of a single RMOA controller threatens only a single link - and not the entire manipulator. Furthermore, by recognizing that task conflict can be reconciled through compliant and adaptive controllers a mechanism of decentralized arbitration has been identified. Though the process may seem complex in comparison to RMDAC or OSF, closer examination of these centralized schemes reveals similar (if not more daunting) complexity in collecting and polling N (possibly 154 Chapter 7. Multiple Goal Systems Strategy Avoidance 6rms $rms 3.481e-03 5.989e-03 Table 7.11: R.MS error for a combined tracking and obstacle avoidance strategy. dissimilar) sensor arrays and computing the responses for N links in real time. In the multiagent structure discussed here, the global goal process transmits (at most) N — 1 global goal couplets to each agent and receives at most N — 1 global goal couplet assertions, each only 96 bytes long. Agents receive and transmit kinematic packets (144 bytes each) as well as IOStream and goal assertions (receive: 96(N — j) bytes, transmit: 96 bytes). It is hard to imagine a centralized data collection, analysis, and control system with lighter communication loads than decentralized RMOA. 7.3.3 Results An experiment can now be constructed to explore the performance of this distributed, multiagent obstacle avoidance protocol. In the following test, an obstacle, a sphere 0.25m in diameter, is placed at a known interference location of the manipulator's motion though not obstructing the reference trajectory. The resulting performance, depicted in figure 7.40, demonstrates that the reference manipulator under multiagent control can successfully negotiate an interfering obstacle while tracking an end effector trajectory. During pure trajectory tracking, the manipulator exhibits typical jumbled, unconstrained 'free' motion in the Jacobian null space. However, once an obstacle intrudes a link's clearance lozenge, links appear to 'rebound' from and 'slide' along a surface enveloping the obstacle. Despite these intrusions, the end effector exhibits good trajectory tracking performance as described in figure 7.41 and condensed into RMS error in table 7.11. Examination of figures 7.42 to 7.46 reveals the assertion and propagation of avoidance torques from source to base agents. Exemplified in figure 7.46, an avoidance torque output is generated by link 9 in response to an obstacle encroaching its clearance lozenge. Link 9 then asserts an avoidance goal to the superior links (links 1 through 8). Thus link 9's avoidance torque waveforms are echoed and magnified as each superior agent formulates a local response to link 9's global goal assertion through its global goal proxy operator. Of course, each agent acts on global avoidance assertions and asserts global avoidance goals as well if an obstacle intrudes the local lozenge. Therefore, the jth link (j < 9) response is often the combination of global goal assertions from links k : j < k < 9. Not surprisingly, avoidance torque waveforms become increasingly Chapter 7.Multiple Goal Systems 155 Figure 7.40: The reference manipulator avoids a small stationary sphere while tracking the reference trajectory. The sphere is 0.250m in diameter at x s = [1.00 — 0.750] . The RMOA controller uses a clearance of 0.75 m and gain of i] = 100.0. T OD Chapter 7. Multiple Goal Systems 156 Figure 7.41: End effector trajectory tracking performance of the reference manipulator engaging in multiagent trajectory tracking and obstacle avoidance. Chapter 7. Multiple Goal Systems Figure 7.42: Evolution of the Global Goals within links l(top) and 2 (bottom). 157 158 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_3 10 15 20 25 30 35 1 1 40 _ 500 E i (0 •g o > < -500 200 i z CD C 2. o (0 -200 I- 15 20 25 Time (seconds) Multiple Goal Evolution within: link_4 ~400 E I I 1 1 1 i 200 - I J . I 15 • — * — 20 25 Time (seconds) Figure 7.43: Evolution of the Global Goals within links 3(top) and 4 (bottom). 159 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_5 i ~ 100 i S0 A A i • - c y -100 Co III i i i 10 15 i i 1 i 20 25 Time (seconds) i i 30 35 40 Multiple Goal Evolution within: link_6 0 E i z O) c o 12 10 15 20 25 •I 0J 1 50 I -50 2 h- -100 15 20 25 Time (seconds) 40 Figure 7.44: Evolution of the Global Goals within links 5(top) and 6 (bottom). Chapter 7. Multiple Goal Systems Figure 7.45: Evolution of the Global Goals within links 7(top) and 8 (bottom). 161 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_9 Time (seconds) Multiple Goal Evolution within: link_10 61 1 1 1 1 1 1 r Figure 7.46: Evolution of the Global Goals within links 9(top) and 10 (bottom). 162 Chapter 7. Multiple Goal Systems complex as j -4 0. Of equal interest is the interaction between tracking and avoidance torques within each agent. Examining figure 7.44, both link 5 and link 6 exemplify the self arbitration interaction between the adaptive tracking goal and the relatively compliant avoidance goal. At approximately 24.5 seconds, both links receive the asserted avoidance request posted by link 9 . Figure 7.41 reveals that this disturbance is propagated to the 3 end effector. In compensating for this disturbance, the global goal regulator changes the required tracking forces. An examination of links 5 and 6 reveals that these force changes appear as inverted waveforms of the avoidance torque profile over the same interval (approx. 24.5 to 25.5 seconds). Similar controller interactions can be observed within all the agents of the multiagent manipulator team. This behaviour suggests that multiple global goal systems can, in effect, become self arbitrating in a redundant system. The combination and magnitude of tracking and avoidance torques are determined neither by a global explicit coordination mechanism nor a local agent explicit arbitration strategy, but through interaction of the system, the behaviour controllers, and the environment. Rather than modelling the environment and planning an explicit avoidance strategy, J T C obstacle avoidance schemes, including RMOA, demonstrate that obstacle avoidance and end effector tracking behaviours can be designed independently and superpositioncd to achieve collision free motion in a cluttered environment. Though the independent global goals of trajectory tracking and obstacle avoidance are applied to the system, the system's global behaviour becomes a hybrid of the two systems, apparently without explicit or implicit coordination of the link agents. 7.3.4 Discussion Unlike compliant task space controllers, end effector trajectories enforced through adaptive control must be changed to avoid mid course obstacles. Compliant controllers, however, can spontaneously avoid such obstacles by combining attractive and repulsive forces at the end effector. Overcoming linear repulsive controllers and unstable in combination with any other, adaptive controllers must actively formulate trajectories to avoid unexpected mid course obstacles. One must agree with Khatib, that to avoid end effector collisions, the global goal generator must itself become an agent, adopting some obstacle avoidance strategy to construct safe end effector trajectories. Though such agent based obstacle avoidance strategies arc numerous and would fit easily into the multiagent Occasionally, a link range plot indicates an obstacle intrusion without triggering a local avoidance response, implying Ppsh j — l - Overlapping the clearance lozenges at the joints ensures avoidance behaviour in these cases. 3 = k x Chapter 7. Multiple Goal Systems 163 system discussed in this study, ultimately they are ultimately another trajectory generator and will not be investigated here. All goals are not always of the same priority, however. For tracking tasks obstacle avoidance and trajectory tracking are both of primary importance, divergence from either constituting a form of failure. Nevertheless, strict performance guarantees are often pursued for these less critical tasks (such as torque optimization, manipulability, etc.) and, in R M A C for example, have led to explicit task prioritization methods based on hierarchies of null space selection operators and switching logic [Nakamura, 1984]. In the next section, it will be shown that through appropriate specification of auxiliary goal systems, such prioritization/selection schemes can be greatly simplified. 7.4 Local and Global Goals Combined Having demonstrated a global goal generation method in the last chapter and, in the last section, the interaction of multiple global goals, this section will investigate the interaction of local and global goals. Furthermore, this section will demonstrate that appropriate design of linear local and adaptive global behaviour controllers can result in self organizing, emergent behaviour of a multiagent team. Finally simulation results will be presented that establish multiagent control as a new, computationally lightweight method of redundancy resolution. 7.4.1 Why Local Goals? Fundamentally, local goals provide each agent with control over local conditions. For example, a manipulator's free motion in N(J) can produce both undesirable and unpredictable dynamics. However, under local control this motion can be harnessed, enabling velocity minimization, free configuration space maximization, and joint limit avoidance. Therefore, the purpose behind local behaviour control is to reign in the less desirable characteristics of redundancy while achieving locally desirable states or properties. An additional objective, identified in the emergent multiagent control hypothesis, is that local control can lead to beneficial globally desirable states or properties. Before entering into a discussion of local goal design, a brief examination of a common local auxiliary controller [Khatib, 1985], joint limit repulsion, will demonstrate the benefits and difficulties of local goal design in a multi-goal environment. Chapter 7. Multiple Goal Systems 7.4.2 164 Continuous Nonlinear Joint Limit Repulsion A persistent hazard of task space control, collision with joint limits, has led to the development of joint limit repulsion controllers. A good example, demonstrated here, is Khatib's joint limit repulsion strategy [Khatib, 1985]. A variant of his obstacle avoidance strategy, Khatib developed a force generator in which joint limit boundaries were enforced by the following nonlinear repulsive control law: { - 7 ? ( T ^) J v*w„ if Pupper < Po > ^ 0 P O ; P -^[j^Z-To)!? ( i f 7 2 5 4 ) Plower<P0>0 where Pupper = Plowcr = (/upper — Q q — Slower (7.255) (7.256) Combining this behaviour controller and trajectory tracking behaviour controller: Utrack = Jj(p„,x,-_i)f rf (7.257) to each link agent controlling the reference manipulator or: u cj = k^.u . = [1 1] Utrack Aj (7.258) Ulimit A physically realizable robot configuration history, i.e. within joint limits, can be observed. In the demonstration depicted in figure 7.48, the gain r/ = 5.00 is applied to the local joint limit repulsion goal. Clearly, the local repulsion goal prevent collision with the joint limit boundaries. Unfortunately, the nonlinearity of these local goals can produce collision-like effects visible in figure 7.47, often disturbing the end effector trajectory. Indeed, some trial and error is required to settle upon a local gain sufficiently strong to prevent collision and yet maintain stability of the manipulator. Furthermore, the nonlinearity of the local goal exacerbates the problem by permitting virtually free motion in N(J) only to prevent joint space collisions precariously near the joint limit. Clearly, arbitrary local goals do not guarantee stability of the system even in the presence of a relatively robust global goal generator. Some reliable design process is needed. The following sections will attack the local goal design problem by • adopting a simple linear local controller Chapter 7. Multiple Goal Systems 165 Figure 7.47: End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint limit avoidance (rj = 5.0 N-m). Chapter 7. Multiple Goal Systems Figure 7.48: Both RMDAC end effector global and joint limit local goals are engaged. 166 Chapter 7. Multiple Goal Systems 167 • combining this with a linear global controller • performing a linearized stability analysis. 7.5 Linear Local Goal Design To separate the interaction and design of controllers from agent arbitrators assume, for the moment, that simple linear combination is an adequate arbitrator for the jth agent. For example: u =k^ C J Uglobal xi , = [1 1] (7.259) A Ulocal where is the arbitration vector. The local goal design problem then addresses how u i o c a i should be formed. Just as in global behaviour control, a number of local behaviour controller models are possible: linear or nonlinear, continuous or discrete. Unlike global behaviour control alone, however, local behaviour control interacts with both local and global goal systems. To simplify the problem, a good starting point is to clarify the interaction between local and global goal systems through the adoption of a simple linear continuous local controller. By reviewing the design of a simple P D local controller, the implications of this selection on the global performance of the system and on the distribution of local control effort throughout the system can be predicted, simulated, and discussed. Assume a P D controller is applied to thefcthagent based on the assumption that the link is an isolated linear system of the form in equation (2.49): Jeff q + (-Befffc + K K )q k mk k b = Ku mk k - rd k k (7.260) k where, as before: ./eff, d k = J +r d 2 mk (7.261) kk = ^d q jk j + 'Y^Cij qiqj (7.262) k Now for a simple P D controller and the with the gains k and k , reiterating equation (3.58): q u =kq q where q Ck ek w +k q w (7.263) ek = qd - q - Substituting and rearranging: k mk J ff q c k mk + (Beft + K K k k b + Kk )q w mk + Kkq k q mk =Kkq k q dk + Kk q k w dk - rd k (7.264) Chapter 7. Multiple Goal 168 Systems A p p l y i n g t h e L a p l a c e T r a n s f o r m w i t h zero i n i t i a l c o n d i t i o n s : JefuQm* (s)s + (Z?efr + A V u + K k )Q (s)s + K k Q 2 k k w mk k q (s) = K k Q mk k q (s) + K k Qd dk k w k (s)s - r D {s) k k (7.265) D e v e l o p i n g t h e expression: Dk{s) Qm (*') QM-rk^mA A («) k : fc A (s) fc = J s + (S eff k b k (7.266) (s) „ + K K + K k )s 2 e r T t fc + Kk w (7.267) q Now, this system will be stable if the roots of the characteristic equation A (s), reside in the left half plane. k The steady state error is the product of two possible inputs Qd and D(s). Applying a unit step for both k Qd and D(s), the final value theorem may be used to estimate the steady state error. k e (t) ss = lim s 8->0 1 Kk k + q _s Kk k (7.268) T k 9 b A (s) sA (s)\ fe k Kk k q Now, the remainder is that which remains constant in the robot equation, usually gravimetric forces. Therefore g = \g | is the bound on the value of these forces. From this classical analysis one can conclude that b k the steady state error may be characterized by: e (t) < ^ ss (7.269) Thus as kg increases steady state error diminishes. Using vector notation, we can describe the manipulator local controller behaviour as a typical PD manipulator controller or: ui oca i = K q (r) + K q (t) 9 e w e (7.270) Given this classical PD joint controller construction, what are the implications of a similar local PD controller coexisting with a global goal system of the form discussed in chapter 6. The following sections will explore this topic through the combination of locally compliant and globally adaptive behaviour controllers. 7.6 Local Goal Strategies In this section a number of local goal strategies will be described and combined with global goal strategies in simulation. Though based on the basic PD controller design above, each local goal adopts a specific compliance and/or setpoint combination to achieve unique global behaviour. Since each link is controlled by both a three degree of freedom task space controller and a single degree of freedom joint controller, the manipulator will be overconstrained with a total of n + 3 constraints. As 169 Chapter 7. Multiple Goal Systems discussed earlier, this overconstraint should not produce a PD equilibrium offset at the end effector, the global RMDAC controller compensating for the local controllers in the steady state. However, those links that are fully constrained will, nevertheless, be able to transmit force and position requirements to other link agents by 'reflecting' control effort off the global goal regulator. 7.6.1 Fail Safe Locking and Robustness The basic attraction of redundant manipulation is that a variety of configurations can achieve the same end effector task. Thus a disabled motor or controller need not bar the manipulation system from completing the task. However, this kinematic redundancy is usually not mirrored within traditional centralized controllers in which computer or algorithmic failure can be catastrophic. In R M P C or off-line numerical methods, joint jamming or locking requires centralized detection and recornputation of the desired joint space solution. While tolerant of such failures, RMRC, R M A C , and traditional J T C , like R M P C , are model driven processes within centralized architectures. Software or hardware failures of any centralized controller can lead to catastrophic failure. The solution, backup controllers ('hot standbys') and/or remote operation, greatly complicates the control system and substantially elevates costs. By exploiting a distributed computational model, Multiagent control is comparatively immune to systemwide computer failure. This architecture provides a simple solution to controller and/or motor failure with little or no computational penalty. Since end effector performance is not disturbed by frozen joints in a redundant manipulator under Jacobian transpose control, an agent can manage a joint failure through, for example, a 'fail-safe' behaviour that engages an emergency brake. This approach combined with the decentralized computation model described in chapter 6 provides the necessary hardware robustness to survive relatively severe controller and/or motor failures. By basing an agent's fail-safe behaviour on triggers such as suspicious sensor data or poor motor performance, a factor of safety can be designed into the manipulation system. Of course, no system is immune to catastrophic failure and multiagent manipulator control is no exception. Multiagent control's weakness lies in each agent's dependency on kinematic packet transmission (assuming no local task space sensing is available). A failing rectified only through redundant communications . 4 Results * Probably required by 'hot standbys' in R M A C anyway, redundant communications alone are, nevertheless, far less costly than redundant communications and redundant, centralized controllers. 170 Chapter 7. Multiple Goal Systems 0.1 i 1 1 1 1 1 r 1 1 X (metres) 21 1 1 1 1 r Time (seconds) Figure 7.49: End effector trajectory error for the reference manipulator and payload under the failure of link 9. Chapter 7. Multiple Goal Systems 171 Figure 7.50: The manipulator configurations at the knotpoints of the reference trajectory for the reference manipulator and payload under the failure of link 9. The 'fail safe' braking behaviour locks link 9 to link 8. Chapter 7. Multiple Goal Systems 172 Strategy Reference with Failure Reference firms $rms 3.285e-03 2.779e-03 5.528e-03 2.850e-02 Table 7.12: Comparison of end effector RMS position and orientation error performance between reference and fail-safe behaviour of link 9. The performance of a local fail safe braking strategy can be simulated through the application of a stiff local PD controller at the failed joint and the removal the global trajectory tracking behaviour from the joint's goal list. This has the effect of binding the failed link, j, to link j — 1. Ubrake = k q . q (7.271) Ck where k is large. q In the following simulation, each agent in the team pursues a global tracking goal while the disabled agent, link 9, employs a fail safe braking strategy. The multiagent system adopts the control laws: u c j u 9 C = u t r a c k = u b r a k e j = {1,..., 8,10} (7.272) (7.273) In figure 7.50, the configuration history of the reference manipulator with link 9's goal system disabled and motion locked. Together, adaptive global goal enforcement and the decentralized architecture enable the manipulator to continue trajectory tracking without interruption or substantial increase in tracking error as depicted in figure 7.49 and recorded in table 7.12. The increased RMS error can be attributed to larger inertias in the link 8/ link 9 system that can effect the global behaviour's transitory response. R M A C methods can exhibit similar robust performance, but at the cost of greater computation (an 0(N ) 2 pseudoinversion and at least O(N) dynamic model) and centralized control. 7.6.2 Constant Compliance Centering Earlier in this chapter a joint limit avoidance local goal was introduced, exhibiting effective , though poorly behaved, performance. Another strategy is to minimize the probability of a joint limit collision by maximizing each joint's available free maneuvering range. One strategy for maximizing available 'maneuvering room' in joint space, is to enforce a joint centering strategy on the system. By specifying a time invariant setpoint t midway between each joint's upper and lower bounds in the following manner: feean, = ^ ^ (7-274) Chapter 7. Multiple Goal Systems Global Trajectory Tracking Goal (RMDAC) 173 d Other Agents Local Behaviour Controller (Centering) Figure 7.51: The structure of a multiagent system in which each agent has both local centering and global trajectory tracking (and other) behaviours. a simple joint centering strategy can be devised. For simplicity, assume all agents are homogeneous, with identical PD controllers of the form: ^centering where q Ck =q mcatik = ^qlck "t" k q w ek (7.275) - q - Where k k = C Vfc (7.276) qk and (k ,k ) are selected to be LHP stable and perfectly damped or: qk Wk fctujt — 2\/fc . 0A (7.277) Figure 7.51 depicts the final structure in which each agent has both local centering and global tracking behaviours. Chapter 7. Multiple Goal Systems 174 Intuitively, centering seems to be a reasonable method of ensuring each joint tends to remain in the middle of its range, but does this strategy, in fact, globally maximize maneuvering room (or globally minimize displacement from the midrange) over the trajectory? Joint Centering A s Optimization Joint Centering can be shown to globally minimize joint displacement from the midrange over the end effector trajectory. Kazerounian and Wang [Kazerounian, 1988] demonstrated that locally minimizing joint acceleration through R M A C (a least squares pseudoinverse solution for joint acceleration), globally minimizes the performance index: C> ,n q A ( * ) q dt 1=1 (7.278) r and, therefore, the joint velocity if the symmetric matrix A ( r ) = I. By augmenting this index with a potential energy term a similar development will show that insertion of a proportional joint controller into N(J) globally minimizes joint displacement: (7.279) q K q + q A q dt 1=1 T T 9 Jt 0 subject to the forward kinematic solution G'((q), t) = x(t) — f (q) = 0. The Hamiltonian becomes: (7.280) H (q, q , *) = q r K , q + q T A q + A G((q)) T making an augmented performance index: (7.281) I* = f J H(q,q,t) dt Jt Kazerounian then applies the calculus of variations in I* to produce: 0 SI* [ Jt tJ 0 \L*IL _ L^q d d t d H OH dH Sq(t ) Sq(t ) dq(t )\ dq(t ) Sqdt + dq 0 f f (7.282) 0 For an arbitrary variation J q , the PI is minimized if: dH dq d dH = 0 dt dq (7.283) The term: — - 2 K ? q + A — + q (7.284) Of course the term | g is simply J , the Jacobian of the forward solution. The term: d dH „ .. 4 - - = 2 A q + .. 2 A q (7.285) 175 Chapter 7. Multiple Goal Systems Equation 7.283 then becomes: 2K q + A J + T d 0 ^ A ^ q - (2Aq + 2Aq) = 0 (7.286) Solving for q: q = A " ( K q + A J + 0.5 ^ ^ 1 T A q - Aq) ) 0 (7.287) Twice differentiating the constraint, G((q)), produces the familiar equation: Jq = x - j q (7.288) into which q may be substituted : d(a A) T JA _ 1 (K,q + A J + 0.5-^-^q-Aq) = x - Jq (7.289) = x - Jq (7.290) T J A - ^ ' A + JA-^K^q+^.S ^ j q-A)q] Now solving for A: A = ( J A ^ J ' ) - ^ - jq) - ( J A - ^ y ^ A - ^ K j q + ( 0 . 5 ^ 3 1 A ) q q - A)q) (7.291) Backsubtituting into the expression 7.287: q = A - ^ K j q + ^ K J A - ^ ) - ^ - jq) - ( J A - J ) - J A " [ K q + ( 0 . 5 ^ ^ q - A ) q ] ] 1 1 1 r 1 1 g +(0.5^^q-A)q = (7.292) J^(x - Jq) + (I - j t J ) A ~ [ K , q + ( 0 . 5 ^ ^ 1 t A ) q - A)q] (7.293) where: J^A-^^JA" ^')1 (7.294) 1 is the weighted pseudoinverse of the Jacobian matrix. If A = I and K = 0, then J ^ = j t , and (7.293) is ? equivalent to the minimum velocity solution.If the only joint value requirements are the satisfaction of the end effector forward kinematic solution constraint at tj and *o, then the joint velocities must satisfy the boundary conditions [Kazerounian, 1988]: q = J^x at both tf and to. (7.295) 176 Chapter 7. Multiple Goal Systems However, both [Kazerounian, 1988] and [Colbaugh, 1989] show that by setting A = D(q), free motion in the null space of minimizes the kinetic energy of the system. Substituting into equation (7.293): q = J (x - Jq) + (I - J+ J)D- [K q + (0.5 ^ f 1 D ? P ) ? q - D)q] (7.296) and recognizing the relation: C(q, q) = (D - 0 . 5 ^ ^ q ) (7.297) q = J^(x - Jq) + (I - J ^ J ) D - [ K „ q - Cq] (7.298) equation (7.296) reduces to: 1 with the natural boundary conditions [Kazerounian, 1988] at to and t f . q = J*,x (7.299) If K = 0 equation (7.298) describes the accelerations within a redundant manipulator under end effector g control alone. Observe that these accelerations are weighted (as one might expect) by the manipulator inertia matrix and that accelerations in the Null space are governed by coriolis and centrifugal forces. By restructuring this equation as a force-torque expression (as in [Colbaugh, 1989]), it can be shown that insertion of a simple proportional controller into the Jacobian null space is equivalent to a minimum displacement strategy. Recall the structure of the Robot equation in both task space and configuration space coordinates: r = D(q)q+C(q,q)q-r-g(q) (7.300) f = M(x)x + N(x,x)x-l-p(x) (7.301) where M(x), N(x,x)x and p(x) are related to D(q),C(q,q)q, and g(q) respectively in equations (E.421), (E.422), and (E.423). Now given a redundant manipulator with the inertia-weighted pseudoinverse: J^ = D- J (JD 1 T _ 1 J ) T _ 1 (7.302) the task space equation can be rewritten: f = ( J D - ^ ) - ^ - J q ] - f - J ^ [ C q + g] 1 1 T (7.303) Substituting equation 7.298 into the configuration space robot equation: T = D jj,(x - Jq) + (I - J ^ D - ^ q - Cq]] + Cq + g(q) (7.304) Chapter 7. Multiple Goal Systems 177 Expanding J o and simplifying: T = DD- J (JD 1 T _ 1 J ) T _ 1 ( x - j q ) + D(I - j J , J ) D [ K , q - Cq] + C q + g ( q ) _ 1 (7.305) Simplifying: - Jq) + (I - J ^ ) [ K , q - C q ] + Cq + g ( q ) r = 3 (3D- J )- (i T 1 T 1 (7.306) and substituting f and rearranging: T = 3 f - 3 3%{Cq + g] + Cq + g(q) + (I - 3 3)[K q - Cq] (7.307) + (I-Jl,J)[K,q (7.308) T T D q or simply: T = j£fc + g(q)] From this last equation, one can conclude that proportional control in the null space of 3D globally minimizes displacement if g(q) « 0. The effects of the addition of derivative control is not clear using these arguments. A PI based on damping energy term q K „ , q becomes q '[K«, +D]q and ultimately acts to bias the weighted T 7 pseudoinverse. The above development has shown that a proportional centering strategy inserted into N(j£>) globally minimizes displacement energy (and therefore displacement). However, in the current technique there is no explicit insertion process, rather the 'natural' evolution of perturbations described in section 7.2 performs this coordination automatically. Unfortunately, since RMDAC requires finite time to ensure end effector tracking and, moreover, cannot guarantee convergence of parameters , the actual performance is probably 5 a suboptimal displacement energy solution. Nevertheless, in accepting suboptimality, a decentralized, self coordinated system can be demonstrated. Results In the following simulations each agent employs a joint centering local goal in conjunction with the reference trajectory tracking global goal. In effect, each PD controller models a spring-damper system located at each link's joint. The agent output: u,cj = A A [ k Utrack u J (7.309) = I !] 1 J ^centering In the demonstration run, depicted in figures 7.52 and 7.53, local gains k and k q w are set to 100 and 20 respectively, with all controllers operating at 120Hz. The results may be viewed from both a local and aggregate perspective. 5 i.e. equations (E.421), (E.422), and (E.423) are not necessarily true Chapter 7. Multiple Goal Systems 178 Locally, each agent's centering behaviour keeps each joint near its midrange and, as a consequence, minimizes clashes with joint limits. Lacking the nonlinear boundary controller, centering behaviours produce less disturbed end effector performance and greater predictability than the joint limit behaviour described earlier, though they do not guarantee joint limits will not be exceeded. The aggregate effect is twofold. First, the local strategy induces the predicted local minimization. Secondly, the local strategy enforces a manipulator 'shaping' policy much like a 'leaf spring'. Both are the product of complex interactions between local and global behaviour controllers. The local centering strategy seems to maximize the available maneuvering volume within the limits of the usable C-space and exhibit a smooth curvature, an unplanned (though not unforeseeable) global behaviour. Just as in end effector trajectory tracking, a root mean square (RMS) measure characterizes the optimality of the manipulator's centering behaviour over the reference trajectory and is defined as: (7.310) where N is the total number of timesteps. Table 7.13 documents the erosion of RMS end effector tracking s accuracy as well as the reduction of the RMS joint displacement error with increasing local goal stiffness. Though slowing the global goal's convergence to an ideal tracking process, even relatively small local stiffnesses greatly improve the optimal displacement measure, qms- Dynamically, the aggregate behaviour r mimicks a spring-loaded or leaf spring mechanism, similar to configurations described in [Slotine, 1988]. Despite the simplicity of this strategy, examination of the torque histories (figures 7.54-7.58), reveals that this behaviour is the product of complex interaction between local and global constraints. As established earlier, end effector control drives disturbance forces into N(J). From the lone agent's perspective, the adaptive correction process allows local goals, Su, to be partially fulfilled while fulfilling the global objective. In terms of multiagent control, this process effectively reflects a local goal to the agent team. In this sense, cartesian adaptive controllers provide a communication infrastructure for an emergent coordination strategy between arbitrary local goal systems. The torque histories in figures 7.54-7.58 clearly document this coordination through the cancellation of local link centering forces, but only insofar as they interfere with the global goal. This cancellation is indicated by mirrored waveforms between the link's centering and trajectory tracking torque plots. In the steady state (q = 0), these forces balance exactly as predicted by the stability analysis. In an analysis of the locally and globally compliant systems it will be shown in the next section that these systems are coupled and that the performance of the combined system is a nonlinear hybrid of both systems. Chapter 7. Multiple Goal Systems No. 01 05 04 02 03 kg 0.00 30.00 50.00 100.00 225.00 179 h 0.00 10.95 14.14 20.00 30.00 w firms $rms Qrms 3.059e-03 4.573e-03 5.023e-03 5.963e-03 8.183e-03 6.014e-03 8.034e-03 9.346e-03 1.424e-02 2.988e-02 7.848e+00 2.813e+00 2.738c+00 2.700e+00 2.675e+00 Table 7.13: Tabulation of local joint centering PD gains versus RMS task space position error and RMS centering error for a simple joint centering strategy. The same may be said of the locally compliant and globally adaptive multiagent system: isolated stability of local and global systems does not guarantee combined stability. To ensure stability of the RMDAC generator, the manipulator's parameters must not appear to change suddenly. For example, setting local gains above k = 400 and k q w = 40 produces instability at the reference control rate of 120Hz. Stability can be recovered by either raising local control rates or using underdamped local derivative gains. Either tactic reduces the apparent change in manipulator parameters at the end effector. As the end effector approaches the origin, the local deflections become greater (and the 'leaf spring' more 'compressed'), forcing RMDAC to become increasingly stiff and, often, less stable. The combined effect is a stability gradient imposed on the work space dependent both on the position of the end effector and manipulator configuration. Thus marginally stable centering/tracking goal combinations at the start of the trajectory may become unstable near the origin. Indeed, for marginally stable cases many explode numerically either in the first few seconds of the reference trajectory or approximately at the midway point (t = 20.0 sec). While noting that local stiffness is a source of instability, large local position gains are not necessary to institute a manipulator shaping policy in N(J). Rather, it is the existence of such local behaviours that dominate the manipulator shape as indicated by the optimality measures, q 7.6.3 r m s , in table 7.13. Constant Compliance Retraction The 'leaf spring' configuration behaviours demonstrated thus far maximize the available maneuvering volume for each link, minimizing the curvature along the manipulator. In effect, as the end effector approaches the origin, the local centering behaviours becomes more compressed, requiring greater stiffness by the adaptive global goal generation system to maintain the trajectory. Since stability and global goal stiffness are coupled, this strategy imposes a stability gradient on the system. This approach also biases the manipulator position 180 Chapter 7. Multiple Goal Systems „ 0.1 E. 2 . CD 0 f *-0.1 10 15 20 25 30 35 40 10 15 20 25 30 35 40 0.1 i o cu ^-0.1 2 4 X (metres) CO T3 CO c g 5c CD 15 20 25 Time (seconds) 40 Figure 7.52: End effector trajectory for the reference manipulator and payload under both RMDAC end effector global and joint centering local goals. Chapter 7. Multiple Goal Systems 181 Figure 7.53: The manipulator configurations at the knotpoints of the reference trajectory for the reference manipulator and payload under both RMDAC end effector global and joint centering local goals. Note the manipulator simultaneously adopts a 'leaf spring' configuration while tracking the reference trajectory, indicating that joint centering forces are acting in N(J). 182 Chapter 7. Multiple Goal Systems 2001 1 1 1 0 5 10 15 1 1 20 25 Time (seconds) r 30 35 40 100 Time (seconds) Figure 7.54: Centering and tracking torques for links l(top) and 2 (bottom) of the reference manipulator tracking the reference trajectory. 183 Chapter 7. Multiple Goal Systems 1501 1 r 1501 1 1 1 0 5 10 15 Time (seconds) 1 1 20 25 Time (seconds) r 30 35 40 Figure 7.55: Centering and tracking torques for links 3 (top) and 4 (bottom) of the reference manipulator tracking the reference trajectory. 184 Chapter 7. Multiple Goal Systems 200 Time (seconds) Figure 7.56: Centering and tracking torques for links 5 (top) and 6 (bottom) of the reference manipulator tracking the reference trajectory. 185 Chapter 7. Multiple Goal Systems 1501 t i 1 1 i 1 r "*t| 0 5 10 15 20 25 Time (seconds) 30 35 40 Figure 7.57: Centering and tracking torques for links 7 (top) and 8 (bottom) of the reference manipulator tracking the reference trajectory. 186 Chapter 7. Multiple Goal Systems 501 0 1 1 5 10 1 15 1 1 20 25 Time (seconds) 1 30 r 35 40 Figure 7.58: Centering and tracking torques for links 9 (top) and 10 (bottom) of the reference manipulator tracking the reference trajectory. 187 Chapter 7. Multiple Goal Systems towards the edge of the work volume and, by coincidence, closer to kinematic singularities. Qualitatively, these effects can be reversed by applying a retraction behaviour to the entire manipulation system. Though one might be able to design a global goal to implement a global retraction generator, a simple local strategy can produce a retraction behaviour. For a serial planar manipulator, retraction of the end effector to the origin may be easily affected by alternately applying maximum and minimum boundary setpoints along the length of the manipulator. Formally: q dk = f <?high I fflow k if k even fc if fc Vfc (7.311) odd again: u = kq q where q Ck ek +kq w ek (7.312) = q<i — qk- The agent's control effort: k u c j = k^ u . = [1 1] Utrack A (7.313) Urctraction Adopting the local centering behaviour gains and setpoints in table 7.14 in the multiagent control of the reference manipulator, one can observe that the global shaping behaviour is markedly distinct from previous centering strategies, though the controller dynamics are identical. This strategy is a good example of local behaviour leading to complex aggregate or global behaviour and the simplicity of deriving such behaviour. The alternate extreme setpoint strategy serves to contract the manipulator in a manner reminiscent of a coil spring. The results in table 7.15 and figure 7.60 document improved trajectory tracking and, in figure 7.59, a relatively compact manipulator volume. Trajectory tracking is improved in part due to end effector rotation that conveniently coincides with the desired orientation trajectory. Furthermore, as all the links rotate in the first instants of motion, the manipulator spontaneously leaves the region of kinematic singularity (where the Jacobian Transpose performs poorly) improving both the projection of the end effector goal on each actuator and the image of the robot's dynamics to the end effector goal generator. Tertiary dynamics, also residing in N(J), differ between centering and retraction behaviours. Since PD local behaviour controllers resemble a spring damper system, some form of tertiary oscillatory dynamics should be expected. Not surprisingly, the centering behaviour, modelling a leaf spring, tends to exhibit transverse oscillations more readily than the retraction behaviour that tends to exhibit longitudinal oscillations. Chapter 7. Multiple Goal Systems 188 Figure 7.59: Both RMDAC end effector global and joint centering local goals with alternating setpoints are engaged. 189 Chapter 7. Multiple Goal Systems 0.021 1 1 1 i 1 r X (metres) 21 1 1 1 0 5 10 15 1 1 20 25 Time (seconds) 1 r 30 35 40 Figure 7.60: End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint centering 'retraction' local goals. 190 Chapter 7. Multiple Goal Systems Agent 01 02 03 04 05 06 07 08 09 10 kq 1.0e+02 1.0e+02 1.0e+02 1.0e+02 1.0e+02 1.0e+02 1.0e+02 1.0e+02 2.0e+01 2.0e+01 2.0e+01 2.0e+01 2.0e+01 2.0e+01 2.0e+01 2.0e+01 2.0e+01 2.0e+01 1.0e+02 1.0e+02 Id id +1.570e+00 -1.570e+00 +1.570e+00 -1.570e+00 +1.570e+00 -1.570e+00 +1.570e+00 -1.570e+00 +1.570e+00 -1.570e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 Table 7.14: Tabulation of joint centering PD gains and setpoint for the constant compliance retraction local goal strategy. Strategy Retraction firms $rms 2.244e-03 5.782e-03 Table 7.15: RMS error for the constant compliance retraction strategy. 7.6.4 Variable Compliance Centering Another global shaping behaviour strategy is to vary the stiffness of each local PD goal generator. By differentiating agents through a stiffness strategy, an agent with a relatively soft PD controller will accept more displacement than stiffer controllers, giving some links 'preferential' treatment over others. For example, consider the following compliance strategy based on the following PD controller stiffness selection: k >k _ qj qj fe=l...n 1 (7.314) With this strategy, a 'variable compliance' multiagent system can be designed to becomes stiffer towards the end effector. The reverse of this strategy: k <k ._ qj q l k-l...n (7.315) allows a 'variable compliance' multiagent system can be designed that becomes more flexible towards the end effector. For consistency both cases are are designed to be LHP stable and perfectly damped or: k Wj =2 ^ (7.316) Chapter 7. Multiple Goal Systems 191 and the agent output is again: r cj = k ^ u . = [1 1] u Utrack A (7.317) ^centering Simulating the reference manipulator with the increasing gains documented in table 7.16 the familiar global 'leaf spring' behaviour emerges. However, the manipulator exhibits progressively smaller deflections from joint midrange in figure 7.62 from base to end effector. Comparing figure 7.61 with figure 6.30 it is clear that the addition of this behaviour tends to exaggerate transient end effector tracking errors, though steady state response is ultimately unaffected. Similarly simulating the reverse strategy in figure 7.64 shows that decreasing gains from base to end effector permits progressively greater deflections at each link along the 'leaf spring', again magnifying the end effector transient response in figure 6.30. It is interesting to note the difference in transient response between increasing and decreasing gains cases. Clearly, the increasing gain strategy disrupts the end effector global goal less than the reversed, decreasing strategy. Given the Jacobian relationship between end effector and joint velocities, this should not be unexpected. Since the base gets 'preferential' treatment in the decreasing case (i.e. small deflections are desirable) with relatively large stiffnesses, restoration accelerations in the first link will be magnified at the end effector through equation (2.12). In the increasing gains case it is the end effector that retains relatively stiff gains and, as a consequence, generates smaller end effector disturbances. Despite the performance differences, a comparison of thefigures7.61,7.63 and 7.52 indicates that reducing total stiffness of these variable series goal systems improves transient performance. Remarks To dynamicists, the observation that subtle, structural changes can lead to large changes in the response of a nonlinear system is not surprising. To the multiagent designer, however, this lesson shows that seemingly minor changes within a multiagent team can have significant impact on the system's global behaviour. 7.6.5 Combinations: Trajectory Tracking, Obstacle Avoidance, and Centering Trajectory tracking and obstacle avoidance guarantee collision free motion if the tracking goal is reachable and the end effector trajectory is obstacle free. However, triggered obstacle avoidance tasks constrain the manipulator only while activated and leave the manipulator to free motion in N(J) otherwise. Maximizing Chapter 7. Multiple Goal Systems 192 Figure 7.61: End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint centering local goals of linearly increasing stiffness. Chapter 7. Multiple Goal Systems 193 Figure 7.62: Both RMDAC end effector global and joint centering local goals with increasing stiffness are engaged. Figure 7.63: End effector trajectory performance for the reference manipulator and payload under RMDAC end effector and joint centering local goals of linearly decreasing stiffness. Chapter 7. Multiple Goal Systems 195 Figure 7.64: Both RMDAC end effector global and joint centering local goals with decreasing stiffness are engaged. 196 Chapter 7. Multiple Goal Systems Agent 01 02 03 04 05 06 07 08 09 10 Increasing Decreasing kq k kq k 1.0e+02 9.0e+01 8.0e+01 7.0e+01 6.0e+01 5.0e+01 4.0e+01 3.0e+01 2.0e+01 1.0e+01 2.0e+01 1.897e+01 1.788e+01 1.673e+01 1.549e+01 1.414e+01 1.265e+01 1.095e+01 8.944e+00 6.324e+00 1.0e+01 2.0e+01 3.0e+01 4.0e+01 5.0e+01 6.0e+01 7.0e+01 8.0e+01 9.0e+01 1.0e+02 6.324e+00 8.944e+00 1.095e+01 1.265e+01 1.414e+01 1.549e+01 1.673e+01 1.788e+01 1.897e+01 2.0e+01 v v Table 7.16: Tabulation of increasing and decreasing joint centering PD gains for the variable compliance local goal strategy. Strategy Increasing Decreasing firms ^rms 4.972e-03 6.045e-03 1.082e-02 1.238e-02 Table 7.17: RMS error for "increasing" and "decreasing" variable centering gain strategies. the maneuvering volume and avoiding obstacles frees the manipulator from conflicts with both configuration and cartesian space boundaries. In the following demonstration, previewed in the introduction, three behaviours act simultaneously within each agent: an RMDAC trajectory tracking global goal, an RMOA obstacle avoidance global goal, and a PD joint centering local goal. The jth agent of the form: Utrack u cj =k \u . =[111] A A (7.318) Uavoid ^centering Given that u a v o id is a triggered behaviour controller, the arbitration vector could be rewritten as: ^track (7.319) ^avoid [_ ^ c e n t e r i n g (7.320) Utrack ^avoid 1 if |x | < C 0 if IxJ > C r (7.321) Chapter 7. Multiple Goal Systems Behaviour Controllers Tracking Tracking, Avoidance Tracking, Centering Tracking, Avoidance, Centering Tracking, Avoidance, Centering 197 ",t [1.0] [1.0 1.0] [1.0 1.0] [1.0 1.0 1.0] [1.0 1.0 0.5] e.•rms 2.779e-03 2.537e-03 5.963e-03 4.679e-03 3.999e-03 0, Qrrns 2.850e-02 5.475e-03 1.424e-02 7.626e-03 8.046e-03 7.848e+00 9.904e+00 2.700e+00 2.239e+00 2.292e+00 rms Table 7.18: Comparison of RMS errors for a tracking, obstacle avoidance (rj = 10.0), and centering (k = 100 ltd = 20) combined strategies. p (7.322) ^centering where |x | and c are described are the surface and clearance ranges respectively. r Selecting the avoidance gain, rj = 10.0, and centering gains uniformly as k q = 100.0 and k w — 20.0, a multiagent simulation can demonstrate the combined interaction between these multiple goal systems. The resulting end effector performance depicted in figure 7.65 demonstrates that, as in previous centering tasks, the adaptive controller's performance does not degrade significantly. Indeed, the combined strategy shows marked improvement over the tracking strategy alone. When triggered, the obstacle avoidance behaviour stabilizes local oscillations and, as a result, improves RMS trajectory tracking. The individual agent torque contributions shown in figures 7.67 to 7.71 clearly demonstrate the interaction and arbitration between behaviours as a function of end effector trajectory, link displacement, and range to obstacle surface. Again all local goal activity resides in N(J) through the strict enforcement of the trajectory tracking task. As mentioned earlier a similar mixture of tasks was demonstrated by [Slotine, 1988] based on an offline version of RMRC. Using a null space selection operator, joint centering and obstacle avoidance were demonstrated in conjunction with a trajectory tracking task. The control law used in [Slotine, 1988]: (7.323) where rj and K are the number of joints and obstacles respectively. Clearly, Slotine used similar, centralized, controllers. In multiagent control, the adoption of the global goal proxy enables the distribution of these controllers over the manipulator, while the adaptive global goal engages self organization in the Jacobian null space. Chapter 7. Multiple Goal Systems 198 Figure 7.65: End effector trajectory error for the reference manipulator and payload under both RMDAC end effector global, RMOA obstacle avoidance, and joint centering local goals. 199 Chapter 7. Multiple Goal Systems Figure 7.66: Reference manipulator configurations at reference trajectory knotpoints under RMDAC end effector and obstacle avoidance global goals (rj = 10.0) and joint centering local goals(fc = 100, k = 20). Note 'leaf spring' configuration and avoidance reside in N(J). 9 w 200 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_1 Time (seconds) Multiple Goal Evolution within: link_2 Time (seconds) Figure 7.67: Range to surface, avoidance, centering, and tracking torques for links l(top) and 2 (bottom) of the reference manipulator tracking the reference trajectory. 201 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_3 1 1 1 0 5 10 15 E 21 1 1 "£21 1 1 20 25 30 Time (seconds) Multiple Goal Evolution within: link_4 1 1 1 35 40 r CD Time (seconds) Figure 7.68: Range t o surface, avoidance, centering, and tracking torques for links 3 (top) and 4 (bottom) of the reference manipulator tracking the reference trajectory. 202 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_5 10 15 20 25 Time (seconds) Multiple Goal Evolution within: link_6 15 20 25 Time (seconds) Figure 7.69: Range to surface, avoidance, centering, and tracking torques for links 5 (top) and 6 (bottom) of the reference manipulator tracking the reference trajectory. 203 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_7 15 20 25 Time (seconds) Multiple Goal Evolution within: link_8 15 40 20 25 Time (seconds) Figure 7.70: Range to surface, avoidance, centering, and tracking torques for links 7 (top) and 8 (bottom) of the reference manipulator tracking the reference trajectory. 204 Chapter 7. Multiple Goal Systems Multiple Goal Evolution within: link_9 0i 1 1 i-r-i 1 1 1 r § - 0 . 0 5 [- Time (seconds) Multiple Goal Evolution within: link_10 „ E i 200 Time (seconds) Figure 7.71: Range to surface, avoidance, centering, and tracking torques for links 9 (top) and 10 (bottom) of the reference manipulator tracking the reference trajectory. Chapter 7. Multiple Goal Systems 7.7 205 Arbitration and Compliance In keeping the elements of the linear agent model's arbitration gain vector, kj, unity, arbitration has been simplified to an unmoderated equilibrium between goal systems. However, in the previous experiments it was observed that changing the gains within the local PD goal system significantly altered the performance of the system. Clearly arbitration and compliance are related. If this is so, what is the impact of changing the relative magnitudes of the arbitration vector elements? To explore this further, consider the local centering goal system in isolation. ^centering kgq = Ck + k q w (7.324) C k Within the linear agent model, the output of the agent is defined as: Utrack Xlcj — ^ A - ^ A j — [^track (7.325) ^centering] ^centering Now suppose this expression was rewritten, defining new behaviour controllers, u[ rack and u£ rack in which the arbitration gains have been incorporated: track = [l 1] Uci (7.326) |_ ^centering Where, for example, the centering behaviour becomes: (7.327) u,centering where kg — k t \ gkq k cen eT n w (7.328) — fceentering^u; Reexamining the model of the PD controller with gains k' and k' the Laplace transform, Q w Qm (s) k = &k^centering~r~ &k^centering&u>S mk Qd (s) - r k A (s) k (s), becomes: (7.329) k Afc(s) = Jefffc* + 2 (B fl + KkKb e k + A'jfcfccenteringAlu))* + A'fcfc teringfc? ccn If we assume, for argument, that B ff and Kb are negligible and K c k k (7.330) is unity, then the characteristic reduces to: Afc(s) = Jcff s k "f k t i gk S - cen ct a w + fcccntcringfcg (7.331) It is apparent that arbitration gains other than unity can affect the performance properties of the simple PD controller. In figure 7.72 and table 7.18 this effect is demonstrated using identical behaviour controllers Chapter 7. Multiple Goal Systems 206 to the obstacle avoidance, centering behaviour example (i.e. k = 100, k = 20, K = diag(100), eta = 10.0 q and K „ = diag(20)) and with arbitration gains set to w /^entering p 0.5 and = fctrack = 1-0. Though trajectory tracking is only marginally affected, the centering performance index drops as might be expected with a lower arbitration gain. Interestingly, applying a gain of fc entering 2.0 renders this same system unstable, a = C confirmation that changing a lone arbitration gain can directly influence the stability of the entire system. Now, suppose that the PD controller was divided into separate position and damping behaviour controllers. The linear agent model then becomes: U u = [1 cj 1 1] track (7.332) T U position damping Then it becomes clear that local the position and velocity gains,k and k are, effectively, arbitration gains q w ^rack Ucj — (7.333) [1 kq fcu)] One conclusion to draw from this argument is that weighted combination arbitration strategies (such as the linear model) affect both the performance of the combined tasks and the stable performance of individual tasks. Another is that compliance and arbitration are equivalent in the linear agent model. With the recognition that arbitration and compliance are equivalent, adaptive goal generators (such as RMDAC) can be examined in a slightly different light. Using the RMDAC goal generator as an example and applying the arbitration gain fctrack into the global goal proxy: Utrack Since fctrack = Jj (Pn, is scalar, the gain may be moved arbitrarily into fi(t) = k (7.334) )frf fd or: [di{t) + a(t)n(t) + bi(t)ri{t) + ai(t)ri(t) + ttack K (t)x. (t) + K (t)x: (t)] pi e vi which with further simplification, it can be shown that (7.335) e fctrack ultimately affects the integral gains. Recalling the generic integral gain structure of equation (6.206): fc-(«;(*),e(i),£) = k [ki(Q) + uu / Vi(t)ei{t)dt + u iVi(t)ei(t)] track 2 Jo (7.336) 207 Chapter 7. Multiple Goal Systems 0 5 10 15 20 25 30 35 40 Time (seconds) Figure 7.72: End effector tracking performance overconstrained by R.MOA and 7V Centering controllers ( K = diag(100) , K = diag(20)) and the agents arbitration vector 9 ^Aj = t^'track fcavoid ^centering] w = [1-0 1.0 0.5]. 208 Chapter 7. Multiple Goal Systems Sincefc,(0)and u i are usually ignored in the cartesian case, as mentioned earlier, only the integral is affected: 2 fc{(t;(t),e(t),t) =fctrackuu/ Vi(t)ei(t)dt (7.337) Jo In which case, fctrack modifies either un, Vi(t) or e,-(t) or possibly some multiplicative combination. So, even in the adaptive case, multiplicative arbitration gains will affect the performance of the individual controller as well as the combined performance of multiple goal systems. Conversely, arbitration gains can be extracted from the existing goal generator. Just as in the P D case, the RMDAC global goal generator can be divided into six global goal generators and the Jacobian Transpose-integral coefficient products for each component elevated to an arbitration gain or: u c j k A U . A = k = [Jj JjC(i) jjB(t) 3JA(t) 3JK (t) JjK„(t) k k ] (7.339) = [d(t) f(t) r(t) X (t) X (t) qe q ] (7.340) A j u . (7.338) A p e e q w e Though the linear agent model could be expressed this way, this obscures the fact that the local and global generators have been designed as separate stable components. In other words, the behaviour controllers and u c e n tcring Ut r a c k represent behaviour controller groups explicitly designed to work together to achieve a specific performance objective. 7.8 Summary In this chapter, the design and interaction of combined behaviour systems has been explored in the context of multiagent manipulator control. These local and global behaviour controllers provide some important lessons on multiagent manipulator control in particular and multiagent system control in general. In particular, the conclusions drawn from this chapter include: • multiagency can be made robust to failure through simple, local measures. • multiagency accommodates global goal generation by any agent. • two design elements ensure the emergent coordination of an agent team towards multiple goals: — null space self organization — arbitration through compliance Chapter 7. Multiple Goal Systems 209 • global and local goal systems can produce emergent global behaviour (i.e. manipulator shaping policy) in addition to desired global behaviours. Fail safe methods discussed earlier demonstrate that even in the absence of an agent's global goal proxy, local strategies can be devised (such as a braking maneuver) that permit arbitrary global goal seeking to continue. Kinematically, this is not surprising. Disabling one agent's capacity to seek global goals removes a single degree of freedom from the team. If the team is redundant in the global goal, global behaviour should be unaffected. Architecturally, centralized, model based schemes such as R M P C , RMRC, and R M A C are not well suited to these simple low-level solutions to link failure principally due to the dependence on an explicit map between global and local goal spaces. Since these maps often require an accurate system model, precise fault descriptions are required to maintain acceptable global performance. This chapter has shown that multiple goal systems are achievable for collections of independent link agents. Though investigated for some time, multiple tasks are usually incorporated into a centralized global process, disseminating primary and auxiliary tasks to each link according to centralized null space or task space augmentation models. Reexamined as a distributed, multiagent control problem, the previous chapter showed that centralized explicit manipulator control is not necessary for well posed global tasks. It has been shown in this chapter that any global goal conforming to Jacobian transpose methods may be generated and broadcast by any agent in a multiagent team. The addition of a simple PD local goal generator demonstrated not only that decentralized local and global goal systems can coexist in globally redundant systems but that this local generator can generate useful, even optimal, performance. Significantly, this study identified two distinct mechanisms that together provide an automatic goal arbitration mechanism for agents pursuing multiple goals: Null Space Self Organization and Arbitration through Compliance Null Space Self Organization By adopting an adaptive cartesian controller, this study predicted and demonstrated that explicit identification of the null space and subsequent task insertion are unnecessary. As described in section 7.2, perfect regulation of the end effector position drives torque and displacement perturbations into the null space. This property has been used in J T C to implement mutually exclusive global goal systems. However, null space self organization has not previously been identified or exploited as a desirable byproduct of adaptive end effector task enforcement. Null space self organization ensures that local behaviours are automatically Chapter 7. Multiple Goal Systems 210 inserted of into the Jacobian null space of an adaptive global goal system. Furthermore, this technique shows that compliance can be used as a self arbitration mechanism in multiagent teams. Arbitration through Compliance Experiments in this chapter demonstrate that arbitration can be distributed amongst behaviour controllers through the selection of an appropriate compliance strategy for each behaviour. In particular, if the 'highest priority' task is enforced through a rigid goal regulator such as an adaptive controller and 'lower priority' behaviour controllers are enforced through less rigid strategies such as nonlinear and linear control laws, then task conflicts are automatically resolved between controllers and require neither an explicit arbitration strategy within each agent nor an explicit coordination strategy between agents. Based on the Emergent Coordination definition D4.23, agents within a redundant multiagent manipulator under rigid global control and compliant local control exhibit an emergent coordination strategy and, therefore, emergent behaviour. Taking the centering local behaviour as an example, rigid end effector tracking drives local centering behaviour into the null space of the end effector Jacobian. In so doing, this interaction permits partial though maximal satisfaction of the local goal without coordination by a centralized coordination process. In effect, the interaction between the two goal systems alone acts as a coordination policy that enables desirable local behaviour if possible - an example of self arbitrating or emergent control. Furthermore, if a team of similar agents produce a desirable, stable global behaviour without external coordination then the team exhibits emergent global behaviour, for example manipulator shaping policies arise from local compliant goal systems. Chapter 8 Multiagent Control Compared The application of multiagent control to manipulation was justified on the grounds that improved robustness, architectural simplicity, lower cost, greater extensibility, and improved real time performance observed in other multiagent systems might be realized in manipulator control. This chapter will show that this manipulator architecture delivers these benefits, and, furthermore, that these advantages are unique to multiagent control. Multiagent manipulator control is not a single technique, but the careful reformulation of a number of fundamental techniques into a coherent control strategy. So while each component brings advantages in isolation, the combined performance is uniquely beneficial to redundant manipulator control. The key contributing components to multiagent manipulator control are: • Redundant Manipulator Control • Adaptive Control • Jacobian Transpose Control • Decentralized Control The relationship between these foundations and multiagent control appear in figure 8.73. Guided by the agent and multiagent structures developed in chapter 4, these components can be combined, delivering two distinct advantages over traditional, centralized manipulator model based systems: structure that is simple, robust, and extensible, and performance that is consistent, guaranteed, and independent of a manipulator's dynamic structure. 8.0.1 Multiagent vs. Centralized Control Manipulator model based systems (i.e. those relying on knowledge of the manipulator's structure and parameters) achieve consistent guaranteed performance, but at the cost of a centralized inverse kinematic 211 212 Chapter 8. Multiagent Control Compared solution (e.g. a geometric inverse or Jacobian pseudoinverse). For example, in resolved motion acceleration control, a desired acceleration is computed from the pseudoinverse of a task space trajectory: q<*M = J [xd(t) - Jq(r) + fc„x + fc x 1 e Utrack p e = D(q)q + C(q, q) + g(q) d This centralization extends to auxiliary tasks that must be assigned to regions of configuration space through a task assignment mechanism. For example, again in RMAC: U where auxiliary tasks, u a u x jij a r y = Utrack + (I ~ J*J)u a u x (8.341) iliary are inserted into the Jacobian Null space by the (I — J^J) operator. By combining Adaptive Task Space Control with a decentralized Jacobian Transpose Controller an estimation of a tracking task model is maintained rather than a manipulator's structure and parameters. Alone this reduces computing costs from an at least 0(N ) pseudoinversion and O(N) dynamic model to simply 2 0(MN). Furthermore, by distributing the inverse solution amongst a set of link agents computational costs are further reduced from O(MN) to O(M). Finally, adaptive task space control ensures disturbances migrate to the Jacobian null space. This means multiple tasks can coexist within each agent, providing robust control alternatives to each link without the necessity of a centralized task assignment protocol. Finally, multiagent systems permit either the removal, replacement, or addition of whole link agents at any time on or off line without costly software changes commonly required for traditional centralized control architectures. This has important implications on task or mission planning in which serial manipulators can be simply and easily "daisy chained" or "split" in real time, physically combining or dividing manipulators respectively. Together these features of multiagent manipulator control provide performance comparable to the compute intensive, model based, centralized systems but with a substantially simpler structure, potentially lower implementation and maintenance cost, flexibility, and robustness to change or failure. 8.0.2 Multiagent Manipulator Control and the Multiagent Context Though comparison of multiagent systems is difficult, the multiagent architecture presented here is unusual in that task assignment through explicit interagent bidding, negotiation, or competition, common to other systems, is unnecessary. Rather 'negotiation' emerges through the dynamics of both adaptive and compliant goal systems in a manner similar to Reynold's Boids. This is in contrast to Parker's Troops and Chapter 8. Multiagent Control Compared Figure 8.73: Convergence of techniques that form Multiagent Manipulator Control. 213 214 Chapter 8. Multiagent Control Compared Mataric's Nerd Herd that employ explicit negotiation protocols reminiscent of Smith's Contract Net Protocol [Smith, 1980]. 8.1 Demonstrating The Advantages To clearly demonstrate the advantages that multiagent manipulator control offers over traditional centralized model based control, the performance and structure must be compared. In the first experiment, traditional R M A C will be used to perform end effector trajectory tracking and joint centering tasks. This result should demonstrate the performance possible from a centralized model based end effector and task assignment system. In the second test, the task assignment system is removed and the joint centering tasks decentralized to each link. This result should demonstrate the structural sensitivity of R M A C to both manipulator modelling errors and/or decentralized auxiliary controllers, both unmodelled constraints within R M A C . 8.1.1 Performance In the following experiment, the independent variable will be the required performance: simultaneous tracking of a desired end effector trajectory and joint displacement minimization. The dependent variable will be the structure required to achieve the desired behaviour, one the centralized model based R M A C , and the other the decentralized manipulator model-free Multiagent Manipulator Control system. In resolved motion acceleration control the typical structures required to achieve these tasks are: q (t) = J - x (t) - jq(t) + fc„x + u^ck = D(q)q + C(q,q)+g(q) (8.343) Uccntering = K q + K„,q (8.344) = Utrack 1 d c d U d 0 e + (I - e J J)u entering t C fcpXe (8.342) (8.345) and are very similar to [Slotine, 1988].Note the estimate of the manipulator's parameters and frequent use of the Jacobian pseudoinverse, both centralized models of the manipulator's kinematics and dynamics. Adaptive task space control and decentralization make multiagent control's structure substantially simpler than the R M A C equivalent: f(t) = d(t) + C(t)r(t) + B(t)r(*) + A(«)f(t) + K (*)XeM+K„(*)XeW (8.346) jJ(p„,Xj_i)fd (8.347) p Utrack = 215 Chapter 8. Multiagent Control Compared Ucentering u cj (8.348) ^g^et = k . u . = [l l] A "f" kq — w eh Utrack (8.349) A Ucentering This latter structure is the familiar link agent described in chapter 7 with both tracking and centering tasks. Again, note that there is no manipulator parameter description only a simple Newtonian dynamic model within the adaptive end effector goal generator. Furthermore computation of the cartesian states needed for the Jacobian transpose row is distributed over N agents. At no point in the multiagent system is it necessary to maintain a complete centralized representation of the manipulator geometry or physical parameters. Simulation Though the tracking and displacement minimization performance of the two systems are comparable, the computational cost is clearly less in the multiagent system. Comparing figures 8.74 and 7.52 in chapter 7 and results in table 8.19, it is clear that these two structures exhibit similar end effector tracking and displacement minimization performance. However, R M A C is substantially more compute intensive with both a feedforward dynamic model and a pseudoinverse Jacobian computation. Since the pseudoinverse structures cannot be decentralized over N joint processes, R M A C must be centralized on a single powerful CPU. 8.1.2 Structure As remarked above, the structure of the multiagent system is considerably simpler than that of centralized R M A C . Suppose, however, that RMAC was treated as a global goal generating process like RMDAC, how would this centralized system perform in the computationally simpler agent-like structure: q (t) = J Utrack — D(q)q + C(q,q) + g(q) d Ucentering = 1 [x (0 - jq(t) + k x + k x d v e d kqq + k q Ck w p e (8.350) (8.351) (8.352) Ck r U c j = K A u A =[111] Utrack ^centering (8.353) 216 Chapter 8. Multiagent Control Compared 0.1 S 0 -0.1 10 15 20 2 15 25 30 35 40 4 X (metres) 20 25 Time (seconds) 40 Figure 8.74: End effector trajectory for the reference manipulator and payload under centralized R M A C end effector and joint centering task assigned to the null space. Chapter 8. Multiagent Control Compared 217 Simulation In this experiment, the independent variable will be the structure: a distributed multiagent environment and decentralized control structure. The dependent variable will be performance: simultaneous tracking of a desired end effector trajectory and joint displacement minimization. This experiment may be viewed in two ways. On the one hand this experiment examines the performance of R M A C within a structure identical to multiagent control. On the other hand this experiment examines the performance of R M A C given an imperfect manipulator model (i.e. one in which joint controllers are biased by a linear PD controller). Structurally, these two systems are now comparable. The computational cost of the agent-like R M A C is less than in the original R M A C system and local behaviours may act without centralized coordination. With these changes, it is clear from figure 8.75 and table 8.19 that R M A C cannot match the performance of the multiagent system. Without centralized task assignment, the R M A C system exhibits considerably greater tracking performance error and somewhat poorer joint displacement minimization than the R M A C with null space task assignment. Interestingly, the multiagent control system is an order of magnitude more precise in trajectory tracking but exhibits poorer joint minimization performance than both R M A C cases. These results must be put in perspective. The performance of the RMAC system without task assignment must be a compromise between end effector tracking and joint centering. Given the large local gains, end effector tracking performance suffers substantially, while centering performs well (indeed, very near R M A C with task assignment). In this version of Multiagent control, the adaptive end effector task enforces the highest priority and, therefore, exhibits small end effector errors. Unlike R M A C , RMDAC cannot compensate for manipulator dynamics in the Jacobian Null space, producing greater motion in N(J) and larger RMS error over the joint displacement history. Ultimately, however, figure 8.75 shows that without task assignment R M A C fails to provide acceptable steady state end effector performance. Since R M A C is model based, arbitrary insertion of auxiliary tasks without altering the centralized model through a null space task insertion operator can only degrade end effector performance. In other words, model based systems rely on a complete description of all active constraints to achieve satisfactory control. To maintain this performance with added local behaviours requires that R M A C be altered through a task assignment mechanism. The ability to arbitrarily add local behaviours without alterations to a centralized process combined with the lower computational costs of decentralized control suggest that the multiagent system more flexible, 218 Chapter 8. Multiagent Control Compared Strategy R M A C with task assignment R M A C without task assignment Multiagent Control firms $rms Qrms 1.257e-02 5.612e-02 5.963e-03 1.109e-02 7.756e-02 1.424e-02 1.908e+00 1.942e+00 2.700e+00 Table 8.19: R M A C and Multiagent Control RMS end effector and joint centering performance. extensible and robust than the model based counterpart. Clearly there are advantages to model based control if an accurate model is available. However, in a dynamic environment a static model is of little advantage and the costs associated with maintaining accurate dynamic models grows in proportion to the environment. This comparison also provides an avenue to understanding the stability properties of a multiagent system with both local and global goals. In the next section stability will be discussed by first examining the localized stability properties of the combined 'compliant' goal system, R M A C (a task space PD controller) and 'compliant' joint centering local goals (joint space PD controllers). These arguments will clarify the properties of the 'rigid' global RMDAC (an adaptive task space controller) and 'compliant'joint centering goal system. 8.2 Stability Qualitatively, some useful observations can be made about the combined performance of global and local goals by first considering the combination of compliant global and compliant local goals. If modelled as a pair of spring-damper systems, compliant local and global goals should exchange potential and kinetic energy, ultimately settling into equilibrium offset from both local and global desired trajectories. R.ecall the case above of an adaptive global goal simulating a perfectly rigid (albeit, moving) task space constraint. Acting in combination with local compliant constraints, the adaptive controller corrected any end effector trajectory errors by adaptively changing the force setpoint. Since force setpoints are propagated to the agent team through the global goal, local control efforts that perturb the end effector position are, in effect, "reflected" off the global goal to the entire agent team. Though the RMDAC global goal generator has been designed from the outset to be asymptotically stable, a global stability proof for a combined compliant local/RMDAC global system is difficult. Nevertheless, a valuable starting point is to first examine the stability of the combined locally and globally compliant system near equilibrium. With this simple analysis, the coupling mechanisms between the global and local 219 Chapter 8. Multiagent Control Compared Figure 8.75: End effector trajectory for the reference manipulator and payload under centralized R M A C end effector and decentralized joint centering auxiliary tasks ( K = diag(lOO.O) , K», = diag(20.0)). g 220 Chapter 8. Multiagent Control Compared domains can be clearly observed and demonstrated. Furthermore, by extending these linear arguments the performance of locally compliant and globally adaptive systems can be clarified. 8.2.1 Combining Locally and Globally Compliant Goal Systems A compliant global goal generator for task space manipulator control is best exemplified through a system such as Khatib's OSF, a cartesian computed torque controller. Now task space computed torque exploits a task space feedforward model in the construction of the required force vector fd'f = M(x)f * + N(x, x) + p(x) (8.354) d where the hat superscript indicates parameter estimates. If f^, is defined as the acceleration necessary to move a unit mass, identity matrix I , along a prescribed trajectory, m V = I x + K ( x - x) + K ( x - x) m m d p d d d (8.355) Including the local controllers, the Robot Equation becomes: D(q)q + C(q, q) + g(q) = K , [q - q] + K rf [q - q] + J w T d [M(X)C + N(x, x) + p(x)] (8.356) where M(x) = J" (q)D(q)J- (q) (8.357) N(x,x) = J- C(q,q)-M(x)J(q)q (8.358) p(x) = J- g(q) (8.359) T 1 T T Now suppose M(x) = M(x),N(x,x) = N(x,x), and p(x) = p(x) the simplified equation becomes: K q + K^qe + J M(x) [x + K x + K x ] = 0 T g e e p e d c (8.360) where x = x — x, q = q — q and, in general x,i ^ f (qd). e d e d This combined error system provides insight into the performance of combined goal operations. In particular, equation (8.360) indicates that if the parameter estimates are exact, the cartesian system will asymptotically converge to x = 0 if and only if the first two terms vanish. This is true under two possible e conditions: 1. K q q e +Kq w c = 0. 221 Chapter 8. Multiagent Control Compared 2. K g q e + KtuCJe C N(J), the local goals reside in the null space of the Jacobian. The first case is trivial, implying that either the setpoint error has vanished, position and velocity terms are equal and opposite, or the PD gains are precisely zero. The second possibility is that local and global systems reside in complementary regions of configuration space. If neither condition is true, then both systems reside in overlapping subspaces of R(J) and, though they may achieve a stable equilibrium, neither q e nor x will converge to zero asymptotically. e Local Stability Though equation (8.360) is a concise statement of the system's error dynamics, it does not completely show the coupling between global and local goal systems. By rewriting this equation entirely in task space a better understanding of these two systems can be established. Recalling that M = j t D J t where Jt is the T pseudoinverse : JD^K^qe Since the term q e +J E r ^ c u +x +K x +K x = 0 c p e d (8.361) e produces velocities at the end effector: (8.362) X<,e=J(q)qe where x ? e is the end effector velocity error induced by q . Inverting this relation the equation can be further e rewritten: JLV^q,, +J D - ^ j t x ^ +x +K x +K x = 0 e Now assuming the local displacement error q e p e d e (8.363) is small, the forward solution may be linearized about q : x, = J ( q ) q e (8-364) e A task space error expression can now be introduced: x qe =x qd — x. Where x ? e is the displacement error between the end effector position at q and q . Rewriting (8.363): d JV^K^Xqe Now, referring tofigure8.76, x + Jit^K^kqe g e + x + K x + K xe = 0 e p e d (8.365) and x can be related through: e e x ge = (8.366) x -Xqd d = x ? d - x=x - e e (8.367) 222 Chapter 8. Multiagent Control Compared Figure 8.76: About the end effector position x(i), the desired global goal x goal x d through the vectors x , x and e. ? c substituting for x ? e is related to the desired local d g e and rewriting equation (8.365): J D ^ K ^ J ^ X e - e) + J D - ^ u . J ^ X e - e) + x + e Kx p e + Kdxe =0 (8.368) Applying the Laplace Transform: XeW A(s) = ± = A(s) s I + [K +3i)'' K 3 ]s ~r~/ 7 2 1 d ^E( ) (8.369) S i w + [K +3T)- K 3 ] 1 p i q (8.370) where the characteristic equation of the combined system is A(s). Through this simple linearized analysis, it is clear that local and global goal systems are closely coupled and that the performance of the combined system is a hybrid of both systems. In particular, task space damping and stiffness coefficients are clearly configuration dependent with the introduction of the local goal system. If step inputs, X and Q<j (or simple E(s)) , are applied to the cartesian and the local controllers d respectively, the steady state error may be shown to be the equilibrium position between the two PD systems. Applying the final value theorem and recalling that X. (s) is the end effector position error vector: c x C e s = lim.sXe(.s) s-+0 x ,„ e = lim [[s I = [K 2 p + [Kj + J D - ' K ^ j t j s + [ K + J D K , J ] ] + JD-'KjJ*] - 1 f p - 1 - 1 s [jEr'K,,.!* + J D ^ K ^ j ' s ] [JET'K^E] (8.371) Thus the intuitive conclusion that local goals residing in R ( J ) cause steady state end effector offsets is T borne out. As the 'stiffness' of the local goals, K , increases, the cartesian steady state error, too, will ? increase. Clearly a system possessing both local and global compliant goals will experience steady state errors in both local and global domains if the local goals reside in R ( J ) . T 223 Chapter 8. Multiagent Control Compared Through further manipulation, the cartesian steady state error can be transformed into an expression of equilibrium between local and global systems. Rearranging equation (8.371) and substituting equation (8.367) produces: K x „„ p e = -JD-^JHE-Xe..) (8.372) = -JD^K^Xge.. (8-373) Thus the steady state error is an equilibrium between the local and global systems: J MK> = -K,q „ T C 8 S (8.374) e Of course if the parameter estimates are not exact: Mq)<i+Mq> q) + * (q) - ?qe - K „ q = J M ( X ) [x + K K T B e e P X E +K x ] d e (8.375) the two linear systems will be driven by the nonlinear parameter errors defined as: Mq) = D(q)-D(q) (8.376) <5 (q,q) c = C(q,q)-C(q,q) (8.377) W = g(q)-g(q) (8-378) Since gravitational forces are bounded [Lewis, 1993] and time invariant, parameter estimation errors reduce to errors in gravity compensation, |<S(q)| < SQ, in the steady state. Such errors contribute to the s steady state offset: x Ces = [K + J f r ^ j t ] p 1 [ j D ^ K ^ E + JD-M (q)] G (8.379) It is important to reiterate that steady state offsets will arise only from local behaviours in R(J). However, in section 7.2 it was shown that local behaviours migrate to N(J) if J^J ^ I, an intrinsic condition of redundant manipulation. By overconstraining the system with both local and global goals, the combined performance of global and local goals can be demonstrated. Recalling that R M A C and OSF are functionally identical compliant goal systems, the results of the structural comparison in figure 8.75 show that, as predicted, a manipulator driven by a compliant goal system experiences steady state position and orientation offsets throughout the end effector trajectory if overconstrained by a compliant local goal system. While it is true that end effector control of underconstrained serial manipulators ensures local behaviours migrate to N(J), it is also true that homogeneous teams of agents might inadvertently overconstrain the 224 Chapter 8. Multiagent Control Compared system with local goal activity. If this is so, the team should not have to rely on an omniscient explicit coordination system to arbitrate between local and global systems. Yet, both steady state analysis and simulation shows that centralized compliant methods (such as R M A C or OSF) require centralized coordination (e.g. equation (8.341)) to achieve convergence for both local and global goal systems under overconstraint. Now adaptive task space controllers (such as R M F C and RMDAC) are designed to overcome end effector errors and, as a consequence of the Null Space Self Organization theorem, drive local behaviours into N(J) without explicit coordination. 8.2.2 Combining Globally Adaptive and Locally Compliant Goal Systems RMDAC was designed to ensure asymptotically stable trajectory tracking without an explicit manipulator model by adopting a model reference adaptive control strategy in cartesian space. The adaptive controller essentially adjusts task space estimates of the robots parameters to match the performance of an idealized linear tracking process. So, while RMDAC does establish a dynamic model, it is independent of a dynamic description of the robot. Including the local PD controllers and the RMDAC global controller, the Robot Equation becomes: D(q)q(*) + C(q,q)(t) + g(q)(«) = f (t) = d K,q (t) + K„q (t) + 3 f (t) T e e d d(t) + C(t)x (t) + B(t)x (t) + A(t)x (t) d d d +K (t)x (t) + K„ (t)x (t) P e e (8.380) R.ecall that RMDAC is the product of an MR AC Lyapunov design technique [Seraji, 1989a] converging not to a manipulator model but to an idealized trajectory tracking process. Therefore in general: AM M(x) B(t) N(x,x) C(t) g(x) So while parameter convergence cannot, in general, be guaranteed, asymptotic convergence of the end effector with the desired trajectory is guaranteed under the assumption of slowly varying manipulator parameters. Therefore, so long as the slow variation assumption is maintained, local goals acting in R(J ) T should appear as slow parameter changes and RMDAC should remain stable about the desired end effector trajectory. Assuming this is true, what performance can be expected from the combination of local and global control? Chapter 8. Multiagent Control Compared 225 First consider, again, the combined response of locally and globally compliant controllers. If the system N(J) = 0, the motion produced by an additional task in R ( J ) inevitably disturbs the end effector trajectory i by some offset (as in figure 8.75). Hence the necessity of explicitly inserting auxiliary tasks into N(J) and equation (3.70). On the other hand, if N(J) ^ 0 (the robot is redundant), the resulting motion is, ultimately, forced into N(J) by the compliant control forces and producing no end effector disturbance. Now consider the combined response of locally compliant and globally adaptive controllers. Since any disturbance at the end effector trajectory is adaptively removed, any local task (or portion thereof) in R ( J ) T is removed if N(J) = 0. On the other hand, if N(J) ^ 0 (the robot is redundant), a local task (or portion thereof) is, again, forced into N(J) by adaptive compensation. Furthermore, since compensations in global behaviour are transmitted to the entire agent team. Adaptive global controllers can be interpreted as a natural goal assertion and broadcast mechanism. When local controllers act in R(J) and perturb the end effector, the compensation process 'reflects' or broadcasts this perturbation to the agent team. Thus even the end effector, fully constrained in task space by the end effector adaptive controller, can affect manipulator configuration simply by modulating global goal seeking with local activity. 8.3 Summary Multiagent control offers comparable trajectory tracking performance to traditional systems at substantially lower computational cost, equal or better robustness to failure and parameter changes, and arbitrary extensibility thanks to a parallel agent computational model, carefully selected global and local controllers, and the innate properties of redundant manipulation. Chapter 9 Conclusions, Contributions, and Recommendations 9.1 Conclusions In summary, this thesis provided control analysis and simulation of an N d.o.f. redundant manipulator executing multiple tasks using N independent agent processes without a centralized manipulator parameter model or auxiliary task distribution mechanism. The control strategy combined an existing adaptive task space controller [Seraji, 1987b] a new decentralized Jacobian Transpose Control (JTC) [Monckton, 1995] method, and a carefully developed agent architecture to autonomously control each joint. In proving that task space adaptive control drives auxiliary tasks into the Jacobian Null space, this thesis developed a decentralized manipulation system permitting link agents to act on multiple local and global goals without manipulator model based setpoint derivation, auxiliary task assignment to the Jacobian Null Space [Nakarnura, 1984] or task space augmentation. Thus each link process became an agent within a multiagent manipulation system that exhibited self organizing task assignment, priority through compliance, robustness to failure, and computationally parallelism - independent of team size, composition, or degree of redundancy. 9.1.1 Caveats Though the feasibility and benefits of multiagent manipulator control were shown in both simulation and analysis, this controller has notable weaknesses that may compromise real world performance. Through Jacobian decentralization, multiagent manipulator control clearly depends on a distributed, reliable communications bus or shared memory structure. The failure or underperformance of this communications structure would lead to catastrophic failure of the control system as currently described. Experimentation was restricted by the simulator's limited ability to model realistic manipulator and obstacle characteristics. For example: In adopting rigid link models, the effects of elastic deformation and collision were excluded and their impact on local and global goal systems remains unknown. Similarly, sensing and communications were idealized, ignoring sensor failure and/or noise along with unreliable communications (a significant feature of implemented systems such as Mataric's Nerd Herd). All of which have 226 Chapter 9. Conclusions, Contributions, and Recommendations 227 well known deleterious effects on the practicality and, often, stability of control systems. Furthermore, the behaviours demonstrated here, while instructive, possess either limited or undetermined potential application. As implemented here, the obstacle avoidance simulation, for example, does not account for the manipulator itself (a common, if surprising, feature typical of such simulations). Though local PD behaviour controllers proved useful in revealing the interplay between goal systems they have undetermined practical application. Despite these caveats, this thesis provides fundamental contributions to the practice of multiagent control and manipulation that deserve review. 9.2 Contributions Building on the considerable work of other investigators, this thesis contributes to both theory and practice of manipulation robotics and multiagent control. In particular: Multiagent Manipulator Control A manipulator was controlled through a set of autonomous processes acting on a common force trajectory goal. The first of its kind, a multiprocess simulator simulated an agent team able to control a manipulator's end effector trajectory without centralized geometric or integral inverse kinematic solutions or a centralized manipulator parameter model. This parameterfree decentralization permitted the design of multiagent serial manipulation teams with constant time response regardless of team size or geometry. Multiple Goal Interaction Traditionally manipulator redundancy is resolved through pseudoinverse or task space augmentation [Nakamura, 1984, Seraji, 1989b] methods that insert auxiliary tasks into the primary task's Jacobian null space thus guaranteeing multiple objectives. Through simultaneous action of both local compliant and adaptive global goals this study identified a number of significant properties and techniques: Null Space Self Organization Drawing on a simple argument, a Null Space Self Organization Theorem was developed that proves rigid end effector control and explicit null space task assignment operators are functionally identical. This means that model based centralized task assignment operators (e.g. (I — J^J)) are unnecessary in redundant multitask systems under adaptive task space control. Compliance as Priority If rigid global and compliant local goals occupy configuration subspace R(J^), global goals dominate and local goal activity is limited to the Jacobian null space. By Chapter 9. Conclusions, Contributions, and Recommendations 228 i assigning task priority through 'stiffness' or rigidity of auxiliary goal expressions, multiple tasks executed in overlapping domains can be biased according to priority. Emergent Coordination Agents acting in the global goal's Jacobian range space were shown to disturb the global goal. In broadcasting corrections to these disturbances, global goal adaptation reflects local goal activity to the agent team. So while emergent global behaviour is not specified by a particular global goal expression, it arises from interaction between global and local constraints nevertheless. Cartesian Decentralization Traditional inverse kinematic solutions (an inverse aggregate of end effector behaviour) uniformly require a complete manipulator description to relate end effector position to joint displacements. However, in this thesis, a simple, practical, distributed inversion strategy was developed out of the class of Jacobian Transpose Control. With cartesian sensing (or distributed Newton Euler kinematic computation), N processes can control N degree of freedom manipulators through the observation of a desired force trajectory goal. Behaviours A number of new redundant manipulator null space configuration strategies were demonstrated. Arising from simple local and global goal systems, complex organized manipulator behaviours were generated with little additional computational effort. In particular: • Analysis and simulation demonstrated that multiagent control is distinct from traditional centralized control. By overconstraining a redundant manipulator with an R M A C end effector goal and N PD joint centering local goals, it was shown that without explicit configuration space assignment of the centering goals, the R M A C end effector goal was disturbed. • Analysis and simulation demonstrated that overconstraining a redundant manipulator with an RMDAC end effector goal and N PD joint centering local controllers, produced both near optimal joint displacement minimization without explicit configuration space assignment of the local goals and convergent end effector tracking performance. • It was shown in simulation that by varying either setpoints or gains alone, local PD goal generators could produce qualitatively predictable, global manipulator shaping behaviours in conjunction with global end effector trajectory tracking without any centralized manipulator parameter model. • In simulation it was shown that, given a redundant manipulator under an RMDAC end effector goal, N agents could each broadcast a nonlinear global obstacle avoidance goals to the agent team Chapter 9. Conclusions, Contributions, and Recommendations 229 to successfully a v o i d obstacles i n the w o r k v o l u m e w i t h o u t d i s t u r b i n g the e n d effector trajectory. Agent and Multiagent Theoretical Foundations I n p l a c i n g agent a n d m u l t i a g e n t s t r u c t u r e s w i t h i n a c o n t r o l theoretic c o n t e x t a s i m p l e agent c o n t r o l architecture was s h o w n to b e c o m p o s e d of a set of task specific behaviour controllers, a n arbitrator t h a t c o m b i n e d the o u t p u t of these controllers to act o n a plant. T h e behaviour of the agent was defined as the response of the plant to the a r b i t r a t e d c o n t r o l efforts. S i m i l a r l y , a l o g i c a l development of the m u l t i a g e n t m o d e l revealed basic s t r u c t u r e s a n d techniques t h a t clarify the m u l t i a g e n t design p r o b l e m : Aggregate Relations A n aggregate relation was identified t h a t m a p s l o c a l to g l o b a l b e h a v i o u r . I n d o i n g so, this thesis recasts the p r o b l e m of m u l t i a g e n t c o n t r o l as one of aggregate inversion a n d agent coordination. Feasibility S a m s o n ' s feasibility c r i t e r i a ( a n a p p l i c a t i o n of the inverse f u n c t i o n t h e o r e m ) a p p l i e d to a m u l t i a g e n t system's aggregate r e l a t i o n establishes the existence of the inverse aggregate a n d classifies these s o l u t i o n s i n t o one-to-one, one-to-many, or i n s o l u b l e . Coordination Explicit coordination inverts the aggregate u s i n g a one-to-one m a p between g l o b a l a n d l o c a l b e h a v i o u r , c l e a r l y a m o d e l based process. Implicit coordination inverts the aggregate t h r o u g h a o n e - t o - m a n y p r o j e c t i o n . I n m a n i p u l a t o r c o n t r o l , i t was s h o w n t h a t m a n i p u l a t o r parameter-free, i m p l i c i t c o o r d i n a t i o n requires a n a d a p t i v e g l o b a l goal s y s t e m . Emergent Coordination was s h o w n to generate g l o b a l b e h a v i o u r as a b y p r o d u c t of c o n s t r a i n t r e s o l u t i o n a n d , therefore, w i t h o u t a global goal description. Linearized Stability Analysis A p p l i e d to b o t h aggregate a n d inverse aggregates, l i n e a r i z e d s t a b i l i t y analysis l i n e a r i z a t i o n c a n b e used to e x p l o r e l o c a l s t a b i l i t y c r i t e r i a . I n p a r t i c u l a r , i t was s h o w n t h a t even w i t h a n o n l i n e a r m a n i p u l a t i o n s y s t e m , the s t a b i l i t y (e.g. t h e s t e a d y s t a t e c o n d i t i o n ) of m u l t i p l e g o a l systems c o u l d be d e t e r m i n e d l o c a l l y t h r o u g h s i m p l e l i n e a r i z a t i o n . I n s h o r t , c o n t r o l t h e o r e t i c expressions of agent a n d m u l t i a g e n t a r c h i t e c t u r e s represent a n i m p o r t a n t first step i n the p r a c t i c e of m u l t i a g e n t c o n t r o l design a n d clarifies c r i t i c a l issues of s t r u c t u r e , p e r f o r m a n c e , a n d s t a b i l i t y of m u l t i a g e n t systems. Chapter 9. Conclusions, Contributions, and Recommendations 9.3 230 Recommendations Though performing comparably to traditional centralized control architectures without centralized computation or coordination, the study of multiagent manipulator control remains immature. In particular, time delay, sensor noise, and flexible links or actuators are fundamentally unexplored issues and may present significant barriers to the adoption of this architecture in some applications. While this thesis improves the understanding of agent and multiagent systems, there remain many outstanding issues that deserve further investigation: • Goal design methods.Though partially explored here, the goal design process deserves more attention. For example: adaptation about a desired global behaviour clearly imposes order on the multiagent team. What other adaptive models exist for manipulation? How broadly applicable is this adaptive model to generic trajectory tracking tasks (i.e. mobile robotics)? How can local adaptive generators coexist with global adaptive systems? These are a few of many questions requiring attention in global and local goal design. • Arbitrator design. It was shown here that arbitration influences team stability. The converse must also be true - that stability can govern arbitration design. It is conceivable that arbitrator design could be refined through further exploration of stability. • Team homogeneity and morphology.Though it is a useful research model, team homogeneity is less likely in practice. Goal systems are likely to differ between agents by accident or by design (e.g. sensor suites, computational power, failures, etc.). What are the implications of heterogeneous teams? Particular to manipulation, however, a number of potentially critical issues remain unresolved and deserve additional attention including: • Time delay. The propagation of kinematic data along a shared memory bus structure invites time delay into end effector control. Though Seraji suggests that RMDAC will remain stable if control is faster than significant Jacobian rates of change (approx. 1 0 _1 seconds), it is conceivable that slow sensing or high degrees of freedom may drive control rates into this region. What is the significance of slow or time delayed computation on end effector control? • Sensor noise. Given appropriate filtering within each controller, sensor noise should not directly affect the practicality of multiagent control in general. However since filtering can introduce time Chapter 9. Conclusions, Contributions, and Recommendations 231 delay, noise clearly can effect the stability of the multiagent manipulator controller described here. Given a particular set of behaviour controllers what is the sigificance of noise on the performance of a distributed, multiagent manipulation system? • multiple manipulators under multiagent control. Perhaps the most intriguing. Broadcasting a single global goal to more than one multiagent team should lead to a competition between teams to provide the desired forces. Does global adaptation, as one might expect, resolve these conflicts automatically? • flexible links. Space based applications in which payloads, links, actuators, and bases are no longer rigid bodies greatly complicates manipulator control. Can local or distributed goal systems be developed to mitigate these affects? Can flexibility couple goal systems (e.g. obstacle avoidance)? Through the application of control theory to agent and multiagent systems in general and manipulation in particular, this thesis strives to outline and demonstrate an orderly method to the design and analysis of these new controllers. Yet with so many unanswered questions, this can only be one of many small steps on the road to a coherent multiagent control theory. Bibliography [Albus, 1990] J.S. Albus and R. Quintero. Toward a reference model architecture for real time intelligent control systems (arctics). In ISRAM '90, pages 243-250, July 1990. [Ambler, 1975] A.P. Ambler and R.J. Popplestonc. Inferring the positions of bodies from specified spatial relationships. Artificial Intelligence, 6, 1975. [Ambler, 1980] A.P. Ambler and R.J. Popplestone. An interpreter language for describing assemblies. Artificial Intelligence, 14, 1980. [An, 1989] C.H. An, C.G. Atkeson, and J.M. Hollerbach. Model Based Control of a Robot Manipulator. The MIT Press, 1989. [Anderson, 1988] R.J. Anderson and M.W. Spong. Hybrid impedance control of manipulators. Transactions of Robotics and Automation, 4(5), 1988. [Andersson, 1989] R.L. Andersson. Understanding and applying a robot ping-pong players expert controller. In Proceedings 1989 IEEE Int. Conf. on Robotics and Automation, pages 1284-1289, 1989. [Arkin, 1987] R..C. Arkin. Motor schema based mobile robot navigation. International Journal of Robotics Research, 8(4), August 1987. [Arkin, 1991] R.C. Arkin. Reactive control as a substrate for telerobotic systems. IEEE on Robotics and Automation, 8(4), June 1991. [Asada, 1986] H. Asada and J.J. Slotine. Robot Analysis and Control. John Wiley and Sons, 1986. [Ballieul, 1985] J. Ballieul. Kinematic programming alternatives for redundant manipulators. In Proceedings 1985 IEEE Int. Conf. on Robotics and Automation, pages 722-728, 1985. [Brooks, 1989a] R.A. Brooks. A Robot that Walks: Emergent Behaviours from a Carefully Evolved Network, chapter 24, pages 28-39. Artificial Intelligence at MIT. The MIT Press, 1989. [Brooks, 1989b] R.A. Brooks. Robotic Science, chapter 11, The Whole Iguana, pages 432-456. The MIT Press, 1989. [Brooks, 1989c] R.A. Brooks. A Robust Layered Control System for a Mobile Robot, chapter 24, pages 2-27. Artificial Intelligence at MIT. MIT Press, 1989. [Brooks, 1991a] R.A. Brooks. AI Memo No. 1293: Intelligence without Reason. Massachusetts Institute of Technology, 1991. [Brooks, 1991b] R.A. Brooks. Intelligence without representation. Artificial Intelligence, (47):139-159, 1991. [Colbaugh, 1989] R. Colbaugh, H. Seraji, and K.L. Glass. Obstacle avoidance for redundant robots using configuration control. Journal of Robotic Systems, 6(6):722-744, 1989. 232 IEEE Transactions 233 Bibliography [Colombetti, 1996] M . Colombetti, M. Dorigo, and G. Borghi. Behaviour analysis and training - a methodology for behaviour engineering. IEEE Transactions on Systems Man and Cybernetics, 26(3):365-380, 1996. [Council, 1990] J.H. Conncll. Minimalist [Cormen, 1990] T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. McGrawHill., 1990. [de Silva, 1989] C.W. de Silva. Controls, Sensors, and Actuators. Prentice-Hall, 1989. [Denavit, 1955] J. Denavit and R.S. Hartenberg. A kinematic notation for lower pair mechanisms based on matrices. ASME Journal of Applied Mechanics, pages 215-221, June 1955. [EL-Rayyes, 1990] L. EL-Rayyes, R.W. Toogood, I. Kermack, and D. McKay. Symbolic generation of robot dynamics. Software Documentation., December 1990. [Fcrrell, 1993] C. Ferrell. Robust Agent Control of an Autonomous Robot with Many Sensors and Actuators. PhD thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1993. [Fichter, 1996] E.F. Fichter. A Stewart platform-based manipulator: General theory and practical construction. International Journal of Robotics Research, 5(2):157-182, Summer 1996. [Fisher, 1992] W.D. Fisher and S. Mujtaba. Hybrid position/force control: A correct formulation. International Journal of Robotics Research, 11(4):299-311, August 1992. [Fu, 1987] K.S. Fu, R.C. Gonzalez, and C.S.G. Lee. Robotics Control, Sensing, Vision, and Intelligence. McGraw-Hill, 1987. [Fukunaga, 1990] K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, 1990. [Gat, 1994] E. Gat. Behavior control for robotic exploration of planetary surfaces. IEEE Transactions of Robotics and Automation, 10(4):490-503, August 1994. [Glass, 1993] K. Glass. On-line collision avoidance for redundant manipulators. Proceedings 1993 IEEE Int. Conf. on Robotics and Automation, pages 36-43, 1993. [Glass, 1995] K. Glass, R. Colbaugh, D. Lim, and H. Seraji. Real-time collision avoidance for redundant manipulators. IEEE Transactions of Robotics and Automation, 11(3):448457, 1995. [Goldstein, 1981] H. Goldstein. Classical Mechanics, Second Edition. Addison Wesley, 1981. [Hartley, 1991] R. Hartley and F. Pipitone. Experiments with the subsumption architecture. In Proc. 1991 IEEE Int. Conf. on Robotics and Automation, pages 1652-1658, April 1991. [Hogan, 1985a] N. Hogan. Impedance control: An approach to manipulation: Part i theory. Journal of Dynamic Systems, Measurement, and Control, 107:1-7, 1985. [Hogan, 1985b] N. Hogan. Impedance control: An approach to manipulation: Part ii implementation. Journal of Dynamic Systems, Measurement, and Control, 107:8-16, 1985. [Hogan, 1985c] N. Hogan. Impedance control: An approach to manipulation: Part iii applications. Journal of Dynamic Systems, Measurement, and Control, 107:17-24, 1985. Mobile Robotics. Academic Press, 1990. 234 Bibliography [Hollerbach, 1987] J.M. Hollerbach and K.C. Suh. Redundancy resolution of manipulators through torque optimization. IEEE Transactions of Robotics and Automation, RA-3(4):308-316, August 1987. [Jones, 1993] J.L. Jones and A . M . Flynn. Mobile Robots: Inspiration Peters Ltd., Wellesly M A , 1993. to Implementation. A.K. [Kazerounian, 1988] K. Kazerounian and Z. Wang. Global versus local optimization in redundancy resolution of robotic manipulators. International Journal of Robotics Research, 7(5):3-12, October 1988. [Khatib, 1985] O. Khatib. Real time obstacle avoidance for manipulators and mobile robots. In Proceedings 1985 IEEE Int. Conf. on Robotics and Automation, 1985. [Khatib, 1987] O. Khatib. A unified approach for motion and force control of robot manipulators: The operational space formulation. IEEE Transactions of Robotics and Automation, RA-3(l):43-53, February 1987. [Kosecka, 1994] J. Kosecka and R. Bajcsy. Discrete event systems for autonomous mobile agents. Robotics and Autonomous Systems, 12:187-198, 1994. [Leahy Jr, 1994] M.B. Leahy Jr and G.N.Saridis. A behaviour based system for off road navigation. IEEE Transactions of Robotics and Automation, 10(6):776-782, December 1994. [Lewis, 1993] F. Lewis, C.T. Abdallah, and D.M. Dawson. Control of Robot Manipulators. MacMillan Publishing Company, 1993. [Lippman, 1991] S. B. Lippman. C++ Primer. Addison Wesley Company, 1991. [Luh, 1980a] J.Y.S. Luh, M.W. Walker, and R.P. Paul. On line computational scheme for mechanical manipulators. Journal of Dynamic Systems, Measurement, and Control, 102:69-76, 1980. [Luh, 1980b] J.Y.S. Luh, M.W. Walker, and R.P. Paul. Resolved motion acceleration control of mechanical manipulators. IEEE Transactions on Automatic Controls, AC-25(3), 1980. [Lumia, 1994] R. Lumia. Using nasrem for real-time sensory interactive robot control. 12:127-135, 1994. [MacKenzie, 1995] D.C. MacKenzie, J.M. Cameron, and R.C. Arkin. Specification and execution of multiagent missions. In Proceedings 1995 IROS Conference, 1995. [Mahadevan, 1992] S. Mahadevan and J.H. Connell. Automatic programming of behaviour based robots using reinforcement learning. Artificial Intelligence, 55:311-365, 1992. [Mataric, 1992] M.J. Mataric. Minimizing complexity in controlling a mobile robot population. In Proceedings 1992 IEEE Int. Conf. on Robotics and Automation, May 1992. [Mataric, 1994] M.J. Mataric. Interaction and Intelligent Behaviour. PhD thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, May 1994. [Mel, 1990] B. W. Mel. Connectionist Robot Motion Planning. Academic Press, 1990. [Minsky, 1990] M . Minsky. Excerpts from The Society of Mind, volume 1, chapter 10, pages 245-269. The MIT Press, 1990. Robotica, 235 Bibliography [Monckton, 1995] S. P. Monckton and D. Cherchas. Jacobian transpose control: A foundation for multiagent manipulator control. In 1995 IEEE Int. Conf. on Systems, Man, and Cybernetics, Oct. 22-25 1995. [Nakamura, 1984] Y. Nakamura and H. Hanafusa. Task priority based redundancy control of robot manipulators. In The Second International Symposium on Robotics Research. The MIT Press, 1984. [Nakamura, 1991] Y. Nakamura. Advanced Robotics Redundancy and Optimization. Publishing Co., 1991. [NeXT, 1993] NeXT. NEXTSTEP Release 3.2. NeXT Software Inc., 1993. [Parker, 1992a] L . E . Parker. Adaptive selection for cooperative agent teams. In Proceedings of the Second International Conference on Simulation of Adaptive Behaviour, December 1992. [Parker, 1992b] L . E . Parker. AI Memo No. 1357: Local Versus Global Control Laws for Cooperative Agent Teams. Massachusetts Institute of Technology, 1992. [Parker, 1994] L . E . Parker. An experiment in robotic cooperation. In Proceedings of the ASCE Specialty Conference on Robotics and Automation for Challenging Environments, February 1994. [Paul, 1981] R.P. Paul. Robot Manipulators-Mathematics, Programming and Control. MIT Press, 1981. [Pratt, 1995] .I.E. Pratt. Virtual model of a biped walking robot. Master's thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, August 1995. [Press, 1988] W.H. Press, B.P. Flannery, S.A. Teulkolsky, W.T. Vetterling, and D.M. Dawson. Numerical Recipes in C. Cambridge University Press, 1988. [Raibcrt, 1981] M.H. Raibert and J.J. Craig. Hybrid position/force control of manipulators. Transactions of the ASME, 102:126-133, June 1981. [Raibert, 1984] M.H. Raibert and H.B.Brown Jr. Experiments in balance with a 2d one legged hopping machine. Transactions of the ASME, Journal of Dynamic Systems and Control, 106:75-81, March 1984. [Ramadge, 1989] P.J.G. Ramadge and W.M. Wonham. The control of discrete events. Proceedings of the IEEE, 77(l):81-97, January 1989. [Reynolds, 1987] C. W. Reynolds. Flocks, herds, and schools: a distributed behavioural model. Computer Graphics, 21(4):25-34, July 1987. [Rosenblatt, 1995a] J.K. Rosenblatt. DAMN: A distributed architecture for mobile navigation. In Proceedings of the 1995 AAAI Spring Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents , H. Hexmoor and D. Kortenkamp (Eds.). AAAI Press, Menlo Park, CA., March 1995. [Rosenblatt, 1995b] J.K. Rosenblatt and C.Thorpe. Combining multiple goals in a behavior-based architecture. In Proceedings of 1995 International Conference on Intelligent Robots and Systems (IROS), Pittsburgh, PA, August 1995. Addison Wesley 236 Bibliography [RTI, 1996] RTI. Control Shell Object Oriented Framework for Real Time System Software - User's Manual. Real Time Innovations, 1996. [Samson, 1991] C. Samson, M . LeBorne, and B. Espiau. Robot Control: The Task Function Oxford University Press, 1991. [Seraji, 1987a] H. Seraji. Adaptive force and position control of manipulators. Journal of Robotic Systems, 4(4):551-578, June 1987. [Seraji, 1987b] H. Seraji. Direct adaptive control of manipulators in cartesian space. Robotic Systems, 4(1):157-158, April 1987. [Seraji, 1989a] H. Seraji. Configuration control of redundant manipulators:theory and implementation. IEEE Transactions of Robotics and Automation, 5(4):472-490, August 1989. [Seraji, 1989b] H. Seraji. Decentralized adaptive control of manipulators: Theory, simulation and experimentation. IEEE Transactions of Robotics and Automation, 5(2):183-201, April 1989. [Seraji, 1989c] H. Seraji. Simple method for model reference adaptive control. International of Control, 49(1):367-371, 1989. [Seraji, 1993] H. Seraji. An on-line approach to coordinated mobilitiy and manipulation. In Proceedings 1993 IEEE Int. Conf. on Robotics and Automation, pages 28-35, 1993. [Seraji, 1996a] H. Seraji. Sensor based collision avoidance: Theory and experiments. Email consultation, Mon, 3 Jun 1996 17:04:38, 1996. [Seraji, 1996b] H. Seraji, R. Steele, and R. Ivlev. Sensor based collision avoidance: Theory and experiments. Journal of Robotic Systems, 13(9):571-586, 1996. [Shoham, 1993] Yoav Shoham. Agent oriented programming. Artificial ber 1993. [Simmons, 1991] R. Simmons and E . Kroktov. An integrated walking system for the ambler planetary rover. In 1991 IEEE International Conference on Robotics and Automation, Sacremento California, April 1991. [Simmons, 1994] R. Simmons. Structured control for autonomous robots. Robotics and Automation, 10(1), February 1994. [Slotine, 1987] J.J.E. Slotine and W. Li. On the adaptive control of robot manipulators. International Journal of Robotics Research, 6(3):49-59, 1987. [Slotine, 1988] H.Das, J.J.E. Slotine and T.B. Sheridan. Inverse kinematic algorithms for redundant systems. IEEE Transactions on Robotics and Automation, pages 43-48, 1988. [Smith, 1980] R.G. Smith. The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Transactions on Commmunications, Vol. C29(12):1104-1103, 1980. [Spong, 1989] M.W. Spong and M. Vidyasagar. Robot Dynamics and Control. John Wiley and Sons, 1989. Approach. Journal of Journal Intelligence, (60):51-92, Octo- IEEE Transactions on 237 Bibliography [Steels, 1991] L. Steels. Towards a theory of emergent functionality. In From Animals to Animats Proceedings of the First International Conference on Simulation of Adaptive Behaviour, September 1991. [Steels, 1994] L. Steels. Mathematical analysis of behaviour systems. In Proceedings of Prearc Conference Lausanne, 1994. [Stokic, 1984] D. Stokic and M . Vukobratovic. Practical stabilization of robotic systems by decentralized control. Automatica, 20(3):353-358, 1984. [Sung, 1996] W.Y. Sung, K.D. Cho, and M.J. Chung. A constrained optimization approach to resolving manipulator redundancy. In Journal of Robotic Systems, pages 275-288, 1996. [Thomas, 1988] F. Thomas and C. Torras. A group theoretic approach to the computation of symbolic parts relations. IEEE Journal of Robotics and Automation, 1988. [Tu, 1994] D. Terzopoulos , X. Tu and R. Grzeszczuk. Artificial fishes with autonomous locomotion, perception, behavior, and learning in a simulated physical world. In Proceedings of Artificial Life IV Workshop, July 1994. [Upstill, 1990] S. Upstill. The RenderMan Companion. Addison-Wesley, 1990. [van de Panne, 1992] M . van de Panne and E . Fiume. A controller for the dynamic walk of a biped across variable terrain. In Proceedings of the 31st IEEE conference on Decision and Control, 1992. [Verschure, 1992] P.F.M..I. Verschure, B.J.A Krose, and R. Pfeifer. Distributed adaptive control: The self organization of structured behaviour. Robotics and Autonomous Systems, 9:181196, 1992. [Vidyasagar, 1993] M . Vidyasagar. Nonlinear Systems Analysis. Prentice Hall, 2nd edition, 1993. [Walker, 1982] M.W. Walker and D.E. Orin. Efficient dynamic computer simulation of robotic mechanisms. Journal of Dynamic Systems, Measurement and Control, 104:205-211, September 1982. [Walter, 1950a] W.G. Walter. An imitation of life. Scientific American, 182(5):42-45, May 1950. [Walter, 1950b] W.G. Walter. A machine that learns. Scientific American, 185(2):60-63, August 1950. [Weir, 1984] M . Weir. Goal Directed Behaviour. Gordon and Breach Science Publishers, 1984. [Weiss, 1987] L . E . Weiss, A . C . Sanderson, and C P . Neuman. Dynamic sensor based control of robots with visual feedback. IEEE Transactions of Robotics and Automation, RA3(5), October 1987. [Whitney, 1969] D.E. Whitney. Resolved motion rate control of manipulators and human prostheses. IEEE Transactions on Man, Machines and Systems, MMS-10(2), 1969. [Winograd, 1972] T. Winograd. Understanding Natural Language. Academic Press, New York, 1972. [Wu, 1982] C.H. Wu and R.P. Paul. Resolved motion force control of robot manipulator. Transactions on Systems, Man and Cybernetics, SMC-12(3), 1982. IEEE Bibliography 238 [Yuan, 1988] J.S.C. Yuan. Closed loop manipulator control using quaternion feedback. IEEE Transactions of Robotics and Automation, 4(4):434-440, 1988. [Zhang, 1995] Y. Zhang and A.K. Mackworth. Constraint nets: a semantic model for dynamic systems. Theoretical Computer Science, (138):211-239, 1995. Appendix A Quaternions Quaternions are the IR equivalent of complex numbers, having real and imaginary pure components. The 3 quaternion is composed of a scalar magnitude and a vector. Often these are combined into a single vector representation to create a four parameter representation of orientation. A.l Definition A quaternion may be expressed as a couple {rj, q} where r\ is an element of H while q is an element of IR . 3 Suppose two coordinate systems exist seperated by a rotation ip about the unit vector r . The components 1 i] and q are described by r) = cos | (A.381) q = sin^r (A.382) A = {r/,q} (A.383) where A is a notational convenience representing the quaternion (r), q). Extending this notation: Tl(X) = r) (A.384) V(X) = q (A.385) where 1Z() is the real quaternion component and V() is the pure quaternion component. A.2 Properties Uniqueness Both {r/, q} and {—r/, — q} describe the same orientation. If the rotation tp is limited to —ir<p<n then n is nonnegative and the quaternion is unique. Normality From the definition it is clear that: rf + q q = 1 T 'Thus this becomes a unitary quaternion, commonly called Euler Parameters 239 (A.386) 240 Appendix A. Quaternions Addition imAi} + {»72,q2} = {vi + mAi + &} (A.387) X = TZ(X)-V(X) (A.388) Conjugate Inner Product < X,/.i (A.389) >= ^(Xp + nX) Multiplication {//i,qi}{'?2,q2} = {^I^ - q i q2,qi x q.2 + ?7iq2 + »?2qi} (A.390) Norm ||A||=\/Al=^ +||q|| 2 2 (A.391) Inverse and Equalities (A.392) INII V II = II A || || p 2 (A.393) Propagation If T\ rotates about JF at an instantaneous angular velocity of u. Then the quaternion {77, q} 0 evolves in time according to the following differential equation: _q. 1 ~2 0 w V -ux (A.394) _ q. where UJ is expressed in the coordinates of T\ and u> x is the skew matrix of ui. A.3 Relation to the Rotation Matrix A rotation matrix may be generated from a quaternion through either the arbitrary rotation expression outlined earlier or through one of the two expressions below. In this instance the quaternion represents a rotation from frame 0 to frame 1. °Ri = coscpl + (1 — cose/?)? ? — sin^f x 1 (A.395) where fx is the skew matrix of f. Using the definition the above expression may be rewritten "Ri = (if - q q)I + 2qq - 2»?qx T where qx is the skew matrix of q. T (A.396) Appendix B Orientation Error Orientation error for the purposes of cartesian control requires compact representation and therefore, limitation of redundant terms. Two possible approaches have been developed in literature. The differential motion vector is based upon homogeneous transformation algebra and assumes that the orientation error may be expressed through a small angle approximation [Paul, 1981]. Quaternion orientation formulations, adapted from spacecraft attitude control, have been used for cartesian robotic control [Yuan, 1988]. B.l Differential Motion Vector or Euler Rotations A simple form of cartesian error, discussed in [Fu, 1987] in reference to [Wu, 1982], is the differential motion vector derived in [Paul, 1981]. This technique, [Fu, 1987] (pg. 177-181), requires the development of a transformation between actual and desired end effector orientations. This transformation employs the small angle approximation of the expressions for rotation about each principle axis. The expression for cartesian error, the differential motion vector. dx dy dz (B.397) 6x Sy Sz where 8x,8x, and Sz are incremental rotations about the x, y, and z axes respectively. This translates into the following homogeneous transformation: 1 -Sz 6z 1 -Sy Sx 1 dz 0 0 0 1 241 Sy dx —Sx dy (B.398) Appendix B. Orientation Error where T i and T nc n a 242 are the desired and actual end effector locations of an n degree of freedom manipulator respectively. With the above relation, [Paul, 1981] reduces x into the following form: e dx n • dy da • {fd ~ dz da • (fd-Pa) a fa) (fd-fa) (B.399) 8x \(da Sy \(n • (id - n • Sz \(o - nd — o"d- n ) •O -d' d o" ) d a a d a da) a While this is relatively conservative of space compared to the full 3 x 3 rotation matrix, the small angle approximation, [Paul, 1981], might prove to be a problem over large sampling periods. [Yuan, 1988] points out that this scheme is indeed stable over large orientation errors, but after linearization over small intervals is stable only over small intervals. Splitting the vector x into its position and orientation components e c p and e . The latter is called the expression for Euler Rotation. According to [Luh, 1980b]: a Sx Sy (B.400) = sin tp r Sz employing the half angle formula: = (B.401) sin tp r ip tp _ sin — r = 2 cos — 2 = 28r) Sq (B.402) 2 (B.403) the latter being the Quaternion expression for Euler Rotations. See the appendix A for a discussion of Quaternions. B.2 Quaternion Orientation Error [Yuan, 1988] introduces Euler Parameter (Unitary or Normalized Quaternion) techniques from spacecraft attitude control to manipulator control.The quaternion expression for orientation error between two coordinate frames T\ and T2 is Si] = ifttft+qTqa Sq r)iq - mqi - qi x q = 2 (B.404) 2 (B.405) 243 Appendix B. Orientation Error [Yuan, 1988] determines that two coordinate systems coincide iff <5q = 0. If T\ and Ti are the desired and actual hand coordinates relative to the base, then (B.405) is the orientation error. Now when two frames coincide: m =m (B.406) qi = q 2 (B.407) 6rj = 1 (B.408) Sq = 0 (B.409) or using the normality condition A.386: Now at Jq = 0 *,iq*2 - %qi = Qi x Q2 (B.410) Plainly r/iq — ^/2qi, qi, and q share the same plane, qi x q , therefore, is perpendicular to this plane. For 2 2 2 (B.409) to be true, then both sides are zero or: = %qi (B.411) Then qi is a multiple of q . Since both vectors are of unit length and both vectors are aligned, the normality 2 (A.386) condition may again be used: «tj = !&i72+qT32=±l (B.412) {fft,4i} = {±%,±4} (B.413) Therefore: both of which have the same orientation (see section A.2). Therefore B.405 is the orientation error between two coordinate systems. Appendix C Scenario Specification Database The Multiprocess Manipulator Simulator uses a flexible group of database files to specifiy a particular manipulation scenario. A scenario is composed of a manipulator configuration, a global goal generation system for the manipulator, the global goal (a trajectory specification) for the end effector, and an obstacle specification. To implement a 'database' that is legible to both man and machine a simple data formatting syntax for 'scenarios' has been adopted and will be discussed in this appendix. There are four very similar format types: • Manipulator Description files • Global Goal Description files • Obstacle Description files • End Effector Trajectory Description files all contained within a single directory suffixed by ".scenario" (e.g. GainSeries_01.scenario. Each format type relies on a common set of basic elements. In particular, the scoping operators def ine{} and end{} limit the scope of defined variables to particular objects. Current objects include link, controller, globalgoal, obstacle, and knotpoint definitions. Formal parameters may be assigned within a given def ine{}- end{} pair through the familiar variable assignment syntax: varname = parameter. Formal parameters are those permanently defined within a specific objects. Informal parameters are those specified for a particular instance of objects (e.g. the radius of a ball obstacle) and have an 'escape' syntax: @ varname = parameter. Regardless of parameter formality only four variable types are supported: double in exponential format. vector whitcspace delimited doubles enclosed in angle brackets < >. matrix whitespace delimited vectors enclosed in square brackets [ ]. 244 Appendix C. Scenario Specification Database 245 strings whitespace delimited ASCII characters. The following sections will detail the use of object instances within each scenario element. C.l Manipulator Description Files (MDF): theRobot.rbt The Manipulator Description File documents all of the physical and local control parameters that constitute a link agent. The format is designed to be self documenting. Informal parameters to be added to the syntax without requiring a (time consuming) language redefinition. Link objects arc used to define individual links in the manipulator. The order of the link objects in the file represents the order of the links from base to end effector. There is no limit on the number of link objects. Formal parameters include mass, principle moments of inertia, and Denavit Hartenberg parameters. To implement the jth link's local controllers H , j , the controller object may be instantiated. The order of instantiation is of no consequence except where communication between local and global goals requires allocation of IOstreams. In this case the order of local and global controller instantiation should be identical. Local controllers have only one formal parameter, a name identifier ( a string e.g. " resolved-motion Jbrce_control") followed by a list of informal parameters. There is an arbitrary limit of ten local controllers that may be instantiated for a given link. Not surprisingly, failure to provide an M D F within a scenario is an error. The data file segment in figure C.77 demonstrates the manipulator description language employed in this simulator. C.2 Global Goal Description Files (GDF): theGoals.dat Global goal description files, as in figure C.78, are identical to the controller descriptions within the MDF described earler and carry the same caveats for order of instantiation. If the global goal file is omitted, simulation proceeds entirely using local goals alone. C.3 Trajectory Description Files: theTrajectory.spt Trajectory Description Files are simple lists of knotpoints in global coordinates. Each knotpoint is composed of a time interval over which the knotpoint is attained, an homogeneous transformation describing the knotpoint position , and a final velocity vector. 1 The starting point is function of the moving entities starting position and is assumed to be unknown at startup (as in the case of a manipulator end effector). 1 Homogeneous transformations are generally more man readable than quaternion-cartesian hybrids. 246 Appendix C. Scenario Specification Database g r a v i t y = <+0.000e+00 +0.000e+00 +0.000e+00> de£ine{link} name - l i n k _ l type - r e v o l u t e t h e t a = tO.OOOe+00 o f f s e t = +0.000e+00 alpha = +0.000e+00 l e n g t h = +0.750e+00 mass = +1.000e-01 s a t u r a t i o n = +1.000e+06 v e l o c i t y = +0.000e+00 i n e r t i a = <+5.000e-02 +4.938e-01 +4.938e-01> c e n t r o i d = <-0.375e+00 t0.000e+00 +0.000e+00> define{controller} name = resolved_motion_f o r c e _ c o n t r o l 8 ControlRate = 1.200e+02 end{controller} end{link} define{link} end{link} Figure C.77: A robot Specification Database flat (ASCII) file. Note that the link list order is from manipulator base (frame 1) to end effector (frame N). define{globalgoal} name = resolved_motion_adaptive_control 8 ControlRate = 1.200e+02 8 Wp = <+1.800e+03 +1.800e+03 +1.800e+03 +5.000e+01 +5.000e+01 +5.000e+01> 8 Wv = <+8.000e+02 +8.000e+02 +8.000e+02 +1.000e+01 +1.000e+01 +1.000e+01> 8 f _ z e r o = +0.000e+00 8 f_coeff = <+O.O0Oe+0O +6.000e+00 +0.000e+00> 8 kO_zero = +0.000e+00 3 kO_coeff = <+O.OOOe+00 +4.000e+00 +0.000e+00> 8 k l _ z e r o = +0.000e+00 8 kl_coeff = <+0.000e+00 +4.000e+00 +0.000e+00> 8 qO_zero = +0.000e+00 8 <jO_coeff = <+0.000e+00 +1.000e+00 +0.000e+00> 8 q l _ z e r o = +0.000e+00 8 <jl_coeff = <+O.OOOe+00 +1.000e+00 +0.000e+00> 8 g2_zero = +0.000e+00 8 q2_coeff = <+0.000e+00 +2.000e+00 +0.000e+00> end {globalgoal} define{globalgoal} name = resolved_motion_obstacle_avoidance 8 ControlRate = 1.200e+02 Figure C.78: A Global Goal Database end {globalgoal} flat (ASCII) file. Appendix C. Scenario Specification Database 247 def ine { knotpoint} i n t e r v a l = 0.00e+00 10.00e+00 p o s i t i o n = [<0.00e+00 -1.00e+00 0.00e+00 1.00e+00> <1.00e+00 0.00e+00 0.00e+00 2.00e+00> <0.00e+00 0.00e+00 1.00e+00 0.00e+00> <0.00e+00 0.00e+00 0.00e+00 1.00e+00>] v e l o c i t y = <0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00> end{knotpoint} def ine {knotpoint} i n t e r v a l = 34.00e+00 40.00e+00 p o s i t i o n = [<0.00e+00 -l.OOe-t-00 0.00e+00 1.00e+00> <1.00e+00 0.00e+00 0.00e+00 -2.00e+00> <0.00e+00 0.00e+00 1.00e+00 0.00e+00> <0.00e+00 0.00e+00 0.00e+00 1.00e+00>] v e l o c i t y = <0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00> end{knotpoint Figure C.79: An Trajectory Database flat (ASCII) file. The default profile is cubic linear (a parabolic velocity profile). Step trajectories can also be specified by inserting @ type = step.trajectory into a knotpoint entry. A trajectory segment example appears in figure C.79. Failure to provide a trajectory file is an error if a global goal generation file is present, otherwise the trajectory file is ignored. C.4 Obstacle Description Files: theObstacle. obs Obstacle description files document a list of obstacles in the scenario. Each obstacle entry incorporates a brief shape description within an informal parameter list. This is followed by the objects current position and an obstacle trajectory (seefigureC.80). Again there is no limit on the number of obstacles present. The inclusion of an obstacle file is optional. 248 Appendix C. Scenario Specification Database define{obstacle} name = ball 8 radius = 0.1250e+00 position = [<1.00e+00 <0.00e+00 <0.00e+00 <0.00e+00 0.00e+00 1.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 1.00e+00 0.00e+00 1.00e+00> -0.50e+00> 0.00e+00> 1.00e+00>] velocity = <0.00e+00 0.00e+00 0.00e+00 0.00e+00 O.OOe+00 0.00e+00> define{knotpoint} interval = 0.00e+00 40.00e+00 position = [<1.00e+00 0.00e+00 0.00e+00 1.00e+00> <0.00e+00 1.00e+00 0.00e+00 -0.50e+00> <0.00e+00 0.00e+00 1.00e+00 0.00e+00> <0.00e+00 0.00e+00 0.00e+00 1.00e+00>] velocity = <0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00> end{knotpoint} end{obstacle} def ine {obstacle} end{obstacle} Figure C.80: An Obstacle Database fiat (ASCII) file. Appendix D R M D A C Gain Selection Though stability of Seraji's Direct Adaptive Control [Seraji, 1989c] (here known as resolved motion decentralized adaptive control, RMDAC) may be assured under appropriate operating conditions, adequate performance remains a function of the free variables of the adaptive control algorithm. The selection of these values determines the controller's rate and quality of convergence to the reference model. The next section will describe the process used to determine the free variables to achieve satisfactory end effector trajectory tracking. Unlike OSF or R M F C in which gain selection is simplified through linear system stability rules, Configuration Control has a large number of integral gains and initial values to assign, including 7 initial values, 14 integral gains, and 12 weighting gains for each of n degrees of freedom or 33n values. The recommended method of determining these values is through trial and error [Seraji, 1989c, Glass, 1993], initial values can be left at zero, and the auxiliary term, d,-, merely speeds convergence and is not crucial for stability. The only guideline is to ensure that gains remain positive. This appendix describes a simple gain selection process that explores the significance of some of these values. The process is not an exhaustive and is merely designed to find a stable region possessing good performance. This experiment employs the reference manipulator configuration, trajectory, and payload. The following sections will explain the experimental method. The control rate is set at 120 Hz, double the standard 'real time' sampling rate and cosiderably more than the fixed integration period of 0.001 seconds or 1000Hz. In this study weights are initially assigned as if they are LHP stable position and velocity gains for a linear system. 1 and all integral gains have been zeroed except an and pu, set to 1.0. Recalling that each term in the RMDAC force expression is the product of an integration, at least one set of gains must have assigned values in order to track the trajectory. A velocity weighting experiment retains a constant w - while successively doubling the magnitude of w„;. pl By selecting the greatest stable [w ;, w„,] pair a series of integral tests can then be performed. In particular: p 'Though there is no foundation for this approach it has been found to be a safe first step in selecting position and velocity gains 249 250 Appendix D. RMDAC Gain Selection No. 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 f 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.0 4.0 6.0 Wvi 100.0 225.0 400.0 625.0 900.0 1600.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 900.0 20.0 30.0 40.0 50.0 60.0 80.0 120.0 240.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 480.0 <*i; 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 4.0 6.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 0ii 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0 4.0 6.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 v\i 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.0 4.0 6.0 2.0 2.0 2.0 71 i 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.0 4.0 6.0 2.0 2.0 2.0 2.0 2.0 2.0 Ai, 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.0 4.0 6.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 Table D.20: A tabulation of initial values (italic) and integral gains (bold) for a set of scenarios. Each integral can use up to three gains, un, « i j and u . In this study un is omitted while u never varies, thus the listed gains represent the value of U\ in each integral. 2 2 1. feedback position and velocity gains an and Pn2. Performance as a function of feedforward gains vn, ~fn and Ai,-. 3. Performance as a function of auxiliary signal gains d,. Since these cartesian controllers are orthonormal no coupling gains (the first term in the coefficient vectors) were required [Seraji, 1989c].The integration (the second term in the coefficient vectors) gains again are somewhat arbitrary, though early trial and error suggested unit gains for the feedback terms and 2.0 for the feedforward mass gain. Offset gains were also excluded (the term u in the adaptive integrals). 2 The results are divided into four sections. Weights, feedback gains, feedforward gains, and auxiliary gains. The results uniformly show that using all of the integral terms improves trajectory tracking. As observed in tables D.21 and D.22 as the weights arc increased maximum and RMS errors diminish. There is a limit to this tactic, however. As the weights are increased, it eventually becomes necessary to Appendix D. RMDAC Gain Selection 251 increase the local control rates, possibly because of the ever increasing magnitude of accelerations. Greater performance improvement may be afforded through the selection of larger velocity weights. The results here indicate that the velocity and position weights are of the same order of magnitude unlike the position and velocity gains in a linear PD-like controller. It is interesting to observe that the maximum error is slightly higher in case 9 than in case 6. This is some early evidence that the control rate acts as a barrier to large gains. Apparently the initial error is larger in this latter case and the RMS error indicates poorer steady state performance, possibly hinting at an underdamped system. Convergence data from these runs, indicating that case 9 is slower to converge than case 6, confirms the perception that the system is less stable with the larger velocity weight. As expected, tables D.24 and D.23 indicate that feedback gains must be present to produce reasonable tracking, and that as these gains increase tracking improves. Caution must be exercised with these and other gains, however, in that excessively large gains may result in integrator wind-up and poorer R.M.S. performance. Tables D.25, D.26 and D.27 indicate that increasing feedforward gains improves tracking. While the basic trend is towards improved maximum and RMS performance, an inconsistent increase in both demonstrated in series 24 indicates that it is possible to apply excess feedforward position gains - leading to a decrease in maximum and RMS performance. It should be noted that convergence times were better for this run, however. This may indicate that as gains increase the system becomes underdamped, oscillating about the desired trajectory and converging rapidly. Enabling the auxiliary gains improves performance as predicted by Seraji and shown in table D.28. It is interesting to note, however, that initially the Maximum and RMS errors are elevated with the introduction of another integrating term. This could be attributed to additional integrator wind up in the first seconds of execution. Ultimately, as more information is used to augment the auxiliary signal, error is substantially reduced. Even with carefully selected gains, the reference trajectory presents some difficulties for the Jacobian Transpose Algorithm. The first segment's velocity and acceleration vectors have small magnitudes and are inclined only 15deg to the x-axis. Since the manipulator is initially homed along this axis, the algorithm produces small cartesian forces and even smaller torques. The effect of integrator windup is compounded by the six integrations - all of which depend on cartesian error. The overshoot serves to drive the system, making any instability more noticeable. Particularly in these low aspect segments, RMDAC showed considerable sensitivity to the feedforward term, presumably because this term would results in larger force requirements 252 Appendix D. RMDAC Gain Selection Step Response No. 100 225 400 625 900 1600 01 02 03 04 05 06 2.610e-01 2.302e-01 2.103e-01 1.970e-01 1.868e-01 1.713e-01 1.026e-01 9.160e-02 8.628e-02 8.438e-02 8.335e-02 8.671e-02 Reference Trajectory v W v/e 2.248e-01 1.839e-01 1.489e-01 1.315e-01 1.167e-01 9.759e-02 9.577e-02 7.698e-02 6.113e-02 5.670e-02 4.411e-02 3.594e-02 2 Table D.21: Tabulation of position and orientation RMS L 2 error with variation in error weights,W . p earlier in the trajectory. The auxiliary signal, too, provided a significant improvement in performance. While earlier studies indicated that using initial values could improve performance. Some of these values, particularly the feedback gains, would be difficult to estimate. Indeed, only q 2 , the manipulator inertia parameter, represents an easily estimated initial value. Without more rigorous methods for establishing integral gains, we are left with a trial and error procedure. The procedure shown here, while not exhaustive, represents a reasonable method for searching the integral gain space. This process has been used to establish a 'best gains' RMDAC gain file for the trajectory, the 'step' response for which appears in figure D.81. D.l Control Rates and R M D A C Given that RMDAC is designed with the assumption that the manipulator parameters vary slowly through time, control rate seems to play a role in determining the stability of a system that operates at the edge or outside of this assumption. Over the course of the trial and error process, large integral gains (or large position and velocity weights) sensitized the controller to large trajectory errors, often leading to instability. Increased control rates were observed to counter large integral gains, in effect reducing integrator wind up through smaller time steps. Appendix D. RMDAC Gain Selection Wi No. 60.0 120.0 240.0 480.0 03 07 08 09 v 253 Reference Trajectory Step Response 2.103e-01 1.572e-01 1.391e-01 1.520e-01 vW 8.628e-02 8.449e-02 8.971e-02 1.028e-01 1.489e-01 1.161e-01 1.112e-01 1.070e-01 6.113e-02 4.733e-02 3.876e-02 4.362e-02 Table D.22: Tabulation of position and orientation RMS L error as a function of variation of the weight W „ given w>- = 900. 2 p i No. Step Response VIXcl 1.0 2.0 4.0 6.0 09 10 11 12 Reference Trajectory vVi 2 1.520e-01 1.437e-01 1.336e-01 1.291e-01 2 1.070e-01 8.876e-02 7.128e-02 6.286c-02 1.028e-01 1.006e-01 9.827e-02 9.600e-02 Ve 2 4.362e-02 3.614e-02 2.950e-02 2.668e-02 Table D.23: Tabulation of position and orientation RMS L error as a function of the Feedback integral gain, a given p = 1.0. 2 u u Pli No. Step Response Reference Trajectory v/fl 1.028e-01 1.024e-01 1.053e-01 1.073e-01 V W 2 1.0 2.0 4.0 6.0 09 13 14 15 1.520e-01 1.579e-01 1.629e-01 1.633e-01 1.070e-01 1.045e-01 1.038e-01 1.012e-01 4.362e-02 4.767e-02 5.246e-02 5.003e-02 Table D.24: Tabulation of position and orientation RMS L error as a function of the Feedback integral gain, Pu given au = 1.0. 2 An No. Step Response vWl 1.496e-01 Re
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Multiagent manipulator control
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Multiagent manipulator control Moncton, Simon Philip 1997
pdf
Page Metadata
Item Metadata
Title | Multiagent manipulator control |
Creator |
Moncton, Simon Philip |
Date Issued | 1997 |
Description | The objective of this research is to define, specify, and implement a new, robust, and extensible manipulator control founded upon recent developments in multiagent robot control architectures.. Historically manipulator controllers serve within an idealized monolithic "sense-model-plan-act" (SMPA) control cycle that is both difficult and expensive to design for real time implementation. Recently, however, robotic systems have achieved remarkable performance through the combination of multiple, relatively simple, task specific controllers. These agents are arguably more reliable, robust, and extensible than SMPA architectures exhibiting similar performance. Furthermore, complex tasks have been achieved through multiagent teams, often exhibiting self organinzing or emergent behaviour. Despite these benefits and the growing popularity of these techniques, a formal model of agent and/or multiagent systems has not been proposed nor has any such architecture been applied to classical manipulation robotics. This thesis attempts to address these omissions through the analysis and application of multiagent design principles to manipulator control. After an introduction; to problems in real time supervisory robot control, an overview of manipulator and controller dynamics establishes a reference robot model. With this model as background, experimental high performance robot architectures arc examined and concepts common to these systems identified. Multiagent manipulator control strategies are then discussed and global goal distribution mechanism introduced. The design and implementation of a complementary multiprocess manipulator simulator is then described. With a global goal distribution definition, the design of a manipulator model-free global goal generator is discussed. Results from a multiagent manipulator control simulation are then presented and evaluated. The focus then turns to multiple global goal operation, discussing self organization, multiple global goals, and the impact of simultaneously active local and global goal systems on stability and arbitration. Results demonstrate multiple global and local goal operation including combinations of end effector trajectory tracking, joint failure, obstacle avoidance, joint centering, and joint limit avoidance. Finally, the significance of these results is discussed in the context of general multiagent control. |
Extent | 12235516 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-04-20 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080898 |
URI | http://hdl.handle.net/2429/7391 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mechanical Engineering |
Affiliation |
Applied Science, Faculty of Mechanical Engineering, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1997-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1997-251187.pdf [ 11.67MB ]
- Metadata
- JSON: 831-1.0080898.json
- JSON-LD: 831-1.0080898-ld.json
- RDF/XML (Pretty): 831-1.0080898-rdf.xml
- RDF/JSON: 831-1.0080898-rdf.json
- Turtle: 831-1.0080898-turtle.txt
- N-Triples: 831-1.0080898-rdf-ntriples.txt
- Original Record: 831-1.0080898-source.json
- Full Text
- 831-1.0080898-fulltext.txt
- Citation
- 831-1.0080898.ris