Genetic Algorithms, their Applications and Models in Nonlinear Systems Identification. Frank Lup Ki Wan B.A.Sc. University of British Columbia, 1983 A THESIS SUBMITTED IN PARTIAL FULHLLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF ELECTRICAL ENGINEERING We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA March 1991 © Frank Lup Ki Wan, 1991 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada DE-6 (2/88) Abstract The Genetic Algorithm was used to estimate the hydraulic compliance of the hydraulic system on the UBC teleoperated heavy duty excavator. Using real recorded and simulation data from the excavator, the Genetic Algorithm has successfully identified the compliance of single link and multi-link hydraulic system of the excavator. A Parallel GA ( PGA ) was implemented with 16 T800 Transputers. It achieved a speedup factor of 12 over a traditional GA. With such a high speedup factor, real-time monitoring of hydraulic compliance and other hydraulic parameters is becoming possible. New mechanisms such as the distributed fitness function, the active error analysis were used to enhance the performance of a PGA. A PGA which incorporated these mechanisms actually outperformed a traditional GA in key areas such as variance of the estimated parameter and parameter tracking ability. Finally, a physical model that explains the fundamental properties of GAs was introduced. The physical model ( a hypercube ) not only provides an excellent explanation of GAs searching power, but also gives insight to GAs users ways to improve and to predict the performance of GAs in most applications. ii Table of Contents Abstract ii List of Figures vi Acknowledgements ix 1 Introduction 1 1.1 What are Genetic Algorithms ? 1 1.2 Scope of the thesis 4 2 Related background 7 2.1 Teleoperated heavy duty hydraulic machine 7 2.2 The need for an on-line diagnostic program 8 2.3 How the compliance of the hydraulic system changes 9 2.3.1 Hydraulic fluid ( C f l u id ) 10 2.3.2 Mechanical problems ( Cstnicture ) 13 2.3.3 How hydraulic compliance affects the stability of a P-D controller 16 2.3.4 Comments 20 2.4 Difficulties in identifying the compliance of the hydraulic system 20 3 Identification of hydraulic compliance with GA 23 3.1 Identify the compliance of a mobile hydraulic machine 23 3.1.1 Single link identification 24 3.1.2 Implementation details 27 3.1.3 Experimental Results 29 3.2 Identifying multiple link hydraulic compliance simultaneously 34 3.3 On-line identification of hydraulic compliance with GA 40 3.4 Comment and Discussion 42 iii 4 A Parallel Genetic Algorithm implementation for hydraulic compliance identification 44 4.1 Introduction 44 4.2 Speedup, efficiency, and parallel architectures for PGA 46 4.3 Comparison between PGA and tradition GA performance in identification of hydraulic compliance 54 4.4 Comments and discussion 56 5 PGA in system identification 60 5.1 Previous work on PGA ( algorithm B ) 60 5.2 Important requirements on PGA in system identification 61 5.2.1 The conventional PGA implementation ( algorithm B ) 61 5.2.2 Differences between function maximization and system identification 62 5.3 PGA performance Improvement in system identification 65 5.3.1 Distributed fitness function 65 5.3.2 Broadcasting with active error analysis 66 5.4 Testing the distributed fitness function and active error analysis on various PGA configurations 70 5.4.1 Experimental result 73 5.5 Comments and discussion 81 6 Physical insight into how Genetic Algorithms work 83 6.1 Genetic Algorithms search space 84 6.1.1 The binary unit basis 85 6.1.2 The Reproduction, Crossover and Mutation operators on the hypercube 88 6.2 The importance of encoding the parameter space properly 94 6.3 Comments and discussion 98 7 Conclusions 100 Appendix A The boom hydraulic circuit and its model A.103 iv Appendix B Dynamic equations for hydraulic system in configuration a, branch #1 . B.106 B.l Equations describing the hydraulic configuration a branch #1 B.107 B.2 Simplified orifices area calculation B.108 Appendix C Speedup and efficiency definitions C.109 Appendix D Overview of available computational resources D.110 Appendix EPGA implementation details E.112 Appendix F Hydraulic compliance identification program listing F.117 References F.l 18 V List of Figures 1.1 A typical Genetic Algorithm flow chart 3 2.1 Caterpillar 215B excavator 11 2.2 The physical model of a conventional hydraulic system on a CAT215B excavator 11 2.3 Effect of pressure on the bulk modulus of an air-oil mixture. The air is free 12 2.4 Rate of solution of air bubbles in hydraulic oil 12 2.5 Schematic of a heavy-duty hydraulic actuated boom 16 2.6 Steady State Flow QBOI versus Pressure Peoi and Spool Displacement . 16 2.7 General block diagram of a hydraulically actuated robot with P-D feedback control 17 2.8 Response of a P-D controlled hydraulic manipulator to step input . . . . 19 2.9 Response of a P-D controlled hydraulic manipulator with the decreased stiffness 19 2.10 Schematic diagram of the hydraulic main circuit 21 2.11 Different configurations of the hydraulic main circuit 21 2.12 The branch #1 of the main hydraulic valves in configuration ( a ) . . . . 22 3.1 The block diagram of the refitted hydraulic system for teleoperation control 24 3.2 Schematic of a heavy-duty hydraulic actuated boom 28 3.3 Data record 28 3.4 GA interface to the measurement data 29 3.5 Experiment #1 ( with spool data ) 31 3.6 Experiment #2 ( with servo-valve voltage data ) 32 3.7 Block diagram of the identification and simulation setup 33 vi Comparison between measured Peoi and calculated PBCM 34 Simulation of bucket and boom hydraulic circuit 37 Identified internal states from GA 38 Estimated CBUI and CBOI 39 Interface between the control system and diagnostic program 41 Different sets of population for different hydraulic configurations . . . . 42 Two Parallel Genetic Algorithms Architectures 45 Two levels of parallelisation 48 Two parallel architectures for PGA 51 Nonpipeline implementation flow chart 52 Pipeline implementation flow chart 52 Speedup factor of nonpipline implementation versus number of processors 53 Speedup factor of pipeline implementation versus number of processors . 53 Block diagram of GA and PGA for identifying hydraulic compliance . . 55 Experiment #1 ( with spool data ) 58 Experiment #2 ( with servo-voltage data ) 59 Block diagram of GA and PGA in system identification 63 A matrix of fitness error versus different window size 67 PGA configuration #1 with or without distributed fitness function . . . . 76 PGA configuration #2 under highest broadcast rate of 1 per generation with or without distributed fitness function 77 PGA configuration #2 performance with or without distributed fitness function 78 PGA configuration #2 with distributed fitness function and active error analysis 79 Compare PGA configuration #2 ( distributed fitness function and active error analysis ) with traditional GA 80 vii 6.1 Mapping an one dimensional signal into a hypercube 87 6.2 A four dimensional hypercube 88 6.3 The possible locations of the offspring as a function of the bit difference between parents 90 6.4 The combined effect of Reproduction, Crossover 94 6.5 An impulse-like fitness function 94 6.6 Block diagram of conventional GA and new combination GA 95 6.7 "Coarse" and "Fine" GA grid size . 96 A . l The boom hydraulic circuit. A. 103 A. 2 The mathematical model of the boom hydraulic circuit A. 103 B. l The branch #1 of the main hydraulic valves in configuration a . . . . B.l06 D. l Available hardware resources D. l 11 D. 2 The block diagram of the Transputer T800 D. 111 E. l Software block diagram of the master-slave scheme E. l 14 E.2 The Implementation of a PGA ( algorithm A and B ) E.115 E.3 Measurement Method used for Timing Process E.115 viii Acknowledgements 1 am grateful to my supervisors Dr Lawrence and Dr. Dumont for being so patient with me, to let me explore many different interesting areas, and to see me through writing my thesis. 1 like to thank K. Kristinsson for permitting me to use the core of his Genetic Algorithm program and his help on translating his program from Pascal to C. 1 am especially grateful to N. Sepehri for his help and the use of his detailed documentation on the hydraulic system of the UBC teleoperated excavator. 1 will always remember and grateful for the encouragement and support from Darrell Wong, Dan Chan, Real Frenette, Edward Lam and David Warkintin. Finally, 1 am indebted to my family. Their faithful support has made this work possible. ix Chapter 1: Introduction Chapter 1 Introduction 1.1 What are Genetic Algorithms ? Genetic Algorithms (GAs ) are search procedures based on the mechanics of natural genetics and natural selection. They extract the ideas of survival of the fittest from nature to form a robust search mechanism. GAs are a parallel, global search technique that can simultaneously search through many points in the parameter space. GAs differ from other search techniques by the use of concepts taken from natural genetics and evolution theory. 1. GAs use finite length codings of the parameters ( usually in binary ), instead of the parameters themselves (The binary string coding of parameters is necessary for the Crossover and Mutation process ). 2. GAs work with a population-by-population approach rather than point-by-point techniques. 3. GAs need only fitness values (dependent on application). There is no requirement for derivatives or other auxiliary knowledge. 4. GAs use probabilistic transition rules instead of deterministic ones. A simple GA consists of three operators: Reproduction, Crossover and Mutation. Reproduction is the survival of the fittest within a GA. Based on the survival of the fittest principle, it decides which strings are going to survive and which ones are going to disappear according to their normalized fitness values. Individuals with above average fitness will have more offsprings than those with below average fitness. This step directs the search towards the best individuals. Crossover is a randomized yet structured operator that takes valuable information from both strings ( parents ) and combines it to find a highly fit individual. To apply this operator, 2 individuals are mated at random and a point on the interval l < k < N - l ( N = length of the binary encoded strings ) is chosen randomly. Two new strings are then created by exchanging all characters between positions 1 and k inclusively. For example, given the following two binary encoded strings: 1 Chapter J: Introduction A0 = '0000000' B0 = '1111111' (1.1) and with crossover point at 4, the new strings will be: Ai = '1111000' Bx = '0000111' (1.2) The Reproduction and Crossover operators are responsible for most of the searching power of the Genetic Algorithm. The Mutation operator is used to introduce new information into the population at the bit level. It acts as an insurance policy against the loss of important genetic material at a particular position. Figure 1.1 shows a simplified block diagram of a genetic algorithm. A more rigorous explanation of how the GA work is in Holland's "theory of schemata"[ll]. The GA always converges toward the best from the algorithm's point of view. However there is no guarantee that it will converge to the optimum ( the true value ). Nevertheless it has many advantages over other "identification methods" Identification methods such as recursive least squares, instrumental variable method are based on the principle that a smooth search space exists. These recursive schemes are in essence local search techniques that search for zero gradient by going in a direction suggested by the local gradient. Thus they often fail in the search for a global maximum if the search space is not differentiable and linear in the parameter space. Statistical techniques will do better in multimodal functions ( which have multiple optima), but they require too much computation time for higher dimensional problems. GAs have been shown to behave well on multimodal functions. They work much more subtly than just a random search with preservation of the best Their ability to handle functions that are nonlinear and discontinuous makes them good candidates in function maximization and system identification. GAs do require massive computation power since they are population rather than single point based algorithms. However, GAs are inherently parallel. A l l individuals in a population evolve simultaneously without central coordination. Many parallel implementations of the GA [18][17][16][24][25] for function maximization have been demonstrated recently. With the advent of parallel processing, GAs are becoming a practical on-line identification method for many "difficult" problems. 2 Chapter 1: Introduction Initial ization: - Set the parameters search space - Randomly generate a population I Applying a fitness function Genetic Algorithm Operators: 1. Reproduction 2. Recombination 3. Mutation 1 populati uon 1 Output the best fit individual Figure 1.1: A typical Genetic Algorithm flow chart To demonstrate how a GA works, a second-order system of the following form is used: y(k) = ax x y(k - 1) + a2 x y(k - 2) 4- 60 X (u(k) + bi x u(k - 1)) (1.3) Where ai, a2, bo, b\ are the unknown parameters. Assuming the range of the unkown parameters are, - 5 < ai < 5; - 5 < a 2 < 5; - 5 < 60 < 5; - 5 < b{ < 5 (1.4) the unknown parameters are converted into binary numbers. The number of bits ( N ) for each parameter depends on the required resolution. N can be calculated with the following equation: resolution = or : N = log2 (upper bound - lower bound) 2N (upper bound — lower bound) resolution The binary numbers are concatenated to form an individual string as: (1.5) £(<*,) I B(a2) \ B(b0)\ Bih) (1.6) If a resolution of 0.01 is used for all the parameters, such an individual string length will be 40 bits long. . Chapter J: Introduction The choice of the population size depends strongly on the individual string length [8]. For string length of 40, a population size between 100 to 400 should be used[8]. A random number generator is initially used to generate all individuals in the population. For discrete parameter identification, a fitness function can be defined as: window size fitness =y^(Bias - (y(k — to) — y(k - w))) (1.7) u>=0 where: y(k) = measured output value of the system. y(k) = calculated output value based on the estimated parameters. Bias : a positive constant that ensures positive fitness for good estimated parameters. Window : used to minimize the estimated parameters' variance. Refering to Figure 1.1, after fitness values of all individuals in the population are calculated, the GA operators such as Reproduction, Crossover and Mutation will operate on the corresponding individual binary strings. The binary string format allows GA operators to create a new generation of binary strings. The search for a better individual ( parameters ) will continue as shown in Figure 1.1 1.2 Scope of the thesis Das and Goldberg[5] have shown that a GA can be used in estimating discrete-time parameters. They have demonstrated the feasibility of a GA in identifying parameters a second-order system and suggested future GA extensions to more complex problems. Kristinsson and Dumont [13] have made use of the unique properties of GAs to directly estimate poles and zeros of a system. By doing so, the design of an adaptive pole-placement controller for such system is simplified. They found that: 1. A GA convergence rate is similar to that of Recursive Least Square ( RLS ) in terms of number of input-output data points. 2. A GA gives unbiased estimates in the presence of colored noise. 4 Chapter 1: Introduction 3,. A GA could be used in cases where the system is not linear in the unknown parameters ( RLS cannot be applied ). Because of its unique advantages, GAs can be applied to areas where conventional identification methods perform poorly or fail. A good example of such an area is the hydraulic system in a typical heavy duty mobile machine, e.g. an excavator, where it is very difficult to apply conventional identification methods. For example, the compliances ( denned later ) of the hydraulic system are very important since they affect the controllability of the hydraulic system. In order to apply GAs in real-time system identification, it is necessary to run the GA on multiple processors. There are several ways to parallelize GAs. It is important to determine the achievable speedup factor, efficiency of a parallel implemented GAs ( PGAs ) that are used for system identification. Many papers have been published on using the PGA in function maximization. However, system identification requirements are quite different from that of function maximization. The performance of PGAs used for system identification can be further enhanced or optimized by various techniques. Thus it is possible that in certain applications, PGAs can outperform traditional GAs in key areas such as convergence rate of estimated parameter and variance of the estimated parameters. This thesis is organized into the following chapters: 1. Chapter 2 describes the operation of the teleoperated heavy duty hydraulic excavator and the related backgrounds of the excavator hydraulic system. It also illustrates mechanisms that can cause the compliance of the system to vary. 2. Chapter 3 shows how a GA is used to estimate single link and multiple links hydraulic compliances on the excavator. It also presents one possible way to estimate mult-link hydraulic compliance on-line with GAs. 3. Chapter 4 discusses ways that GAs can be parallelized. It suggests ways to improve the speedup factor and efficiency of the parallel architectures for PGAs. A PGA is implemented to estimate the hydraulic compliance in the excavator hydraulic system ( Chapter 3 ). The performance of such PGA and traditional GAs is compared. 5 Chapter J: Introduction 4. Chapter 5 discusses ways to improve PGAs performance in system identification. Various techniques will be used and their effectiveness on PGAs performance will be examined. 5. Chapter 6 offers physical insights on the searching power of GAs. 6. Chapter 7 will summarize all the above ideas and suggests future research. 6 Chapter 2: Related background Chapter 2 Related background 2.1 Teleoperated heavy duty hydraulic machine Figure 2.1 shows a typical manual joint mode control found on excavator machines today. The control consists of two two-degree-of-freedom joysticks with a one-to-one mapping between the joystick motions and the links. This method of control is very easy to implement but requires extensive training time for most operators to master. In the Telerobotics Lab at UBC Electrical Engineering, resolved motion control is used on the prototype teleoperated heavy duty hydraulic excavator. Resolved motion control, which is based on end point trajectory technique [23], provides a much more "natural" motion interface to the operators. There are many benefits of applying teleoperation and computer control to heavy duty hydraulic machines: 1. Computers can create a better human interface, such as resolved motion control, to the machine. While it may take one year for operators to be proficient in the manual joint mode control, most operators can usually master the resolved motion control in a much shorter period. Resolved motion can also increase the operator productivity. 2. Advanced control algorithms provide a more efficient and safer way to control the hydraulic actuators in the machine [23]. 3. Overload sensing and stability checking can be added to a teleoperated machine. Such features can prevent dangerous manoeuvres by the operators. The safety of the operators and the machines arc thus increased. Figure 2.1 shows the physical block diagram of the hydraulic system of a conventional manually operated CAT 215B excavator. The major components of the system are: 1. Pilot valves: manual control for the pilot pressure which determines the spools displacement in the main hydraulic valves. The displacement of the spools in the main valves determine the orifices areas of the main values. 7 Chapter 2: Related background 2. Main valves: It is the heart of the hydraulic system. It consists of 2 independent branches. Each branch is connected to a constant flow pump. Branch #1 contains the bucket and the boom valves and branch #2 contains the swing and the stick valves. A cross-over network consists of pressure operated valves which connects both branches together. The cross-over network allows either pump to provide surplus power from one branch to other branch when needed. The schematic of the main valves is in Appendix A. 3. Hydraulic actuators: The boom, the stick and the bucket are moved by linear hydraulic actuators. The swing is powered by a hydraulic motor. All the input and output lines of the hydraulic actuators and motor are connected to the main valves by flexible hydraulic hoses. For teleoperated control, the conventional pilot valves are replaced with a new set of servo-valves. The servo-valves will respond to the control voltage and build up the corresponding pressure to move the main hydraulic spool to the desired displacement. Pressure sensors are installed on the input and output lines of the hydraulic actuators ( swing motor, boom, stick, bucket ). Resolvers are installed on the mechanical joints to measure joints angles. The block diagram of the refitted hydraulic system is illustrated in Figure 3.1. 2.2 The need for an on-line diagnostic program Heavy duty hydraulic machines such as excavators are designed to operate in a rugged environ-ments. Periodic maintenance will help cut down the repair frequency of machines. Yet for hydraulic machines operating in a rugged environments, component breakdown is very common. Many ma-chine failures are not foreseeable and repairs are done on as-required basis. In forestry, major repairs are difficult for machines operating in remote areas. The downtime of such machines can be as high as 30% of the machine's life time. Some experienced operators can readily sense malfunction or performance degradation of the hydraulic machines. With the manufacturers' diagnostic procedures and the operators' experience, faulty or damaged components can be isolated and the necessary repairs made. The control of a teleoperated heavy duty hydraulic machine is similar to the "fly-by-wire" control on advanced aircrafts such as Airbus 300 series. The operator's joystick motion is translated into 8 Chapter 2: Related background commands by the computer and the corresponding hydraulic actuators motions are executed using some control algorithm. In a closed loop control system, the operator is isolated from the physical machine. Using the prototype teleoperated excavator as an example, if there was a leakage in the boom hydraulic actuator , both experienced and inexperienced operators could not observe any steady-state error in boom motion since the closed-loop control would have compensated for the leakage. If one of the hydraulic hoses was worn out or damaged, the operator may detect some degradation in the machine performance, but he would rely on the computer to perform the proper diagnosis. For reliable and safe operation of any teleoperated hydraulic machine, the onboard computer must periodically perform on-line self-checking, which involves identifying some important hydraulic parameters of the hydraulic circuit. With the help of a proper on-line diagnostic program, it may be possible to identify early signs of most common mechanical problems, such as worn out hydraulic hoses, dirt build up in the hydraulic valve, and leakage in hydraulic actuator, before they become major problems later. The on-line diagnostic program will reduce the downtime of the machine. 2.3 How the compliance of the hydraulic system changes The hydraulic compliance of a typical mobile hydraulic machine, i.e. an excavator, used in the primary industry can be defined as: c, system + C, (2.1) (2.2) where C. system = the compliance of the hydraulic system the compliance of hydraulic fluid C, •structure = the compliance of the flexible hose and the surrounding structure. 9 Chapter 2: Related background Vflujd = volume of the fluid " y^fluid = the bulk modulus of hydraulic fluid 2.3.1 Hydraulic fluid ( C f l u l d ) The compliance of the hydraulic fluid changes with time due to aeration. Air can be dissolved or entrained in hydraulic fluid. However, the bulk modulus of the hydraulic fluid, (3 is affected by undissolved free air in the fluid ( Figure 2.3, reprinted from [19]). Under high pressure, more air can be dissolved in hydraulic fluid. However as air bubbles are compressed and passed through the pump, they do not dissolve immediately. Figure 2.4 ( reprinted from [19] ) shows the rate of solution of air bubbles in hydraulic oil. Very often tiny bubbles of air accumulate on the surface of the hydraulic actuator or valve, causing the hydraulic compliance of the actuator to increase by an order of magnitude[19]. 10 Chapter 2: Related backgrounds Figure 2.1: Caterpillar 215B excavator Pump Pump #1 #2 Operator Manual Control ' Spool 1 I Pilot-Valve Spool PUot-Varve Spool Main Hydraulic Valves PUot-Valve Spool Pilot-Valve PU PO Bucket I I \ Stick 3 • Boom 1 H-\ hydraulic hose oD Swing Figure 2.2: The physical model of a conventional hydraulic system on a CAT215B excavator 11 Chapter 2: Related backgrounds Figure 2.3 Effect of pressure on the bulk modulus of an air-oil mixture. The air is free. Tim* (s) Figure 2. if Rate of solution of air bubbles in hydraulic oil Chapter 2: Related background 2.3.2 Mechanical problems ( Cstmctw-e ) There are many other conditions such as worn out hydraulic hose, leakage inside the hydraulic actuator and seal failure that could cause the compliance of the system to deviate. Leakage in actuator or seal failure In a manually operated hydraulic machine, if there is a significant leakage ( i.e. 5% to 10%) in the hydraulic actuator, the operator will notice some loss of power and unwanted drift motion in the associated link. In a teleoperated hydraulic machine, the closed-loop control would compensate for the loss of power and drift motion. Thus the closed-loop control could mask the symptoms of the leakage or improper sealing in the hydraulic actuator from the operator. If unattended, this minor problem may evolve into a major machine failure and the safety of the machine operator will be compromised. Referring to the boom hydraulic circuit in Figure 2.5, if there is no leakage in the actuator: Qi = dPi + AX (2.3) where : Qi = flow at the pump side d = the hydraulic compliance Pi = change in pump side pressure Ai = area of the pump side cylinder X = velocity of the cylinder rod Assume there is a leakage in the actuator, Qj flows from input chamber along the wall of the cylinder to the output chamber. Q'i = C'iP'i + AiX + Qi (2.5) 13 Chapter 2: Related background where : Qi = new flow at the pump side C'i = apparent hydraulic compliance Pi = new change in pump side pressure Ql = leakage in the boom actuator Because of the closed-loop control, the error in boom angular velocity is small in steady state and X remains unchanged. However the transient response of the system is quite different and the abnormality can be detected by monitoring the compliance. Combining the above two equations, Qi = C'iP'i - dPi / , x • • • • (2-7) < \ Ci - CAPi since P{ > P{ Thus the change in compliance will be at least proportional to the leakage: (C'i - d) = Ad > ^ (2.8) If the compliance of the hydraulic circuit is monitored periodically, then the abnormal deviation in the measured compliance would be detected and warning can be sent to the controller. Voltage-spool transfer function drift Drift in voltage-spool transfer function can be caused by a worn out spool or poor pressure regulation in the servo-valves. Since such problems affect the flow directly, the measured compliances are affected as well. Referring to the boom hydraulic circuit in Figure 2.5, the sensitivity of steady state flow with respect to pressure and spool displacement can be seen in Table 2.6 which is derived from the hydraulic equations for boom ( Appendix A ). For example, an error of 0.02 inch ( 8 % ) in spool displacement causes a 50% difference in flow QBCM-14 Chapter 2: Related background In the prototype teleoperated excavator designed at the UBC Robotic Lab, an advanced control algorithm based on a computed flow control is used [23]. This control algorithm consists of feedforward compensation and conventional proportional-derivative feedback. This feedforward compensation technique decouples the nonlinearities in the hydraulic and joint structure of the excavator. With most of the nonlinearities compensated by the feedforward term, the controller can use smaller proportional Kp and derivative Kv gain in the feedback to correct for minor unmodelled dynamics. The controller will be more stable and performs much better than a simple P-D feedback controller. The feedforward term involves calculating the desired spool displacement based on the line pressure readings and the desired joints velocities. Using the spool-voltage transfer function ( the inverse of voltage-spool transfer function, Appendix B), the control voltages for the servo-valves can be found. Thus any offset drift in the servo-valve voltage-spool transfer function will cause the feedforward term to be incorrect. From Table 2.6, an error as high as 50 percent in the feedforward term is possible. The controller performance will deteriorate significantly since the proportional and derivative feedback have to compensate for all 50% of the highly nonlinear flow error. Pump Figure 2.5: Schematic of a heavy-duty hydraulic actuated boom 15 PBOJ@ [ 2030 psi K13996 kPa ) PBOI @ [ 1830 psi ]( 12617 kPa ) PBO;@ t 1630 psi ]( 11239 kPa ) QBOJ [ in3/s ]( cm3/s ) Spool [ in ]( cm ) Spool [ in ]( cm ) Spool [ in ]( cm ) [ 60 ] ( 983.4 ) [ 0.2741 ] ( 0.696 ) [ 0.2688 ] ( 0.6827 ) [ 0.2627 ] ( 0.6673 ) [ 50 ] ( 819.5 ) [ 0.2650 ] ( 0.673 ) [ 0.2592 ] ( 0.6584 ) [ 0.2527 ](0.6419) [ 40 ] ( 655.6 ) [ 0.2561 ] ( 0.650 ) [ 0.2499 ] ( 0.6347 ) [ 0.2427 ] (0.6165 ) Figure 2.6: Steady State Flow QBCH versus Pressure Peoi and Spool Displacement 2.3.3 How hydraulic compliance affects the stability of a P-D controller The effect of hydraulic compliance on the system under P-D closed-loop control was illustrated in [14]. Figure 2.7 shows a block diagram of a hydraulically actuated robot with P - D feedback control. The linearized spool and hydraulic actuator model are combined with the proportional and derivative feedback control to form an overall transfer function: Pips) _ Kpag Qf(s) 5 n + a2s2 + (oi + OQKv)S + KPOQ •2KXLDM KPI 2D2M ( 2 ' 9 ) a o = —j--—; Chapter 2: Related background 2.3.4 Comments In a teleoperated hydraulic machine, the performance of the controller is affected by the hydraulic compliance of the system. Since many other common problems such as wom-out hoses, trapped air bubbles, leakage in the actuator, and defective seals all affect the compliance, monitoring the compliance will give the diagnostic program an early warning from problems in the hydraulic system. With additional information and help from some other techniques, it is possible for the diagnostic program to isolate the causes. Proper action can be taken before the teleoperated heavy duty hydraulic machine performance starts to degrade significantly or major machine failures happen. 2.4 Difficulties in identifying the compliance of the hydraulic system Difficulties The main hydraulic valve is the heart of the hydraulic system. The schematic diagram of the hydraulic main valves is shown in Figure 2.10. 1. The complexity of the circuit is increased by the presence of the cross-over network. Depending on the internal pressures P12, P13, P22, P23. the hydraulic circuit can have 5 possible configurations ( Figure 2.11 ). 2. The equations that govern the hydraulic system are nonlinear and the steady state solution can only be obtained by iteration method. Fig. 2.12 shows the configuration a, branch #1 and the set of equations that describe the hydraulic circuit Note that the internal states Pu, P12, Q11, Q12 have to be solved by iteration method. 3. The hydraulic compliance can only be identified during the transient stage of the hydraulic actuator's motion, i.e. when the line pressures PBUI and Prjoi are still rising. The number of data points available is small. The identification algorithm must have a fast parameter convergence rate. 1 9 Chapter 2: Related backgrounds L_!_l T a n k > aebu B u c k e t S p o o l Jon Junction M2 ^aebo ^ Q12 Pe T a n k B o o m A c t u a t o r ^ aibo hose p BOi | I | T a n k B o o m S p o o l ^aobo Oo hose l * B O BOo Figure 2.12: The branch #1 of the main hydraulic valves in configuration ( a ) 21 Chapter 3: Identification of hydraulic compliance with GA Chapter 3 Identification of hydraulic compliance with GA 3.1 Identify the compliance of a mobile hydraulic machine The GA was used to identify the hydraulic compliance of the refitted hydraulic system of the prototype teleoperated CAT 215B excavator. The block diagram of the hydraulic system is showed in Figure 3.1. Why the Genetic Algorithm ? The Genetic Algorithm ( GA ) is chosen to identify the compliances of the system because: 1. The number of unknown parameters is small. There are 4 independent links and thus only 4 compliances Cswi. CBCH. CSTI. and Cam that need to be identified. The size of the global population for GA depends strongly on the number of unknown parameters. A large global population will require extensive computation power. 2. The unique feature of GA is that it does not put strict requirements on the model. Its population based searching algorithm can easily handle discontinuities and nonlinearities in a model. The GA does not require the hydraulic equations to be arranged in any form. There is no need to rearrange the equations to find a closed form solution for the internal states Pu, P12, ... etc. . 3. Due to the cross-over network, the model of the hydraulic system is changing from time to time depending on the operator command and the external loading. It is very difficult to find an identification method, especially a recursive one that can handle such model changes. The GA is a nonrecursive algorithm. It stores the estimated parameters in its population, which is independent from changes in the model of the hydraulic system. 4. Previous work[5][12] showed that the GA parameter convergence rate is similar to ( or slightly faster than ) Recursive Least Square methods. Thus GA parameter convergence rate is sufficiently high for compliance identification. 5. Even though GA is computation intensive, it is a naturally parallelisable algorithm [17] [24]. The GA speed can be improved with parallel processing. 21 Chapter 3: Identification of hydraulic compliance with GA Operator Joys Intei lick face Vbu Servo-Valve Pump #1 Pump #2 3l Vst Servo-Valve Spool Vbo Servo-Valve Spool Vsw Servo-Valve Spool Main Hydraulic Valves hydraulic hose PI PO Bucket Stick Boom Control Algorithm: Feedforward compensation + P-D feedback control PI PO A i_ Swing Pressure sensor Resolver Figure 3.1: The block diagram of the refitted hydraulic system for teleoperation control Other method ( Recursive Least Square method ) Recursive Least Square ( RLS ) method can be applied to single link hydraulic compliance identification by linearizing the simple single link hydraulic equations. However.there are quite a few difficulties in applying RLS in multi-link example. First, because there is no closed form solution for Pii and P12 (internal states of the hydraulic system ), it is difficult to formulate the constraints into the RLS formula. Second, since the hydraulic equations are highly nonlinear, there is no guarantee that the algorithm RLS will converge at all. Although RLS is much less computational intensive (30 — 50 times less ) than the conventional GA, it can not be used in the hydraulic area because of the problem nature. 3.1.1 Single link identification The boom link was chosen to test GA performance in identifying the hydraulic compliance. Figure 3.2 shows a schematic of a heavy-duty hydraulically actuated boom of an excavator. The three orifices ( ai, ae, ao ) belong to the boom main valve and are controlled by the boom spool. The main valve is connected to the actuator through two flexible hoses ( compliances Ci and Co ). The side of the actuator which is connected to the pump with output flow Q is called the pump 23 Chapter 3: Identification of hydraulic compliance with GA side and the side which is connected to the tank is called the drain side. The pump pressure P is measured but it is too noisy to be useful. Line pressures Pi and Po are measured by pressure sensors and they are related to the flows as follows Pump Side: Qi = AX + Pid = ka{ ^P-Pi (3.1) Qe = kaeyfP^Te (3.2) Q = Qi + Qe (3.3) Drain Side: Q0 = A0X- P0C0 = kaeo y/P0 - Pe (3.4) where k is the discharge coefficient [20]. Pe is the tank pressure, Qe is the flow to the tank, and Q is the total pump flow ( a known constant ). The above equations are used to identify parameters Q and C 0 . For example to identify Q, the above equations are used as follows: 1- Given the measured values of Pi(m), where (m) denotes measured data, and the boom joint angle at time t, the first order derivative of the pressure Pi and boom angular velocity are determined by linear piece-wise least-square curve-fitting technique. ( X(m) is proportional to the boom angular velocity ) 2- A population consisting of binary encoded string Cj(j), (j=i,2,.. n ) with its range defined already ( 0.1 > Q > 0.0 ) is initially generated randomly. The above equations are used recursively as follows: Qi = AiX(m) + Pi(m)Ci(j) (3.5) 24 Chapter 3: Identification of hydraulic compliance with GA (3.6) Q< = kae\Jp- (3.7) Error = E(j) — Q - Qi (3.8) where Qi, Qe< P are the intermediate state information. Note that the measured pump pressure P is too noisy, the estimated one P is used in the calculation instead. The orifice areas are derived from the spool or the servo-valve voltage measurement ( Appendix A ). Absolute operators are used to prevent Error cancellation due to sign difference in the calculation 3- Step two is performed over a window size to minimize the estimated parameter Q variance and the effect of noise. The fitness value of a particular Q(j) is defined as: window window (3.9) fitness(CiU)) = £ {B-E\j))< £ B t t where : window = 10 B = 400 The value of B ( Bias ) is chosen so that if E2(j) is larger than Bias, the fitness value of such C;(j) will become negative. 4- Based on the fitness value, Q(j) with high fitness value will be allowed to have more offspring. Q(j) with small or negative fitness value will be discarded from the population. GA operators ( Crossover and mutation ) will then generate a new population of Q(j) j=i,2 n- Step one to three are then repeated until the transient period is over ( Pi(m) is steady ). Q(j) with the highest fitness value will be outputed or recorded. 25 Chapter 3: Identification of hydraulic compliance with GA 3.1.2 Implementation details The recorded data was obtained from Real Frenette and Nariman Serpehri's feedforward com-pensation, open loop experiment on the prototype teleoperated CAT215B excavator. Figure 3.3a-d showed the recorded data of boom actuator line pressures ( Psoi and PBOO ) » servo-valve voltage and spool measurement, and boom angle versus time for a boom command angular velocity of 4 deg/s . 26 Chapter 3: Identification of hydraulic compliance with GA Boom Angle Velocity command Time ( Sec ) PBOi. PBOo. Ppump - 0 Time (Sec) V o l t ( m V ) Spoo l f i nch x .001) 5 0 0 --500 Time (Sec) Time (Sec) Figure 3.3 Data record Chapter 3: Identification of hydraulic compliance with GA Preparation and Considerations The accuracy of the mathematical model: The mathematical model should be a good approx-imation of the physical process. Otherwise, it would be difficult for GA or any other identification method to work properly. In the hydraulic model, the first-order pipe model is only an approximation of the transmission line dynamic. The model ignores the turbulence and all other second-order effects. Filtering out noise in the measured data: Since GA is trying to identify hydraulic parameters during the transient period, it cannot perform much smoothing on the measured data ( even if it is noisy ) , because such an operation would eliminate the transient information. For example, the boom angle must be differentiated to obtain the boom angular velocity. If too many data points were used to find a smooth velocity profile, then the transient information of the boom angular velocity would be lost. The amount of smoothing on measured data should be kept to a minimum to preserve the transient information. Figure 3.4 shows how GA converts the measured data into the variables used in the equations. The first order derivative Psoi of the pressure and boom velocity are determined by a linear piece-wise least-squares curve-fitting technique with number of data points limited to 5 samples. The complete detail of the boom hydraulic system is listed in Appendix B. Derivat ive B o o m angle Derivat ive P r BOi B o o m angular velocity B o o m l ink geometry v BO Servo-valve voltage V- Spool transfer function spool Spool posit ive dynamics spool Spool position ( i f avai lable ) Spool-Orif ices transfer function aibo Figure 3.4: GA interface to the measurement data 3.1.3 Experimental Results The first attempt to identify Q failed. A negative value of C; was obtained from the measured data. After comparing the boom, the spool, voltage measurement with the results from simulation, 28 Chapter 3: Identification of hydraulic compliance with GA it was confirmed that both the spool measurement and the voltage-spool transfer function had the same offset error. That both the voltage-spool transfer function and spool measurement had the same offset error may be due to the fact that the voltage-spool transfer function was originally derived from the same measuring equipment An offset of 0.01 inch in spool measurement caused a steady state flow error of 20 percent. In the following experiments, both spool and voltage measurement were compensated by the calculated offset. Two experiments were performed. In the experiment #1, the spool measurement was used to calculate the orifice area ( ai, ae, ao ). In the experiment #2, the servo-valve voltage is used. If the servo-valve voltage is used ( Figure 3.4 ), then the spool dynamics will be involved in orifice area calculation. Even though the spool dynamics is not known precisely and may affect the accuracy of the calculated orifice areas, the servo-valve voltage is preferred because direct spool measurement may not always be available ( special equipment is needed ). The results of experiment #1 and #2 are in Figure 3.5 and 3.6 respectively. Line pressures PBOL PBOO and boom motion are showed in Figure 3.5 a-b. Identification was performed during the pressure rise where the compliance Csoi ( Q ) was effective. The estimated CBOI and Error ( defined earlier ) of both experiments are shown in Figure 3.5c-d and Figure 3.6c-d. In both experiments, during the period when GA was active, the Error was decreasing monotonically. The difference between estimated CBOI from both experiments was small ( 0.005 versus 0.0086 ). 29 Chapter 3: Identification of hydraulic compliance with GA 30 Chapter 3: Identification of hydraulic compliance with GA 2> •a or 2500 •^ 2000 g 1500 | 1000 .E 500 1 1 1 1 — i 1 1 1 1 £ — ' a t 1 r i i -LU 40 20 0 1 i » • 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 T i m e ( S e c ) Figure 3.:(5" Experiment #2 ( with servo-valve voltage data ) SI Chapter 3: Identification of hydraulic compliance with GA In order to test the accuracy of the estimated hydraulic compliance Ci, this estimated Ci value, with a different set of measured data ( voltage, boom angle ) are used in a simulation which would calculate the boom actuator input line pressure PBOI- The calculated PBCH would be compared with the measured PBOI - Figure 3.7 shows the block diagram of the excavator model, GA and the simulator. The line pressure PBOJ from the simulator and the measured Psoi are plotted in Figure 3.8a. The agreement between both measured and simulated Pfloi was excellent, an indication that the estimated CBOi ( 0.0086 ) was accurate. In order to see how CBOI affect PBOL two different values CBOJ ( 0.0043 and 0.0172 ) were used in the simulator and the estimated Psoi are plotted side by side with the measured PBOI in Figure 3.8b. There was a significant difference between these three sets of curves. The two experiments with the simulator confirmed the accuracy of the estimated CBOI-Testing the accuracy of the estimated hydraulic parameters Feedback Servo-Valve Voltage Actuator pressures Control Servo-valves. Actuators Joint angles'1 Gene t i c A l g o r i t h m In Ident i f i ca t ion Excavator Structure: Bucket, Stick, Boom of the Excavator Actuator pressure Joint angle (Boom angle) Hydraulic compliance Hydraulic System: Servo-valves, Actuators Excavator Structure: Bucket, Stick, Boom of the Excavator S i m u l a t o r Comparison Figure 3.7: Block diagram of the identification and simulation setup 32 Chapter 3: Identification of hydraulic compliance with GA 2000 1500 g. 1000 -500 2000 1500 I. 1000 500 0 Measured PBOi and estimated PBCH (Sboom = 4 deg/s fVcG C ^ s o.otTz. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time (Second) Measured PBOi and estimated PBOi (5>boom = 4 deg/s 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time (Second) Figure 3.8: Comparison between measured PBCH and calculated Peoi 3.2 Identifying multiple link hydraulic compliance simultaneously To demonstrate that G A can identify compliance Caui and CBOI simultaneously, the hydraulic system configuration a, branch #1 is used for illustration ( Figure B . l Appendix B ). To test the performance of G A , a simulation program based on the simplified hydraulics model is used ( Appendix B ). In the simulation, both bucket and boom are activated simultaneously. A l l the states of the simulated hydraulic system are shown on Figure 3.9a and Figure 3.9b. Every individual in the population was made up of a binary coded string ( 20 bits long, 10 bits for each parameter ) corresponding to CBIH and CBOJ- An individual j was selected from the population pool and its genes CBUiO) and CBOIO) w e r e u s e d a s follows: 1. At sample data point k, the Peui{k) PBOi{k) w c r e obtained from measured PBUiOO, PBOi(k) by linear piece-wise least-square curve fitting technique. 2. Xsuik) Xso{k) were proportional to the bucket and boom joint angular velocity. 33 Chapter 3: Identification of hydraulic compliance with GA 3. CsuiCJ) and CBOIO) w e r e u s e a in Equation B2 ( Appendix B ) to solve for Qsuiik) Qsoiik) • Given the flow Qsui{k) QBOi(k) , the internal states Pn, Pi 2, Qn(k),Qn(k) were obtained from Equation B3 ( Appendix B ) . 4. Given that the flow in junction #1 and junction #2 were: Qt = QBW + Qn Junctional (3.10) Qn = Qn + QBCH Junction#2 With the estimated Qsui{k), QBOi(k), Qn(k), Qn(k) from CBIKCJ) and CBOIO)- the summation of flow errors in junction #1 and #2 were used as the fitness criterion for GA. E rror = (Junctional flow error) + (Junction#2 flow error) = {Qt - QBUi(k) - Qu(kj) | + | (Qn(k) - Qi2{k) - QBoi(k)) (3.11) 5. A window of size 10 was used to minimize the estimated parameter variance and noise effect. 6. The selection process here was similar to the single link version. If the selected individual j genes CsuiO) and CBOIG) w e r e far from the correct value, the Error attached to the string were large and the string was removed from the population. On the other hand, individuals with high fitness values will reproduce more offsprings. 7. For every sample k, GA calculated the internal pressures Pn, P12, and the flows Qn, Q12, QBUL Q_BOi for all the individuals j , (j=i,...,n) in the population and recorded the best individual and its associated internal states. 8. The GA operators ( Crossover and mutation ) operated on all the binary encoded strings and generate a new population. 9. Step 1 - 8 are repeated until the transient period is over (the line pressures PBUIOO and PBOIOO are steady ). window (3.12) t*=0 34 Chapter 3: Identification of hydraulic compliance with GA In this experiment, the population size was set to 120. The fitness window size was set to 10. To observe how well GA performed, one can compare GA estimated internal states and the simulated internal states of the hydraulic system ( Figure 3.10a and Figure 3.10b ). The agreement between both sets of curves is excellent. Figure 3.11 shows that estimated CBUI. CBOJ converge very rapidly to their true value. However, when the generation reaches 425, both estimated CBUJ. CBOI start to deviate. This phenomenon is caused by the lack of excitation in the input signal: both simulated PBUi(k) and PBOi(k) approach steady state values and Pnui(k) = 0, Prjoi(k) = 0 . 35 Chapter 3: Identification of hydraulic compliance with GA Figure 3.°, Simulation of bucket and boom hydraulic circuit 36 Chapter 3: Identification of hydraulic compliance with GA Pll .P12 . @C&ufO0073,CbOfO.011 5 0 100 150 200 250 300 350 400 Qll Q12 QBu;QBoi 5 0 W 150 200 250 300 350 400 Figure 3. |0 Identified internal states from GA 3 1 Chapter 3: Identification of hydraulic compliance with GA 0.07 0.06 0.05 0.04 0.03 0.02 0.01 C B u Y C B o t @Ceu BQ.0073, ©Cfepga.Ol 1 1 1 1 fa) • i 1 1 r i -i- 1 r - / . j A r-\ i • -JJJ, ( - - ^ . _ _ ^ . . „ . ^ J . . 1 1 1 i 50 100 150 200 250 300 Tr«ne ( 0.0025 s * ) 350 400 Figure 3.11 Estimated Cam and CBOJ 1 QeAerctflOf] = 0.| S AC 38 Chapter 3: Identification of hydraulic compliance with GA 3.3 On-line identification of hydraulic compliance with GA Interaction between the Control Algorithm and GA Depending on the operator commands and the loading of the system, the control algorithm demonstrated in [23] will respond and the hydraulic configuration will change accordingly. Com-munication between the control algorithm and the diagnostic program is essential since the on-line diagnostic program based on GA needs the configuration of the hydraulic circuit to track and identify the compliance. The on-line diagnostic program should be an integral part of the control system. The control system determines when and how often to perform a diagnostic check on the hydraulic system. It can also determine which configuration of the hydraulic system and which particular links should be checked by the diagnostic program. How GA handles multiple configurations of the hydraulic circuit There are 4 different compliances, CBUJ. CBOI, CSTI. Cswi that need to be monitored ( assume only the pump side of the hydraulic system is monitored ). Figure 3.12 shows the interface between GA and the control algorithm. The control algorithm supplies the configuration information to the diagnostic program. The diagnostic program will then run GA with the appropriate set of populations. Figure 3.13 shows different sets of population for 15 different hydraulic configurations: If only one link in any configuration is active, then GA will only use population set PI a, or Plb, or Pic, or Pld. For configuration with multiple active links, population sets P2, P3 and P4 will be used. For example, in configuration (a), branch #1 contains the bucket and the boom joints. Assuming that both bucket and boom joints in branch #1 are active, GA must identify the compliance CflUi and CBOI simultaneously. Population set P2a is used because an individual in population set P2a consists of Cf iUi and CBOJ encoded in a binary string, and thus the GA operators such as Crossover, mutation can operate on the encoded parameters CBUJO) and CBOJCJ) simultaneously. If only the bucket is active, then GA will use population set Pla. Note, not all possible 15 combinations of C B U i . C B O i . C s T i and Cswi have to be identified. Referring to Figure 2.11 in Appendix A, starting from hydraulic configuration (a), two genetic algorithms are used for the two different branches in the hydraulic circuit GA #1 operates on 39 Chapter 3: Identification of hydraulic compliance with GA Online identification of hydraulic compliance with GA H y d r a u l i c Sys tem Sensors M e a s u r e d D a t a C o n t r o l Sys tem C o n f i g u r a t i o n I n f o r m a t i o n H y d r a u l i c c i r cu i t C o n f i g u r a t i o n Branch Genetic Algorithm Population Diagnostic Program x (a) #1 GA #2 GA ( b ) E s t ima ted C o m p l i a n c e GA GA (c) GA GA (d) ,(e) GA Pla P2a P2e P3a P3d Plb P2b P2f P3b P3e P4 Pic P2c P3c P3f Pld P2d Figure 3.12: Interface between the control system and diagnostic program population P2a while GA #2 operates on population P2b. If the hydraulic configurations changes from a to b, GA #1 will continue to operate on population Pla, while GA #2 will operate on population P3. Finally, for hydraulic configuration d and e, both branches are connected by the cross-over network. If all the links are active, then all the compliances must be identified at the same time. A GA will operate on population P4 where an individual is a binary string that encodes CBUJ. CBOI. CSTI. and Cswi-G A stores the estimated parameters in a population and there are redundant copies of the estimated parameters in different sets of population. For example, CBUJ is in population sets PI, P2, P3, and P4. Values of CBUI identified under different population sets correspond to different hydraulic 40 Chapter 3: Identification of hydraulic compliance with GA Population individual binary encoded string P l a CBUi Plb Pic CsTi P l d Cswi P2a CBUJ. CBOI P2b CsTi. Cswi P2c Caui. Csn P2d CBUL Cswi P2e CBOi. CsTi P2f CBOL Cswi P3a CBUi, CBOJ. CSTI P3b CBUI. Csoi. Cswi P3c CBOi. CSTI . Cswi P3d CBUi. CSTI . Cswi P4 CBUL CBOi. CsTi. Cswi Figure 3.13: Different sets of population for different hydraulic configurations configurations. By combining all the estimated compliances Caui. CBOI. CSTJ. Cswi in all different population sets, one can devise: 1. A way to cross check the accuracy of the estimated parameter. 2. A way to identify and to locate the particular faulty condition more precisely. 3.4 Comment and Discussion Results in the single-link and multi-links experiment indicate: 1. GA can accurately identify the compliance of the boom hydraulic circuit. 2. GA can work with a highly nonlinear system. In the multi-links hydraulic system, not only is the system nonlinear but also there is no closed form solution for the internal states such as Pu and P12. GA is able to bypass all the difficulties with its nonrecursive, population based algorithm. 41 Chapter 3: Identification of hydraulic compliance with GA 3. GA has unique searching power. It successfully searched out the right combination of Cnui and CBOI from the parameter space in only a few iterations ( generations ). Such fast parameter convergence rate is very important for real-time identification. In this chapter, the method of applying GA to identify the compliance of the hydraulic system has been demonstrated. It has also been shown how GA can overcome the nonlinearities and on-line model changes in the hydraulic system. GA has been found to perform exceptionally well in identification of hydraulic parameters. GA limitations The GA is similar to other identification methods in that it requires a good model in order to produce accurate result. An inaccurate model of the system under identification may cause failure of the identification process, poor convergence rate, and multiple local minimums. In discrete parameter identification with the GA, a window size is used to minimize the variance of the parameter and the effect of noise. Because of the secondary effects in the hydraulic system, trying to fit parameters in a large window would result in poor accuracy. From trial and error in the experiments, 10 was found to be the best window size. With a small window size, the variance in the estimated parameter could be quite large due to the effect of noise. Fortunately, in the experiment, the measured Psoi. boom angle were relatively free of noise. 42 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification Chapter 4 A Parallel Genetic Algorithm implementation for hydraulic compliance identification 4.1 Introduction In the previous chapter, the flexibility and searching power of Genetic Algorithms ( GAs ) in identification of hydraulic parameters have been demonstrated. Das [5] and Kristinsson [12] also achieved good performance using GA in process identification and control. However, the GA differs from other algorithms in that it is based on a population, not on a single point, and it is very computation intensive. Fortunately, the GA is an inherently parallel algorithm. With the advent of parallel processing technology, it is becoming feasible to apply GA in real-time applications. Pettey [16] has suggested some possible parallel GA algorithms. The two most promising ones are: the synchronous master-slave GA ( algorithm A ) and the cooperating sequential GA ( algorithm B ) shown in Figure 4.1. Algorithm A is designed for problems which require long evaluation or long fitness calculation time. Algorithm A is organized in a master-slave configuration with n slaves. The master is responsible for distributing individuals ( from a global population ) to slaves. After receiving the fitness results from slaves, the master will perform Reproduction and Crossover on the old population to generate a new one. Algorithm A produces exactly the same sequence of generations as the traditional GA, and thus the same solution too. Algorithm A has some disadvantages: 1. Efficiency ( defined in Appendix C ) per processor decreases as the number of slaves increases since more slaves will be idling while the master performs Reproduction and Crossover. 2. Speedup ( defined in Appendix C ) is linear for only a small number of slaves. The cooperating sequential GA ( algorithm B ) is more generally known since it is the most widely used one. Algorithm B is a collection of many independent (local or nodal ) GAs running and communicating with one another simultaneously. Algorithm B gains speed by dividing the global population into many smaller ones for the local GAs. With a smaller population, all local GAs and thus this PGA will run much faster. Algorithm B retains the characteristic of GA by making 43 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification all local GAs exchange portions of their population with each other. If the communication time in the exchanging phase is much smaller than the processing time, linear speedup can be achieved. Algorithm B introduces some new parameters: frequency of exchange, portion of the population to be exchanged among NGA, and the local population size. The performance and some characteristics of algorithm B are quite different from the traditional GA. Researchers [17][24] have found that algorithm B is more robust and can consistently find a better solution than the traditional GA in function maximization. Master i < Slave Slave Slave Slave # 1 #2 #3 #4 The synchronous master-slave GA (algorithm A ) Processor #2 ( Local Population #2 ) Processor #3 ( Local Population #3 ) Processor #4 ( Local Population #4 ) Processor #5 ( Local Population #5 ) Processor #1 ( Local Population #1 ) t ^ ^ t Exchange ^ ^ ^ f ~\ Processor #0 Network f* * \ ) (Local Population #0 ) Processor #7 ( Local Population #7 ) Processor #6 ( Local Population #6 ) The cooperating sequential GA (algorithm B ) Figure 4.1: Two Parallel Genetic Algorithms Architectures In system identification, the fitness function requires a window to minimize the variance of the 44 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification estimated parameters and the effect of noise: window size fitness = ^ bias - ( y(k - w) — y(k - w) ) 2 (4.1) tu=0 Where y(k) : measured system output y(k) : estimated system output bias : a positive constant that sets a limit for GA reproduction operator ( only individuals with fitness > 0 can reproduce ). The window in system identification increases the fitness computation cost. From experience, the fitness calculation uses up to 80% - 95% of a GA program execution time, thus satisfying the requirement of algorithm A ( Algorithm A is designed for problems which require long evaluation or long fitness calculation time ). Because of the small global population size ( 100 to 400 ) [12][5] and the need to use sizable local populations in system identification ( explained in the next chapter ), the speedup factor from algorithm B is small. By combining both algorithms A and B in system identification, not only is the overall speedup factor increased, but the robustness of such PGA is improved as well. Objectives: A project was initiated with the overall goal to investigate the effect of combining algorithm A and B into a two level parallel implementation. We would like to determine a suitable parallel architecture and its efficiency for such PGA. Such PGA for identification of the hydraulic compliance ( single joint) will be implemented on the Inmos T800 Transputer and its performance is compared with that of the traditional GA. 4.2 Speedup, efficiency, and parallel architectures for PGA The following information from a traditional GA is important for determining the maximum combined speedup factor from algorithm A and algorithm B: 45 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification 1. The global population size M 2. The parallelizable process (total fitness calculation time for all individuals in the population ) t p t p = M x fitness calculation time = I x window size x Error calculation time (4.2) 3. The sequential process ( total GA operator time for all individuals in the population ) t s t s = M x GA operator time — M x (Reproduction + Recombination + Mutation) time (4.3) where for system identification, fitness is usually denned as: window size fitness(k) = £ (Bias - Error2(k - w)) (4.4) w=0 Define a as the ratio of execution time of parallelizable process vs the sequential process: tD window size x Error calculation time „ ex = — = (4.5) tg GA operator time then a is independent of the global population size. The above equations indicate that there are 2 intrinsic level of parallelisation in GAs. Dividing the global population M into many smaller local populations ( M i o c ) is level 1 parallelisation ( algorithm B ). Distributing the fitness calculation among slave processors is level 2 parallelisation ( algorithm A ). In level 1 parallelisation ( algorithm B ), the global population is divided into many smaller local populations for local or nodal GAs. The collection of nodal GAs retains the characteristic of traditional Genetic Algorithm by making all local GAs exchange portions of their population with each other. Usually the communication time in level 1 tci in the exchanging phase is much smaller than level 1 sequential process time tsi and can be neglected. Thus linear speedup over traditional GA is possible. The achievable speedup factor in level 1 is: Speedupi = (4.6) Mloc 46 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification local population # 1 Master 1 1 i < Slave #1 Slave #2 Slave #3 Exchange the best individuals Master 2 local population # 2 Level 1 parallelization (algorithm B) Slave #1 1 1 Slave #2 Slave #3 Level 2 parallelization ( algorithm A) Figure 4.2: Two levels of parallelisation Where Mioc is the local population size and depends on the types of applications. In level 2, the fitness calculation is distributed among slave processors. Defining: 1. t p i = the parallellizable process timing in Level 1. 2. t si = the sequential process timing in Level 1. 3. t P2 = the parallellizable process timing in Level 2. 4. ts2 = the sequential process timing in Level 2. then the following relationship will be true: tpi _ tp2 _ Mioc X window size x Error calculation time t si t s2 Mioc X GA operator time = a (4.7) Even though the ratio of t P2 to ts2 is equal to a, gain in speedup is possible only if the communication overhead tc2 is small compared to ts2- ts2 is smaller than tsi since . _ M,oc M (4.8) the communication overhead tc2 is much more critical in level 2 parallelisation. Interconnection network between master and slave processors must support high speed communication. The speedup 47 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification factor of level 2 with n numbers of processors will be ( see derivation in Appendix C ): c J t p 2 + t s 2 bpeeaup2 = £ t P 2 + t s 2 + n x t c 2 a + 1 r ^ tf t c2 << t s 2 a 4- 1 Speedup2 > —-— if n = integer(a) (4.10) Where n = number of slave processors The maximum combined speedup factor for both algorithm A and B ( Figure 4.2 ) will be: M a + 1 Combined speedup = speedupi x speedup^ = —— x —-— (4.11) The choice of a parallel architecture for PGA The choice of a parallel architecture for PGA in system identification depends on M , Mioc and a. Figure 4.3 shows two possible parallel architectures for PGA. The hypercube configuration is excellent for applications where M a + 1 » ^ r i (4.12) Mloc " 2 Thus only level 1 parallelisation is sufficient Since the communication overhead is small, speedup is linear for a large number of processors. For applications where JL « 2 + 1 (4.13) Mloc 2 It may be necessary to combine algorithm A and algorithm B to achieve higher speedup factor. A hybrid architecture composed of serial interconnection network for a small number of NGAs and high speed local buses for master-slaves connection can be used. The serial interconnection network handles the nodal GAs ( NGAs ) communication. The fitness calculations of NGA's local population 48 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification are distributed to the slave processes on the local bus. The high speed local bus keeps t<2 small to maximize gain in speedup factor. Efficiency is important in parallel processing. Figure 4.4 shows the software block diagram of a PGA with 2 levels of parallelisation. In this conventional implementation of PGA, the fitness calculation time t p and GA operators time U do not overlap. As more and more processors are added to reduce t p , the efficiency per processor decreases because more processors are idling during GA operators time. At speedup equal to ( a + 1) » where t p is equal to t s , the efficiency per processor is down to only 50%. Using pipeline techniques will not increase the maximum achievable speedup factor. However, for a given speedup factor, the number of processors needed is reduced because of improved efficiency ( 100 % ). Figure 4.5 shows the pipeline implementation and the associate timing diagram. By grouping two local GAs together and combining both GAs slave processors for fitness calculation, t p will be equal to t p i + t P 2 . Buffer 1 holds the temporary result from the fitness calculation corresponds to one of the local populations while a processor performs GA operations on the local population in buffer 2. Such arrangement allows t p and ts to overlap each other and forces all slave processors to work at 100 % efficiency. Figure 4.6 and Figure 4.7 show the speedup factors of nonpipeline and pipeline implementation of algorithm A and algorithm B versus number of processors for various a (8,16,24) ratio. For small number of processors, the different between the two speedup factors is quite small. However, pipeline implementation offers significant saving in number of processors for higher speedup factor. 49 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification NGA NGA link ^ local bus NGA NGA /("siave / £ slave X Q slave ^[ . link . Hypercube architecture: •1 level of parallelization (NGA) -Small local population (6 ) •Linear speedup •Easy to implement Hybrid architecture: •2 levels of parallelization: population & fitness calculation •Sizable local population (24) •Almost linear speedup Figure 4.3: Two parallel architectures for PGA 5 0 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification L o c a l GA# 1 Fitness Ca lcu la t ion G A Operators Exchange or Broadcast L o c a l G A # 2 1 Fitness Ca lcu la t ion G A Operators Exchange or Broadcast Loca lGA#3 L o c a l G A # 4 Z3 Fitness Calculat ion I G A Operators Exchange or Broadcast Fitness Ca lcu lat ion ± G A Operators Exchange or Broadcast Convent iona l Implementation Figure 4.4: Nonpipeline implementation flow chart Timing ts Exchange «P -1—Exchange L o c a l GA# 1 and#2 1 L o c a l G A # 3 and # 4 Pipel ine Implementation Figure 4.5: Pipeline implementation flow chart 51 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification I 5 0 45 4 0 35 3 0 25 2 0 15 10 5 0 Nonpipel ine speedup factor versus processors used 10 20 30 4 0 Number o f processors 50 F i g u r e 4.6 S p e e d u p f a c to r o f n o n p i p l i n e i m p l e m e n t a t i o n ve r sus n u m b e r o f p r o c e s s o r s Pipel ine speedup factor versus number o f processors used 5 0 45 4 0 35 ! 3 0 fitness function GA operators (local window size=10 ) System parameters Local GA n Local population Exchange number Exchange frequency System Model fitness function « G A operators C BOi (local window size=10) Local GA #1 A PGA (A Parallel Implementation of GA) Figure 4.8: Block diagram of GA and PGA for identifying hydraulic compliance the identification process of boom hydraulic compliance with actual spool measurement and indirectly with servo-valve voltage using PGA respectively. In this application, the parameter convergence rate and the estimated Ceoi from PGA are almost identical to that from the traditional GA. 54 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification GA and PGA in identifying single link hydraulic compliance : GA Level 1 Algorithm (B) Level 2 Algorithm ( A ) Combined Algorithm (A + B) Number of processors used 1 4 4 16 Global Population size 120 120 N/A 120 Local population size N/A 30 30 30 Number of generations calculated per sec. 9.8 N/A N/A 117.6 Speedup achieved 1 4 3 12 Efficiency ( Speedup / processors ) 1 1 0.74 0.74 Table 4.2 The measured speedup factor of various PGAs 4.4 Comments and discussion In system identification, it is possible to combine the synchronous master-slave GA ( algorithm A ) and the cooperating sequential GA ( algorithm B ) to increase the speedup factor and robustness of such PGA. Such PGA can be implemented on a hybrid parallel architecture with pipeline techniques. The efficiency and speedup factor of such implementation can approach ideal ( 100% efficiency and linear speedup ). In this application, the communication overhead of PGA is small. Even without direct connec-tions among master and slaves, the communication time is still much smaller than the sequential GA operators time. With an adequate communication network, maximum combined speedup factor of such PGA will be: Maximum combined speedup = Jf X ——— (4.14) Mioc 2 Both GA and PGA have accurately identified compliance of the boom hydraulic circuit. A typical transient interval in which compliance can be identified lasts 0.4 sec ond and contains 20 sample points. From Table E.2, the execution time for traditional GA for 20 sample points can be calculated: GA time = sample number x trial -j- (generations per second) (4.15) = 20 X 5 9.8 « 10 second 55 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification The execution time for PGA will be : PGA time = GA time -f- speedup factor (4.16) = 10 -fl2 » 0.8 second Since PGA-time is twice as long as the transient interval ( 0.8 second versus 0.4 second ), it is necessary to increase the speedup by at least a factor of 2 ffiofti. In the future, T800 Transputer can be replaced by the H9 processor from Inmos Inc. . The H9 processor is about 10 times faster than the T800 Transputer. Real-time identification or control of the hydraulic parameters with PGA will be possible. 56 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification 2500 ' £ , 2 0 0 0 % 1500 9 | 1000 .1 500 0 1 i i 1 1— i i i --} 0.2 0.4 0.6 0.8 1 12 1.4 1.6 1.8 2 Time (Sec) 100 \ • 8 0 6 0 w 4 0 U 2 0 0 Figure 4.9 Experiment #1 ( with spool data ) 57 Chapter 4: A Parallel Genetic Algorithm implementation for hydraulic compliance identification eo *> •a *> CO c < E •6 •£2000 g 1500 s I 1000 .1 500 0 I 1 1 1 1 1 — I ! I • - ^ ,—'* PB0C 1 — — i . ._• /—* "Si-to" 100 80 60 40 u 20 0 • Error 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Time (Sec) Figure 4.10 Experiment #2 ( with servo-voltage data ) 5 $ Chapter 5: PGA in system identification Chapter 5 PGA in system identification 5.1 Previous work on PGA ( algorithm B ) Pettey, Leuze and Grefenstette[17][16][21] experimented with PGA in function maximization. They implemented a PGA on a hypercube processor network with a local population size of 50. Their experimental results agreed well with Goldberg's paper [9] on the optimum global population size. In [21], based on the assumption of uniformly distributed communication among NGAs, Pettey suggested that PGA maintains efficiency similar to that observed in a traditional sequential GA. Tanese [24] implemented a PGA on a medium-grained hypercube computer for function max-imization. He experimented with a number of important parameters such as exchange frequency, number of individuals exchanged, Mutation and Crossover rates, and local population size. He suggested that: 1. Exchange frequency and the number of individuals exchanged depend on the problem domain. Exchanging too frequently or too infrequently may degrade PGA performance. 2. Different Mutation and Crossover rates may be used among nodal GAs ( NGAs ). A NGA with a high Mutation rate would emphasize exploration and a NGA with low Mutation rate exploitation. 3. The local population size can be as small as 6 without affecting the measured parameter convergence rate. Previous works on PGA ( algorithm B ) can be summarized as follow: 1. Global and local population size of PGA depends on applications. 2. PGA is consistently able to find a better solution than the sequential GA[25]. 3. PGA may be more robust than sequential GA. [24] 4. PGA parameter convergence rate may be slower than the sequential GA running on the same population. A super-individual is seen only by a local population and it takes several exchanges before such a super-individual can be seen by the global population. 59 / Chapter 5: PGA in system identification Objectives: While many experiments have been done on function maximization with GA and PGA, there are very few works [5][13]on using PGA in system identification and control. It is in this area that we would explore the full potential of PGA in system identification. 5.2 Important requirements on PGA in system identification 5.2.1 The conventional PGA implementation ( algorithm B ) In a conventional PGA, each processor runs a genetic algorithm on its own local population, periodically selects good individuals from its local population, and sends copies of good individuals to one of its nearest neighbours. Under such an exchange scheme, the local GAs are best connected in a hypercube multiprocessor network. The hypercube network is showed in Figure 4.3. There are many implementations of PGA on the hypercube network [24][17]. By sequentially sending and receiving just one best individual from one of its nearest neighbours, each processor on the hypercube network can keep the ratio of local population size to incoming external member sufficiently high to maintain the local population diversity. In [24], a PGA was implemented on a hypercube network and the smallest local population size was 6. If the hypercube network is used, by dividing the global population ( 100 ) into many small local populations ( 6 ), PGA can achieve a speedup factor of 16 time over sequential GA. For a PGA with small local population ( 6 ), it is necessary not to broadcast any "best individual" from one local GA to the others. Such an individual may destroy the diversity of all the local populations ( a best individual may be a local minimum ). Sequential exchange of good individuals among local GAs is necessary. In the case of a super individual ( global minimum ), depending on the hypercube network, it will take several exchanges before the entire population will be aware of such super individual. Thus such an exchanging scheme will cause the total population to converge at a slower rate than sequential GA, since a "super individual" would be seen by subpopulations instead of the global population. 6 0 Chapter 5: PGA in system identification From experience in working with simple PGA exchange mechanisms, with the diversity of the population preserved, a slower convergence rate avoids premature convergence and leads to a global minimum. For conventional PGA, such a trade off between the robustness and the parameter convergence rate is unavoidable. Many papers have been published in the last few years on PGA performance in function maximization [17][24]. In function maximization, PGAs have been shown to be more robust than the sequential GA. Even though there have been many successful implementations of PGA for function maximization, the requirements and the nature of the problem in system identification are quite different from those in function maximization. 5.2.2 Differences between function maximization and system identification There are many differences between function maximization and system identification for PGA: 1. In function maximization, the environment is static; the function is time invariant. In system identification, the working environment is usually dynamic and in real-time. 2. In function maximization, it is possible to use different Mutation rates among nodal GAs ( NGAs ) so that some NGAs will place emphasis on exploration and others on exploitation [24]. However a higher Mutation Rate will increase the variance of the estimated parameter. In function maximization, where only the final value is needed, higher variance in parameters is not important But in many applications of system identification and control, where tracking the parameter changes is important for control and diagnosis, the variance of parameters must be minimized. Therefore only a low Mutation Rate should be used in system identification. 3. In function maximization, the global population size plays a significant role in determining the parameter convergence rate ( measured with respect to time ) [17]. In system identification, the parameter convergence also depends on data sample rate and excitation in the sampled signal. Thus the role the global population plays is secondary in system identification. 4. In real-time system identification and control, a high parameter convergence rate is one of the most important requirements. 61 Chapter 5: PGA in system identification Figure 5.1 shows the block diagram of GA and PGA for system identification. Several important areas in the figure are: the global and local population size, the communication network and the fitness function. System parameters Input/Output System Measurement global population System Model fitness function GA operators ( window size) GA ( Sequential Genetic Algorithm) Best fit system parameters InputyOutput System Measurement System Model Communication Network for exchanging best members among local population System parameters Local population System Model *. fitness function GA operators (local window size ) System parameters Local population System Model fitness function GA operators (local window size ) System parameters Local population fitness function GA operators ( local window size ) PGA (Parallel Implementation of GA) Figure 5.1: Block diagram of GA and PGA in system identification . Best fit system parameters Global population size In function maximization, according to Goldberg [7][8], the size of the global population should depend on the length of an individual string. Using Goldberg's criteria, very large global population 62 Chapter 5: PGA in system identification sizes, from 500 to 1000 are usually used in function maximization. R. Das [5] and K. Kristinsson [12] demonstrated that in system identification with a small number of parameters ( 4 - 6 ) with GA, good performance can be obtained from small global population size of 100. Such a result is encouraging since the computation cost is directly proportional to the global population size. In parallel implementation of Genetic Algorithm ( PGA ), it will be advantageous to keep the same small global population size of 100. Only if the performance of PGA with such a small global population size is unsatisfactorary, then more local GAs can be added to increase the global population. Local population size In a static environment, Tanese[24] demonstrated that a very small local population size of 6 could be used in his application. For system identification, it is necessary to use a larger local population. In a dynamic environment, diversity of small local populations can be destroyed easily. For example, when the parameters of the system under identification become constant, then many small local populations may become identical or very similar ( low Mutation rate is used ). Larger local population offers better protection. Based on the simulations and the above considerations, PGA's local population size for system identification was selected to be at least 20. Communication network In real-time system identification, the measured input and output of the system must be broad-casted to every local GA. The communication network should have very small and deterministic delays in broadcasting. Most PGAs mentioned in the literature were used in the area of function maximization and did not require much communication among local GAs. Depending on the speed of the processor, applications, and sampling rate, a high speed communication network among NGAs is needed in some real-time applications. 63 Chapter 5: PGA in system identification 5.3 PGA performance Improvement in system identification 5.3.1 Distributed fitness function In discrete time parameter identification, GA's fitness function requires a window to minimize the variance of the estimated parameters and the effect of noise: window size fitness = ^ bias — ( y(k — w) — y(k — w) )2 (5.1) w=Q Where y(k) = measured system output y(k) = estimated system output bias = a positive constant that set a limit for GA reproduction operator ( only individual with fitness > 0 can reproduce ). window size = used to minimize the estimated parameter variance. GA with a large window size will be more immune to noise, but will be slower in tracking changes in parameters. Conversely, GA with a small window size may perform poorly in noise, but is excellent in tracking changes. With multiple local GAs exchanging their best individuals, PGA may perform better if the local GAs use different window sizes. A local GA with large window size could estimate parameters with good immunity to noise, and a local GA with smaller window size may track changes in the system parameters better. In exchanging their best individuals, both local GAs may complement and compensate each other's advantages and disadvantages. Thus a PGA with such settings may improve its parameter convergence rate and maintain a good noise immunity at the same time . The robustness of the PGA, and to some extent, the PGA parameter convergence rate, can be improved by increasing the global population size. However, for a given sample rate, increasing the global population will require more computation power. With different window sizes on the local GAs, a local GA with smaller window size has a smaller computational load. Thus it is feasible to increase the population of such local GA to obtain the same computation load. With a smaller window size and a larger local population, the tracking performance of such local GA is further enhanced. 64 Chapter 5: PGA in system identification By varying the window size and local population accordingly to maintain the same computation load, the global population of PGA is increased and the robustness of PGA is also improved. The term distributed fitness function is used to describe the combination of different fitness function and local populations. In real-time system identification and control, the ability to track changes in the system parameters ( a fast parameter convergence rate ) is essential. 5.3.2 Broadcasting with active error analysis Instead of exchanging the best individual with only one of the neighbours, all local GAs will periodically broadcast their best individuals to one another. Broadcasting has the following advantages over sequential exchange: 1. Since all local GAs receive the best individuals from one another, it is equivalent to having every local GA sees the entire global population. The parameter convergence rate will be comparable to that of sequential GA. 2. Broadcasting allows the distributed fitness function's full potential to be exploited. Based on the Maximum Likelihood technique, extra information from the distributed fitness function can be analyzed and be used in a more active role in estimating the unknown system parameters. During the exchange phase, broadcasting ensures that every local GA receives copies of the best individuals from other local GAs with different window sizes. One generation later, every local GA would calculate the fitness value for its own population, which also includes the best individuals from other local GAs. Assuming there are 4 different local GAs with window sizes of 30, 24, 16, 10 respectively, a matrix of fitness of best parameters versus different window size can be extracted from the local GAs with no computation cost: 65 Chapter 5: PGA in system identification window size SWJ wi=30 Error(wi, A i ) Emor(wi, A2 ) Error(wi, A3 ) Error(wi, A4) Swi W2=24 Error(w2, Ai) Error(w2, A2) Error(w2, A3) Error(w2, A4) Sw 2 W3=16 Error(w3, Ai) Error(w3, A2) Error(w3, A3) Error(w3, A4) Sw 3 w4=10 Error(w4, Ai) Error(w4, A2) Error(w4, A3) Error(w4, A4) Sw 4 SA; SAj SA 2 SA 3 SA4 Figure 5.2: A matrix of fitness error versus different window size Where Given a local GA with window W{ : fitness(wi,Aj) = ] P (B - (y(k - n) - y(k - n)A. )) n=0 Wi Error(wi,Aj) = ^ (y(k - n) - y(k - n)Ai ) n=0 Aj = parameters from a local GA with window Wj = A f\ Error(wi,Aj) \ \ J ^V|Error(wj,Aj)|J = ^2 normalized Error for parameter Aj (derived from Wj) evaluated under all other window sizes : Wj (,=i..4) 4 Sw; _ ^ / j Error (w^Ajm ^ V l Error ( W j ,Aj ) | ; = ]P normalized Error for parameters Aj (,=1,2,3,4 window size) evaluated under window w. (5.2) (5.3) For example: Swi SAi Error A i ) Error Error (toj (wi,A\) Error (w2,A2) + Error (103, A3) + Err wi,Ai)\ + \Error (w2,A\)\ + {Error (u>3,Ai)| + {Error (w\, Ai)\ {Error (wi, A i ) | Error (wi, A2) u>2, A2) +Error ( w i , A3) Erro (103, A ) +Error (w\, A4) Erro (w\, A 4 ) (5.4) 66 Chapter 5: PGA in system identification Active error analysis: From the definition of Saj and SWJ, the following statements can be made: 1. Under a noiseless and transitional environment ( i.e. a system parameter is changing ) the local GA with smallest window would be first to track such change. During the transient time, Error(w4, A 4 ) would be smaller than Error(W4, A3 ), Error(w4, A2), Error(w4, Aj). From the definition for SWJ, SW4 will be larger than Swi, Sw2, and SW3 with high probability: Noiseless and transitional : Su>4 > Sw$ > Sw2 > Swi with high probability where : A4 = estimated parameter from GA with W4 (5.5) A*... A 2 . . . A\ = estimated parameter from GA with w\ Thus the magnitude of Swj indicates which window is most suitable for the present time. 2. In a noisy but steady-state condition, the summation of Error across all window sizes should indicate which estimated parameter ai has the best overall Error. Since the variance of the estimated parameter in window 30 is smallest, Errorfwi, Ai ) will be less than Error(w2, A 2 ) , Error(w3, A3), Emor(w4, A4). In this case, SAi should be less than SA2, S A 3 , and SA4. Noise and steady state : SAi < SA2 < SAz < SA4 with high probability (5.6) where : w\ > W2 > t f . i > w$ (window size) 3. The information in SAi and Sw; can be used in choosing the best parameter from the four different windows under noisy, noiseless and transitional environment. There are many ways to use the information in SWJ and SAi. One obvious way is to use the ratio Swj/SAi as a selection criteria. The ratio Sw;/SAj will compensate for some of the noise effect and account for the 61 Chapter 5: PGA in system identification transient condition. The highest ratio of SWJ/SAJ is used to select the best parameter from the window size i . Under most conditions : $ ID' —— (,•=!...4) is a good indication of which window is most appropriate For example : */ ~7TT > 7Ti T T i Tnr" then window w? is probably the most appropriate one (5.7) The computation cost for the active error analysis is insignificant. By combining the distributed fitness function with the active error analysis, it may be possible to improve PGA performance in parameter convergence rate, noise immunity such that they are equal or better than the sequential GA. The active error analysis will output the estimated parameters at the broadcasting frequency. Disadvantage of broadcasting: Broadcasting the best individuals among local GAs could destroy diversity of the local population. Several measures have to be taken to avoid the destruction of local diversity that could lead to premature convergence: 1. The local population size should not be smaller than 20 ( mentioned in previous sections ) to maintain the high ratio between local population size and incoming exchange individuals ( 6 : 1). 2. The broadcasting frequency should be kept low ( less than 1 per 10 generations ) The distributed fitness function will provide additional protection on the local population diver-sity. Although higher Mutation rates can also be used to protect the diversity of local population too, it does severely affect the variance of the estimated parameters. Also, since Mutation is a random process, it is difficult to determine the effectiveness of such mechanism. Thus a higher Mutation rates will not be used unless all other methods have failed. Chapter 5: PGA in system identification 5.4 Testing the distributed fitness function and active error analysis on various PGA configurations In GA and PGA, the initial population is usually created by a random number generator. Thus the initial parameter convergence rate depends on the population size and the random number generator seed. The dependency becomes more noticeable in PGA since each local GA has its own population size and random number generator seeds. For example, a good individual in the initial population will distort the measurement of the parameter convergence rate. Some of the distortion can be removed by averaging many runs of GA or PGA to give a better estimate of the parameter convergence rate. In discrete time parameter identification, the interest is not only on the initial parameter convergence rate, but also on how well PGA can track the changes in the system parameters. A second-order system of the following form ( the same one in [12] ): y(k) = ai * y(k - l ) + 02 * y(k - 2) + ba(u(k) + 6, * u(k - 1)) (5.8) Where ai=1.5, a2=-0.7, bo=1.0 bi=0.5. is used to test the performance of various PGA configurations. Because GAs do not require linearity in the parameters, they can directly identify the poles and zeros of the system. In pole-zero form, the plant can be written as: ^(?_1)»(0 = Biq-^uit -d) + C ^ e W (5.9) Where : A{q-1) = {l-(a1±jfo)q-1) ^ ( g - 1 ) = - (71 ± ^ i ) 9 _ 1 ) d = delay e(t) = noise For a stable minimum phase plant, the poles and zeros are inside the unit circle. Therefore the search space can be limited to the unit circle or a box enclosing the unit circle. With GA, the prior 69 Chapter 5: PGA in system identification lower bound upper bound # of bits resolution d 1 4 2 1 bo 0.0 2.0 7 0.016 «!./?!. 71. ^ 1 -1.0 1.0 7 0.016 Table 5.1 Search space for a stable minimum phase system PGA Nodal GA Exchange ( E ) Broadcast ( B ) Comment Configuration #1 4 E : 1 per every 3 generations Use sequential exchange Configuration #2 4 B : 1 per every 12 generations Use broadcast Table 5.2 Two different PGA configurations knowledge of the exact size of the search space is an important advantage of using pole-zero form instead of the parameter form. The second order system has six parameters that need to be identified: delay d, gain bQ and poles-zeros a i , fix, 71, #1 . The delay is coded as a two bit string to give 4 choices and the other parameters are coded as five 7-bit strings, making totally a 37 bit string. This results in a search space with total 237=1.371*10n combinations. The parameters have been concatenated as follows d I a i I Pi I 7i I 1^ I °o • Upper and lower bounds on the parameters are defined and the resolution of the coding is in Table 5.1 A PRBS ( Pseudo Random Bit Sequence ) signal[12] is used as an input for the system. The PRBS input has a period of 127 with the bit interval equal to four times the sampling interval ( PRBS signal is persistently excited signal ) . Unless stated otherwise, the genetic operator settings are : 1. The Crossover rate ( p c ) = 0.8 2. The Mutation rate ( p m ) = 0.01 Two PGA configurations are tested: the PGA configuration #1 and the PGA configuration #2 in Table 5.2. T o Chapter 5: PGA in system identification The experiment setup is listed in Table 5.3. In experiment #1, the effectiveness of the distributed fitness function with different local population sizes and fitness window sizes on a PGA using sequential exchange mechanism is tested. Experiment #2 is designed to test whether the distributed fitness function can effectively protect the local population diversity against broadcasting. The highest broadcasting frequency is used ( 1 per generation ), and the performance of different local GAs is compared. Experiment #3 and #4 are designed to test how the combination of distributed fitness function and active error analysis can help to improve the performance of PGA in system identification. Finally, the performance of PGA and GA is compared in experiment #5. Experiment Fitness Number of Local Fitness B=Broadcast Function local GAs population size window size E=Exchange #1 ( PGA ) traditional 4 24:24:24:24 30:30:30:30 E = 1 per every 3 distributed 24:30:40:50 30:24:16:10 generations #2 ( PGA ) traditional 4 24:24:24:24 30:30:30:30 B = 1 per every distributed 24:30:40:50 30:24:16:10 generation #3 ( PGA ) traditional 4 24:24:24:24 30:30:30:30 B = 1 per every distributed 24:30:40:50 30:24:16:10 12 generations #4 (PGA ) distributed + active error analysis 4 24:30:40:50 30:24:16:10 B = 1 per every 12 generations #5 ( GA ) traditional 1 100 30 N/A Table 5.3 Experiment setup In Table 5.3 , notation such as 24:30:40:50 under column heading "Local population size" implys that local GAs #1, #2, #3, #4 population size are 24, 30, 40, 50 respectively. To observe how well PGA tracks changes in the system parameters, the parameter ai of the second order system is altered as shown in Table 5.4. Chapter 5: PGA in system identification Case 0-199 generation 200-399 generation 400-600 generation 1 ai=-1.5 ai=-1.5+1.0*(gen-200) ai=-0.5 Table 5.4 How ai of the second order system is altered 5.4.1 Experimental result Initially the fitness value was assumed to be a good indication of how well GA can track the system parameters. However, further examination showed that sometimes a small fitness error only indicates good tracking on the output of the system, and is not necessarily an indication of the convergence of the estimated parameters. A good example would be the fitness error from a GA with a window size of 10. Even though the fitness error was small in the presence of noise, the variance of the estimated parameter can be very large. Thus fitness error alone would not indicate the performance of various PGA configurations. In all the experiments, various PGA configurations were used to identify the pole and zero of the second order system. The estimated real and imaginary part of the dominant second order system pole ( o i , from various PGA configurations were plotted. Experiment #1: Testing the performance of the PGA configuration #1 with the distributed fitness function PGA configuration #1 based on a simple 2 dimensional hypercube ( 4 processors ) was implemented to test whether the distributed fitness function could improve performance. The experimental results are shown in Figure 5.3a-d. The variance of the estimated parameter seemed to be reduced slightly ( compare to Figure 5.3a and Figure 5.3c ), but there is no other indication of any significant improvement. The passive exchange mechanism of the PGA configuration #1 could not utilize the full potential of the distributed fitness function. Experiment #2 : Testing the effectiveness of the distributed fitness function in protecting the local population diversity For the PGA configuration #2, the local population diversity was threatened by the broadcasting mechanism. The distributed fitness function with a different local population and window size, should give some protection to the local population diversity. In the experiment, the PGA configuration #2 72 Chapter 5: PGA in system identification with the same fitness and distributed fitness function were subjected to a broadcasting frequency of one per generation. Under such a high broadcasting frequency, without sufficient protection, the local population diversity would be destroyed and led to premature convergence. Figure 5.4a and Figure 5.4b show two local GAs from the PGA configuration #2 with the same fitness function performed. After 200 generations, the estimated poles from two different local GAs became identical and converged prematurely. The performance of the PGA configuration #2 with the distributed fitness function, is shown in Figure 5.4c and Figure 5.4d . After 400 generations, the estimated poles from the two local GAs were close but remained different. Even though the convergence rate of the estimated pole was much slower than normal, there was no indication of premature convergence. The distributed fitness function had successfully preserved the local population diversity. Experiment #3: Testing the performance of the PGA configuration #2 with the distributed fitness Figure 5.5 shows how the real and imaginary part of the pole changed with generation ( time ). Window size affected the GA parameter tracking ability. In a noiseless environment, the local GA with a window size of 10 could track the change in the pole better than one with a window size of 30. Given that the pole started to change at 200 generation, the local GAs with window size 10 and 30 started responding to such change at 250 generation and 300 generation respectively. The different delay in the response was caused by the local GA different window size. ( Figure 5.5c and Figure 5.5d ). In a noisy environment, the local GA with a window size 30 could filter out noise much better than the GA with a window size 10 ( Figure 5.5g and Figure 5.5h ). Without active error analysis, the estimated parameter would be taken from the local GA with a window size of 30. The large window size would minimize the variance of estimated parameter, and filtered out the transient information from other local GAs with smaller window sizes. As shown by Figure 5.5f and Figure 5.5g , there was no observable improvement on tracking the change in the system pole. This experiment illustrated that distributed fitness function's full potential would not be utilized if active error analysis was not applied on the extra information available. Experiment #4: Combining the distributed fitness function with active error analysis Figure 5.6a and Figure 5.6b show that the tracking performance of the PGA configuration #2 with 73 Chapter 5: PGA in system identification and without active error analysis in a noiseless environment. The active error analysis reduced the response delay from 100 generations to 50 generations. Active error analysis had used the transient information from the local GA of window size 10 to reduce the delay. In a noisy environment, the active error analysis would be biasd more towards the local GA with large window size. In the present of noise and under steady state condition, estimated parameters variance from the local GA with a small window size would be quite large and the local GA with larger window size gave better estimation of system parameters. Comparing Figure 5.6c and Figure 5.6d , one would notice that both set of curves were very similar, confirming that under noisy conditions, active error analysis would be biased towards the local GA with window size 30. However, the active error analysis did not reject transient information from other GA with smaller window sizes. It still managed to reduce the response delay from 100 to 50 generations. Experiment #5: Compare PGA and GA performance The performances of GA and the PGA configuration #2 with the distributed fitness function and active error analysis were compared. The results are shown in Figure 5.7a to Figure 5.7d . The PGA configuration #2 actually outperformed GA in tracking parameter changes and had smaller variance in estimated parameters under both noiseless and noisy environment. 74 600 : same fitness, noiseless. w=30. pop=24 exch=l/3 (0.) 200 300 4 0 0 5 0 0 generation (3 generation/sample) d i f f fitness, noiseless. w=30. pop=24 exch=I/3 600 200 300 4 0 0 generation (3 generation/sample) 5 0 0 600 same fitness, noise. w=30. pop=24 exch=l/3 100 200 300 400 500 generation (3 generation/sample) d i f f fitness, noise. w=30. pop=24 exch=l/3 600 200 300 400 generation (3 generation/sample) p7* 5.5 ( ExpenfWit * 1 ) 1*7 0.5 0 -0.5 3 0 0 generation P G A configuration 02 : same fitness, noiseless. w=30. PQP=24 B = l CPU 1 Co.") « ... j. <--; , ;v-yvr t 100 5 0 0 200 300 4 0 0 generation (3 generations/sample) P G A configuration 02 : same fitness, noiseless. w=30. PQP=24 B = l cpu 4 -0.5 200 300 4 0 0 generation (3 generations/sample) 600 C O . ) 600 0.5 -0.5 --1 P G A configuration #2: dif f fitness, noiseless. w=30. pop=24 B = l cpu 1 -P c CuT.Lt, • * CO 100 500 0.5 u>., f : i > Q |. ' * •• 1>-'. 1'. R 200 300 400 generation (3 generation/sample) P G A configuration 02: d i f f fitness, noiseless. w= l 0. pop=50 B=l cpu 4 600 1 100 200 300 4 0 0 generation (3 generation/sample) 500 600 7 6 4 0 0 5 0 0 300 generation PGA config. #2: same fitness, noiseless. w=30. PQP=24 200 300 400 generation (3 generations/sample) 500 600 600 PGA config. 02: diff fitness, noiseless. w=30. pop=24 2 0 0 300 4 0 0 generation (3 generations/sample) P G A conf ig . #2: d i f f fitness, noiseless. w=10. pop=50 2 0 0 300 4 0 0 generation (3 generations/sample) Frgur* 5 . F ( &f««MA{ #3) 600 600 "71-1 0 -0.2 1? -°A •g -0.6 Cm -0.8 -1 T h e real and imaeinery pan o f the transfer function ro l e Ce). -100 200 4 0 0 300 generation P G A confiE. 02: same fitness, noise. w=30. pop=24 5 0 0 100 200 300 4 0 0 generation (3 generation/sample) 500 600 600 ID P G A config. #2: d i f f fitness, noise. w=30. pop=24 200 300 4 0 0 5 0 0 generation (3 generation/sample) P G A conf ig. » 2 : d i f f fitness, noise. yy=10. pqp=50 600 generation (3 generations/sample) P^urc 5.5" ( B a t m e n ! *3) -0.2 G A : noiseless, w=30. pop=100 ^ -0.4 •5 -0.6 Co) •0.8 -1 4 Tsi 100 5 0 0 200 300 400 generation (3 generations/sample) P G A conf ig. ¥2 + active analysis: d i f f fitness, noiseless. w=30 pop=24 200 300 400 generation (3 generations/sample) 600 600 The real and 600 GA: noiseless, w=30, pop=100 -0.2 -0.4 -0.6 •0.8 -1 -J. 100 500 200 300 400 generation (3 generations/sample) PGA config. #2 + active analysis: diff fitness, noiseless. w=30 pop=24 600 200 300 400 generation (3 generations/sample) 600 generation (3 generation/sample) "71 Chapter 5: PGA in system identification 5.5 Comments and discussion There are many advangages in using PGA ( algorithm B ) to gain speed over the traditional GA. Most of the advangages are mentioned in the beginning of the Chapter. Almost all the PGAs implementation in the literature emulated the traditional GA by sequentially exchanging portions of the local populations among local or nodal GAs. Such exchange process is a passive mechanism since it may take several exchanges before a super individual be noticed by all the local populations. In system identification ( and control ), the passive nature of the exchange mechanism of PGA can affect the parameter tracking rates of the algorithm. To improve PGA performance in system identification, three new mechanisms are used: 1. The distributed fitness function. 2. Broadcasting best individual among local GAs. 3. The active error analysis. The distributed fitness function lets local GAs have different combination of window and local population size. It can significantly improve PGA parameter tracking rates and robustness. Broadcasting removes the passive exchange mechanism so that all local GAs can interact with one and other simultaneously. Active error analysis is a method to compare best results from all local GAs and to choose the most appropriate output. It is a reality that broadcasting may destroy the local population diversity. Fortunately, the distributed fitness function offers the local population effective protection against the disadvantages of broadcasting. The robustness and effectiveness of the distributed fitness function in protecting the local population diversity was tested. Experiments showed that PGA with distributed fitness function is much more robust than the conventional implmented PGA even when subjected to the highest broadcasting frequency of one per generation. Note that in the conventional PGA implementation, a sizable local population ( >= 24 ) and a GA operator such as ranking[12] ( to prevent a super-individual to reproduce too many and too quickly to dominate the local population ) cannot protect local population diversity against broadcasting. 80 Chapter 5: PGA in system identification Experiment #3 and #4 show that the combination of the distributed fitness function, the broad-casting mechanism and the active error analysis produces noticeable improvement in estimated pa-rameters tracking rate and estimated parameters variance. In experiment #5, PGA with these three new mechanisms actually outperformed the traditional GA in the key areas mentioned above. A PGA with these three new mechanisms is a promising tool for many real-time, nonlinear system identification problems. 81 Chapter 6: Physical insight into how Genetic Algorithms work Chapter 6 Physical insight into how Genetic Algorithms work The theory of schemata proposed by Holland Holland ( 1975 ) has proposed the theory of schemata to explain some theoretical properties of Genetic Algorithms. A schema is a template describing a subset of strings with similarities over certain string positions. For example, given the following binary strings: A = 11100011 B = 11011011 (6.1) C = 11011111 the template "ii****n" is the common schema h for string A, B and C ( " * " is the "don't care" state ). Holland introduced two important terms: the denning length of schema, 6(h) and the order of schema 0(h). 6(h) is the length between a schema outermost defining positions. For example 6(u****n) = 8 - 1 = 7 and £(10*11**1) = 8 - 1 = 7. The order of a schema is the number of defining positions for a string. For example 0(11****11) is equal to four. The defining length £(h) is a measure of the probability that a crossover may destroy such schemata. A schema with large 6Q\) has a higher probability of being destroyed by a Crossover process than a smaller one. A small 0(h) is better protected from Mutation than a larger 0(h). Thus schemata with short defining length and low order stand the biggest chance of surviving to the next generation. The combined effect of the three transition rules, Reproduction, Crossover, and Mutation can be estimated. The expected number of schemata h to survive into the next generation is the product of the 82 Chapter 6: Physical insight into how Genetic Algorithms work expected number from Reproduction and the survival probabilities of Crossover and Mutation [11]. m(h, t + 1) = m(h, t)*W> ( i _ ( i _ PmfW where : m(h, t), m(h, t + 1) = number of schema hatt, t + 1 F(hft) = the average fitness of strings with schema h at t F(t) = the average fitness of the population at t F(h,t) , „ , . p = the Reproduction factor (1 — pm)°^ = the probability schema h survives Mutation 6(h)\ 1 — pc-—- J = the probability schema h survives Crossover (6.2) ( Note that the m(h,t+l) is a conservative estimate of schema h at t+1 since it assumes no Crossover among strings with similar schema. Holland's schemata theory states that the algorithm is going to. converge towards the best, but there is no guarantee that it is converging to the optimum. While Holland's schemata theory is helpful for predicting the searching power of GAs, it does not address many problems encountered in using GAs. Such as: 1. how to encode the parameter space to help the GAs searching process 2. how to avoid premature convergence 3. how to choose a realistic population size 4. how to modify the simple GA so that it becomes a less stochastic and more determinstic algorithm ( for certain applications only ) We attempt to give a physical picture of how GAs work. With better understanding, it is easier to predict and even improve the performance of GAs in certain applications. 6.1 Genetic Algorithms search space To understand the searching power and efficiency of GAs, one must visualize the search space of GAs. In everyday life, most of us work with real or integer number system. Functions such as F( x ), F( x, y ), F( x, y, z ) are defined in a real number system as: 83 Chapter 6: Physical insight into how Genetic Algorithms work 1. F( x ) is a function defined in an one dimensional space spanned by the x unit basis 2. F( x, y ) is a function defined in a two dimensional space spanned by the x and y unit basis. 3. F( x, y, z ) is a function defined in a three dimensional space spanned by the x, y, and z unit basis. The real number and the integer systems are easy for humans to understand. However, the binary number system is the key to unlock and to understand the searching power of GAs. 6.1.1 The binary unit basis For a given resolution, it is possible to convert any real numbers x, y, z into binary notations B(x), B(y), B(z) of / bits long. Instead of associating B(x) as a decimal number, B(x) = aia2..at = x = c i X 2l~l + a 2 X 2'~2 + ... + at (6.3) B(x) can be treated as the summation of / binary unit basis. B(x) = a\ai...ai = a\V\ + a2V2 + — + a/Vj where : a i , a 2 ... a/ = 1 or 0 V! = 00...001 (6.4) V2 = 00...010 = 00...100 V, = 10...000 Where Vii V2r Vn, ... V} are the binary unit basis for any binary notation of / bits long, and o i , a 2 i ... o/ are the associated binary unit values. Since any real ( at a given resolution ) or integer number can be converted to an unique binary notation, the binary unit bases, Vit V2, VJj, ... V/ are the fundamental unit basis vectors that are similar to any real axis unit basis. A hypercube spanned by the binary unit bases Vi, V2, Ki, ... V} ( detailed description of the hypercube in later section ). Given a function F( x ) defined in 0 < 84 Chapter 6: Physical insight into how Genetic Algorithms work x < x m a x ( assuming x is an integer ), the corresponding function in the binary domain will be F(ai,a>2r" °', with / = Rounded Integer(log2(xmax))- F(aia2,...ait V\V2,...Vi) is a function defined in a / dimension space spanned by the binary unit basis V\t V2, V3, ... Vi Finding the maximum or minimum of a function ( optimization problem ) If a function F(x) is well behaved ( continuous and monotonic ) in the real number domain of interest, then many conventional gradient based methods can be used to locate the maximum or minimum of such a function. However if the function is poorly defined ( discontinuous, multimodal or highly nonlinear ), most conventional methods will perform poorly since such methods depend on a well defined function gradient. Figure 6.1 shows that a function F(x) can be linear or highly nonlinear, continuous or discontinu-ous in different regions of x . If one wants to locate the maximum or minimum of F(x), either all the values of F(x) over the interested domain are examined sequentially ( analytical form of F may not exist ) or a statistical method such as Monte Carlo is used to give rough estimation of the optimum location ( multiple simultaneous searches are possible but they are very difficult to coordinate ). However, if the real number ( at a given resolution ) domain x is mapped into the binary domain spanned by Vit V2, V3, ••• Vi, then a hypercube is formed. Figure 6.1 shows the mapping from a linear one dimensional variable x to a hypercube. The advantage of such binary mapping is that with a / dimensional space, there can be at least / possible coordinated search directions for the locations of F(x)'s maxima and minima. It turns out that GAs operators Reproduction, Crossover, Mutation are tools that designed for coordinated searches in a hypercube space. GAs search space and the hypercube In GAs, the parameter space is usually encoded in a binary string format. Letting / be the length of the binary string, the parameters search space of a GA will contain 21 ( x m a x ) different combinations. Instead of working on the real number system, GAs map the search space into a hypercube of dimension / ( refer to Chapter 1 "Introduction" ). Since a GA search space is mapped into a hypercube of dimension /, many interesting properties of the hypercube can be used to give some insights into how a GA works. Figure 6.2 shows a four 85 Chapter 6: Physical insight into how Genetic Algorithms work Hypercubesof /dimensions Figure 6.1: Mapping an one dimensional signal into a hypercube dimensional hypercube with its vertices and links. Some of the important properties of a hypercube of dimension / are: 1. The search space of a GA is within the hypercube. 2. The optimum solutions sought by the GA are vertices of the hypercube. 3. The number of bit difference between any two different vertices is 0 < the number of bit difference < I. 4. The minimum number of links between any two vertices on the hypercube is equal to the number of binary bit difference between them. For example, in Figure 6.2, going from vertex #12 ( 1100 ) to vertex #5 ( 0101 ) requires the spanning of at least 2 links. Thus there are 2 bits which are different between these two vertices. 5. Given any vertex on the hypercube of dimension /, the number of neighbours with n binary bit difference are shown in Table 6.1: 86 Chapter 6: Physical insight into how Genetic Algorithms work number of bit number of neighbours difference 0 0 1 / 2 /! / ( 2! ( 1-2 )! ) 3 /! / ( 3 ! (/-3)! ) . . . . . . / - 1 / / 1 Table 6.1 Number of neighbours vs number of bit difference Figure 6.2: A four dimensional hypercube 6.1.2 The Reproduction, Crossover and Mutation operators on the hypercube The Reproduction operator ensures that vertices (individuals ) on the hypercube with good fitness will have many identical clones while vertices with poor fitness will be eliminated. The Reproduction 87 Chapter 6: Physical insight into how Genetic Algorithms work Number of Location of the offsprings bit difference 1 ( no new location ) 2 On a plane ( 4 vertices ) which contains the original parents 3 On a cube ( 8 vertices ) which contains the original parents 4 On a hypercube of dimension 4 ( 16 vertices ) which contains the original parents n On a hypercube of dimension n which contains the original parents. Table 6.2 Locations of offspring c ube n dimensional hypercube plane 1 bit difference 2 bits difference between parents between parents 3 bits difference between parents n bits difference between parents offsprings maximum bit difference from parents = ( n 12) bits original parent (|||| a vertex on the hypercube possible offspring ( after Crossover ) Figure 6.3: The possible locations of the offspring as a function of the bit difference between parents The Crossover operator generates two new vertices ( offsprings ) that are in the proximity of the parents as shown in Figure 6.3. Thus the Crossover operator uses the bit difference information of the parents to generate offsprings such that the space spanned by the bit difference between the parents is searched. Chapter 6: Physical insight into how Genetic Algorithms work operator is responsible for identifying relevant regions ( many identical clones will be located in such region ) in which the Crossover operator can perform intensive multiple direction searchs. The Crossover operator is the most difficult operator to understand. It takes any two individual strings ( vertices ) and randomly selects a crossover point ( the crossover point is randomly selected in a simple GA ) interval 1 < k < / - l ( / = length of binary encoded strings ). Two new strings are then created by changing all bits between position k and / inclusively. For example: Before : B0 = &1&2&3— (6.5) After (k = 2) : B\ = 6 i 0 2 a , T . . . a / _ i a / The Crossover operator can be more easily visualized much better on the hypercube space. Let Ao = ai 02 a . i . . . a j - 1 a / (6.6) BQ = 61626.1. be two vertices on a hypercube with dimension equal to /, using the concept of bit difference between vertices on the hypercube, the following statements can be made: 1. The bit difference between An and Bo determines the location of the offsprings A i and B i on the hypercube ( Table 6.2 and Figure 6.3 ) . 2. If there is n bit difference between strings Ao and Bo, the bit difference between offsprings A i and B i must also be n bits. 3. Depending on the value of k ( the crossover point ), offspring A i and B i can have 0, 1,2, ... , integer ( n/2) bit difference from AQ, BQ respectively ( n is the bit difference between Ao and BQ . 8 9 Chapter 6: Physical insight into how Genetic Algorithms work The Crossover process can also be explained in term of the binary basis V\t V2t Vs, ... V}: 1. Given two vertices An and Bo on a hypercube of dimension / AQ = a\a2a%. ••<*/—IGJ BQ = Ol0203...5/-lO/ (6.7) where : d\ = complement of ai ( l ,0 ) 2. An and Bo can be rewritten as AQ = axVi + a2V2 + 0 3 V3 + — + a/-iV/_i + atVt BQ = aiVi + a2V2 + 63V3 + ... +'oj_iV/_i + a/V/ Let Aj and Bi be the products of Crossover process of Ao and Bo, then Ai = a{V\ + 02^2 + a3V3 + ... + o/_iVj_i + a,Vt Bi = aiVi + a2V2 + a3Vn + ... + ai_iV}_i + a,F, (6.9) ty»7ft A; = 3 (t/te crossover point) 3. Then (6.8) Ao - BQ = (01 - oi)Vi + ( 0 3 - o3)V3 + ... + (o/_i - a,_i)F/_i (6.10) A i - Bi = (ai - 6i)Vi + ( 0 3 - a3)V3 + ... + (6/_i - a/_i)V/_i = (ai - oi)Vi + (a 3 - a3)Vh + ... + (o/_i - ai_!)Vi_i (6.11) where : I4 = — Vi Since the binary basis are the same for Ao - B0 and Ax - BX (6.12) implies that the bit difference between A] and Bi must be the same as that of Ao and Bo. 4. Crossover is similar to exchanging binary basis. For example, the new offspring A i loses the binary vectors V 3 , ... V n ( from Crossover ) and Bi gains the binary vector V 3 , V i + i ( from the Crossover ). Thus the direction A i moves from Ao should be exactly opposite to the direction Bi moves from Bo within the smaller hypercube denned by the bit difference between the parents. 90 Chapter 6: Physical insight into how Genetic Algorithms work While choosing two parents is a random process ( in the conventional GAs ), Crossover is a structure searching process in the hypercube space ( Even though the Crossover point k is randomly chosen, the offsprings will still be inside the hypercube space spanned by the bit difference between the parents ). Since each binary basis signifies a move in a certain direction in the hypercube, exchanging binary basis in crossover is actually a way to use one, two, or all different binary basis between the two parents to find new vertices in the space spanned by all the different binary basis between the two parents. For example, given the following strings: Ao = 01020304030(5 (6.13) 1. If the crossover point k is 2, the offsprings A i will move 02030405 away from Ao and Bi will move 02030403 away from Bo. Thus no new offsprings are created. 2. If the crossover point k is 3, then A i will move 030403 from Ao and Bi will move 020304 from Bo. Thus 3 different bits are tested. The probability of applying the Mutation operator to any bit in a string is usually quite low. Mutation can be seen as a local search algorithm on the hypercube. Mutation operator allows any vertex ( an individual ) to move to any one of its closest neighbour randomly. This operator is a random process and is effective only when the solution is only a few bit differences from the estimated one. Visualizing the combined effect of Reproduction, Crossover, and Mutation process Figure 6.4 shows the combined effect of Reproduction, Crossover on a randomly generated initial population. In step #0, the randomly generated individuals fitness values are calculated. Individuals with poor fitness are eliminated and individuals with good fitness are rewarded with more duplicates ( step #1 ). The Reproduction operator seeks out the important space of the hypercube by letting vertices with good fitness to have more duplicates. 91 Chapter 6: Physical insight into how Genetic Algorithms work Although Crossover is a stochastic operator (the parents are randomly chosen and the crossover point k is randomly generated ), its outcome ( offspring ) is quite predictable. It is seeking a combination of bit difference from the two individual parents to generate offsprings that may be closer to the solutions space. Since there are more copies of individuals with good fitness, there could be several different mates for such individuals. Thus the hypercube space ( or bit patterns ) around such individuals is searched more intensively ( step #2 ). The combination of the GA operators Reproduction and Crossover gives GA the searching power. The offsprings of the 2nd generation ( step #3 ) fitness are calculated. The process of Reproduction, Crossover is repeated. Mutation operator is a local searching technique ( on the hypercube ). It is effective when the estimated solution is only one or two bits from the optimum one. The searching power of GA is demonstrated since space around good individuals are searched from multiple directions simultaneously with coordination. After only a few generations, one can see that more and more offspring will be much closer to the optimum solutions. Hz. Chapter 6: Physical insight into how Genetic Algorithms work Offsprings ( 2 nd generation ) Reproduction ( 2 nd generation ) Crossover ( 2 nd generation ) Figure 6.4: The combined effect of Reproduction, Crossover 6.2 The importance of encoding the parameter space properly The effectiveness of GAs in searching toward the optimum value depends on how the parameters are encoded. Improper encoding of the parameters for GAs can increase the difficulties of the search similar to that of finding a needle in a haystack. For example, an impulse-like fitness function ( Figure 6.5 ) would reduce GAs ( or any other methods ) to something no better than a blind search. Fitness(x) x Figure 6.5: An impulse-like fitness function 93 Chapter 6: Physical insight into how Genetic Algorithms work In one of the experiments of system identification using GAs, we tried to reduce the size of the global population and increase the parameters convergence rate by shortening the individual binary string length. Instead of using the conventional GAs, a new combination GA is devised. Figure 6.6 shows how the conventional GA is divided into "coarse" GA and "fine" GA. The "coarse" GA will try to give a rough estimation of the system parameter ( Figure 6.7, using a coarse grid space — less bit ) to the "fine" GA. The "fine" GA will use the rough estimation of system parameters from "coarse" GA as a bias point to set a search space with much finer grid size to achieve the proper resolution. Since "coarse" GA is designed to give only a rough estimation of the parameter, its parameter convergence rate should be much higher than that of the conventional GA. Table 6.3 shows the number of bits used by the conventional, the "coarse" and the "fine" GA. System parameters Input/Output System Measurement global population System Model fitness function GA operators ( window size ) GA (Sequential Genetic Algorithm) Best fit system parameters Input/Output System Measurement System parameters Local population System Model » fitness function GA operators (local window size ) "coarse" GA System Model fitness function Best estimated parameter ( Bias Point for the Fine grain GA) GA operators (local window size ) System parameters Local population . Bes t es t imated p a r a m e t e r s Tine" GA The new combination GA Figure 6.6: Block diagram of conventional GA and new combination GA 94 Chapter 6: Physical insight into how Genetic Algorithms work Coarse GA search grid FineGA search grid Simplified 2 dimensional parameter space Figure 6.7: "Coarse" and "Fine" GA grid size Parameters Conventional GA ( "Coarse"GA "Fine" GA bits used ) ( bits used ) ( bits used ) d (delay ) 2 2 2 a 7 4 -5 4 - 5 P 7 4 -5 4 - 5 7 7 4 -5 4 - 5 6 7 4 -5 4 -5 g 7 4 -5 4 - 5 Length of 37 22-27 22-27 individual binary string Table 6.3 Number of bits used for the system parameters Since the string length of the "coarse" and the "fine" GAs are less than the original one, then based on [8] theory, the population size of "coarse" and "fine" GA can be reduced significantly. Thus the execution time of the new combination GA could be much shorter than that of the conventional 95 Chapter 6: Physical insight into how Genetic Algorithms work GA. The system model, fitness function and setup in the experiment are identical to those described in Chaper 5. The experimental result of the new combination GA is shown in Table 6.4. "Coarse" GA Length of Results Population size individual binary string 24 22 failed to converge 27 good ( converge rapidly ) 30 22 failed to converge 27 failed to converge 50 22 failed to converge 27 slow convergence rate Table 6.4 The performance of "coarse" GA with various size of population Contrary to our expectations, the performance of the new combination GA was poor and unreliable. For individual string length of 22, the "coarse" GA failed to converge for all three different population sizes ( 4 bits are used to encode a, /?, 7, 6, g ). With string of 27 bits long ( 5 bits for each parameter ), "coarse" GA is still unreliable. In an attempt to reduce the string length for the "coarse" GA, the parameters were encoded incorrectly. By encoding the parameters into only four or five bits ( Only the most significant bits of the parameters are used ), all bits have equal importance. In the other word, any individual that does not match completely the solution bit by bit will have a poor fitness. From the hypercube model, the solution space for the "coarse" GA is reduced to a few locations on the hypercube. This is similar to having a impulse-like fitness function and a GA is no better than a random blind search. If seven bits are used for each parameter, out of the 37 bits in an individual string, 15 bits ( the lower order bits of the encoded parameters ) are very similar to the "don't care" bits. Such individuals have fitness values very similar to the optimum ones. With 15 "don't care" bits, more than two or three thousands of such individuals will locate at many sections of the hypercube ( see the neighbours of the hypercube in Table 6.1 ). For some applications ( system identification ) and 96 Chapter 6: Physical insight into how Genetic Algorithms work certain combination of / and D, ( / = the dimension of the hypercube and D = the number of "don't care" bit, the probability of finding a good solution ( not the optimum ) can increase significantly. 6.3 Comments and discussion While Holland's schemata theorem gives an excellent explanation to the searching power of GAs, it does not provide a good physical model of GA since the theorem is based on stochastic models. We have shown that by mapping the search space of GAs into a hypercube, a working physical model of GAs can be built. With a physical model, GA mechanisms can be analyzed with more precision. Such a physical model may also allow one to further fine tune the GA mechanisms for better performance in a specific area. The physical model gives a more precise explanation of GAs searching power, and it can help to resolve many other future technical issues such as the following: A more realistic estimation of GA's population size In the past, the schemata processing rate [11] ( based on Holland schemata theorem ) is used to estimate the optimum population size. The analysis [8] may be too conservative since it states that the population size of GAs should increase exponentially as string length is increased. Such analysis contradicts with concept of the bit difference geometry of the hypercube. How to encode the parameter space In the past, trial and error was used to find the optimum encoding for the parameters. The "coarse" and "fine" GA example illustrates how easily a mistake can be made in encoding. The hypercube physical model explained why "coarse" GA performed so unreliably. In certain applications, stuffing binary string with "don't care" bits can help GAs to find good solutions ( vertices ) on the hypercube easier. For example, given the following binary string: a 1020,104 . . . ai^ai-\ai (6.14) it can be converted to Chapter 6: Physical insight into how Genetic Algorithms work Oi * 02 * 03 * 04 ... aj_2 * * a,i (6.15) The "don't care" bits " * " increase the bit range spanned by the number of vertices with fitness that are close to optimum, and thus increase the chance that GAs can find good solutions. Although the global search space increases exponential with number of bits in the binary string ( 2l ), the dimension of the search space increases only linearly ( / ). Initial randomly generated population replaced by predefined populations A different initial GAs population pattern can be used instead of a randomly generated one. Some prior knowledge of the search space, such as minimum space resolution, can be used to improve convergence rate. Such a technique can provide a better coverage of the search space than a randomly generated one. Improving the Crossover operator Choosing two individuals for Crossover should not be totally random. The hypercube physical model of the search space states that 1. Crossover between parents of 1 bit difference produces no new offsprings 2. Crossover between parents with bits difference in the "don't care" region produces no "new" offsprings. If the above precautions are taken, the Crossover operator should become much less noisy ( stochastic ) and behaves more like a structure search. Modified GAs are powerful multidimensional, almost deterministic ( with some stochastic favors ) optimization algorithms. 98 11 Chapter 7: Conclusions Chapter 7 Conclusions The G A was used to estimate the compliance of the hydraulic system on the UBC teleoperated heavy duty excavator. Using real recorded data from the excavator, the G A successfully identified the compliance of the boom hydraulic circuit The accuracy of the estimated compliance from G A was cross checked by a simulator. We found that the simulator output PRCM agreed very well with measured PBCH ( Figure 3.8, within 10% ), indicating that the estimated compliance CBCH was accurate. G A was also used to estimate the compliances of multiple links simultaneously. The agreement between the simulation and estimated internal states of the hydraulic circuit ( shows in Figure 3.9 to Figure 3.11 ) indicated that both the estimated compliances and the estimated internal states of the hydraulic system from G A were accurate ( within 10% ). The searching ability and power of G A were well demonstrated in this example. Different algorithms and architectures were studied for a parallel implementation of G A . For the G A used in system identification, both population and fitness calculation could be parallelized. Such combination allows the use of the pipeline techniques, which improved the efficiency of the parallel system. A P G A was implemented with sixteen T800 Transputers. It achieved a speedup factor of 12 over a traditional G A . This P G A was used to identify the compliance of the boom hydraulic circuit ( Chapter 3 ). The estimated compliance from this P G A was almost identical to the one from traditional G A . With such a high speedup factor, real-time monitoring of hydraulic compliance and other hydraulic parameters is becoming possible. While many papers on " P G A in function maximization" can be found in the literature, there seems to be no published paper on " P G A in system identification". This may be because of the inherent disadvantages in P G A , such as slower parameter convergence rate and lack of guarantee that it will converge to an optimum. In Chapter 5, we addressed the different requirements between function maximization and system identification and introduced three new mechanisms: the distributed fitness function, the broadcasting 100 Chapter 7: Conclusions mechanism and the active error analysis that could be used to enhance the performance of PGA. Simulation results confirmed the effectiveness of these three mechanisms. A PGA which incorporates these three mechanisms can actually outperform GA in key areas such as variance of the estimated parameter and parameter tracking ability. There are many conventional methods available for linear system identification. If the system is nonlinear, the identification problem is much more complex and difficult The GA is not recommended for nonlinear system identification if other identification methods can be used. However there are many nonlinear systems where conventional identification methods perform poorly or fail, the GA may become the only alternative method available. Even through there is no guarantee that GA will converge to an optimum solution, from our experience with nonlinear systems identification using the GA, we find that the GA usually provides solutions that are close to the optimum ones. In Chapter 6, we derived a physical model that explain the fundamental properties of GAs. The physical model ( a hypercube ) not only provide an excellent explanation of GA searching power, but also provide insights to GAs user ways to improve the performance of GAs in most applications. The hypercube physical model also explains why the GA and PGA are so successful in locating solutions near the optimum one in nonlinear system identification. In a typical nonlinear system identification application, the parameters are encoded in a binary string as follow: a\/3\f\6 ...|A (7.1) aia2a^..anbib2b^...bm... Where n bit is used to encode a, m bit is used to encode /?, ... etc. Usually only half of the bit used for each parameter is significant ( the most significant bits of the parameters, eg. 0102030405). The insignificant or unimportant bits of the parameters can be classified as the "don't care" bits. Thus with so much "don't care" bits encoded in the binary string, the probability for the GA or PGA to find a solution in the hypercube space is quite high. Finally, The fundamental properties of GAs are as followed: 1. By converting parameter search space from a real number system into a binary number system, GAs increase the dimension of the search space. Higher dimension search space allows multiple 101 Chapter 7: Conclusions search with different directions, thus increasing the efficiency of function maximization process in the search space by at least / times the linear method. 2. The hypercube configuration is an excellent tool to illustrate the operation of GA operators: Reproduction, Crossover, and Mutation. By eliminating individuals with poor fitness and cloning individuals with good fitness, the Reproduction operator can quickly locate areas of interest on the hypercube. On the hypercube, Crossover can be seen as a directed structural search method that uses the difference between the two parents to produce offspring in the direction ( derived from the bit difference between parents ) toward the optimum. Mutation is just a simple random neighborhood search around any vertex. The physical model provides a new way to re-evaluate most of the important parameter settings of the GAs. From the last chapter, we point out that it is possible to derive: 1. A realistic population size that grows linearly instead of exponentially with respect to the binary string length ( / = dimension of the hypercube ) for certain applications. 2. The Crossover operator can be tuned so that it is behaves much more like a structural search and resembles much less as a stochastic operator. 3. Encoding the parameters properly so that premature convergence or blind search can be avoided. With all the technical issues resolved, the structured GA is a powerful multidimensional, deterministic ( with some stochastic flavor ) optimization algorithm. 102 Appendix A The boom hydraulic circuit and its model Voltage-Spool Transfer function Spool Dynamic Spool-Orifices Transfer function Hydraulic circuit and First order pipe model VBOOC) Desired SpoolQc) aibo(k) Spool(k) aebo(k) aobo(k) ( main valve orifices area ) Boom Angle(k) PBOi(k) PBOo(k) Figure A . 2 : The mathematical model of the boom hydraulic circuit A . 102 The Voltage-Spool Transfer function for the boom: Desired Spool(k) = 0.0892FJo + 0.2352Fjo + 0.2815Veo + 0.1297 The Spool dynamic: Discrete time model : Spool(k) = Desired Spool(k) x (l - e-a'T) + e^aoT) x Spool(k) The Spool-Orifices Transfer function: HminSpooip] = the dead band in the Spool = 0.188 inch. aebo-' if ( Spool <= HminSpool[l] ) aebo = (-7.85 in. X Spool) + 1.5235 in. else if ( Spool <= 0.376 in.) aebo - -0.25 in. x (Spool - 0.376 in.) else aebo = 0.0 in.; Uibc if ( Spool <= HminSpooirj] ) aibo = 0.0 ; else if ( Spool <= 0.376 ia) a.to — 1-247 in. X (Spool — HminSpool[l]) else a^o - (3.927 in. x (Spool - 0.376)) + 0.2346 in. A. 103 Oobo if ( Spool <= HminSpool[l] ) aobo=0-0 in.; else if (Spool < 0.248 in.) a0bo = (Spool - 0.248 tn.) X HminSpool[l] + 0.0113 in. (A.7) else if ( Spool < 0.376 in. ) a0bo = (Spool - 0.376 in.) x 1.09 in. + 0.1509 in. (A.8) else a0bo = (Spool - 0.376 in.) x 3.927 in. + 0.1509 in. (A.9) A.104 Appendix B Dynamic equations for hydraulic system in configuration a, branch #1 I I Tank > 'BUo ^aobu ^ hose hose BOo Main Hydraulic Valves Pump 1 Junction #1 ^aibu ^ | Q B , Ui BUi P12 > Bucket Actuator Bucket Spool aebu | Qn Junction #2 ^ ^ a e b o Q12 Pe Tank Boom Actuator ^ aibo ^ QBOI hose BOi | | Tank Boom Spool ^ a o b o Oo hose BOo Figure B. l : The branch #1 of the main hydraulic valves in configuration a B.105 B.l Equations describing the hydraulic configuration a branch #1 Measured variables : PBUI, PBOi, #BU, OBO, VBU, VBO (or SpoolBu and SpoolBo) Calculated from measured variables : PBUi, PBOi, XBU, XBO, the orifices areas (B.l) Estimated parameters : (pump side compliance) CBIH, CBOi Unknown internal states : QBUi, QBOi, Qn, Q12, Pu, P\i First order pipe model : CBOi ^ = QBOi - A-BOiX-BO (B.2) CBUi—^ = QBUi - AsuiXsu Where A B O J = the boom actuator pump side piston area, A B U I = the actuator pump side piston area, Equations for hydraulic flow : QBUi(t) = k aibu(t) y/Pn{t) - PBui(t) Quit) = k aebu(t) y/Pn(t) - Pu(t) (B.3) QBOi(t) = k aiboitWPnit) - PBoi{t) <3i2(0 = k aebo(tWPn(t) - Pe Flow Constraint: Qt = Qn + QBUi (B.4) Qn = Qn + QBOi B.106 B.2 Simplified orifices area calculation The simplified hydraulic model does not contain the Voltage-Spool and the Spool-Orifices transfer functions. It assumes a step input and the orifices areas are described by the following equations: The dynamics of the hydraulic valves : o.'6u(0 = aibui ( l - e x p - 0 0 ' ) aebo(t) K i m O - aebui)exp~aot + a e b u l a , - 6 0 i ( l - exp-aat) (a e 6o0 - O e 6 o i ) e x p - O o t + aebol (B-5) B.107 Appendix C Speedup and efficiency definitions Speedup is defined as: _ , GA execution time for 1 processor Speedup = — GA execution time for N processores where : GA execution time for 1 processor = GA fitness calculation time(tp) + GA operator time(ts) GA execution time for N processors — GA fitness calculation time(tp/N) + GA operator time(t8) tc = Communication overhead for N processor. Thus : Speedup = . iP^~ ts— ;s t s if ts >> tc % + ts + tc ft + «. (Cl) Efficiency is defined as: „ ,,. Speedup obtained from N processsors Speedup . _ Efficency = - = (C.2) C.108 Appendix D Overview of available computational resources Details of the available hardware resources are shown in Figure D . l . The host computer is a Sun 3/280 68020 based CPU equipped with a terminal board, an Ethernet interface, and a disk storage subsystem. The distributed system consists of the following modules: 1. The Bus Masters ( Parsytec-BBKV2 ): A transputer bus master is composed of an T800 transputer with 2 MBytes of dual-ported memory which is asynchronously accessible by the transputer and the VMEbus. The bus master also performs automatic conversion of data format, allowing compatible messages to be exchanged efficiendy with the host computer. 2. Clusters ( Parsytec-VMTM ): Each cluster module consists of four Inmos T800 transputers each with 1 MByte of local memory, nine external serial ports, a 32x32 crossbar switch, and four link adapters for interfacing a transputer connection link to the VMEbus. The external serial ports and the crossbar can be used to connect a transputer from a cluster module to another transputer on any other cluster module. The 4 link adapters on the VMEbus provide the facilities to support broadcasting of data from the host computer or the Transputer bus master. The Transputer T800 architecture: The Transputer is designed and built on Hoare's concept "Communication Sequential Processing" [10]. Al l processes on the Transputers communicate synchronously through soft ( internal channel ) or hard ( external transputer links ) channels and therefore all concurrent processes synchronize automatically. The Transputer has a built in scheduler for all its processes. Any process that is waiting for input will be de-scheduled until its data become available. Thus the CPU does not spend any time idling. With high speed data links of ( 20 Mbit/s ) and direct memory addressing to support the hard link features, there is very little interruption for the local CPU for communication among two Transputers. Such features make parallel processing on multiple Transputers feasible and efficient. The T800 Transputer hardware block diagram is shown on Figure D.2. D.109 Interconnection Network 32 bits Interconnection Network Figure D.| Available hardware resources Floating Point Unit System Services Timer 4 Kbyte On-chip RAM Externa! Memory Interlace Link Interface Link Interface Link Interface Link Interface 32 bits Figure The block diagram of the Transputer T800 D.IK? Appendix E PGA implementation details As discussed in Chapter 3 "Identification of hydraulic compliance with GA", to identify the single link hydraulic compliance, the global population size is set between 100-120. The ratio a is in the range between 8 to 16. The minimum local population size Mioc is set at 30 to protect the diversity of the local population ( using the criteria discussed in next Chapter ). To achieve as much speedup as possible, algorithm A and algorithm B are combined. The maximum speedup is: 120 8 + 1 Max. speedup = —— x —-— « 16 (E.l) oU 2 Level 1: global population ( algorithm B ) Given the minimum local population size of 30 and a global population of 120, there will be 4 local GAs. The 4 local GAs are connected in a ring network using the hard link of the Transputer. Level 2: The Fitness Calculation ( algorithm A ) With a local population size of 30, fitness calculation time tp2 and the sequential GA operators time ( Reproduction, Crossover, Mutation ) U2 accounts for 80% and 20% of the execution time respectively. Such high ratio of a ( tp2 / to ) fits the master- slaves scheme requirement. Each local GA unit will consist of one master and several slave processors. The master processor distributes individuals from the local population to slave processors. Slave processors calculate the fitness value and return the result to the master. For the master-slaves scheme to work efficiently, the total communication time tc2 between master and slaves must be much smaller than tS2- The T800 Transputer high speed serial link ( 20 Mbit/s ) should be able to cut down tc2 substantially. Due to the limited number of T800 transputers available, 3 slaves are used for each NGA. The total number of Transputers used in this PGA implementation is 16. With one master and 3 slaves, the data on the communication channel will be: 1. the length of individual string times the number of individuals ( 240 bytes ) E.111 2. the input and output measurements of the system ( 148 bytes ) 3. the fitness values of the individuals ( 240 bytes ) The total size of data is approximately 600 bytes. Assuming direct connection between the master and slaves, ( in direct connection, each slave is connected to the master by a dedicated serial link ), the communication overhead (task switching) on the Transputer should be small, and the master-slave communication time will then be approximately equal to the transmission time of data on the serial links. With transmission rate of 20 Mbit/s, the master- slaves communication time would be 0.4 ms. Since 0.4 ms is much smaller than the measured GA operators time ( 10.9 ms ), the fitness calculation can be successfully parallelized. Figure E . l shows the software block diagram of the master-slave scheme. Due to the limited number of Transputer serial channels, only 1 link on the master node is available for sending and receiving data to and from slave processors. Figure E.2 showed the hardware implementation of the PGA. The communication overhead tc2 would increase since direct connection is not possible and data must be re-routed among slave processors. To make sure that in the worst case master-slave communication time would still be small, the timing of processes are measured. The results of the measurement are summarized on Table E . l . It is important to note that the measured t& is significantly higher than the computed one ( 1.89 ms vs 0.4 ms ). However it is still much smaller than the GA operators time ( 1.89 ms vs 10.5 ms ). With 1 master and 3 slave processors, the measured speedup factor for level 2 parallelisation is 3.0, which agrees with the calculated speedup factor. The combined speedup factor for algorithm A and B is: Combined speedup = level 1 speedup x level 2 speedup Mloc l a + 1 The pipeline technique is not used in here because 1. The communication overhead tc2 will increase rapidly if direct connection between master-slaves is not possible ( pipeline implementation will double the number of slaves ) E.112 2. a is relatively small in this example and the difference between pipeline and nonpipeline implementation speedup factor will be small. GA, PGA with algorithm B, and PGA with combined algorithm A & B achieved speedup factors are listed in Table E.l for comparison. | Initialization""! Local GA node (with local population size of 24) 1 Sample data I Master Fitness Calculation! Slave 1 T Slave 2 Slave 3 Fitness Calculation T Fitness Calculation 1 Fitness Calculation Average all fitness value Reproduction ( Is there a V super-individual 7 J protect diversity yes Ranking Reproduction Recombination Mutation Keepbest X Recording X Nextgeneration GA Operators Exchange To other local GAs From other local GAs Figure E.1: Software block diagram of the master-slave scheme E.113 VGA2 NGA3 SUN3/60 User InUrface link-3 NGAO Unk-0 Iink-1 NGA1 llnk-2 Slavel Slave 2 Slave 3 •T800 Transputer H i g h s p e e d Data link Figure E.2: The Implementation of a PGA ( algorithm A and B ) Master ».Q Start t Tl Send data T2 Slave #1 Slave #2 T3 Fitness Calculation i ^ ^Fitness Calculation^ fitness Calculation' ^T4 T5 Receive data T6 V Operators I -6 End T7 Defination: Send & Receive time (Tsr) =T5 -T2 GA operator time (Tga) = T7-T6 Fitness function time (Tfit) =T4 -T3 Total execution time (Texe) = T7 - T l Communication Overhead ( Tcom ) = Tsr - Tfit Figure E.3: Measurement Method used for Timing Process E. l 14 Number of slave processors 0 1 2 3 Execution time ( Texe ) 97.0 ms 54.0 ms 39.7 ms 32.8 ms Send - Receive time ( Tsr ) 86.4 ms 43.3 ms 29.5 ms 22.5 ms Fitness calculation time ( Pom2 ) 86.4 ms 43.3 ms 29.0 ms 21.8 ms Communication Overhead ( Cfaxa ) 0.0 ms 0.6 ms 1.28 ms 1.89 ms GA operator time ( Stung ) 10.S ms 10.5 ms 10.5 ms 10.5 ms Table E . l Measurement of processes timing GA and PGA in identifying single GA Level 1 Level 2 Combined link Algorithm Algorithm Algorithm hydraulic compliance : (B) ( A ) (A + B) Number of processors used 1 4 4 16 Global Population size 120 120 N/A 120 Local population size N/A 30 30 30 Number of generations calculated per sec. 9.8 N/A N/A 117.6 Speedup achieved 1 4 3 12 Efficiency ( Speedup / processors ) 1 1 0.74 0.74 Table E.2 The measured speedup factor of various PGAs E.115 References [I] An and H. Chae. Model-based control of a robot manipulator. The MIT Press, Cambridge, Massachusettes, 1988. [2] B. Armstrong. On finding 'exciting* trajectories for identification experiments involving systems with non-linear dynamics. Proc. IEEE Int. Conf. Robotics and Automation, pages 1131-1139, 1987. [3] An C.H. Atkeson and C.G. Griffiths. Experiment evaluation of feedforward and computed torque control. Proc. IEEE Int. Conf. Robotics and Automation, pages 165-168, 1987. [4] An C.H. Atkeson and Hollerbach. Estimation of inertia parameters of rigid body links of manipulators. Proc. 24th IEEE Conf. Decision and Control, pages 990-995, 1987. [5] R. Das and D. E. Goldberg. Discrete-time parameter estimation with genetic algorithms. The Modelling and Simulation Conference Proceedings, 1988. [6] Edward D. Lazowska Derek L. Eager, John Zahorjan. Speedup versus efficiency in parallel systems. IEEE Transactions on Computers, 38(3):408^22, March 1988. [7] David Goldberg. Optimal initial population size for binary-coded genetic algorithms. TCGA Report No. 85001, 1985. [8] David Goldberg. Sizing population for serial and parallel genetic algorithms. The International Conference on Genetic Algorithms, 89, pages 70-79, 1989. [9] David Goldberg and Jon Richardson. Genetic algorithms with sharing for multimodal function optimization. Genetic algorithms and their applications: Proceedings of the Second International Conference on Genetic Algorithms, pages 41-49, 1987. [10] C.A.R. Hoare. Communicating sequential processes. Communications of the ACM, 8:666-677, August 1978. [II] J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975. E.116 [12] K. Kjistinssoa Genetic algorithms in system identification and control. Master's thesis, Univ. of British Columbia, Vancouver, B.C., 1988. [13] K. Kristinsson and G.A. Dumont. Genetic algorithm in system identification. IEEE Interna-tional Symposium on Intelligent Control, 3, 1988. [ 14] N. Sepehri G. Dumont P.D. Lawrence and F. Sassani. Cascade control of hydraulically-actuated manipulators. Robotica, 8:207-216, 1990. [15] N . Sepehri P.D. Lawrence and F. Sassani. Gear backlash and stick-slip friction in a teleoperated heavy-duty hydraulic manipulator. Proc. Canadian Conference on Electrical and Computer Eng., pages 45-49, 89. [16] Crisila B. Pettey Michael R. Leuze and John J. Grefenstette. Genetic algorithms on a hypercube multiprocessor. Hypercube Multiprocessors, pages 333-341, 1987. [17] Crisila B. Pettey Michael R. Leuze and John J. Grefenstette. A parallel genetic algorithm. Proceedings of the Second International Conference on Genetic Algorithms, pages 155-161, 1987. [18] B. Manderick and P. Spiessens. Fine-grained parallel genetic algorithms. The International Conference on Genetic Algorithms, 89, pages 428-433, 1989. [19] D. McCloy and H.R. Martin. The Control of fluid power. John Wiley and Sons, New York, 1982. [20] H.E. MerriL Hydraulic Control System. John Wiley and Sons, New York, 1967. [21] Chrisila C. Pettey and Michael R. Leuze. A theoretical investigation of a parallel genetic algorithm. The International Conference on Genetic Algorithms, 89, pages 398-405, 1989. [22] M.B. Priestley. Non-Linear and Non-Stationary Time Series Analysis. Academic Press, New York, 1988. [23] Nariman Sepehri. Dynamic Simulation and Control of Teleoperated Heavy-Duty Hydraulic Manipulators. PhD thesis, Univ. of British Columbia, Vancouver, B.C., 1990. [24] Reiko Tanese. Parallel genetic algorithm for a hypercube. Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms, E.117 pages 177-183, 1987. Reiko Tanese. Distributed genetic algorithm. The International Conference on Genetic Algorithms, 89, pages 434-439, 1989. E.l 18