Noisy Optimal Control Strategies for Modelling Saccades by Garrett L. Jones B.A. Computer Science and Physics, State University of New York at Geneseo, 2009 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in THE FACULTY OF GRADUATE STUDIES (Computer Science) The University Of British Columbia (Vancouver) December 2011 c Garrett L. Jones, 2011 Abstract Eye movements have for a while provided us a closer view into how the brain commands the body. Particularly interesting are saccades: fast and accurate eye movements that allow us to scan our visual surroundings. One observation is that motor commands issued by the brain are corrupted by a signal-dependent noise. Moreover, the variance of the signal scales linearly with the control signal squared. It is assumed that such uncertainty in the dynamics introduces a probability distribution of the eye that the brain accounts for during motion planning. We propose a framework for computing the optimal control law for arbitrary dynamical systems, subject to noise, and where the cost function depends on a statistical distribution of the eye’s position. A key contribution of this framework is estimating the endpoint distribution of the plant using Monte Carlo sampling, which is done efficiently using commodity graphics hardware in parallel. We then describe a modified form of gradient descent for computing the optimal control law for an objective function prone to stochastic effects. We compare our approach to other methods, such as downhill simplex and Covariance-Matrix-Adaptation, which are considered “gradient-free” approaches to optimization. We finally conclude with several examples that show the framework successfully controlling saccades for different plant models of the oculomotor system: this includes a 3D torque-based model of the eye, and a a nonlinear model of the muscle actuator that drives the eye. ii Preface This thesis was not done in solitary confinement. Credit goes to my supervisors, Dinesh Pai and Uri Ascher, who helped formulate the research problem in question, and provided guidance and insight into the design of the algorithms developed during the completion of this thesis. In particular, it was Dinesh’s idea to use Monte Carlo sampling on the GPU, and Uri’s idea as to how to effectively calculate the gradient of a error-prone cost function during gradient descent. Data on the forcelength and force-velocity curves for the extraocular muscles were obtained directly from Qi Wei. I was responsbile for developing the core algorithms and software relevant to the thesis at hand, as well as the examples presented. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Minimizing the variance . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Modeling Saccades . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Modeling the eye . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 A simple 1D plant . . . . . . . . . . . . . . . . . . . . . 9 2.1.2 A simple 2D plant . . . . . . . . . . . . . . . . . . . . . 11 A 3D plant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Accounting for muscle mass . . . . . . . . . . . . . . . . 12 2.2.2 Listing’s Plane . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 A nonlinear muscle model . . . . . . . . . . . . . . . . . . . . . 14 2.4 Control of eye movements . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 16 2.1 2.2 Pulse-step control . . . . . . . . . . . . . . . . . . . . . . iv 2.4.2 3 The slide component . . . . . . . . . . . . . . . . . . . . 18 2.5 Optimal control and motion planning . . . . . . . . . . . . . . . . 19 2.6 Optimal control in the presence of noise . . . . . . . . . . . . . . 20 2.6.1 Open-loop control . . . . . . . . . . . . . . . . . . . . . 21 2.6.2 Feedback control . . . . . . . . . . . . . . . . . . . . . . 21 2.6.3 Tasked-based optimization . . . . . . . . . . . . . . . . . 22 Monte Carlo Simulation on the GPU . . . . . . . . . . . . . . . . . 23 3.1 Why Monte Carlo simulation? . . . . . . . . . . . . . . . . . . . 23 3.2 Monte Carlo sampling on the GPU . . . . . . . . . . . . . . . . . 25 3.2.1 Sampling the plant . . . . . . . . . . . . . . . . . . . . . 26 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3.1 Estimating mean and variance . . . . . . . . . . . . . . . 28 3.3.2 Performance on the GPU . . . . . . . . . . . . . . . . . . 29 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 An Optimal Control Framework for Noisy Saccades . . . . . . . . . 35 4.1 Introducing an open-loop framework for control . . . . . . . . . . 35 4.2 The framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3 Gradient-free methods . . . . . . . . . . . . . . . . . . . . . . . 37 4.3.1 Downhill simplex . . . . . . . . . . . . . . . . . . . . . . 38 4.3.2 Covariance Matrix Adaptation . . . . . . . . . . . . . . . 38 Gradient-based methods . . . . . . . . . . . . . . . . . . . . . . 38 4.4.1 Steepest descent . . . . . . . . . . . . . . . . . . . . . . 39 4.4.2 Broyden-Fletcher-Goldfarb-Shannon method . . . . . . . 39 Testing the framework . . . . . . . . . . . . . . . . . . . . . . . 40 4.5.1 A 1D solution . . . . . . . . . . . . . . . . . . . . . . . . 40 4.6 Comparisons and results . . . . . . . . . . . . . . . . . . . . . . 42 4.7 Driving a 3D plant . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.8 A nonlinear muscle model . . . . . . . . . . . . . . . . . . . . . 46 4.9 Implementation in CUDA . . . . . . . . . . . . . . . . . . . . . . 52 4.10 Summary and conclusions . . . . . . . . . . . . . . . . . . . . . 54 Conclusions 58 3.3 3.4 4 4.4 4.5 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 5.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 vi List of Tables Table 4.1 Performance comparison between steepest descent, BFGS, CMA, and Simplex for a 15◦ saccade with F = 10 ms . . . . . . . . . vii 45 List of Figures Figure 1.1 Equipment for recording eye movements . . . . . . . . . . . 2 Figure 1.2 Measurement of saccades . . . . . . . . . . . . . . . . . . . . 3 Figure 2.1 Illustration of the ocular muscles . . . . . . . . . . . . . . . . 8 Figure 2.2 Activation and passive muscle forces . . . . . . . . . . . . . 15 Figure 2.3 Activation force plus the passive force . . . . . . . . . . . . . 15 Figure 2.4 Force-velocity profile of muscle . . . . . . . . . . . . . . . . 16 Figure 2.5 Motor neuron signal during a saccade . . . . . . . . . . . . . 17 Figure 3.1 Plot of trajectories from a linear time-invariant stochastic process 27 Figure 3.2 Plot of a noise free control signal . . . . . . . . . . . . . . . . 28 Figure 3.3 Residual error from Monte Carlo sampling . . . . . . . . . . 30 Figure 3.4 Performance plot on the GPU . . . . . . . . . . . . . . . . . 31 Figure 3.5 Histogram of endpoint positions for 3D eye . . . . . . . . . . 34 Figure 4.1 Diagram of our framework for computing optimal control laws 37 Figure 4.2 Relative error in calculating the cost function vs the number of particles traced . . . . . . . . . . . . . . . . . . . . . . . . . Figure 4.3 Plot of trajectories generated from different optimization methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 4.4 43 44 Saccade trajectory generated for a 3D plant of the oculomotor system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Figure 4.5 Diagram of 1D eye . . . . . . . . . . . . . . . . . . . . . . . 48 Figure 4.6 Endpoint distribution for 1D eye using a nonlinear muscle model 51 Figure 4.7 Trajectory for 5◦ saccade of 1D eye using a nonlinear muscle model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 52 Figure 4.8 Control signal for 5◦ saccade of 1D eye using a nonlinear muscle model . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 4.9 Trajectory for 15◦ 53 saccade of 1D eye using a nonlinear muscle model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Figure 4.10 Control signal for a 15◦ saccade of 1D eye using a nonlinear muscle model . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Figure 4.11 Sampling the cost function in CUDA . . . . . . . . . . . . . 56 Figure 5.1 61 Trajectories for 3D plant with artificial mass . . . . . . . . . . ix Acknowledgments There are many people before and since my arrival at UBC that I must thank. I am grateful for my advisors: Uri Ascher and Dinesh Pai, who let me into their worlds. They offered great advice, and were immensely patient and supportive as deadlines drew suddenly near. Dinesh, in particular, allowed me onboard the “Sensorimotor Train” my first summer in Vancouver, which is where I came to know and befriend so many wonderful people. Last but not least, I would also like to thank Jim Little for his help proof-reading this thesis and seeing it to the finish line. To my colleagues and friends who shared a special interest in the eye: Martin, Mahk, Qi, and Miriam. I had many great discussions with them on eyeballs and motor control. Qi was especially helpful in providing data instrumental to the completion of this thesis, and Miriam helped tremendously with operating the eye tracker. I’d also like to thank Sang Hoon, for giving me guidance on several occasions, as well as Martin and Shin who proof-read this thesis on its way to completion. And so that brings me to acknowledging all my labmates: old, current, and new, with whom I had an amazing time working and playing with. In no particular order, I’d like to thank my collegaues who’ve been there since what felt like the beginning: Shin and Dave, for all the late-nights we enjoyed together; Tim, for his advice on why scones are actually reheated; Bin, for not letting me drown; Josh, for finding us food, and Mahk, for feeding us; Sang Hoon, who was gracious enough to share his coke; Martin, who was always great for hugs. Outside the lab I’d like to hail my other friends and family, and all the times we had. I also want to reserve special thanks for my mother, who gave so much in raising me; she reminds me to believe that there’s so much more to life. x Chapter 1 Introduction The human eye rotates in a variety of ways in order to focus within its visual surroundings. When we search a map for our favourite coffee corner or shift our attention from one person’s face to another, we generally accomplish this without even thinking. Yet, these movements are done very quickly and typically without fail! These fast and accurate eye movements are known as saccades. Why do saccades need to be so fast? From a predator’s viewpoint, fast eye movements are handy for tracking particularly fast prey. Another feature of saccades, however, is that during movement vision is actually suppressed, which has been postulated as a mechanism for mitigating potential dizziness induced from motion blur [Leigh and Zee 1999]. This has the simultaneous disadvantage, however, of leaving the viewer momentarily blind. Thus it makes sense from an evolutionary perspective, that faster saccades would be favoured. Likewise, it is obvious why saccades, and eye movements in general, need to be accurate: how else could a predator track its prey, or vice versa? The idea of modelling phenomena by optimizing specific quantities during a saccade is not new: optimal control theory [Bryson and Ho 1979, Stengel 1986] is a common approach that has in fact already been applied to motion planning for biological systems [Clark and Stark 1975]. Such methods, however, must model noise and uncertainty as well [Harris 1998, Todorov 2005]. From a control perspective, saccades provide a closer view into the innerworkings of the brain. For example, Chen-Harris et al. [2008] demonstrate how the 1 Chin/forehead rest Camera/Light Source CRT Display PC Eye Tracker PC (a) (b) Figure 1.1: The above illustrates how saccades are recorded and measured. Figure 1.1a shows the experimental setup with a subject plugged in. An infrared camera, shown in Figure 1.1b, was used to track the subject’s eye position. 2 Trace plot for 10 deg 15 9 y-position (deg) 3 -3 -9 -15-15 -9 -3 3 x-position (deg) 15 9 (a) 10 x-component y-component position (deg) 8 6 4 2 0 20 10 20 10 20 30 40 50 60 40 50 60 300 velocity (deg/s) 250 200 150 100 50 00 30 time (ms) (b) Figure 1.2: Position and velocity data for saccades obtained from the author using the setup in Figure 1.1. 3 brain adapts to changes in visual stimuli using video recordings of eye movements such as that shown in Figures 1.1 and 1.2. The fact that all the relevant neural circuitry is nearby, as opposed to passing through the spinal cord, has made it feasible to learn more about the physiology of the brain by comparing eye measurements to actual neural recordings. When collecting experimental data on saccades is not enough, mathematical models may prove neccessary to explore ideas which are more difficult to test in the laboratory. Simulating saccades, and biological motion in general, requires two components: a physical plant and a controller. 1.1 Minimizing the variance Harris and Wolpert [1998] observe that the noise in the motor command is signaldependent, and they assume its variance scales linearly with respect to the control signal squared. The inclusion of uncertainty in the dynamics results in a probability distribution of possible eye positions at the end of the saccade. The fact that the noise is signal-dependent means that a hasty controller could have a detrimental effect on the saccade’s accuracy. Harris and Wolpert [1998] consequently proposed a cost function that penalizes this endpoint variance, while ensuring that the eye reaches its target. By formulating an optimization problem around the probability distribution of the eye’s final position and velocity, they are able to simulate eye trajectories that are strikingly similar to real-world examples (e.g. Figure 1.2b). Interestingly, they are able to reproduce saccade trajectories using a 2nd order linear plant, which was much simpler than previous models. The simplicity afforded by their model is both a blessing and a curse, however. A key limitation regarding their method is the requirement that the dynamics be linear and the noise function Gaussian. In optimal control theory, both the dynamics and cost function are important, dramatically affecting the resulting optimal trajectory. In the context of modeling saccades, this sadly makes it difficult to test the minimum variance principle on more realistic muscle models and eye geometries. We consequently need a method that allows us to compute control laws that minimize variance for more general dynamical systems. This creates two problems: estimating the endpoint distribution of the plant, and then calculating a control law 4 based on this distribution. 1.2 Contributions In this thesis we describe a framework we developed for efficiently computing optimal control laws where the cost function depends on a statistical distribution of the plant. The framework is able to generate controllers for arbitrary plants subject to uncertainty, allowing us to investigate control strategies for more general biomechanical systems. We use Monte Carlo sampling to calculate the endpoint plant distribution. For a given control sequence, we simulate many different instances of the plant, perturbing each trajectory by sampling a user-supplied noise function. The “embarrassingly parallel” nature of this method allows us to take advantage of commodity graphics hardware, achieving simulation results in relatively short time using GPUs. Even after exploiting the GPU, however, evaluation of the cost function remains the biggest bottleneck in terms of performance. Therefore it is critical that the number of times in which the cost function is computed be relatively small and manageable. With this in mind, gradient-based methods are very appealing. However, they are difficult to apply because of a stochastic error that arises when evaluating the cost function. By keeping the realization of the noise constant when computing the gradient, we show that gradient descent methods work well compared to much more expensive gradient-free methods, such as downhill simplex and Covariance-Matrix-Adaptation. Lastly, we give examples to demonstrate the utility of our framework, by showing how it can compute saccade trajectories for linear and nonlinear eye plants. In particular we show how our framework can be applied to a 3D model of the plant where the horizontal and vertical components are coupled. Additionally we also show an example where a nonlinear muscle model is used that would not be possible in previous methodologies. 5 1.3 Outline The thesis is organized as follows: in Chapter 2, we describe the oculomotor system in more detail. We also review previous work in developing controllers for the oculomotor system, but with more emphasis placed on optimal control in the context of biological systems. Chapters 3 and 4 detail the optimal control framework and the numerical algorithms developed during the course of this thesis. The first part focuses on how we sample the plant dynamics to obtain an endpoint distribution on the GPU. Chapter 4 meanwhile focuses on actually finding the optimal control law using iterative methods such as gradient descent and BFGS [Nocedal and Wright 1999]. In both chapters we illustrate that the techniques work on examples relevant to eye movements. Finally, in Chapter 5 we summarize the results of our thesis, followed by a discussion on future work. 6 Chapter 2 Modeling Saccades 2.1 Modeling the eye The eye has 3 pairs of antagonist/agonist muscles, shown in Figure 2.1, allowing it to undergo rotations in the yaw, pitch, and roll directions. In order to rotate the eye along a certain axis, each muscle receives an innervation signal from motor neurons located in a distinct region of the brain, known as the brainstem. When the motor neuron fires, muscle fibres in the extraocular muscle contract in response, forcing the eye to rotate [Kandel et al. 1991]. The muscles in the extraocular system are elastic and highly overdamped. When a muscle, such as the lateral rectus, is innervated, it contracts, moving the eye very quickly towards the temporal side of the head, at least until the “go” signal issued from the brainstem is replaced with a “stop” signal. Due to the viscous qualities of the muscle, braking with the antagonist muscle, such as the medial rectus in this case, is not necessary (though neural recordings have shown an increase in motor neuron activity in the antagonist muscle just as the saccade is about to end; see Van Gisbergen et al. [1981] for more details). Earlier studies simplified the eye plant in order to better understand the mechanisms controlling it. Robinson [1964] for instance, only considers one-dimensional movements, and models the plant as a first order linear system. Clark and Stark [1975] and Enderle and Wolfe [1987] develop more intricate models of the oculomotor system, and are able to reproduce saccade trajectories using a bang-bang 7 Sup eri or Ob liqu e Su pe rio r Re c tu s Lateral Rectus R rior e f In s u ect Inferior Oblique Figure 2.1: The above figure illustrates five of the six muscles responsible for rotating the eye. The medial rectus (not shown) is located opposite to the lateral rectus. controller. Yet they still leave the problem of modeling the plant in three-dimensions for future work. Another limitation of these models is that they are all torque-based, while assuming that the eye is perfectly symmetric. In reality, the eye has asymmetries in its shape and mass distribution. Furthermore, the insertion points, where the muscles attach to the eye globe, are not symmetric, resulting in non-orthogonal rotation axes. Some of the muscles are routed through what are known as muscle pulleys, which have previously been noted for their role in changing the rotational mechanics of the globe [Quaia and Optican 2003a]. A more realistic eye is simulated in Wei et al. [2010], where they generate accurate saccade trajectories using recorded innvervation signals for the control. They take into account the actual geometry of the eye and model the extraocular muscles using simulation primitives known as strands [Sueda et al. 2008; 2011]. This allows them to accurately model where the muscles attach to the globe as well as any interaction occuring between the muscles and orbital pulleys. 8 In optimal control, the control signal computed is ultimately determined by the underlying plant. Thus, having a more realistic model of the eye could alter the resulting trajectories. At the same time, however, a more complex plant could very well complicate how the control is computed, such as the minimum variance controller. We next describe the plant model primarily used by Harris and Wolpert [1998] and Chen-Harris et al. [2008]. 2.1.1 A simple 1D plant An approximation of the eye dynamics in 1D is mθ̈ (t) + bθ̇ (t) + kθ (t) = f (2.1) where t is time, θ (t) is the angular rotation of the eye, m is the eye mass, b is the dampening coefficient, k is the plant stiffness, and f is a force. Equation 2.1 is an ordinary differential equation. The homogenous version (i.e. f = 0) has solutions in the form of eλt , with the λ ’s being roots of the characteristic polynomial mλ 2 + bλ + k, (2.2) where the roots are b λ =− ± 2m r ( b 2 k ) − . 2m m (2.3) Denoting the two roots as λ1 and λ2 , we have λ1 + λ2 = − λ1 λ2 = 9 b m k . m (2.4) Rewriting Equation 2.1 in terms of λ1 and λ2 gives us 1 1 1 1 θ̈ − ( + )θ̇ + θ = f. λ1 λ2 λ1 λ2 mλ1 λ2 (2.5) We define the plant time constants τi = − λ1i , with i being either 1 or 2, resulting in τ1 τ2 θ̈ + (τ1 + τ2 )θ̇ + θ = f . k (2.6) We now define the control signal u(t) = kf , which is a scalar function that controls how much the eye moves. The dynamics in state space are then dx = Ax dt + Bu dt, (2.7) iT is a state variable whose components are the eye θ (t), θ̇ (t) position and velocity, and the matrices A and B are given by where x(t) = h " A= " B= 0 1 # − τ11τ2 − τ11 − τ12 # 0 , 1 (2.8) τ1 τ2 with τ1 = 150 ms, τ2 = 7 ms having been obtained from measured human data, as reported in Robinson et al. [1969]. To account for biological noise, we append Equation 2.7 with a noise term dw, resulting in dx = Ax dt + B(u dt + dw), (2.9) Here dw is a Markov process with zero mean and variance proportional to the control signal squared, c2 u(t)2 dt. In numerical examples, we usually set c = 0.02. The 10 value chosen has no physiological basis, but was hand-tuned to generate plausible looking saccade trajectories, as done in Chen-Harris et al. [2008]. Also note that the controllers in Harris and Wolpert [2006] and Tanaka et al. [2006] use a third-order linear plant model, which includes a third time constant, where the control signal u(t) applies a yank to the system instead of a force. The matrices A and B can be easily modified to account for this. 2.1.2 A simple 2D plant Chen-Harris et al. [2008] make a simple extension of Equation 2.9 to 2D, considering n=2 dx = Âxdt + B̂udt + ∑ B̂[i] dwi . (2.10) i=0 Here B̂[i] is the ith column of the matrix B̂, and dwi is the ith component of the noise process dw. Also, Â = I2 ⊗ A and B̂ = I2 ⊗ B, with In being an n × n identity matrix 2 and ⊗ the kronecker tensor product. The noise process dwi has variance c2 ui as well. The problem with this plant model is that the horizontal and vertical dynamics are completely decoupled. This is unrealistic since the eye is not perfectly symmetric when considering the mass distribution, the added muscle mass, and muscle locations. More crucially, the 2D plant does not account for innervation of the superior/inferior obliques (muscles responsible for generating torsion during eye movements). Technically there should be three control signals for applying a torque about each principal axis, which is not addressed in Chen-Harris et al. [2008]. 2.2 A 3D plant Raphan [1998] proposes a torque based model for the oculomotor system, using an Euler-axis representation for the orientation of the eye. A state variable q(t) is defined in R7 as 11 φ (t) q(t) = n(t) , ω(t) (2.11) where φ is the angle rotated about a 3D unit-vector n, and ω is the angular velocity in three dimensions. The dynamics for each state variable are ω T ndt dq = f (q, u, dw)dt = 1 2 ([ω] n + [n] [ω] n cot( φ2 ))dt M −1 (−Dωdt − Kφ ndt + Mn R( 21 φ , n)(udt + dw)) , (2.12) where M is a 3 × 3 inertia matrix, D is the damping coefficient, K is the spring constant, Mn relates motoneuron innervation to the tension applied along the extraoculor muscles. The function R( 12 φ , n) generates a rotation matrix using Euler’s angle-axis theorem, and implements the pulley’s effect on the muscle torque axes, which helps simplify the control [Quaia and Optican 1998]. The control signal is again denoted by u(t), but has three components instead of two. Raphan [1998] ignores the muscle mass and assumes that the plant is perfectly symmetric, and specify the coefficients for M = 5 × 10−7 kg m2 , D = 7.476 × 10−5 kg m2 /s/rad, and K = 952.4 s−2 /rad using time constants of 150 ms and 7 ms. The time constants were obtained from Robinson et al. [1969], which measures eye movements in humans. Mn was chosen to be 2.493 × 10−6 kg m2 s−2 /(spikes/s). 2.2.1 Accounting for muscle mass As noted by Quaia and Optican [2003a], the muscle insertion points are not symmetrically placed about the eye globe. Further adding to the asymmetry is that the muscle-mass is different per muscle. We can approximate the inertial contribution of each muscle by lumping a portion of the mass at each insertion point. This has 12 the effect of adding off-diagonal elements to the inertia matrix M. Consequently, calculating the angular acceleration ω̇ becomes: dω = M̂ −1 1 φ , n (udt + dw) + [ω] M̂ωdt (2.13) −Dωdt − Kφ ndt + Mn R 2 where M̂ is the new inertia matrix. The extra quadratic velocity term at the end accounts for Coriolis forces. The mass for each muscle was obtained from the authors in Wei et al. [2010], and was typically on the order of 10−4 kg. The radius of the globe was set to 12 mm. 2.2.2 Listing’s Plane The simplification awarded by the muscle-pulley model also fits more nicely with the observation that eye movements generally obey Listing’s Law [Leigh and Zee 1999]. Assuming that the head is fixed and that the eye is in its primary position, any rotation made to move to the eye to another orientation can be described by an axis of rotation. Listing’s Law states that these axes of rotations are constrained to lie on a plane in space, referred to as Listing’s Plane [Quaia and Optican 2003b]. How Listing’s Law is maintained is a mystery. Activating the rectii alone, and not the obliques, allows the eye to move to where it needs to: however, doing so in simulation is not enough to reproduce Listing’s Plane, which has been quantitatively measured. Activation of the superior/inferior obliques is required to satisfy the constraint, and Quaia and Optican [2003a] show the innervation patterns in the obliques necessary to reproduce Listing’s Plane. How the brain computes the oblique innervation signals is not clear; as a function of the rectii activation signals, the oblique signals appear very nonlinear. However, Quaia and Optican [2003a] further show that if the asymmetry in the eye geometry is taken into account, contrary to the symmetric model described in Raphan [1998], the oblique innervation patterns can be expressed as a linear superposition of the rectii control signals. If this is true, then only the activations for the horizontal and vertical rectii need to be calculated. The oblique innervation would just be a scalar multiple of the horizontal control signal plus a scalar multiple of the vertical control signal. 13 2.3 A nonlinear muscle model In addition to ignoring muscle mass, the aforementioned plant models also make simplifications to the actual muscle dynamics. They assume that the force generated by the muscle is affine with respect to the state variable and control. The Hill muscle model describes the muscle dynamics in more detail [Hill 1938]. In this model, the muscle length and velocity are taken into account. The force generated by the muscle is decomposed into a contractile element, a passive element, and a series-elastic element. This model has been used before for eye movements in Clark and Stark [1974], Robinson and Zuber [1981], Martin and Schovanec [1998], and Wei et al. [2010]. The force generated by by the contractile element, in particular, is nonlinear in terms of the muscle length, velocity, and activation signal, which we will also denote as u(t). Later on, we will use this model to drive the oculomotor system, and assume that the total force in the musculotendon system is given by F(l, v, u) = u ∗ Fl (l) ∗ Fv (v) + Fp (l), (2.14) where l is the muscle length, v is the contraction velocity. Fl is the force-length relationship observed when the muscle is activated or when u(t) > 0. Figure 2.2 shows the active and passive forces generated as the muscle stretches and contracts with respect to its resting length, and Figure 2.3 shows a similar relationship for the eye muscle. The force-velocity relationship Fv is another empirical observation (see Figure 2.4), and Fp is the passive force generated as the muscle stretches. We used data from the OrbitT M 1.8 simulator [Miller et al. 1995] to approximate the force-length curves of the contractile and passive elements. The force-velocity curve in Figure 2.4 uses the model in Robinson and Zuber [1981] for contracting muscles. When the muscle is lengthening (i.e. v > 0), the force is given by Fv (v) = 2 − (0.18 + 0.5v)/(v + 0.18) [Wei et al. 2010]. The series-elastic element represents the tendon, and in the oculomotor system this element is considerably more stiff compared to contractile and passive elements. A gross approximation to the musculotendon system, which we adopt for simplicity, is to ignore the contribution of the series-elastic element entirely. 14 force passive active 1 0.5 1.0 length 1.5 Figure 2.2: Plot of the normalized activation force, the normalized passive force, and the total force, as a function of the normalized muscle length. Illustration is inspired from data shown in Zajac [1989]. 2.0 total passive force-length force 1.5 1.0 0.5 0.00.8 0.9 1.0 1.1 length 1.2 1.3 1.4 Figure 2.3: Force-length relationship for the extraocular muscles. The range of muscles lengths we are interested is restricted for simulating saccades. 15 1.6 1.4 1.2 force 1.0 0.8 0.6 0.4 0.2 0.01.0 0.5 0.0 velocity 0.5 1.0 Figure 2.4: Force-velocity profile of the extraoculor muscle. The velocity and force are normalized with respect to the rest length and maximum force generated. 2.4 Control of eye movements Key regions in the brain for controlling the eye have been identified via neurological recordings in primates (e.g. Van Gisbergen et al. [1981]). Computational models of these regions responsible for triggering and executing saccades are not new: Arai et al. [1993], for instance, proposed a neural network model for simulating saccades beginning with a neural network representation of an area in the midbrain known as the Superior Colliculus. 2.4.1 Pulse-step control Figure 2.5 shows what the motor neuron command for a saccade looks like. The signal encodes both the eye velocity and position. At the start of the saccade there is a “pulse” of neuron activity that typically lasts most of the saccade duration and is later replaced with a smaller tonic signal, known as the “step” of innervation. Since the eye plant is highly overdamped, the intial burst in neural activity helps the eye overcome strong viscous forces and arrive at its target destination quickly. 16 firing rate time Figure 2.5: Shown above is an illustration of the pulse-(slide)-step innervation signal during a saccade. See Leigh and Zee [1999] or Van Gisbergen et al. [1981] for actual data. The “step” holds the eye in place once it reaches the target position [Leigh and Zee 1999]. The frequency expressed in the “pulse” command is typically the same for all saccades. The “step” command, however, changes depending on where the eye is fixating. How do we calculate the magnitude in the “step” signal? Consider a first-order one-dimensional plant θ̇ + θ = u, τ1 (2.15) where θ (t) is the position, τ1 is a plant time constant, and u(t) is the control signal. The control signal is restricted to a bang-bang controller, where it switches values at time T , the saccade duration, i.e. 17 ( u(t) = u pulse t ∈ [ 0, T ] ustep t ∈ ( T, T + F ] If we assume that θ (0) = 0, then θ (t) = Rt t 0 −t τ1 0e − τt during the movement, θ (t) = u pulse τ1 (1 − e 1 . (2.16) u(t 0 )dt 0 . Since u(t) is constant ), as well as during the fixation period t ∈ (T, T + F], we have θ (t) = τ1 ustep . Combining the two expressions for θ (t) with t = T yields the pulse-step ratio u pulse ustep − τT −1 1) . = (1 − e This relationship is necessary in order to drive the eye to the target location accurately and with minimal drift. If the pulse-step ratio is miscalculated then the eye is at risk of first undershooting and then drifting towards the target. Problems crop up in three dimensions, however. In 1D or 2D, the angular position can be integrated directly from the angular velocity, which is what the above calculation assumes. In 3D, this is not true, since rotations are not commutative. One way to solve this problem is to use a noncommutative controller, as suggested by Tweed et al. [1994]. They propose a quaternion-based framework for correctly steering the eyes using a pulse-step controller. An alternative approach to handling noncommutative rotations is to consider the underlying orbital mechanics. In Raphan [1998], the role of muscle pulleys is cited as a key factor in simplifying eye control. A half-angle rule allows for noncommutative operations such as rotations to be treated as commutative. Quaia and Optican [1998] show that with the muscle pulleys inserted, the error introduced into the pulse-step calculation is very small. This allows the brain to integrate the x and y components of the angular velocity vector directly and have a reasonable estimate of where the eye is rotated to, and consequently reduces a 3D control problem to 2D. 2.4.2 The slide component It turns out that there is also a “slide” component in the transition from the “pulse” signal to “step.” That the previous models described lack this component in the control is a flaw. Optican and Miles [1985] demonstrates the importance of having a slide component when simulating saccades under certain scenarios. It is not clear 18 how the “slide” component is neurally computed. In the next section, however, we review a controller from the literature that is able to produce the slide-component for a simple second-order plant. 2.5 Optimal control and motion planning Arguably a more fundamental approach to modeling the motor control system is to reformulate it as an optimization problem, and consider what the motor system in question is optimized for. In this case, optimal control theory can be applied. This is plausible, if we assume that the most efficient biomechanical systems survived years of evolution via natural selection. In that case, the underlying biomechanics evolved to behave optimally in a certain way, motivating the application of optimization to biomechanical systems. Monk and Harris [2009], for example, hypothesize that the photoreceptor density in the eye’s retina is optimally designed for saccades. More controversy, however, concerns whether or not the brain is implementing an optimal controller and how it can be done neurally. Our explanation in this thesis is based on the view that perhaps the brain evolved to control saccades optimally, but is not necessarily implementing optimal control theory to do so. When modelling saccades in an optimal control framework, it becomes necessary to have both an accurate model of the oculomotor system in addition to having a plausible cost function. Harris and Wolpert [1998] proposed an optimal controller that takes into account biological noise. They assume that the plant dynamics are contaminated by signal-dependent noise, where the variance is proportional to the magnitude of the driving control signal squared. Furthermore, they propose that the cost function minimizes the resulting variance of the plant at the end of the saccade movement. Their results produced trajectories qualitatively similar to actual saccades, and more interestingly the pulse-slide-step signal previously described is reproduced for “free.” Bang-bang controllers fail to capture this feature. A serious problem with their method, however, is the requirement that the dynamics be linear and the uncertainty be Gaussian. Before discussing how control laws for such systems are computed, we will first briefly review previous work in applying optimal control to saccades. 19 The application of optimal control to biological movements has long been investigated, especially concerning eye movements. Clark and Stark [1975] were the first to propose that the brain employs a bang-bang controller to drive saccades. The principal idea behind this strategy is that the brain moves the eye to its destination as fast as possible. Obtaining plausible trajectories became difficult, however, without developing a more complicated plant model, such as in Enderle and Wolfe [1987]. The bang-bang controller was subsequently criticized for not working on simpler plant models [Harris and Wolpert 1998]. Minimum jerk is another control strategy that has been used to replicate biological movements in Hogan [1984] and Flash and Hogan [1985]. The problem with the minimum jerk principle, and other similar strategies where the square of a particular derivative is minimized, however, is that no strong argument exists concerning why these quantities should be optimized. What is more important: the cost function or the plant dynamics? It seems that both are necessary, especially when reproducing measured trajectories via simulation. A good cost function should be simple (e.g. few parameters to tweak) and intuitive to understand. For instance, minimizing the amount of time spent moving makes more sense than optimizing the amount of jerk applied. Some also argue that a good cost function should work regardless of the plant system used [Harris and Wolpert 1998]. However, if we expect to reproduce trajectories observed in nature or the laboratory, a good plant model is generally needed to achieve this. Additionally, the path planned by an optimal controller may noticeably change depending on the plant model used, potentially giving rise to new hypotheses in motor planning. 2.6 Optimal control in the presence of noise Biological movements are inherently noisy, but whether the brain adapts to noise during motion planning is an open question. Assuming that the brain uses optimal control, then the unintended effects of biological noise should be taken into account. Here we review how optimal control laws for such noisy systems can be computed. 20 2.6.1 Open-loop control Control laws for stochastic dynamical systems have been computed before using open-loop and closed-loop methodologies, as demonstrated in Harris [1998] and Todorov [2005]. However, current approaches have certain limitations. Harris [1998], Harris and Wolpert [1998], Harris and Wolpert [2006] introduced the minimum variance idea to the sensorimotor community, solving trajectories for both saccades and arm movements analytically via the Euler-Lagrange equations. In addition to this, Harris and Wolpert [2006] show that penalizing both the endpoint variance and total saccade duration imposes a speed-accuracy tradeoff that forces the eye to slow down early in the saccade. As a result, they are able to capture certain phenomena observed in saccades, such as the main sequence which defines a relationship between peak eye velocity, eccentricity (the distance the eye has moved), and saccade duration [Bahill et al. 1975]. Tanaka et al. [2006] further show how the optimal duration can be computed analytically if the endpoint variance is treated as a constraint. However, a serious limitation of all these methods is that the assumptions they make preclude applying their methods to more general dynamical systems. 2.6.2 Feedback control Todorov [2005] introduces a framework for solving an optimal feedback control law when the dynamics are linear, the cost function is quadratic with respect to the state and control variables, and the noise is Gaussian and (optionally) dependent on the control (this is important, since signal-dependent noise is so prevalent in biological systems). Todorov and Li [2005] extend the same framework to handle more general dynamics, cost functions, and noise models. However, their method may take many iterations to converge, is suscept to local minima, and requires linearizing the dynamics to first order when evaluating the cost function. Some cost functions are not suitable in closed-loop control frameworks, e.g. the cost function J = Var [θ (T )], where Var [·] returns the variance. When the dynamics are nonlinear, or the noise to be non-Gaussian, it becomes very difficult to find an analytic expression for the endpoint variance of the plant. The only route left open is to randomly sample the noise distribution, and hopefully obtain a 21 reliable estimate of the plant statistics. So, while the methodologies in Todorov and Li [2005] and Harris and Wolpert [1998] are suitable for computing optimal control laws for noisy biological systems, both are unable to handle minimum variance cost functions that have no closed-form expression. 2.6.3 Tasked-based optimization Miyamoto et al. [2004] develop a more general optimal control framework for the arm, where the cost function relies on the plant statistics during and after movement. In their framework, a task region is specified, and the goal is to find an optimal trajectory that maximizes the probability of the plant landing in the given task region. The control is computed by reshaping a path through the plant’s state space, instead of optimizing the control signal directly as in Harris and Wolpert [2006]. At each step during the optimization, the control is computed using inverse dynamics. By sampling the noise acting on the plant, a distribution of states is obtained, which can be used to calculate the cost function. A special type of filter is used to reduce the number of samples effective for obtaining mean and variance estimates in state-space. The downhill simplex method is employed to perform a derivative-free optimization in obtaining the optimal state-space trajectory, and subsequent control signal, though they do not clarify why they do not use gradient-based methods. 22 Chapter 3 Monte Carlo Simulation on the GPU 3.1 Why Monte Carlo simulation? Not every movement is the same. Ask a person to repeatedly move their hand from point A to point B, and the resulting trajectories are bound to have different shapes, require more or less time, or end at different distances from the target. Such variability in our movements has been observed and measured [Fitts 1954, Harris and Wolpert 1998, Kowler and Blaser 1995]. The source of this variability is not well understood. It has been observed that biological noise is signal-dependent, where the variance of the motor control signal scales in proportion to the squared mean value of the control signal [Diedrichsen et al. 2010]. Incorporating this noise model into a controller helps explain Fitt’s law [Fitts 1954]. In the case where someone must move their hand from point A to point B in a very short amount of time, a large control signal is required to carry the hand to its destination quickly, resulting in a more inaccurate movement than if the subject had been afforded more time. Biological noise is often assumed to be Gaussian [Harris and Wolpert 2006]. When a control signal, perturbed by random white noise, is applied to a linear (constant coefficient) plant, the resulting final state will also have a Gaussian distribution. It is easy enough to calculate the distribution’s mean and covariance 23 from the dynamics and control signal. Assume the dynamics obey the equations described in Section 2.1.1 dx = Axdt + B(udt + dw), (3.1) where x(t) ∈ Rnd is a time-dependent state variable, u(t) ∈ Rnu is the control, A and B are constant matrices, dw(t) is a white noise variable with mean 0 and covariance c2 diag [u]2 . If x(0) = 0, then for a given control u = u(t) the mean at time T is given by Z T E [x(T )] = eA(T −t) Bu(t)dt, (3.2) 0 and the covariance is CoVar [x(T )] = c2 Z T T eA(T −t) B diag [u]2 BT eA(T −t) dt. (3.3) 0 For a general nonlinear plant, e.g. ẋ = f (x, u,t), obtaining an analytical solution for the endpoint distribution is nontrivial, when for instance, the noise is not Gaussian. We can instead approximate the plant distribution by sampling the noise in the control during a forward simulation of the plant dynamics. The plant distribution is represented by a set of particles, where each particle is an instance of the plant state and noise. For example, in the 1D linear case, a particle carries both a position and a velocity. As it marches forward in time, its trajectory is perturbed by a Gaussian white noise process. Here calculating the trajectory of a particle is also known as particle tracing. Thus, we propagate (approximately) the entire probability distribution of x(t) up to time T by tracing the path of each particle, whereupon direct calculation of the mean state and covariance follow. Let ρ(x(0) → x(t)) denote the probability of a particle transitioning from the plant state at time 0 to time t. The expectation of any appropriately defined function f at x(T ) is 24 Z E [ f (x(T ))] = f (x(T ))ρ(x(0) → x(T ))dx 1 N ≈ ∑ f (i x(T )), N i=1 (3.4) where i x denotes the ith particle and N is the number of particles traced. This approximation replaces a large multi-dimensional integral with a finite sum of randomly drawn samples, and is at the heart of all Monte Carlo methods [Doucet et al. 2001]. Calculating the mean, covariance, or in general any function that depends on a stochastic time-dependent variable involves a forward simulation of the plant, which we do for many noise realizations in order to calculate the mean and covariance of the propagated probability distribution. Deciding how many particles to trace is a question. If the dynamics are expensive to compute, then tracing too many particles becomes impractical. Otherwise, not enough particles can lead to a poor estimate of the probability distribution, where for example, simply averaging to calculate the mean position could lead to erroneous results. An advantage of this approach in addition to its generality, however, is that the problem becomes embarrassingly parallel since we can trace many trajectories at once without so much worrying about contention of resources (e.g. when two processes write to the same resource). Furthermore, the general idea of propagating a probability distribution is also well-known and used in the context of data assimilation, such as in meteorology [Reich 2011, Rogberg et al. 2008]. 3.2 Monte Carlo sampling on the GPU Here we implement a basic particle tracing method on a graphics processing unit (GPU), using Nvidia hardware and CUDA [Nvidia 2007]. The multiple cores on the GPU allow for a single-instruction-multiple-data approach to computing, and is well suited for the task of tracing many particles in parallel (see Kirk and Hwu [2010] for information on general purpose programming on the GPU). In CUDA, programs written for execution on the GPU are called device kernels. A device 25 kernel, when uploaded onto the GPU, is fed a user-specified number of threads for execution. A thread is the lowest level of granularity in terms of computation. A larger hierarchy sits on top, which specifies how threads can be bundled together in blocks in order to share resources more efficiently, and a similar structure exists for blocks (organized onto a grid). The CUDA runtime, which runs during program execution, schedules when blocks are run and determines how resources, such as memory or processors, are divided across threads. 3.2.1 Sampling the plant A basic outline of the program execution is to start by initializing N state trajectories i x(0) (i ∈ [1, N]) to an initial plant state x(0) (e.g. 0). Each particle is allocated its own thread of execution on the GPU, receiving as input a nominal control signal u(t), the number of time steps K, and the time step size h. The plant dynamics ẋ = f (x, u) are simulated directly on the GPU. At the kth time step the nominal control signal uk is sampled and then perturbed by a random noise process wk (which may also be user-specified). Figure 3.1 shows paths in position and velocity space that can result from noise. Given enough samples, we wish to have an estimate of the plant state which is useful for determining the optimal control signal later on. 3.3 Validation We start off by estimating the endpoint distribution of a 1D plant subject to Gaussian noise. Assume the dynamics are given by Eqn. 3.1. The noise process w(t) is a standard Gaussian process with mean 0 and variance dependent on the control, i.e. c2 u(t)2 . We can simulate the dynamics in discrete time. We use a forward Euler approximation of the dynamics to update our state variable x j+1 = (I + Ah)x j + Bu j h + Bdw j , (3.5) where h is the step size (we used a value of h =1e-3 for all numerical examples), j is a time index, and w j is a random variable sampled from a Gaussian distribution 26 velocity (deg/s) position (deg) 0.25 0.20 0.15 0.10 0.05 0.00 0.050 10 20 10 20 30 40 30 40 8 6 4 2 0 0 time (s) Figure 3.1: Example trajectories for a 1D plant. Noise is added directly to the acceleration signal. The dashed black lines in each figure is the noise-free trajectory. with mean 0 and variance c2 u2j h. In evaluating the control sample u j , a uniform cubic B-spline representation of nw −1 u(t) is used, i.e. let u(t) = ∑i=0 bi φi (t), where nw is the number of basis functions, bi is the control point, and φi (t) is the corresponding B-spline. Other basis functions can be used, which may be more desirable if we know the control function needs to be less smooth. Each thread receives the set of B-spline weights representing the control, as well as the number of time steps K, and time step size h. The dynamics are then evaluated in batch mode, with each thread sampling a Gaussian distribution. The final state of the plant, xK−1 , is written to an output buffer, which is then transferred back to the CPU. 27 1.2 1.0 1.0 0.8 control control 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 10 20 time (ms) 30 0.00 40 10 20 30 40 (a) avg = 10.340 sig = 0.995 avg = 10.357 sig = 0.989 0.12 0.12 0.10 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 8 9 10 70 80 11 position (deg) 12 13 avg = 15.000 sig = 1.130 avg = 15.015 sig = 1.131 0.14 0.10 7 60 (b) 0.14 0.00 6 50 time (ms) 0.0010 14 (c) 12 14 16 position (deg) 18 20 (d) Figure 3.2: The top row show the nominal trajectories used to generate the endpoint distributions pictured on the bottom. Feeding these control signals into the particle tracer generates the histogram of binned states shown in blue. The left column shows an estimate of the endpoint distribution using random spline weights for u(t). The right column, in particular, shows that the endpoint distribution can be accurately estimated using a pulse-step model of the control signal. 512 particles were traced in both examples. 3.3.1 Estimating mean and variance Given a noise-free control signal u(t), we can directly evaluate the mean state and covariance matrix at time T , and use the result to benchmark estimates derived from the Monte Carlo approach. 28 Figure 3.2c shows a histogram of binned states generated from path tracing, after generating N trajectories. The red curve is the distribution calculated from the linear dynamics using Equation 2.9 and nominal control sequence. The relative error for the Monte Carlo estimate of the mean position and variance using 512 sample points was 0.0131 and 0.0215, respectively. We can increase the particle count in the hopes of obtaining better estimates of the distribution. Figure 3.3 shows the relative endpoint error of the sampled MC distribution for different sized sample counts. Both the relative error with respect to the positional average and variance decrease quite slowly as the sample size is increased, i.e. a √1 N relationship follows. While it is well known that in its naive form, Monte Carlo sampling requires many samples to achieve satisfactory accuracy [Miyamoto et al. 2004], perhaps meriting an investigation of smarter sampling strategies, the availability of many cores on the GPU allows us to leverage sample sizes large enough to obtain an appropriate solution. 3.3.2 Performance on the GPU Timings were measured on a 256MB Nvidia 8600GT card. The card comes equipped with four multiprocessors, each allowing up to 768 threads. The card supports CUDA version 1.1. We expect better performance with a more intelligent memory scheme implemented. Each trajectory is allocated a thread, from which many paths are sampled given a control sequence. Figure 3.4a shows how performance scales as we increase the particle count. The timing remains almost constant up to 32 particles, and then proceeds into a linear relationship, i.e. increasing the number of particles by a factor of 2 doubles the amount of time spent tracing. This is indicative of the finite amount of cores available on the GPU used during experimentation. While showing that performance scales linearly as more threads are used, Figure 3.4a also shows how fast brute force sampling is, e.g. tracing 216 particles takes less than a millisecond of time. The amount of time required to lower the relative error is shown underneath in Figure 3.4b. Data points for sample sizes less than 8 (corresponding to roughly less than half a millisecond of computation time) were removed, since the corresponding error was very sporadic, which is in part due to 29 0.30 mean variance 0.25 relative error 0.20 0.15 0.10 0.05 0.00 4 2 25 26 27 28 29 210 211 212 213 214 215 216 217 N (a) mean variance 0.025 relative error 0.020 0.015 0.010 0.005 0.000 4 2 25 26 27 28 29 210 211 212 213 214 215 216 217 N (b) Figure 3.3: The top figure shows the relative error from calculating the positional mean and variance after tracing N particles. For both sets of data, a √1N was found using least-squares. This illustrates how Monte Carlo methods behave as the particle count is increased. The bottom figure shows the positional error in more detail from Figure 3.3a. 30 time (s) 22 21 20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 4 2 26 28 210 212 N 214 216 218 220 222 (a) 0.035 mean variance 0.030 relative error 0.025 0.020 0.015 0.010 0.005 0.000 2-10 2-9 2-8 2-7 time (s) 2-6 2-5 2-4 (b) Figure 3.4: Top figure: a plot of time spent Monte Carlo sampling trajectories. The dashed line shows a linear relationship between time and particle count, and indicates when the GPU is oversaturated with work. The bottom figure shows how much effort, in terms of computation time, is required to lower the relative error. 31 the small particle sizes used to estimate the expectation terms. A √1 N relationship is shown between the amount of time spent tracing particles and the relative error, which is typical for Monte Carlo simulations (Weinzierl [2000]). Thus, decreasing the error by a factor of 10 requires a hundredfold increase in time spent tracing particles. The performance of the simulator can be improved in a couple of areas. Transferring data between the host and device machines is taxing on performance, but this is only an issue whenever the control signal needs to be initialized at the start of the kernel execution, or when the endpoint positions are relayed back from the device at the end of the stochastic simulation. The next chapter, which focuses on how to find the optimal control signal, will be more concerned with reducing this kind of message-passing overhead. Another bottleneck to performance is memory access on the device itself. In CUDA, there exists a hierarchical memory scheme, where each level performs differently in terms of speed and storage capacity. Register memory, which resides on the streaming multiprocessor unit (SMU), is allocated per thread, and can consequently be accessed very quickly. But since the total amount of register memory on the SMU is limited, requiring too much register memory per thread limits the total number of threads that can be run in parallel, thus reducing the amount of concurrency [Kirk and Hwu 2010]. An SMU is allocated a finite number of blocks for parallel execution. Within each block threads can be synchronized, whereafter it may be useful to have them communicate amongst each other. This is accomplished through the use of shared memory. In addition, since the shared memory is closer to the SMU, than say, global memory, it is also generally faster to access. A common strategy then is to copy data from global memory to shared memory as a pre-pass, and then operate on the data in a space where access is much quicker. However, performance issues can still crop up when accessing shared memory (such as bank conflicts). It goes without saying that optimizing CUDA programs is extremely tricky. See Kirk and Hwu [2010] for a more complete description of CUDA architecture. 32 3.4 Conclusions The value of having an analytic solution to compare against cannot be understated. By simulating a one-dimensional plant, where simple models for the dynamics and noise are used, we can validate our naive approach of tracing particles, which is important since being able to accurately sample the endpoint variance and mean state lays the foundation of our optimal control framework described in the next chapter. We also show how accuracy can be improved by simply throwing more particles at the problem, obtaining a more accurate depiction of the probability distribution. Doing so is cheap and fast: when enough processors are present, the amount of time spent tracing particles remains roughly constant. But even when the number of particles vastly outnumbers the number of available cores, the method works well in practice. Nonetheless, it would be worth investigating how to accurately calculate the expectation terms using a smaller number of particles. The rate at which the accuracy improves with respect to the number of particles is very slow. Furthermore throwing more particles at the problem may become a problem when not enough memory is available, or if the dynamics are more expensive to simulate. How resources are allocated and accessed on the GPU can also have a significant impact on the performance of any software, which is easy to forget when designing algorithms on paper. Sometimes just looking at the problem in big O notation is not an accurate indicator of performance in of itself, especially when caching issues are considered. Lastly, the endpoint distribution sampled is not restricted to linear plant models or Gaussian dynamics. Nonlinear, discontinuous plants could just as easily be plugged in, simulated, and have a corresponding endpoint distribution generated. To illustrate, Figure 3.5a shows the endpoint distribution for a 3D plant described in Chapter 2, and Figure 3.5b shows a histogram of endpoint positions for an eye model that uses the nonlinear muscle model from the very same chapter. Both models assume signal dependent noise, but we defer the details of their implementation to the next chapter, which describes the proposed control framework in full. 33 0.12 0.10 0.08 0.06 0.04 0.02 14.5 14.0 15.5 15.0 position (deg) (a) 0.20 0.15 0.10 0.05 0.00 12 14 16 position (deg) 18 20 (b) Figure 3.5: The top figure is a histogram of the horizontal position of a 3D eye, which is described in Section 2.2. The bottom figure is a similar endpoint distribution of a 1D eye using a nonlinear muscle actuator, which is described in Section 2.3. The next chapter describes how the horizontal position of the eye is obtained, and how the nonlinear muscle is simulated with the rest of the eye in more detail. 34 Chapter 4 An Optimal Control Framework for Noisy Saccades 4.1 Introducing an open-loop framework for control We consider the problem of finding a control u(t) that minimizes the cost function Z T +F J(u) = αVar [θ (t)] + (E [x(t)] − xT )T Λ (E [x(t)] − xT ) dt, (4.1) T which if we sample the integral discretely gives us the approximation KT +KF J(u) = h ∑ αVar [θk ] + (E [xk ] − xT )T Λ (E [xk ] − xT ) , (4.2) k=KT where h is the time step, T = KT h and F = KF h. In this example, a 1D linear plant is used again ( Section 2.1.1), thus in full dx = h iT (Ax + Bu)dt + Bdw, and x(t) = [ θ (t) θ̇ (t) ]T . The goal state xT = tx 0 gives the desired endpoint position of the eye. We also introduce penalty weights α, which is a scalar, and Λ, a 2-by-2 diagonal matrix that affects how closely the plant ends at the desired position and velocity. The cost is only incurred during the fixation period of the saccade, rather than the duration of the movement. This 35 relates to the fact that during a saccade, all visual input is suppressed. To reduce the dimensionality of u(t), we approximate it as a linear combination of uniformly u −1 spaced cubic B-splines, letting u(t) = ∑ni=0 bi φi (t), where nu is the number of splines, bi is the ith control weight, and φi (t) is a cubic B-spline basis function. Note that this implies a smoothness assumption on the control u(t), viz. u ∈ C2 . Furthermore, the control is open-loop; once a control law has been computed, it remains the same during movement, regardless of any changes to the plant or environment. The lack of a feedback term in the control restricts the class of problems that our framework handles in an obvious way. However, since we are primarily interested in studying saccades, where the role of sensory feedback is considered negligible [Leigh and Zee 1999], the open-loop assumption is reasonable enough. In the previous chapter, we discussed how brute-force sampling of the noise in the dynamics can be an effective way to approximate the expressions in Equation 4.2. In practice, this allows us to sample and estimate expectations for more general plant and noise models. Now that we have a practical way of computing the terms in Equation 4.2, the next step is to compute a control signal that minimizes the cost. 4.2 The framework An iterative scheme, illustrated in Figure 4.1 is used to calculate an optimal control signal u(t). Note that u(t) is the noise-free signal and is not yet perturbed by w(t). Starting with an initial guess for the spline weights defining u(t) obtained by solving the problem without noise, the simulator samples many paths using the dynamic and noise models specified by the user, where the output of all these paths is input to the cost function J(u). N particles are initialized and their states updated for KT + KF time steps. The updates are performed by first evaluating the current controller uk (t), sampling the noise model w(t), and then feeding the resulting signal into the dynamics calculation. Particles approximate the same uk (t) during the forward simulation shown in Figure 4.1. However, each particle may be unique since the realization of w(t) can differ among particles, resulting in different trajectories. The states of these particles are then recorded during the fixation time interval [T, T + F]. The cost function is evaluated using these endpoint positions, after 36 Sampling J(u) Update Figure 4.1: An iterative scheme is used to calculate the optimal control signal u? (t). The loop is first initialized with a control u0 (t). The cost function is then sampled according to the optimization method used (e.g. gradient descent or CMA). We check whether the candidate uk (t) appears to have converged to a minimum cost function. If convergence hasn’t been reached, the control is updated to a uk+1 (t) and the loop is repeated. which the current estimate of u(t) is updated so that the cost is smaller in the next iteration. We next address the question how such an update is obtained. 4.3 Gradient-free methods The optimization method used to update the spline weights for u(t) depends on whether gradient information is available. Withouth gradients, or if the gradient estimate is very poor or difficult to compute, gradient-free methods may be necessary [Kramer et al. 2011]. However, we will demonstrate how these methods can be extremely costly in terms of the total number of function calls. The TOPs framework in Miyamoto et al. [2004] uses the gradient-free downhill simplex method to compute optimal control laws for hand movements where the control command is subjected to signal-dependent noise. One reason for their choice of method could be that their cost function is similar to Equation 4.2, where its evaluation might introduce a stochastic error term. In a similar application, Hamilton and Wolpert 37 [2002] use simulated annealing to avoid local minima. 4.3.1 Downhill simplex The downhill simplex method [Nelder and Mead 1965] is a popular choice for performing gradient-free optimization. A simplex is created with n + 1 vertices, where n is the dimension of the problem search space. In each iteration, a vertex with the largest value is replaced with another vertex of lower value, reshaping the simplex. The simplex shape goes through a set of operations, such as expansion, contraction, and reflection, until a minimum has been reached. 4.3.2 Covariance Matrix Adaptation Covariance Matrix Adaptation (CMA) is a stochastic optimization method. It starts with a population of samples drawn from a distribution defined over the domain of the function. At each iteration, the distribution is reshaped and resampled towards an optimal region in the function space. The full extent of the algorithm can be read in Hansen and Kern [2004]. Tan et al. [2011] applied the method to character animation, to calculate the optimal swimming gait of underwater creatures. In practice, the method works really well, as the authors are able to achieve swimming gaits that look suitable for the types of characters they simulate. Another nice feature of CMA is that it requires very little parameter tuning. However, like the simplex method, we show that in our application, CMA requires many expensive function evaluations without being able to converge to an acceptable eye trajectory. 4.4 Gradient-based methods The essential problem in using a gradient-based method is that the cost function evaluated at each iteration is not without random error. Monte Carlo sampling the dynamics in order to evaluate the endpoint distribution can only achieve so much accuracy, as noted in the previous chapter. Furthermore, samples drawn from the noise distribution w(t) are expected to change each time a forward simulation is executed, where calculating the cost function twice using the same control signal could result in two different values. While reusing the same set of noise values for every simulation implies that the cost function will change only when u(t) changes, 38 this would introduce bias in the final solution. Simply finite differencing the cost function to estimate the gradient may result in very large errors. For example, suppose f (x) = g(x) + ξ is a scalar, where g(x) is the true differentiable function we wish to calculate, and ξ is a random variable. We approximate the gradient ∇ f (x) = df dx via finite differencing, e.g. ( f (x + ε) − f (x))/ε = (g(x + ε) − g(x))/ε + (ξ1 − ξ2 )/ε, where ξ1 and ξ2 are different realizations of ξ and kξ k 1. But, whereas the first term approaches ∇g(x) as ε → 0, the second blows up since ξ1 and ξ2 may well have different signs. Fortunately, the gradient of the cost function for any given realization of w depends on that particular realization and does not exhibit independent noise. Therefore, the random noise in calculating the particle’s trajectory is the same as for the gradient! Thus, for each weight bi in the control expression u(t) = ∑i bi φi (t), we evaluate J(u) with b̃i = bi + ε and b̃i = bi − ε using the same realization of noise w(t). Subtracting these results and dividing by 2ε yields an approximation for 4.4.1 dJ dbi . Steepest descent In steepest descent, the gradient of the function being minimized is selected as the search direction, so that in each iteration the optimizer moves orthogonal to the function’s contours. The missing key ingredient, however, is the step-length which decides how far to follow the steepest direction. If the step-length is too small, then the optimizer can take forever to converge, or if the step-length is too large, then there is an increased risk in over-shooting the function and never reaching the minimum. Line-search methods address the problem of selecting a proper steplength, of which more information can be found in Nocedal and Wright [1999]. 4.4.2 Broyden-Fletcher-Goldfarb-Shannon method Steepest descent by itself can require many iterations to converge, since only the gradient is used. Newton-like methods converge in fewer iterations under certain locality and smoothness assumptions, since the Hessian is used to determine the next best search direction. However, this becomes prohibitively expensive if the only way to obtain the Hessian is by sampling the cost function. The Broyden39 Fletcher-Goldfarb-Shannon (BFGS) method strikes a compromise between gradient descent methods and Newton’s method, by approximating the Hessian effect in critical directions using only gradients calculated from the previous and current iterations [Nocedal and Wright 1999]. 4.5 Testing the framework We benchmark our framework using the gradient-free optimization routines downhill simplex and CMA, in addition to gradient-based methods steepest descent and BFGS. All methods are implemented in Python. Gradient descent and BFGS were programmed by the author, using the numerical packages Numpy [Ascher et al. 1999] and Scipy [Jones et al. 2001]. The downhill simplex method is already included in the Scipy optimization package. CMA can be obtained freely online from Hansen and Kern [2004]. The gradients are calculated using central differencing, while keeping the noise samples ξit constant for each particle. For both gradient descent and BFGS, the magnitude of the gradient is used as a stopping criteria. When it falls beneath a certain threshold, e.g., τε , the algorithms terminate. In the first comparison, a one-dimensional linear plant is used (see Equation 2.9), where the noise model is Gaussian and has variance proportional to the control signal squared. The objective, defined via Equation 4.2, is to minimize the distance between the goal state and plant configuration as well as the variance during the fixation period [T, T + F]. 4.5.1 A 1D solution Given the discrete plant (2.1.1), we now approximate the differential equation by the forward Euler method, obtaining x j+1 = (I + Ah)x j + B(u j h + dw j ), (4.3) i θ j θ̇ j , and θ and θ̇ are the position and velocity state variables. The control signal u j is just a scalar, and matrices A and B lie in R2×2 and R2×1 , where x j = h respectively. The random process dw j has variance c2 u2j , where c is a constant. 40 The values for these coefficients are given in Chapter 2. Since the dynamics, noise model, and cost function are simple enough, the optimal control law can be computed without using an iterative method. The variance Var [x(t)] and expected state E [x(t)] can be computed directly using the discrete dynamics in Equation 4.3 in terms of the control parameters bi , u −1 knowing that u j = ∑ni=0 bi φi ( j h). Let Ã = (I + Ah) and B̃ = Bh. The expected state at time t = jh is given by j−1 E [x j ] = Ã j x0 + ∑ Ã j−1−i B̃ui i=0 j−1 = Ã j x0 + ! ∑ Ã j−1−i B̃φ (i h)T (4.4) b i=0 j = Ã x0 + G j b, where φ (t) = h φ0 (t), φ1 (t), . . . , φnu −1 (t) iT and similarly b = [ b0 , b1 , . . . , bnu −1 ]T . The variance is given by j−1 Var [x j ] = c2 h ∑ (Ã j−1−i B)(Ã j−1−i B)T u2i , (4.5) i=0 Letting [·]0,0 be the entry in the first row and first column of a matrix, the positional variance is given as j−1 Var [θ j ] = c2 h ∑ (Ã j−1−i B̃)(Ã j−1−i B̃)T 0,0 u2i i=0 = c2 hbT ! j−1 ∑ i=0 (Ã j−1−i B̃)(Ã j−1−i B̃)T 0,0 φ (i h)φ (i h)T b T = b H j b. Assuming x0 = 0, we can then write the cost function J(u) in terms of b as 41 (4.6) KT +KF J(u) = ∑ αbT Hi b + (Gi b − xT )T Λ (Gi b − xT ) i=KT T =b KT +KF ∑ ! αHi + GTi ΛGi b − 2b T i=KT Taking the gradient dJ db ∑ (4.7) ! GTi ΛxT + xTT ΛxT , i=KT and setting it to 0 yields KT +KF b= KT +KF ∑ !−1 αHi + GTi ΛGi i=KT KT +KF ∑ ! GTi ΛxT . (4.8) i=KT The above solution for b returns an optimal control law u? (t) which we use to compare against solutions obtained from the different optimization tested in our Monte Carlo framework. 4.6 Comparisons and results To verify that the cost function J(u) can be accurately determined, we calculate the cost for a given set of spline weights b using the stochastic framework described earlier and additionally by directly evaluating Equation 4.7. Figure 4.2 shows how the relative error in position, velocity, positional variance, and cost decrease as a 1 function of the particle count. The errors follow a distinct N − 2 relationship, illustrating that our framework is capable of determining these quantities accurately. We generated trajectories for a 15◦ saccade using the different methods already described. Quantitatively we can compare how close the final cost is to that obtained from Equation 4.8. The results were mixed for the different optimization routines. Figure 4.3 shows trajectories obtained using a fixation period of 10 ms, and Table 4.1 shows quantitative results in terms of the number of expensive function calls, the cost value achieved, and the relative error of the cost with respect to the true solution. The simplex method fails quite badly, producing an eye trajectory that is barely able to leave its primary position. CMA achieves the lowest cost value and relative 42 0.8 position velocity variance cost 0.7 relative error 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1 2 22 23 24 25 26 N 27 28 29 210 211 (a) position velocity variance cost 0.25 relative error 0.20 0.15 0.10 0.05 0.00 1 2 22 23 24 25 26 N 27 28 29 210 211 (b) Figure 4.2: The expectation terms in Equation 4.2 are calculated by tracing N particles. The resulting cost is compared to the direct solution obtained from Equation 4.7. The initial spline weights in b were all set to 1. The bottom figure is a zoomed in view of the position, velocity, and positional variance errors. 43 solution steepest descent BFGS downhill simplex CMA control signal velocity (deg/s) position (deg) 16 14 12 10 8 6 4 2 0 20 1500 1000 500 0 500 0 12 10 8 6 4 2 0 2 4 0 5 10 15 20 25 30 35 5 10 15 20 25 30 35 5 10 15 20 time (ms) 25 30 35 Figure 4.3: The cost function in Equation 4.2 was minimized using steepest descent, BFGS, downhill simplex, and CMA for a 15◦ saccade. 10 spline weights were used to represent the control signal. error, however still large. Since the cost only reflects the endpoint error during fixation, it is not enough to accurately judge whether the optimization method generates an appropriate trajectory. In this case there are several issues in the trajectories generated by CMA during the saccade: the plant position is overshot during flight, the peak velocity is too large, and the control signal is too oscillatory. While steepest descent and BFGS produce trajectories that cost more during the fixation interval, the corresponding in-flight trajectories appear to qualitatively match the solution better compared to CMA. Even nicer is that the goal state is reached at about the same time as the solution and maintained for the rest of the fixation period. The discrepancy in the cost function is troubling, and may be an artifact of the gradient-based methods being incapable of distinguishing the cost 44 function from random error generated from Monte Carlo sampling. There is also a “knee” shape in the solution’s velocity and control signal profiles that the gradientbased methods seem to have smoothed. Performance-wise, the gradient-based methods take the cake, requiring far fewer iterations, and subsequent expensive function calls, over the gradient-free methods. The final endpoint cost value might not matter so much, as long as the in-flight trajectory looks reasonable enough, the controller is able to reach the goal state, and maintain it for some amount of time. The gradient descent methods work “well enough” in this regard, smoothing artifacts aside. That they are able to attain reasonable enough results making far fewer function calls is most appealing. From now on, let us only consider the gradient-based methods. Table 4.1: Performance comparison between steepest descent, BFGS, CMA, and Simplex for a 15◦ saccade with F = 10 ms Method steepest descent BFGS CMA Downhill Simplex 4.7 Iterations 141 13 1885 760 Function calls 2961 273 18850 2003 Cost 8.506e-5 8.227e-5 3.384e-5 6.104e-2 Error 4.390 4.213 1.144 3866 Driving a 3D plant We also tried a more complicated plant model of the eye in 3D, described in detail in Section 2.2. The cost function is optimized using only the gradient-based methods steepest descent and BFGS. Note that there is no exact solution to compare the estimated solutions against, as we did in the one-dimensional case, so the criteria used to compare one method versus another will be purely qualitative in that we consider if the plant is able to reach the goal state on time, and maintain the plant position during the fixation period. h i φ , n, ω , is a vector in R7 , comprised of an angle φ and axis n for the eye orientation, and a 3D angular velocity vector ω(t). The state of the eye q(t) = The control signal u(t) is extended to 3D to allow for innervation of the 3 pairs of muscles. In addition to being able to simulate a nonlinear plant, another advantage 45 our framework has over other methods is that it allows us to simulate the dynamics using integration schemes other than Forward Euler. During a forward simulation of the 3D plant, we integrate the ordinary differential equations corresponding to the plant’s dynamics using the classical Runge-Kutta 4th-order method [Ascher and Petzold 1998]. When calculating the cost function, we project the 3D coordinates h of the eyei onto a 2D coordinate system. This is done by rotating the view vector 0 0 1 by the rotation matrix corresponding to the axis-angle of the eye plant, projecting it onto xz and yz planes, and calculating the respective angular distance away from the z axis. The projected coordinates are calculated for each particle during the Monte Carlo simulation, and these are then used to calculate the average position and variance in Equation 4.7. The goal state xT is then naturally represented as a 4D vector, xT = [ tx 0 ty 0 ]T , encoding the horizontal and vertical components of the desired position and velocity. Note that the conversion from an axis-angle representation of the coordinates to a xy representation is a nonlinear transformation of the the coordinates. Evaluating the endpoint variance in this case actually requires a numerical solution, which this framework provides. Figure 4.4 shows a saccade generated to a target 15◦ away in the horizontal and vertical directions. The trajectories were computed using steepest descent and BFGS, giving qualitatively similar results. One noticeable problem is that the eye takes a little longer to reach the target position than the allotted flight time of T = 60 ms. However, once it does arrive there, the controller manages to maintain the position during the fixation period (F = 30 ms). In terms of performance, the tolerance was set to 1e − 2: steepest descent managed to converge to a solution in 29 iterations, while BFGS finished after roughly 73 iterations. When running both methods, the same noise realization of w was used. In terms of endpoint cost, BFGS achieves a cost smaller than steepest descent, but qualitatively, the trajectories are very similar. 4.8 A nonlinear muscle model The 3-element muscle model described in Chapter 2 was implemented as an actuator for a 1D model of the oculomotor system. As already mentioned, we ignore the 46 control signal x-velocity (deg/s) x-position (deg) 16 14 12 10 8 6 4 2 00 500 400 300 200 100 0 1000 0 50 100 150 200 2500 steepest descent BFGS 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 time (ms) (a) control signal y-velocity (deg/s) y-position (deg) 16 14 12 10 8 6 4 2 0 20 500 400 300 200 100 0 1000 0 50 100 150 200 2500 steepest descent BFGS 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 time (ms) (b) Figure 4.4: The controller was required to drive the eye to a goal state 15◦ in the horizontal and vertical direction by time T = 60 ms, and maintain its position during the fixation period F = 30 ms. The top figure is the projected horizontal position of the eye, and the bottom figure is the corresponding vertical movement. 47 Medial Rectus Lateral Rectus Figure 4.5: The medial and lateral recti are attached to the globe, and the muscle model in Chapter 2 is used to calculate the force generated in each muscle given the muscle length, contraction velocity, and activation signal. In our example, we simplify the muscle dynamics by expressing the muscle length in terms of the eye orientation θ . Here, positive rotations are counter-clockwise, which corresponds to the medial rectus contracting. series-element in the model and only focus on the contractile and passive elements, which are still nonlinear. We assume that sufficiently small changes in the eye orientation θ (t) are equivalent to changes in the muscle length l(t), where rdθ = dl, and r is the radius of the eye globe (12 mm). Both the medial and lateral recti are modelled as actuators. Referring to Figure 4.5, counter-clockwise rotations are positive, with θ (t) being measured from the eye’s primary position. The length of the medial rectus is then lMR (t) = lr − rθ (t), where l p is the initial muscle length when the eye is in its primary position (i.e., θ = 0). Note that the initial muscle length, l p , does not have to be the same as its rest length. A similar result follows for the lateral muscle length lLR . A second-order approximation to the dynamics, in terms of the eye orientation θ is 48 md θ̇ = − bθ̇ dt − kθ dt+ (4.9) fMR (l p − rθ , −rθ̇ , uMR , dwMR )− (4.10) fLR (rθ − l p , rθ̇ , uLR , dwLR ), (4.11) where u(·) (t) and dw(·) (t) are the corresponding control signal and noise distribution for each muscle. fMR (·) and fLR (·) are the active and passive forces generated by their respective muscles f(·) (l, v, u, dw) = Fl (l) ∗ Fv (v) ∗ (udt + dw), (4.12) as described in Chapter 2; the primary difference is the inclusion of the signaldependent noise term dw. Values used for the mass m, viscosity b, and stiffness k are the same used for the 3D plant; the noise coefficient c was set to 0.02. The initial muscle length was set to 40 mm. The control signals were clamped to be greater than or equal to zero, since the activation signal in the muscle model is assumed as such. An explicit 4th order Runge-Kutta method was used to step through the simulation using a time step size of 1e − 3. Moreover, u(t) was parameterized using 16 spline basis functions. Since u(t) is 2D, that gives 32 parameters to optimize over. Saccades were simulated for two different target locations: a target with eccentricty of 5◦ and another target that was 15◦ away from the primary position. The position and velocity profile of the 5◦ saccade is shown in Figure 4.7. The duration and fixation period were separately assigned to 20 ms. Just as in the earlier 1D and 3D examples, the figure illustrates that the controller is capable of steering the eye to the target location, and keeping it there during fixation. More interesting is the control signal, which resembles a “pulse-slide-step”. There are two muscles to pull the eye in this case, and Figure 4.8 shows the activation signal for each. The agonist muscle (i.e. medial rectus) starts off high, and then proceeds to a tonic low that keeps the eye in its place. The smooth transition between the two signifies the “slide” component, which is difficult to simulate in bang-bang 49 control. We also recognize, however, that the “slide” could just be an artifact of using smooth basis functions. Furthermore, the “pulse” component does not reach peak activity until after the saccade has started, which is most likely the result of the set of basis functions selected for u(t). There is also interesting activity in the antagonist muscle (i.e. lateral rectus) near the end of the saccade that looks as if the muscle is “braking”. Van Gisbergen et al. [1981] observed similarly in recordings of antagonist activity for saccades. Generating a 15◦ saccade was more difficult; gradient descent required more iterations, and the final solution obtained through BFGS was noticeably worse than the 5◦ example. Figures 4.9 and 4.10 show the saccade trajectory and control signals computed using gradient descent only, which required 160 iterations in order to converge. In general we found that gradient descent performed better in the optimization process than BFGS. This could be related to how the activation signal is clamped to be non-negative; the antagonist signal in Figure 4.8 illustrates this. Since BFGS assumes that the underlying function is smooth, the discontinuities in the activation signal’s derivative could very well throw it off course. The trajectory and control data for the 15◦ saccade shows reasonable looking position and velocity profiles. Interestingly, the “brake” signal appears again in the antagonist signal. During the fixation, however, the controller oscillates in keeping the eye steady. This behavior is noticeable in both control signals and the resulting velocity profile. Experimenting with downhill simplex and CMA did not result in better trajectories than those generated from gradient descent. We also observed that reducing the noise coefficient had little effect on convergence, instead only affecting the resulting eye trajectory more. This would imply that the inherent nonlinearities in the dynamics are causing more trouble for the optimization process than the uncertainty in the dynamics. Lastly, we would like to point out that the endpoint distributions estimated were done so for nonlinear plant models, which was not really considered in prior applications of the minimum variance principle to eye movements. Figure 4.6 shows endpoint distributions for 5◦ and 15◦ saccades. The latter distribution, in particular, is noticeably skewed. This shows how the plant model can affect the endpoint distribution, possibly resulting in distributions that are not Gaussian, which have 50 0.12 0.10 0.08 0.06 0.04 0.02 0 4 2 6 8 position (deg) 10 (a) 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 12 13 14 15 position (deg) 16 17 18 (b) Figure 4.6: Shown are endpoint distributions for saccades with eccentricities of 5 and 15◦ . The distribution for the 15◦ saccade, in particular, appears asymmetric. 51 velocity (deg/s) position (deg) 5 steepest descent BFGS 4 3 2 1 0 500 400 300 200 100 0 5 10 15 20 25 30 35 0 5 10 15 20 time (ms) 25 30 35 Figure 4.7: A 1D model of the eye using the nonlinear muscle model described in Chapter 2 was driven to a 5◦ target location in a 40 ms timeframe. The above figure shows the position and velocity profile of the eye during saccade flight and final fixation. often been assumed in the oculomotor literature. The cost function we use also assumes that the endpoint distribution is Gaussian, where the positional mean and variance is computed, but perhaps a different cost formulation is needed for more general endpoint distributions. We can also imagine a scenario where we compare the plant distribution against goal regions more arbitrary in nature. Instead of representing the target as a point, for instance, we could define more arbitrary shapes, such as a circle, for the eye to saccade to, instead of point targets. 4.9 Implementation in CUDA In the context of our parallel implementation, the biggest, most immediate threat to performance is not the number of particles being traced (though this could 52 medial control signal lateral control signal 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0 0.045 0.040 0.035 0.030 0.025 0.020 0.015 0.010 0.005 0 steepest descent BFGS 5 10 15 20 25 30 35 5 10 15 20 time (ms) 25 30 35 Figure 4.8: The 1D eye shown in Figure 4.5 has two muscle actuators: the medial and lateral rectii. The control signals for each muscle are shown, resulting in the saccade trajectory in Figure 4.7. change as the dynamics model becomes more and more complicated), but rather the amount of memory transferred back and forth between the CPU and GPU. The solution is to keep as much of the computation delegated to the GPU, only referring back to the CPU when needed. The framework we outline ends up fitting nicely within CUDA. Particles are individually represented as threads, which trace their trajectories while sampling the corresponding control and noise. The cost function J(u) is evaluated using the final positions and velocities, which can be implemented efficiently depending on the cost function. The mean and variance, for instance, can be calculated by applying a parallel reduction twice on the particle states. Computing the gradient for steepest descent and BFGS requires sampling the cost function 2 ∗ Nu times, if a central differencing scheme is used. A diagram 53 velocity (deg/s) position (deg) 16 14 12 10 8 6 4 2 00 700 600 500 400 300 200 100 0 1000 10 20 30 40 50 60 10 20 30 time (ms) 40 50 60 Figure 4.9: Positon and velocity trace for a 15◦ saccade. illustrating how sampling is done is shown in Figure 4.11. We allocate a set of blocks for each sample Ji , each simulating its own set of N particle trajectories. The associative cost for each set of blocks is computed, and the subsequent values stored in memory. Updating the control spline weights is then performed on the GPU. 4.10 Summary and conclusions We outlined a framework for computing an optimal control law where the cost function relies only on the distribution of the plant state during the fixation period. Qualitatively, the trajectories obtained from steepest descent and BFGS are quite comparable to the solution in Equation 4.8 (for the simplest model, where we have such an exact solution), and converge while requiring a relatively small number of function calls. 54 medial control signal 0.10 0.09 0.08 0.07 0.06 0.05 0.04 0.030 10 20 30 40 50 60 10 20 30 time (ms) 40 50 60 lateral control signal 0.025 0.020 0.015 0.010 0.005 0.0000 Figure 4.10: Agonist and antagonist control signals for a 15◦ saccade. The downhill simplex method did not perform up to our expectations in the presented framework, and so it would be interesting to investigate how other optimization methods fare in the TOPS framework, which is very similar in purpose and function to our own. We additionally looked at a test problem where the solution is known beforehand, to determine how well the different methods fare. Important to note is that the final endpoint cost obtained is not the best indicator of accuracy, as illustrated by the CMA method which generated small cost values but unusual saccade trajectories. Furthermore, the TOPS framework lacks such comparisons, so it would be also interesting to look at how well it performs with a similar problem for validation. We also exploit the GPU for performance purposes. As discussed earlier, it is fairly inexpensive to simulate many particles. Simulating hundreds, if not thousands of trajectories is perfectly doable; for example, the figures in 4.3 were calcu55 Sampling J(u) thread 1! thread 1! thread i! thread N! thread i! thread N! Block 1! Block Nu+1! Block j! Block Nu+j! thread 1! thread thread i! 1! thread N! i! thread thread N! Block Nu! Block 2Nu! Figure 4.11: Diagram illustrating how blocks and threads are allocated when sampling the cost function J(u) for steepest descent or BFGS. lated using two to three-thousand particles. Since we use the GPU, a major bottleneck in performance is transferring data between the host CPU and GPU. This can be alleviated by keeping as much computation on the GPU, sending only a small amount of data back to the CPU when checking whether the method has converged. Another issue with GPUs is that not all of them support 64 bit arithmetic. The device used to run our examples, for instance, only supports 32 bit floating point operations. This could be a problem when trying to accurately compute the variance of a set of numbers [Jones et al. 2001]. We have not yet encountered any serious issues, however, and eventually more and more cards will inevitably support double precision arithmetic. Overall, our framework provides a fairly fast and practical method for computing optimal control laws. Theoretically any dynamical system, noise model, and cost function can be plugged into the framework, and a control law should be 56 returned. In practice, however, it might not be entirely feasible to implement a dynamics simulator on the GPU where thousands of trajectories are simulated in parallel, because of memory constraints and the lack of more general purpose computing resources such as dynamic memory allocation. Additionally, the gradientbased methods might run into issues if the cost function is prone to too much noise. In the previous examples, tuning was required for the tolerance and step sizes used. A two-pass approach for applying gradient descent on a stochastic cost function is described in Bertsekas and Tsitsiklis [1996], where the average of iterate solutions before uk (t) is taken as the current solution. This averaging is performed once the iterate uk (t) is within the vicinity of the optimal u? (t). The purpose of averaging is to minimize variance, and it has been shown to achieve convergence properties on par with second order methods. The problem is identifying when the iterate is close enough to the solution. Furthermore, this technique really would only affect performance near the end of gradient descent, and not during, where convergence can be particularly slow. In our example, CMA produced the smallest cost values, but saccade trajectories that looked not so great. It would be interesting to see if a marriage between the gradient-based and gradient-free methods such as CMA is possible, where the best of both worlds could be combined: achieving good-looking saccade trajectories using a small number of iterations and, at the same time, striking a low cost value. All the optimization routines ultimately performed poorly in terms of achieving a small relative error with respect to the solution for the 1D linear test problem. Even though CMA achieves the smallest cost value, the value it strikes is still far from the true cost value. Nonetheless, we show that our framework can control a 3D plant of the oculomotor system, where the dynamics and cost function are nonlinear with respect to the plant state. We also showed an application of our framework to a model of a 1D eye, but where the dynamics of the muscle are taken into account. In this case, nonlinearities in the muscle with respect to the state variables and activation signal are considered, for which our framework is able to effectively calculate a control signal. At the same time, this example highlighted the need for methods better than gradient descent, that converge faster. The quasi-Newton method BFGS failed completely in some cases, and Newton methods would be quite prohibitive. 57 Chapter 5 Conclusions We described a framework for controlling noisy saccades. The dynamics, noise, and cost function are not restricted to the linear-quadratic-Gaussian models [Stengel 1986] ubiquitous in recent saccade control literature. In developing this framework, we addressed two problems: propagating the plant’s state distribution and iteratively finding the optimal control law quickly and efficiently. Estimating the plant distribution is easily done by simulating many trajectories at once, which we demonstrated on both linear and nonlinear plant models. Previous work on applying the minimum variance model to saccades were limited only to linear plant models. Leveraging the GPU during the forward simulation also allows us to calculate the state distribution quickly: demonstrating how the “brute force” solution can work really well in practice. The cost function calculated is not without error, since particle tracing is used to determine the plant distribution which is critical to cost. As a result we investigated several optimization methods that did or did not require gradient information. We discovered that the gradient-free methods produced poor saccade trajectories, even though one of them (i.e. CMA) achieved the smallest cost value. The gradient descent alternatives offered more in terms of fewer function calls (which are considered expensive to make) and better-looking saccade trajectories. A key ingredient in applying gradient descent to the error-prone cost function is that we reuse the same noise realization when estimating the gradient of the cost function. We also demonstrate the utility of our framework on several different models 58 of the eye: the canonical 1D plant, a 3D nonlinear plant, and finally an eye-plant model that uses a nonlinear muscle model. In all examples, the eye was correctly driven to its target destination, producing position and velocity profiles qualitatively similar to saccades observed in the real world. 5.1 Future work While we show that our framework works well for different models of the eye, it would be great to apply our method to even more computationally demanding models, such as that described in Wei et al. [2010]. Due to limited resources on the GPU, implementing more complicated simulations might be difficult. Fortunately, the GPU is not the only platform capable of parallel computation; it would be worth considering an MPI implementation of the framework, for instance. We realize that the number of particles needed to obtain high accuracy is typically large in Monte Carlo path tracing. An interesting line of future work would be to investigate smarter sampling strategies that yield better estimates of the probability distribution using fewer particles. Variance-reduction techniques, which are popular in rendering [Pharr and Humphreys 2004], might be useful for reducing the “error” in the cost function, though care is needed to ensure that bias is mitigated as much as possible in the final solution. When calculating the gradient of the cost function, we reuse samples drawn from the noise function: this could be a source of bias, and requires further examination. In developing the framework, we used a linear congruential number generator [Park and Miller 1988] to draw samples from the noise distribution. The method is quick and easy to implement, but is not very good compared to other methods in terms of the periodicity of the random numbers generated. Better random number generators, such as the Mersenne Twister [Matsumoto and Nishimura 1998], could be implemented, and might improve our framework’s estimate of the plant’s state distribution. There are also other biomechanical phenomena to model, such as arm movements or other types of eye movements. Miyamoto et al. [2004] use their TOPS framework to simulate hand trajectories, and we believe our framework could handle similar problems. Furtheremore, the human visual system capabilities extend 59 beyond saccades. An interesting application of this framework would be to apply it to pursuit, a smooth tracking motion our eyes make when following moving objects. Since pursuit is primarily feedback driven, and our framework only considers open-loop trajectories, one approach to this problem would be to simulate either a sequence of open-loop trajectories or perhaps choose a basis for the control signal that is defined over a domain other than time, such as the target position and velocity. In regards to saccades, we would also like to validate our method and the trajectories it simulates against actual measured data. A quantitative comparison against subject-specific eye trajectories have been lacking in previous applications of the minimum variance principle. For example, it would be interesting to verify the statistical properties of saccades. Ploner et al. [2004] note that saccade endpoint variance increases with target size. As future work, it would be interesting to measure the endpoint accuracy of saccades and determine whether current models of the plant and cost function are in agreement. In our experiments with the 3D plant, we observed how lumping the muscle mass onto the eye globe had little effect on the resulting saccade. In one experiment, we artificially increased the muscle mass contribution for strictly horizontal saccades and found no significant change in the vertical movement, as shown in Figure 5.1. Indeed, Robinson and Zuber [1981] reports that, in general, the moment of inertia J would have to be raised by a significant factor before seriously affecting eye movements. There are other asymmetries to consider other than the eye’s mass distribution, such as the insertion points of the muscles [Quaia and Optican 2003a]. Perhaps testing more geometrically intricate eye models would reveal why some saccades are straight and others curved. As we develop and test out more biomechanical models, the value of having a control framework for investigating what effects the underlying dynamics have on the control become apparent. As a tool for studying the brain, the framework we describe will hopefully allow us to quickly test out new and current hypotheses on the brain and its control mechanisms, a prospect we find exciting. 60 control signal x-velocity (deg/s) x-position (deg) 35 30 25 20 15 10 5 00 1500 1000 500 0 500 10000 600 400 200 0 200 400 600 8000 1.0 5.0 10.0 50.0 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 time (ms) control signal y-velocity (deg/s) y-position (deg) (a) 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.00 60 40 20 0 20 40 600 25 20 15 10 5 0 5 100 1.0 5.0 10.0 50.0 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 time (ms) (b) Figure 5.1: The above show the projected horizontal and vertical trajectories for a 3D plant after scaling the muscle mass by the amounts shown in the legend. 61 Bibliography K. Arai, E. Keller, and J. Edelman. A spatio-temporal neural network model of saccade generation. In Neural Networks, 1993., IEEE International Conference on, pages 70–74. IEEE, 1993. → pages 16 D. Ascher, P. F. Dubois, K. Hinsen, J. Hugunin, and T. Oliphant. Numerical Python. Lawrence Livermore National Laboratory, Livermore, CA, ucrl-ma-128569 edition, 1999. → pages 40 U. Ascher and L. Petzold. Computer methods for ordinary differential equations and differential-algebraic equations, volume 61. Society for Industrial Mathematics, 1998. → pages 46 A. Bahill, M. Clark, and L. Stark. The main sequence, a tool for studying human eye movements. Mathematical Biosciences, 24(3-4):191–204, 1975. → pages 21 D. Bertsekas and J. Tsitsiklis. Neuro-dynamic programming: an overview. In Decision and Control, 1995., Proceedings of the 34th IEEE Conference on, volume 1, pages 560–564. IEEE, 1996. → pages 57 A. Bryson and Y. Ho. Applied optimal control. American Institute of Aeronautics and Astronautics, 1979. → pages 1 H. Chen-Harris, W. M. Joiner, V. Ethier, D. S. Zee, and R. Shadmehr. Adaptive Control of Saccades via Internal Feedback. J. Neurosci., 28(11):2804–2813, 2008. doi:10.1523/JNEUROSCI.5300-07.2008. URL http://www.jneurosci.org/cgi/content/abstract/28/11/2804. → pages 1, 9, 11 M. Clark and L. Stark. Control of human eye movements: I. modelling of extraocular muscle. Mathematical Biosciences, 20(3-4):191–211, 1974. → pages 14 62 M. Clark and L. Stark. Time optimal behavior of human saccadic eye movement. Automatic Control, IEEE Transactions on, 20(3):345–348, 1975. → pages 1, 7, 20 J. Diedrichsen, R. Shadmehr, and R. Ivry. The coordination of movement: optimal feedback control and beyond. Trends in cognitive sciences, 14(1): 31–39, 2010. → pages 23 A. Doucet, N. de Freitas, and N. Gordon. An introduction to sequential monte carlo methods. Sequential Monte Carlo methods in practice, pages 3–14, 2001. → pages 25 J. Enderle and J. Wolfe. Time-optimal control of saccadic eye movements. Biomedical Engineering, IEEE Transactions on, (1):43–55, 1987. → pages 7, 20 P. Fitts. The information capacity of the human motor system in controlling the amplitude of movement. Journal of experimental psychology, 47(6):381, 1954. → pages 23 T. Flash and N. Hogan. The coordination of arm movements: an experimentally confirmed mathematical model. The journal of Neuroscience, 5(7):1688, 1985. → pages 20 A. Hamilton and D. Wolpert. Controlling the statistics of action: obstacle avoidance. Journal of Neurophysiology, 87(5):2434, 2002. → pages 37 N. Hansen and S. Kern. Evaluating the cma evolution strategy on multimodal test functions. In Parallel Problem Solving from Nature-PPSN VIII, pages 282–291. Springer, 2004. → pages 38, 40 C. Harris. On the optimal control of behaviour: a stochastic perspective. Journal of neuroscience methods, 83(1):73–88, 1998. → pages 1, 21 C. Harris and D. Wolpert. The main sequence of saccades optimizes speed-accuracy trade-off. Biological Cybernetics, 95:21–29, 2006. ISSN 0340-1200. URL http://dx.doi.org/10.1007/s00422-006-0064-x. 10.1007/s00422-006-0064-x. → pages 11, 21, 22, 23 C. M. Harris and D. M. Wolpert. Signal-dependent noise determines motor planning. Nature, 394:780–784, Aug 1998. doi:10.1038/29528. URL http://dx.doi.org/10.1038/29528. 10.1038/29528. → pages 4, 9, 19, 20, 21, 22, 23 63 A. Hill. The heat of shortening and the dynamic constants of muscle. Proceedings of the Royal Society of London. Series B, Biological Sciences, 126(843): 136–195, 1938. → pages 14 N. Hogan. An organizing principle for a class of voluntary movements. The Journal of Neuroscience, 4(11):2745, 1984. → pages 20 E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools for Python, 2001. URL http://www.scipy.org/. → pages 40, 56 E. Kandel, J. Schwartz, T. Jessell, S. Siegelbaum, and A. Hudspeth. Principles of neural science, volume 3. Elsevier New York, 1991. → pages 7 D. Kirk and W. Hwu. Programming massively parallel processors: A hands-on approach. Published February, 5, 2010. → pages 25, 32 E. Kowler and E. Blaser. The accuracy and precision of saccades to small and large targets. Vision Research, 35(12):1741–1754, 1995. → pages 23 O. Kramer, D. Ciaurri, and S. Koziel. Derivative-free optimization. Computational Optimization, Methods and Algorithms, pages 61–83, 2011. → pages 37 R. Leigh and D. Zee. The neurology of eye movements. Number 55. Oxford Univ Pr, 1999. → pages 1, 13, 17, 36 C. Martin and L. Schovanec. Muscle mechanics and dynamics of ocular motion. Journal of Mathematical Systems Estimation and Control, 8:233–236, 1998. → pages 14 M. Matsumoto and T. Nishimura. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation (TOMACS), 8(1):3–30, 1998. → pages 59 J. M. Miller, D. S. Pavlovski, and I. Shamaeva. Orbit tm 1.8 gaze mechanics simulation. Eidactics, San Francisco, 1995, 1995. → pages 14 H. Miyamoto, E. Nakano, D. Wolpert, and M. Kawato. Tops (task optimization in the presence of signal-dependent noise) model. Systems and Computers in Japan, 35(11):48–58, 2004. → pages 22, 29, 37, 59 T. Monk and C. Harris. Using optimality to predict photoreceptor distribution in the retina. Advances in Neuro-Information Processing, pages 408–415, 2009. → pages 19 64 J. Nelder and R. Mead. A simplex method for function minimization. The computer journal, 7(4):308, 1965. → pages 38 J. Nocedal and S. Wright. Numerical optimization. Springer verlag, 1999. → pages 6, 39, 40 C. Nvidia. Compute unified device architecture programming guide. NVIDIA: Santa Clara, CA, 83:129, 2007. → pages 25 L. Optican and F. Miles. Visually induced adaptive changes in primate saccadic oculomotor control signals. Journal of neurophysiology, 54(4):940, 1985. → pages 18 S. Park and K. Miller. Random number generators: good ones are hard to find. Communications of the ACM, 31(10):1192–1201, 1988. → pages 59 M. Pharr and G. Humphreys. Physically based rendering: From theory to implementation. Morgan Kaufmann, 2004. → pages 59 C. Ploner, F. Ostendorf, and S. Dick. Target size modulates saccadic eye movements in humans. Behavioral neuroscience, 118(1):237, 2004. → pages 60 C. Quaia and L. Optican. Commutative saccadic generator is sufficient to control a 3-d ocular plant with pulleys. Journal of Neurophysiology, 79(6):3197, 1998. → pages 12, 18 C. Quaia and L. Optican. Dynamic eye plant models and the control of eye movements. Strabismus, 11(1):17–31, 2003a. ISSN 0927-3972. → pages 8, 12, 13, 60 C. Quaia and L. Optican. Three-dimensional rotations of the eye. Adlers physiology of the eye: clinical application. New York: Mosby, pages 818–29, 2003b. → pages 13 T. Raphan. Modeling Control of Eye Orientation in Three Dimensions. I.Role of Muscle Pulleys in Determining Saccadic Trajectory. J Neurophysiol, 79(5): 2653–2667, 1998. URL http://jn.physiology.org/cgi/content/abstract/79/5/2653. → pages 11, 12, 13, 18 S. Reich. A dynamical systems framework for intermittent data assimilation. BIT Numerical Mathematics, pages 1–15, 2011. → pages 25 D. Robinson. The mechanics of human saccadic eye movement. The Journal of Physiology, 174(2):245, 1964. → pages 7 65 D. Robinson and B. Zuber. Models of oculomotor behavior and control (pp. 21–42), 1981. → pages 14, 60 D. Robinson, D. O’meara, A. Scott, and C. Collins. Mechanical components of human eye movements. Journal of Applied Physiology, 26(5):548, 1969. → pages 10, 12 P. Rogberg, P. Read, and S. Lewis. Assessing atmospheric predictability on mars using numerical weather prediction and data assimilation. 2008. → pages 25 R. Stengel. Stochastic optimal control: theory and application. Wiley New York (NY) et al., 1986. → pages 1, 58 S. Sueda, A. Kaufman, and D. Pai. Musculotendon simulation for hand animation. In ACM SIGGRAPH 2008 papers, pages 1–8. ACM, 2008. → pages 8 S. Sueda, G. L. Jones, D. I. W. Levin, and D. K. Pai. Large-scale dynamic simulation of highly constrained strands. ACM Trans. Graph., 30(4):39:1–39:9, Aug 2011. → pages 8 J. Tan, Y. Gu, G. Turk, and C. K. Liu. Articulated swimming creatures. ACM Trans. Graph., 30:58:1–58:12, August 2011. ISSN 0730-0301. doi:http://doi.acm.org/10.1145/2010324.1964953. URL http://doi.acm.org/10.1145/2010324.1964953. → pages 38 H. Tanaka, J. W. Krakauer, and N. Qian. An Optimization Principle for Determining Movement Duration. J Neurophysiol, 95(6):3875–3886, 2006. doi:10.1152/jn.00751.2005. URL http://jn.physiology.org/cgi/content/abstract/95/6/3875. → pages 11, 21 E. Todorov. Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural Comput., 17(5): 1084–1108, 2005. ISSN 0899-7667. doi:http://dx.doi.org/10.1162/0899766053491887. → pages 1, 21 E. Todorov and W. Li. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems. In American Control Conference, 2005. Proceedings of the 2005, pages 300–306. IEEE, 2005. ISBN 0780390989. → pages 21, 22 D. Tweed, H. Misslisch, and M. Fetter. Testing models of the oculomotor velocity-to-position transformation. Journal of neurophysiology, 72(3):1425, 1994. → pages 18 66 J. Van Gisbergen, D. Robinson, and S. Gielen. A quantitative analysis of generation of saccadic eye movements by burst neurons. Journal of Neurophysiology, 45(3):417, 1981. → pages 7, 16, 17, 50 Q. Wei, S. Sueda, and D. Pai. Physically-based modeling and simulation of extraocular muscles. Progress in biophysics and molecular biology, 2010. → pages 8, 13, 14, 59 S. Weinzierl. Introduction to monte carlo methods. Arxiv preprint hep-ph/0006269, 2000. → pages 32 F. Zajac. Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. Critical reviews in biomedical engineering, 17(4):359, 1989. → pages 15 67
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Noisy optimal control strategies for modelling saccades
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Noisy optimal control strategies for modelling saccades Jones, Garrett L. 2011
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Noisy optimal control strategies for modelling saccades |
Creator |
Jones, Garrett L. |
Publisher | University of British Columbia |
Date Issued | 2011 |
Description | Eye movements have for a while provided us a closer view into how the brain commands the body. Particularly interesting are saccades: fast and accurate eye movements that allow us to scan our visual surroundings. One observation is that motor commands issued by the brain are corrupted by a signal-dependent noise. Moreover, the variance of the signal scales linearly with the control signal squared. It is assumed that such uncertainty in the dynamics introduces a probability distribution of the eye that the brain accounts for during motion planning. We propose a framework for computing the optimal control law for arbitrary dynamical systems, subject to noise, and where the cost function depends on a statistical distribution of the eye’s position. A key contribution of this framework is estimating the endpoint distribution of the plant using Monte Carlo sampling, which is done efﬁciently using commodity graphics hardware in parallel. We then describe a modiﬁed form of gradient descent for computing the optimal control law for an objective function prone to stochastic effects. We compare our approach to other methods, such as downhill simplex and Covariance-Matrix-Adaptation, which are considered “gradient-free” approaches to optimization. We ﬁnally conclude with several examples that show the framework successfully controlling saccades for different plant models of the oculomotor system: this includes a 3D torque-based model of the eye, and a a nonlinear model of the muscle actuator that drives the eye. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-12-23 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0052155 |
URI | http://hdl.handle.net/2429/39845 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2012-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2012_spring_jones_garrett.pdf [ 3.11MB ]
- Metadata
- JSON: 24-1.0052155.json
- JSON-LD: 24-1.0052155-ld.json
- RDF/XML (Pretty): 24-1.0052155-rdf.xml
- RDF/JSON: 24-1.0052155-rdf.json
- Turtle: 24-1.0052155-turtle.txt
- N-Triples: 24-1.0052155-rdf-ntriples.txt
- Original Record: 24-1.0052155-source.json
- Full Text
- 24-1.0052155-fulltext.txt
- Citation
- 24-1.0052155.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0052155/manifest