A Risk Assessment Infrastructure forPowered Wheelchair MotionCommands without Full SensorCoveragebyPouria TalebiFardB.A.Sc., The University of British Columbia, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Computer Science)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)March 2014c? Pouria TalebiFard 2014AbstractSmart powered wheelchairs offer the possibility of enhanced mobility to alarge and growing population?most notably older adults?and a key featureof such a chair is collision avoidance. Sensors are required to detect nearbyobstacles; however, complete sensor coverage of the immediate neighbour-hood is challenging for reasons including financial, computational, aesthetic,user identity and sensor reliability. It is also desirable to predict the futuremotion of the wheelchair based on potential input signals; however, directmodeling and control of commercial wheelchairs is not possible because ofproprietary internals and interfaces. In this thesis we design a dynamic ego-centric occupancy map which maintains information about local obstacleseven when they are outside the field of view of the sensor system, and weconstruct a neural network model of the mapping between joystick inputsand wheelchair motion. Using this map and model infrastructure, we canevaluate a variety of risk assessment metrics for collaborative control of asmart wheelchair. One such metric is demonstrated on a wheelchair with asingle RGB-D camera in a doorway traversal scenario where the near edgeof the doorframe is no longer visible to the camera as the chair makes itsturn.iiPrefaceA version of this thesis is submitted for publication, in which I was thelead investigator, responsible for all major areas of concept formation, datacollection and analysis, as well as manuscript composition. Junaed Sat-tar was involved in the latter stages of concept formation and contributedto manuscript edits. Ian M. Mitchell was the supervisory author on thisproject and was involved throughout the project in concept formation andmanuscript composition.This research was supported by CanWheel (the Canadian Institutesof Health Research (CIHR) Emerging Team in Wheeled Mobility for OlderAdults #AMG-100925), National Science and Engineering Council of Canada(NSERC) Discovery Grant #298211, and the Canadian Foundation for In-novation (CFI) Leaders Opportunity Fund / British Columbia KnowledgeDevelopment Fund Grant #13113.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . viii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Related Work on Smart Wheelchairs . . . . . . . . . . . . . . 41.3 Hardware and Software Platform . . . . . . . . . . . . . . . . 52 An Egocentric Local Map . . . . . . . . . . . . . . . . . . . . 72.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 102.4 Map Update Based on Odometry Motion Estimation . . . . 132.4.1 Movement update . . . . . . . . . . . . . . . . . . . . 132.5 Updating the Map based on the Observation . . . . . . . . . 152.6 Software Platform . . . . . . . . . . . . . . . . . . . . . . . . 202.7 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.8 Real World Results . . . . . . . . . . . . . . . . . . . . . . . 213 Non-Linear Control Function Estimation . . . . . . . . . . . 233.1 Data Collection and Preprocessing . . . . . . . . . . . . . . . 253.1.1 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 Neural Network Training . . . . . . . . . . . . . . . . . . . . 283.2.1 The Frequentist Approach . . . . . . . . . . . . . . . 29ivTable of Contents3.2.2 The Bayesian Approach . . . . . . . . . . . . . . . . . 313.3 Outer-Loop Optimization . . . . . . . . . . . . . . . . . . . . 343.3.1 Bayesian Bandit Optimization . . . . . . . . . . . . . 343.3.2 Optimization Results . . . . . . . . . . . . . . . . . . 343.4 Model Validation and Performance . . . . . . . . . . . . . . . 353.5 Implementation Details . . . . . . . . . . . . . . . . . . . . . 364 Risk Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 436 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496.1 Suggestions for Future Research . . . . . . . . . . . . . . . . 49Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50vList of Tables3.1 List of optimized parameters of the network trained via SCG,their ranges and achieved optimal . . . . . . . . . . . . . . . . 353.2 List of optimized parameters of the network trained via HMC,their ranges and achieved optimal . . . . . . . . . . . . . . . . 35viList of Figures1.1 Illustration of limited sensor coverage. . . . . . . . . . . . . . 21.2 Photos of our PWC. . . . . . . . . . . . . . . . . . . . . . . . 52.1 An illustration of acquiring a map representation while navi-gating in a cluttered environment . . . . . . . . . . . . . . . . 72.2 Occupancy grid representation of the filtering algorithm pre-sented in [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Illustration of an egocentric obstacle map representation inpolar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Overlay of the generated polar map with the environment . . 122.5 Performance of the algorithm to dynamic obstacles . . . . . . 192.6 Screen-shot of AMORsim simulator . . . . . . . . . . . . . . . 202.7 Illustration of updating the map using movement information 223.1 Time deviations for the encoders . . . . . . . . . . . . . . . . 253.2 Time deviations for the joystick commands . . . . . . . . . . 263.3 Run-time resampling of the inputs . . . . . . . . . . . . . . . 273.4 Predictive capability of the model trained via SCG . . . . . . 363.5 Predictive capability of the model trained via HMC algorithm 373.6 Relative-error distribution of the model trained via SCG . . . 373.7 Relative-error distribution of the model trained via the HMCalgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.8 Comparison of various strategies for optimizing model param-eters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.1 Experimental setup and the four tested trajectories. . . . . . 445.2 Safe (top) and unsafe (bottom) trajectories in slow-speed mode. 455.3 Safe (top) and unsafe (bottom) trajectories in fast-speed mode. 465.4 Occupancy-grid map of the environment navigated during thelong-run trial. . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.5 Time-series plot of risk assessment during the long-run trialsshown in Figure 5.4. . . . . . . . . . . . . . . . . . . . . . . . 48viiAcknowledgementsI offer my lasting gratitude to my supervisor, Ian Mitchell. I?m not convincedthat words would ever suffice but, Ian: to work with you has been a realpleasure, with heaps of fun and excitement. Your high scientific standardsand critical insights serves as an inspiration to me, and has provided mewith a strong research foundation on which to continue building. Thankyou ever so much for being so patient and encouraging.Moreover, I feel heavily indebted to my former supervisor, Meeko Oishi,for her continued support and the opportunities she provided me with.I am very grateful to Jim Little for his insightful comments on my thesisand motivating discussions. I have been privileged to get to know andcollaborate with Junaed Sattar. I learned a lot from him about researchand technical groundwork and I thank him for that. Needless to say thatthis work would not have been possible without Pooja Viswanathan?s thesiswork, which identified the problems that I set out to solve.I also want to thank Nando de Freitas for introducing me to the field ofmachine learning and getting me hooked on it.My time in graduate school was made enjoyable in large part due tofriends that became a part of my life and helped me with personal challenges.I would like to thank namely Farshid Samandari for his support and forbeing the voice of reason in my life. I am very grateful to Chloe? Filson andRobin Wilson for their friendship; and Guggy Grewal for impeding me fromembarrassing myself too often.Much gratitude is owed to my family for all their love and encouragement.For my parents, Sirous and Nahid, through whose sacrifices, my siblings andI have the right to freedom of belief and the right to higher education. Formy brother, Peyman, for being nothing short of the perfect role model. Formy sister, Sahba, for her unconditional love and care, and for instilling loveof science in me.viiiChapter 1IntroductionMobility impairment is becoming increasingly common as the populationages, and a lack of mobility can lead to a host of further social, mental andphysical impairments. Powered wheelchairs (PWCs) can greatly improvethe mobility of those who are unable to self-propel in manual wheelchairs;however, existing PWC technology is often inappropriate for users with, forexample, low vision, hemispatial neglect, spasticity, tremors, dementia, se-vere Alzheimers, or other cognitive impairment because they may not be ableto safely drive these large, heavy and powerful machines [2]. Smart PWCsseek to overcome this limitation and could benefit a sizeable population [3].Considerable progress has been made in fully autonomous PWC navigation,but there are populations for which full autonomy is undesirable; for ex-ample, older adults with cognitive impairments can become quite agitatedwhen the PWC seems to act on its own initiative. Consequently, recentefforts have focused on smart PWCs which can constantly yet seamlesslycollaborate with the user to accomplish his or her goals. In the context ofthis paper, the goal of the collaborative control is minimal intervention inthe user?s commanded motion while still avoiding collisions.Collision avoidance requires knowledge of the surrounding environment.Because the majority of PWCs are purchased through insurance or publichealth care plans with significant pressures to reduce expenses, a criticalfactor in the success of any commercial intelligent PWC will be the costof sensors. With manufacturer and retailer markups totalling several hun-dred percent on parts, few payers will be willing to cover the cost of sur-rounding a power wheelchair with a suite of laser rangefinders (LRFs). Forcost purposes, currently available small and cheap RGB-D cameras (suchas the KinectTM) are an appealing alternative to LRFs. However, camerashave much narrower fields of view (less than 60? for the Kinect) than LRFs(typically 120??180?), and camera systems generate much more data than2D LRFs; for example, a single Kinect nearly saturates a typical laptop?speripheral bus. Even if it were financially and/or computationally practi-cal to completely surround a PWC with LRFs and/or cameras, it is likelythat the resulting system?s halo of sensors would be unappealing to potential1Chapter 1. Introduction(a) Field of view while the adja-cent wall is still visible.(b) Field of view as PWC be-gins turning; adjacent wall is nolonger visible.Figure 1.1: Illustration of limited sensor coverage.users, both for aesthetic reasons [4]?it increases the visual size of the PWC,which is already a rather bulky assistive device?and due to the potentialfor stigmatization [5]?it implies that the user is unable to operate a stan-dard PWC. As a final challenge, LRFs and cameras are routinely occluded,blinded or otherwise temporarily disabled when operating in the field.Consequently, full sensor coverage in a smart PWC system is an essen-tially unattainable goal, and holes in coverage can lead to collisions. Asan example, a system with a front facing stereo vision camera was testedwith cognitively impaired older adults in [1]. The camera?s narrow field ofview led to a common failure mode during the trials: The participant woulddrive the PWC parallel to a wall (for example, down a corridor) until itwas close enough to a corner or doorway that the camera could no longersee the adjacent wall (see figure 1.1 for an illustration of this moment); ifthe participant then made too tight a turn the footrest or side of the PWCwould hit the corner or doorframe.In this work we describe a system which can construct a collision riskassessment for short-term trajectories of the PWC which may steer it intoregions not currently visible to the sensor system. We adapt a probabilis-tic mapping technique [6] to construct an egocentric local occupancy mapwhich surrounds the PWC and thereby provides information about (static)obstacles outside the sensor field of view.In addition to the local obstacle map, a key requirement for risk assess-ment is accurate modeling of future trajectories. Commercial PWCs includedrivetrains, power electronics, control systems and input devices from sev-2Chapter 1. Introductioneral different companies, most of whom consider the technical specificationsof and interfaces to their products to be trade secrets; thus, it is infeasibleto construct models from first principles or in most cases even to interactwith the control and drive system except through existing user interfaces.To work around this constraint we construct a neural net model mappingjoystick motions to PWC motion.Given a map and a method to simulate the trajectory that will be gen-erated by a particular choice of input signal, there are many approaches toassess risk. In a fully autonomous system, this process is relatively straight-forward: choose an input signal which has a low risk of collision. In contrast,in a collaboratively controlled system the user?s (relatively unpredictable)input must be taken into account; therefore, the method for assessing riskmust consider both the user?s capabilities and the ways in which the sys-tem is willing to intervene if the risk becomes too great. Our CanWheelteam is currently running Wizard of Oz trials with our target population(cognitively impaired older adults) to better understand what type of inter-vention will be most effective at avoiding collisions and acceptable to theusers [7]. For the purposes of the experiments in this paper we propose astraw man risk assessment to demonstrate the technique. We expect thatwhatever scheme we design based on our trials would have to be modifiedfor populations with different diagnoses, and therefore we focus our effortshere on mapping and modeling algorithms on top of which a variety of riskassessments can easily be constructed.With that goal in mind, the contribution of this thesis is a systemthrough which a measure of the short-term risk of collision with nearbyobjects while driving forward under collaborative control can be evaluatedusing only a single, low-cost, discreetly mounted RGB-D camera and with-out detailed knowledge of or access to the PWC control system. This taskis accomplished by maintaining an egocentric probabilistic occupancy map(to overcome the camera?s narrow field of view) and constructing a neuralnetwork model of PWC motion (to simulate the response of the PWC tothe collaboratively chosen inputs). This map and model infrastructure caneasily be modified to handle additional sensors and/or different PWCs.We test the effectiveness of the system in detecting higher risk maneuverswhen turning into a doorway from a corridor. We consider a naive riskassessment metric to demonstrate the difference in risk for a safe trajectoryversus an unsafe one.31.1. Outline1.1 OutlineThis thesis presents a system which can construct a collision risk assessmentfor short-term trajectories that may steer the PWC into regions not currentlyvisible to the sensor system.We dedicate the rest of this chapter to explore some of the more recentwork done on smart wheelchairs, and the hardware and software platformthat was employed for development and testing of our system.? Chapter 2 contains the formulation and implementation details forconstruction of an egocentric local occupancy map. Brief discussionof relevant mapping techniques from the literature and the generalmathematical formulation of the update rules for measurements andobservations are also presented in chapter 2. The chapter concludeswith the employed software package(s), visualization tools, and demon-stration of real world results.? Chapter 3 describes the employed model construction procedures forforward movement simulation. The chapter elucidates the data col-lection and necessary pre-processing steps to use the designed model.Moreover, the quality of the model after optimization of the modelparameters is presented.? Chapter 4 discusses a number of possible risk assessment schemes thatcould be utilized on the presented system. The chapter presents asimple measure that will be used to assess the system in chapter 5.? Chapter 5 presents the effectiveness of the system in a scenario thatreplicates a common source of collision in the trials performed by [1].The last chapter is dedicated to concluding remarks and suggestions forfuture research.1.2 Related Work on Smart WheelchairsSeveral prototype intelligent wheelchair systems that attempt to compensatefor lack of cognitive capacity of the user for safe locomotion and navigationhave been developed over the years. Since [1, 8, 9] provide a thorough reviewof smart wheelchair system prototypes along with their employed range ofsensors, here we focus on relatively recent work.41.3. Hardware and Software Platform(a) PWC with the sensormounted underneath the base.(b) Close-up view of the RGB-Dsensor highlighted with a whiterectangle.Figure 1.2: Photos of our PWC.Every prototype equips PWCs with a range of sensor systems includingbumper skirts [10], sonars [11?13], infrared sensors [13], LRFs [13?17], stereovision cameras [1], and omnidirectional stereo vision cameras [18].In all of these systems the mounting location for the sensor(s) is chosento maximize effectiveness, rather than unobtrustive or aesthetically pleasinglocations. Broad consumer and payer acceptance of smart PWC productswill eventually demand a minimum of sensors placed in less than ideal lo-cations, and our focus here is on mechanisms to evaluate risk of collisiondespite the degradation in sensor coverage that implies.1.3 Hardware and Software PlatformOur PWC is a QuickieR? Rhythm modified by AT Sciences to include theirDrive Safe collision avoidance system [19] and custom rotary encoders on theelectric motor driveshafts that we use for odometry measurement. The onlycomponent of the Drive Safe system that we use in the research describedhere is the CANBus to USB interface, which allows us to read the odometersand the joystick and write joystick-like commands to the PWC?s motor51.3. Hardware and Software Platformcontroller (we have removed the non-embedded components of the DriveSafe software, as well as the infrared, sonar and bump sensor pods whichsurrounded the PWC). Our sensor is now an Asus Xtion Pro Live RGB-D camera connected to the laptop through USB. The camera is mountedalmost invisibly below the battery enclosure, looking forward from underthe footrests. Figure 1.2 shows both the PWC and a close-up identifyingthe location of the camera.Our PWC is equipped with a Lenovo W530 laptop with IntelR? CoreTMi7-3720QM processor, 8GB main memory and a 120GB solid state drive.All systems are implemented in ROS [20] (Fuerte Turtle release) on top ofan Ubuntu version 12.04LTS host OS. Specific software environment andpackage requirements for each component of the system will be discussed intheir respective chapter under the heading implementation details.6Chapter 2An Egocentric Local MapPerhaps one of the most crucial elements in developing a safer and moreflexible mobile robot lies in the robot?s ability to interact coherently withits surrounding, usually unstructured, environment. Intuitively, the abilityto interact with the environment suggests the ability to sense, plan, andreact to the stimuli inherent to the environment. For that, the robot mustbe able to perceive its surroundings using sensory information and to utilizeit to plan and react accordingly. For mobile robots this first stage is oftenachieved with the help of maps similar to the ?map? representation in thecallout in figure 2.1.Figure 2.1: Acquiring a map representation while navigating in a clutteredenvironment. As illustrated in the circular ?map? representation, the darkregions correspond to the areas where obstacles reside, white regions areempty areas, and occupancy information in the dark gray areas is unknown.(i.e., areas beyond the obstacles where no sensor data is available.)72.1. MotivationIn this work, we concern ourselves with the problem of acquiring themap representation for the purposes of obstacle detection and collision riskassessment. We present a probabilistic representation of an egocentric ob-stacle map obtained by processing range sensor data and movement readingswith incremental updates using Bayesian estimation procedures.2.1 MotivationThe aim of this work is to improve upon the free space detector in [1]. Thefiltering algorithm in [1] for free space detection is a simplistic procedurewhich only represents obstacles in the perceptual field of the sensor. Eachcell in the obstacle map takes a value between 0 and 255, where 0 repre-sents the white empty cells, as shown in figure 2.2, and 255 are the darkcells where an obstacle is believed to reside. After every sensor reading,each cell is updated by either adding a constant if an obstacle resides inthe cell or subtracting a constant if the sensor reports the absence of anobstacle. This approach, although computationally cheap, responds slowlyto dynamic obstacles and does not take into account the movement of therobot. For instance, the algorithm will lag in updating the map as the robotis moving towards a stationary obstacle. Another disadvantage is that theobstacle will appear in the obstacle map for a brief time after the obstaclehas been removed from the perceptual field of the sensor. Moreover, sincethe algorithm does not make use of movement data, the algorithm will haveno recollection of preceding scenes. For instance, the narrow field of view ofthe sensor in [1] and not incorporating odometry information led to a com-mon failure mode during the trials: The participant would drive the PWCparallel to a wall (for example, down a corridor) until it was close enough toa corner or doorway that the camera could no longer see the adjacent wall(see figure 1.1 for an illustration of this moment); if the participant thenmade too tight a turn the footrest or side of the PWC would hit the corneror doorframe.The proposed algorithm addresses the task of detection with more com-plex map updates relative to [1]. Specifically, the algorithm takes into ac-count dynamic obstacles in the sensor?s field of view and incorporates robotmovement to update the obstacle representation.82.2. Related WorkFigure 2.2: Occupancy grid representation of the filtering algorithm pre-sented in [1]. White cells denote empty spaces, solid black cells correspondto occupied spaces, and gray regions represent the area outside the sensor?sfield of view. Figure taken from [1, figure 3.5 c].To overcome and address the shortcoming we identified from [1], weadopted an egocentric obstacle map representation in polar coordinateswhere the origin of the map coordinate system is always centered on thecamera, as illustrated in figure 2.3. This implies that the egocentric maprequires that the entire map is updated with every incoming odometry mea-surement, but there is no need to maintain a localization estimate withinthe map. While updating the map is as much as or slightly more expensivethan updating a typical localization estimate, a significant computationalbenefit accrues during the simulation of future trajectories for risk assess-ment: There is no need to deal with uncertainty in the initial position fromwhich these trajectories start. We choose the polar coordinate system be-cause updates from the camera (or any range sensor at the origin) are cheapto apply to the map, since each reading spans along a ray in the map (asshown in figure 2.3), and movement updates are no more expensive thanthey would be in a rectangular coordinate system.2.2 Related WorkRobust mapping techniques for obstacle representation and robot navigationin unstructured and unknown environments have taken different forms overthe years. From edge detection methods [21] to Vector Field Histograms [22]to occupancy grid maps [6], each adopt a distinct obstacle representation92.3. Problem Formulationthat is well suited to their more primary tasks?navigation and obstacleavoidance in most cases.Edge-detection based obstacle detection and avoidance techniques wereone of the popular methods of their time. For avoidance, the system wouldeither steer the robot around the boundaries of the obstacle (see [21]) ortake a panoramic scan and apply an edge-based global path planner to planthe subsequent steering direction(see [23]).Occupancy grid maps are the most prominent method for spatial repre-sentation for mobile robots. They have been used in many different scenar-ios as local and global map representations for path planning and naviga-tion [24, 25]. There is a large body of work [26?28] on building occupancygrid maps while estimating the location of the robot within the map.The Vector Field Histogram (VFH) [22] method constructs a 1D his-togram obstacle polar density plot to select the best steering direction forthe robot. VFH is a reactive approach intended for autonomous control andrelies on the presence of a localization system to navigate [29].A notorious technique used by mobile robots for mapping is simultaneouslocalization and mapping (SLAM), which couples the task of mapping andlocalization (within that map). The depency of map estimate and posemakes the problem high-dimensional and complex [6].2.3 Problem FormulationIn this section we establish the basic mathematical notation for our mappingalgorithm.As mentioned in section 2.1, we represent the map in a polar coordinatesystem; with angle ? ? [?pi,+pi], where ? = 0 is directly in front of thePWC, and radial distance ? ? 0. The map is stored on a curvilinear gridgenerated by a tensor product of samples in ? and ? (the white, gray andblack dots in figure 2.4). The ? samples are equally spaced over the interval[?pi,+pi]. The ? samples are drawn from an interval [?min, ?max] where ?minis chosen such that all points ? < ?min would be within the PWC and ?maxis chosen based on the maximum range of the RGB-D camera. Furthermore,the spacing of samples in the ? direction grows quadratically, to match boththe degradation of accuracy of the camera?s depth readings as the distanceto obstacles grows as well as the decreased importance of distant obstaclesin the short-term risk assessment process.The algorithm for maintaining the map follows a fairly standard occu-pancy grid mapping approach [6] with minor modifications to account for102.3. Problem FormulationFigure 2.3: Illustration of an egocentric obstacle map representation in polarcoordinates. The setup exemplifies a case where the sensor is mounted nearthe center of the axis of rotation. Note that only three of the range readings(rays) are displayed for illustration purposes.limited sensor coverage and the fact that the egocentric map can change dueto either obstacle or PWC movement. Our goal is to estimate the probabil-ity of a particular cell of the map being occupied at the current time giventhe previous maps, sensor readings, and movements. Formally, this wouldtake the form of computing the following distribution:p(mt(?, ?)|m0:t?1, z1:t, u0:t) (2.1)where mt(?, ?) is the outcome that the map cell at point (?, ?) is occupiedat time t, m0:t?1 are all previous maps (for all values of (?, ?)), z1:t areall past observations, and u0:t are all odometry measurements. Assumingindependence of the observations and movements allows us to factorize theprobability distribution into a term dependent only on movements and aterm dependent only on observations:p(mt(?, ?)|m0:t?1, z1:t, u0:t)= p(mt(?, ?)|m0:t?1, z1:t) ? p(mt(?, ?)|m0:t?1, u0:t)(2.2)112.3. Problem Formulation(a) Polar map representation beforereaching the corner and while the adja-cent wall is still in the perceptual field.(b) Polar map representation where theadjacent wall is not in the field of view.Figure 2.4: Screenshot of the polar map with actual boundaries (walls)and field of view superimposed on top. White dots correspond to emptyspaces, gray dots correspond to unknown cells and darker dots correspondto occupied cells. Dark blue dots are the raw range readings.122.4. Map Update Based on Odometry Motion Estimation2.4 Map Update Based on Odometry MotionEstimationWe follow the approach from [6, chapter 5] for the motion update, exceptthat instead of updating an estimate of the PWC?s localization we are up-dating the map. We make a Markov assumption that the new map mt(?, ?)depends only on the last map mt?1 and the last measured odometry ut.Regardless of the accuracy of the rotary information obtained from the en-coders, the data is still prone to error as it does not distinguish slippage(or drift) from actual movements; thus, to take into account inaccuracy inthe measured odometry we introduce a new random variable u?t which is theactual odometry (if there were no slippage)p(mt(?, ?)|m0:t?1, u0:t)= p(mt(?, ?)|mt?1, ut),=?u?tp(mt(?, ?)|mt?1, ut, u?t) ? p(u?t|ut) ? p(ut),=?u?tp(mt(?, ?)|mt?1, u?t) ? p(u?t|ut),(2.3)where the final step is accomplished by observing that the measured odome-try ut is irrelevant to the map update once the actual odometry u?t is known,and p(ut) is 1 for the measured ut and zero otherwise. It is impractical toconsider all possible u?t, so we draw samples from a proposal distributionsimilar to that in [6, table 5.6], with slight modifications as outlined in Al-gorithm 1, where ?? represents the amount of rotation in the time interval(t? 1, t], and ?x and ?y represent the two-dimensional planar movement inthe same time interval relative to an arbitrary external coordinate frame.2.4.1 Movement updateAs elucidated previously, the egocentric map representation requires thatthe robot?s location remains in the center of the map; thus to incorporatemovements through space, the map is shifted according to the odometryinformation. Given the sampled actual odometry u?t representing the change(?x, ?y, ??) in the pose, one can use the relationships in (2.4) to transformthe change in position from cartesian coordinates into polar coordinates,where (??, ??) represents the offset (in polar coordinates) of the origin of132.4. Map Update Based on Odometry Motion EstimationAlgorithm 1 sample motion model odometry(ut = {?x, ?y, ??}):1: ?rot = ??2: ?trans =??x2 + ?y23: ??rot = ?rot ? Irot, where Irot ? N (0, ?1?rot + ?2?trans)4: ??trans = ?trans ? Itrans, where Itrans ? N (0, ?3?trans + ?4?rot)Given that ?1, ?2, ?3, and ?4 are robot-specificerror parameters.5: ?x? = ?x+ ??trans ? cos(?rot) where ?x? denotes the hypothesized actual x6: ?y? = ?y + ??trans ? sin(?rot)7: ??? = ?? + ??rot8: return: u?t = (?x?, ?y?, ???)the egocentric map due to the motion u?t.?? =??x2 + ?y2,?? = arctan(?y/?x)(2.4)Given the relative motion in polar coordinates, rotations are trivial as theysimply correspond to shifting elements of the angular coordinate. Transla-tions follow the formulas in (2.5) where (??, ??) represents the correspondingoffset of a general point (?, ?) in the map.?? =??2 + ??2 ? 2? ?? cos(?? ??),?? = arctan(? sin(?)? ?? sin(??)? cos(?)? ?? cos(??)) (2.5)Thus the update for u?t is formally given byp(mt(?, ?)|mt?1, u?t) = p(mt?1(??, ??)|m0:t?2, z1:t?1, u0:t?1) (2.6)where p(mt?1(??, ??)|m0:t?2, z1:t?1, u0:t?1) is taken from the egocentric mapat time t? 1 sampled on the grid.Implementation DetailsThe algorithm was initially simulated in Matlab on a proof of conceptbasis and then implemented in Python and Robotic Operating System142.5. Updating the Map based on the Observation(ROS) for the robotic wheelchair. We discuss a few issues that arose duringimplementation in the remainder of this section.Interpolation of the right hand side of (2.6) is required as it is highlyunlikely for (??, ??) to be a grid node. In the Matlab implementation, linearinterpolation proved too dissipative, while piecewise cubic splines createdringing artifacts near steep discontinuities in the probability functions. Inour experience, the ?cubic? scheme [30] in Matlab worked well. In thecase of two-dimensional interpolation, the cubic scheme fits a bicubic surfaceusing the values of the (sixteen) closest points.For the Python implementation, we used the Clough-Tocher piecewisecubic interpolant [31] (as implemented in the scipy library). The Clough-Tocher piecewise cubic interpolant is a curvature-minimizing interpolant.In our experiments, among the options available for 2D interpolation, thepiecewise cubic provided a reasonable tradeoff between efficiency, accuracyand smoothness.Because the map?s grid is finite, it is also necessary to provide boundaryconditions to evaluate (??, ??) that lie outside the grid. Periodic boundaryconditions are the clear choice for the ? direction. In the ? direction weset the probability of obstacles to zero for ?? < ?min, under the assumptionthat the hole in the center of the map is entirely occupied by the PWCand hence cannot contain obstacles. For ?? > ?max we use a constant priorprobability of occupancy pocc (the same prior as is used in the observationmodel below).We repeat (2.6) for each node (?, ?) in the grid to compute p(mt|mt?1, u?t)for a single sample u?t. To compute the sum in (2.3) we sample a smallnumber (eg: j = 5) of u?t, and assume equal p(u?t|ut) for all of them (eg:1/j = 0.2).Algorithm 2 describes the steps involved in updating the map based onthe relative odometry information. In practice, it would be a waste of com-putational power to account for motions that are less than the resolution ofthe map; hence, the function is only executed when the cumulative trans-lations and/or rotations since the last update are above a preset threshold(derived from the user specified map resolution).2.5 Updating the Map based on the ObservationWe adapt a standard occupancy grid mapping algorithm based on a binaryBayes filter [6, sections 4.2 and 9.2] to an egocentric map. Distributionp(mt(?, ?)|m0:t?1, z1:t) in (2.2) can be computed recursively by applying152.5. Updating the Map based on the ObservationAlgorithm 2 odometry motion update(mt?1, ut = {?x, ?y, ??}):1: if ?translations > thresholdtran ? ?rotations > thresholdrot then2: maptmp ? initialize J copies of map mt?13: for each sample j = 1 : J do4: for all map grid nodes do5: [u?(j)t ] = sample motion model odometry(ut) (Algorithm 1)6: ??(j), ??(j)) ? (?x?(j), ?y?(j), ???(j)) (2.4)7: (??(j), ??(j))? compute the coordinates in the old system (2.5)8: map(j)tmp ? interpolate occupancy information based on (??, ??)9: end for10: end for11: map = 1J?Jj=1 map(j)tmp12: end ifBayes? rule. Assuming that each observation is conditionally independentfrom previous observations and maps given the current map, we can writep(mt(?, ?)|m0:t?1, z1:t) =p(mt(?, ?)|zt)p(zt)p(mt(?, ?)|m0:t?1, z1:t?1)p(mt(?, ?))p(zt|mt?1).(2.7)In the remainder of this derivation, let mt = mt(?, ?) for notational sim-plicity. The standard binary Bayes filter assumes static state, but we wishto allow for moving obstacles so we assume a dynamic dependence betweenthe current and past maps of the formp(mt|m0:t?1, z1:t?1)= p(mt|mt?1)p(mt?1|m0:t?2, z1:t?1),(2.8)where p(mt|mt?1) = ptrans is a constant giving the probability that an oc-cupied map location at the last time remains occupied at the current time.We then transform into odds form by dividing p(mt|m0:t?1, z1:t) through byits complement 1?p(mt|m0:t?1, z1:t), and take the logarithm for mathemat-ical convenience. This transform conveniently cancels a few terms in (2.7)and, after incorporating (2.8), yields a simple and mostly additive recursiveupdatelt = log(p(mt|zt)1? p(mt|zt))? log(p(mt)1? p(mt))+ log(p(mt|mt?1)1? p(mt|mt?1))+ lt?1,(2.9)162.5. Updating the Map based on the Observationwhere we define the current log-odds aslt = log(p(mt|m0:t?1, z1:t)1? p(mt|m0:t?1, z1:t))and set the initial log-odds to bel0 = log(p(m0)1? p(m0))= log(pocc1? pocc). (2.10)The terms on the right hand side of (2.9) are based on, respectively, knowl-edge gained from the current sensor reading p(mt|zt) (commonly referred toas the sensor model), the prior probability of occupancy p(mt) = pocc, theprior probability of occupancy remaining static p(mt|mt?1) = ptrans, and theprevious log-odds lt?1. The prior probability of occupancy pocc is set to avalue between 0.2 and 0.5, depending on obstacle density [32]. The ptrans isthe transition probability of occupancy from t? 1 to t. In our experiments,a value between 0.9 to 0.95 seemed to worked well. Note that the priors poccand ptrans are assumed to be constant (in both time and space).For the sensor reading term p(mt|zt) we assume that the measurementzt provides data in the form of the range to the nearest obstacle for a largenumber of angles (compared with the resolution of our grid in the ? coordi-nate) within the field of view. We assume independence between columns ofthe grid (all elements with the same ?)?not an ideal assumption, althoughthe oversampling of the sensor offsets the issue somewhat. To perform theupdate on a column of the map, consider a fixed ? = ??. In order to reducethe effect of noise in the sensor data, the range estimate for this column r(??)is the median among all sensor range estimates coming from angles whosenearest neighbor among the columns of the map is ??. The sensor readingterm is then given byp(mt(?, ??)|zt) ={N (r(??), ?(r(??))), for ? ? ??;pocc otherwise;(2.11)where N (r, ?) is a normal distribution with mean r and standard deviation?, ?(r(??)) is chosen according to experimental measurement of the sensor?snoise (an increasing function with range for the Kinect, as given in [33]),?? is the first sample of ? in the grid such that ?? > r(??), and ? > ?? isthe unobserved region beyond the sensed obstacle. In [33] the standarddeviation varies based on where in the 2D image frame the data is comingfrom and the depth measurement being reported. In our implementations172.5. Updating the Map based on the Observationwe do not take into account variations across the 2D image frame. However,we do incorporate the variation in standard deviations with respect to depth,which increases quadratically with distance.To account for moving obstacles, cells beyond the obstacles in the fieldof view are driven towards the unknown likelihood, which is l0 in (2.10).The update rule is a simple exponential decay; where the time constant ischosen based on the rate at which sensory information is coming in. Effectsof this update are shown in figure 2.5. The scenario depicted in figure 2.5is where a pedestrian moves across the field of view from left to right andpauses momentarily in the middle of the field of view. As a result of thesimple update (i.e., the exponential decay), cells that are situated beyondthe obstacles are forced towards to l0, as illustrated in 2.5a and 2.5b. Mean-while, the cells where the pedestrian occupied in 2.5a are driven towards lowlikelihood (or free) in 2.5b.182.5. Updating the Map based on the Observation(a) Pedestrian enters thefield of view on left andwalks across. Cells be-yond the pedestrian areinitially white but are be-ing driven towards gray(unknown).(b) Pedestrian pauses infront of the sensor in themiddle of the field of view.Cells beyond the pedes-trian are driven towardsunknown. The regionwhere the pedestrian wassituated in the previousframe is now marked free.(c) Pedestrian stays infront of the sensor in themiddle of the field of view.Cells beyond the pedes-trian are turning darkgray.Figure 2.5: Performance of the algorithm to dynamic obstacles. Occupancyinformation for the gray circles (cells) is unknown, white cells are free anddark (black) cells are occupied. Green points are the sensor readings. Thescenario elucidates the response of the algorithm to dynamic obstacles suchas pedestrians. In this scenario, the pedestrian enters the field of view fromleft and walks across with a momentarily pauses half way through. The cellsbehind the PWC are white because the the chair was moved slightly forwardafter initializing the map.Implementation DetailsMap grid parameters such as the radial and angular resolution can to beuser-defined. For instance, it is likely that a reading with a narrow standarddeviation ? will not be noticed if the abscissa representing the radial coordi-nate (?) do not intersect the normal distribution in (2.11) near its peak. Toovercome the scenario where the radial coordinate is coarsely represented,we use the first-order finite difference of the normal cumulative distribution(cdf) such thatp(mt(?, ??)|zt) ={C(r(??) + h(?))? C(r(??)), for ? ? ??;pocc otherwise;192.6. Software Platformwhere h is the width of abscissa in ? and C(r) is the normal cumulativedistribution function at r. Similar to (2.11) the range estimate for eachcolumn r(??) is the median among all sensor range estimates coming fromangles whose nearest neighbor among the columns of the map is ??.2.6 Software PlatformThe model developed in the preceding sections was initially implemented ina simulator and then ported over to the hardware platform introduced insection 1.3.The simulation environment we used to initially test the algorithm wasAutonomous MObile Robots simulator (AMORsim) [34], designed for de-velopment and testing of localization, mapping, path planning and obstacleavoidance algorithms in Matlab. Figure 2.6 is a screen-shot of the simula-tor. The range sensor in the simulator was modified so that it emulates thesame noise and limitations as the RGB-D sensor that was to be employedon the PWC.Figure 2.6: Screen-shot of the AMORsim simulator.After development and testing of the algorithm in the simulator environ-ment, the algorithm was ported over to Robot Operating System (ROS) [20]as a Python node. In order to extract range data from the RGB-D sensor,202.7. Visualizationwe used the pointcloud to laserscan package in ROS that converts 3DPoint Cloud into a 2D laser scan. The movement update and sensor updateroutines make use of ROS?s spinning procedure, where each routine runs ona separate thread but the map (which is stored in the form of a row-majorarray) is protected via semaphores. We used Python?s threading modulefor locking.2.7 VisualizationIn order to visualize map updates in AMORsim, we used Matlab?s surfplots, which proved very helpful in debugging.Visualization in ROS was not as seamless. Due to the polar coordinatesystem, we were not able to use any of the built-in visualization routinesfor mapping. Instead, the maps are represented as Point Clouds where cellswith higher likelihood of being occupied are darker in colour and empty cellsare lighter. By default, gray points represent unknown or not yet observedregions, as shown in figure 2.7.2.8 Real World ResultsIn this section we present the performance of the developed algorithm. Re-sponse of the algorithm to dynamic obstacles was presented in section 2.5.Results of the movement update are illustrated in figure 2.7. The setup con-sists of the RGB-D sensor facing in the forward direction and a SOKUIKIUTM-30LX scanning LRF mounted to the side such that there is some over-lap between the field of view of the two sensors. Note that the LRF readingsare used merely for error evaluation, not map updates.The scenario in figure 2.7 depicts the situation where the PWC is drivendown the corridor shown in figure 1.1. The PWC makes a 90? counter clock-wise in-place turn once at the doorway, pauses momentarily and makes a90? clockwise in-place turn back (as shown in figure 2.7b) and proceeds mov-ing down the corridor (see figure 2.7c).The readings of the SOKUIKI sensorare displayed in red on the map and confirm the accuracy of the movementupdate. Further evaluation of the scheme is undertaken in chapter 5212.8. Real World Results(a) Initial frame withoutany movement updates.Cells in the field of viewof the RGB-D sensor areupdated according to thesensor update scheme de-scribed in section 2.5(b) The PWC moves for-ward and makes a 90?counter clockwise in-placeturn once at the doorway.After a short pause thePWC makes a 90? clock-wise turn back. The ob-stacles that are no longerin the field of view ofthe RGB-D sensor haveshifted according to themovement information.(c) The PWC movesslightly forward again.The wall beside the open-ing on the left is now com-pletely outside the field ofview of the RGB-D sen-sor. The occupancy in-formation reported hereafter the movement up-date is consistent with thereadings of the SOKUIKIsensor overlaid on top ofthe map.Figure 2.7: Illustration of updating the map using odometry information.Green points are the RGB-D sensor readings and red points are the mea-surements from a SOKUIKI scanning range finder mounted facing sidewayswith some overlap with the RGB-D sensor?s field of view. Dark circles (cells)correspond to occupied cells, white cells correspond to empty cells and oc-cupancy information in the gray region is unknown.22Chapter 3Non-Linear Control FunctionEstimationHigh-level modeling tools are useful in the development and simulation oflocomotion strategies, wayfinding algorithms and devising risk-assessmentcriteria, to name a few. In our application, this corresponds to modeling thedynamics of motion of a commercial wheelchair. Access to a reliable modelthat can predict the behavior of the robot in the immediate future is vitalto the generation, optimization and control of safe movements.As discussed previously in chapter 1, reverse engineering the controlscheme and system identification is infeasible for commercially availablePWCs. The motion dynamics of PWCs can not easily be deduced usingnaive Newtonian mechanics. Anybody who has driven a PWC knows thatthey have both significant inertia and a small but nontrivial delay whenresponding to joystick commands.The motivation for this work is the ?black box? problem. The black boxcan be described as a system whose transfer function models the desiredinput-output relations. Applications of the black box system ranges fromcontrol systems [35] to recognizing hand-written digits [36] and predictingenergy usage in buildings [37]. Perhaps a tangible example is Suddarth?s etal. [38] work, who utilized neural networks to control the thrust of a lunarlander using only the lander?s altitude, fuel and velocity as inputs while itdescends to the moon?s surface.We present a multi-layer feedforward neural network architecture forpredicting the position of (the wheels of) a differential drive mobile robot inthe immediate future (i.e., position at t+ 1) using eight input signals:(i) current joystick position vt,(ii) change in joystick position since last sample ?vt = vt?vt?1 (to accountfor control system delay),(iii) wheel rotation since last sample ut (a proxy for PWC velocity), and23Chapter 3. Non-Linear Control Function Estimation(iv) change in wheel rotation since last sample ?ut = ut?ut?1 (a proxy forPWC acceleration),where each joystick input has two components (forward/backward for linearmotion and left/right for angular motion) and each wheel rotation inputhas two components (left and right wheel). The output of our model isestimates of the left and right wheel rotation at the next sample time ut+1 =f(vt, ?vt, ut, ?ut). Given all eight inputs at time zero (v0, ?v0, u0, ?u0) andthe joystick positions at future times vt for t = 1, 2, . . ., the model can thenbe used to predict the wheel rotations at future times by feeding back theoutputs and appropriate finite differences to the inputs at each step; forexample,u1 = f(v0, ?v0, u0, ?u0)u2 = f(v1, (v1 ? v0), u1, (u1 ? u0))...The process of modeling such behavior can be characterized as solvingfor a direct (or forward) kinematic function [39]. Earlier attempts to modelthe PWC?s response to joystick inputs with standard linear, piecewise linearand linear autoregressive-moving-average (ARMA) models did not producesufficiently accurate results, so we turned our attention to nonlinear models.The motivating factor for employing neural networks is due to theirability to approximate an arbitrary mapping of one vector space (inputs)to another vector space (outputs) [40]. The notion that neural networksare universal approximators was demonstrated more than two decades agoin [41?43]. In particular, the claim surrounds multi-layer preceptrons with asingle intermediate layer. The proposition suggests that with enough hiddenunits, a single layer neural network can approximate any function definedon a compact domain [44]; hence the use of a multi-layer feedforward neuralnetwork with one hidden layer in this work. While there are advantages inusing neural networks for such applications, it should be noted that there aredrawbacks that might not qualify them for all applications of this nature. Itis not trivial to retrieve or extract structured knowledge from the network(i.e. they are ?non-parametric?). Moreover, should the circumstances ofthe problem change, such as changes in the number of input parameters ortheir biasing, then the network needs to be completely re-trained.In the remainder of this chapter we describe the data collection andpre-processing necessary to use the model we designed, the model and itsconstruction procedure, and the quality of the results. It should be notedthat we have used this procedure to construct two different models of the243.1. Data Collection and PreprocessingPWC for two different drive system parameter settings; the quality of themodel was similar in both cases.3.1 Data Collection and PreprocessingThe empirical data used for training and validation consists of roughly60 minutes of joystick commands v and the resulting odometry measure-ments u. All measurements were timestamped by ROS when they arrivedfrom the CANBus to USB device driver. Roughly half of that driving timewas on tiled surfaces and half on carpets. After temporal resampling ofu and v (described below) and computing the ?u and ?v inputs, the datapoints were randomly permuted to remove any potential bias due to tempo-ral ordering.Figure 3.1: Time deviations between samples for the left encoder (top) andright encoder (bottom).In the discrete time modeling paradigm that we adopted, all inputs mustbe delivered simultaneously. Unfortunately, the PWC platform delivers en-coder and joystick data at different rates. Encoder data arrives at 100msintervals, with much less than 1% timing jitter (see figure 3.1). Note thatalthough two distinct bands appear in time deviations for both the left and253.1. Data Collection and PreprocessingFigure 3.2: Time deviations between samples for the joystick commands.The distribution is bimodal with a large spread. Note that the subplot to theleft is the overall view of all samples? time deviation. The top right subplotis the zoomed view of the time deviations of the top band (near 210ms) andthe bottom right subplot is the zoomed view of the samples near the bandat 105ms.right encoder, we treat samples to be in a single band at the median sincethe jitter in time is minimal. The time between joystick data samples iseither around 105ms or 210ms (roughly equal chance of either), again withless than 1% jitter from these two possibilities, as shown in figure 3.2. Sim-ilar to the encoder data, although there are multiple bands in the zoomedviews (see top right and bottom right subplots of 3.2), notice that the jitterin time is less than 1% regardless. For the discrete time model, we arbitrar-ily decided to synchronize the input sequence with a fixed 200ms samplinginterval.For the training data, linear integration and interpolation are used toresample u and v into a time series with fixed 200ms period. Run-timeresampling is performed whenever a joystick sample vt arrives. Linear in-terpolation between joystick samples is used to estimate the effective vt?1(joystick position 200ms earlier) and thereby approximate ?vt (as shownin figure 3.3). Quadratic interpolation through the previous three encodersamples is used to estimate ut?1 and (by extrapolation) ut, and thereby ?ut263.1. Data Collection and PreprocessingFigure 3.3: Run-time resampling of the joystick (top) and one of the en-coders (bottom). To estimate vt?1 linear interpolation is used. Quadraticinterpolation through the previous three encoder samples is used to estimateut?1 and (by extrapolation) ut.(see figure 3.3). More specifically, since deviations of the encoder data areminimal from the 10Hz frequency, we used an interpolant with fixed abscissathat will be inexpensive to evaluate. Namely, we used a Lagrange basis setof the formBj(x) =n?i=0,i 6=j(x? xi)xj ? xi(3.1)where the three bases Bj can be precomputed ahead of time and stored forrun-time evaluation. Note that using linear interpolation during resamplingcorresponds to assuming that the PWC?s acceleration or the joystick?s veloc-ity is constant over the interpolation interval, which is not an unreasonable273.2. Neural Network Trainingassumption over a 200ms time span.The next stage in preprocessing the data is normalization. Data nor-malization, or any linear scaling operation that corrects for large dynamicranges in one or more dimensions, is a prominent step in input-output setupfor neural networks. It reduces the effect of large variations dominating sub-tle trends in the data and reduces the likelihood of underflows and overflows.Hence, the input joystick commands were remapped between [?1, 1] sincethe minimum and maximum ranges are known. Encoder data was normal-ized to a zero mean and unit standard deviation to preserve outliers, suchthatx? =x? x??xwhere x? is the standardized data, x? is the mean of the original data of x and?x is the standard deviation of x. Based on our experience, normalizationspeeds up the training phase (i.e., faster convergence), and prevents inputswith large scales from biasing the solution.3.1.1 Data SetsOf the 60 minutes of collected data, 30 minutes labeled as the training setwas used for inner loop training (e.g.,: weight adjustments in the backprop-agation approach), 15 minutes labeled as the optimization set was used inthe outer loop training to optimize network parameters, and 15 minuteslabeled as the validation set was reserved for validation. Note that onlythe samples in the training and optimization sets were randomly permuted.Samples in the validation set are from a continuous interval and are kept intheir temporal ordering.3.2 Neural Network TrainingA multi-layer feedforward neural network is characterized by an input layer,an output layer and any number of intermediate layers, conventionally re-ferred to as hidden layer(s).Training of a neural network using the supervised learning paradigm canbe thought of as modeling a function f(x) that maps input X = {xn}Nn=1to target values Y = {yn}Nn=1, such that f(xn; ?) ? yi. Consider a neuralnetwork with one hidden layer, consisting of J hidden units and I inputunits; then the output of the kth neuron (out of K output units) would be283.2. Neural Network Trainingof the form:fk(x) =J?j=1?jkaj(x) + bkwhere ?jk is the weight of the connection between hidden unit j to outputunit k, and bk is the bias on the output layer. The aj(x) function is usually abounded, non-linear and differentiable function, referred to as the activationfunction. In this work we make use of the hyperbolic tangent function asthe activation function.aj(x) = tanh(I?i=1?ijxi + bj)where ?ij is the weight of the connection between input unit i and hiddenunit j, and bj is the bias on the hidden layer. Discussions surrounding thechoice of activation function are beyond the scope of this thesis and will notbe addressed; nevertheless, it is noteworthy to mention that using a linearactivation function would severely restrict the functionality of the hiddenunits and the resulting network would not differ from a network with directconnections from input layer to output layer.Regardless of the learning paradigm that the network is trained in, theobjective of the training step is to determine the weights and biases thatminimize a cost function.The frequentist and Bayesian approaches were tested for training theneural network. The conventional frequentist approach believes that thereis a single set of weights that map between the inputs and outputs, whereasthe Bayesian approach believes a probability distribution over the weightsis appropriate.3.2.1 The Frequentist ApproachThe frequentist learning approach is based on adjusting the weights of thenetwork according to a learning policy, given a training set of input-outputpairs. The procedure of the training step is a gradient-based optimizationprocess where the objective is to minimize some measure of error or costfunction. Conventionally, the cost function is defined as the mean squareerror (MSE) to quantify a measure of difference between the model?s output,f , and the target value,y. Total error, E, over all N training samples wouldbe of the form:E(?) = 1/2?i?k(fk(x)? yk)2 (3.2)293.2. Neural Network TrainingThe approach of using derivatives of the error function with respect to theweights was introduced by Rumelhart, Hinton, and Williams in [45]. In orderto improve the convergence speed of gradient descent, a scaled conjugategradient search strategy [46] was employed for adjusting the weights.Detailed derivation of the rules for steepest descent on the error surfacehave been thoroughly described in [45]; thus, it would be redundant to repeatthem here. Briefly, the update rule of the gradient descent method is:? ? ? ? ??E(?)where ? is the learning rate. Generally, ? is tuned to reduce the effect ofoverfitting and underfitting. Finding a suitable learning rate is however nottrivial. A common practice to identify an appropriate learning rate is crossvalidation [47], where ? is optimized based on the performance of the modelusing an ?optimization? set, different from that of the training set.Scaled Conjugate GradientsGradient descent using backpropagation does not necessarily produce thefastest convergence [48] since it uses one fixed step size. Conjugate gradientmethods [49] use conjugate directions instead of the local gradient informa-tion. Formally, each learning iteration, conjugate gradient methods use linesearch to find ? that minimizes E for fixed values of ? and ?E. M?ller pro-posed the scaled conjugate gradient (SCG) [46], which would scale the stepsize based on error reduction and quadratic approximation of the likelihoodto avoid performing a line-search at each learning iteration. Specifically, ?at iteration n is calculated as,?(n) = 2(?(n)? P TH(n)P + ?(n)?P?2?P?)2where H is the Hessian of the gradient and P is the conjugate system. SCGhas been one of the most widely used iterative methods for solving linearequations [50] and provides significant speed ups. In fact, it outperformsconjugate gradients and quasi-Newton algorithms for small gradient evalu-ations [51].PredictionLastly, the prediction step using the frequentist approach is simply eval-uation of fk(x(n+1); ?optimum), where ?optimum is a single set of optimumweights that minimize the cost function in (3.2).303.2. Neural Network TrainingNote that the network parameters?the number of hidden units, the co-efficient of the weight-decay prior ? and the number of epochs?were deter-mined through an outer Bayesian optimization loop, which will be discussedin section 3.3.3.2.2 The Bayesian ApproachThe Bayesian framework can be distinguished by its ability to 1) incorporateprior knowledge to propose a prior probability distribution for model param-eters, and 2) express future observations in terms of a probability distribu-tion over all unknown quantities. Suppose the likelihood of our feedforwardneural network is: p(yi|?, xi), where yi is the target of the network definedby fk(xi, ?). And suppose the prior distribution of the weights, ?, are p(?).Then, via Bayesian inference, the posterior distribution of the weights are:p(?|x1:n, y1:n) =p(y1:n|?, x1:n)p(?|x1:n)p(y1:n|x1:n)=p(y1:n|?, x1:n)p(?)?p(y1:n|x1:n, ?)p(?)d(?)(3.3)for n data points. Predicting the target value yn+1 given the new input(s)xn+1 would be of the form,p(yn+1|xn+1,(x1, y1), . . . , (xn, yn))=?p(yn+1|xn+1, ?)p(?|(x1, y1), . . . , (xn, yn))d?(3.4)The high-dimensional integral in (3.4) is not analytically tractable andis computationally complex to approximate for large number of parame-ters [52].In the interest of computational tractability, work in [53] and [54] pro-posed using Gaussian approximations for the posterior distribution of thenetwork parameters (weights and biases). Such approximation techniques,however, fail to provide satisfactory results in training complex models [44].Moreover, assuming the form of the posterior distribution may not be a validassumption in all applications.Standard Metropolis-type methods are a more flexible approach as theysample from the posterior distribution of the weights. Using Metropolisalgorithm [55] we can approximate (3.4) using S samples asp(yn+1|xn+1, (x1, y1), . . . , (xn, yn)) ?1SS?t=1p(yn+1|xn+1, ?t) (3.5)313.2. Neural Network Trainingwhere as the limit S increases the approximation converges to the true valueand p(?|(x1, y1), . . . , (xn, yn)) is the stationary distribution of the algorithm.This however is not feasible for complex networks and is generally veryslow [44]. Hamiltonian (or Hybrid) Monte Carlo (HMC) [56] method is aform of Metropolis algorithm. The advantage of using an adaptive techniquelike HMC is that instead of random proposal for moving particle in the stan-dard Metropolis, it uses Hamiltonian equations to propose new positions.In this work we make use of an adaptive technique similar to the workin [57]. We employ HMC for sampling from the weight?s posterior distribu-tion and Bayesian regression for the prediction step. Network parametersand HMC?s hyperparameters were determined through an outer Bayesianoptimization loop with Expected Improvement (EI) as the acquisition func-tion [58]. Results and discussions surrounding the outer loop Bayesian op-timization will be covered in section 3.3.Hamiltonian Monte CarloThe disadvantage of using the standard Metropolis algorithm to evaluatethe integral in (3.4) is the random walk. In high-dimensional weight space,generating proposal states by randomly perturbing all weights is problematicas it is very likely that the proposed direction of motion will be rejected (bythe Metropolis algorithm). In contrast, the HMC algorithm makes use ofthe gradient information (from backpropagation network) to drive changesin weights in directions that will have a high probability of being accepted.Formally, HMC drives proposals similar to a physical system. It modelsthe motion of the samples via Hamiltonian dynamics of motion:H(q,p) = U(q) +K(p)where U(q) and K(p) are the potential energy and kinetic energy terms,respectively. The potential energy term can be observed by:U(q) = ? log p(q)? log(z) (3.6)where p(q) is the distribution we want to sample from and z is any constant.Note that q is the ?position? vector (as in Metropolis algorithm) and p isthe ?momentum? vector. In case of our network, q corresponds to the setof network weights ?. Thus, (3.6) is simplyU(?) = ? log p(?|(x1, y1), . . . , (xn, yn))? log(z)Exploring in HMC takes the form of a deterministic move and a stochas-tic move. The former is to explore regions over with H is constant and the323.2. Neural Network Traininglatter is to explore regions with different values of H. The deterministicmoves follow Hamiltonian equations for motion where the derivatives of qand p are computed with respect to a ?time? variable ? such thatdqd?=?H?p= p anddpd?= ??H?q= ??U(q)In practice, however, Hamiltonian dynamics can merely be approximated bydiscretization in finite time steps. Leapfrog [59] is the most commonly usedmethod for this approximation [44]. In [56] stochastic movements are drawnfrom (3.7) though any Metropolis move that changes H is permissible.p(K(p)) = (2pi)D2 exp(?K(p)) (3.7)where K(p) = 1/2|p|2 and D is the dimension of ?.Given the dynamics of motion, HMC can be summarized as below for Snumber of samples.1. Randomly pick a direction ? for the trajectory to simulate; where? = +1 (forward trajectory) and ? = ?1 (backward trajectory) areequally likely.2. Leapfrog: given the current (pt,qt) take L steps with step size as:p(? +2) = p(?)? ?2?U(q(?))q(? + ) = q(?)? p(? +2)p(? + ) = p(? +2)? ?2?U(q(? + ))resulting in candidate (p?t+1, q?t+1)3. Accept or reject (p?t+1, q?t+1), as in Metropolis algorithm, by(p?t+1, q?t+1) ={(p?t+1, q?t+1), if U < H(p?t+1,qt)?H(pt,qt)(p?t+1, q?t+1) otherwise;for U sampled uniformly from [0, 1).The number of steps L is the number of iterations to arrive at the can-didate (p?t+1, q?t+1). Small values for L will impede the algorithm fromexploring and setting too large of a value for L will increase the number ofsamples that will be rejected and will reduce the number of stochastic moves.Aside from the network parameters, number of samples S, number of stepsL and step size are optimized by the outer-loop optimization scheme aspresented in section 3.3.2.333.3. Outer-Loop Optimization3.3 Outer-Loop OptimizationThe goal of an optimizer is to find the solution that minimizes some ob-jective function; however, this task can become computationally expensivedepending on the complexity of the function and number of dimensions [60].3.3.1 Bayesian Bandit OptimizationBayesian optimization is an elegant framework for optimizing cost that doesnot require gradient information about the function and is well suited forexpensive functions [57].Bayesian optimization can be characterized by its use of 1) a kernelmatrix for the Gaussian process prior and 2) an acquisition function for de-termining a policy of exploitation and exploration of the error surface. Forthe purposes of this project, we used the automatic relevance determination(ARD) Mate`rn 5/2 kernel, which is a widely used kernel for practical opti-mization problems [58]. The acquisition function can be described as thepolicy that decides the next point that should be evaluated, with a tradeoff between exploitation and exploration. The acquisition function that weemployed in this optimization process is the Expected Improvement, whichpicks the minimum between the next function evaluation and the currentbest function value.3.3.2 Optimization ResultsThe measure of performance that we took into account for optimization isthe MSE of the output neurons from the target values. MSE was computedon the optimization set, different from the training and validation sets.For the SCG method, we optimized the model based on the numberof hidden neurons, coefficient of weight decay prior (?), and the epochs(the number of iterations over the data set in order to train the network).Table 3.1 lists these parameters along with the ranges over which they wereset to optimize and the returned optimal values.For the Bayesian framework, we optimized over the number of hiddenneurons, number of samples to be drawn, number of steps (L), step size (),and coefficient of weight decay prior (?). Likewise, table 3.2 lists the pa-rameters, their ranges, and the returned optimal values for the probabilisticmodel.343.4. Model Validation and PerformanceParameters Range Optimal valueNumber of hidden units [4, 20] 19Coefficient of weight-decay prior [0, 1] 0.0125Number of epochs [1, 1000] 781Table 3.1: Parameters of the network trained via the scaled conjugate gra-dient approach, the ranges over which Bayesian optimization was performedalong with the achieved optimal values after 15 iterations.Parameters Range Optimal valueNumber of hidden units [4, 20] 17Coefficient of weight-decay prior [0, 1] 0.3375Number of samples [100, 1000] 1000Number of steps := L [10, 100] 91Step size := [0, 1] 0.0011Table 3.2: Parameters of the network trained via Hamiltonian Monte Carlo,their ranges over which Bayesian optimization was performed along with theachieved optimal values after 68 iterations.3.4 Model Validation and PerformanceThe models? performance metric is the MSE. The overall MSE over theoptimization data set (see section 3.1.1) for the network trained by SCGand HMC are 0.03695 and 0.05124, respectively, which are based on runningthe Bayesian optimization algorithm over 15 iterations on the frequentistmodel and 68 iterations on the probabilistic model. The performance ofthe models on the validation data set and run-time extrapolation schemediscussed in 3.1?are comparable to the performance of the models on theresampled data. The MSE over the validation data set for the networktrained by SCG and HMC are 0.03887 and 0.05029, respectively.A comparison of the predictive capability of the models with respectto the actual target values is plotted in figures 3.4 and 3.5 for 20 randomtime samples from the validation set. Notice that the predicted outputs infigure 3.4 contain no indication of uncertainty whereas predicted outputsin figure 3.4 are bounded by error bars. The error bars are the result ofBayesian approach to prediction that it does not use a single best set ofnetwork weights.353.5. Implementation DetailsFigure 3.4: Predictive capability of the model trained via SCG.Plots of the relative error of the models? output with respect to theactual target values are shown in figures 3.6 and 3.7. The resulting errorswere unbiased (centered at zero). The mean error magnitude is less than15% for both the deterministic and probabilistic model.The minimum MSE over the number of iterations for the Expected Im-provement and a random grid search procedure are illustrated in figure 3.8.Optimizing the parameters of the deterministic model (see: 3.8, top) viaEI as the acquisition function proved to outperform the randomized gridsearch. Likewise, EI outperformed Randomized search greatly in optimizingmodel parameters for the probabilistic model (see: 3.8, bottom).The performance of the probabilistic approach (via HMC) proved similarto the deterministic (via SCG) backpropagation (likely because the problemis not particularly high dimensional); thus, for the purposes of this work weused the simpler deterministic model.3.5 Implementation DetailsThe process for training the model via SCG and HMC takes about 4 hoursand 1.6 hours, respectively, and on an IntelR? CoreTMi5-660 CPU with 4GB363.5. Implementation DetailsFigure 3.5: Predictive capability of the model trained via HMC algorithm.Figure 3.6: Distribution of relative error of the model trained via SCG.373.5. Implementation DetailsFigure 3.7: Distribution of relative error of the model trained via the HMCalgorithm.main memory. We used NetLab?s Matlab toolbox to set up the neuralnetwork?s architecture and for the SCG and HMC procedure. The Bayesianoptimization step was done using the ?Spearmint? package from [58].383.5. Implementation DetailsFigure 3.8: Comparison of expected improvement and random grid searchas the acquisition function for optimizing the parameters of the deterministicmodel (top) and probabilistic model (bottom).39Chapter 4Risk AssessmentBecause the appropriate measure of and response to risk in a collaborativelycontrolled WC may be diagnosis and even user specific, our goal in this workis not to propose a particular approach to risk assessment, but rather toconstruct components from which a variety of assessments can be built. Inthis chapter we propose some ways in which such assessments can draw onthe information in the model and map described earlier.The fundamental tool for constructing a risk assessment in this frame-work is a simulation of the WC?s future motion x(?) through the local mapgiven a proposed future joystick signal v(?), where for convenience we willassume that both functions are discrete time with the same sampling periodas that used to construct the model in chapter 3 (eg: 200 ms). The WC hasnontrivial size, so at each sample of the simulation it will overlap with somesubset of the map?s grid cells, each of which has some likelihood of contain-ing an obstacle. The design of a risk assessment therefore has a number ofparameters to determine, including:(i) simulation horizon,(ii) number of times to evaluate risk over that horizon (could be as oftenas every 200 ms, although the WC will not typically move very muchover such a short period),(iii) how to account for the shape of the WC,(iv) number of future joystick signals to consider, and(v) what future joystick signals to consider.Depending on available interventions and how the risk assessment will beused in these interventions, the form of the risk assessment might vary froma separate assessment for each possible input, a time dependent assessment,or perhaps just a single overall value. It is fairly clear that the reportedrisk should not decrease if the WC overlaps with a grid cell which is morelikely to be occupied, if more of the WC overlaps with obstacles, if collision40Chapter 4. Risk Assessmentis reported for more input signals and/or if a collision will occur sooner;however, these monotonicity constraints still leave considerable flexibilityin choosing aggregation functions when more compact risk assessments aredesired. Some examples of risk assessments which result in a single number:? Greatest (log) likelihood of collision:risk = maxv0:tmax?1?Vmax0?t?tmaxmaxx??W(xt)lt(x?) (4.1)where V is a set of input signals, tmax is the time horizon, W(x(t))is the set of all points inside the WC when it is at point x(t), andlt(x?) is the log-odds of point x? (or its nearest neighbour in the grid)being occupied by an obstacle according to the map at time t (asconstructed in chapter 2). Typically only a subset of the time interval[0, tmax] would actually be tested, andW(x(t)) would be approximatedby a sampling of the WC?s shape.? User weighted, time discounted likelihood of collision:risk =?v(?)?Vtmax?t=0?t p(v(?)) maxx??W(x(t))exp(lt(x?))where V is a discrete distribution of input signals approximating theuser?s future behaviour, 0 < ? < 1 is a discount factor, and p(v(?)) isthe probability of input signal v(?).? Time to collision:risk = argmint?[0,tmax][?y?(y, t,W(x(t))) lt(y) dy > l?]where future state uncertainty is modelled by a spreading kernel func-tion ? rather than a sampling of input signals: only a single v(?) isused, but ?(y, t,W ) is a time dependent kernel function over y (eg:?y ?(y, t,W )dy = 1 for all t and any W ) which starts at time t = 0as an appropriately scaled indictator function for set W and gradu-ally spreads out as t grows. In this case the risk is the first time atwhich the likelihood of collision (as measured by the integral) exceedsa threshold (controlled by the choice of l?). In practice, the integralwould be approximated by quadrature or sampling.41Chapter 4. Risk AssessmentIn the next chapter we demonstrate our system using a version of the firstrisk assessment above, but there are many more possibilities. A Bayes? riskmodel [61], for example, have been used for robotic task execution under un-certainty. Domain-specific risk evaluation measures have also been appliedto ensure task safety in human-robot dialog [62]. Althoff et al. [63] proposeda probabilistic model for safety assessment in dynamic environments underan imperfect sensing model.42Chapter 5Experimental ResultsWe test the effectiveness of the described system in two scenarios. In thefirst, the system attempts to detect higher risk maneuvers when the userturns into a doorway from a corridor, a scenario meant to replicate a commonsource of collision observed in the trials described in [1] (as discussed insection 1). In the second, we demonstrate the effectiveness of the system asthe user drives an extended trajectory through a cluttered environment whilemaneuvering the PWC in all directions. Note that the system is passivelyevaluating the risk metric in all of these experiments; the user (the author)is always in full control of the motion.For the first scenario, the experimental setup uses a 914mm (36in) widesimulated doorway (a common width in North American public buildings).It should be noted that our PWC is 655mm wide and that health carefacilities typically mandate wider doors?for example, 1168mm (46in) inthe province of Ontario [64]?so our doorway poses a moderate degree ofcollision risk simply because of the relatively tight fit.As shown in Figure 5.1, we perform four experiments:(i) a safe slow turn,(ii) an unsafe (early) slow turn,(iii) a safe fast turn, and(iv) an unsafe (early) fast turn.In every experiment the PWC starts from rest approximately 2.3m downthe corridor, and the user attempts to drive it parallel to the wall untilthe turn is initiated. For the ?slow? trajectories, the speed of the PWC iscapped at 50% of the maximum speed (?0.4m/s) and the initial portion ofthe trajectory is ?30cm from the corridor wall. For the ?fast? trajectoriesthe speed is 70?90% of maximum (?0.7m/s) and the distance from thecorridor wall is ?64cm. A turning point was identified on the floor for eachof the ?unsafe? trajectories such that the PWC would impact about 7.5cmbefore the corner, although the PWC was stopped just before impact. For43Chapter 5. Experimental ResultsFigure 5.1: Experimental setup and the four tested trajectories.the ?safe? trajectories the turn was initiated approximately 15cm furtheralong the corridor, so the PWC passed no closer than about 7.5cm from thecorner, and was stopped after it had passed through the doorway. The rightside of figure 2.4 shows a typical map status just before turning begins; thecorner is not within the field of view of the camera and remains outside ofit throughout the turn.We report the greatest log-odds of collision risk metric (4.1). The horizonwas chosen as tmax = 1.4s. The set of future input signals V consideredincluded only one: that the joystick was held constant at its current position.The PWC shape W(x(t)) is represented by 17 samples (5 along each of thelong edges, 3 along the short edges and 1 in the center of a rectangularbounding box). The risk is recomputed every time a new joystick inputarrives.Each of the four experiments was repeated 10 times. Figures 5.2 and 5.3show the time evolution of the risk metric (4.1) during each run of eachexperiment (time shifted so that the peak of every run for a single exper-iment is aligned). The scaling of the vertical axis is essentially arbitraryand the values of the metric have been capped to ?10, so we are primarilyinterested in comparing the value of the metric between experiments rather44Chapter 5. Experimental Results?2 0 2 4 6 8 10 12 14 16 18?10?50510Safe trajectory ?? slowrisk measure?2 0 2 4 6 8 10?10?50510Unsafe trajectory ?? slowsimulated timerisk measureFigure 5.2: Safe (top) and unsafe (bottom) trajectories in slow-speed mode.than drawing conclusions from its specific value.All of the experimental runs follow the same basic pattern. When thesystem is first turned on at the start of each run, the risk metric hoversaround zero until the sensors have time to determine that there is free spacein front of the chair; we will consider this the ?baseline? risk of drivingthrough a region whose obstacles are unknown. At that point the metricdrops significantly as the chair moves safely along the corridor parallel tothe wall. When the chair begins its turn the metric rises quickly toward apeak, and then falls again when the chair is stopped at the end of the run.The distinction between the four experiments lies in the height of thepeak as the chair makes its turn. Consider first the ?slow? experimentsshown in figure 5.2. In the ?slow unsafe? case on the bottom, all runs showthe risk metric rising significantly above the baseline risk detected at thebeginning of the run, indicating that a collision is a significant threat. Incontrast, for the ?slow safe? case on the top all runs show a distinctly lowerpeak. The fact that the chair is making a blind turn into a narrow doorway45Chapter 5. Experimental Results0 2 4 6 8 10 12 14?10?50510Safe trajectory ?? fastrisk measure0 1 2 3 4 5 6 7?10?50510Unsafe trajectory ?? fastsimulated timerisk measureFigure 5.3: Safe (top) and unsafe (bottom) trajectories in fast-speed mode.explains why these peaks are still at or slightly above the baseline risk atthe beginning of the run: toward the end of the run, the chair is againpredicting motion through unknown portions of the map. Now consider the?fast? experiments shown in figure 5.3. In the ?fast unsafe? case, all runsshow high peaks similar to those in the ?slow unsafe? case, again indicatingthat the threat of collision is significant. In the ?fast safe? case the range ofpeaks displayed by different runs is much wider?everything from baselineto the top of the scale?however, we are not terribly concerned that the riskmay be overestimated in this case because most PWC users would considera 90? turn at nearly full speed around a blind corner to be a somewhatdubious maneuver even if they were certain of avoiding the corner itself.46Chapter 5. Experimental Results! !ABCD EF G HIJKLMNFigure 5.4: Occupancy-grid map of the environment navigated during thelong-run trial.In the second scenario the PWC is driven in a typical office environment,as depicted by the occupancy-grid map in Figure 5.4 (generated offline),along the path shown in colors and with arrows indicating wheelchair head-ing. The green lines indicate areas evaluated as not risky, yellow indicatesareas of moderate risk, and red indicates areas of high risk along the path.The time-series plot of risk assessment along the path is shown in Figure 5.5.47Chapter 5. Experimental Results0 20 40 60 80 100?10?50510sim u la ted?tim erisk?measureABC DEF G HI JKLM NFigure 5.5: Time-series plot of risk assessment during the long-run trialsshown in Figure 5.4.We have added corresponding annotations (alphabetically ordered in timealong the trajectory) to both figures to denote specific areas of interest. Ascan be seen, while travelling close to obstacles (such as at G and L) orthrough narrow doorways (such as between B and C), risk assessment ishigh. Of particular interest is the section of the map where the wheelchairperforms a ?backing up? maneuver between H and I with the wall directlybehind it. This wall was detected by the camera and incorporated into theegocentric map while the chair was facing toward it between F and G; how-ever, it is behind the chair and hence invisible to the camera while the chairis backing up between H and I. In spite of the obstacle being in the sensor?sblindspot, the risk assessment goes from low to high as the chair approachesthe wall while backing up after H, and then drops again when the chair stopsbacking at I in preparation for forward motion toward J. This reaction ofthe risk metric demonstrates the utility of our egocentric map in detectingpotential collisions with previously observed obstacles now lying outside thecamera?s narrow field of view.48Chapter 6ConclusionsIn this thesis we discuss the construction of a dynamic egocentric occupancymap that is capable of maintaining information about local obstacles evenwhen they are outside the field of view of the sensor. We construct a neuralnetwork model using the frequentist and Bayesian approach to map joystickinputs and wheelchair motion. We demonstrated that using the egocentricmap and model infrastructure, we can evaluate a variety of risk assessmentmetrics for collaborative control of a smart wheelchair. One such metric isdemonstrated on a wheelchair with a single RGB-D camera in a doorwaytraversal scenario where the near edge of the doorframe is no longer visibleto the camera as the chair makes its turn.6.1 Suggestions for Future ResearchWhile there are approaches by which one can improve the mapping tech-nique and modeling scheme, we dedicate this section to discussions of furtherimprovement towards a smart PWC.We demonstrated a naive risk assessment measure in which it only con-sidered greatest log-odds of collision. Some future works can be dedicatedto exploring (experimentally) the various risk assessment schemes similarto the schemes presented in chapter 4. While running experiments with(some) target population is a large undertaking, the process of designingthe experiments is no less and requires great attention.As explained in chapter 5, the only future input signal that was consid-ered was based on the assumption that the joystick will be held constant atits current position. This is indeed a naive assumption. One could however?learn? future joystick movements from data. This can be a user specificmodel that is based on the driving habits of the user or based on data frompopulation.Since generally joystick commands reside in polar coordinates one couldemploy the properties of the egocentric polar map representation presentedhere to first solve for the configuration space in the obstacle map and thenrestrict joystick movements accordingly.49Bibliography[1] P. Viswanathan, ?Navigation and obstacle avoidance help (NOAH) forelderly wheelchair users with cognitive impairment in long-term care,?Ph.D. dissertation, University of British Columbia, 2012.[2] L. Fehr, W. E. Langbein, and S. B. Skaar, ?Adequacy of powerwheelchair control interfaces for persons with severe disabilities: A clin-ical survey,? Development, vol. 37, no. 3, pp. 353?360, 2000.[3] R. C. Simpson, E. F. LoPresti, and R. A. Cooper, ?How many peo-ple would benefit from a smart wheelchair?? Journal of RehabilitationResearch and Development, vol. 45, no. 1, pp. 53?71, 2008.[4] G. Pullin, Design Meets Disability. MIT Press, 2009.[5] J. Miller Polgar, ?The myth of neutral technology,? in Design and Useof Assistive Technology, M. M. K. Oishi, I. M. Mitchell, and H. F. M.Van der Loos, Eds. Springer, 2010, pp. 17?23.[6] S. Thrun, W. Burgard, and D. Fox, Probabilistic robotics. MIT Press,2005.[7] P. Viswanathan, R. Wang, and A. Mihailidis, ?Wizard-of-Oz andmixed-methods studies to inform intelligent wheelchair design for olderadults with dementia,? in Association for the Advancement of AssistiveTechnology in Europe, 2013.[8] R. C. Simpson, ?Smart wheelchairs: A literature review.? Journal ofrehabilitation research and development, vol. 42, no. 4, p. 423, 2005.[9] B. M. Faria, L. P. Reis, N. Lau, J. C. Soares, and S. Vasconcelos,?Patient classification and automatic configuration of an intelligentwheelchair,? in Agents and Artificial Intelligence. Springer, 2013, pp.268?282.50Bibliography[10] R. H. Wang, S. M. Gorski, P. J. Holliday, and G. R. Fernie, ?Evaluationof a contact sensor skirt for an anti-collision power wheelchair for olderadult nursing home residents with dementia: Safety and mobility,? As-sistive Technology, vol. 23, no. 3, pp. 117?134, 2011.[11] A. C. Balcells and J. A. Gonz, ?TetraNauta: A wheelchair controllerfor users with very severe mobility restrictions,? in Proc. 3rd TIDECongress, 1998, pp. 336?341.[12] G. Bourhis, O. Horn, O. Habert, and A. Pruski, ?An autonomous ve-hicle for people with motor disabilities,? Robotics & Automation Mag-azine, IEEE, vol. 8, no. 1, pp. 20?28, 2001.[13] E. Prassler, J. Scholz, and P. Fiorini, ?A robotics wheelchair forcrowded public environment,? Robotics & Automation Magazine,IEEE, vol. 8, no. 1, pp. 38?45, 2001.[14] E. B. Vander Poorten, E. Demeester, E. Reekmans, J. Philips,A. Huntemann, and J. De Schutter, ?Powered wheelchair navigation as-sistance through kinematically correct environmental haptic feedback,?in IEEE Int. Conf. on Robotics and Automation (ICRA), 2012, pp.3706?3712.[15] Y. Wang and W. Chen, ?Hybrid map-based navigation for intelligentwheelchair,? in IEEE Int. Conf. on Robotics and Automation (ICRA),2011, pp. 637?642.[16] G. Peinado, C. Urdiales, J. Peula, M. Fdez-Carmona, R. Annicchiarico,F. Sandoval, and C. Caltagirone, ?Navigation skills based profiling forcollaborative wheelchair control,? in IEEE Int. Conf. on Robotics andAutomation (ICRA), 2011, pp. 2229?2234.[17] T. Carlson and Y. Demiris, ?Increasing robotic wheelchair safety withcollaborative control: Evidence from secondary task experiments,? inIEEE Int. Conf. on Robotics and Automation (ICRA), 2010, pp. 5582?5587.[18] Y. Satoh and K. Sakaue, ?An omnidirectional stereo vision-based smartwheelchair,? EURASIP Journal on Image and Video Processing, vol.2007, no. 1, p. 087646, 2007.[19] R. Simpson, E. LoPresti, S. Hayashi, I. Nourbakhsh, D. Miller et al.,?The smart wheelchair component system,? Journal of RehabilitationResearch and Development, vol. 41, no. 3B, pp. 429?442, 2004.51Bibliography[20] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs,E. Berger, R. Wheeler, and A. Ng, ?ROS: an open-source robot operat-ing system,? in IEEE Intl. Conf. on Robotics and Automation (ICRA)Workshop on Open Source Robotics, Kobe, Japan, May 2009.[21] J. Borenstein and Y. Koren, ?Obstacle avoidance with ultrasonic sen-sors,? IEEE Journal of Robotics and Automation, vol. 4, no. 2, pp.213?218, 1988.[22] ??, ?The vector field histogram-fast obstacle avoidance for mobilerobots,? IEEE Transactions on Robotics and Automation, vol. 7, no. 3,pp. 278?288, 1991.[23] A. Elfes, ?Sonar-based real-world mapping and navigation,? IEEEJournal of Robotics and Automation, vol. 3, no. 3, pp. 249?265, 1987.[24] D. Kortenkamp, R. P. Bonasso, and R. Murphy, Eds., Artificial Intel-ligence and Mobile Robots: Case Studies of Successful Robot Systems,1st ed. AAAI Press, March 1998.[25] S. Thrun, ?Learning metric-topological maps for indoor mobile robotnavigation,? Artificial Intelligence, vol. 99, no. 1, pp. 21?71, 1998.[26] W. Burgard, D. Fox, H. Jans, C. Matenar, and S. Thrun, ?Sonar-basedmapping of large-scale mobile robot environments using EM,? in Proc.of the 16th International Conference on Machine Learning, 1999.[27] S. Thrun, W. Burgard, and D. Fox, ?A probabilistic approach toconcurrent mapping and localization for mobile robots,? AutonomousRobots, vol. 5, no. 3-4, pp. 253?271, 1998.[28] B. Yamauchi, ?A frontier-based approach for autonomous exploration,?in Computational Intelligence in Robotics and Automation, 1997.CIRA?97., Proceedings., 1997 IEEE International Symposium on, 1997,pp. 146?151.[29] B. Siciliano and O. Khatib, Springer handbook of robotics. Springer,2008.[30] R. Keys, ?Cubic convolution interpolation for digital image processing,?IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 29,no. 6, pp. 1153?1160, 1981.52Bibliography[31] J. W. Cooley and J. W. Tukey, ?An algorithm for the machine calcu-lation of complex fourier series,? Mathematics of Computation, vol. 19,no. 90, pp. 297?301, 1965.[32] S. Thrun, ?Learning occupancy grids with forward sensor models,? Au-tonomous Robots, vol. 15, pp. 111?127, 2002.[33] J.-H. Park, Y.-D. Shin, J.-H. Bae, and M.-H. Baeg, ?Spatial uncertaintymodel for visual features using a Kinect sensor,? Sensors, vol. 12, no. 7,pp. 8640?8662, 2012.[34] T. Petrinic, E. Ivanjko, and I. Petrovic, ?AMORsim - a mobile robotsimulator for matlab,? in Proceedings of 15th International Work-shop on Robotics in Alpe-Adria-Danube Region, Balatonfu?red, Hungary,2006.[35] S. Suddarth, S. Sutton, and A. Holden, ?A symbolic-neural method forsolving control problems,? in IEEE International Conference on NeuralNetworks, 1988, pp. 516?523 vol.1.[36] L. Cun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hub-bard, and L. D. Jackel, ?Handwritten digit recognition with a back-propagation network,? in Advances in Neural Information ProcessingSystems. Morgan Kaufmann, 1990, pp. 396?404.[37] D. J. MacKay, ?Bayesian non-linear modeling for the prediction compe-tition,? in Maximum Entropy and Bayesian Methods. Springer, 1996,pp. 221?234.[38] S. C. Suddarth and A. D. Holden, ?Symbolic-neural systems and theuse of hints for developing complex systems,? International Journal ofMan-Machine Studies, vol. 35, no. 3, pp. 291 ? 311, 1991.[39] M. Nagrath and I. Nagrath, Robotics and Control. Tata McGraw-Hill,2003.[40] C. Andrieu, N. D. Freitas, and A. Doucet, ?Robust full Bayesian learn-ing for neural networks,? 1999.[41] G. Cybenko, ?Approximation by superpositions of a sigmoidal func-tion,? Mathematics of Control, Signals and Systems, vol. 5, no. 4, pp.455?455, 1992.53Bibliography[42] T. Poggio and F. Girosi, ?Networks for approximation and learning,?Proceedings of the IEEE, vol. 78, no. 9, pp. 1481?1497, 1990.[43] K. Hornik, M. Stinchcombe, and H. White, ?Multilayer feedforwardnetworks are universal approximators,? Neural networks, vol. 2, no. 5,pp. 359?366, 1989.[44] R. M. Neal, ?Bayesian learning for neural networks,? Ph.D. disserta-tion, University of Toronto, 1995.[45] J. A. Anderson and E. Rosenfeld, Eds., Neurocomputing: Foundationsof Research. Cambridge, MA, USA: MIT Press, 1988.[46] M. F. M?ller, ?A scaled conjugate gradient algorithm for fast supervisedlearning,? Neural Networks, vol. 6, no. 4, pp. 525?533, 1993.[47] M. Stone, ?Cross-validatory choice and assessment of statistical predic-tions,? Journal of the Royal Statistical Society. Series B (Methodologi-cal), pp. 111?147, 1974.[48] Y. Zweiri, J. Whidborne, and L. Seneviratne, ?A three-term backprop-agation algorithm,? Neurocomputing, vol. 50, no. 0, pp. 305 ? 318, 2003.[49] M. R. Hestenes and E. Stiefel, ?Methods of conjugate gradients forsolving linear systems,? pp. 409?436, 1952.[50] J. R. Shewchuk, ?An introduction to the conjugate gradient methodwithout the agonizing pain,? Carnegie Mellon University, Tech. Rep.,1994.[51] I. Nabney, NETLAB: Algorithms for Pattern Recognition, ser. Advancesin Computer Vision and Pattern Recognition. Springer, 2002.[52] R. Neal, ?Bayesian training of backpropagation networks by the HybridMonte Carlo method,? University of Toronto, Tech. Rep., 1993.[53] D. J. MacKay, ?A practical Bayesian framework for backpropagationnetworks,? Neural computation, vol. 4, no. 3, pp. 448?472, 1992.[54] W. L. Buntine and A. S. Weigendt, ?Bayesian back-propagation,? Com-plex Systems, vol. 5, pp. 603?643, 1991.[55] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller,and E. Teller, ?Equation of State Calculations by Fast Computing Ma-chines,? The Journal of Chemical Physics, vol. 21, no. 6, pp. 1087?1092,1953.54Bibliography[56] S. Duane, A. Kennedy, B. J. Pendleton, and D. Roweth, ?Hybrid MonteCarlo,? Physics Letters B, vol. 195, no. 2, pp. 216 ? 222, 1987.[57] S. M. Ziyu Wang and N. de Freitas, ?Adaptive Hamiltonian and Rie-mann Manifold Monte Carlo samplers,? Technical Report, Tech. Rep.[58] J. Snoek, H. Larochelle, and R. P. Adams, ?Practical bayesian opti-mization of machine learning algorithms,? in Advances in Neural Infor-mation Processing Systems, 2012.[59] B. Leimkuhler and S. Reich, Simulating Hamiltonian Dynamics. Cam-bridge University Press, 2005.[60] C. Andrieu, N. de Freitas, A. Doucet, and M. Jordan, ?An introductionto MCMC for machine learning,? Machine Learning, vol. 50, no. 1-2,pp. 5?43, 2003.[61] F. Doshi, J. Pineau, and N. Roy, ?Reinforcement learning with limitedreinforcement: Using Bayes risk for active learning in POMDPs,? inProceedings of the 25th International Conference on Machine Learning.ACM New York, NY, USA, 2008, pp. 256?263.[62] J. Sattar and G. Dudek, ?Towards quantitative modeling of task con-firmations in human?robot dialog,? in Proceedings of the IEEE Inter-national Conference on Robotics and Automation, ICRA, Shanghai,China, May 2011, pp. 1957?1963.[63] D. Althoff, J. Kuffner, D. Wollherr, and M. Buss, ?Safety assessmentof robot trajectories for navigation in uncertain and dynamic environ-ments,? Autonomous Robots, vol. 32, no. 3, pp. 285?302, 2012.[64] Ontario Ministry of Health and Long-Term Care, ?Long-term care fa-cility design manual,? 1999.55
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A risk assessment infrastructure for powered wheelchair...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A risk assessment infrastructure for powered wheelchair motion commands without full sensor coverage TalebiFard, Pouria 2014
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | A risk assessment infrastructure for powered wheelchair motion commands without full sensor coverage |
Creator |
TalebiFard, Pouria |
Publisher | University of British Columbia |
Date Issued | 2014 |
Description | Smart powered wheelchairs offer the possibility of enhanced mobility to a large and growing population---most notably older adults---and a key feature of such a chair is collision avoidance. Sensors are required to detect nearby obstacles; however, complete sensor coverage of the immediate neighbourhood is challenging for reasons including financial, computational, aesthetic, user identity and sensor reliability. It is also desirable to predict the future motion of the wheelchair based on potential input signals; however, direct modeling and control of commercial wheelchairs is not possible because of proprietary internals and interfaces. In this thesis we design a dynamic egocentric occupancy map which maintains information about local obstacles even when they are outside the field of view of the sensor system, and we construct a neural network model of the mapping between joystick inputs and wheelchair motion. Using this map and model infrastructure, we can evaluate a variety of risk assessment metrics for collaborative control of a smart wheelchair. One such metric is demonstrated on a wheelchair with a single RGB-D camera in a doorway traversal scenario where the near edge of the doorframe is no longer visible to the camera as the chair makes its turn. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2014-03-26 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution 2.5 Canada |
DOI | 10.14288/1.0103390 |
URI | http://hdl.handle.net/2429/46276 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2014-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by/2.5/ca/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2014_spring_talebifard_pouria.pdf [ 3.99MB ]
- Metadata
- JSON: 24-1.0103390.json
- JSON-LD: 24-1.0103390-ld.json
- RDF/XML (Pretty): 24-1.0103390-rdf.xml
- RDF/JSON: 24-1.0103390-rdf.json
- Turtle: 24-1.0103390-turtle.txt
- N-Triples: 24-1.0103390-rdf-ntriples.txt
- Original Record: 24-1.0103390-source.json
- Full Text
- 24-1.0103390-fulltext.txt
- Citation
- 24-1.0103390.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0103390/manifest