Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Usability evaluation of emerging virtual reality technologies in telerobotics Haghjoo, Fatemeh 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2018_May_Haghjoo_Fatemeh.pdf [ 6.24MB ]
JSON: 24-1.0364273.json
JSON-LD: 24-1.0364273-ld.json
RDF/XML (Pretty): 24-1.0364273-rdf.xml
RDF/JSON: 24-1.0364273-rdf.json
Turtle: 24-1.0364273-turtle.txt
N-Triples: 24-1.0364273-rdf-ntriples.txt
Original Record: 24-1.0364273-source.json
Full Text

Full Text

Usability Evaluation of Emerging Virtual RealityTechnologies in TeleroboticsbyFatemeh HaghjooB.Sc., University of Sheffield, 2013A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinThe College of Graduate Studies(Electrical Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Okanagan)March 2018c© Fatemeh Haghjoo, 2018The following individuals certify that they have read, and recommend to the Collegeof Graduate Studies for acceptance, a thesis/dissertation entitled:Usability Evaluation of Emerging Virtual Reality Technologies in Teleroboticssubmitted by Fatemeh Haghjoo in partial fulfilment of the requirements of the degreeof Master of Applied ScienceProfessor Homayoun Najjaran, School of EngineeringSupervisorDr. Jahangir Hossain, School of EngineeringSupervisory Committee MemberDr. Yang Cao, School of EngineeringSupervisory Committee MemberProfessor Solomon Tesfamariam, School of EngineeringUniversity ExamineriiAbstractRecent technological advancements in robotic systems have led to increasing exposure ofhumans to these systems. Therefore, the quality of Human-Robot Interactions (HRI),which effect adoption of robots by human operators, is increasingly important [1]. Thisis particularly true in telerobotic applications, where users can only operate a robot re-motely and a problematic interface can have major implications. Therefore, the objectiveof this study is to investigate whether using a head-mounted display and a hand-trackingdevice in the development of HRI systems can result in better interactions. Hence, us-ability testing was performed via user studies considering three factors: effectiveness,efficiency, and user satisfaction. Three different HRI modes were developed: manual,Virtual Reality (VR), and Mixed Reality (MR). The manual mode involved manual con-trol of a robotic arm using a keyboard with visual feedback on a monitor. The VR modeentailed controlling the robotic arm by hand gestures using Leap Motion and visual feed-back from a VR environment provided through Oculus Rift. The MR mode was similar tothe VR, but the visual feedback was augmented with views from two cameras provided tothe user through Oculus Rift. These modes were used to perform a pick-and-place task.Effectiveness was evaluated by measuring the success rates of each mode and performingan error analysis; efficiency was examined by considering the time-of-completion; anduser satisfaction was subjectively measured using a Likert-type questionnaire. Statisticalanalysis revealed that the VR mode was the most efficient and effective and also resultedin fewer human and system errors. Users’ gaming experience was also considered as afactor, but had no effect on the effectiveness or efficiency of any mode. Therefore, it isconcluded that users require no prior experience, and consequently training, to achieveeffective and efficient HRI using this mode. Moreover, the questionnaire revealed thatiiiusers were highly satisfied with both VR and MR modes, with no clear preference. Toconclude, the application of Oculus Rift and Leap Motion results in more effective andefficient HRI, and is recommended for use in real-world applications similar to pick-andplace tasks, such as in the manufacturing industry.ivLay SummaryDue to improvements in robotic systems humans exposure to robots is ever-increasing,making the quality of their interactions increasingly important. This is particularly truein telerobotic applications, where users can only operate a robot remotely. Consequently,a problematic interface in this case can have major negative implications in the humanrobot interaction. Therefore, the main goal of this study is to investigate whether us-ing recently developed Virtual Reality (VR) equipment including VR goggles and handtracking devices can improve the quality of human-robot interactions. For this end, suchequipment is used to design and develop three different modes of interaction to remotelycontrol a robotic arm. These interaction modes are then evaluated by recruiting par-ticipants and obtaining their performance measurements in performing a pick-and-placetask. Then, statistical analysis of the obtained data is used to measure the difference inefficiency, effectiveness, and user satisfaction of the interaction modes.vPrefaceThis research was conducted at the Advanced Control and Intelligent System (ACIS) lab-oratory at the School of Engineering within the University of British Columbia, Okanagancampus under the supervision of Professor Homayoun Najjaran.I, Fatemeh Haghjoo, was the lead investigator, responsible for conducting the re-search, designing the experiment, developing the designed experiment, and analyzing alldata collected through testing. Lukas Stracovsky, a member of the ACIS laboratory,helped in this research as the programmer and contributed with his ideas. He was alsoinvolved in testing the project with participants. Professor Homayoun Najjaran wasalso involved throughout the research as a mentor and in the concept formation andmanuscript editing.A version of VR HRI model, explained in Chapter 2, was presented for the Roboticsand Intelligent Systems portion of the UBC research during the first BCTech Summit inVancouver on January 18th and 19th of 2016.In addition, I was responsible for obtaining ethics approval to test the designed systemon participants. This approval was received from UBC Okanagan Behavioural ResearchEthics under the title of Telepresence, subject number H15-03245.viTable of ContentsAbstract iiiLay Summary vPreface viTable of Contents viiList of Tables xiList of Figures xiiAcknowledgements xivDedication xv1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3.1 Human-Robot Interaction . . . . . . . . . . . . . . . . . . . . . . Level of Autonomy . . . . . . . . . . . . . . . . . . . . . Nature of Communication . . . . . . . . . . . . . . . . . Structure of the Team . . . . . . . . . . . . . . . . . . . Adaptation, Learning, and Training of Humans and Robots Task Shaping . . . . . . . . . . . . . . . . . . . . . . . . 7vii1.3.2 Remote Human-Robot Interaction . . . . . . . . . . . . . . . . . . System Architecture . . . . . . . . . . . . . . . . . . . . Control Architecture . . . . . . . . . . . . . . . . . . . . Challenges of Telerobotics . . . . . . . . . . . . . . . . . Application of VR and MR in Telerobotics . . . . . . . . 131.3.3 Evaluation of Human-Robot Interaction . . . . . . . . . . . . . . Theoretical Framework . . . . . . . . . . . . . . . . . . . Methodological Framework . . . . . . . . . . . . . . . . 201.4 Statement of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 271.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 Methodology 292.1 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.1.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.1.2 Variables, Hypotheses, and Design of the Experiments . . . . . . 312.1.3 Experimental Procedures . . . . . . . . . . . . . . . . . . . . . . . 332.1.3.1 Manual HRI Mode . . . . . . . . . . . . . . . . . . . . . 342.1.3.2 VR HRI Mode . . . . . . . . . . . . . . . . . . . . . . . 352.1.3.3 MR HRI Mode . . . . . . . . . . . . . . . . . . . . . . . 362.1.4 Statistical Data Analysis . . . . . . . . . . . . . . . . . . . . . . . 372.1.4.1 Chi-Square Test . . . . . . . . . . . . . . . . . . . . . . 402.1.4.2 Fisher’s Exact Test . . . . . . . . . . . . . . . . . . . . . 412.1.4.3 McNemar’s Test . . . . . . . . . . . . . . . . . . . . . . 422.1.4.4 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.1.4.5 Post-Hoc Tukey HSD . . . . . . . . . . . . . . . . . . . . 442.1.4.6 Box-Cox Normality Transformation . . . . . . . . . . . . 452.1.4.7 Mann-Whitney U Test . . . . . . . . . . . . . . . . . . . 452.1.4.8 Sign Test . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Experimental Setup 483.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48viii3.2 Interface Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.3 Graphical User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.4 Hardware Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.4.1 Oculus Rift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.4.2 IP Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.4.3 Leap Motion and Kinect . . . . . . . . . . . . . . . . . . . . . . . 543.4.4 Robotic Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.4.5 Raspberry Pi and Magnet . . . . . . . . . . . . . . . . . . . . . . 574 Results and Discussion 594.1 Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.1.1 Effect of Human-Robot Interaction Modes . . . . . . . . . . . . . 604.1.2 Effect of Gaming Experience . . . . . . . . . . . . . . . . . . . . . 614.1.3 Error Rates and Types . . . . . . . . . . . . . . . . . . . . . . . . 624.2 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.2.1 Effect of HRI Mode and Gaming Experience . . . . . . . . . . . . 654.3 Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.3.1 Effect of Gaming Experience . . . . . . . . . . . . . . . . . . . . . 714.4 Subjective Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 Conclusions and Future Work 765.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.3 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 78References 80Appendices 90A Questionaire 91B Anonymized Data 94ixC Statistical Data Analysis 98C.1 GLM for mixed model ANOVA . . . . . . . . . . . . . . . . . . . . . . . 98C.2 Post hoc Tukey HSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100C.3 Mann-Whitney U test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101D Minitab Outputs 103D.1 Outputs of McNemar’s Tests . . . . . . . . . . . . . . . . . . . . . . . . . 103D.2 Outputs of Chi-square and Fisher’s Exact Tests . . . . . . . . . . . . . . 107D.3 Outputs of ANOVA and Related Tests . . . . . . . . . . . . . . . . . . . 111xList of Tables1.1 Overview of statistical methods. Adapted from [70] . . . . . . . . . . . . 272.1 Overview of statistical methods used for hypothesis testing. . . . . . . . 392.2 An example 2×2 contingency table for the case of VR HRI mode to studythe effect of gaming experience on effectiveness of the interaction mode. . 402.3 General notation of a 2× 2 contingency table for paired samples. . . . . 432.4 Transformations and their corresponding Box-Cox transformation param-eter [87]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.1 Error Types and Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.2 Factors and their levels in the mixed-model ANOVA for efficiency . . . . 664.3 ANOVA results for the transformed TOC data, effect of HRI modes andgaming experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67B.1 Userss’ anonymized demographic data and Time of Completion (TOC) fordifferent HRI modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95B.2 Anonymized responses to questions 3 to 10 of the questionnaire. . . . . . 96B.3 Anonymized responses to questions 11 to 18 of the questionnaire. . . . . 97C.1 Formulas for mixed effect two-factor ANOVA. [81] . . . . . . . . . . . . . 99xiList of Figures1.1 Level of autonomy with respect to HRI . . . . . . . . . . . . . . . . . . . 51.2 Concept and main modules of a telerobotic system [36] . . . . . . . . . . 101.3 Different telerobotic control architectures [29] . . . . . . . . . . . . . . . 121.4 Milgram’s virtuality continuum. Adapted from [47] . . . . . . . . . . . . 142.1 Participants’ demographic information. . . . . . . . . . . . . . . . . . . . 312.2 What users saw when using the manual HRI mode. The screen on the rightshows the scene from the front view, and the screen on the left shows thevideo feed from the top view. . . . . . . . . . . . . . . . . . . . . . . . . 352.3 What users saw when using the VR HRI mode. . . . . . . . . . . . . . . 362.4 What users saw when using the MR HRI mode. . . . . . . . . . . . . . . 373.1 Schematic drawing of components of different HRI modes and their com-munications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.2 Experimental setup with the robotic arm . . . . . . . . . . . . . . . . . . 493.3 Schematic drawing of the connection system. . . . . . . . . . . . . . . . . 503.4 Side-by-side view of Unity environment on desktop . . . . . . . . . . . . 523.5 Workspace of the robotic arm. . . . . . . . . . . . . . . . . . . . . . . . . 574.1 Pie charts of the rate of successful and unsuccessful task completion foreach HRI mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.2 Pie charts of the rates of successful and unsuccessful task completion forHRI mode and users’ gaming experience . . . . . . . . . . . . . . . . . . 62xii4.3 Pie charts of the percentage of (upper) error types (bottom) human versussystem errors categorized for each HRI mode . . . . . . . . . . . . . . . . 644.4 Post-hoc Tukey HSD test for HRI mode factor . . . . . . . . . . . . . . . 674.5 Residual plots for the ANOVA analysis for factors of HRI mode and gam-ing experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.6 Main effect plots for the factors of HRI modes and gaming experience . . 694.7 Interaction plots for the factors of HRI modes and gaming experience . . 704.8 Histograms of the users’ chosen level of agreement with the first and secondstatements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.9 Histograms of the users’ chosen level of agreement with the first and secondstatements, based on their gaming experience . . . . . . . . . . . . . . . 724.10 Stacked bar charts of counts of user agreement levels with the providedstatements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75xiiiAcknowledgementsI offer my gratitude to my supervisor, Professor Homayoun Najjaran, for his mentorshipand guidance during this journey. I would like to thank my committee members, Dr.Cao and Dr. Hossain for their insights. I would like to thank all my friends at ACIS lab,especially Lukas and Mathew.I would like to thank Brian and Dorean who made my stay in Kelowna enjoyable. Iwould like to thank all my friends in Kelowna for all the memorable moments, especiallyMajid, Sadegh, Armin, Moein, Roya, Hojat, Armaghan, Reza, Sadaf, Nima, Shelir, Soror,Zahra, Farhad, Atousa and Parya.Finally, I also would like to thank my family, Zhinous, Mehdi, Azadeh, Mohammad,Maryam and Ali for their continuous encouragement throughout my years of study. Noneof my accomplishments would have been possible without them. I am blessed beyondbelief to have each and everyone of them in my life.xivTo my dear husband, Farzin,for his understanding, patience,love and for making this journey beautiful and memorable.To my beloved parents, Zhila and Jalinous,for their unconditional love and support in every step of my life.Chapter 1IntroductionHuman Robot Interaction (HRI) has become increasingly popular in academic, labora-tory, and industrial settings. It blends robotics, control, communication, human psychol-ogy, computer science, and design to create an environment where humans and robotscan effectively and efficiently work together. The new means of communication betweenrobots and humans will introduce more adaptable and flexible interactions, which willlead to more profitable tasks making the most of users’ understanding and skills. Re-gardless of the robot’s level of autonomy, there is often a lack of communication betweenthe robot and its operator, resulting in less effective and costly interactions, making itvital to address the challenges and shortfalls in complex HRI tasks. The aim of this re-search is to develop a unified treatment for remote HRI that can be used by non-expertusers to enable intuitive and safer cooperation in a variety of applications. Moreover,other interests in this thesis are novel approaches and techniques of programming for fastteaching to reduce training time, increase efficiency, and increase user safety in a flexibleproduction setting. Furthermore, making a cost-effective communication method usingoff the shelf equipment is another goal of this thesis.This study assessed and compared the effectiveness, efficiency, and satisfaction ofusing Manual, Virtual Reality (VR), and Mixed Reality (MR) technologies by providinga 3D and a 2D environment to an operator. The operator used a hand movementwithin the virtual environment to communicate with a real robotic arm. To observe theeffectiveness, efficiency, and satisfaction of interacting with this robotic arm remotely,1user performance was evaluated by interacting with the robotic arm while using a manualcontroller, a combination of VR and Leap Motion, with and without live video feedbackfrom the real world.The viability of different experimental designs was evaluated using subjective andobjective analysis of users’ performance and satisfaction. This thesis represents a startingpoint for wider dissemination of developments in this quickly growing field.1.1 MotivationResearch into HRI involves the design and understanding of the communications betweenhumans and robotic systems [2] with the main incentive of making the HRI easier andmore intuitive. Successful HRI would result in an increase in effectiveness and efficiencyof some demanding tasks while remaining robust and user friendly. Although severalresearch efforts have been made regarding the benefits of proximate HRI applications indifferent scenarios ranging from hospitals and eldercare to self-driving cars and smarttoys [3, 4], little has been done to improve the user-friendliness of HRI for remote robotoperations [3].Intuitive and user-friendly HRI for remote robot operations can have extensive appli-cations, ranging from medical applications such as telemedicine [5] to industrial applica-tions such as telemaintenance [6]. While the robustness of autonomous and intelligentrobots remains questionable in unknown and dynamic environments, a remote HRI canimprove users’ safety by removing the need for them to be present in hazardous environ-ments. Moreover, by removing the barrier of physical presence, the operation of expertscan be facilitated when their competency is inadequate for the task at hand. In addition,it can empower people with special needs to perform tasks that would not be possiblewithout the use of a robot. The main obstacle in the way of widespread adoption ofthese systems, in addition to their cost and accuracy, is lengthy training times, whichalso affect their operational costs. Furthermore, people with less technological skill aremore skeptical about new technologies and often refer to them as not helpful insteadof difficult to learn. This can result in a less effective and less efficient HRI, which un-2derlines the importance of developing an intuitive, accurate, and user-friendly HRI forremote robot operation.Recent technological advancements have made some state-of-the-art technologies suchas Head Mounted Displays (HMD) and infrared hand tracking devices more attainableand readily available. The use of these off-the-shelf components can reduce the costof remote HRI systems while increasing their user-friendliness. In fact, VR platformsprovide users with a perception of being in the real environment, facilitating creationof a more intuitive interaction. In addition, hand tracking devices can also improve theintuitiveness of the remote HRI systems by removing the need for any peripheral inputdevices.1.2 ObjectivesThe main objective of this thesis is to demonstrate the potential benefits of emerging VRtechnologies, such as Oculus Rift, in telerobotics. A potential outcome could be moreefficient HRI for non-expert users to use teleoperation for different tasks in less timewith a lower cost. To achieve this main objective, it can be divided into the followingsub-objectives:• Design and develop different HRI modes that involve Oculus Rift and the LeapMotion hand tracking device;• Design experiments to evaluate the usability of the developed HRI modes consid-ering the three factors of effectiveness, efficiency, and user satisfaction;• Develop a proper test-bed to evaluate the developed HRI modes; and• Analyze the acquired data from the experiments to study the potential benefits ofVR technologies by means of usability evaluation.31.3 Literature ReviewRobots have long been used in mass production industries. They have tremendouslyimproved the production rate of these industries as they perform perfectly wheneverthey have structured planning and clearly defined solutions. However, they still facedifficulties when it comes to unstructured planning which might require a situationalassessment [7]. A possible solution to such problem lies in the development of intuitiverobotic systems that could combine the superior motor and sensory skills of a robotwith the unmatched cognitive abilities of a human [8]. The development of such systemsnecessitates a thorough understanding of the way in which humans and robots mayinteract.1.3.1 Human-Robot InteractionThe field of HRI focuses on understanding, designing, and evaluating robotic systemswhen they are used with or by humans [9]. The ideal goal of HRI is to create an effectiveway of communicating between the two parties in analogy to two persons who knoweach other well and can pick up subtle cues from one another (e.g., musicians playing aduet) [10, 11]. An effective HRI would improve robots’ behavior by taking advantage ofhuman perception and cognitive abilities while providing humans with the precision andefficiency offered by robots, thus increasing the chance of achieving the desired goal of atask. According to Goodrich [9], the following attributes of HRI can affect the design ofthe methods that define the interactions between humans and robots:• Level of autonomy• Nature of information exchange• Structure of the team• Adaptation, learning, and training of humans and the robot• Shape of the task41.3.1.1 Level of AutonomyHRI can be classified based on the robot’s Level of Autonomy (LOA), which indicatesthe extent to which a robot needs human supervision [12]. While some tasks mightneed full interaction with the user, others can be fully completed by automated robotsthat require little human supervision. Robots that can perform tasks without humansupervision are classified as fully autonomous. The amount of time during which robotscan be left without human supervision, called neglect tolerance, is known as the levelof autonomy [13]. Therefore, the longer the robot can perform tasks without humansupervision, the higher its LOA. Figure 1.1 illustrates a scale of LOA with respect tomixed-initiative interaction levels, where mixed-initiative refers to a “flexible interactionstrategy where each agent is able to contribute to the task what it does best” [14]. Onone extreme, the human has direct control over the robot and the robot does not initiateany assistance, while on the other extreme, the robot has the full cognitive ability toeffectively collaborate with the human like another human would. Although an increasein LOA is not the ultimate goal of HRI, a suitable LOA can facilitate a productiveinteraction between a robot and a human in the completion of a certain task.Figure 1.1: Level of autonomy with respect to HRIadapted from [13] Nature of CommunicationIn addition to LOA, the nature of communications in HRI can also affect the effectivenessof the interaction. In general, information exchange between humans and robots can befacilitated using three of the five senses: seeing, hearing, and touch, which may appearin HRI in the following forms [9]:5• Visual communications, such as visual displays in graphical user interfaces [15], VR[16], and augmented reality interfaces [17]; and gestures, which can be static like apose, or dynamic such as movements in predetermined patterns [18];• Auditory communications, such as speech and natural language communications,which can also include the text-based responses that are usually used in mixed-initiative interactions [19]; and• Tactile communications, which are haptic feedbacks typically used to improve theoperator’s perception in telemanipulation tasks [20].In addition, information exchange in HRI can also be categorized based on its for-mat into verbal, nonverbal, and multimodal approaches. Verbal information exchangeis usually achieved by commands in the form of audio or text; nonverbal informationexchange is based on gestures; and multimodal information exchange can be defined asa combination of verbal and nonverbal methods. Structure of the TeamAnother important factor affecting the HRI is the structure of the team within whichhumans and robots are supposed to interact. The HRI is not exclusive to the interactionof one robot with one human and in fact, in most cases, several robots and humans areinvolved in performing a specific task. For example, robots used in search and rescue aretypically operated by two or more people, each of whom has a certain role in the team:while one person is responsible for the navigation of the robot, others are responsible forthe analysis of the data received from the robot for situational awareness [21]. Attemptshave been made to increase the efficiency of HRI systems by empowering a single humanto operate multiple robots, exploiting the neglect tolerance of such robots [12, 22]. Thisclearly shows how the structure of the team can affect a designer’s decisions when creatingan effective HRI method. Adaptation, Learning, and Training of Humans and RobotsTraining and adaptation of robots and humans is another important aspect of HRI sys-tems. On the one hand, certain robotic systems, such as museum tour guide robots[23], have been designed for use in a specific task for a short period of time, resultingin short-term interactions with a wide variety of humans. In such systems, the amountof training required for humans to achieve an effective interaction has to be minimized,requiring the HRI method to be as intuitive as possible. On the other hand, however,some robotic systems require an HRI method that facilitates a careful training of humanoperators, as the risk involved in the operation of the robotic system is very high [9].Robotic systems designed for telerobotic bomb disposal and search and rescue operationsare examples of such systems, where the operators are usually experts in their fields [24].In addition, HRI systems are not just confined to the training of human operators, butcan also involve training of robots. One example of such training is Programming byDemonstration (PbD), where the operator can train the robot by demonstrating goodand bad examples of solutions, thereby removing the tedious step of programming andreinforcement learning methods and increasing the application of robots in humans’ dailylife [25]. Furthermore, proximate robotic systems that are designed for social interactionor therapeutic purposes are involved in long-term interactions with humans, which signi-fies another important issue of HRI systems: the adaptation of robots by humans [9]. Forexample, cultural backgrounds and users’ previous experiences have been shown to havean influence on people’s attitude towards robots [26], consequently affecting the effec-tiveness of the HRI approach. Therefore, a successful HRI design must also acknowledgethe adaptation issue of such systems. Task ShapingAccording to Goodrich [9], task-shaping is another important aspect of HRI systemsthat highlights the influence of the introduction of a new technology on the executionof a task. In designing a successful HRI approach, it is imperative to consider how anew technology may change the way a task is done, and perhaps to modify the task7performing approach to increase the effectiveness of the interaction. For example, it hasbeen shown that the introduction of Roomba – an autonomous vacuum cleaning robot– has changed the way operators clean the floor [27]: task-shaping has manifested itselfas performing a pre-cleaning step to increase the robot’s effectiveness.In addition to all the above aspects of HRI systems, the nature of interaction can alsobe altered by the distance between the user and the robot. Proximate interaction is whenthe user and robot are in the same environment, whereas remote interaction is when theuser and robot are spatially separated. Since remote interactions between humans androbots are the main focus of this thesis, the next section is devoted to providing moredetails about remote HRI.1.3.2 Remote Human-Robot InteractionRemote interaction of humans and robots is known as telerobotics. One of the forerun-ners of the field, Sheridan [28], defines telerobotics as the extension of human cognitiveand physical capabilities to artificial sensors and actuators in remote locations throughsome communication means, facilitating the execution of complex tasks while avoidinghazardous situations for the human operator. Telerobotics generally refers to robotic sys-tems with a human operator in control, which are also referred to as human-in-the-loopsystems [29]. In such systems, the operator handles all cognitive decisions and planning,while the robot is only responsible for mechanical implementation of such decisions. In-clusion of human operators makes such systems highly attractive for complex tasks inunstructured environments.Acknowledging developments in computing and communication technologies, teler-obotics becomes a limitless field with new applications and uses being introduced anddeveloped every day. The three main motivations for discovering telerobotic applicationsconcern users’ safety, operational expenses, and operational scale [30]. The most commonapplications for telerobotics regard the safety of human users and avoiding their presencein dangerous environments, where there may be hazards in the form of radiation, toxiccontamination, potential explosions, collapsed buildings, or catastrophic events. Onemain application in this category is Search and Rescue (SAR) operations, in which re-8mote robots detect, locate, and rescue humans who are trapped due to natural disasters[31, 32]. Secondly, sending humans to some fields might not be hazardous but can becostly, such as to manipulate objects in space explorations. The best-known example ofthis application is the Mars Exploration Rover (MER), which began in 2003 by sendingtwo rovers, Spirit and Opportunity, to Mars to search and characterize rocks and soilsthat hold clues to past water activity on the planet [33]. Thirdly, in some applications,such as micro material handling [34] or the assembly of microsystems [35], humans mustgo beyond their limitations to teleoperate in scaled environments [34]. System ArchitectureFigure 1.2 depicts the concept and main modules of a telerobotic system. As can beseen, a telerobotic system is composed of two main parts that are connected througha communication channel: the operator site and the remote site [36]. On the operatorsite, the human operator uses the input devices of the human system interface to givecommands to the telerobotic system that, in its simplest form, can be a joystick. Thecommunication channels transfer the commands over some barriers to the teleoperator,which is defined as a machine that enables the human operator to move about, sense, ormanipulate objects in remote environments [37]. Such devices on the operator site arecalled masters while the robots on the remote site are called slaves, together creatinga master-slave system. On the remote site, sensors mounted on the teleoperator obtaindata regarding the interaction between the operator and the environment, which typicallyinvolves acoustic, visual, force, and tactile sensors. Measurement data are transmittedback to the operator and displayed using different hardware in the system, resultingin a control loop that is closed via the human operator. Systems that involve sensoryfeedback from the remote environment are called bilateral, and the feedback in this caseis called haptic feedback.The communication mediums in telerobotics can range from radio waves to the inter-net, and they can significantly contribute to the complexity of the system as they maycreate signal loss and distortion or cause time delays, thereby further complicating thecontrol of such systems. Such issues and proposed solution will be discussed in more9detail in the following sections.As is evident from Figure 1.2 another major component of any telerobotic systemis the barrier that the system tries to overcome [36]. In its simplest form, the barriercan be the distance between human operators and the teleoperator, which may requireminutes or hours of travel. In this case, the telerobotic system tries to minimize thetime and cost associated with task experts traveling to required sites. Another barrier isrestricted accessibility, where the task expert needs to be protected in some ways due tohazards in the remote environment. Operations that may involve handling of radioactivematerials or deal with clean rooms that need to be air locked are examples of such abarrier. Furthermore, barriers may result from the scale of the remote environment. Forexample, in the assembly of micro or miniature mechatronic components or the case ofminimal invasive surgery, a telerobotic system may be helpful in providing the user withprecise scaled manipulations along with scaled visual and haptic feedback.Figure 1.2: Concept and main modules of a telerobotic system [36] Control ArchitectureAs opposed to typical robotic systems where a robot executes a task without any con-sultation from a human operator, in telerobotic systems there is active communicationfrom and to a human operator, thereby necessitating a different approach in their controlarchitecture. The control architecture of telerobotic systems lies on a spectrum of thefollowing three main categories [29]:10• Supervisory control refers to a situation where the human operator is given high-level commands that are refined and executed by the teleoperator. A typical andfamous example of such a control system is the NASA rovers on Mars [38], wherethe time delay of several minutes necessitates a control system in which the humanoperator defines the goal of a movement and the rover achieves the goal of localautonomy using sensory feedback directly.• Shared control refers to a situation where task execution is shared between directcontrol, local sensory feedback, and autonomy, or where user feedback is augmentedfrom VR or other automatic aids. An example is a tele-grinding task, where thegrinding force is best controlled locally by the system while the motion of thegrinder over the workpiece is controlled by the human operator. In this case, thehuman operator is also relieved of the physical effort of pressing the grinder againstthe workpiece.• Direct control refers to a situation where there is no autonomy or intelligence in thesystem and all the motions of the slave are controlled by the operator via a masterinterface, creating a master-slave system. An obvious example of such a systemis the control of a vehicle using one or two joysticks, where the commands areproportional to the joystick displacements that are used to control the translationand orientation of the vehicle.Figure 1.3 shows the three concepts of the control architecture for telerobotic systems.As can be seen, the degree of coupling between human and robot decreases when movingfrom direct control to supervisory control. The brown arrows represent the degree ofcoupling between human and robot and the grey arrows show local control loops on thehuman or remote sites in each case. Such control architectures correspond with the threerightmost levels on the LOA scale that was introduced earlier, in section 1.3: Different telerobotic control architectures [29] Challenges of TeleroboticsTelerobotics faces several major challenges, which can be summarized as follows [39]:• Latency: Latency is a major issue in manual and real-time control. It can causeoverreaction from an operator to fix an error that does not correspond with theactual positional error of a robot, leading to unexpected movements of the robotthat in turn can result in more correctional input being needed from the operator.However, because of the latency, the operator’s commands and corrections willagain not be aligned with the current error, thus causing the error to grow, leadingto instability. One solution to such instabilities would be for an operator to waitmore than the delay period to observe the result before executing more tasks [39],leading to inefficiencies in the system.• Incomplete information from the remote location: The second challenge in teler-obotics is the lack of feedback from the remote location [39]. For example, if avideo camera cannot provide a user with views from different directions, the lackof visual feedback could decrease the user performance. Another example may in-volve a situation where visual feedback from cameras is obstructed by oil leaks in12underwater pipeline inspections tasks.• Unfriendly user interface: Another major challenge is unfriendly user interfacesettings. When an interface uses too many graphical components, such as sliders orgraphics, or non-intuitive graphical representations, it may become too complicatedfor the operator to operate a robot using such interfaces, leading to performanceloss [39].• Operator cognitive fatigue: An extension of the unfriendly user interface challengeoccurs when the operator has to focus on a screen, observing and controlling robotsremotely for a long time. This could cause cognitive fatigue in the operator [39]and thus further decreases her performance. Application of VR and MR in TeleroboticsVR is a three-dimensional (3D) computer generated simulation environment that enablesusers to interact with a graphical environment using special equipment [40]. One of themain features of VR is that the operator can manipulate the virtual world in real time.Applications of VR can be divided into the following general categories [41]:• Training and simulation: Simulation of different scenarios in VR can be used forlow-cost training. This is mostly used in military activities, where designing areal-world platform for training could be costly and hazardous .• Education: VR can also be used for educational purposes when complex 2D datacan be easily explained in 3D environments. Furthermore, VR can provide studentsand teachers with the opportunity to visit artworks in different places around theworld without leaving the classroom.• Entertainment: Gaming is one of the most powerful markets in today’s world, andVR has the potential to move the nature of this industry to the next level.• Teleoperation: Robots can survive in any environment, from underwater to nuclearfields. Having the ability to control and interact with robots remotely from a safe13location can make a huge difference in operators’ safety. Application of VR intelerobotics can provide the operator with greater visual feedback, resulting in anincrease in performance.In general, VR, can be divided into three types based on the immersion level [42]:fully immersive, such as Oculus Rift; semi-immersive, where a large projection screen isused; and non-immersive, where a desktop system is used to display a 3D world using amonitor.In 1994, Paul Milgram introduced a virtuality continuum axis [43], shown in Figure1.4. This axis is a continuous scale, with a completely real environment on one end anda completely virtual environment on the other. The area between the two extremes iscalled MR [43]; applications such as Augmented Reality (AR) and Augmented Virtuality(AV) lie in this region. AR refers to a situation where VR is merged with interactive,real-time 3D physical information [44]. In other words, it enables users to see offlinevirtual environments and the real world at the same time by overlaying physical objectswith a virtual world in real time using computers [45]. AR can be divided into threetypes: see-through AR, monitor-based AR, and special AR. In the first type, a headsetor other device is usually placed on the user’s head to present augmented information,whereas in the second type, a monitor is used to display information [42]. In the thirdcategory, a projector is used without any device to display augmented models [42]. AVrefers to a situation where virtual environments are augmenetd by real-world data. Themain focus of AV is immersion: users can navigate through a virtual world and interactwith real or virtual objects [46].Figure 1.4: Milgram’s virtuality continuum. Adapted from [47]VR and Visual Feedback In some applications that suffer from poor or nonexistentvisual feedback, limited field of view, or microscopic scale data, task completion might14take longer to execute remotely or may be impossible to complete at all [48]. One instanceof using VR to improve visual feedback has been proposed by Oyama and his team [49].When the visual feedback from a remote site was degraded by smoke, Oyama et al.used a VR-assisted teleoperation system to overcome the poor visual feedback. A stereocamera was installed on the remote robot that provided the user with visual feedback. Ahuman operator remotely controlled a slave arm by controlling a kinematically equivalentmaster arm. The same model of the slave robot arm and its environment was generatedusing 3D graphics. In this HRI system, the user was able to control both the remote andvirtual slave arms by controlling the master arm. In such a case, an accurate calibrationbetween the virtual and the real remote scene is vital. A manual calibration approachwas used, which was based on least-squares matching of manually chosen correspondingpoints in the real and virtual environments.Another example of inefficient visual feedback can be found in mobile robot navigationunder water. Muddy water, complex pipe structures, poor visibility around the work site,and cameras’ limited field of view can disorient even experienced users, which may causeaccidents and loss of equipment. Lin and Kuo [50] introduced a VR-assisted navigationfor underwater tethered robots by matching the robot’s position and orientation datafrom navigational system sensors against a Computer Aided Design (CAD) model of theunderwater structure being inspected. This virtual vision feedback provides users witha full perception of the structure’s spatial location. Furthermore, the result of this studyshow that using VR increases the efficiency of the operation and reduces the operator’sworkload [48, 50] .In some applications, such as nano-manipulators, visual feedback is needed in a higherlevel of detail as motion in the user environment is in a sub-molecular scale. Taylor et al.[51] developed an atomic scale operation by using HMD and force feedback manipulatorarms for the operator interface. This allows the user to control the experiment at anatomic scale in real time. Taylor and his team also used a Scanning Tunnelling Microscope(STM) to obtain an elevation map of the structure. Then, virtual images of the atomicsurface generated by a computer were displayed to the operator.15VR and Time Delay Human users rely on vision not only for perception, but alsofor action. Continuous and immediate visual feedback is essential for correct and fine-tuned movements [52]. Delay in visual feedback is unavoidable in many teleroboticsystems; this can be due to transmission, processing, and rendering of visual data. Kimet al. [53] found that latency resulted in an increase in the duration of task completion,and that users typically changed their strategy to a move and wait strategy when thedelay was longer than 400ms. The authors also found that an increase in duration oftask completion for telerobotic surgical maneuvers can result in reduced accuracy andincreased fatigue [53].Bejczy et al. [54] tried to solve the time delay issue by overlaying a graphical repre-sentation of the remote robot onto a video image of the real robot. The virtual robot andthe real robot both receive inputs from the user. Then, the virtual robot executes theuser commands promptly, and thus acts as a predictor of the real robot, which allows fora safer and quicker teleoperation. Their results showed that using virtual representationof real robots as a guide can lead to 50% reduction in task completion time.1.3.3 Evaluation of Human-Robot InteractionAs the improvements in robotic systems are increasing the amount of human exposure torobotic systems, issues related to HRI are gaining more attention in the scientific commu-nity. For example, a recent DARPA/NSF report [1] underlines the need for methods andmetrics to evaluate the development of human-robot teams. To evaluate the HRI systemdeveloped in the present work, the approach suggested by Weiss et al. [55] is adapted;this is explained in more detail in the following section. In Weiss et al.’s approach, theevaluation framework is divided into two frameworks: a theoretical and a methodologi-cal framework. Similar to their approach, in the theoretical framework a multi-factorialmodel is proposed. This framework further provides a brief explanation of a variety ofmetrics found in the literature, and only the relevant factors along with their influencingmetrics are explained in more detail. In the methodological framework, methods relatedto this thesis are extracted from the literature to evaluate the established metrics. Theoretical FrameworkThurn suggested one of the first theoretical frameworks for evaluation of HRI [56]. Hisframework is based on the distinction of robots into three different categories: industrial,personal, and professional service robots. He described in detail the capabilities of eachhuman-robot interface, the contexts of use, and the potential user groups, resulting insome open questions that need to be answered during the development of such roboticsystems.Adams [57] has tried in his work to draw the attention of the HRI community tothe vast existing research in the area of human factors, which studies the interactionin complex man-machine systems such as air traffic control, cockpit design, and nuclearpower plants, among others. He emphasizes that the humans’ needs and requirementsneed to be at the center of HRI developments, and identifies the following areas of humanfactors research as important for the development of efficient, effective, and usable HRI:user-centered design, human decision-making, workload, vigilance, situational awareness,and human errors.Steinfeld et al. [58] divided the metrics that can be used to evaluate HRI into twoclasses: task-oriented and common metrics. Task-oriented metrics are highly dependenton the nature of the task at hand, while common metrics are those that can be usedfor comparison of different HRI systems. Furthermore, these authors categorized thetask-oriented metrics into the following five categories: navigation, management, manip-ulation, perception, and social. Based on this classification, the common metrics can befurther divided into the three categories of system performance, operator performance,and robot performance.Recently, Weiss et al. [55] developed a framework for the evaluation of HRI in hu-manoid robots addressing Usability, Social acceptance, User experience, and Societal im-pact (USUS). Their framework is divided into two frameworks: theoretical and method-ological. In the theoretical framework, each of the evaluation factors discussed aboveare described and split into several sub-factors extracted from existing literature. Themethodological part of their framework includes a mix of methods derived and borrowed17from various disciplines such as HCI, psychology, and sociology.A detailed explanation of the above factors can be found in the relevant literature [55,57, 58]. Considering the scope and the context of the present work, a brief explanationof only the most relevant factors is provided in the following sections.Usability Usability is one of the factors suggested in the USUS framework [55]. It isdefined by the International Organization for Standardization (ISO) [59] as “the extentto which a product can be used by specified users to achieve specified goals with effec-tiveness, efficiency, and satisfaction in a specified context of use”. This definition showsthat usability is a concept involving several indicators rather than a single measurableterm. Therefore, to analyze the usability of a product, a method, or a procedure, it needsto be broken down into different usability indicators [55].The literature suggests different indicators to perform usability evaluations [55, 59,60, 61]. One of the pioneers of usability engineering, Nielsen [60], proposed the following:• Efficiency: The ISO standard [59] defines efficiency as “the resources expended inrelation to the accuracy and completeness with which users achieve goals”. Thus,the sooner the user can finish a task accurately the more efficient the productor system becomes. The time of completion can therefore be the main factor ofefficiency. The user expertise may significantly affect the amount of time thatit takes to perform the specified tasks; hence, one needs to consider the level ofexpertise of users in efficiency evaluations [61]. Possible measures may include thepercentage of users who have finished the task in a certain time [61].• Effectiveness: The ISO standard [59] defines effectiveness as “the accuracy andcompleteness with which users achieve specified tasks”. Rubin and Chinsell [61]also define the effectiveness of a system as the degree to which it behaves in away that users anticipate it to, and as the simplicity of using it to meet thoseexpectations. Thus, the effectiveness of a product can be measured by error rates.Possible measures may include quantitative error rates, and the percentage of userswho can successfully complete the desired task in their first trial [61].18• Learnability: This factor is a measure of how easily and fast users can start workingwith the product and achieve the specified goals. The user should need minimaltraining to be able to perform the desired tasks. Based on the nature of a product,learnability can be measured after a certain amount of training, or it may requireno training at all. When training is needed, possible measures can include theamount of time that the user needs to be trained, and in other cases the time tofinish the desired task effectively can be an implicit indication of learnability.• Memorability: The instructions and the way to use a product need to be easilyremembered so that users who have not used the product for some time can achievethe specified goals without having to learn to use it all over again. In some of theliterature, learnability and memorability are used interchangeably [61].• Error: The system should be designed in a way to minimize user errors. The systemshould provide the users with the tools to recover from any errors that may occur.In general, it should have a low error rate and avoid any catastrophic errors. Asmentioned earlier, the error rate of a system is closely related to its effectiveness;therefore, measurements of error rates of a system can also provide informationabout its effectiveness.• Satisfaction: This factor measures users’ comfort, feelings and perceptions aboutthe system. This data is usually captured at the end of a testing period, eitherthrough verbal interviews with the users or written surveys [61]. However, accord-ing to Nielsen [60], satisfaction is more relevant to systems that relate to leisure, notlabor, since a satisfying experience with a system does not necessarily mean thatthe system is easy to use or efficient. The author indicates that possible metrics forthis factor may include some physiological measurements of the users (e.g., heartrate or blood pressure). However, he also underlines that such measures mightskew the results, since they may increase the nervousness of test users. Therefore,a simple measure such as asking users for their subjective opinion about the systemmight be a more accurate metric. However, as mentioned earlier, the ISO standard[59] outlines only the three attributes of effectiveness, efficiency, and satisfaction19to be enough for usability evaluation. The satisfaction is described by the ISOstandard as ”Freedom from discomfort, and positive attitudes towards the use ofthe product.” Methodological FrameworkDifferent methods have been suggested for evaluation of HCI, and these can be adaptedfor use in HRI as long as they take into account the dynamics and complexity of therobotic systems[62]. The choice of suitable evaluation methods can be affected by thetype of data at hand, the amount of available resources (i.e. time and money), andthe phase of the design (i.e. whether it is a formative or summative evaluation) [63].In general, the evaluation methods in HCI are divided into two major categories: expertanalysis and user studies [63]. The following sections provide a brief explanation of thesemethods, with more attention given to those used in the present work.Expert Analysis In expert analysis, evaluation experts are employed to evaluate thesystem and find any flaws of the design. Since users are not involved, and a lowernumber of evaluation sessions are consequently needed for this evaluation method, it canbe executed in a quicker and less expensive fashion compared to user studies. However,it has been reported that this method may result in false positives, where experts comeup with problems that do not actually affect the usability of the product [60]. In general,there are two main approaches to expert analysis: heuristic evaluation and cognitivewalkthrough.• Heuristic Evaluation In this approach, experts judge whether the system is com-pliant with the established set of guidelines (i.e. heuristics). Heuristics describea set of essential attributes that a system should possess to provide the user withthe ability to perform a specified task in an effective, efficient, and satisfying man-ner. A set of 10 heuristics suggested by Neilsen [64] has been accepted in the HCIcommunity as the standard heuristics. While these have also been directly appliedfor evaluation of HRI [65], Clarkson and Arkin [66] adapted them for this purposeand developed a new set of eight heuristics.20• Cognitive Walkthrough This approach is a theoretically structured usability evalu-ation method that uses forms and guidelines to draw the attention of the evaluatorto the goals and actions of the user during a task, and to whether the systemsupports such goals and actions or hinders them [67]. To implement this method,evaluators must take on the perspective of a potential user and work through thetask, evaluating the system by filling in a set of forms. As the method relies heavilyon the concepts of cognitive science, the evaluator’s familiarity with such termi-nology can significantly affect the outcome [68]. Furthermore, since no real usersare involved in the evaluation process, the set of actions chosen by the evaluatorsmight not represent that of real users [69]. In fact, the way users actually use aproduct can at times be surprising to designers and evaluators.User Studies In contrast to expert analysis, user studies put the potential users atthe center of the evaluation study and try to evaluate the system by directly askingfor user feedback. User studies are generally employed to provide empirical evidence toanswer research questions or to validate/invalidate some hypothesis about the system[55]. Such studies are conducted by asking users to perform the desired tasks using thesystem while their behavior is recorded and measured by a researcher. In general, userstudies can be either laboratory-based, taking place in well-equipped usability testinglabs, or field-based, where the user behaviors are monitored in a real setting. Althoughfield-based studies allow the usability of a product to be studied in a realistic usagecontext, they may suffer from some natural disturbing factors such as background noiseor non-ideal lighting conditions that may affect the acquired data and make the analysismore difficult.According to Dix et al. [70], user study techniques can be divided into three cate-gories: observational, query-based, and experimental evaluation methods. Observationaltechniques include methods such as “think aloud”, where users are asked to talk loudlyto explain what they are thinking while being observed. This method also includes co-operative evaluation, where users and evaluators work together to evaluate the system.Although such techniques are easy to implement, the provide information is usually ob-21jective and selective. On the other hand, query-based methods can provide direct answersfrom the users about the usability of a product. These methods can include interviewinga group of users using a set of questions. Questionnaires are less flexible than interviewsas they are set in advanced, but they are capable of reaching a wider group and takeless time to administer. Finally, experimental evaluation methods are used in situationswhere some attributes of the participants’ behavior are used to evaluate a hypothesis thathas been set by a researcher. Some user study methods were briefly introduced in thissection. However, being the methods of choice in the present work, questionnaires andexperimental evaluation methods are explained in more detail in the following sections.Questionnaires Questionnaires, as one of the query-based methods, are the embod-iment of the philosophy that the most suitable way to determine whether a system iscapable of meeting users’ needs is by directly ”asking the users” [70]. Although they arecheap and easy to administer, however, they are necessarily subjective and are based onthe users’ perception. Nevertheless, for the same reason, they can reveal issues that areusually missed by designers. As users are not usually engaged in creating questionnaires,it is necessary that questionnaires are well designed in advance. The design of a ques-tionnaire depends on the desired information and the way the responses will be analyzed.Based on these considerations, questionnaires may involve one or several of the followingstyles of questions [70]:• General: These questions are usually intended to help establish the users’ back-grounds and their position within the user population. They may include questionsabout age, education level, occupation, and sex, among others. Furthermore, somequestions may also be included about their previous experience with similar systems(e.g., whether or not they have previous experience with computers or gaming).• Open-ended: Such questions ask the users to provide their impulsive opinion on aquestion. For example, the user could be asked to provide suggestions on how toimprove an interface. Such questions are only useful in obtaining general subjectiveinformation about a system and are difficult to analyze or compare. Although they22are usually skipped by most time-conscious users, they are useful in providinginformation about areas that have not been considered by the designer.• Scalar: These questions require the participants to judge a specific statement ona numeric scale, which is typically based on how much they agree or disagree withthat statement. An example of such a question is as follows:The interface is easy to interact with.Disagree 1 2 3 4 5 AgreeThe levels of the scale can vary depending on the intention of the questionnaire.For example, a scale of 1 to 3 only allows for the answers of agree, neutral, anddisagree, which would tempt the users to choose neutral for most of the questions.In contrast, a fine granularity of the scale (e.g. 1 to 10) makes the analysis of theresponses highly difficult. However, high granular scales provide the possibility toreduce the scale by grouping the answers in an appropriate level. Therefore, highergranularity of scales is desirable.• Multiple-choice: In this case, a set of explicit responses is provided for the respon-dents to choose from. They may be asked to choose only one answer or as manyas applicable. These can also be useful in determining users’ previous experiencesusing ”yes” and ”no” answers.• Ranked: This type of question is especially useful in analyzing users preferences.Users are usually asked to order the items in a list based on different factors, suchas the amount of usage or their preference.Although all these question styles are useful for different cases, for the questionnaireto be effective, the burden on the respondents should be kept at a minimum by usingclosed questions such as scalar, ranked, or multiple-choice as much as possible. Suchquestions are also easier to analyze and compare using simple statistical methods.Experimental Evaluation Controlled experiments are one of the most powerful tech-niques, as they can provide empirical evidence to support a claim or a hypothesis [70]. In23all controlled experiments the evaluator tries to test a chosen hypothesis by measuringcertain attributes of participants’ behavior. To do so, different experimental conditionsare considered that only differ in values of a certain controlled variable. In this way,changes that are found in the measured behavior of the participants are attributed tothe differing conditions. The design of such experiments can significantly affect the reli-ability of their findings. This is influenced by the following factors [70]:• Participants Participants of evaluation experiments need to be chosen in a waythat match the expected user population as closely as possible. For example, theyshould be chosen from a similar age group with a similar educational background asthe final users. They should also have a similar past experience, if the effect of pastexperience is not the subject of analysis. The number of participants also plays animportant role in the reliability of a controlled experiment. Although it is usuallydecided based on pragmatic considerations, the number of participants needs tobe large enough to be representative of the targeted population. Furthermore, assome statistical methods are sensitive to the sample size, the number of participantsshould be chosen in accordance with their requirements.• Variables Controlled experiments are designed to test the validity of a hypothesisby manipulating and measuring variables under controlled conditions. The vari-ables that are manipulated during the experiment are called independent variables,and the variables that are measured to test the hypothesis are called dependentvariables [70]. Independent variables can have different values, called levels of thevariable. Experiments can have more than one independent variable, which canalso have more than one level. For example, if there are four independent variablesin an experiment that can have two levels, then eight experimental conditions arerequired to examine all the possibilities. Dependent variables must be measurablein some ways, and their value must, as much as possible, only be affected by thechanges in independent values and not by other factors. Most common dependentvariables in evaluation experiments of HCI are the number of errors, the time takento complete a task, quality of user performance, and user preferences [70].24• Hypotheses The hypothesis of a controlled experiment is in fact its predicted out-come [70]. It claims that a change in independent variables will lead to a change inthe values of the dependent variables. This is tested by rejecting the null hypothe-sis, which claims that the values of dependent variables for two levels of independentvariables are not different. Some statistical methods, explained later, are usuallyused to reject the null hypothesis. If their result is statistically significant, thismeans that at the given level of significance, the obtained difference between thedependent variables for two levels of independent variables would not have occurredby chance. The null hypothesis can therefore be rejected [70].• Experimental Design Design of a controlled experiment plays an important role inthe reliability of its outcome. The first step in the design of any controlled exper-iment is to determine the hypothesis that the experiment is trying to test. Doingso requires identification of the dependent and independent variables and predict-ing the effect of the change in the levels of independent variables on the values ofdependent variables. When designing an experiment, two different experimentalmethods can generally be used [70]:– Between-Subjects: In this method, which is also called a randomized exper-iment, each participant is randomly assigned to one of two different groups:the experimental group and the control group. These two groups are similarin every condition except for the independent variables. The purpose is toshow that the difference in the values of dependent variables are only the re-sult of the alterations in the levels of independent variables. The number ofcontrol groups is determined based on the number of independent variables.This method is sensitive to proper selection of participants, as it can sufferfrom user bias that stems from the variation between participants.– Repeated Measures: In this method, also called within-subjects, the partici-pants are placed under each condition. Although this method has the advan-tage of removing the effect of variation between participants, it can suffer fromlearning effects, where the skills learned during one condition may influence25their performance under other conditions. This can be overcome by alteringthe order in which the participants are assigned to each condition. For exam-ple, half of the participants can be assigned to condition A before B, and theother half to condition B and then A.Considering the advantages and disadvantages of each method, one needs to considerthe effect of learning transfer, and the availability of participants who can be representa-tive of the user group when making decisions about the design of the experiment. Havingcarefully designed the experiment, one then also needs to choose the statistical methodsto analyze the acquired information. The next section provides a general overview of theavailable statistical methods that can be used for hypothesis testing, followed by a moredetailed explanation of the methods employed for analysis of the data in the presentwork.Overview of Statistical Measures The choice of statistical analysis depends on thetype of data at hand and the questions that need to be answered [70]. The obtaineddata (i.e. variables) can be classified as discreet or continuous [70]. While discreetvariables, such as the colors of a user interface, can only have a finite number of values orlevels, a continuous variable, such as completion time or user heights, can take any value.When dependent variables, which are subject to random experimental variations, followa certain distribution, then certain statistical methods can be used to analyze them; theseare called parametric tests. One such distribution is called the normal distribution, wherea histogram of data will result in a well-known bell-shaped graph. When no assumptioncan be made about the distribution of the data, transformations such as log-transform canbe used to change the distribution of the data to a normal one. On the other hand, somestatistical tests make no assumption about the distribution of data and are mainly basedon the ranks of data. Such statistical methods are called non-parametric. Since theyare purely based on the ranks of data, they have the advantage of being less sensitiveto outliers. Table 1.1 lists all the standard parametric and non-parametric statisticalmethods depending on the type of dependent and independent variables. Explanation ofall these methods falls outside of the scope of this work; the reader is referred to [71] for26more detail. However, a brief explanation of the methods used in this work is providedin the Methods sections.Independent Variable Dependent Variable CommentsParametricTwo-Valued Normal Student’s t test on difference ofmeansDiscrete Normal ANOVA (ANalysis Of VAriance)Continuous Normal Linear (Non-linear) regressionfactor analysisNon-ParametricTwo-Valued Continuous Mann-Whitney U testDiscrete Continuous Rank-sum versions of ANOVAContinuous Continuous Spearman’s rank correlationTable 1.1: Overview of statistical methods. Adapted from [70]1.4 Statement of ContributionsThe contributions of this thesis can be summarized in the following four points:• A new intuitive HRI method is designed and developed using new technologies suchas Oculus Rift and Leap Motion to remotely control a robotic arm.• The thesis empirically demonstrates the intuitiveness of using Oculus Rift and LeapMotion in interacting with and controlling robots remotely with users inexperiencedin robotics and programming.• User studies based on statistical analysis are used to evaluate and validate theeffectiveness, efficiency, and user satisfaction of designed systems.27• The result of the usability test can be considered in future system designs and thefuture nature of HRI to increase effectiveness, efficiency, and user satisfaction.1.5 Thesis OrganizationThis thesis comprises five chapters. Chapter 1 has introduced the work, including mo-tivations and objectives of this thesis, a review of the relevant literature on HRI, andevaluation methods. Next, Chapter 2 provides a detailed explanation of the designedexperiments. This covers information about the participants, variables, hypotheses, ex-perimental procedures, and related statistical data analysis methods, along with justifi-cation for their use. Chapter 3 then provides the details of the developed test-bed for theevaluation experiments, including the hardware components and the system architecture.Furthermore, this chapter also provides the details of the designed HRI user interfacesystems. Subsequently, the results of the evaluation studies are provided in Chapter 4,along with a discussion on efficiency, effectiveness, and user satisfaction regarding the de-veloped HRI modes. Finally, Chapter 5 concludes the thesis and offers some suggestionsfor future work.28Chapter 2MethodologyThis chapter provides the details regarding the design of the evaluation experiments,including the participants, variables, hypotheses, experimental procedures, and the sta-tistical methods used for data analysis, along with the justification for their use.2.1 Experimental DesignAs explained in the literature review section, different frameworks have been suggested forevaluation of telerobotic systems. For the theoretical framework in this work, standardusability testing was chosen, which only suffices to consider the factors of effectiveness,efficiency, and satisfaction. The error factor was also considered in the effectiveness anal-ysis. Learnability and memorability were not explicitly included as they are consideredto be implicitly covered in the effectiveness analysis [61]. In addition to the theoreti-cal framework, the methodological framework to perform the evaluations also needs tobe considered. Considering the generality of the designed task (pick-and-place), expertanalysis methods were irrelevant, and user studies were therefore the most appropriatemethod. Questionnaires and controlled experimental evaluations were used to performthese studies and evaluate the effectiveness, efficiency, and satisfaction regarding the threedeveloped HRI modes: manual , VR, and MR, explained in sections,, and2.1.3.3 respectively.For the questionnaires, a set of seven questions with a 10-point rating scale was used29to subjectively evaluate the developed HRI modes. Participants were asked to respondto each question by choosing a number from a scale of 1 to 10, with 1 and 10 being thelowest and highest levels of agreement, respectively. This subjective analysis was used tomeasure the level of user satisfaction with the developed interaction modes. Furthermore,the data gathered from these questionnaires were also used to analyze users’ comfortwith and general perception of the technologies used in the modes. The questionnairesalso included some general questions to survey the participants’ demographic and pastexperiences. Finally, one open-ended question was included to ask for users’ generalfeedback and for any suggestions as to how to improve the telerobotic system. Thequestionnaire used in this study can be found in Appendix A.Furthermore, a controlled experiment was carefully designed to objectively evaluatethe effectiveness and efficiency of the developed interaction modes. Controlled experi-ments are powerful tools to perform hypotheses testing using statistical methods, whichis done by measuring the changes in the dependent variables induced only by manipu-lation of the independent variables. As mentioned in the literature review section, thevalidity of such experiments is dependent on careful consideration of the following fourfactors [70]: participants, variables or measures, hypotheses, and the design of the ex-periments. All these factors of the experiments used in this work are explained in thefollowing sections. In addition, the experimental procedures and statistical methods usedfor hypothesis testing are also explained later in this chapter.2.1.1 ParticipantsAs mentioned in the literature review section, the participants chosen for an experimentalevaluation studies need to be representative of the target population. However, consid-ering that the telerobotic system developed in this study was not designed for a specifictask but only for a general evaluation of the HRI modes, this criterion was deemed to beirrelevant in this case. Twenty-six graduate and undergraduate students of the Univer-sity of British Colombia, Okanagan Campus participated in this experiment. All werehealthy without any physical disability or cognitive defects. None of the participantsquit while the experiments were running.30Figure 2.1 shows pie charts of demographic information about the participants. While38.5% (10 out of 26) of the participants reported having significant experience with play-ing video games, 61.5% (16 out of 26) reported having no such experience. Furthermore,19.2% (5 out of 26) also reported thaving some past experience working with Oculus Rift,and 15.4% (4 out of 26) indicated that they had used Leap Motion in the past. However,considering that such devices are newly commercialized, these experiences were probablynot significant enough to be considered in the analysis of this study. Finally, 30.8% (8 outof 26) of the participants were female, and only one person reported being left-handed.The necessary ethical approval for the study was obtained in January 2016 from the UBCOkanagan Behavioural Research Ethics Board.Figure 2.1: Participants’ demographic information.2.1.2 Variables, Hypotheses, and Design of the ExperimentsAs mentioned earlier, standard usability testing with only three factors - effectiveness,efficiency, and satisfaction - was chosen to evaluate the developed HRI modes. First,the effectiveness of each HRI mode could be measured by the percentage of users whosuccessfully finished the desired tasks. Error analysis considering the rates and natureof the errors that occured during the experiments could also be used as part of the31effectiveness analysis. Second, efficiency could be measured by the time it took foreach user to successfully finish the desired tasks using each HRI mode. Finally, users’satisfaction with the developed HRI modes could be measured directly by asking theusers about their opinion on each HRI mode using a questionnaire. Furthermore, sincepeople with gaming experience are known to outperform non-gamers in a number ofcognitive skills [72], the effect of users’ gaming experience was also considered to seewhether this experience had any effect on the users’ performance, which could in turnaffect the effectiveness, efficiency, and consequently the users’ satisfaction regarding eachHRI mode. This ensured that the possibly observed differences were only due to thefactor of HRI mode.This analysis necessitated the following null hypotheses to be tested with properstatistical methods:• There is no relation between effectiveness and HRI mode. Here, the dependentvariable is the success rate, which is the frequency of a two-valued (binary) variable(successful/unsuccessful) and therefore not a continuous variable. The independentvariable is the HRI mode, which has three different levels: manual, VR, and MRinteraction modes. Since each participant was asked to perform the task using eachHRI mode, this experiment has a repeated-measures design.• There is no relation between the success rates of each HRI mode and users’ gamingexperience. In this hypothesis, similar to the previous one, the dependent variableis the success rate, which is again a two-valued, non-continuous variable. Theindependent variable is the users’ gaming experience, which is a two-valued (yes/no)categorical variable. Since users could not simultaneously be considered to have andnot have gaming experience for each HRI mode, this becomes a between-subjectsdesign.• There is no difference in the efficiency of each HRI mode. The dependent variablehere is the Time of Completion (TOC), which is a continuous variable. The inde-pendent variable is the HRI mode, which has three levels. Furthermore, since eachuser participated in all the experiments, this is again a repeated-measures design.32• There is no efficiency difference between the users with and without gaming ex-perience. In this hypothesis, the dependent variable is again the TOC in eachHRI mode. The independent variable is the users’ gaming experience, which isagain a two-valued categorical value. Since each user could only be considered ineither the experienced or the inexperienced group for each HRI mode, this is abetween-subjects design.• There is no difference between the users’ satisfaction with the VR and MR HRImodes. The dependent variable here is the frequency of each scale value in thequestionnaire chosen by the users, which is a discrete variable. The independentvariable is the HRI mode, which has two levels: VR and MR. Subjective data fromthe questionnaire is used for the analysis, and since each user was asked to providetheir answers for all cases, this is a repeated-measures design.• There is no difference in user satisfaction with each HRI mode between the userswith and without gaming experience. The dependent variable in this hypothesis isagain the frequency of each scale value in the questionnaire chosen by the usersfor each HRI mode. The independent variable is users’ gaming experience, whichis a two-valued categorical variable. Since users could only be considered to beexperienced or inexperienced, this is a between-subjects design.2.1.3 Experimental ProceduresThree different interaction models were developed for this work: manual, VR, and MRHRI modes. These developed HRI modes were used to study the effect of using emergingVR technologies, such as Oculus Rift and Leap Motion, on the quality of HRI. To thisend, users were asked to perform a simple pick-and-place task using each HRI mode.In each case, after giving their consent upon arrival, participants were asked to fill in ageneral questionnaire to assess their standard demographic information. Next, a briefdescription of each task was given to the participants, and they were then asked tocontrol a robotic arm to pick up three small sponges from one table and place them intoa larger box on another table; this was facilitated by an electromagnet. If successful, the33time that it took the user to finish the desired task was recorded and later used for theefficiency analysis. In case the user failed to achieve the desired goal, the reason was alsorecorded. These data were later used to perform the effectiveness analysis of each HRImode. Having finished the task, each user was given a questionnaire to answer. Theirresponses were used later to measure their satisfaction with each interaction mode. Allanonymized raw performance data and questionnaire data are available in Appendix B.Details of each developed HRI mode are provided in the following sections. Manual HRI ModeIn this mode, users had to control the robotic arm using arrow keys on a keyboardwhile looking at direct video feeds from two Internet Protocol (IP) cameras shown on amonitor. As can be seen in Figure 2.2, one camera provided the video feed from the topview, and the other camera showed the front view of the scene. In this HRI mode, thearrow keys were used to move the robotic arm in a horizontal plane, and the page-up andpage-down keys were used to move the arm vertically. The magnet could also be turnedon with the “+” key, and off with the “-” key. Since no pre-programmed actions wereused in this case, this mode was considered to be a direct control architecture, where thesuccess of each task was largely dependent on the ability of the user to vertically alignthe robotic arm (magnet) with the boxes correctly.34Figure 2.2: What users saw when using the manual HRI mode. The screen on the rightshows the scene from the front view, and the screen on the left shows the video feed fromthe top view. VR HRI ModeIn this mode, users controlled the robotic arm through an entirely virtual environment, asshown in Figure 2.3. The input device for this test was a Leap Motion controller that wascapable of detecting users’ hands and providing them with a live representation of theirhands in the virtual environment. Users were also provided with virtual representationsof the three small boxes differentiated by their colours. In this method, users could graba box by making a fist gesture and then release it. Releasing the fist would initiate thepick-and-place action for the designated box by the robotic arm, which for user’s safetywas entirely pre-programmed. Thus, in this case a supervisory control architecture wasemployed.35Figure 2.3: What users saw when using the VR HRI mode. MR HRI ModeSimilar to the VR method, users also controlled the robotic arm through a virtual en-vironment in the MR HRI mode. As can be seen in Figure 2.4, two live camera feedscoming from two IP cameras were projected on two walls of the virtual room. The firstcamera feed was projected on the far wall and showed a front view of the robot (thecamera was placed in front of the robot), while the second camera feed was displayedon the floor of the virtual room and showed a top view of the robot and its surroundingenvironment. There were also three virtual boxes corresponding to the three real boxes.Similar to the VR mode, users could grab a box by reaching for it with their virtual handand making a fist gesture. Then, they could drag the box in the virtual environmentwhile maintaining that fist to align the projections of the virtual boxes on the two wallswith the images of the real boxes. In this way, users provided the robot with coordi-nates of the box. Then, users could clap their hands to command the robotic arm topick up the boxes and place them into the corresponding box, which was again entirelypre-programmed. Therefore, this case was considered to be a shared control architecture.36Figure 2.4: What users saw when using the MR HRI mode.2.1.4 Statistical Data AnalysisA major part of conducting any measurable research is identifying the appropriate tech-niques to analyze the collected data regarding specific questions, observations, or exper-iments. The variable type, whether it is continuous or categorical, and the number ofdependent and independent variables are primary factors that determine the most suit-able statistical test. Furthermore, each statistical method for hypothesis testing usuallyhas a few assumptions whose violation can challenge the validity of the analysis.This section briefly explains the statistical methods used to test the hypotheses formedin Section 2.1.2. An overview of these methods is given in Table 2.1. All analysis in thisstudy was performed using Minitab software.To analyze effectiveness, considering that the dependent variable is dichotomous(pass/fail) and there are three samples (one for each HRI mode), it was necessary touse a test that can handle three samples with a binary dependent variable. However, allsuch tests require independence of observations, which is not the case in repeated-measuresdesign. Therefore, this analysis was performed in a pairwise fashion, comparing two lev-els of HRI mode at a time. Considering the nature of the data, McNemar’s test wasthe most suitable. When considering the effect of users’ gaming experience, however,37the design of the experiment changed to between-subjects. To be consistent with theprevious analysis for HRI mode, this analysis was also performed in a pairwise fashion.Since both dependent and independent variables are categorical, Chi-square tests werethe most suitable.For the efficiency analysis, since the dependent variable is continuous, Analysis ofVariance (ANOVA) was the most suitable choice. This is a powerful test to handle threeor more samples together, and can also consider the interactions of factors in the model.Therefore, a two-way ANOVA (2× 3) was used to examine the effect of HRI mode andgaming experience on efficiency, along with their interactions. In this case, HRI mode isa repeated-measures variable, and gaming experience is a between-subjects variable.For the satisfaction analysis, since the dependent variable is the frequency of eachcategorical value of a Likert scale, the Mann-Whitney U test was a suitable choice.However, since this test requires independence of observations, which is not the case in arepeated-measures design, this test was only used to analyze the effect of users’ gamingexperience on their satisfaction with each HRI mode. To examine the effect of HRI modeon users’ satisfaction, a Sign test was used on the difference of the scores given for eachHRI mode to see if the median of difference was equal to zero.38Dependent Vari-ableIndependent Variable (Levels) Design of Experiment Statistical MethodEffectivenessSuccess Rate HRI Mode (2) Two-sample repeated-measuresMcNemar’s testSuccess Rate Gaming Experience (2) Two-sample between-subjectsChi-square and Fisher’s Ex-act testEfficiencyTOC HRI Mode (3) & Gaming Experi-ence (2)Three-samplerepeated-measures& between-subjects(nested inusers factor)Two-factor mixed-measuresANOVA and Post hocTukey HSD testSatisfactionLikert Data HRI Mode (2) Two-sample repeated-measuresSign testLikert Data Gaming Experience (2) Two-sample between-subjectsMann-Whitney U testTable 2.1: Overview of statistical methods used for hypothesis testing.392.1.4.1 Chi-Square TestThe Chi-square test for association, is a statistical test to compare whether there is arelation between two categorical variables in a contingency table. As mentioned earlier,this test was used to study the effect of users’ gaming experience on the effectiveness ofeach HRI mode. An example of a 2 × 2 contingency table for the case of the VR HRImode is shown in Table 2.2, where a, b, c, and d are the number of observations for eachcell. The Chi-square statistic can be calculated based on the following equation [73]:X2c =∑ (fo − fe)2fe(2.1)where c is the degrees of freedom, fo is the frequency of the observed data, and feis the frequency of the expected data. The degrees of freedom can be calculated as thenumber of rows minus one, multiplied by the number of columns minus one, withoutconsidering the rows and columns for the total values. The expected data fe for eachcell can be calculated by multiplying the row totals by the column totals divided bythe grand total (N). Using the Chi-square statistic values and the degrees of freedom,a Chi-square distribution table can be used to find the P-value. If the P-value is lessthan the chosen significance value, which is usually 0.05, then the null hypothesis canbe rejected, indicating that there is a statistically significant relation between the twocategories.Data 1Variable 2 Successful Unsuccessful TotalExperienced a b a+bInexperienced c d c+dTotal a+c b+d N = a+b+c+dTable 2.2: An example 2 × 2 contingency table for the case of VR HRI mode to studythe effect of gaming experience on effectiveness of the interaction mode.One should note that the validity of a Chi-square analysis is dependent on satisfying40the following assumptions [74]:• Independence of Observations: A Chi-square test cannot be used on correlated data,and the sum of all cell frequencies should be equal to the number of participants inthe experiment. Therefore, each participant can contribute to only one cell. Thisassumption is met when considering the effect of users’ gaming experience in theeffectiveness analysis, since each user can only belong to one of the categories.• Sample Size Assumption: Chi-square calculates an approximate P-value and willonly work if the sample size is large enough, where the suggested minimum numberof participants is 20 with no less than 5 participants in each cell. When the databaseis small and more than 20% of cells have an expected value of less than 5, then aFisher’s exact test should also be considered. Fisher’s Exact TestFisher’s exact test [75] is a non-parametric test which, similar to the Chi-square test,is used on a contingency table to examine the significance of association between thetwo kinds of classifications. This statistical test is an alternative to the Chi-squaretest when the expected cell counts for any cell of the 2 x 2 contingency table of theeffectiveness analysis considering the users’ gaming experience is less than 5. Instead ofusing approximate distributions, Fisher is able to show that the probability of obtaininga set of values shown in the contingency table as in Table 2.2 can exactly be calculatedas [76]:p =(a+ b)! (c+ d)! (a+ c)! (b+ d)!a! b! c! d!N !(2.2)which can be used to reject the null hypothesis stating that there is no difference in theeffectiveness between the two groups of experienced and inexperienced users for each HRImode.The following assumptions must be satisfied for the analysis with this test to be valid[76]:41• This test requires one dependent variable and one independent variable that areboth measured at the dichotomous level, where the outcome is split into two groups(successful / unsuccessful and experienced / inexperienced pairs).• This test requires independence of observations, which means that each participantcan only be in one of the independent variable levels, making it suitable for between-subjects designs. This is satisfied in this analysis. McNemar’s TestMcNemar’s test [77] was used in this study to examine whether there was a statisticallysignificant difference between the effectiveness of the three different HRI modes in apairwise fashion. This test is a non-parametric test and is used to see if any statisticallysignificant change in proportions of a dichotomous variable has occurred at two timepoints in the same population, making it suitable for repeated-measures analysis. Similarto the Chi-square test, McNemar’s test is also applied to a contingency table, in this casethe one shown in Table 2.3. The McNemar’s test statistic can be calculated using thefollowing equation [78]:X2 =(|b− c| − 1)2b+ c(2.3)One should note that for McNemar’s test to be valid, the following assumptions mustbe satisfied[79]:• This test requires one dichotomous dependent variable (successful/unsuccessful inthis case) and one categorical independent variable (HRI mode) with two relatedgroups (same users for all HRI modes).• This test also requires the groups of the dependent variable to be mutually exclu-sive, which means that participants can only be in one of the two groups. This istrue here since users could only be successful or unsuccessful in their tasks.42HRI mode 1HRI mode 2 Successful Unsuccessful TotalSuccessful a b a+bUnsuccessful c d c+dTotal a+c b+d N = a+b+c+dTable 2.3: General notation of a 2× 2 contingency table for paired samples. ANOVAANOVA was used in this study to examine the effect of HRI mode and users’ gamingexperience on the efficiency of HRI. In its simplest form, ANOVA is a general procedurefor isolating the sources of variability in a set of measurements [80]. ANOVA is capableof simultaneously testing for many different factors with “n” different levels, and isrobust against violations of the normality assumption. Furthermore, when consideringdifferent factors, it is able to account for interactions of those factors in their effect on thedependent variable. Such factors can include repeated-measures, between-subjects, or acombination of both factors, making it the best choice for the effectiveness analysis, whereusers’ gaming experience is a between-subjects factor and HRI mode is a repeated-measuresfactor; making the analysis a mixed-model ANOVA. Performing a mixed-model ANOVAwhere both between-subjects and repeated-measures factors are included necessitates theapplication of the General Linear Model (GLM), where the data can be represented asfollows [81]:ds tData = Model + Error (2.4)In this case, the model is the researcher’s understanding of the data, or in otherwords the hypotheses about the data, and the error term accommodates other known orunknown factors that may influence the data. When a full model is used in the GLMapproach, the error term will include influences from all possible factors that are notcontrolled in the experiment. However, when a reduced model is used, which is when a43factor is intentionally left out of the model, the contribution of the omitted factor(s) willalso be accommodated by the error term. Therefore, in general, comparisons betweenthe error term of a full model and the reduced models can reveal the effect of differentfactors. The details of the application of GLM to mixed-model ANOVA can be found inAppendix C.1.For a mixed-measures ANOVA to be valid, the following assumptions must be satisfied[82]:• Normality of Data: The dependent variable should be approximately normallydistributed. This can be examined by performing a normal probability test for thedata at hand, which can be done using Minitab.• Homogeneity of Variances: There should be homogeneity of variance for each com-bination of the groups of the two between-subjects and repeated-measures variables.To test for this assumption, Levene’s test [83] for homogeneity of variances can beused.• Sphericity: Variance of the difference between all possible pairs of the repeated-measures factor must be equal. This assumption can be tested using Bartlett’s test[84]. Post-Hoc Tukey HSDRejection of a null hypothesis using ANOVA only reveals the existence of at least one testgroup that has a significantly different mean than other groups in the study. Therefore,post-hoc analysis is usually employed after null hypothesis rejection to reveal the patternin the means of the test groups. One test that is usually used for such analysis is calledthe Tukey Honestly Significant Difference (HSD) test, also known as the Tukey-Kramermethod [79]. This is a multiple comparison test that compares all possible pairs ofmeans by calculating the HSD between two means using a statistical distribution definedby Student, which is denoted by q distribution. The technical details of this test can befound in Appendix C.2.442.1.4.6 Box-Cox Normality TransformationMost statistical tests have assumptions regarding the distribution of the data that needto be analyzed. However, most real data sets do not meet these assumptions: the datamight be skewed severely to the left or to the right. Although in such cases one hasthe option to work with non-parametric tests that do not assume normality of data sets,another option is to use an appropriate transformation of the data set, which can makethe data approximately follow a normal distribution and be qualified for parametric tests[85]. In 1964, Box & Cox [86] introduced a series of power transformations that can beused to transform the data at hand to a normal distribution:yλt =(yλt −1)λ; λ 6= 0log yt; λ = 0(2.5)where y is the response variable and λ is the transformation parameter, which can havethe values shown in Table 2.4 for the corresponding transformations.Transformation Box-Cox λ ParameterSquare Root Transformation λ = 0.5Cube Root Transformation λ = 0.33Fourth Root Transformation λ = 0.25Natural Log Transformation λ = 0Reciprocal Square Root Transformation λ = −0.5Reciprocal (Inverse) Transformation λ = −1Table 2.4: Transformations and their corresponding Box-Cox transformation parameter[87]. Mann-Whitney U TestAs mentioned earlier, the Mann-Whitney U test was used in this study to analyze theLikert data of experienced and inexperience users to examine the effect of gaming ex-45perience on user satisfaction with different HRI modes. The Mann-Whitney U test isa version of the independent samples t-test that can be performed on ranked (ordinal)data, with its null hypothesis stipulating that the two groups are from the same popula-tion and therefore have the same distribution. The Mann-Whitney U test compares eachobservation from sample 1 with each observation from sample 2. To this end, data mustbe ranked in an ascending order and then individually compared. If the two samples arefrom the same population, then each data point in sample 1 will have an equal chance ofbeing larger or smaller than each data point from sample 2, which can be used to rejectthe null hypothesis. The technical details of this test can be found in Appendix C.3.The Mann-Whitney U test requires the data to meet the following assumptions [88]:• The two samples must be randomly drawn from the target population. This as-sumption was met in the study of the effect of gaming experience on satisfaction,since users were chosen randomly.• There must be independence of observations within groups and mutual indepen-dence between groups. This was also met since users could be either experiencedor inexperienced, but not both.• The dependent variables must be of ordinal, relative, or absolute scale type. Thiswas the case in the satisfaction analysis, which used Likert-type data from thequestionnaire. Sign TestA one-sample Sign test was used in this study to examine the effect of HRI mode on users’satisfaction with the interaction. The Sign test is a simple test to determine whetherthe median of a population differs from a target value [89]. Performing a Sign test onthe difference of the scores given by users for each HRI mode helped to identify whetherthere was a difference between user satisfaction for each mode. In this case the nullhypothesis being tested stipulates that the difference of the scores has a zero median. Toperform the Sign test, the given data are divided into two groups higher and lower thanthe stipulated median, denoted as r+ and r−, respectively. The values that are equal to46the median are ignored and the sample size is updated and denoted as n. Since under thenull hypothesis both groups should have equal sizes, both r+ and r− follow a binomialdistribution with p = 12and N = n. Therefore, the binomial distribution can be used tofind the probability of observing a value of r = max(r+, r−) and higher. This yields thep-value that can be used to reject the null hypothesis.The following assumptions about the data at hand must be met for the Sign test tobe valid [89]:• The dependent variable should be continuous or ordinal, which was true for theLikert-type data used for this test.• The independent variable should be a categorical variable of matched pairs. Thisassumption about the data was also met since the independent variable of this testwas the HRI mode, which has two levels: the VR and MR modes.• The paired observations for each user needs to be independent. This was also metsince the data from one user were not influenced by the data from other users, andtheir scores were independent.47Chapter 3Experimental SetupThis chapter provides details about the development of the three designed HRI modes.This icludes explanations of the system architecture, interface software, graphical userinterface, and hardware components.3.1 System ArchitectureFigure 3.1 shows the architecture of all three designed HRI modes, where the green,yellow, and orange arrows show the connections between the equipment and interfacesoftware in the manual, VR, and MR HRI modes, respectively. All three HRI modeshave a master-slave architecture and they are all connected through the interface softwareprogram developed in this study. In the manual control mode, the master side includesthe user, keyboard, and a desktop computer. In the VR and MR modes, the masterside includes the user, Leap Motion, and Oculus Rift. The slave side for all three modesincludes a robotic arm, a Raspberry Pi, and a magnet. Two IP cameras were used forvisual feedback in the manual and the MR HRI modes. Figure 3.2 shows the experimentalsetup with the robotic arm.48Figure 3.1: Schematic drawing of components of different HRI modes and their com-municationsFigure 3.2: Experimental setup with the robotic arm493.2 Interface SoftwareAll devices used in this experiment required communication with central interface soft-ware to receive commands and provide feedback. Figure 3.3 provides a schematic illus-tration of the implemented interface software used in this thesis. All communications,except the communication between the interface software and the user interface (Unity),were implemented through a local network using a Virtual Private Network (VPN) con-nection. The interface software was programmed using C# and installed in the samecomputer where the graphical user interface program was implemented. However, anyserver with a connection to the VPN could also have been used. As can be seen in Fig-ure 3.3, the connection between the interface software and the user interface (Unity) wasimplemented using a TCP connection. The connections between the Raspberry Pi, therobotic arm, and the IP cameras were established using a router and a VPN connection;this was achieved via Secure Socket Shell (SSH), Telnet, and Hypertext Transfer Protocol(HTTP), respectively.Figure 3.3: Schematic drawing of the connection system.503.3 Graphical User InterfaceTo design the graphical user interface in the VR and MR HRI modes, Unity 5.3 [90] wasused. Unity is a game engine that supports Leap Motion and VR goggles. Unity wasprogrammed as a medium to use Leap Motion to interpret users’ gestures as an inputand send related commands to the central interface software accordingly. In the MR HRImode, Unity was also used to stream live videos from two IP cameras directly over anHTTP stream. The programming language used in Unity was C#. As mentioned earlier,the communication between Unity and the remote robotic arm in all three methods wasdone directly through the internet.Unity was chosen for this project because it has a suitable developmental environmentto build and customize 3D computer-generated objects, which can be used as a substitutefor traditional information displays such as pictures, graphs, and texts. Unity providesthe possibility of designing, customizing, and updating the virtual environment to builda prototype [91]. Furthermore, when designing the virtual environment, there is no needto build the executable file to see the changes made in Unity, as one can simply switchto the player mode and see the adjusted changes there. This can greatly facilitate thedesign and development of virtual environments. To connect Oculus Rift to Unity inthis study, Oculus Runtime 1.3 was installed. When connected, in the Unity play mode,a game view appears both on the Oculus Rift and on the computer screen. This is aside-by-side view, as shown in Figure 3.4.51Figure 3.4: Side-by-side view of Unity environment on desktop3.4 Hardware ComponentsThis section briefly explains the hardware components of the system, namely OculusRift, Leap Motion, the robotic arm, Raspberry Pi, and the IP cameras.3.4.1 Oculus RiftHead-Mounted Displays (HMDs) are goggles that allow users to see through them; theycan either augment reality or give users the feeling of being in a virtual world. The mainidea is to create a 3D perspective from two two-dimensional images of the same scene.A slightly changed perspective gives the user the impression of being in a 3D world.Using goggles, this immersive experience lets users feel that they are actually in anotherenvironment and are able to interact with that environment. This feeling of immersioncan open new fields and areas of applications for such technologies.Oculus Rift was chosen as the HMD device for the VR and MR HRI modes. OculusRift is a binocular, fully closed HMD which completely fills the user’s Field of View(FOV). The combination of two digital screens in front of the user’s eyes provides theuser with the illusion of depth, resulting in 3D virtual environments that are life-sizedfrom the user’s perspective. This is done by streaming two near-square video feeds foreach screen (for each eye) that come from cameras (real or virtual) with a slightly different52angle so that the user’s brain is tricked into thinking that the two 2D images are one 3Dimage [92]. One can experience this by alternately closing each eye, whereby the scenewill slightly move horizontally. One should note that Oculus Rift does inherently blockusers’ view of their real surrounding environment unless other devices such as camerasare used to relay the images or videos of this environment onto the displays of the device.Oculus Rift has a tracking system that includes a positional camera that tracks theLEDs placed on the goggles. Each LED on Oculus Rift goggles will emit at a differentfrequency and intensity, allowing for the tracking camera to detect individual LEDs andthereby facilitating detection of the user’s head position and orientation. In the devel-opment version of Oculus Rift goggles, the positional tracker lost information when theuser tried to look behind herself; thus, it was not able to detect the user’s head positionand orientation accurately for such situations. However, in the new consumer versions,LEDs have been added to the back of the headset to give the user a full 360-degreeperspective. In addition, Oculus Rift’s internal tracking system includes a gyroscope, anaccelerometer, and a magnetometer to facilitate more accurate head tracking in all threedimensions [92]. Some people claim feeling dizzy after using Oculus Rift [93]. Therefore,this aspect had to be investigated for the developed HRI modes using the device.3.4.2 IP CamerasTwo IP cameras were located in the remote environment in this study. IP cameras areable to send and receive data via the Internet using wired, wireless, or cellular networks.Thus, these cameras can provide the users with a view of the remote environment fromanywhere using the Internet. Depending on the quality of the Internet service, theresolution, frame rates, and data rates could be affected in wireless transmissions. Asthese cameras are equipped with LEDs, they are also capable of capturing live video inlow light conditions. To live stream the movements of the robotic arm in this study, onecamera was mounted on a tripod in front of the robotic arm, and the other one capturedthe view from the ceiling. The live feeds of these two videos were used in the manualand MR HRI modes.533.4.3 Leap Motion and KinectTwo methodologies have been introduced to improve the intuitiveness of HRI [94]: haptic-based controls and gesture controls. As haptic-based control has been tested broadly inmobile platforms [94], the focus of this thesis is on gesture controls. These gesturesare the recognized natural way of communication between humans [95]; therefore, thedevelopment of intuitive interaction modes can focus on hand gesture tracking.Kinect [96], a 3D motion-sensing camera, was first selected as the input device forthis project. Kinect is a marker-free system, where no physical contact of objects andmeasuring devices is required. Kinect can detect the skeletons of two people by identifying20 joints on each user; thus, users’ joint movements can be tracked using these skeletalfeatures. In this work, this data was used to calculate the three coordinates of the user’sright hand. These coordinates were then transformed to the robotic arm coordinates,and later transmitted to the robotic arm via an Internet socket connection. Next, thebuilt-in inverse kinematics of the robotic arm were used to move the end-effector to therelevant position and orientation. The challenge of using Kinect for hand tracking wasthreefold. First, when a user moved her hand faster than 30 Frames per Second (FPS),Kinect and consequently the robotic arm were not able to keep up, which caused thelatency of the system to grow. Second, the developed hand detection algorithm neededto be calibrated for each user since calculation of each user’s hand position was dependenton height, distance from the Kinect, and arm’s length of each user. Third, even a changein shirt color or lighting condition of the lab affected the Kinect’s ability to accuratelydetect the users’ skeleton.Given these challenges, Leap Motion [97] was believed to be a better choice for handtracking than Kinect. On the one hand, Kinect can detect users from meters away, whileon the other hand Leap Motion only detects hands from 25mm to 600 mm distances. Thetrade-off between Leap Motion and Kinect was to have accurate detection, but a limiteddetection area. Leap Motion is much smaller than Kinect, with a reduced price. LeapMotion can detect the position, rotation, and orientation of 10 individual fingers andthe palm of the hand in real time with great precision. Furthermore, the Leap Motion54cameras are able to capture up to 200 FPS, compared to the 30 FPS captured by Kinect[98].Leap Motion detects users’ two hands along with their fingers for each captured frame.Then, these captured data in each frame are applied for positional tracking, where LeapMotion can describe the motion by comparing each frame with the previous one. To doso, Leap Motion compares the rotation, translation, and scale of the detected objectsbetween two frames and estimates the relative motion of the hands and fingers betweendifferent frames, which can be applied for gesture tracking. Leap Motion is usuallyplaced on a flat screen beneath the user’s hands. The area above Leap Motion, whereit can detect the user’s hands, is called the hover zone, and it has an inverted pyramidshape centered at the location of the Leap Motion. Leap Motion’s FOV is about 150degrees, and if the hands move outside this area, then the device will lose track of them.Leap Motion has two infrared cameras and three infrared LEDs. The LEDs produce aninfrared light; the reflection of objects in that light is captured by the infrared camerasand later interpreted by Leap Motion’s software [99]. As Leap Motion does not use adepth map, once data have been transferred to the computer, some algorithm is appliedto the raw sensor data to reconstruct a 3D representation of what the device sees.Furthermore, Leap Motion’s API has some predefined gestures, such as circle, swipeand tap [99], that can be used to develop interaction modes. In general, gestures can bestatic or dynamic. Static gestures are mainly built based on the palm and fingers’ relativedistance, such as the distance from the palm to the fingertips or the distance betweentwo adjacent fingers. If the total value of the velocity magnitude of the fingers and palmis greater than a user-defined threshold, then the gesture is categorized as dynamic. Toanalyze a dynamic movement, the palm, fingers, and their relative movements (velocity,translation, and rotation) are examined. For example, to analyze a rotational movement,the normal vector of the palm is compared with that of the previous frame. In this thesis,the built-in Grab Strength function was used for the grabbing gesture. This functionindicates how close a hand is to being a fist, where any fingers that are not curled willreduce the grab strength. Clapping was detected by measuring the relative position andmotion of the palm vectors of both hands. If they came close together and then moved55apart very quickly, this was considered to be a clapping gesture. However, one shouldnote that as Leap Motion uses infrared light to gather hand information, when fingersare occluded by another part of the hand, quality of the tracking data is greatly reducedand Leap Motion’s predicted gesture might be inaccurate.3.4.4 Robotic ArmThe 6 Degree of Freedom (DOF) Kawasaki robotic arm was used in this experiment. Ithas six revolute joints and a carrying capacity of 10 kilograms. The basic componentsof a robotic arm consist of actuators, a Programmable Logic Controller (PLC), and aprogramming teach pendant. The PLC only has a few buttons, such as turn on/off,emergency stop, error, and cycle buttons. As there is not much of a user interface on thePLC, the teach pendant is the primary interface of the controller. Manual manipulationof the joints can be done by using the teach function on the teach pendant. By usingthe teach function, the user can manipulate the speed, orientation, and position of eachjoint. Each position can be saved and the motion between each position can be specified.The remote button on the teach pendant allows the user to read and write controllerparameters using either RS232 or an Ethernet connection.To find the robotic arm’s reachable points, a program was run on the robotic arm.This program asked the robot to attempt to access different points in a simple exploratorypattern. The only way the robotic arm could provide an error regarding an unreachablepoint was by providing out-of-range errors. The program determined whether a pointwas reachable by attempting to reach it and seeing whether an out-of-range error wasencountered. The program started with a single-known reachable point, and then loopedover neighboring points in a grid pattern to see if those points were also reachable. Theprogram repeated the loop until no reachable points had unexplored neighbors. Therobot’s 3D reachable points plot was generated and can be seen in Figure3.5.The Kawasaki E-series robot is programmable using AS language, which allows forcontrol over many of the robot’s functions. The robot itself can store different AS pro-grams which can be run when suitable. In this study, five different AS programs werewritten, each corresponding to an action the robot needed to accomplish. The first pro-56gram, used in the manual control test, simply prompted the terminal for a point as inputand then moved the robot to that point. The next three programs were used for thesimple virtual test, with each program corresponding to the action of the robot pickingand placing different-colored sponges. The last program, which was used in the MR test,prompted for a point, then instructed the robot to attempt to pick up a box at that pointand place it in the drop-off location. The initiation and input/output of these programswere handled by the central interface software.Figure 3.5: Workspace of the robotic arm.3.4.5 Raspberry Pi and MagnetRaspberry Pi is a credit-card-sized computer. It is slower than a modern laptop or adesktop, but it is still a complete Linux based computer [100]. Raspberry Pi was used asa controller for the electromagnet mounted on the end-effector of the robotic arm, andwas connected wirelessly via the router to the central interface software. Along the edgeof a Raspberry Pi board, pins serve as a physical interface between the Pi and the outsideworld. These General-Purpose Input/Output pins are known as GPIO. The RaspberryPi itself had a simple python program that turns a GPIO pin controlling the magnet on57or off depending on the user’s input. When the central interface software connected tothe Raspberry Pi in this study, it executed the program and then fed its input to controlthe magnet.58Chapter 4Results and DiscussionThis chapter presents the experimental results along with their statistical analysis andtheir discussion. As explained in Chapter 3, standard usability testing was adopted forevaluation of the developed system. Only three factors were considered in this testing:effectiveness, efficiency, and satisfaction. First, the chapter presents and discusses theresults on the effectiveness of the three systems, where the effectiveness is analyzedusing the percentage of task completion. Chi-square, McNemar’s, and Fisher’s exact testanalyses are presented in this section to determine whether any relation exists betweenthe effectiveness of the three HRI modes and users’ gaming experience. Second, theresults of the efficiency analysis of the systems are provided, where a two-way ANOVAis used to evaluate the effect of the HRI modes and users’ gaming experience. Third, thethree systems are compared subjectively for user satisfaction, where a sign test is usedto measure users’ preference of the VR and MR modes. A Mann-Whitney U test is alsoused to analyze the effect of users’ gaming experience on these results. The followingsections then describe the study and hypothesis testing conducted on each of the threefactors. All the statistical analyses were performed using Minitab software.4.1 EffectivenessThe effectiveness of each system was measured by the percentage of users who were ableto successfully complete each task. Success was determined by whether or not the user59was able to pick up the colored boxes and place them inside the corresponding box. Sinceall users were subject to all HRI modes, this was a repeated-measures analysis, where theassumption of independence of variables was invalid for the Chi-square test. Therefore,McNemar’s tests were used to see if there were significant differences in the effectivenessbetween the HRI modes. Since McNemar’s test can only test for differences betweentwo samples, three pairwise analyses were performed. According to Boot et al. [72]gamers can surpass people with no gaming experience in a number of cognitive skills.Hence, to investigate whether gaming experience had any impact on the effectivenessof the different HRI modes, users’ gaming experience was also considered. In this case,each user could only be assigned to one level of each factor, making it a between-subjectsdesign. Therefore, a Chi-square test could be used for this analysis.4.1.1 Effect of Human-Robot Interaction ModesA McNemar’s test between the manual and VR HRI modes resulted in a P-value of 0.508,indicating that there was no significant difference between the success rates of these twoHRI modes. Furthermore, a McNemar’s test between the manual and MR modes resultedin a P-value of 0.031; thus, considering a significance value of 0.05, there were significantdifferences between the effectiveness of these two HRI modes. The McNemar’s testalso returned a P-value of < 0.001 for the VR and MR HRI modes, which indicated asignificant difference between the effectiveness of these two HRI modes. Details of theMinitab output for these analyses can be found in Appendix D.1. Figure 4.1 shows piecharts of the rate of successful and unsuccessful task completion for each HRI mode. Ascan be seen, out of 26 users, the highest success rate was 88.5% (23 out of 26 users) forthe VR HRI mode, followed by the manual mode, where the success rate dropped to76.9% (20 out of 26 users). The lowest success rate was measured for the MR system,where only 38.5% (10 out of 26 users) were able to successfully complete the desiredtasks.60Figure 4.1: Pie charts of the rate of successful and unsuccessful task completion foreach HRI mode4.1.2 Effect of Gaming ExperienceTo statistically examine the interaction between users’gaming experience and each HRImode in their effect on success rates, Chi-square tests of association were also conducted.For the manual mode, this relation was not found to be significant, X2 (1, N = 26) =0.439, p = .508. The Chi-square test had two cells with counts lower than 5. Furthermore,Fisher’s exact test also returned a P-value of 0.644. This relation was not found to besignificant for the MR mode either, X2 (1, N = 26) = 0.016, p = .899. The Chi-squaretest had one cell with a count lower than 5; Fisher’s exact test also returned a P-valueof 1. The same was true for the VR mode, X2 (1, N = 26) = 0.038, p = .846. TheChi-square test had one cell with a count lower than 5, and Fisher’s exact test returneda P-value of 1. Therefore, with a significance value of 0.05, it can be concluded thatthere were no significant interaction between gaming experience and HRI modes in theireffects on completion rates. This is also evident from Figure 4.2, which shows pie chartsof the rate of successful and unsuccessful task completion based on HRI mode and users’61gaming experience. As can be seen, for the manual mode the percentage of successfultask completion was 81.3 %(13 out of 16 users) for users without gaming experience,and 70.0% (7 users out of 10) for users with gaming experience. In the MR mode, thesuccessful task completion rate was 40% (4 out of 10 users) and 37.5% (6 out of 16users) for users with and without gaming experience, respectively. For the VR mode, thesuccessful task completion was 87.5% (14 out of 16 users) for non-gamers versus 90% (9out of 10 users) gamers. Minitab outputs from the Chi-square and Fisher’s tests can befound in Appendix D.2.Figure 4.2: Pie charts of the rates of successful and unsuccessful task completion forHRI mode and users’ gaming experience4.1.3 Error Rates and TypesIn addition to the rate of successful task completion, error types and rates can alsoprovide useful information about the effectiveness of the systems. In total, 25 errorsoccurred during the experiments. The error types can be divided into six categories,as shown in Table 4.1.Out of these 25 errors, 18 resulted from mistakes made by theoperators; these are categorized as human errors. Of these 18 errors, in 17 cases users62had problems with aligning the end-effector with the boxes, while one user had a problemwith Leap Motion in general and could not use it effectively. The remaining seven errorsare categorized as system errors, as they originated from some failures in the system. Intwo cases, the magnet did not turn on properly, resulting in users being unable to pick upthe boxes. In two cases, the Leap Motion failed to detect the grabbing gesture (clappingof hands) and did not trigger the magnets to turn on; in one case it failed to detect theuser’s hands altogether; and in the remaining two cases the system did not run properlywithout any known reasons.Human/System Error Type Counts PercentageHuman ErrorEnd-Effector Misalignment 17 68Leap Motion- General Use 1 4System ErrorMagnet Failure 2 8Leap Motion- Grabbing Gesture 2 8Leap Motion- Hand Detection 1 4Unknown 2 8Table 4.1: Error Types and RatesFigure 4.3 shows pie charts of errors for each HRI mode. The top row shows thefrequency of different error types for each system, and the bottom row shows the ratioof human versus system error for each HRI mode. As can be seen, six errors occurredin the manual mode, all of which were human errors where users could not align theend-effector with the boxes. In the MR mode, 16 errors were observed, 11 of which werehuman errors, where users again could not align the end-effector with the boxes. Out ofthe five system errors, one resulted from the Leap Motion failing to detect the grabbinggesture, and two resulted from the magnet not turning on. Finally, only three errorsoccurred in the VR mode; only one of them was a human error, where one user couldnot, in general, use the Leap Motion properly; and the two system errors originatedfrom Leap Motion, where in one case it failed to detect a user’s hands, and in another63case it failed to detect the grabbing gesture. In general, the VR system had the lowesthuman error, followed by the manual system. In contrast, the MR system had the highestamount of human and system errors. All in all, it seems that the introduction of gesturetracking to the system through Leap Motion resulted in some system errors but reducedthe amount of human errors.Figure 4.3: Pie charts of the percentage of (upper) error types (bottom) human versussystem errors categorized for each HRI modeIn summary, considering the results of the Chi-square, McNemar’s, and Fisher’s exacttests, there were significant differences in the effectiveness of the VR and manual HRImodes compared to the MR mode, and users’ gaming experience had no relation tothese differences. In general, considering the success rates, the VR system was the mosteffective HRI mode, followed by the manual system. This could be because in the VRmode, users did not need to create any mental models of the environment ,which reducedthe cognitive load of the task. On the other hand, the MR system had the lowest successrates. This might stem from the fact that users needed to process the additional visualdata from cameras, which could increase the cognitive workload. This is evident fromthe number of human errors in aligning the end-effector with the boxes. Similarly, the64number of human errors observed with the manual system could also indicate that usershad difficulty understanding the position of the boxes using only visual feedback fromthe two cameras. This may also have increased the cognitive load since users had tocreate mental models of the environment while performing the desired tasks.4.2 EfficiencyAs mentioned in the methodology section, efficiency of a telerobotic system can be mea-sured by the time it takes the users to successfully finish the desired tasks, which isdenoted as TOC. To examine the efficiency of the different HRI modes, statistical anal-ysis was carried out to see if there were any statistically significant differences betweenthe measured TOCs for each system. Similar to the effectiveness analysis, users’ pastexperiences were also considered here. Since the dependent variable, measured TOC, is acontinuous variable and the analysis was between more than two samples (three differentHRI modes), ANOVA was the most suitable choice.4.2.1 Effect of HRI Mode and Gaming ExperienceA two-way mixed-model ANOVA was performed to compare the main effects of HRImodes and users’ gaming experience along with the interaction between these factors intheir effects on TOC. The HRI mode factor included three levels, manual, VR, and MR;while the factor of gaming experience only had two levels. The gaming experience is abetween-subjects factor since users can only belong to one of the two levels. Table 4.2lists these factors along with their levels.65Factors Types Levels ValuesHRI Mode Fixed 3 Manual, VR, and MRGaming Experi-enceFixed 2 Yes and NoUsers(GamingExperience)Random 25 User1(No), User10(No), User12(No),User13(No), User15(No), User16(No),User17(No), User18(No), User2(No),User22(No), User25(No), User26(No),User3(No), User7(No), User9(No),User11(Yes), User14(Yes), User19(Yes),User20(Yes), User21(Yes), User23(Yes),User24(Yes), User5(Yes), User6(Yes),User8(Yes)Table 4.2: Factors and their levels in the mixed-model ANOVA for efficiencyFurthermore, a Box-Cox transformation with λ = 0 was used to ensure normality inthe distribution of the measured TOC data, which is equal to taking the natural log ofthe data. The results of the ANOVA for the transformed data are listed in Table 4.3.Considering a significance value of 0.05, the main effect of the HRI mode yielded an Fratio of F(2,21) = 155.29, p < 0.001, indicating that there was a significant differencebetween TOC values of at least two of the HRI modes: manual (M = 122.16), VR (M =24.79) and MR (M = 88.55). Figure 4.4 shows the post-hoc Tukey HSD for HRI modeswith 95% confidence intervals. As can be seen, only one of the intervals passes throughthe zero line; therefore, the difference between the efficiency of MR and manual modeswas not statistically significant. Furthermore, the main effect of the gaming experienceresulted in an F ratio of F(1,21) = 1.09, p = 0.30, which indicates that there was nosignificant difference between the mean TOC of users with gaming experience (M=79.45)and those with no gaming experience (M = 73.47). The interaction effect was also found66not to be significant, F(2,21) = 0.33, p = 0.72.Source DF Adj SS Adj MS F-Value P-ValueHRI Mode 2 23.16 11.58 155.29 0.000Gaming Experience 1 0.10 0.10 1.09 0.30HRI mode * Gaming Experience 2 0.05 0.02 0.33 0.72Users(Gaming Experience) 23 2.22 0.10 1.30 0.28Error 21 1.57 0.07Total 49 32.23Table 4.3: ANOVA results for the transformed TOC data, effect of HRI modes andgaming experienceFigure 4.4: Post-hoc Tukey HSD test for HRI mode factorFigure 4.5 shows standardized residual plots of the ANOVA test. The upper-leftfigure shows the normal probability plot of the residuals, which seems to have a more orless normal distribution. This can also be seen in the lower-left figure, which shows the67histograms of the standardized residuals. The upper-right figure shows the standardizedresiduals plotted against fitted values, which do not show any particular trends. Finally,the lower-right figure shows the standardized residuals versus observational order, whichalso shows no particular pattern. Furthermore, the Levene’s test for equality of thevariance returned a p-value of 0.30 for the transformed data, indicating that with asignificance value of 0.05 they had equal variances. All the normal probability plotsof the transformed data also returned p-values of more than 0.21, indicating that thedata had normal distribution. Furthermore, the Bartlett’s test for sphericity (equalvariance for repeated-measures factor) returned a p-value of 0.055 for the transformeddata. Although this p-value is close to the significance value of 0.05, the equality ofvariance for the HRI mode data can still be assumed.Figure 4.5: Residual plots for the ANOVA analysis for factors of HRI mode and gamingexperienceAll in all, considering the results of the ANOVA, the main effect of HRI mode wasfound to be statistically significant, while the main effect of users’ gaming experience wasnot. Figure 4.6 shows the main effect plots of these factors. As can be seen, the manualmode had the highest mean of TOC, making it the least efficient system, while the VR68mode had the lowest mean of TOC, making it the most efficient system. Surprisingly,the users who had gaming experience had a slightly higher mean of TOC, but this wasprobably due to chance since the main effect of gaming experience was not found to besignificant. Furthermore, Figure 4.7 shows the interaction plots of these factors. As canbe seen from this figure, the lines are not parallel, indicating that gaming experiencemight have interacted with the HRI mode in the effects on TOC, the users who hadgaming experience performed worse when in the manual mode. However, the ANOVAshowed that the interaction between these factors was not significant; therefore, thismight be due to chance. The details of the Minitab outputs for the ANOVA and itsrelated tests can be found in Appendix D.3.Figure 4.6: Main effect plots for the factors of HRI modes and gaming experience69Figure 4.7: Interaction plots for the factors of HRI modes and gaming experience4.3 SatisfactionAs explained in the methodology chapter, user satisfaction was measured subjectivelyby asking users to express how much they agreed with the following two statements ona scale from 1 to 10, with 10 being the highest level of agreement:1. I would like to use the VR system in future applications.2. I would like to use the MR system in future applications.Figure 4.8 shows histograms of the counts of the users’ chosen levels of agreementwith these statements. The histogram on the left shows the distribution of the useragreement data for the first statement, and the histogram on the right for the secondstatement. As can be seen, both data sets have similar distributions with median valuesof 8. Therefore, it can be concluded that users were highly satisfied with both VR andMR systems in general ( 8 out of 10). Furthermore, the data had equal medians; thiswas also confirmed by a Sign test on the difference of the user scores for each HRI mode70that returned a p-value of p = 0.61. Hence, it can be concluded that users had no clearpreference for either of the two systems.Figure 4.8: Histograms of the users’ chosen level of agreement with the first and secondstatements4.3.1 Effect of Gaming ExperienceSince users with gaming experience may feel more comfortable with the developed sys-tems, gaming experience was also considered when examining users’ satisfaction. Figure4.9 shows histograms of the users’ level of agreement with the above statements cate-gorized based on their gaming experience. The left and right histograms represent thefeedback for the first and second statement, respectively. In each case, histograms of thedata from users with and without gaming experience are overlaid on top of each otherin such a way that the addition of the two results in the histograms shown in Figure 4.8.Furthermore, the histograms are also color coded based on users’ gaming experience,with red indicating gaming experience and blue the lack of gaming experience. Althoughsome differences can be seen from these histograms, it is difficult to determine whethergaming experience had any effect on user satisfaction. A Mann- Whitney U test on the71data regarding the first statement returned a P-value of 0.65; therefore, with a confidencevalue of 0.05, it can be concluded that the difference between the medians of the datafrom the users with gaming experience (N = 10, M = 8.5) and without gaming expe-rience (N = 16 , M = 8.0) was not statistically significant. A Mann-Whitney test wasalso performed on the data from the second statement; this yielded a P-value of 0.59,indicating again that the difference between the medians of the data from the users withgaming experience (N = 10 , M = 9) and without gaming experience ( N = 16 , M = 8)was not statistically significant.All in all, it can be concluded that users were highly satisfied with both systems (8out of 10). Furthermore, considering the results of the U test, it can safely be concludedthat the users with and without gaming experience had the same view about the systems,and gaming experience had no effect on user satisfaction.Figure 4.9: Histograms of the users’ chosen level of agreement with the first and secondstatements, based on their gaming experience724.4 Subjective FeedbackAs explained in the methodology section, in addition to the statements in the previoussection, users were also asked to provide their level of agreement with seven more state-ments based on a scale from 1 to 10, with 10 being the highest level of agreement. Users’responses were later reduced to a five-level Likert scale. Figure 4.10 shows stacked barcharts of the counts of levels of agreement chosen by the users along with the corre-sponding statements. The stacked bar charts are color coded based on the agreementlevels.The first statement aimed to measure users’ general perception regarding Oculus Riftand Leap Motion. As can be seen, out of 26 users, only 1 disagreed with the statementthat these technologies are effective in performing desired tasks, while the rest agreed orstrongly agreed with the statement. This clearly shows that the users generally had apositive view of these technologies. However, this might have resulted in some bias intheir answers to other questions.Statement 2 aimed to measure users’ comfort level when using Oculus Rift and LeapMotion. As can be seen, most users agreed or strongly agreed with the statement thatthey could comfortably control the robot using VR and MR environments. Only twousers disagreed with this statement. This clearly shows that most of the users werecomfortable using Leap Motion and Oculus Rift.The third statement aimed to determine how safe users felt when using Oculus Riftand Leap Motion. Since using Oculus Rift generally means that the user will be visuallydisconnected from the real environment. As can be seen, only five users disagreed, andone user strongly disagreed, with the statement that they felt safe while using OculusRift and Leap Motion to communicate with the robot. This shows that most of theusers felt safe when using Leap Motion and Oculus Rift, even though they were visuallydisconnected from their surroundings.The fourth statement also attempted to examine how users felt after using OculusRift, as VR goggles are known to result in visually induced motion sickness [101]. Asshown in Figure 4.10, 15 persons strongly disagreed, 7 disagreed, and 2 felt neutral about73the statement that they felt sickness or pressure after using Oculus Rift along with LeapMotion. Therefore, it is safe to state that most users did not feel sickness after usingOculus Rift.Statements 5, 6, and 7 were intended to examine users’ preference for a particu-lar system by measuring how intuitive they thought the systems were. As is evident,each statement comprised a pairwise comparison between two of the three systems. Instatement 5, 22 users agreed with the statement that VR environment is more intuitivethan using a keyboard, 12 of whom strongly agreed. In statement 6, 21 users agreed orstrongly agreed with the statement that the MR environment is more intuitive than us-ing a keyboard. Finally, in statement seven, although 13 users agreed or strongly agreedwith the statement that VR is more intuitive than MR, 9 users felt neutral about thisstatement. Thus, although it can be concluded that most users preferred VR and MRenvironments over the keyboard as a way of interaction with robots, the users did nothave a clear preference for either the VR or MR system.74Figure 4.10: Stacked bar charts of counts of user agreement levels with the provided statements75Chapter 5Conclusions and Future Work5.1 SummaryThe main objective of this thesis was to develop a telerobotic system using emergingVR technologies, such as Oculus Rift head-mounted display and the Leap Motion handtracking device, and to evaluate their impact on effectiveness, efficiency, and users’ sat-isfaction regarding the HRI modes built for that telerobotic system. It was hypothesizedthat the application of such technologies might result in more efficient and effective HRImodes that may yield higher user satisfaction. This might in turn result in an increasein the application of telerobotics in unstructured environments, where human operators’cognitive skill can be used in the form of a supervisory control.To address this objective, three different HRI modes were designed, referred to asmanual, VR, and MR, to remotely control a robotic arm to perform a simple pick-and-place task. Twenty-six participants were recruited to perform each task using each HRImode while their performance was recorded. In each case, users were asked to pick up asmall box by controlling the robotic arm using each HRI mode. In the manual HRI mode,users had to use a keyboard to control the robotic arm while looking at the monitor thatwas used to display video streams from two IP cameras placed in the remote environment.This mode is considered to be a direct control mode. In the VR HRI mode, users wereprovided with an Oculus Rift and had to interact with a virtual environment using LeapMotion to control the robotic arm and perform the desired task. In this case, users were76only involved as a supervisory controller. On the other hand, in the MR HRI mode, inaddition to Oculus Rift and a virtual environment, users were provided with live videostreams of the two IP cameras located in the remote environment to complete the samepick-and-place task. In this HRI mode, a shared control architecture was employed.To study the effectiveness of each HRI mode, the success ratio for each mode wascalculated using the number of users that successfully finished the tasks. In addition, theerror rates and their nature were also included in the analysis. For the efficiency analysis,the time that it took each user to successfully finish the desired tasks (TOC) using eachHRI mode was used. Finally, user satisfaction with each HRI mode was analyzed usingthe data from a Likert-type questionnaire. Furthermore, users’ gaming experience wasalso considered in the analysis. For each analysis, a proper statistical method was usedto test the relevant hypotheses, which were explained in detail in Chapter 2.5.2 ConclusionsThe statistical analysis of the collected data revealed a statistically significant differencebetween the effectiveness of the three HRI modes. The VR HRI mode was the mosteffective interaction mode with a success rate of 88.5%, followed by the manual HRImode with 76.9%. In last place, the MR system had a success rate of only 38.5%. Ingeneral, 25 errors occurred, which 18 of them was made by human operators and 7 ofthem was due to system errors. All in all, VR mode had the lowest amount of humanand system errors. Therefore, when considering the error rates VR mode were still themost effective interaction mode. Furthermore, a statistically significant difference wasfound between the efficiency of all three interaction modes. VR mode was the mostefficient system with the lowest TOC. In addition, no interaction was found between theefficiency and effectiveness of the three HRI modes and the participants’ video gamingexperience. This is an indication that even inexperienced users can remotely operaterobots effectively and efficiently using VR HRI mode.User satisfaction was measured subjectively using a 10-point Likert scale question-naire, where 10 indicated the highest level of agreement. According to the data, the users77were highly satisfied with both VR and MR HRI modes, with a median value of 8 out of10. This data also showed that users had no preference for either the VR or MR over themanual mode. Furthermore, users’ gaming experience had no effect on their likelihood ofusing such technologies in future applications. In other words, users’ gaming experiencehad no effect on their satisfaction with the HRI modes. Moreover, the results of Likertquestionnaire showed that the majority of participants had a positive view of Oculus Riftand Leap Motion technologies, and they were comfortable controlling a robotic arm usingthis equipment. Most of the users also felt safe when using Oculus Rift, even thoughthey were visually disconnected from their surroundings. Furthermore, most users didnot feel sick after using Oculus Rift. Therefore, in general, it can be concluded that theapplication of VR technologies were successful in delivering a user-friendly and intuitiveinteraction mode for the users.In conclusion, based on the factors of effectiveness, efficiency, user satisfaction, anderror rates (both human errors and system errors), it can be recommended that in realworld scenarios where the nature of the task is similar to the pick-and-place task ofthis research, which is the case in manufacturing or telemaintenance situations, a VRHRI mode can be an effective and efficient way to control the robots in a supervisorymode without a need for prior training of the operator. One should note this study wasconducted on a certain population, using a popular pick and place task by robots, andthe result may or may not be applicable to other applications. However, the evaluationprocess can be generalized to other applications in industry.5.3 Limitations and Future WorkThe evaluation in this work proved the positive impact of the use of new VR technologiessuch as Oculus Rift and Leap Motion in developing more effective and efficient HRI modesfor telerobotic applications. However, this research could be continued and improved inseveral areas. Several suggestions are provided below:• This research focused on HRI modes that provided the user with visual feedbackonly. In many telerobotic systems, however, haptic feedback is also provided and78is believed to increase users’ performance and satisfaction. Therefore, it would beuseful to evaluate an HRI mode that combines the haptic feedback with virtualenvironments.• The virtual environment in this work was generated by the Unity game engine andhad only limited collision physics. More complex environments, such as in surgicaltelerobotics, necessitate more sophisticated physics simulation, so that the remotesite can involve simulation of deformation of different parts of the body. This canbe done by tracking some markers that are placed in the target organ.• Simultaneous Localization and Mapping (SLAM) algorithms can be used to createa virtual environment that closely imitates the remote site. This can be used inmanipulation tasks using mobile robots: first, a map of the remote environmentcan be generated by the mobile robot, and then the generated virtual environmentcan be used in a telerobotic system that employs VR as an HRI mode to performthe desired tasks remotely.• The task examined in this study was only a pick-and-place scenario. It would bebeneficial to see whether the same results are observed for real industrial work-places, and how the complexity of such tasks can affect the HRIs when VR isemployed along with supervisory control. The results could further be used totailor VR HRI modes for specific tasks, especially in applications such as tele-maintenance, which is a part of recently emerging connected industries that arecollectively called Industry 4.0 [6].79References[1] Jennifer L Burke et al. “Final Report for the DARPA / NSF InterdisciplinaryStudy on Human – Robot Interaction”. In: Ieee, Transactions on Systems, Man,and Cybernetics—Part C: Applications and Reviews, Vol. 34.2 (2004), pp. 103–112.[2] Robin R Murphy et al. “Human–robot interaction”. In: IEEE robotics & automa-tion magazine 17.2 (2010), pp. 85–89.[3] David Spencer. “Analysis and Synthesis of Effective Human-Robot Interaction atVarying Levels in Control Hierarchy”. PhD thesis. Clemson University, 2015.[4] Juan Fasola and Maja J Mataric. “Using socially assistive human–robot interac-tion to motivate physical exercise for older adults”. In: Proceedings of the IEEE100.8 (2012), pp. 2512–2526.[5] Thomas B. Sheridan. “Human?Robot Interaction: Status and Challenges”. In: Hu-man Factors 58.4 (2016), pp. 525–532. issn: 0018-7208. doi: 10.1177/0018720816644364.url:[6] Doris Aschenbrenner. “Human Robot Interaction Concepts for Human Supervi-sory Control and Telemaintenance Applications in an Industry 4.0 Environment”.In: ().[7] Roger Clarke. “Asimov’s laws of robotics: Implications for information technology.2”. In: Computer 27.1 (1994), pp. 57–66.80[8] Julius J Grodski. “Applications of Augmented Reality for Human-Robot Com-munication”. In: Proceedings of the 1993 IEEE/RSJ International Conference onIntelligent Robots and Systems. Yokohama, Japan, 1993, pp. 1467–1472. isbn:0780308239.[9] Michael A Goodrich and Alan C Schultz. “Human-robot interaction: a survey”.In: Foundations and trends in human-computer interaction 1.3 (2007), pp. 203–275.[10] Thomas B Sheridan. “Teleoperation, telerobotics and telepresence: A progressreport”. In: Control Engineering Practice 3.2 (1995), pp. 205–214.[11] Terrence Fong, Charles Thorpe, and Charles Baur. “Collaboration, dialogue, human-robot interaction”. In: Robotics Research. Springer, 2003, pp. 255–266.[12] Jacob W Crandall et al. “Validating human-robot interaction schemes in multi-tasking environments”. In: IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 35.4 (2005), pp. 438–449.[13] Batu Akan. “Human robot interaction solutions for intuitive industrial robot pro-gramming”. PhD thesis. Ma¨lardalen University, 2012.[14] JE Allen, Curry I Guinn, and E Horvtz. “Mixed-initiative interaction”. In: IEEEIntelligent Systems and their Applications 14.5 (1999), pp. 14–23.[15] Kiju Lee, Matt Moses, and Gregory S Chirikjian. “Robotic self-replication instructured environments: Physical demonstrations and complexity measures”. In:The International Journal of Robotics Research 27.3-4 (2008), pp. 387–401.[16] Laurent A Nguyen et al. “Virtual reality interfaces for visualization and controlof remote vehicles”. In: Autonomous Robots 11.1 (2001), pp. 59–68.[17] Scott A Green et al. “Human-robot collaboration: A literature review and aug-mented reality approach in design”. In: International Journal of Advanced RoboticSystems 5.1 (2008), p. 1.81[18] Md Hasanuzzaman et al. “Face and gesture recognition using subspace method forhuman-robot interaction”. In: Pacific-Rim Conference on Multimedia. Springer.2004, pp. 369–376.[19] Thomas Kollar et al. “Toward understanding natural language directions”. In:Human-Robot Interaction (HRI), 2010 5th ACM/IEEE International Conferenceon. IEEE. 2010, pp. 259–266.[20] Allison M Okamura. “Haptic feedback in robot-assisted minimally invasive surgery”.In: Current opinion in urology 19.1 (2009), p. 102.[21] Jean Scholtz et al. “Evaluation of human-robot interaction awareness in search andrescue”. In: Robotics and Automation, 2004. Proceedings. ICRA’04. 2004 IEEEInternational Conference on. Vol. 3. IEEE. 2004, pp. 2327–2332.[22] Rajesh Elara Mohan et al. “Validating extended neglect tolerance model for hu-man robot interactions in humanoid soccer robots”. In: Robotica 29.3 (2011),pp. 421–432.[23] Sebastian Thrun et al. “MINERVA: A second-generation museum tour-guide robot”.In: Robotics and automation, 1999. Proceedings. 1999 IEEE international confer-ence on. Vol. 3. IEEE. 1999.[24] Carlos Beltran-Gonzalez et al. “Methods and techniques for intelligent navigationand manipulation for bomb disposal and rescue operations”. In: Safety, Securityand Rescue Robotics, 2007. SSRR 2007. IEEE International Workshop on. IEEE.2007, pp. 1–6.[25] Aude Billard et al. “Robot programming by demonstration”. In: Springer hand-book of robotics. Springer, 2008, pp. 1371–1394.[26] Christoph Bartneck et al. “The influence of people?s culture and prior experienceswith Aibo on their attitude towards robots”. In: Ai & Society 21.1-2 (2007),pp. 217–230.82[27] Jodi Forlizzi and Carl DiSalvo. “Service robots in the domestic environment:a study of the roomba vacuum in the home”. In: Proceedings of the 1st ACMSIGCHI/SIGART conference on Human-robot interaction. ACM. 2006, pp. 258–265.[28] Thomas B Sheridan. “Telerobotics”. In: Automatica 25.4 (1989), pp. 487–507.[29] Gu¨nter Niemeyer, Carsten Preusche, and Gerd Hirzinger. “Telerobotics”. In: Springerhandbook of robotics. Springer, 2008, pp. 741–757.[30] Andre Smit. “Development of a telerobotic test bench system for small-field-of-operation bilateral applications with 3D visual and haptic (kinaesthetic) feed-back”. PhD thesis. Stellenbosch: Stellenbosch University, 2014.[31] Geert De Cubber et al. “The EU-ICARUS project: developing assistive robotictools for search and rescue operations”. In: Safety, Security, and Rescue Robotics(SSRR), 2013 IEEE international symposium on. IEEE. 2013, pp. 1–4.[32] K.A.; Ramirez-Serrano Davies A. “A Reconfigurable USAR Robot Designed forTraversing Complex 3D Terrain”. In: 22nd Canadian Congress of Applied Me-chanics 2009 (CANCAM 2009) Joint 1 (2009), 339 (1 Vol).[33] Jim Taylor et al. “Mars Exploration Rover Telecommunications”. In: Deep SpaceCommunications and Navigation Series 10 (2005), p. 156.[34] Two-fingered Tele-micromanipulation System, Moussa Boukhnifer, and AntoineFerreira. “Loop Shaping Bilateral Controller for a”. In: Control 15.5 (2007),pp. 891–905.[35] Yantao Shen et al. “Internet-based remote assembly of micro-electro-mechanicalsystems (MEMS)”. In: Assembly Automation 24.3 (2004), pp. 289–296. issn: 0144-5154. doi: 10.1108/01445150410549782.[36] M Buss and G Schmidt. “Control problems in multi-modal telepresence systems”.In: Advances in control. Springer, 1999, pp. 65–101.83[37] Jianhong Cui et al. “A review of teleoperation system control”. In: Proceedings ofthe Florida Conference on Recent Advances in Robotics. Florida Atlantic Univer-sity Boca Raton, FL. 2003, pp. 1–12.[38] J Wright et al. “Driving on the surface of Mars with the rover sequencing andvisualization program”. In: (2005).[39] Ida Bagus Kerthyayana Manuaba. “Evaluation of Gaming Environments for MixedReality Interfaces and Human Supervisory Control in Telerobotics”. PhD Thesis.Australian National University, 2014.[40] Anna Felnhofer et al. “Is virtual reality emotionally arousing? Investigating fiveemotion inducing virtual park scenarios”. In: International journal of human-computer studies 82 (2015), pp. 48–56.[41] Mikael Eriksson. “Reaching out to grasp in Virtual Reality Reaching out to graspin Virtual Reality Stra¨ck ut och ta tag i virtuell verklighet”. In: ().[42] TTN Do. “Development of a virtual pet game using oculus rift and leap motiontechnologies.” In: May 2016 (2016). url:[43] Ed Barfield, Woodrow. Fundamentals of wearable computers and augmented real-ity. CRC Press, 2015.[44] Ronald T Azuma. “Visualizacao-Gr-LuisPattam-paperdeapoio-1”. In: (1997), pp. 355–385.[45] Feng Zhou, H. B. L. Duh, and M. Billinghurst. “Trends in augmented realitytracking, interaction and display: A review of ten years of ISMAR”. In: 2008 7thIEEE/ACM International Symposium on Mixed and Augmented Reality. 2008,pp. 193–202. doi: 10.1109/ISMAR.2008.4637362.[46] Stefaan Ternier et al. “AR Learn: Augmented reality meets augmented virtuality”.In: Journal of Universal Computer Science 18.15 (2012), pp. 2143–2164. issn:0948695X. doi: 10.3217/jucs-018-15-2143. arXiv: arXiv:1011.1669v3.84[47] Paul Milgram et al. “Augmented reality: A class of displays on the reality-virtualitycontinuum”. In: Telemanipulator and telepresence technologies. Vol. 2351. Inter-national Society for Optics and Photonics. 1995, pp. 282–293.[48] Grigore C. Burdea. “Invited review: the synergy between virtual reality androbotics”. In: IEEE Transactions on Robotics and Automation 15.3 (1999), pp. 400–410. issn: 1042296X. doi: 10.1109/70.768174.[49] Eimei Oyama et al. “Experimental Study on Remote Manipulation Using VirtualReality”. In: Presence 2.2 (1993), pp. 112–124.[50] Qingping Lin Qingping Lin and Chengi Kuo Chengi Kuo. “Virtual tele-operationof underwater robots”. In: Proceedings of International Conference on Roboticsand Automation 2.April (1997), pp. 1022–1027. issn: 10504729. doi: 10.1109/ROBOT.1997.614269.[51] R Taylor et al. “The nanomanipulator: a virtual-reality interface for a scanningtunneling microscope”. In: Proceedings of the 20th annual conference on Computergraphics and interactive technique (1993), pp. 127–134. issn: 0097-8930. doi: 10.1145/166117.166133. url:[52] Marek Bures et al. “Advances in Ergonomics Modeling, Usability & Special Popu-lations”. In: 486 (2017), pp. 221–230. issn: 02683768. doi: 10.1007/978-3-319-41685-4. url:[53] T Kim et al. “The effect of delayed visual feedback on telerobotic surgery”. In:Surgical Endoscopy And Other Interventional Techniques 19.5 (2005), pp. 683–686. issn: 1432-2218. doi: 10.1007/s00464-004-8926-6. url:[54] a.K. Bejczy, W.S. Kim, and S.C. Venema. “The phantom robot: predictive dis-plays for teleoperation with time\ndelay”. In: Proceedings., IEEE InternationalConference on Robotics and Automation (1990), pp. 546–551. doi: 10.1109/ROBOT.1990.126037.85[55] Astrid Weiss, Regina Bernhaupt, and Manfred Tscheligi. “The USUS evaluationframework for user-centered HRI”. In: New Frontiers in Human-Robot Interaction(2011), pp. 89–110. url:{\&}rep=rep1{\&}type=pdf.[56] Sebastian Thrun. “Toward a Framework for Human-Robot Interaction”. In: Human-Computer Interaction 19.1 (2004), pp. 9–24. issn: 0737-0024. doi: 10.1207/s15327051hci1901 & 2 _ 2. url: http : / / www . informaworld . com / openurl ?genre=article{\&}doi=10.1207/s15327051hci1901{\&}2{\_}2{\&}magic=crossref{\%}7C{\%}7CD404A21C5BB053405B1A640AFFD44AE3.[57] Julie A Adams. “Critical Considerations for Human-Robot Interface Develop-ment”. In: AAAI Fall Symposium: Human Robot Interaction Technical ReportFS-02-03 (2002), pp. 1–8.[58] Aaron Steinfeld et al. “Common metrics for human-robot interaction”. In: Pro-ceeding of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction- HRI ’06 2 (2006), p. 33. issn: 1595932941. doi: 10.1145/1121241.1121249.url:[59] International Organization for Standardization. ISO 9241-11: Ergonomic Require-ments for Office Work with Visual Display Terminals (VDTs): Part 11: Guidanceon Usability. 1998.[60] Jakob Nielsen. Usability engineering. Elsevier, 1994.[61] D. Rubin, J., & Chisnell. Handbook of usability testing [electronic resource] :How to plan, design, and conduct effective tests (2nd ed.) 2008, p. 386. isbn:9780470185483. doi: 10.1007/s13398- 014- 0173- 7.2. arXiv: arXiv:1011.1669v3.[62] Holly A.; Yanco, Jill L.; Dury, and Jean; Scholtz. “Beyond Usability Evalua-tion:Analysis of Human-Robot Interaction at a Major Robotics Competition”.In: Human-Computer Interaction 19.June 2004 (2004), pp. 1–8. doi: 10.1207/s15327051hci1901.86[63] Alan Dix. “Human-computer interaction”. In: Encyclopedia of database systems.Springer, 2009, pp. 1327–1331.[64] Jakob Nielsen. “Enhancing the explanatory power of usability heuristics”. In:Proceedings of the SIGCHI conference on Human factors in computing systemscelebrating interdependence - CHI ’94. 1994, pp. 152–158. isbn: 0897916506. doi:10.1145/191666.191729. url:[65] Jill Drury and L Riek. “Command and Control of Robot Teams”. In: Proc. of the. . . (2003). url:[66] E; Clarkson and Ronald C.; Arkin. “Applying heuristic evaluation to human-robotinteraction systems”. In: FLAIRS Conference (2007), pp. 44–49.[67] John Rieman, Marita Franzke, and David Redmiles. “Usability Evaluation withthe Cognitive Walkthrough”. In: Conference companion on Human factors in com-puting systems (1995), pp. 387–388. doi: 10.1145/223355.223735.[68] Clayton Lewis et al. “Testing a Walkthrough Methodology for Theory-Based De-sign of Walk-Up-and-Use Interfaces”. In: Proceedings of the SIGCHI Conferenceon Human Factors in Computing Systems, CHI 1990 January (1990), pp. 235–242. doi: 10.1145/97243.97279.[69] C. Wharton et al. “Applying cognitive walkthroughs to more complex user inter-faces: Experiences, issues, and recommendations”. In: Journal of Chemical Infor-mation and Modeling 53 (1992), pp. 1689–1699. issn: 1098-6596. doi: 10.1017/CBO9781107415324.004. arXiv: arXiv:1011.1669v3.[70] Alan Dix et al. Human-Computer Interaction (3rd Edition). Upper Saddle River,NJ, USA: Prentice-Hall, Inc., 2003. isbn: 0130461091.[71] Fabio Sani and John Todman. Experimental Design and Statistics for Psychology:a First Course. Blackwell Publishing, 2006. isbn: 9781405100236.87[72] Walter R Boot et al. “The effects of video game playing on attention, memory,and executive control”. In: Acta psychologica 129.3 (2008), pp. 387–398.[73] Ronald J Tallarida and Rodney B Murray. “Chi-square test”. In: Manual of Phar-macologic Calculations. Springer, 1987, pp. 140–142.[74] Mary L McHugh. “The chi-square test of independence”. In: Biochemia medica:Biochemia medica 23.2 (2013), pp. 143–149.[75] Graham JG Upton. “Fisher’s exact test”. In: Journal of the Royal StatisticalSociety. Series A (Statistics in Society) (1992), pp. 395–402.[76] Alan Agresti. “A survey of exact inference for contingency tables”. In: Statisticalscience (1992), pp. 131–153.[77] Eric J Feuer and Larry G Kessler. “Test statistic and sample size for a two-sampleMcNemar test”. In: Biometrics (1989), pp. 629–636.[78] Allen L Edwards. “Note on the ?correction for continuity? in testing the signif-icance of the difference between correlated proportions”. In: Psychometrika 13.3(1948), pp. 185–187.[79] Omolola A Adedokun and Wilella D Burgess. “Analysis of paired dichotomousdata: A gentle introduction to the McNemar test in SPSS”. In: Journal of Multi-Disciplinary Evaluation 8.17 (2011), pp. 125–131.[80] Ellen R Girden. “Quantitative Applications in the Social Sciences: ANOVA”.In: Oaks, CA: SAGE Publications Ltd, 1992. isbn: 9780803942578. doi: doi:10.4135/9781412983419.[81] Andrew Rutherford. “GLM Approaches to Factorial Mixed Measures Designs”. In:Anova and Ancova. John Wiley & Sons, Inc., 2013, pp. 199–214. isbn: 9781118491683.doi: 10.1002/9781118491683.ch8. url:[82] HJ Keselman et al. “Testing the validity conditions of repeated measures F tests.”In: Psychological Bulletin 87.3 (1980), p. 479.88[83] Brian B Schultz. “Levene’s test for relative variation”. In: Systematic Zoology 34.4(1985), pp. 449–456.[84] Sigmund Tobias and James E Carlson. “Brief report: Bartlett’s test of sphericityand chance findings in factor analysis”. In: Multivariate Behavioral Research 4.3(1969), pp. 375–377.[85] Franklin A Franklin A Graybill. Theory and application of the linear model. 04;QA279, G7. 1976.[86] George EP Box and David R Cox. “An analysis of transformations”. In: Journalof the Royal Statistical Society. Series B (Methodological) (1964), pp. 211–252.[87] Jason W Osborne. “Improving your data transformations : Applying the Box-Coxtransformation”. In: Practical Assessment, Research & Evaluation 15.12 (2010),pp. 1–9. issn: 15317714.[88] Nadim Nachar et al. “The Mann-Whitney U: A test for assessing whether twoindependent samples come from the same distribution”. In: Tutorials in Quanti-tative Methods for Psychology 4.1 (2008), pp. 13–20.[89] Peter Sprent and Nigel C Smeeton. Applied nonparametric statistical methods.CRC Press, 2016.[90] Unity Game Engine. Unity game engine-official site.[91] Jeff Craighead, Jennifer Burke, and Robin Murphy. “Using the unity game engineto develop sarge: a case study”. In: Proceedings of the 2008 Simulation Workshopat the International Conference on Intelligent Robots and Systems (IROS 2008).2008.[92] Ishan Goradia, Jheel Doshi, and Lakshmi Kurup. “A review paper on oculusrift & project morpheus”. In: International Journal of Current Engineering andTechnology 4.5 (2014), pp. 3196–3200.[93] Shrutik Katchhi and Pritish Sachdeva. “A Review Paper on Oculus Rift”. In:International Journal of Current Engineering and Technology E-ISSN (2014),pp. 2277–4106.89[94] Filipe Andre´ Cachada Rodrigues. “Immersive Telerobotic Modular Frameworkusing stereoscopic HMDapos; s”. In: (2015).[95] Maria Karam. “PhD Thesis: A framework for research and design of gesture-basedhuman-computer interactions”. PhD thesis. University of Southampton, 2006.[96] Zhengyou Zhang. “Microsoft kinect sensor and its effect”. In: IEEE multimedia19.2 (2012), pp. 4–10.[97] Frank Weichert et al. “Analysis of the accuracy and robustness of the leap motioncontroller”. In: Sensors 13.5 (2013), pp. 6380–6393.[98] Rui Miguel de Paiva Batista. “Navigating Virtual Reality Worlds with the LeapMotion Controller”. In: (2016).[99] Can Eldem. “A technique for the measurement and possible rehabilitation ofVisual Neglect using the Leap Sensor”. In: (2014), p. 90. url:[100] Geeta Desai. “IoT approach for motion detection using raspbrry PI”. In: (2017).[101] Lawrence J Hettinger and Gary E Riccio. “Visually induced motion sickness invirtual environments”. In: Presence: Teleoperators & Virtual Environments 1.3(1992), pp. 306–310.90Appendix AQuestionaire1) What is your gender?2 Female 2 Male2) Are left or right handed?2 Right-Handed 2 Left-Handed3) Do you play video games?2 Yes 2 No4) Have you used Oculus Rift before?2 Yes 2 No5) Have you used Leap Motion before?2 Yes 2 No6) How many hours you work with computer daily?1 2 3 4 5 6 7 8 9 10917) Using the technology completing the task is effective.Disagree 1 2 3 4 5 6 7 8 9 10 Agree8) I feel I could comfortably control the robot arm in the second and third scenario.Disagree 1 2 3 4 5 6 7 8 9 10 Agree9) I feel safer using the Oculus Rift and Leap Motion in order to interact with the robot.Disagree 1 2 3 4 5 6 7 8 9 10 Agree10) I feel sick or pressure after using the Oculus Rift and Leap Motion.Disagree 1 2 3 4 5 6 7 8 9 10 Agree11) I would like to use the VR in future applications.Disagree 1 2 3 4 5 6 7 8 9 10 Agree12) I would like to use the MR in future applications.Disagree 1 2 3 4 5 6 7 8 9 10 Agree13) Using virtual environment is more intuitive than using keyboard.Disagree 1 2 3 4 5 6 7 8 9 10 Agree14) Using MR environment is more intuitive than using Manual environment.Disagree 1 2 3 4 5 6 7 8 9 10 Agree15) Using VR is more intuitive than using MR environment.Disagree 1 2 3 4 5 6 7 8 9 10 Agree16) I feel like using virtual environment is less time consuming than the keyboard arrows.Disagree 1 2 3 4 5 6 7 8 9 10 Agree9217) I feel like using MR environment is less time consuming than the Manual environ-ment.Disagree 1 2 3 4 5 6 7 8 9 10 Agree18) I feel no difference in using a teach pendant or virtual environment or MR environ-ment in order to communicate with the robot.Disagree 1 2 3 4 5 6 7 8 9 10 Agree19) Do you have any suggestion to improve this project?93Appendix BAnonymized Data94User Gender Right/LeftHandManual TOC VR TOC MR TOCUser1 Female Right Handed - 15s -User2 Female Right Handed 1min 50s 20s -User3 Male Right Handed 1min 46s 40s 1min 30sUser4 Male Right Handed 1min 24s 10s 30User5 Female Right Handed 2min 47s 20s -User6 Male Right Handed 1min 38s - -User7 Male Right Handed 2min 6s 30 -User8 Male Right Handed - 43s -User9 Male Left Handed 2min 20s 13s -User10 Female Right Handed - 22s 56sUser11 Male Right Handed 2min 23s 26s 1mi 51sUser12 Female Right Handed 1min 49s - -User13 Male Right Handed 2min 20s 40s -User14 Male Right Handed - 1min 10s 1min 20sUser15 Male Right Handed 1min 30s 28s 2min 10sUser16 Male Right Handed 1min 30s 21s -User17 Male Right Handed - 19s 1min 14sUser18 Male Right Handed 2min 14s - -User19 Male Right Handed - 24s 47sUser20 Female Right Handed 3min 9s 28s -User21 Female Right Handed 1min 55s 18s -User22 Female Right Handed 2min 32s 21s -User23 Male Right Handed 2min 49s 32s -User24 Male Right Handed 1min 45s 39s 1min 56sUser25 Male Right Handed 1min 54s 25s 1min 25sUser26 Male Right Handed 2min 13s 31s -Table B.1: Userss’ anonymized demographic data and Time of Completion (TOC) fordifferent HRI modes.95User Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10User1 No No No 8 7 8 5 4User2 No No No 4 6 4 4 3User3 No No No 7 5 6 7 9User4 No No No 6 7 8 9 1User5 Yes Yes No 5 10 10 8 2User6 Yes Yes No 9 5 6 2 2User7 No No No 5 7 7 4 2User8 Yes Yes Yes 2 7 8 7 2User9 No No No 10 8 8 10 2User10 No No No 6 9 10 10 4User11 Yes No No 6 4 7 7 3User12 No Yes Yes 2 8 6 4 6User13 No No Yes 3 9 8 8 2User14 Yes No No 8 7 7 7 2User15 No Yes Yes 10 8 8 5 1User16 No No No 10 10 9 10 2User17 No No No 4 8 5 8 2User18 No No No 5 8 8 9 6User19 Yes No No 8 9 10 8 1User20 Yes No No 5 10 8 9 2User21 Yes No No 5 8 7 7 1User22 No No No 4 7 7 5 3User23 Yes No No 5 7 6 7 2User24 Yes No No 8 7 7 9 3User25 No No No 5 6 4 3 8User26 No No No 4 7 6 3 3Table B.2: Anonymized responses to questions 3 to 10 of the questionnaire.96User Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18User1 9 7 8 7 8 10 10 1User2 4 5 4 4 6 8 3 3User3 7 9 8 8 8 9 9 7User4 8 9 8 8 6 6 5 7User5 10 10 10 10 5 10 10 1User6 9 9 10 8 3 9 9 1User7 8 8 5 7 9 4 4 4User8 6 7 8 7 5 9 8 2User9 10 8 10 8 8 3 3 2User10 5 9 9 10 5 9 9 3User11 7 7 7 8 8 10 8 2User12 8 6 7 4 3 8 8 4User13 9 9 7 7 5 8 8 4User14 7 8 9 9 9 8 9 2User15 8 9 8 9 10 10 10 2User16 10 10 10 5 9 10 10 1User17 9 10 9 10 6 9 9 1User18 8 8 9 9 2 10 10 1User19 9 9 8 8 6 7 8 1User20 10 8 10 10 9 10 10 3User21 9 7 7 8 6 10 9 4User22 8 6 5 4 8 8 4 4User23 9 9 9 9 8 6 6 3User24 10 10 10 10 7 10 10 2User25 1 2 2 2 2 1 1 1User26 8 8 9 9 9 9 9 3Table B.3: Anonymized responses to questions 11 to 18 of the questionnaire.97Appendix CStatistical Data AnalysisC.1 GLM for mixed model ANOVAThe GLM for the two-way ANOVA with mixed measures design can be described by thefollowing equation [81]:Yijk = µ+ αj + βk + pii(j) + (αβ)jk + (piβ)ik(j) + εijk (C.1)where Yijk is the value of the dependent variable for the ith subject at the jth level offactor A, and kth level of factor B, µ is the grand mean of the experimental conditionpopulation mean, βk is the effect of the kth level of Factor B, αj is the effect of the jthlevel of Factor A, pii(j) represents the random effect of the ith subject within the jthlevel of Factor A, (piβ)ik(j) represents the interaction effect of the kth level of Factor Band the ith subject within the jth level of Factor A, (αβ)jk is the effect of interactionof the jth level of Factor A and the kth level of Factor B, and εijk is the random errorassociated with the ith subject in the jth level of Factor A and the kth level of Factor B.Where factor A is a between-subjects measure and factor B is a within-subjects measure.The use of brackets around the subscript j indicates that these effects involve the scoresof subjects nested within the p levels of Factor A, which means that separate groupsof subjects are employed in each of the p levels of Factor A. Such variables and relatedSum of Squares (SS) can be calculated using the formulas provided in Table C.1, where98Y¯G is the grand mean, Y¯j is the measured mean for each level of factor A, Y¯ij is themean value for each subject for different levels of factor A, Y¯k is the mean of measureddependent variable for each level of factor B, and Y¯jk is the mean value for each level offactor A using the subjects that fall in each level of factor. Using these estimated valuesfor each variable the predicted value for each subject (model) can be calculated usingthe following equation:Yijk = µ+ αj + βk + pii(j) + (αβ)jk + (piβ)ik(j) (C.2)Then, using the predicted and measured values of each subject the error term can becalculated as follows:εˆijk = Yˆijk + Yijk (C.3)Effect FormulaA (αj) qN∑pj=1 (Y¯j − Y¯G)2S(A) (pii(j)) (error) q∑Ni∑pj=1 (Y¯ij − Y¯j)2B (βk) pN∑qk=1 (Y¯k − Y¯G)2A × B ((αβ)jk) N ∑pj=1∑qk=1 (Y¯jk − Y¯j − Y¯k + Y¯G)2S × B ((piβ)ik(j)) (error) ∑Ni=1∑pj=1∑qk=1 (Yijk − Y¯ij − Y¯jk + Y¯j)2Table C.1: Formulas for mixed effect two-factor ANOVA. [81]Now the mean square values can be calculated by dividing the sum of square for eachfactor shown in Table 2.2 by the related degrees of freedom for each factor, which forfactor A is p−1, for factor B is q−1, and interaction factor A×B is equal to (p−1)(q−1).The degrees of freedom for error terms S(A) (subjects nested in A) is equal to (N − p)which is number of subject minus the levels of factor A, and for the error term S×B isequal to (N − p)(q − 1). Then the F ratio can be found by dividing each factor by therelevant error term: S(A) for repeated-measures variable A, and S×B for between-subjectsvariable B and A×B.99Further, as mentioned earlier in GLM method for ANOVA effect of each factor canalso be found by comparing a full model to the reduced models obtained by removingthe factors of interest. Doing so will result in an increase in the error term that can beused as a measure of the effect of each factor. When the model is implemented usingregression methods the difference of the sum of square residuals obtained from the fullmodel and the reduced models can be used as the measure of contribution of each factor.Using such SS values and relevant degrees of freedoms calculate for each reduced modelan ANOVA table can be constructed to present the result of two-way mixed measuresANOVA using GLM approach.C.2 Post hoc Tukey HSDPost hoc Tukey HSD test is used after the null hypothesis is rejected by ANOVA toexamine which two groups of data have significantly different means. This is done byusing an statistical distribution defined by Student as q. In this method the differencebetween two means would be significant if it is larger than the calculated HSD for thegroups [79]:|Ma −Ma′| > HSD (C.4)whereHSD = qA,α√12MSs(A)(1Sa+1Sa′) (C.5)where Ma and Ma′ are the means of the two groups a and a′, Sa and Sa′ are the number ofobservations for each group, MSS(A) is the mean square of error obtained from ANOVA,and qA,α is the statistic value of q distribution with range A (number of groups) andN −A degrees of freedom using significance value of α that can be obtained from a tableof a Studentized range distribution.100C.3 Mann-Whitney U testThe Mann-Whitney U test compares each observation from sample one with each obser-vation from sample two when such data are ranked in an ascending order. In this case,if the two samples are from the same population, there is equal chance that each data ofsample one is larger or smaller than each data from sample two that can be expressed intechnical terms as [88]:  H0 : p(xi > yi) = 1/2H1 : p(xi > yi) 6= 1/2 (C.6)where xi is an observation of sample one and yi is an observation from sample two. Inthis case the null hypothesis is rejected when one sample is significantly larger than theother sample. Mann-Whitney U test initially requires the calculation of U statistics foreach group which are defined as follows [88]:Ux = nxny + ((nx(nx + 1))/2−Rx (C.7)Uy = nxny + ((ny(ny + 1))/2−Ry (C.8)where, Rx is the sum of the ranks assigned to the first group, Ryis the sum of the ranksassigned to the second group, and nx is the number of observations or participants in thefirst group, ny is the number of observations or participants in the second group. BothU values can be thought of as the number of times observations in one sample precedeor follow observations in the second sample when data from one sample are placed inan ascending order. In others words, when all the data of the two samples are rankedfrom the lowest to the highest value, for every observation of the first sample, it will getone point for every observation of the other sample above it. Same procedure is done forthe observations of the second sample. Then, the U values can be calculated simply byadding the points for each group. If there is a repeated value, then an average for theranks will be calculated.Following the calculation of the U statistics and the determination of an appropriatesignificance value (α), the null hypothesis can be rejected if, by consulting the Mann101and Whitney tables, the p corresponding to the min (Ux, Uy) (the smallest of U bothcalculated) is smaller than the p or the predetermined significance value. If number ofobservation in each group is more than eight then, the sample’s distribution graduallyapproaches a normal distribution, in which case the z-distribution can be used to test thehypothesis. The z statistics value can be calculated using the following equation [88]:|z| = |Ux + Uy| /σU (C.9)where, σU is the standard deviation of the U distribution obtained from the followingequation:σU =√((nxny)(N + 1)/12 (C.10)If the absolute value of the calculated z is larger or equal to the tabulated z value, thenull hypothesis is rejected.102Appendix DMinitab OutputsD.1 Outputs of McNemar’s Tests103Tabulated	Statistics:	Manual,	Mixed	Reality	Rows:	Manual			Columns:	Mixed	Reality		 No	 Yes	 All				 			 			 			No	 2	 4	 6				 33.33	 66.67	 100.00				 12.50	 40.00	 23.08				 7.69	 15.38	 23.08				 0.7756	 1.2410	 						 			 			 			Yes	 14	 6	 20				 70.00	 30.00	 100.00				 87.50	 60.00	 76.92				 53.85	 23.08	 76.92				 0.2327	 0.3723	 						 			 			 			All	 16	 10	 26				 61.54	 38.46	 100.00				 100.00	 100.00	 100.00				 61.54	 38.46	 100.00	Cell	Contents							Count							%	of	Row							%	of	Column							%	of	Total							Contribution	to	Chi-square	Chi-Square	Test		 Chi-Square	 DF	 P-Value	Pearson	 2.622	 1	 0.105	Likelihood	Ratio	 2.574	 1	 0.109	2	cell(s)	with	expected	counts	less	than	5.	McNemar’s	Test	Estimated	Difference	 95%	CI	 P	-0.385	 (-0.707,	-0.063)	 0.031	Difference	=	p	(Manual	=	No)	-	p	(Mixed	Reality	=	No)			104Tabulated	Statistics:	Mixed	Reality,	Virtual	Reality	Rows:	Mixed	Reality			Columns:	Virtual	Reality		 No	 Yes	 All				 			 			 			No	 3	 13	 16				 18.75	 81.25	 100.00				 100.00	 56.52	 61.54				 11.54	 50.00	 61.54				 0.7212	 0.0941	 						 			 			 			Yes	 0	 10	 10				 0.00	 100.00	 100.00				 0.00	 43.48	 38.46				 0.00	 38.46	 38.46				 1.1538	 0.1505	 						 			 			 			All	 3	 23	 26				 11.54	 88.46	 100.00				 100.00	 100.00	 100.00				 11.54	 88.46	 100.00	Cell	Contents							Count							%	of	Row							%	of	Column							%	of	Total							Contribution	to	Chi-square	Chi-Square	Test		 Chi-Square	 DF	 P-Value	sPearson	 2.120	 1	 0.145	Likelihood	Ratio	 3.154	 1	 0.076	2	cell(s)	with	expected	counts	less	than	5.	McNemar’s	Test	Estimated	Difference	 95%	CI	 P	0.5000	 (0.2693,	0.7307)	 0.000	Difference	=	p	(Mixed	Reality	=	No)	-	p	(Virtual	Reality	=	No)			105	Tabulated	Statistics:	Manual,	Virtual	Reality	Rows:	Manual			Columns:	Virtual	Reality		 No	 Yes	 All				 			 			 			No	 0	 6	 6				 0.00	 100.00	 100.00				 0.00	 26.09	 23.08				 0.00	 23.08	 23.08				 0.69231	 0.09030	 						 			 			 			Yes	 3	 17	 20				 15.00	 85.00	 100.00				 100.00	 73.91	 76.92				 11.54	 65.38	 76.92				 0.20769	 0.02709	 						 			 			 			All	 3	 23	 26				 11.54	 88.46	 100.00				 100.00	 100.00	 100.00				 11.54	 88.46	 100.00	Cell	Contents							Count							%	of	Row							%	of	Column							%	of	Total							Contribution	to	Chi-square	Chi-Square	Test		 Chi-Square	 DF	Pearson	 1.017	 1	Likelihood	Ratio	 1.688	 1	1	cell(s)	with	expected	counts	less	than	1.	Chi-Square	approximation	probably	invalid.	2	cell(s)	with	expected	counts	less	than	5.	McNemar’s	Test	Estimated	Difference	 95%	CI	 P	0.115	 (-0.145,	0.376)	 0.508	Difference	=	p	(Manual	=	No)	-	p	(Virtual	Reality	=	No)	106D.2 Outputs of Chi-square and Fisher’s Exact Tests107Tabulated	Statistics:	Manual,	Video	Game	Experience	Rows:	Manual			Columns:	Video	Game	Experience		 No	 Yes	 All				 			 			 			No	 3	 3	 6				 50.00	 50.00	 100.00				 18.75	 30.00	 23.08				 11.54	 11.54	 23.08				 0.12981	 0.20769	 						 			 			 			Yes	 13	 7	 20				 65.00	 35.00	 100.00				 81.25	 70.00	 76.92				 50.00	 26.92	 76.92				 0.03894	 0.06231	 						 			 			 			All	 16	 10	 26				 61.54	 38.46	 100.00				 100.00	 100.00	 100.00				 61.54	 38.46	 100.00	Cell	Contents							Count							%	of	Row							%	of	Column							%	of	Total							Contribution	to	Chi-square	Chi-Square	Test		 Chi-Square	 DF	 P-Value	Pearson	 0.439	 1	 0.508	Likelihood	Ratio	 0.431	 1	 0.512	2	cell(s)	with	expected	counts	less	than	5.	Fisher’s	Exact	Test	P-Value	0.644269				108Tabulated	Statistics:	Mixed	Reality,	Video	Game	Experience	Rows:	Mixed	Reality			Columns:	Video	Game	Experience		 No	 Yes	 All				 			 			 			No	 10	 6	 16				 62.50	 37.50	 100.00				 62.50	 60.00	 61.54				 38.46	 23.08	 61.54				 0.002404	 0.003846	 						 			 			 			Yes	 6	 4	 10				 60.00	 40.00	 100.00				 37.50	 40.00	 38.46				 23.08	 15.38	 38.46				 0.003846	 0.006154	 						 			 			 			All	 16	 10	 26				 61.54	 38.46	 100.00				 100.00	 100.00	 100.00				 61.54	 38.46	 100.00	Cell	Contents							Count							%	of	Row							%	of	Column							%	of	Total							Contribution	to	Chi-square	Chi-Square	Test		 Chi-Square	 DF	 P-Value	Pearson	 0.016	 1	 0.899	Likelihood	Ratio	 0.016	 1	 0.899	1	cell(s)	with	expected	counts	less	than	5.	Fisher’s	Exact	Test	P-Value	1				109Tabulated	Statistics:	Virtual	Reality,	Video	Game	Experience	Rows:	Virtual	Reality			Columns:	Video	Game	Experience		 No	 Yes	 All				 			 			 			No	 2	 1	 3				 66.67	 33.33	 100.00				 12.50	 10.00	 11.54				 7.69	 3.85	 11.54				 0.012821	 0.020513	 						 			 			 			Yes	 14	 9	 23				 60.87	 39.13	 100.00				 87.50	 90.00	 88.46				 53.85	 34.62	 88.46				 0.001672	 0.002676	 						 			 			 			All	 16	 10	 26				 61.54	 38.46	 100.00				 100.00	 100.00	 100.00				 61.54	 38.46	 100.00	Cell	Contents							Count							%	of	Row							%	of	Column							%	of	Total							Contribution	to	Chi-square	Chi-Square	Test		 Chi-Square	 DF	 P-Value	Pearson	 0.038	 1	 0.846	Likelihood	Ratio	 0.038	 1	 0.845	2	cell(s)	with	expected	counts	less	than	5.	Fisher’s	Exact	Test	P-Value	1		110D.3 Outputs of ANOVA and Related Tests111General	Linear	Model:	TOC	versus	HRI	Mode,	Gaming	...	erience,	User	The	following	terms	cannot	be	estimated	and	were	removed:	HRI	Mode*User(Gaming	Experience)	Method	Factor	coding	 (-1,	0,	+1)	Box-Cox	transformation	 λ	=	0	Rows	unused	 28	Factor	Information	Factor	 Type	 Levels	 Values	HRI	Mode	 Fixed	 3	 Manual,	Mixed	Reality,	Virtual	Reality	Gaming	Experience	 Fixed	 2	 No,	Yes	User(Gaming	Experience)	 Random	 25	 User	1(No),	User	10(No),	User	12(No),	User	13(No),	User	15(No),	User	16(No),	User	17(No),	User	18(No),	User	2(No),	User	22(No),	User	25(No),	User	26(No),	User	3(No),	User	7	(No),	User	9(No),	User	11(Yes),	User	14(Yes),	User	19(Yes),	User	20(Yes),	User	21(Yes),	User	23(Yes),	User	24(Yes),	User	5(Yes),	User	6(Yes),	User	8(Yes)	Analysis	of	Variance	for	Transformed	Response	Source	 DF	 Adj	SS	 Adj	MS	 F-Value	 P-Value	 			HRI	Mode	 2	 23.1584	 11.5792	 155.29	 0.000	 					Gaming	Experience	 1	 0.0991	 0.0991	 1.09	 0.303	 x			HRI	Mode*Gaming	Experience	 2	 0.0488	 0.0244	 0.33	 0.724	 					User(Gaming	Experience)	 23	 2.2217	 0.0966	 1.30	 0.277	 			Error	 21	 1.5658	 0.0746	 			 			 			Total	 49	 32.2351	 			 			 			 			x	Not	an	exact	F-test.	Model	Summary	for	Transformed	Response	S	 R-sq	 R-sq(adj)	 R-sq(pred)	0.273062	 95.14%	 88.67%	 *	Coefficients	for	Transformed	Response	Term	 Coef	 SE	Coef	 T-Value	 P-Value	 VIF	Constant	 4.1666	 0.0473	 88.11	 0.000	 			HRI	Mode	 			 			 			 			 					Manual	 0.6392	 0.0680	 9.40	 0.000	 2.53			Mixed	Reality	 0.3169	 0.0805	 3.94	 0.001	 2.40	Gaming	Experience	 			 			 			 			 			112		No	 -0.0545	 0.0473	 -1.15	 0.262	 1.44	HRI	Mode*Gaming	Experience	 			 			 			 			 					Manual	No	 -0.0516	 0.0680	 -0.76	 0.457	 2.54			Mixed	Reality	No	 0.0270	 0.0805	 0.33	 0.741	 2.68	User(Gaming	Experience)	 			 			 			 			 					User	1(No)	 -0.472	 0.267	 -1.77	 0.091	 *			User	10(No)	 -0.260	 0.200	 -1.30	 0.207	 *			User	12(No)	 -0.008	 0.268	 -0.03	 0.975	 *			User	13(No)	 0.375	 0.189	 1.99	 0.060	 *			User	15(No)	 0.121	 0.159	 0.76	 0.455	 *			User	16(No)	 -0.168	 0.189	 -0.89	 0.383	 *			User	17(No)	 -0.194	 0.200	 -0.97	 0.342	 *			User	18(No)	 0.198	 0.268	 0.74	 0.468	 *			User	2(No)	 -0.092	 0.189	 -0.49	 0.631	 *			User	22(No)	 0.094	 0.189	 0.50	 0.623	 *			User	25(No)	 0.021	 0.159	 0.13	 0.899	 *			User	26(No)	 0.222	 0.189	 1.18	 0.252	 *			User	3(No)	 0.172	 0.159	 1.08	 0.292	 *			User	7	(No)	 0.179	 0.189	 0.95	 0.354	 *			User	11(Yes)	 0.089	 0.158	 0.56	 0.579	 *			User	14(Yes)	 -0.212	 0.198	 -1.07	 0.296	 *			User	19(Yes)	 -0.362	 0.198	 -1.82	 0.083	 *			User	20(Yes)	 0.211	 0.187	 1.13	 0.273	 *			User	21(Yes)	 -0.258	 0.187	 -1.38	 0.182	 *			User	23(Yes)	 0.222	 0.187	 1.18	 0.250	 *			User	24(Yes)	 0.136	 0.158	 0.86	 0.400	 *			User	5(Yes)	 -0.019	 0.187	 -0.10	 0.919	 *			User	6(Yes)	 -0.327	 0.270	 -1.21	 0.239	 *	Regression	Equation	ln(TOC)	 =	 4.1666	+	0.6392	HRI	Mode_Manual	+	0.3169	HRI	Mode_Mixed	Reality	-	0.9562	HRI	Mode_Virtual	Reality	-	0.0545	Gaming	Experience_No	+	0.0545	Gaming	Experience_Yes	-	0.0516	HRI	Mode*Gaming	Experience_Manual	No	+	0.0516	HRI	Mode*Gaming	Experience_Manual	Yes	+	0.0270	HRI	Mode*Gaming	Experience_Mixed	Reality	No	-	0.0270	HRI	Mode*Gaming	Experience_Mixed	Reality	Yes	+	0.0246	HRI	Mode*Gaming	Experience_Virtual	Reality	No	-	0.0246	HRI	Mode*Gaming	Experience_Virtual	Reality	Yes	-	0.472	User(Gaming	Experience)_User	1(No)	-	0.260	User(Gaming	Experience)_User	11310(No)	-	0.008	User(Gaming	Experience)_User	12(No)	+	0.375	User(Gaming	Experience)_User	13(No)	+	0.121	User(Gaming	Experience)_User	15(No)	-	0.168	User(Gaming	Experience)_User	16(No)	-	0.194	User(Gaming	Experience)_User	17(No)	+	0.198	User(Gaming	Experience)_User	18(No)	-	0.092	User(Gaming	Experience)_User	2(No)	+	0.094	User(Gaming	Experience)_User	22(No)	+	0.021	User(Gaming	Experience)_User	25(No)	+	0.222	User(Gaming	Experience)_User	26(No)	+	0.172	User(Gaming	Experience)_User	3(No)	+	0.179	User(Gaming	Experience)_User	7	(No)	-	0.187	User(Gaming	Experience)_User	9(No)	+	0.089	User(Gaming	Experience)_User	11(Yes)	-	0.212	User(Gaming	Experience)_User	14(Yes)	-	0.362	User(Gaming	Experience)_User	19(Yes)	+	0.211	User(Gaming	Experience)_User	20(Yes)	-	0.258	User(Gaming	Experience)_User	21(Yes)	+	0.222	User(Gaming	Experience)_User	23(Yes)	+	0.136	User(Gaming	Experience)_User	24(Yes)	-	0.019	User(Gaming	Experience)_User	5(Yes)	-	0.327	User(Gaming	Experience)_User	6(Yes)	+	0.521	User(Gaming	Experience)_User	8(Yes)	Equation	treats	random	terms	as	though	they	are	fixed.	Fits	and	Diagnostics	for	Unusual	Observations	Original	Response	Obs	 TOC	 Fit	2	 15.00	 15.00	10	 109.00	 109.00	28	 134.00	 134.00	49	 105.00	 155.69	67	 98.00	 98.00	74	 43.00	 43.00	76	 140.00	 91.18	77	 13.00	 19.96	Fits	and	Diagnostics	for	Unusual	Observations	Transformed	Response	Obs	 TOC'	 Fit	 Resid	 Std	Resid	 	  2	 2.708	 2.708	 0.000	 *	 			 X	10	 4.691	 4.691	 0.000	 *	 			 X	28	 4.898	 4.898	 0.000	 *	 			 X	49	 4.654	 5.048	 -0.394	 -2.01	 R	 			67	 4.585	 4.585	 0.000	 *	 			 X	74	 3.761	 3.761	 0.000	 *	 			 X	76	 4.942	 4.513	 0.429	 2.34	 R	 			77	 2.565	 2.994	 -0.429	 -2.34	 R	 			TOC'	=	transformed	response	R		Large	residual	X		Unusual	X	114Expected	Mean	Squares,	using	Adjusted	SS		 Source	Expected	Mean	Square	for	Each	Term	1	 HRI	Mode	 (5)	+	Q[1,	3]	2	 Gaming	Experience	 (5)	+	1.3893	(4)	+	Q[2,	3]	3	 HRI	Mode*Gaming	Experience	 (5)	+	Q[3]	4	 User(Gaming	Experience)	 (5)	+	1.9130	(4)	5	 Error	 (5)	Error	Terms	for	Tests,	using	Adjusted	SS		 Source	 Error	DF	 Error	MS	 Synthesis	of	Error	MS	1	 HRI	Mode	 21.00	 0.0746	 (5)	2	 Gaming	Experience	 35.08	 0.0906	 0.7262	(4)	+	0.2738	(5)	3	 HRI	Mode*Gaming	Experience	 21.00	 0.0746	 (5)	4	 User(Gaming	Experience)	 21.00	 0.0746	 (5)	Variance	Components,	using	Adjusted	SS	Source	 Variance	 %	of	Total	 StDev	 %	of	Total	User(Gaming	Experience)	 0.0115176	 13.38%	 0.107320	 36.58%	Error	 0.0745630	 86.62%	 0.273062	 93.07%	Total	 0.0860806	 			 0.293395	 				Comparisons	for	TOC	Tukey	Pairwise	Comparisons:	HRI	Mode	Grouping	Information	Using	the	Tukey	Method	and	95%	Confidence	HRI	Mode	 N	 Mean	 Grouping	Manual	 19	 122.216	 A	 			Mixed	Reality	 9	 88.546	 A	 			Virtual	Reality	 22	 24.789	 			 B	Means	that	do	not	share	a	letter	are	significantly	different.	Tukey	Simultaneous	95%	CIs		Test	for	Equal	Variances:	TOC	(LN)	versus	HRI	Mode,	...	ing	Experience	Method	Null	hypothesis	 All	variances	are	equal	Alternative	hypothesis	 At	least	one	variance	is	different	115Significance	level	 α	=	0.05	95%	Bonferroni	Confidence	Intervals	for	Standard	Deviations	HRI	Mode	Gaming	Experience	 N	 StDev	 CI	Manual	 No	 13	 0.191073	 (0.126413,	0.36234)	Manual	 Yes	 7	 0.258936	 (0.141940,	0.75809)	Mixed	Reality	 No	 12	 0.426655	 (0.260106,	0.89707)	Mixed	Reality	 Yes	 9	 0.268771	 (0.094213,	1.08473)	Virtual	Reality	 No	 14	 0.400067	 (0.241901,	0.81529)	Virtual	Reality	 Yes	 9	 0.320599	 (0.193265,	0.75238)	Individual	confidence	level	=	99.1667%	Tests	Method	Test	Statistic	 P-Value	Multiple	comparisons	 —	 0.079	Levene	 1.25	 0.300		Test	for	Equal	Variances:	TOC	(LN)	versus	HRI	Mode	Method	Null	hypothesis	 All	variances	are	equal	Alternative	hypothesis	 At	least	one	variance	is	different	Significance	level	 α	=	0.05	Bartlett’s	method	is	used.	This	method	is	accurate	for	normal	data	only.	95%	Bonferroni	Confidence	Intervals	for	Standard	Deviations	HRI	Mode	 N	 StDev	 CI	Manual	 20	 0.226025	 (0.162344,	0.361863)	Mixed	Reality	 21	 0.367692	 (0.265998,	0.580281)	Virtual	Reality	 23	 0.386247	 (0.283074,	0.594315)	Individual	confidence	level	=	98.3333%	Tests	Method	Test	Statistic	 P-Value	Bartlett	 5.82	 0.055	Test	for	Equal	Variances:	TOC	(LN)	vs	HRI	Mode	116


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items