Building Believable RobotsAn exploration of how to make simple robotslook, move, and feel rightbyPaul BucciB.A. Visual Arts, The University of British Columbia, 2012B.Sc. Computer Science, The University of British Columbia, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Computer Science)The University of British Columbia(Vancouver)August 2017c© Paul Bucci, 2017AbstractHumans have an amazing ability to see a ‘spark of life’ in almost anything thatmoves. There is a natural urge to imbue objects with agency. It’s easy to imagine achild pretending that a toy is alive. Adults do it, too, even when presented with ev-idence to the contrary. Leveraging this instinct is key to building believable robots,i.e. robots that act, look, and feel like they are social agents with personalities, mo-tives, and emotions. Although it is relatively easy to initiate a feeling of agency, itis difficult to control, consistently produce, and maintain an emotional connectionwith a robot. Designing a believable interaction requires balancing form, functionand context: you have to get the story right.In this thesis, we discuss (1) strategies for designing the bodies and behavioursof simple robot pets; (2) how these robots can communicate emotion; (3) and howpeople develop narratives that imbue the robots with agency. For (1), we developeda series of four robot design systems to create and rapidly iterate on robot formfactors, as well as a tools for improvising and refining expressive robot behaviours.For (2), we ran three studies wherein participants rated robot behaviours in termsof arousal and valence under different display conditions. For (3), we ran a studywherein expert performers improvised emotional ‘stories’ with the robots; also oneof the studies in (2) included soliciting narratives for the robot and its behaviours.iiLay SummaryHumans have an amazing ability to see a ‘spark of life’ in almost anything thatmoves. It’s easy to imagine a child pretending that a toy is alive. Adults do it,too. Leveraging this instinct is key to building believable robots, i.e., robots thatact, look, and feel like they are social beings with personalities, motives, and emo-tions. Although it is relatively easy to make a robot seem alive, it is difficult tocontrol, consistently produce, and maintain an emotional connection with a robot.Designing a believable interaction requires balancing form, function and context:you have to get the story right. For this thesis, we explored how robots can com-municate emotions through simple body movements. We found that people willperceive a wide range of emotions from even simple robots, create complex storiesabout the robot’s inner life, and adapt their behaviour to match the robot.iiiPrefaceThis thesis is organized around two major published papers, with some currentand unpublished work. For all work, I collaborated closely with members of theSPIN Lab, especially Laura Cang, Oliver Schneider, David Marino and my su-pervisor, Karon MacLean, and two visiting researchers, Jussi Rantala and MerelJung. In conjunction with Laura and Oliver, I supervised a number of undergradu-ate researchers, all of whom contributed time and effort to the projects, includingSazi Valair, Lucia Tseng, Sophia Chen and Lotus Zhang. I am extremely gratefulto each of them, however, I would attribute most of the intellectual contributionto myself, Laura, Oliver, and David. In CuddleBits [16], I was the principle de-signer, architect, and builder of the CuddleBit form factors and design systems.Although other people helped with evaluation and assembly, this is my work. Fur-ther CuddleBit designs included a twisted string actuator introduced and developedby Soheil Kianzad.The chapter on behaviour generation (4) outlines a number of behaviour gen-eration ideas and systems. MacaronBit is an extension of work by Oliver that wasinitially developed by myself, Oliver, Jussi, Merel, and Laura, and carried into ma-turity by myself and David. Voodle [51] is a system initially conceived of by Oliverand prototyped by Oliver and David. I architected and implemented Voodle’s finalversions that were used in our studies, with significant help from David in writingthe final code. The Twiddler and Hand-Sketching prototypes were developed onmy own. The work in Complexity stems from ideas developed by Laura and my-self in CuddleBits, but was carried out to maturity by myself, with significant helpfrom Lotus in taking care of the very important minutia.The chapter on behaviour evaluation (5) outlines a number of studies in whichivwe evaluated the CuddleBits and Voodle, through both displayed and interactivebehaviours. In CuddleBits, the initial study design was developed collaborativelybetween Laura, Oliver, Jussi, Merel, and myself, then brought to maturity by Lauraand myself. In Voodle, I largely architected the study design, with help from David.Voodle analysis was performed collaboratively between myself, David, and Oliver.As first author on CuddleBits, I principally drafted, organized, edited and wrotethe paper. Laura and Karon contributed significantly to editing, with help fromJussi, Merel and Sazi.On Voodle, I contributed heavily to initial drafting, framing, analysis and writ-ing. David, Oliver, and Karon carried the paper through the first draft, then I editedand wrote heavily for the second draft. The final draft was largely an effort byKaron and David.The chapter on current and future work (6) includes ideas that are currently indevelopment by myself, Laura, and Mario Cimet in an equal intellectual partner-ship. It also references unpublished work developed by Laura, Jussi, Karon, andmyself, with significant inspiration from Jessica Tracy.The chapter on related work (2) is largely taken and amended from Voodle andCuddleBits. Sections on believability and complexity are my own.Research was conducted under the following ethics certificates: H15-02611;H13-01620; H09-02860.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 What Makes a Social Robot Believable? . . . . . . . . . . . . . . 11.2 Affective Touch and Narrative Context . . . . . . . . . . . . . . . 51.3 CuddleBots and Bits . . . . . . . . . . . . . . . . . . . . . . . . 71.4 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.5 Contributions and Limitations . . . . . . . . . . . . . . . . . . . 81.5.1 CuddleBits: Simple Robots for Studying Affective Touch . 81.5.2 Voodle: Vocal Doodling for Improvising Robot Behaviours 91.5.3 Complexity: Characterizing Pleasant/Unpleasant Robot Be-haviours . . . . . . . . . . . . . . . . . . . . . . . . . . . 10vi1.5.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 101.6 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . 112 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1 Companion Robots, from Complex to Simple . . . . . . . . . . . 122.2 Rendering Emotion Through Breathing . . . . . . . . . . . . . . 132.3 Emotion Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 Complexity and Physiological Behaviours . . . . . . . . . . . . . 162.4.1 Sample Entropy . . . . . . . . . . . . . . . . . . . . . . . 182.4.2 Multi-Scale Entropy . . . . . . . . . . . . . . . . . . . . 182.5 Coupled Physical and Behaviour Design . . . . . . . . . . . . . . 182.5.1 Sketching Physical Designs . . . . . . . . . . . . . . . . 192.5.2 Creating Expressive Movement . . . . . . . . . . . . . . 202.6 Voice Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 212.6.1 Alignment . . . . . . . . . . . . . . . . . . . . . . . . . 212.6.2 Iconicity . . . . . . . . . . . . . . . . . . . . . . . . . . 212.7 Believability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.8 Gestural and Emotional Touch Sensing . . . . . . . . . . . . . . . 253 Sketching Robot Bodies . . . . . . . . . . . . . . . . . . . . . . . . . 273.1 Paper Prototyping for Robotics . . . . . . . . . . . . . . . . . . . 283.2 Design Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . 303.2.2 CuddleBit Visual Design Systems . . . . . . . . . . . . . 313.3 The ComboBit . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.4 Moving from 1-DOF to 2-DOF . . . . . . . . . . . . . . . . . . . 353.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Generating Robot Behaviours . . . . . . . . . . . . . . . . . . . . . 384.1 Early tool attempts . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 MacaronBit: A Keyframe Editor for Robot Motion . . . . . . . . 394.3 Voodle: Vocal Doodling for Affective Robot Motion . . . . . . . 404.3.1 Pilot Study: Gathering Requirements . . . . . . . . . . . 424.3.2 Voodle Implementation . . . . . . . . . . . . . . . . . . . 47vii4.4 Evaluating Voodle and MacaronBit . . . . . . . . . . . . . . . . . 474.4.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 474.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Evaluating Robot Behaviours . . . . . . . . . . . . . . . . . . . . . . 515.1 CuddleBits Study 1: Robot Form Factor and Behaviour Display . 515.1.1 CuddleBits Study 1: Methods . . . . . . . . . . . . . . . 525.1.2 CuddleBits Study 1: Results . . . . . . . . . . . . . . . . 535.1.3 CuddleBits Study 1: Takeaways . . . . . . . . . . . . . . 545.2 CuddleBits Study 2: Evaluating Behaviours . . . . . . . . . . . . 555.2.1 CuddleBits Study 2: Methods . . . . . . . . . . . . . . . 575.2.2 CuddleBits Study 2: Data Preprocessing . . . . . . . . . . 575.2.3 CuddleBits Study 2: Data Verification . . . . . . . . . . . 585.2.4 CuddleBits Study 2: Analysis and Results . . . . . . . . . 595.2.5 CuddleBits Study 2: Takeaways . . . . . . . . . . . . . . 645.3 Voodle: Co-design Study . . . . . . . . . . . . . . . . . . . . . . 655.3.1 Voodle: Methods . . . . . . . . . . . . . . . . . . . . . . 655.3.2 Voodle: Participants . . . . . . . . . . . . . . . . . . . . 665.3.3 Voodle: Analysis . . . . . . . . . . . . . . . . . . . . . . 675.3.4 Voodle: Discussion and Takeaways . . . . . . . . . . . . 755.3.5 Voodle: Insights into Believability and Interactivity . . . . 765.4 Complexity: Valence and Narrative Frame . . . . . . . . . . . . . 795.4.1 Complexity: Methods . . . . . . . . . . . . . . . . . . . 805.4.2 Complexity: Results . . . . . . . . . . . . . . . . . . . . 815.4.3 Complexity: Discussion, Takeaways, and Future Work . . 865.5 Conclusions for Complexity, Voodle, and CuddleBits BehaviourEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886 Conclusions and Ongoing Work . . . . . . . . . . . . . . . . . . . . 906.1 Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916.2 Developing a Therapeutic Robot . . . . . . . . . . . . . . . . . . 926.3 Can We Display Valenced Behaviours? . . . . . . . . . . . . . . . 92viii6.4 Lessons for Designing Emotional Robots . . . . . . . . . . . . . 93Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95A Supporting Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 105ixList of TablesTable 4.1 Pilot Study: Linguistic features that participants felt correspondedbest with robot position in the imitation task. “+” and “-” indi-cate feature presence or absence. The comparative Study 1 wenton to use pitch as a primary design element. . . . . . . . . . . 45Table 4.2 Affect Grid quadrants of PANAS emotion words. † representswords used in CuddleBits Behaviour Generation Study; ‡representswords used in CuddleBits Study 1 (see Evaluating); * representswords used in CuddleBits Study 2 (see Evaluating). . . . . . . 48Table 5.1 Summary of Voodle Co-Design Themes. We refer to each themeby abbreviation and session number (e.g., “PL1”) . . . . . . . 68Table 5.2 The correlation between summary statistics of valence ratings(columns) and complexity measures (rows). For example, thetop left cell is the correlation of the mean valence rating perbehaviour and the variance of the behaviour signal. . . . . . . . 82xList of FiguresFigure 1.1 Examples of social robots. The Nao (left) is a humanoid robot,Google’s self-driving car (centre) acts in human social spaces,and Universal Robotics’ UR-10 is a collaborative pick-and-place robot. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Figure 2.1 Two dimensional models: left, Russell’s circumplex with as-sociated PANAS words; right, the discretized affect grid. . . . 15Figure 3.1 Top: the evolution of the RibBit and FlexiBit from low-fidelitypaper (and plastic) prototypes to study-ready medium-fidelityprototypes. Bottom: a diagram of the actuation mechanismsof the RibBit and FlexiBit. The RibBit looks like a rib cage,if you imagine that there are hinges along its spine. The robotexpands by the ribs moving outwards as figured above. TheFlexiBit looks like an orange carefully peeled by slicing intoequal sections, if the orange was then taken out and the sliceswere reattached at the bottom and top. . . . . . . . . . . . . 28Figure 3.2 Screenshot from Sagmeister and Walsh’s website. Notice howa wide number of products can be generated by creating a de-sign system, i.e., an aesthetic, colour palette, and canonicalshapes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Figure 3.3 FlexiBit design system. Using two canonical shapes, the sliceand the base, the FlexiBit’s shape and size can be quickly andeasily varied. . . . . . . . . . . . . . . . . . . . . . . . . . . 32xiFigure 3.4 RibBit design system. Using Adobe Illustrator, it is easy tomodify the texture, shape, size, and number of the ribs, makingiteration quick and easy. . . . . . . . . . . . . . . . . . . . . 32Figure 3.5 SpineBit design system. The whole robot is built off of con-figurations of a single slice (shown) which is defined by pa-rameterized curves. By defining key slices and interpolatingbetween them, a new robot shape can be produced. . . . . . . 33Figure 3.6 ComboBit design system. The robot is built off of configura-tions of a two slices (two configurations of the same ‘rib’ sliceshown left; one configuration of the ‘spine’ slice shown right)which is defined by parameterized curves. By defining keyslices and interpolating between them, a new robot shape canbe produced. . . . . . . . . . . . . . . . . . . . . . . . . . . 34Figure 3.7 Exampe of two slices created by design tables. Notice that thedefined curves are the same for both: the same number of cir-cles and curves in the same relative placements, with the samerelations (i.e., some point X is coincident with some circle K)with the different values for the same dimensions. The val-ues of the dimensions are set in an Excel table, which is usedby Solidworks to produce different configurations of the sameshapes and relations. . . . . . . . . . . . . . . . . . . . . . . 35Figure 4.1 For behaviour generation, users had two design tools to helpcreate behaviours: Voodle and MacaronBit. With Voodle, userscould optionally sketch a robot motion using their voice. Theirvocal input was imported into MacaronBit as raw position keyframes(shown as ’pos’ above). Users could modify the waveform bymanipulating other parameters, specifically randomness (shownas ’ran’), max and min position. MacaronBit includes standardkeyframe editing functions. . . . . . . . . . . . . . . . . . . . 40Figure 4.2 Voodle (vocal doodling) uses vocal performance to create be-lievable, affective robot behaviour. Features like tone and rhythmare translated to influence a robot’s movement. . . . . . . . . 41xiiFigure 4.3 The 1-DOF CuddleBit robots used in the Voodle co-designstudy: (a) RibBit: A CuddleBit that looks like a set of ribs;(b) FlexiBit: A Bit whose stomach ”breaths” via a servo; (c)FlappyBit: A Bit with an appendage that flaps up and down viaa servo; (d) VibroBit: A Bit that vibrates via an eccentric massmotor; (e) BrightBit: A Bit whose eyes light up via an LED. . 43Figure 4.4 The Voodle system implementation, as it evolved during ourstudies. Additions for each stage are highlighted in yellow. Inour final system, incoming vocal input is analyzed for ampli-tude and fundamental frequency. These signals are normalizedbetween 0 and 1, then averaged, weighted by a “pitch/amp”bias parameter. Randomness is then inserted into the system,which we found increased a sense of agency. Output is smoothedeither with a low-pass filter or PD control. Final output can bereversed to accommodate user narratives (i.e., robot is crunch-ing with louder voice vs. robot is expanding) for several dif-ferent CuddleBits. . . . . . . . . . . . . . . . . . . . . . . . 44Figure 5.1 Waveforms of Study 1 behaviours as designed by researchers.Each quadrant is represented by a PANAS affect word corre-sponding to the extremes along (valence, arousal) axes, i.e.,Excited is high-arousal, positive-valence. . . . . . . . . . . . 52Figure 5.2 Mean behaviour ratings (+2 for Match; -2 for Not Match) forFlexiBit grouped by the researcher-designed behaviours (hori-zontal) and the emotion word against which participants ratedbehaviours (vertical). Researcher-designed behaviours corre-spond with (1) to (8) in Figure 5.1. RibBit scores were similarand omitted for space. . . . . . . . . . . . . . . . . . . . . . 56Figure 5.3 For each behaviour and viewing condition, a single vector wascalculated by adding the vectors of the top three words that par-ticipants chose, weighted by confidence levels. Word vectorswere determined at the beginning of the session, when partici-pants rated each word in terms of arousal and valence. . . . . 58xiiiFigure 5.4 Each plot shows a single behaviour’s arousal (-1,1) and va-lence (-1,1) ratings. Live viewing condition is in red, videoin blue. Green ellipses show confidence intervals at 5% and97.5%. Green cross is mean, purple cross is median. Each plotcorresponds to a single PANAS word, each row corresponds toan affect grid quadrant. Rows order from the top: Depressed,Relaxed, Excited, Stressed. . . . . . . . . . . . . . . . . . . . 61Figure 5.5 Correlation results from behaviours that were designed for anemotion label but unrated by participants (marked unrated above)were calculated on all 72 designs from CuddleBits: Partici-pant Generated Behaviours (see Generating); correlation re-sults from participant-ratings were calculated on the 16 be-haviours from CuddleBits: Study 2 (marked by viewing condi-tion). A strong positive correlation is shown between the po-sition total variance for all arousal columns (unrateda, videoa,livea) – the higher the total variance, the higher the arousal. . . 63Figure 5.6 Reported affect grids by participant and session. After beinginstructed about dimensions of arousal and valence, partici-pants drew the robot’s expressive range directly on affect grids.Participants indicated increased expressivity from sessions 1 to2, differences between voice and Wheel control, and that eachrobot had a different range. . . . . . . . . . . . . . . . . . . 72Figure 5.7 Vision for “Embedded Voodle”: Voodle could be a natural low-cost method to emotionally color a motion path in more com-plex robots. . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Figure A.1 RibBit assembly instructions, page 1. . . . . . . . . . . . . . 106Figure A.2 RibBit assembly instructions, page 2. . . . . . . . . . . . . . 107Figure A.3 RibBit assembly instructions, page 3. . . . . . . . . . . . . . 108Figure A.4 RibBit assembly instructions, page 2. . . . . . . . . . . . . . 109Figure A.5 RibBit assembly instructions, page 5. . . . . . . . . . . . . . 110Figure A.6 RibBit assembly instructions, page 6. . . . . . . . . . . . . . 111Figure A.7 RibBit assembly instructions, page 7. . . . . . . . . . . . . . 112xivFigure A.8 RibBit assembly instructions, page 8. . . . . . . . . . . . . . 113Figure A.9 RibBit assembly instructions, page 9. . . . . . . . . . . . . . 114Figure A.10 RibBit design system explainer, page 1. . . . . . . . . . . . . 115Figure A.11 RibBit design system explainer, page 2. . . . . . . . . . . . . 116Figure A.12 RibBit design system explainer, page 3. . . . . . . . . . . . . 117Figure A.13 Lasercutting files for the RibBit. . . . . . . . . . . . . . . . . 118Figure A.14 FlexiBit assembly instructions, page 1. . . . . . . . . . . . . . 119Figure A.15 FlexiBit assembly instructions, page 2. . . . . . . . . . . . . . 120Figure A.16 FlexiBit assembly instructions, page 3. . . . . . . . . . . . . . 121Figure A.17 FlexiBit assembly instructions, page 4. . . . . . . . . . . . . . 122Figure A.18 FlexiBit design system explainer, page 1. . . . . . . . . . . . 123Figure A.19 FlexiBit design system explainer, page 2. . . . . . . . . . . . 124Figure A.20 FlexiBit design system explainer, page 3. . . . . . . . . . . . 125Figure A.21 FlexiBit design system explainer, page 4. . . . . . . . . . . . 126Figure A.22 FlexiBit design system explainer, page 5. . . . . . . . . . . . 127Figure A.23 FlexiBit design files to be cut out. . . . . . . . . . . . . . . . 128xvAcknowledgmentsNo good work is done alone. I certainly couldn’t have done anything interestingwithout support from my colleagues, friends, and loved ones. It is rare that weallow ourselves the opportunities to be effusive, so I will do my best to make themost of it.To each of my colleagues at the SPIN Lab and MUX group, I thank you foryour critique, enthusiasm, and support. I should mention a few by name:Laura, for over two and a half years of productive collaboration and friendship.It’s a rare thing to find someone who is able and willing to take one’s ideas seri-ously. You’ve done that and much more. I am very grateful to have worked withsomeone so enthusiastic, intelligent, and thoughtful.Oliver, for consistent mentorship and enthusiastic help whenever I needed it. Ilearned a lot from you about research, and I’d like to say I taught you a thing ortwo as well. I’ve enjoyed getting to know you, and I hope we can work togetheragain.David, for listening to me talk for hours, encouraging me, working with me,writing with me, tolerating me, and believing in my silly ideas. It’s been a privilegeand a joy to work with you and be your friend.Jussi and Merel, for a very creative half a year and excellent intercontinentalcollaboration for the next couple of years.Karon, for believing in me many times over, pushing me forwards, lookingcarefully and deeply at my work, and saying ‘yes’ more than you needed to. It’sbeen a privilege to work with you and learn from you.This work was also supported by contributions from many people in other de-partments. Jon Nakane (EngPhys), Nick Scott, Graham Entwistle and Blair Satter-xvifield (SALA) deserve special mention.I’m personally grateful to Tamara Munzner, Christine D’Onofrio, and JessicaDawson, all of whom have given time and thought to helping me advance in myacademic career. And to Eric Vatikiotis-Bateson, who believed in me and helpedshape my perspective on science. RIP.And, of course, I’m deeply grateful my friends and family who have supportedme along the way.Ashley, who has been willing to listen to hours of my fool ideas for years now.All of the Oldbyssey and Abby crews who have politely engaged me in con-versation about this.Micki, for love, and for putting me up while I finished this thing.My mother, Dorothy, who has spent many car rides, late nights, and early morn-ings listening to me talk, ever since I was able to talk.xviiDedicationTo Micki, who has shown me more kindness,support and love than I have yet known.xviiiChapter 1IntroductionThe most sophisticated people I know—inside they are all children.— Jim HensonThis work is pretty silly. I want, you, the reader, to know that up front, and to knowthat I know that this is silly. But I have taken this silly stuff pretty seriously, andhave found that keeping things playful has helped me and my colleagues tacklesome big and interesting problems in affective computing and social robotics. So,dear reader, prepare yourself for a monograph that looks seriously at crafts, toys,puppetry, and funny noises, and hopefully you’ll agree that being a little silly canproduce some interesting results.A note on pronouns: “I” will refer myself, i.e., the author of this document, PaulBucci. “We” will refer to myself and my collaborators in the Sensory Perceptionand Interaction Research Group, a.k.a. SPIN Lab.1.1 What Makes a Social Robot Believable?Deciding what constitutes a robot can be difficult. If you go broad, any interactivesystem can seem like a robot. Do you include wearable electronics? How aboutan interactive light installation? Does the robot need to have a body? Can youcall a character in a cartoon a robot? Does the robot need have senses, i.e., vision,hearing, touch?Here, I’ll be discussing social robots, which are robots that need to act in a1Figure 1.1: Examples of social robots. The Nao (left) is a humanoid robot,Google’s self-driving car (centre) acts in human social spaces, and Uni-versal Robotics’ UR-10 is a collaborative pick-and-place robot.human social space. You can imagine robots that greet you at the door like theNao1, or that you collaborate with like many pick-and-place robots2, or that driveyou around like Google’s self-driving car3 (see Figure 1.1).For this work, we’ll assume that, yes, a robot does need to have a body. Thatleaves on-screen cartoon characters out of the robot category. We’ll further assumethat a robot needs to be perceived as an autonomous agent, or an independentmoving thing with a mind of its own. Interactive light installations, automaticdoors, and wearable electronics are all out, too. Finally, we’ll assume a robotneeds to be able to sense and react to the world, so, stuffed animals, cuckoo-clocksand wind-up toys are out.I’ll confess upfront to cheating a little. Although the robots described hereincan and have been given both sensing and reactive capabilities, some of the studieswe ran had robots that could not sense and could only act. But I would still beconfident in calling them believable social robots—why?As an exercise for the reader, try ordering the examples below in terms of robot-ness, then social-ness4. You can decide whether you agree with my ordering. Thepoint is not to determine some universal order, but rather to try to tease apart whatmakes something feel like a social agent.1https://www.ald.softbankrobotics.com/en/cool-robots/nao2Example pictured from https://www.universal-robots.com/3https://waymo.com4I am intentionally using the -ness suffix over robotic and sociality to avoid connotations ofrepetative, automatic movements and ability to socialize, respectively.2Try it.• on-screen cartoon characters• interactive light installations• automatic doors• wearable electronics• stuffed animals• cuckoo-clocks• wind-up toysDone? Both social-ness and robot-ness? My turn. I’ll just do three. I wouldorder stuffed animals < automatic doors < wind-up toys in terms of robot-ness.But if I were to order in terms of social-ness, I would order automatic doors tS2.>tS2]“tcheen” [tSin]Rapid movements – e.g., , the Bitmoves very quickly between differentpositions.+/- Voiced consonants A consonant is voiced if it’s pro-duced while the vocal folds are vi-brating.“ga” [ga] (voiced)“ka” [ka] (unvoiced)Voiced consonants were associatedwith smooth motion, while unvoicedconsonants were associated with lesssmooth motion.45ResultsPhonetic Features: Table 4.1 reports typical phonetic features that we observedin the pilot study’s imitation task. We transcribed vocalizations into the Interna-tional Phonetic Alphabet (IPA), then organized them by distinctive phonologicalfeatures [20]. The most compelling features, based on discriminability on motionand feasibility of implementation, were pitch, continuants, stridents, and voicedconsonants.Metaphors for Sound-to-Behaviour Mappings:Participants instituted a relationship between pitch, amplitude and height: thehigher the robot’s ribs, the higher the pitch and amplitude.There were exceptions to this pattern; for example, one participant saw therobot’s downward movement as ‘flexing,’ and therefore used increased vocal pitchand amplitude to represent its downward movement. Table 4.1 reports contrastingrelationships that we observed, with examples.We saw occasional reversals in participants’ mappings between the imitationtask and the improvisational task.One possible cause is the Bit’s actuation methods: i.e., , computer-control inimitation, and participant-actuated in improvisation. The only direction to manu-ally actuate the robot is downwards: its default state is an extended position, andthe ribs are normally pulled inwards by a servo. Hence, increased physical efforttranslates to downward movement. So the relationship between pitch and ampli-tude may be based on how the participant conceptualizes the “direction” yieldedby the work.Individualized language: Each participant seemed to have idiosyncratic soundpatterning. For example, some participants used many voiced stops (e.g., “badumbadum”) in their utterances. Some participants consistently used multiple syllableswith many consonants (“tschugga tschugga”); others consistently produced simplemonosyllabic utterances (“mmmm”).464.3.2 Voodle ImplementationBased on piloting guidance, we created a full Voodle system, seen in Figures 4.4(system design) and 4.2 (system in use).We found that fundamental frequency and overall amplitude (easily detectedin realtime) could capture a variety of relevant vocalizations, including pitch and+continuant features. To accommodate variety in metaphors (e.g., breathing vs.flexing) and individualized language, we included user-adjustable parameters: mo-tion smoothing, gain, pitch and amplitude weight (where the weight between am-plitude and pitch is a linear combination: out put = amp× ampweight + pitch×pitchweight), and the reverse. Priorities for future phonetic features include distin-guishing the additional features reported in Table 4.1.Voodle was implemented in JavaScript: a NodeJS server connected with theRibBit using Johnny-Five and ReactJS [43, 66].Input audio was analyzed in 1s windows. Amplitude was determined by themaximum value in the window, deemed to be sufficient through piloting. Thefundamental frequency was calculated using the AMDF algorithm [85], the bestperformer in informal piloting. Figure 4.4 shows algorithm evolution. Voodle isopen-source, available at https://github.com/ubcspin/Voodle.4.4 Evaluating Voodle and MacaronBitTo develop a set of behaviours to evaluate for emotion content, we ran a studywherein participants were asked to work with an expert animator to create be-haviours given an emotion word.4.4.1 MethodsWe recruited ten participants to design five robot behaviours, each based on anemotion word from the PANAS scale [93]. Three self-identified as singers or ac-tors.To define the design tasks, participants were assigned one word per affect gridquadrant, chosen randomly without replacement from the five PANAS words forthat quadrant; participants selected a fifth word. The words were presented inrandom order.47UnpleasantActivatedPleasantDeactivatedPleasantActivatedUnpleasantDeactivatedstressed† * relaxed†‡* excited†‡* depressed†*upset‡* calm‡* attentive‡* drowsy‡*scared‡* at rest‡* determined‡* bored‡*guilty‡ serene‡ proud‡ dull‡hostile‡ at ease‡ enthusiastic‡ sluggish‡nervous‡ droopy‡Table 4.2: Affect Grid quadrants of PANAS emotion words. † representswords used in CuddleBits Behaviour Generation Study; ‡representswords used in CuddleBits Study 1 (see Evaluating); * represents wordsused in CuddleBits Study 2 (see Evaluating).For each word, the participant was given the option to express the behaviourwith Voodle, design it using a traditional keyframe editor, or switch between theseas needed.The keyframe editor, Macaron [78], allows users to specify Bit height (peri-odic movement amplitude) over time, as well as remix and transform their originalanimations through copy/pasting, scaling keyframes, and inversion and other func-tions. Participants could export their voodles as keyframe data for later refinementin Macaron.During the study, an expert animator (a co-author) was a design assistant, in-troducing participants to the robot and two tools.The animator assisted participants in creating compelling designs, offeringtechnical support and guidance as needed, but did not create animations for them.Meanwhile, another researcher acted as an observer, taking notes on tool use andconducting a brief informal exit interview.Participants could create as many designs for each emotion word as they wantedusing any tool at any time until they were satisfied with the result; for example, theymight make three designs for “excited” and choose their favourite.4.4.2 ResultsVoodle was used beforehand to sketch behaviours in most cases. When a partic-ipant had a clear idea of what the behaviour should look like, both sketching and48refining was performed in MacaronBit.A library of 72 behaviours labelled by emotion word was generated (partici-pants designed multiple behaviours per word); analysis of these behaviours pre-sented in CuddleBits: Study 2 (Evaluating 5) results.Participants agreed that the robots came to life: “it shocked me how alive itfelt,” “it tries to behave like a living thing would.”Voodling was used by participants to express emotions: “the things [Voodle]’slistening for is different from the things Siri listens for...it’s usually emotional mean-ing or mental state that’s conveyed by [pitch, volume and quality]”. While 7/10participants used Voodle, those with performance experience experience used Voo-dle more. This is may be individual preference: voodling is performance, andtended to be preferred by those comfortable with performing.Participants generally chose to use Voodle to augment their keyframe-editorwork, rather than as a stand-alone tool. Only two (both performers) ever designedwith Voodle alone, and only did so for one behaviour design task each.Voodle was most appropriate for exploring and sketching ideas, not fine-tunedcontrol. When users knew their goal, they moved straight to the keyframe editor:“it always seemed easier to go to [the keyframe] editor to do what I had in my headthan trying to vocalize and create that through voice.”We found participants had trouble expressing static emotional states (e.g., ,distressed); these became clearer when contrasted with an opposing emotion. Inour next study with Voodle, we changed the task to transitions between emotionalstates.Supplementing these observations, we note that a concurrent study (whose fo-cus was on developing and assessing these robots’ expressive capacity, and noton input tools) also used these Voodle-generated animations along with others, andconfirmed that they covered a large emotional space [16]. Specifically, independentjudges consistently assessed Bit animations as well-distributed across the arousaldimension, and somewhat along valence.We concluded that Voodle had value for sketching expressive robot behaviours,but needed further development.494.5 ConclusionsIn this section, we discussed the development and assessment of our two robotbehaviour design tools, Voodle and MacaronBit. The former is used for sketchingand the latter for refining robot behaviour designs. Both tools have been usedso far for designing 1-DOF robot behaviours; there is evidence that extensionsto multiple DOFs are possible, but the transition to multiple DOFs would not besimple. Like many 3D keyframe animation tools, MacaronBit may need a morecomplex timeline-based approach where multiple DOFs and complex movementsare controlled through composing simpler movements (i.e., by joining, nesting,etc.). A vision for Voodle may be to set two keyframe multi-DOF robot positions,then to use voice to interpolate between them. For example, imagine a humanoidrobot that is going from crouching to standing.50Chapter 5Evaluating Robot BehavioursTo determine the ability of the CuddleBits to display a wide range of emotions, weran two studies where participants rated breathing behaviours in terms of arousaland valence (CuddleBits studies 1 and 2). We then ran a six-week co-design studywith expert performers to further develop our improvisation tool and study howbehaviour design could work when conveying changing emotional behaviours, i.e.,from stressed to relaxed (Voodle). Last, we ran a study wherein we explored therelationship between complexity, valence, and narrative context, wherein partici-pants rated robot behaviours for valence and created short stories that explained therobot behaviours (Complexity).5.1 CuddleBits Study 1: Robot Form Factor andBehaviour DisplayWe evaluated the emotional expression capabilities of our two CuddleBit forms(FlexiBit and RibBit) on eight behaviours representing four emotional states. Specif-ically, we asked:RQ 1. Can 1-DOF robot movements be perceived as communicating differentvalence and arousal states?Hypothesis: Different levels of arousal will be interpreted more accurately thandifferent levels of valence.51Negative valenceHigh arousalLow arousalStressed ExcitedRelaxedDepressedPositive valence21438765TimeAmplitudeFigure 5.1: Waveforms of Study 1 behaviours as designed by researchers.Each quadrant is represented by a PANAS affect word corresponding tothe extremes along (valence, arousal) axes, i.e., Excited is high-arousal,positive-valence.RQ 2. How is interpretation of emotional content influenced by robot materi-ality, e.g., a soft furry texture?Hypothesis: FlexiBit’s behaviour will be perceived as conveying more positive va-lence than RibBit’s.5.1.1 CuddleBits Study 1: MethodsBehaviour design: Team members created and agreed upon two breathing be-haviours for each quadrant of the affective grid [72]: Depressed, Excited, Relaxed,or Stressed, for a total of 8 behaviours (represented as motion waveforms in Fig-ure 5.1). Each emotion word typifies the extreme of its emotion quadrant (i.e.,Stressed is high-arousal, negative-valence).Participants: 20 participants, aged 20–40 with cultural backgrounds from NorthAmerica, Europe, Southeast Asia, Middle East and Africa, were compensated $5for 30 minute sessions.Procedure: Participants were given the task of rating each behaviour on a 5-pointsemantic differential (−2 Mismatch to +2 Match) for two different robots display-ing four emotions: Depressed, Excited, Relaxed, or Stressed. For instance, for“FlexiBit feels stressed”, a participant would play each behaviour and rate how52well it matched the robot portraying stress. During playback and rating, partici-pants kept one hand on the robot, and moused with the other; motion was experi-enced largely haptically. Noise-cancelling headphones played pink noise to maskmechanical noises; instructions were communicated by microphone.Ratings for each robot were performed separately. Robot block order was coun-terbalanced, with an enforced 2m rest. For each block, all four emotions were pre-sented on the same screen so participants could compare globally. Behaviours (15sclips) could be played at will during the block. Order of behaviours and emotionwas randomised by participant. To reduce cognitive load, participants saw the samebehaviour/emotion order for the second block. In total, each participant performed64 ratings (8 behaviours × 4 emotions × 2 robots). Afterwards, a semi-structuredinterview was conducted.5.1.2 CuddleBits Study 1: ResultsWe compared ratings of each pair of behaviours designed for the same emotionword with a pairwise Wilcoxon signed-rank tests with Bonferroni correction (Fig-ure 5.2). Ratings of the two designed behaviours for the same emotion quadrantwere not significantly different (α = .050/8 = .006; all p’s ≥ .059). Thus, weaveraged ratings into four pairs by emotion target (e.g., (1) & (2) in Figure 5.1).Effect of emotion quadrant on behaviour ratings (significant). Friedman’s teston behaviour ratings showed significant differences between behaviours per emo-tion for both robots (all p’s < .001). Post hoc analyses using Wilcoxon signed-ranktests were conducted with a Bonferroni correction (α = .050/6 = .008) to furtheranalyse the effect of emotion condition on researcher-designed behaviours:– Stressed, Excited, or Relaxed: There were significant differences betweenhigh and low arousal behaviours (Stressed-Depressed, Stressed-Relaxed, Excited-Depressed and Excited-Relaxed, all p’s≤ .002); but none between behaviours withthe same arousal level but different valence content.Effect of robot on behaviour ratings (not significant). Wilcoxon signed-ranktests with Bonferroni correction showed no statistically significant differences be-tween ratings of emotions displayed on the two distinct robot forms (α = .050/16=.003; all p’s ≥ .026).53Duration (not significant). A two-way (2 robots × 4 emotions) repeated mea-sures ANOVA showed no significant differences in the time spent on rating be-haviours (all p’s ≥ .079), suggesting each emotion rating was undertaken withsimilar care.5.1.3 CuddleBits Study 1: TakeawaysHypothesis 1: Different levels of arousal are easier to interpret than different levelsof valence. – Supported.In general, participants were able to perceive differences in behaviours de-signed to convey high or low arousal. Speed or frequency was most mentioned forarousal variation: low arousal from low frequency and high arousal from high fre-quency. Participants found interpreting valence more difficult. Thus, behaviours onthis 1-DOF display corroborates earlier findings in regards to both dimensions [27,65, 97].We posit that the difficulties in determining valence may be due in part to therestrictive range of behaviours. All designs were based on the perception and imag-ination of three computer science researchers, which may not be broadly general-izable as effective emotional displays.Improvement: Behaviours may have more range or discernible valence whensourced from a more diverse group of designers. To increase emotional variance inStudy 2, we recruited participants (N=10), the majority of whom were employed increative roles to create the behaviours with an expert designer. Participants wereencouraged to puppet robot movements, act out desired movements, and interactwith the robot until they were satisfied with the emotional displays.Hypothesis 2: FlexiBit’s behaviour will be perceived as conveying more positivevalence than RibBit’s. – Not supported.In post-study interviews, participants reported the movement expressed by thetwo robot forms as sensorially but not necessarily emotionally different. FlexiBitfelt nicer to touch, but its motion was less precise. RibBit’s movements were inter-preted as breathing or a heartbeat despite the exposed inner workings emphasizingthe ’machine-ness’ of the robot.Unexpectedly, while participants specified preferences for FlexiBit’s fur and54RibBit’s motor precision, pairwise comparisons of the same emotions revealed nosignificant difference between robots. Movement rather than materiality dominatedhow participants interpreted emotional expression; although visual access to formwas restricted during movement, tactility might have modulated perception of, e.g.,life-likeness.Improvement: Whereas robot form factor had little to no influence on emo-tion recognition results, it did influence how participants perceived the robot. Weselected characteristics to emphasize for a second round of robot prototyping, pro-ducing a new robot for Study 2. We focused on characteristics that participantsreferenced as salient or pleasing in interviews, such as fur, texture, and body firm-ness.Starting from paper prototypes, we iterated on the RibBit form factor to in-crease haptic salience and to incorporate positive FlexiBit features. After exploringbumps on the ribs, spine configuration, fur textures, and rib count, we convergedon a form that had fewer ribs, dense fur, and a prominent spine. This combined thefavourite features of the RibBit (crisp motion and haptic feedback) with the Flex-iBit’s cuddliness. With rapid prototyping methods, each paper/lo-fi sketch couldbe explored in less than an hour; full new robot prototypes took about two hours tomodify design files, half an hour to laser cut, and about two hours to assemble.5.2 CuddleBits Study 2: Evaluating BehavioursIn a second study, we validated our participant-created behaviour designs and ex-plored the effect of presence on emotion evaluation. Here, we ask how consistentlybehaviours are rated in terms of valence and arousal under two viewing conditions:(1) the robot is present; (2) the robot is displayed via video.Of the 72–item behaviour set generated by participants (see Generating 4),CuddleBits Study 2 used a subset of 16: five researchers selected the most rep-resentative designs, converging on the top four per quadrant. Under two viewingconditions {live, video}, participants chose three words that best represented thedisplayed behaviours and rated their confidence in each chosen word, as well asone or more words that least represented the behaviour. Participants rated wordsahead of time in terms of arousal and valence. Ratings per participant and per55Figure 5.2: Mean behaviour ratings (+2 for Match; -2 for Not Match) forFlexiBit grouped by the researcher-designed behaviours (horizontal)and the emotion word against which participants rated behaviours (ver-tical). Researcher-designed behaviours correspond with (1) to (8) inFigure 5.1. RibBit scores were similar and omitted for space.viewing condition were combined into a single (valence, arousal) point (describedbelow). Through this, we explored the following:RQ 1. Is there a difference in viewing conditions?Hypothesis: Participants will rate behaviours similarly regardless of viewing con-dition.RQ 2. Are behaviours consistently distinguishable?Hypothesis: Each behaviour will be distinguishable.RQ 3. Which behaviour design and waveform features correlate with rateddimensions of arousal and valence?Conjecture: Features that are characteristic of variability will correlate with va-56lence, while features that are characteristic of speed will correlate with arousal.5.2.1 CuddleBits Study 2: MethodsParticipants: We recruited 14 naı¨ve participants (4 male), aged 22–35. 12 par-ticipants were fully proficient in English; the remaining 2 had advanced workingknowledge. Out of 14 participants, 13 reported having at least some interactionwith pets; 6 rarely interacted with robots, and 8 never interacted with robots. Allwere compensated $15 per session.Procedure: Participants were seated, introduced to a fur-covered RibBit, and askedto touch the robot to reduce novelty effects. To calibrate emotion words, partici-pants rated the valence and arousal of 12 words on a 9-point scale (Table 4.2).Participants then viewed the 16 robot behaviours in two counterbalanced viewingcondition blocks {live, video}.In the live condition, participants could physically interact with the CuddleBitwhile playing each robot behaviour via MacaronBit. Noise-cancelling headphonesplayed pink noise to mask robot noise. In the video condition, participants watchedsilent videos of the CuddleBit performing the same behaviours (side view, 640x360px, 30fps). In both conditions, behaviour order was randomized for each partici-pant.In each viewing condition, participants were asked to choose 3 emotion wordsthat best represented the behaviour from a list of the 12 emotion words they cali-brated previously, indicating their confidence level of each word on a 5-point Likertscale. They watched 16 behaviours and answered qualitative follow-up questions.After an optional 5 minute break, this process was repeated, with condition blockcounterbalanced. Including a semi-structured interview, the session took∼60 min-utes.5.2.2 CuddleBits Study 2: Data PreprocessingBefore each session, participants calibrated the emotion words that they would beusing by rating each in terms of arousal and valence. Using the calibrated list of57Determined (8,6)Excited (9,9)Nervous (1,7)Participants rate each word in terms of arousal and valence.For each behaviour and viewing condition, word vectors are weighted A vector is produced for each word.Determined1 arousal 91 valence 9c1w1 + c2w2 + c3w3|w1| + |w2| + |w3|b1,live =Each behaviour is then rated by top i.e. {1..5} {0.0,0.25,0.5,0.75,1.0}.Determined Excited NervousFigure 5.3: For each behaviour and viewing condition, a single vector wascalculated by adding the vectors of the top three words that participantschose, weighted by confidence levels. Word vectors were determined atthe beginning of the session, when participants rated each word in termsof arousal and valence.emotion words, we constructed vectors of (v= valence, a= arousal) for each word,where 1 < v,a < 9. For each behaviour and viewing condition, the best three wordswere weighted by their confidence values, added and normalized. This produced asingle vector of (v,a) for each behaviour and viewing condition (Figure 5.3).5.2.3 CuddleBits Study 2: Data VerificationBefore the following analysis, we ran a series of data verifications to ensure con-sistency in each participant’s responses.Due to the high subjectivity of the kinds of emotions people will associatewith different words, the participant-calibrated emotion words were checked forconsistency with the expected PANAS quadrants. For all participants, no more thantwo words disagreed with the PANAS quadrants; as such, we took the participantrated words to be reasonably calibrated.Similarly, for each behaviour per view condition, the best three rated wordswere checked both against themselves, and against the selected least representativeword(s). Roughly 50 per cent agreed within a reasonable margin of error acrosseither valence or arousal; 30 per cent agreed across both valence and arousal; 2058per cent either did not agree or were inconclusive.To determine whether our confidence value weighting scheme was valid, weperformed both a visual inspection of word distribution and confusion matriceswith design labels. With no weighting scheme, data was heavily biased towards(positive valence, high arousal) ratings, which did not agree with our qualitativeresults or a reasonable reading of our quantitative results. As such, a linear weight-ing scheme was determined to be the least biased, such that confidence ratings of{1,2,3,4,5} were mapped to {0.0,0.25,0.5,0.75,1.0}.5.2.4 CuddleBits Study 2: Analysis and ResultsWe summarize our findings from CuddleBits: Study 2. All significant results arereported at p < .05 level of significance.RQ1: Is there a difference in viewing conditions?Hypothesis: Participants will rate behaviours similarly regardless of viewing con-dition. – Not supported.Behaviour label × Viewing condition: We found a significant effect for viewingcondition (Pillai= 0.563, F(2,415)= 6.87) and behaviour label (0.563, F(30,832)=10.86). We did not find an interaction effect (p = .33). Although there is evidenceto suggest that participants do rate behaviours differently, since they also rate view-ing conditions differently, we should be careful in using video as a proxy for liverobot behaviour display.Behaviour label quadrant × Viewing condition: We found a significant effectfor viewing condition (Pillai = 0.441, F(6,880) = 41.43) and by collecting de-signs by quadrant (e.g., Hostile and Upset are both high-arousal, negative-valenceemotions), (Pillai = 0.030, F(2,439) = 8.705), and the interaction effect.Duration: Through 2-way ANOVAs, we found significance in duration betweenviewing conditions wherein participants took longer to rate live (µ = 72.49s, σ =40.69s) than via video (µ = 64.13s, σ = 29.28s) per behaviour, corroborated in thatlive behaviours (µ = 2.36, σ = 1.51) were played more times than the correspond-ing video (µ = 1.96, σ = 1.18). The more time spent on live behaviours couldbe due to more information conveyed or more interest as participants interpret the59motion and/or haptic expression.RQ2: Are behaviours consistently distinguishable?Hypothesis: Each behaviour will be consistently distinguishable. – Partially sup-ported.Behaviour label × Participant: As behaviours (Pillai = 0.917, F(30,448) =12.649), participants ratings (Pillai = 0.671, F(26,448) = 8.705), and the inter-action are all significant, we determine that the behaviours are distinguishable byparticipant.Through Figure 5.4, we examine rating consistency by behaviour and quadrant.Negative-valence, low-arousal (Depressed) behaviours have the largest dispersionin rating for both dimensions, suggesting that they are the most difficult for par-ticipants to classify. Low-arousal, positive-valence (Relaxed) behaviours are moreconsistently concentrated towards the relaxed quadrants.Both high-arousal, negative-valence (Stressed) and high-arousal, positive-valence(Excited) behaviours are concentrated in the high-arousal half, yet highly dispersedacross valence, suggesting valence is difficult to determine for certain high-arousalbehaviours.Overall, behaviours designed for a representative quadrant may not necessar-ily be interpreted as such. Determined, for example, was interpreted as negative-valence with high-arousal, a contrast to the intended positive-valence high-arousal.Finally, live behaviours (red in Figure 5.4) are more dispersed than video be-haviours (blue). This illustrates a higher variation in how participants rated livethan video behaviours.RQ3: Which behaviour design and waveform features correlate with rated di-mensions of arousal and valence?Conjecture: Features that are characteristic of variability will correlate with va-lence, features that are characteristic of speed will correlate with arousal. – Par-tially supported.Analysis using machine learning techniques was performed as a preliminarystep to understand which features might be most relevant. Using the full set ofdesigned behaviours from participants (see Generating) and their associated de-sign labels, we trained a Random Forest classifier on statistical features calculated60VA L E N C ES L U G G I S H D R O O P Y B O R E D D R O W S YC A L M R E L A X E D S E R E N E AT E A S EAT T E N T I V E P R O U D E N T H U S I A S T I C D E T E R M I N E DG U I LT Y H O S T I L E N E R V O U S S C A R E D-1 1AROUSAL-11Figure 5.4: Each plot shows a single behaviour’s arousal (-1,1) and valence(-1,1) ratings. Live viewing condition is in red, video in blue. Greenellipses show confidence intervals at 5% and 97.5%. Green cross ismean, purple cross is median. Each plot corresponds to a single PANASword, each row corresponds to an affect grid quadrant. Rows order fromthe top: Depressed, Relaxed, Excited, Stressed.from design and output waveform attributes. Since each behaviour was output as awaveform, we could decompose the waveform using MacaronBit design parame-ters, and describe them using keyframe count and standard statistical features (min,max, mean, median, variance, total variance, area under the curve) on keyframe61values. The same statistical features were calculated for the output waveform.Each behaviour label was mapped to the original PANAS quadrant (called heredesign quadrant). When running 20-fold cross-validation classifying on designquadrant, the Random Forest classifier achieved between 66% and 72%, full fea-ture subset for the former, and an optimal subset for the latter (chance=25%). Topperforming features were position: keyframe count, range, total variance; ran-dom: max, min.Note that the selected features are related to waveform complexity. If the ran-dom parameter was set high, then the waveform would have a high amount of vari-ation. Similarly, if there were a high number of position keyframes, the waveformwould have a lot of variation.Feature SelectionA correlation matrix was constructed between arousal and valence for 16 participant-rated behaviours (per viewing condition), and for the 72 participant-and-researchergenerated unrated behaviours.As seen in Figure 5.5, arousal has stronger correlation within the feature vectorthan valence. Features with strong positive correlation to arousal are those thatalso correspond with the widest, fastest, and most erratic motions, such as positionkeyframe count, position range, and random maximum.Valence has much weaker correlation overall, and particularly low absolutecorrelation values in the participant-rated analysis. However, within the unratedbehaviours, the top correlated features are also indicators of waveform complexityand are negatively correlated with valence, i.e., the more complex a behaviour is,the less it is deemed to be a pleasant behaviour.Participant ExperienceInterviews with participants were audio-recorded, transcribed, and coded for themesand keywords by a single researcher using an affinity diagram and constant com-parison. Open-ended written responses from participants for both live and videoviewing conditions were analyzed with the same techniques.In contrast to video, participants emphasized the importance of haptic feedback62Positionnumkfmedmaxminvartotvarrangemeanaucunrated_vunrated_avideo_vvideo_alive_vlive_aRandomunrated_vunrated_avideo_vvideo_alive_vlive_aWaveformunrated_vunrated_avideo_vvideo_alive_vlive_a-.15.01-.25.18-.08-.08-.31.10-.15.39-.10.06-.24.05.49.24-.19.36-.13-.10-.08.09.12-.05-.11.04-.15.49.02.16-.35-.04.44.35-.25.49-.02-.16-.10.04.03.00-.07-.08-.04.39.01.13-.34-.01.37.34-.22.39-.19-.05-.17-.18-.10-.13-.12-.12-.04-.03.35.36.28.23.20.30.36.23.02-.07-.09-.06-.02-.04-.06-.08-.04-.05.46.45.32.21.05.24.43.23-.01-.10-.11-.09-.02-.08-.04-.11-.06.02.40.43.28.22.13.25.40.29n/a.08-.21.17-.22-.16-.24.07.07n/a-.12-.12-.10-.01.18.06-.15-.15n/a-.16.11.17-.18-.04-.10-.07-.07n/a.09-.14-.15.26-.16.07-.02-.02n/a-.06.07.16-.13.00-.10.00.00n/a.03-.03-.20.26-.11.15-.05-.05Figure 5.5: Correlation results from behaviours that were designed for anemotion label but unrated by participants (marked unrated above) werecalculated on all 72 designs from CuddleBits: Participant Gener-ated Behaviours (see Generating); correlation results from participant-ratings were calculated on the 16 behaviours from CuddleBits: Study 2(marked by viewing condition). A strong positive correlation is shownbetween the position total variance for all arousal columns (unrateda,videoa, livea) – the higher the total variance, the higher the arousal.(7/14), the ability to view the robot from multiple angles (4/14), the increasedengagement and accessibility that resulted from the live interaction (5/14); 9/14touched the robot while playing behaviours. Of these, three reported wanting totouch the robot when they were having difficulty interpreting the behaviour.“Feeling the movements rather than just watching them helped me geta better sense of an interpretation of what the emotion was.” –P05(corroborated by P09, P11)Participants reported that some of the given emotion words were ambiguous(particularly Depressed, Attentive, Excited, and Stressed), due to a lack of contextand visual cues.“There were some emotions where it was pretty ambiguous. [Thereare] different connotations depending on exactly what the context is.”–P03 (P02, P10)63Several participants (4/14) interpreted a combination of emotions while ob-serving the robot behaviours, with emotions happening either simultaneously orsequentially.“I think [the emotions] are happening in order. So sometimes [it’s]excited, and then after the stimulus is gone, it becomes bored again.”–P06 (P04, P07, P09)Participants’ description of their process for labelling robot behaviours in-cluded relying on their experience with animal behaviours (4/14), their experiencewith human emotions (2/14), and interpretation of the “heartbeat” or “breathing”of the robot’s movement (7/14). 5/14 participants mentioned that it was difficult tointerpret emotions from the robot behaviours.5.2.5 CuddleBits Study 2: TakeawaysViewing conditions: Experiencing the robot live seems to have a different effecton behaviour interpretation than via video. Since the ratings in viewing condi-tions were significantly different, it is inadvisable to use video as a proxy for robotbehaviours. This difference is likely due to the ability to experience the robothaptically and visually from multiple angles. Futher, the laboratory context mayhave diminished the emotional interpretation in both cases, as the context of thebehaviours were intentionally unspecified and therefore ambiguous.Distinguishability and consistency: Behaviours were distinguishable; some robotbehaviours were consistently rated within the same affective quadrants, some acrossa single dimension (usually arousal), and some not at all. In line with an intuitiveunderstanding of emotional behaviours, this suggests that it is possible to createbehaviours that are seen as meaningfully different (especially in terms of arousal),but that their interpretation is subject to some variance. If we take the distributionof behaviour ratings at face value, the ambiguity of interpretation may not be mea-surement noise, but a true representation of the behaviour’s emotional space. Thatis to say, the behaviour labeled as Sluggish in Figure 5.4 may be well-defined as aprobability space with a nearly-neutral mean.64Waveform features: The waveform features that define arousal seem clear: byincreasing the amplitude and frequency of a behaviour, it should be interpreted ashigher arousal. Valence is less clear, since low correlation values make a defini-tive claim difficult. However, we conjecture that the waveform features that cre-ate a more complex and varied behaviour (i.e., randomness, number of keyframesneeded to make a behaviour) create a lower-valence behaviour. More work isneeded to look into valence behaviours (see Complexity below).It is possible to create distinguishable emotional behaviours with the Cud-dleBits. Future work should (1) seek to establish the factors that produce consistentvalence interpretations; and (2) establish how those behaviours change within thecontext of an interactive ‘scene’.5.3 Voodle: Co-design StudyTo understand how Voodle would work in a behaviour design process, we per-formed a co-design study: performer-user input guided iteration on factors under-lying Voodle’s expressive capacity.Because using iconic input to generate affective robot motion is an unexploreddomain, we focused on rich qualitative data. Methods borrowed from groundedtheory [86] allowed us to shed light on key phenomena surrounding this interactionstyle, and to define the problem space through key thematic events as a basis forfurther quantitative study.5.3.1 Voodle: MethodsIdeal Voodle users are performance-inclined designers. We recruited three expertperformers to help us improve and understand Voodle.Over a six week period, each performer met us individually for three one-hour-long sessions, for a total of nine sessions conducted.After completing Session 1 with all three participants, we iterated on the systemfor Session 2; we repeated this process between Sessions 2 and 3.In each session, participants were guided through a series of emotion tasks,followed by an in-depth interview. Each emotion task was treated as a voice-actingscene, where the participant played the role of actor, and two researchers played the65roles of director (here, an assistant as for comparative Study 1) and observer. Asbefore, the director/assistant offered technical support and suggestions as needed,but did not actively design behaviours. An observer took notes.In each task, participants used Voodle to act out transitions between oppos-ing PANAS emotional states, e.g., distressed → relaxed, for (high-arousal, high-valence) → (low-arousal, low-valence). The full set of emotion tasks (a) crossedthe diagonals of the affect grid; and (b) crossed each axis: Distressed - Relaxed,Depressed - Excited, Relaxed - Depressed, Excited - Distressed, Relaxed - Excited,Depressed - Distressed.Participants performed as many as they could in the time allotted per session.Each session lasted an hour: 30 minutes dedicated to the main emotion task, 20minutes for an interview, and 10 minutes for setup and debriefing.An in-depth interview was framed with three think-aloud tasks, to motivatediscussion and draw out user thoughts on the experience of voodling. Participantswere asked to (1) rate and discuss Likert-style questions of 5 characteristics: per-ceived alignment, fidelity and quality of designed behaviours, and perceived degreeof precision and nuance; (2) sketch out a region on an affect grid to represent theexpressive range of the robot (Figure 5.6). (3) pile-sort [9] pictures of objects,including pets, the CuddleBit, and tools, to expose how they defined terms like‘social agent,’ and how the Bit fit within that spectrum.5.3.2 Voodle: ParticipantsParticipants were professional artists with performance experience, recruited throughthe researchers’ professional networks.P1 is a visual artist focused on performance and digital art. He was born inMexico and lived in Brazil for 4 years and Canada for 7 years. P1 is a native speakerof Spanish and English, with working knowledge of Portuguese and French.P2 is an audio recording engineer, undergraduate student (economics and statis-tics), and musician: he provides vocals in a band, and plays bass and piano. P2 isa native English speaker born in Canada; he is learning German and Spanish.P3 is an illustrator, vocalist, and freelance voice-over artist. She has a degreein interactive art and technology, and has taken classes on physical prototyping and66design. She is a native speaker of Mandarin and English, with working knowledgeof Japanese. P3 was born in Taiwan and immigrated to Canada when she was 8years old.5.3.3 Voodle: AnalysisWe conducted thematic analysis [73] informed by grounded theory methods [24]on observations, video, and interview data. We found four themes (Table 5.1): par-ticipants developed a personal language, voodling requires a narrative frame andbrings users into alignment with the robot, and parametric controls complementthe voice for input. Each session helped to further develop and enrich each theme,adding to the overall story. We refer to each theme by an abbreviation and sessionnumber: “PL1” is Personal Language, Session 1.67Table 5.1: Summary of Voodle Co-Design Themes. We refer to each theme by abbreviation and session number (e.g.,“PL1”)Theme Definition Session 1 Session 2 Session 3PersonalLanguage(PL)The individualizedwords and utter-ances a participantdeveloped with therobot.Participants took a varyingamount of time to “get” Voo-dle; each vocalized in differentways, arriving at a local maxi-mum.Participants build upontheir constructed language,starting from their Session1 language, but exploringmore ideas.Robots influenced choiceof voice or MIDI input,but not vocalization lan-guage.NarrativeFrame(NF)The story the useris telling themselvesabout who or whatthe robot is.Participants needed to situatethe robot by constructing acharacter to effectively inter-act with the robot by utilizingmetaphors, concepts, and feel-ings that do not need to be ex-plicitly described in words.Participants used narrativeframe in different ways.Fur did not affect their abil-ity to construct a narrativeframe.Robot form factor, orien-tation adjusted the storiesthat participants told.Immersion(I)The extent to whicha participant couldsuspend their disbe-lief.Participants adjusted theirlanguage depending on therobot’s behaviour. By con-versing with the robot, theyfound the behaviour was morebelievable than the observersdid.Experience helped peo-ple be more in-tune(“aligned”) with the robot;as did voodling in compar-ison to using direct MIDIcontrols.Too much or too lit-tle control reduces emo-tional connection; phys-ically actuated displaysconnect more with users.Controls(C)How the controlof the systeminfluenced howparticipant saw theinteraction.Laptop controls were difficultto use. A low pass smooth-ing algorithm was not effec-tive. Randomness contributedto life like behaviour.Physical MIDI controlswere easy to use whenvoodling, but lacked feed-back. The robot neededan adjustable “zero” tomaintain lifelike behaviourwithout input.Suggestions include:steady-state sine wavebreathing and setting 0position as 50% of maxservo68Voodle: Session 1We introduced naı¨ve participants to the initial Voodle prototype and allowed themto explore its capabilities and limitations by completing as many of the emotiontasks as time permitted (∼30 mins). We closed with a semi-structured interview;participant feedback informed the next iteration of Voodle.Theme PL1: From “Eureka” to local maximum: Participants were initially in-structed to use iconic vocalizations with an example such a ‘wubba-wubba’. De-spite this, all participants chose to use symbolic speech early in Session 1.For example, when asked to perform the emotion task relaxed → depressed,P2 started by saying “I’m having a nice relaxing day”, with little visible successin getting the Bit to do what he wanted. Each participant transitioned into under-standing how to use Voodle at different times. P3 quickly understood that symbolicspeech wouldn’t afford her sufficient expressivity, and transitioned to iconic input,while P2 kept reverting to symbolic speech as an expressive crutch.It took P2 until the fifth emotion task (of seven) until he had a breakthrough: “Ikinda made it behave how I imagined my dog would behave”. Using that metaphor,subsequent vocalizations attained better control. Unlike the other two participants,P1 switched to iconic vocalizations gradually.Each participant eventually converged on his or her own idiosyncratic collec-tion of sounds that they felt was most effective. This differs from what might be aglobally optimal set of sounds to use: participants stayed in some local maximum.For example, P1 started using “tss” sounds and breathes into the microphone;while initially successful for percussive movements, they later proved limiting. P2used nasal sounds peppered with breathiness (“hmmm”). P3 eventually focused onmanipulating pitch with vowels. (“ooOOOO”), as well as employing nasals likeP2 (“mmm”), and some ingressive (breathing in) vocalizations. (“gasp!”).Theme NF1: Developing a story: Once the participant finds the robot’s ‘story’,emotional design tasks get easier. For example, P2’s shift came with his story ofthe robot being a dog. P2 refused to explicitly tell a story: “it wasn’t much of aconcrete story”. P1 said he created less a full story, “more a grand view of feelingsome emotions and from there on you could build a story, we were getting more the69traces of a story through the emotions”. This narrative potential was enabled bythe conceptual metaphor of Voodle as a dog [47].Theme I1: Mirroring the robot suspends disbelief:Participants formed a feedback loop with the robot: their vocalizations influ-enced the robot’s behaviour, which in turn encouraged participants to change theirvocalizations. P2’s dog-like “hmmm” vocalizations caused the robot to jitter, sur-prising P2 and prompting a switch to “ooo ooo ooo” sounds.When actively interacting with the robot, participants reported stronger emo-tional responses than the experimenters observed in the robot; as actors in thescene, participants were more connected than the director and observer. This couldbe due to their close alignment with the robot while acting – an experimenter mightsee a twitch as a quirk of the system, but the participant might see it as evocativeof emotional effort: “I did an ’aaa’ and at the end of the syllable it did a flutter...itwas just really nice, there were just things that I didn’t expect that expressed myemotion better than I thought it would” (P3).Theme C1: Screen distracted, algorithm was unresponsive: Using a laptop tocontrol algorithm parameters distracted participants from looking at the robot.All participants had some trouble modifying Voodle parameters; the directorneeded to take over parameter control (with the participant’s direction) as theyvocalized. Parameter manipulation was especially difficult in emotional transitiontasks where multiple parameters needed to be adjusted over time.In addition, participants reported that the smoothing algorithm, a simple low-pass filter, was unresponsive: “feels like there’s a compressor [audio filter restrict-ing signal range]...limiting the the amount of movement” (P2).Changes for Session 2 – We implemented four changes for Session 2: replacedthe web interface with a physical MIDI keyboard for parameter control; replacedthe low-pass filter with a PD (proportion/damping) controller to improve respon-siveness, with parameters named “speed” (P) and “springiness” (D); introduced anew “randomness” parameter to simulate the noise from the removed low-pass fil-ter; and added a new mode to aid in making comparisons, “Wheel”: users couldpress a button on the MIDI keyboard to disable voice input and directly control theposition of the robot using wheel control.70Voodle: Session 2 Format and ResultsIn the second session we juxtaposed voodling against a manual MIDI-wheel con-troller, based on a participant’s suggestion. Participants first did as many emotiontasks as possible in ∼15mins, using voice for robot position control; then repeatedthese in Wheel mode. Included in this session were observations of how a user’srelationship with the Bit matured as they became more familiar with both the robotand Voodle.PL 2: Participants learn, differ in skill: Unprompted, participants began withsame language they used in Session 1, then developed their language with exper-imentation. P3 continued to use primarily pitch control, as she did in the firstsession. P1 continued his “tss” sounds and blowing directly into the microphone,essentially a binary rate control: the robot was either expanding quickly or con-tracting quickly.After some experimentation, P1 incorporated more pitch control, which af-forded better control. P2 and P3 indicated increased expressivity on their affectgrids in Session 2 (Figure 5.6), suggesting improvement of either ability or system.The participants began to diverge in their ability to create nuanced behaviours,suggesting talent or training influenced their capabilities.P1, with his breathing sounds, simply didn’t succeed in controlling the robot.P3 seemed to understand how to work with Voodle, creating subtle and expressivedesigns; she preferred vocal input, but also was adept with Wheel. P2 was betweenthe other two, making extensive use of Wheel control, and playing it like a piano.NF 2: Agency from motion:Randomness and lack of precise control imbued the robot with agency. P1claimed that, on the whole, randomness made the Bit feel more alive because itimplies self-agency. When turning up the randomness, P3 exclaimed, “oh hey hi,I woke it up”. She explained: “The randomness meter...was always the first thingI moved I think...because it added another layer of emotion to it.” This lack ofcontrol connected to the sense of life within the robot: “[the Bit] was modeledto look like a living creature and that makes me feel like it should probably notcompletely obey what I want it to do. There should be something unexpected”71P1Session 1 Session 2 Session 3P2P3UnpleasantHigh energyLow energyPleasantVoiceand WheelWheel moreprecise hereWheelVoice(P3 intentionally sized circlesto be bigger than Session 1)RibBitFlappyBitFlexiBitBrightBitVibroBitFigure 5.6: Reported affect grids by participant and session. After being in-structed about dimensions of arousal and valence, participants drew therobot’s expressive range directly on affect grids. Participants indicatedincreased expressivity from sessions 1 to 2, differences between voiceand Wheel control, and that each robot had a different range.(P3).Continuous motion can contribute to agency. All participants felt the robotshould not be motionless in its ‘off’ state; it needed a default, like breathing. P2further suggested that the robot’s ‘zero’ point be the middle of its range, to accom-modate both contraction and expansion metaphors.72I2: Voice converses, MIDI instructs: Participants were more aligned with therobot when vocalizing.For example, P3 expressed that manual wheel control allowed her to instructthe robot, whereas voice control allowed her to converse with the robot: “Voicefeels like it’s more conversing than by wheel, I think it’s because by wheel I havea better idea of what’s going to happen...which makes me experiment a little less”(P3). Non-voice MIDI control gave a stronger sense of controlling the robot, di-minishing agency: “[The wheel] felt more like playing an instrument” (P2). P2, theaudio engineer, preferred using the MIDI wheel, while P3 preferred voice. Both P1and P3 indicated that the Wheel had more expressive capabilities with low-arousal,negative emotions (Figure 5.6).C2: Visual parameter state: MIDI parameter control allowed participants to fo-cus attention on the robot.All participants continuously modified the Voodle parameters with the MIDIcontroller, compared to minimal modification with Session 1’s HTML controller.P2 suggested that sliders may be more effective than knobs, as they provide imme-diate visual feedback for range and current value. P3 also requested more visualfeedback for parameter status, e.g., bar graphs.Changes for Session 3 – We displayed parameter status on the laptop screen, andadded 4 new robot forms to explore how form and actuation modality influencevoodling (Figure 4.3).Voodle: Session 3 Format and ResultsThe final session explored the effect of form factor on control style. Each par-ticipant ran through a subset of our emotion tasks with each of the new robots(Figure 4.3), given the option to use either voice or wheel control. They were alsoallotted free time to play with the new robot forms. We administered a closingquestionnaire to capture their overall experience of the final version of Voodle.PL3: Consistent language across robots : Despite wide variation in each robot’sexpressive capability (Figure 5.6), participants continued to use their developedlanguages across robots. Examples include “tssss”, “ooo”, “aaa” (P1), “mmmm”73(P2), and “oooh”, “ahhh” (P3). While language remained consistent across robots,preferred control mechanism did not.P2 preferred vocal input only for FlappyBit as he engaged emotionally with it:he saw the flapper as a head. However, P2 used wheel control for the remainingBit forms. P1 always started vocalizing as an experimentation technique with newBit forms and then consistently moved to wheel input for fine-grained control. P3preferred voice for most robots, although she did indicate the RibBit respondedmore consistently to wheel input (unlike the other robots).NF3: Shape, orientation create lasting stories: Robots did not just have vary-ing expressive capability; they also inspired different stories. Participants reacteddifferently to each. For example, P3 saw VibroBit as a multi-dimensional, highly-controllable, lovable pet; P1 and P2 saw it as a unidimensional, completely uncon-trollable, unlovable object. Different robot features changed the narrative context.While P2 thought FlappyBit’s flapper was a head, giving it expressivity, P3 thoughtthe flapper was a cat’s tail. When FlappyBit was flipped over such that its flappercurled downwards, both P2 and P3 felt that it became only capable of expressinglow-valence emotions. However, form factor did not not completely change thestory: in all sessions, P2 felt the robot was a dog, no matter which robot he wasinteracting with.I3: Sweet spot of control; motion matters: P1 reported high control over Bright-Bit and low control over VibroBit, but rated both with a smaller expressive rangethan the other robots (Figure 5.6). This suggests a “sweet spot” of control whenconnecting emotionally with the robot: some control over behavior is good, but nottoo much.P1 felt more connected to FlappyBit or FlexiBit. That said, all participantsexpressed a lack of emotional connection with BrightBit. P3 thought that the lackof movement was the cause, while P2 did not feel like he conversed with BrightBit:“I kept visualizing it talking to me instead of me talking to it” (P2).Changes for Robot Iterations – Session 3 resulted in several implications for futureiterations on each robot: VibroBit had a limited expressive range; FlappyBit’s flap-per looked like a head, which was easy to connect with, but metaphors would varydepending on orientation; FlexiBit had an ambiguous shape; BrightBit seemed un-74emotional.Voodle: Likert and Pile-Sort ResultsThe Likert scale and pile sort tasks were primarily used as an elicitation deviceto stimulate discussion. Participant responses were consistent with other observa-tions; we highlight a few examples.The questionnaire measured quality of Bit movements match to participant’svocalizations/manual control; precision, nuance and fidelity of voice control; andalignment of Bit behaviour to the emotions participants felt as they performed.Emotional connection with the RibBit increased by session. RibBit and Curly-Bit performed much better than other CuddleBit forms on all metrics. Wheel andvoice control offered similar degrees of quality on average. P1 and P2 reported thatthey felt more in control with the wheel, though P3 said that it made the Bit appearas less of a creature.Participant perceptions of the CuddleBit as a social agent changed through re-peated sessions, albeit in different ways. In the pile sort, P2 first placed RibBitbetween cat and robot, but post-Session 2, moved to between human and cat. Incontrast, P1 first sorted the RibBit between a category containing anthropomorphicelements and home companionship possessions, but later agreed it could fit in allof his categories (except one for food) if it was wearing fur.5.3.4 Voodle: Discussion and TakeawaysOur initial goal was to create a dedicated design tool for affective robots. We ob-served something intangible and exciting about live vocal interaction. We deriveda more nuanced understanding of Voodle use, in that it seems to exist somewherebetween robot puppetry and a conversation with a social agent.In the following, we discuss insights into interaction and believability, and howVoodle can function as an interactive behaviour design tool within a performancecontext. We conclude with future directions, including insight into how Voodlemight be embedded as a component of a larger behaviour control system.755.3.5 Voodle: Insights into Believability and InteractivityThrough the co-design study, we found that believability was mediated by partici-pants conception of robot narrative context, and their level of control and personalways of using it.Creating a context:Behaviour designs and alignment improved dramatically once participants founda metaphor or story. Context was determined by confluence of form factor, robotability and participant-robot relationship. For example, P1 could neither decidewhat VibroBit represented nor control it well, hence saw it as a failure; while P3thought that it was cute and felt skillful when interacting with it.Balancing control with a “spark of life”: Voodling created lifelike behaviourswith a simple algorithm: deliberate randomness and noise produced a user-reactivesystem that still seemed to act of its own accord.Varying randomness and user control made Voodle more like a conversation,or like a design tool.Control increased alignment, like people sharing mannerisms in a conversa-tion [33]; but with too much or little, the system becomes mundane or frustrating,the magic gone. Voodle was a more emotionally immersive design experience thantraditional editors.Personalization:Users developed unique ways to use Voodle. Algorithm parameters could bevaried to facilitate a metaphor, output device, or simply preference. Users mod-ulated their vocal performance with these parameter settings much as guitaristsuse pedals to adjust tone, before or as they play. Importantly, we observed thatusers tended to use similar “personal language” with varied robots, suggesting anindividual stability across context.Voodle: Vision for Behaviour Design ProcessIt is likely that producing affective robots will soon be like producing an animatedfilm or video game. Indeed, steps towards this have begun (e.g., Cozmo [3, 36]).Here, it seemed that enabling artists and performers to directly interact with robots76during design did facilitate the believability of the resulting behaviors, in that thedesigners who became aligned with their robot model seemed to be more satisfiedwith their behaviors than more attached observers.Behavior design team: As reflected in the structure of the co-design study, a be-haviour design session may involve a scripted scenario, a director, a designer, anactor, and the robot itself. Working together to bring out the best performanceon the robot, an actor and director would read through a script as the designertakes notes on how to modify the robot’s body. Through an iterative design pro-cess [16, 40], both behaviours and robot form factors could be refined together(Theme NF3).The actor could also leverage Voodle’s support to improve alignment with therobot. Like a puppet, the actor would be simultaneously controlling and actingwith the robot. Although the interactive space in which the actor works will likelyhave to be multimodal (i.e., , including a physical controller such as the MIDIkeyboard), alignment through voice enables a deeper emotional connection withthe robot itself (Theme I2).Physically adjustable parameters: Voodle took a different approach from previ-ous non-speech interfaces (e.g., the Vocal Joystick [10, 37]), which had a defined,learned control space. As we discovered in our pilot, voodling relied on a nar-rative context: a metaphor for how vocalizations should produce motion. Thiscould change from moment to moment: amplitude might be associated with therobot expanding, but if the robot was conceptualized as “flexing”, amplitude corre-sponded to downwards movement. When adding parameters, we found physicallymanipulable controls were easier to control when voodling, but they require visualindicators of their range and status.One could imagine a kind of recording engineer in a behaviour design sessionwho adapts motion control parameters on the fly (Theme C2).How to ExtendThis work produced initial requirements for a Voodle system, which is open-sourceand online. It also produced implications for future iconic speech interfaces.77Extending the sound-symbolic lexicon:Here we considered proportionally-mixed pitch and amplitude.Our pilots (Table 4.1) have already revealed other promising vocal features,such as -continuants (“dum dum”), +stridents (“shh” or “ch”), and distinguishingvoiced consonants (“b” is voiced, “p” is not). A detailed phonetic analysis willhighlight additional features and inform ways to adjust parameters automaticallyfor specific vocal features.Some parameter ranges should be individually calibrated, e.g., pitch.While we identified examples of our performers’ languages, many more iconicmappings (features to robot position) are possible. These features could further bedynamically mapped to multiple degrees of freedom.Design techniques: While Voodle was built as a design tool, in context, we found itwas rarely used alone. Instead, Voodle could be part of an animation suite, lettingusers easily sketch naturalistic motion without a motion capture system. Inputcould be imported into an editing tool for refinement. This might be especiallyviable in mobile contexts, to sketch an animation on the go, e.g., in a chat program.Iconic vocalizations have also been used to describe tactile sensations [77, 91],so Voodle may also be useful for end-user design of tactile feedback, to augmentcommunication apps – a haptic version of SnapChat or Skype, with voice for hap-tic expression. We expect such uses will need to recognize additional linguisticfeatures (like “sss” vs “rrr”); and Voodle must be more accessible to end-userswho are not performers.Vision for “Embedded Voodle”: Voodle has the potential to add life-like respon-siveness to deployed interactive systems. Adding randomness to an ambient dis-play increases perceived agency [8], but voodling could increase a sense that it isattending to the user, especially with directed speech (I2).As a reactive system, voodling could be added to conventionally planned mo-tion of virtual agents or robots, from a robot pet that reacts to ambient speech, tobody language of a assistive robot arm (Figure 5.7). When a user explicitly tells therobot arm to “come here”, she might modulate its movement with a soft “whoa”(slow down) or urgent “WHOA” (stop).78VoodleNoVoodle+TrajectoryReactiveTrajectoryFigure 5.7: Vision for “Embedded Voodle”: Voodle could be a natural low-cost method to emotionally color a motion path in more complex robots.5.4 Complexity: Valence and Narrative FrameDuring CuddleBits Study 2, we found that certain measures of complexity seemedto be negatively correlated with valence. During all of Voodle, CuddleBits, andother investigations we found that the ‘story’ people told about the robot—e.g. therobot is like my cat or the robot is a combination of a dog and a squirrel—heavilyinfluenced their perception of the valence of the robots behaviours.This study proposes to generate robot behaviours that vary in complexity andtest the participants perception of displayed valence, with consideration for thestory that people tell about each behaviour.Note: in previous work, we have noticed that arousal and valence are not per-fectly orthogonal. This means that arousal and valence are possibly dependent,such that an increase in arousal may imply an increase in valence. We have alsoseen that it is relatively easy to control the perceived arousal of the robot by in-creasing, for example, the range of the robots breathing. As such, we are targetingvalenced behaviours only in this study, using a simple five-point scale from nega-tively valenced, to positively valenced.RQ 1. How does valence correlate with complexity? Hypothesis: valence willbe negatively correlated with complexity, i.e., the more complex a behaviour is, themore negatively valenced it will be perceived.RQ 2. How do participants rationalize the change in the robots behaviours79(i.e., how does the ‘story’ change the perception of the robot)?5.4.1 Complexity: MethodsParticipants: We recruited 10 nave participants, all compensated $15 per session.Procedure:: Particpants were seated, introduced to a fur-covered FlexiBit, andasked to touch the robot to reduce novelty effects. They were then shown a samplebehaviour and walked through the rating task, and the story task. They were thenasked to complete the rating tasks while being directed by a pre-recorded voicewhile an experimenter took notes. Each task took roughly 2 minutes: the robotwould breathe neutrally, a test behaviour would be displayed, the robot wouldreturn to neutral breathing, a pre-recorded voice would ask them to rate the be-haviour (a replay option was available), and a pre-recorded voice would ask themto describe the robot’s behaviour out loud in a few short sentences. There weretwenty trials in total; the first two trials were always the same and were thrown outto reduce novelty effects, the next 18 behaviours were presented randomly. A shortfollow-up interview ended the session; each session took roughly 45 minutes.Fifty-four behaviours were designed by hand, then scored and ranked for com-plexity using three complexity measures: variance, peak count of a spectrographas generated by a fast Fourier transform (FFT), and MSE. Variance was cal-culated by the standard statistical method, i.e., E[(X − µ)2], and was includedto compare against results from CuddleBits. FFT peak count was calculatedby counting all maxima on a spectrogram within 2.6 standard deviations of theglobal maximum, and gives an estimate of the distribution of power across thepossible frequencies of the signal. MSE was calculated as outlined in RelatedWork (2), and in Costa 2008[26], then behaviours ranked by slope and shapeof the produced MSE graph. Six behaviours were chosen from each complex-ity measure to represent an even spread across rankings, i.e., one from each ofthe 0−16,17−32,33−48,49−64,65−80,81−100 percentiles for each of {Var,FFT, MSE}. This ensured a wide range of possible behaviours were shown to par-ticipants.805.4.2 Complexity: ResultsHere, we outline qualitative and quantitative results. Quantitative results outlinethe consistency of behaviour ratings in terms of valence; qualitative results discussthe impact of narrative frame on behaviour interpretation.Complexity: Quantitative ConsistencyInter-rater reliability: This measures the extent to which different raters (par-ticipants) rated all behaviours similarly. For example, if all participants rated eachbehaviour the same, there would be perfect agreement. If half the participants ratedall behaviours one way, and half the other way, there would be systematic disagree-ment. Since we are using ordinal scales, a reliability measure that accounts for theorder of the scale items was used. For example, we would want to assign a closeragreement between a behaviour rated at both “-2” and “-1” than if it were ratedat both “-2” and “+2”. Krippendorf’s alpha was therefore used to determine theinter-rater reliability, producing α = 0.07. This is typically interpreted as slightagreement, where α < 0 is systematic disagreement, α = 1 is perfect agreement,and α = 0 is no agreement. However, the observed agreement (pa) is 0.80, andthe agreement due to chance is 0.79 (pe), so α may be suppressed in this case(α = (pa− pe)/(1− pe)).Per-behaviour consistency: This measures the extent to which each behaviourwas rated similarly by all raters (participants). For example, if every participantrated a behaviour differently, we would say it was inconsistent. Weighting rat-ings for an ordinal scale [45], observed agreement per behaviour (pi) ranged from0.64 to 0.92, with 4 behaviours <0.70, 6 behaviours between 0.70 and 0.85, and8 behaviours above 0.85. As intuition, the variance of ratings and pi:n are highlycorrelated; the lower the variance in rating per behaviour, the higher the pi.Correlation:Our goal was to explore how measures of complexity correlate with valenceratings of breathing behaviours (i.e., to see whether more complexity producedlower valence).. Since our per-behaviour ratings were reasonably consistent, weconsider mean, median, and mode of the valence ratings per behaviour and cor-81Table 5.2: The correlation between summary statistics of valence ratings(columns) and complexity measures (rows). For example, the top leftcell is the correlation of the mean valence rating per behaviour and thevariance of the behaviour signal.Valence/Complexity mean median modeVariance -0.03 -0.15 -0.08FFT peak count -0.01 -0.12 -0.14MSE rank -0.45 -0.47 -0.37MSE mean -0.30 -0.08 -0.30MSE AUC -0.30 -0.27 -0.32MSE slope -0.30 -0.31 -0.43relate with our complexity measures. For the signal variance and peak count, wesimply calculated the correlation between each summary statistic of valence ratingsand the complexity measure of each signal; because MSE outputs a series of valuesper signal (i.e., one per time scale over which MSE performs coarse-graining), wecorrelate each summary statistic of valence ratings with the mean MSE rating, areaunder the curve of MSE ratings, and slope of the regression line of MSE ratings.Results are reported in Table 5.2.Complexity: Narrative framingAt the end of each trial, participants were asked to describe the robot’s behaviourin a few short sentences. Individual interpretation of the robot’s character, moti-vations, and subjective state were highly varied, even within the same behaviour.Here, we present a series of themes developed by (1) comparing and contrasting re-sponses per behaviour; (2) comparing and contrasting responses across behaviours;(3) drawing insights from individual responses.The qualitative results paint a much more complex picture than the quantita-tive results. Participant understanding of the robot and its behaviours were highlyindividualized. Although there were many behaviours for which at least someparticipants converged on the interpretation of the behaviour, the majority of theresponses revealed that the final reported valence measurement was fragile1 to a1Here, I use fragile to mean that the final outcome could vary widely across the range of possible82complex emotional reasoning scheme. Part of that emotional reasoning scheme Irefer to as the narrative frame, i.e., the set of assumptions about the robot’s char-acter, state of mind, backstory, and relationship to the environment through whicha participant will interpret a behaviour. Here I outline three themes that attemptto characterize the complexity of the reasoning behind the participants’ emotionalinterpretations:Participants ascribe complex mental states to the robot:The robot was understood to have thoughts and feelings that extend far beyondit feeling “good” or “bad”. Generally, participants gave a nuanced explanation foreach behaviour, where they attributed both human- and animal-like motivationsand emotions to the robot. For example,“It feels like it was trying really hard to do something but it doesn’tseem to be able to do it.” (P7, B11)“It feels calm, but also like he wants to start moving more, like wantsto play or something.” (P5, B10)Both quotes explain the current behaviour in terms of the robot wanting some-thing that it is not currently able to do, which suggests that the robot not only hasa current state of mind, but the ability to reason about the future. Similarly, partic-ipants describe the robot having hidden feelings that oppose the current action:“I feel like the robot is angry and wants to punch something like that,but it’s not like he’s lost his mind, he’s still under control.” (P9, B4)“...it kind of felt like it was sort of aggressive response, like someonetrying to hold in their anger, or whatever it might be. But perhapsaggressive is not the best word because it didn’t seem like it was somesort of...it seemed more like frustration, that’s a better word for it,frustration.” (P3, B5)responses depending on a few small changes to the emotional reasoning that was used. In contrastto uncertainty, fragility means that the participant could be very certain in what the final valencerating should be, but that conviction could change very easily given a small change in one of theirassumptions about the robot.83If the robot is frustrated and trying to hold it in, it would have to have a senseof how its actions impact the people with whom it is interacting, i.e., it must havesome kind of social understanding. This implies that participants not only act as ifthe robot has an internal mental state, but that the robot has its own internal modelsof their emotional states. This is not naively done; critically, participants are fullyaware of the robot’s status as a robot but act as if it has these complex mental statesregardless.Narrative frame heavily influenced valence ratings:Assumptions that participants made about the character, motivation, and situ-ation of the robot heavily influenced whether they saw the behaviour as positivelyor negatively valenced. For example, both of these quotes describe the same be-haviour:“The shaking so fast like the robot is talking to his friend like so happy,so makes me feel like it’s a positive emotion.” (P9, B5)“Breathing rate’s really fast, seems like it’s really anxious, or it’s run-ning away.” (P2, B5)In both cases, the story of the behaviour (whether it was running away, ortalking to a friend) flip the participant’s understanding of the behaviour’s valence.Interestingly, both frame the behaviour as if the robot was not reacting to the nar-rative of the interaction within the lab, but some other, unseen narrative.Even when all participants gave similar valence ratings, the emotional contentof the rating differed greatly. For example, participants who rated B7 as negativeframed the interaction as all of having “trouble breathing” (P2), “sobbing” (P3),“alarmed” (P4), “didn’t want to be kept here” (P5), “worried” (P6), “agitated”.Positive ratings included “restless, but in a good way” (P8), “working [a job]”(P7), “a cat playing with straw” (P1)2.Participants often framed the current behaviours relative to previous behaviours,often with short references to “this time” or “last time”. One participant even sawa throughline between multiple behaviours:2P9 rated the behaviour positively, but could not decide what the robot was doing.84“OK, I believe that the last two trials and this trials is a whole story.Like, the first one should be like the robot gets sick at the beginningphase, and the second one is ’the robot is so sick’, and this one, therobot is dying, but he’s still struggling so I feel like a little breaths in-between but almost like, in the most of time, he just doesn’t breathe atall. So maybe he’s dying. So that’s my guess.” (P9, B2)This was part of building up the robot’s story over time. One participant madeexplicit reference to the robot’s back story, i.e., where it came from:“It seemed like it was burrowing, burrowing into my arms I guess, butI imagine that that’s an instinctive behaviour it got from the wild.” (P4,B10)Valence was difficult to determine in incongruous or neutral states:The majority of responses indicated that it was easy for participants to find astory for the behaviour. However, if they could not determine a story, or if theactions seemed incongruous with the story they had already imagined, then it wasdifficult for them to rate the valence of the behaviour:“It didn’t seem like an animal behaviour, breathing was a bit weird.I don’t know—it doesn’t seem like an actual animal behaviour that Irecognize. The breathing was just...yeah, it didn’t seem very real tome at all, so I wasn’t able to place a behaviour for that.” (P10, B14,valence= 0)“I don’t know, I couldn’t tell what was going on. It seemed prettyrandom to me, and I couldn’t tell if it was an emotion.” (P1, B16,valence= 0)In both of the above cases, other participants were able to produce a strongnarrative for the behaviour:“The behaviour reminded me of a dog gnawing his teeth in his sleep,so this, like, grinding his teeth, and doing, yeah, just, weird things inhis sleep. So I gave it neutral.” (P8, B14, valence= 0)85“Compared to all of the other behaviours, this one, to me, definitelyhad more of a truly utter joy and satisfaction to it, I don’t know why,I can’t exactly put my finger on why I think that, just my natural gutinstinct is that, just kind of an air of warmth and happiness and satis-faction just the way that the pulse was.” (P3, B16, valence= 2)Often, even with a strong narrative, a neutral state made it difficult for partici-pants to rate in terms of valence:“I feel like the robot is aruging with somebody or something and hisemotion gets more and more...like, I don’t know how to describe it,but he gets more excited while he’s doing this, so I think maybe he’strying to convince somebody of something, but I don’t think it’s eitherpositive or negative, so I choose neutral.” (P1, B11, valance= 0)For example, this quote illustrates a strong (if not well-articulated) sense ofwhat the robot is doing, but not a strong sense of whether the robot is in a positiveor negative emotional state. The participant describes the robot as “arguing” and“excited” but cannot determine a valence rating.5.4.3 Complexity: Discussion, Takeaways, and Future WorkComplexity measures: Higher correlations between MSE measures and valenceratings relative to variance and FFT peak count suggest that MSE is a promisingapproach for quantifying valence. Future work should include an MSE analysis onthe behaviours from CuddleBits combined with other complexity measures such astime irreversibility.Generating behaviours: A generative approach where MSE is used as a utilityfunction might still be possible, but computationally efficient stochastic processesneed to explored to seed behaviour generation. Our attempt at generating via ge-netic algorithms using MSE as a utility function was short lived due to the lengthof time generation was taking with purely random recombination, even with hand-generated behaviours as seeds. One could imagine using a stocastic process such asBrownian motion or Perlin noise to generate a complex seed behaviour (or portion86thereof), then use known complexity-reducing functions to vary the complexity to-wards some MSE setpoint. However, further tests are still needed to determine howappropriate MSE measures would be for all low-valence behaviours.Generating on the fly: An interactive robot needs to generate behaviours on thefly. It may be possible to mix behaviours to mix valence states, e.g., through dy-namic time warping [23]. More work is needed to know whether a fully-interactivecontext would significantly change the interpretation of the behaviours.Qualitative AnalysisQualitative results present a complex and nuanced picture of robot behaviour in-terpretation. Although there were some behaviours that were unambiguously posi-tively or negatively valenced, the diversity of stories the participants told about therobots belie a fragile system of interacting narrative elements that could radicallyshift the understanding of a behaviour.A natural urge for a scientist is to argue that we need higher experimental con-trol to properly determine valence rating; however, there are are a number of prob-lems with this premise.First, the dimensions of control are ambiguous, since results indicate that wewould want to control the participant’s narrative frame. It is not clear whetherwe can, would want to, or be able to provide a story for the robot, as the storyis produced in conjunction with the participant’s own subjective experience. Thelaboratory setting provides its own set of expectations and narrative possibilities;the ambivalence of some interpretations might have been in response to a ratingtask that had low ecological validity. However, increasing the ecological validityby, e.g. moving out of a laboratory, would likely mean giving up control in manyother dimensions.Second, getting a clear rating of valence may take developing a relationshipbetween the robot and the participant. This would necessitate creating an interac-tive environment where the robot’s actions could vary directly in response to theparticipant’s actions. Although such a situation would make procuring ratings dif-ficult (when and how would participants rate a behaviour?), we may find that someconsistent narrative frame emerges naturally out of continued interaction.87Third, it may be that a single-sample, single-point rating task is too reductive.Asking a participant to choose a single point on a dimensional model that repre-sents an entire behaviour ignores any changes in emotion that the participant mayhave perceived during the behaviour. For example, imagine interacting with therobot for a five-minute period: there may be a lot of different emotions felt inthat time, all with different intensities. Further, a single point on any dimensionalmodel may not well represent an emotion state—it seems reasonable to be both“happy” and “sad” at the same time. For example, an emotion state may be betterrepresented by a probability distribution across a set of dimensions, as that wouldaccount for simultaneous conflicting emotions. It may be that an emotion stateshould not converge to a single point until a decision is forced.Future work should include tasks that account for these difficulties. Accountingfor the complexity of subjective measurements, the tasks should (a) attempt toproduce a narrative frame through interaction; (b) have some ecological validitythrough creating a believable setting; (c) take place over a long enough period oftime such that a relationship can develop between the robot and the participant; (d)use a rating scheme and/or emotion model that allows for simultaneous conflictingstates.5.5 Conclusions for Complexity, Voodle, and CuddleBitsBehaviour EvaluationDespite their simplicity, the CuddleBits are capable of evocative emotion displayand interaction. In the above studies, we have demonstrated that participants candistinguish emotional behaviours along axes of valence and arousal3, create com-plex emotional stories for the robots3,4, and conceive of the robot as a socialagent4,5. We have shown that it is possible to create emotional stories with therobot, and that interactors truly inter-act by becoming aligned with the robot4.We have further shown that the arousal and valence of a breathing-like behaviourcan be determined according to some statistical measures of the behaviour, andhave identified signal features that correlate with arousal and valence (i.e., com-3CuddleBits.4Voodle.5Complexity.88plexity measures such as MSE for valence, and frequency/amplitude measures forarousal)3,5. However, we have also presented evidence that simple dimensionalmeasures for determining an emotion state are not sufficient to capture the diversityand depth of participants’ subjective experience while interacting with the robot4.We draw the conclusion that, although arousal is relatively consistent, valence mea-sures are especially fragile to differences in narrative frame.89Chapter 6Conclusions and Ongoing WorkThis thesis has established the potential for simple, 1-DOF robots to engage incomplex emotional interactions. If the work were to be boiled down to the a singlecontribution, it would be that CuddleBits can be designed to be believable socialagents, despite their ‘painful simplicity’. The strong caveat is that, although thissense of agency is easy to produce, it’s difficult to control, due to the complexityof the human subjective experience. There are some opportunities for generality,i.e., arousal display is correlated with frequency and amplitude of the behaviourwaveform; valence display with complexity measures. However, valence displayneeds to account for the interactor creating a narrative frame with the robot, theproduction of which is tied to both the interactor’s experiences and the relationshipthey build while interacting with the robot. To reliably display valenced breathingbehaviours, a designer would need to account for the impact of the narrative frameby creating some interactive context through which the robot’s could be expressedand reliably interpreted.This chapter focuses on identifying current and future work that may supportvalenced interactions. Work discussed here is in various stages of development inclose collaboration with other SPIN researchers1.1especially Laura Cang and Mario Cimet.906.1 InteractivityIn Voodle, participants continuously interacted with the robot to create emotionalscenes. The robot seemed to have a much stronger ability to display emotions forthose participating directly in the scene (the actor and director) than those whowere emotionally removed from the scene (the observer). One interpretation ofthis finding could be that participants in the scene are actively suspending theirdisbelief, whereas the observer was not. Key to achieving this engaged emotionstate is continuous interaction with the robot.To date, the approach in the SPIN Lab has been to treat the robot’s emotiondisplay (presented here) and the robot’s emotion recognition (presented in Cang2016 [17]) as separable, even with cross-pollination between team members. Aholistic approach may be necessary to properly inspect emotion recognition forboth the robot and its interactor.Such an approach would contravene current attempts at single-sample emotionmeasurement, i.e., when we ask participants to summarize an emotional interactionwith a single rating. Emotions may only be well-defined within the context of acontinuous appraisal and reappraisal of the interaction at hand with regards to aperson’s subjective frame (called appraisal theory [76]). A relational definition ofemotions would require a different measurement schema than is currently used.One approach could be to avoid any direct estimation of emotion, and insteaddetermine the success of an emotional interaction through a goal-based approach.Imagine a study in which the goal was to find out how the robot likes to be touched.You might imagine the robot expressing dissatisfaction if it doesn’t like to be tick-led, but satisfaction if it likes to be stroked. The participant would have to reasonabout the robot’s emotional state and determine arousal and valence continuouslywith direct feedback. This kind of an approach sidesteps the problems with emo-tion modeling by placing successful interactions as the valuation metric.A similar idea would be to create robots with differentiable ‘personalities’.Given a long enough interaction period, one could determine, e.g., whether a robotprefers to be left alone, or prefers a particular proximity. A personality, in this case,would be equivalent to a mapping between some set of inputs and behaviours. Ifan interactor were able to compare and contrast the robots by their behaviours, it91would imply that some valence interpretation was consistent and successful.6.2 Developing a Therapeutic RobotCurrent work in the SPIN Lab2 is on designing an interactive talk therapy-like ses-sion centered on the robot. The premise is to tell the robot an emotional story whiletouching it. The goal is to determine the emotional state of the user through theirtouch interactions, and, if possible, intervene through robot motion. For example,you might imagine that the robot becomes sympathetically agitated while a ther-apy client tells an emotionally-charged story, however, the act of calming the robotdown helps them to calm themselves down.We assume that what is necessary to this vision are (1) a system to recognizethe interactor’s emotion; (2) a validated set of behaviours that can be manipulatedto become gradually more agitated; and (3) a mapping between the recognizedemotion and the output behaviour. However, the results from CuddleBits, Com-plexity, and Voodle suggest that an emotion recognition system may not have todetermine valence as much as arousal, since the interactor may interpret the robot’sbehaviour according to their own subjective frame. This could take the form of asimple mapping from some interactor’s input to some robot motion, such as in Voo-dle where vocal input was mapped to robot position. For example, the motion ofan accelerometer could be mapped to displayed arousal. This approach leaves thevalence interpretation as emergent from the context of the interaction.6.3 Can We Display Valenced Behaviours?It would be tempting to conclude from this thesis that we cannot consistently dis-play valenced behaviours on 1-DOF robots—that they’re too dependent on theinteractor’s narrative frame, which is too hard to control. I would consider thisviewpoint too pessimistic. We had success in displaying consistently valenced be-haviours, and found correlation between signal features and valence ratings. Thissuggests that there are aspects of a signal that can inspire a particular interpretationof valence, but that they are easily overpowered and/or accentuated by narrativeframing. Since some consistency was found (assuming no false positive), a better2see Cang 2016 [17]92emotion model may also improve recognition and display relative to some specifiednarrative frame.Building a believable robot must then take narrative into account. A goal of anautonomous robotic system means that the social context and the robot’s situationtherein must be determinable through interaction. A vision for such a system is onewhere the interactor can continously evaluate their actions in relation to the robot’s(re)actions and build up a narrative frame over time. Results from Voodle andComplexity suggest that this is possible, since participants naturally try to create ahomogenous narrative for the robot. Once a behaviour is embedded in a consistentnarrative context, it is likely that it will be interpreted more consistently in termsof valence.To motivate, consider how one learns how to interact with animals. There aresome shared behaviours that map to human expressions, such as the shape andmovement of a dog’s eyebrows, or the height at which an animal holds its head.Many animal expressions do not have human analogues, such as a dog’s waggingtail, or a cat’s ears. Yet, we are able to tell what these movements mean. Forsome movements, we can determine valence by correlation. One can learn that adog’s wagging tail means ‘happy’ by observing the excitement in their eyes, notinghow eagerly they approach you, or by noticing that the wagging tail accompaniesa friendly pat. By contrast, we learn that a cat’s wagging tail means ‘angry’ bywatching their fur rise, their menacing growls, and by noticing that the waggingtail accompanies an unfriendly scratch. For other movements, we can determinevalence by inference. If a cat leans into a stroke, one can be sure that they likeit. Or, if they facilitate a stroke by making it easier to stroke them. In all cases,the context—a history with animals, a knowledge of what constitutes facilitation,repeated interactions—is what determines the valence of a particular action.6.4 Lessons for Designing Emotional RobotsIn this thesis, we have looked at the design process for simple, emotionally-expressiverobots, and the generation and evaluation of their behaviours. The scope of be-haviour evaluation was limited to breathing behaviours as a design challenge, anddue to their emotional salience. However, we expect that many of the lessons93learned here should apply to emotional machines of many configurations and ca-pabilities.Our findings about narrative context reinforce the understanding that humansinteract emotionally with machines. Creating contexts in which machines can ex-press emotion requires that designers facilitate the interactor seeing the machineas a social agent and creating a narrative context. Although this may not be desir-able for every object—maybe we don’t want a social garbage can—the interactiveobjects that already act in our lives socially may be enhanced by emotional pro-gramming.For example, our robot vacuum cleaners already work within the social space ofour homes, and may be much more interesting and understandable if imbued withemotionally expressive behaviours. Imagine a robot vacuum cleaner that couldseem happy about its work, or frustrated at being unable to reach a spot in theroom, or scared about its battery running out. Our work suggests that it may bepossible to convey some of this emotional complexity with only 1-DOF motion,as long as the narrative context was accounted for in the behaviour design. Usingsome of our findings, we might expect a happy robot to drive smoothly, a frustratedrobot to drive erratically, and a scared robot to shake erratically.Why would we want an emotional robot vacuum cleaner? I would argue thatwe already do have robot vacuum cleaners that work in our human emotional space,but they’re just not very good. Well-thought out emotional behaviours for a robotvacuum cleaner would be more than a cute feature, they would be critical to a gooddesign. Humans act emotionally with their world; machines that communicateemotionally make transparent their purpose, motivations, and status. For a robotvacuum cleaner, this might equate to the robot’s owner being able to understand therobot at a deeper level and make it less frustrating to be around; for a collaborativerobot working in industry, such emotional insight might save someone’s life. Thework presented here suggests that that kind of emotional interaction is possiblewith just simple motion—if we can get the ‘story’ right.94Bibliography[1] PhysioNet complexity and variability analysis.https://physionet.org/tutorials/cv/. Accessed: 2017-08-22. → pages 16[2] J. Allen, L. Cang, M. Phan-Ba, A. Strang, and K. MacLean. Introducing thecuddlebot: A robot that responds to touch gestures. In Proceedings of theTenth Annual ACM/IEEE International Conference on Human-RobotInteraction Extended Abstracts, pages 295–295. ACM, 2015. → pages 39[3] Anki, 2017. https://anki.com/en-us/cozmo. → pages 76[4] F. Arab, S. Paneels, M. Anastassova, S. Coeugnet, F. Le Morellec,A. Dommes, and A. Chevalier. Haptic patterns and older adults: To repeat ornot to repeat? In 2015 IEEE World Haptics Conference (WHC), pages248–253. IEEE, jun 2015. ISBN 978-1-4799-6624-0.doi:10.1109/WHC.2015.7177721. URLhttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7177721.→ pages 22[5] J. N. Bailenson and N. Yee. Digital chameleons automatic assimilation ofnonverbal gestures in immersive virtual environments. Psychologicalscience, 16(10):814–819, 2005. → pages 21[6] R. Banse and K. R. Scherer. Acoustic profiles in vocal emotion expression.Journal of personality and social psychology, 70(3):614, 1996. → pages 22[7] J. Bates et al. The role of emotion in believable agents. Communications ofthe ACM, 37(7):122–125, 1994. → pages 24, 40[8] A. Beck, A. Hiolle, and L. Canamero. Using perlin noise to generateemotional expressions in a robot. In Proceedings of annual meeting of thecognitive science society (Cog Sci 2013), pages 1845–1850, 2013. → pages20, 7895[9] H. R. Bernard. Research methods in anthropology: Qualitative andquantitative approaches. Rowman Altamira, 2011. → pages 66[10] J. A. Bilmes, X. Li, J. Malkin, K. Kilanski, R. Wright, K. Kirchhoff,A. Subramanya, S. Harada, J. A. Landay, P. Dowden, and H. Chizeck. TheVocal Joystick: A Voice-based Human-computer Interface for Individualswith Motor Impairments. In Proceedings of the Conference on HumanLanguage Technology and Empirical Methods in Natural LanguageProcessing, HLT ’05, pages 995–1002, Stroudsburg, PA, USA, 2005.Association for Computational Linguistics. doi:10.3115/1220575.1220700.URL http://dx.doi.org/10.3115/1220575.1220700. → pages 22, 77[11] S. Bloch, M. Lemeignan, and N. Aguilera-T. Specific respiratory patternsdistinguish among human basic emotions. Intl Journal of Psychophysiology,11(2):141–154, 1991. → pages 13, 18[12] F. A. Boiten, N. H. Frijda, and C. J. Wientjes. Emotions and respiratorypatterns: review and critical analysis. International Journal ofPsychophysiology, 17(2):103–128, 1994. → pages 13[13] C. Breazeal and L. Aryananda. Recognition of Affective CommunicativeIntent in Robot-Directed Speech. Autonomous Robots, 12(1):83–104, 2002.ISSN 1573-7527. doi:10.1023/A:1013215010749. URLhttp://dx.doi.org/10.1023/A:1013215010749. → pages 22[14] L. Brunet, C. Megard, S. Paneels, G. Changeon, J. Lozada, M. P. Daniel, andF. Darses. “Invitation to the voyage”: The design of tactile metaphors tofulfill occasional travelers’ needs in transportation networks. In 2013 WorldHaptics Conference (WHC), pages 259–264. IEEE, apr 2013. ISBN978-1-4799-0088-6. doi:10.1109/WHC.2013.6548418. URLhttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6548418.→ pages 22[15] P. Bucci, X. L. Cang, M. Chun, D. Marino, O. Schneider, H. Seifi, andK. MacLean. CuddleBits: an iterative prototyping platform for complexhaptic display. In EuroHaptics ’16 Demos, 2016. → pages 13[16] P. Bucci, L. Cang, S. Valair, D. Marino, L. Tseng, M. Jung, J. Rantala,O. Schneider, and K. MacLean. Sketching cuddlebits: Coupled prototypingof body and behaviour for an affective robot pet. To appear in Proceedingsof the SIGCHI Conference on Human Factors in Computing Systems (CHI)2017, 2017. → pages iv, 49, 7796[17] L. Cang. Towards an emotionally communicative robot: feature analysis formultimodal support of affective touch recognition. Master’s thesis,University of British Columbia, 2016. → pages 26, 91, 92[18] L. Cang, P. Bucci, and K. E. MacLean. Cuddlebits: Friendly, low-costfurballs that respond to touch. In Proceedings of the 2015 ACM onInternational Conference on Multimodal Interaction, pages 365–366. ACM,2015. doi:10.1145/2818346.2823293. → pages 13[19] X. L. Cang, P. Bucci, A. Strang, J. Allen, K. MacLean, and H. Liu. Differentstrokes and different folks: Economical dynamic surface sensing andaffect-related touch recognition. In Proceedings of the 2015 ACM onInternational Conference on Multimodal Interaction, pages 147–154. ACM,2015. → pages 25[20] N. Chomsky and M. Halle. The sound pattern of English. New York: Harperand Row, 1968. → pages 46[21] H. Choset, K. M. Lynch, S. Hutchinson, G. Kantor, W. Burgard, L. E.Kavraki, and S. Thrun. Principles of Robot Motion: Theory, Algorithms, andImplementations. The MIT Press, 2005. → pages 20[22] M. Chung, E. Rombokas, Q. An, Y. Matsuoka, and J. Bilmes. Continuousvocalization control of a full-scale assistive robot. In 2012 4th IEEE RASEMBS International Conference on Biomedical Robotics andBiomechatronics (BioRob), pages 1464–1469, jun 2012.doi:10.1109/BioRob.2012.6290664. → pages 22[23] B. Clark, O. S. Schneider, K. E. MacLean, and H. Z. Tan. Predictable anddistinguishable morphing of vibrotactile rhythm. → pages 87[24] J. Corbin and A. Strauss. Basics of Qualitative Research: Techniques andProcedures for Developing Grounded Theory. Sage Publications, Inc., 3edition, 2008. → pages 67[25] M. Costa, A. L. Goldberger, and C.-K. Peng. Multiscale entropy analysis ofcomplex physiologic time series. Physical review letters, 89(6):068102,2002. → pages 17[26] M. D. Costa, C.-K. Peng, and A. L. Goldberger. Multiscale analysis of heartrate dynamics: entropy and time irreversibility measures. CardiovascularEngineering, 8(2):88–93, 2008. → pages 17, 8097[27] J. Q. Dawson, O. S. Schneider, J. Ferstay, D. Toker, J. Link, S. Haddad, andK. MacLean. It’s alive!: exploring the design space of a gesturing phone. InProc of Graphics Interface 2013, pages 205–212. Canadian InformationProcessing Society, 2013. → pages 54[28] C. M. de Melo, P. Kenny, and J. Gratch. Real-time expression of affectthrough respiration. Computer Animation and Virtual Worlds, 21(3-4):225–234, 2010. → pages 13, 18[29] F. De Saussure, W. Baskin, and P. Meisel. Course in general linguistics.Columbia University Press, 2011. → pages 21[30] L. Feldman Barrett and J. A. Russell. Independence and bipolarity in thestructure of current affect. Journal of personality and social psychology, 74(4):967, 1998. → pages 15[31] T. Fong, I. Nourbakhsh, and K. Dautenhahn. A survey of socially interactiverobots. Robotics and Autonomous Systems, 42(3):143–166, 2003. → pages12[32] J. Forsslund, M. Yip, and E.-L. Sallna¨s. Woodenhaptics: A starting kit forcrafting force-reflecting spatial haptic devices. In Proceedings of the NinthInternational Conference on Tangible, Embedded, and EmbodiedInteraction, pages 133–140. ACM, 2015. → pages 19[33] S. Garrod and M. J. Pickering. Why is conversation so easy? Trends incognitive sciences, 8(1):8–11, 2004. → pages 21, 76[34] M. Goto, K. Kitayama, K. Itou, and T. Kobayashi. Speech Spotter:On-demand Speech Recognition. In in Human-Human Conversation on theTelephone or in Face-to-Face Situations. Proc. ICSLP’04, pages 1533–1536,2004. → pages 22[35] J. Gratch, A. Okhmatovskaia, F. Lamothe, S. Marsella, M. Morales, R. J.van der Werf, and L.-P. Morency. Virtual rapport. In International Workshopon Intelligent Virtual Agents, pages 14–27. Springer, 2006. → pages 21[36] J. Gray, G. Hoffman, S. O. Adalgeirsson, M. Berlin, and C. Breazeal.Expressive, interactive robots: Tools, techniques, and insights based oncollaborations. In Human Robot Interaction (HRI) 2010 Workshop: What docollaborations with the arts have to say about HRI, 2010. → pages 20, 7698[37] S. Harada, J. A. Landay, J. Malkin, X. Li, and J. A. Bilmes. The VocalJoystick:: Evaluation of Voice-based Cursor Control Techniques. InProceedings of the 8th International ACM SIGACCESS Conference onComputers and Accessibility, Assets ’06, pages 197–204, New York, NY,USA, 2006. ACM. ISBN 1-59593-290-9. doi:10.1145/1168987.1169021.URL http://doi.acm.org/10.1145/1168987.1169021. → pages 77[38] G. Hoffman. On stage: robots as performers. In RSS 2011 Workshop onHuman-Robot Interaction: Perspectives and Contributions to Robotics fromthe Human Sciences. Los Angeles, CA, volume 1, 2011. → pages 21[39] G. Hoffman and W. Ju. Designing robots with movement in mind. Journalof Human-Robot Interaction, 3(1):89–122, 2014. → pages 20[40] G. Hoffman and W. Ju. Designing robots with movement in mind. Journalof Human-Robot Interaction, 3(1):89–122, 2014. → pages 19, 77[41] B. House, J. Malkin, and J. Bilmes. The VoiceBot: A Voice ControlledRobot Arm. In Proceedings of the SIGCHI Conference on Human Factors inComputing Systems, CHI ’09, pages 183–192, New York, NY, USA, 2009.ACM. ISBN 978-1-60558-246-7. doi:10.1145/1518701.1518731. URLhttp://doi.acm.org/10.1145/1518701.1518731. → pages 22[42] T. Igarashi and J. F. Hughes. Voice As Sound: Using Non-verbal Voice Inputfor Interactive Control. In Proceedings of the 14th Annual ACM Symposiumon User Interface Software and Technology, UIST ’01, pages 155–156, NewYork, NY, USA, 2001. ACM. ISBN 1-58113-438-X.doi:10.1145/502348.502372. URLhttp://doi.acm.org/10.1145/502348.502372. → pages 22[43] Johnny-Five, 2017. http://johnny-five.io. → pages 47[44] K. J. Kitto. Modelling and generating complex emergent behaviour.Flinders University, School of Chemistry, Physics and Earth Sciences., 2006.→ pages 16[45] K. Krippendorff. Computing krippendorff’s alpha reliability. Departmentalpapers (ASC), page 43, 2007. → pages 81[46] J. L. Lakin, V. E. Jefferis, C. M. Cheng, and T. L. Chartrand. The chameleoneffect as social glue: Evidence for the evolutionary significance ofnonconscious mimicry. Journal of nonverbal behavior, 27(3):145–162,2003. → pages 2199[47] G. Lakoff and M. Johnson. Metaphors we live by. 2003. → pages 70[48] Y.-K. Lim, E. Stolterman, and J. Tenenberg. The anatomy of prototypes:Prototypes as filters, prototypes as manifestations of design ideas. ACMTOCHI, 15(2):7, 2008. → pages 19[49] R. Maatman, J. Gratch, and S. Marsella. Natural behavior of a listeningagent. In International Workshop on Intelligent Virtual Agents, pages 25–36.Springer, 2005. → pages 21[50] M. M. Marin and H. Leder. Examining complexity across domains: relatingsubjective and objective measures of affective environmental scenes,paintings and music. PloS One, 8(8):e72412, 2013. → pages 16[51] D. Marino, P. Bucci, O. S. Schneider, and K. E. MacLean. Voodle: Vocaldoodling to sketch affective robot motion. In Proceedings of the 2017Conference on Designing Interactive Systems, pages 753–765. ACM, 2017.→ pages iv[52] L. Mathiassen, T. Seewaldt, and J. Stage. Prototyping and specifying:principles and practices of a mixed approach. Scandinavian Journal ofInformation Systems, 7(1):4, 1995. → pages 19[53] D. H. McFarland. Respiratory markers of conversational interaction.Journal of Speech, Language, and Hearing Research, 44(1):128–143, 2001.→ pages 21[54] A. Moon, C. A. Parker, E. A. Croft, and H. M. Van der Loos. Did you see ithesitate?–empirically grounded design of hesitation trajectories forcollaborative robots. In 2011 IEEE/RSJ International Conference onIntelligent Robots and Systems, pages 1994–1999. IEEE, 2011. → pages 20[55] C. Moussette. Sketching in Hardware and Building Interaction Design :tools , toolkits and an attitude for Interaction Designers. In Processing,2010. URL https://public.me.com/intuitive. → pages 19[56] C. Moussette. Simple haptics: Sketching perspectives for the design ofhaptic interactions. Umea˚ Universitet, 2012. → pages 19[57] C. Moussette, S. Kuenen, and A. Israr. Designing haptics. In Proceedings ofthe Sixth International Conference on Tangible, Embedded and EmbodiedInteraction - TEI ’12, page 351, New York, New York, USA, feb 2012.ACM Press. ISBN 9781450311748. doi:10.1145/2148131.2148215. URLhttp://dl.acm.org/citation.cfm?id=2148131.2148215. → pages 7100[58] S. Oliver and M. Karon. Haptic jazz: Collaborative touch with the hapticinstrument. In IEEE Haptics Symposium, 2014. → pages 20[59] J. S. Pardo. On phonetic convergence during conversational interaction. TheJournal of the Acoustical Society of America, 119(4):2382–2393, 2006. →pages 21[60] J. Pearson, J. Hu, H. P. Branigan, M. J. Pickering, and C. I. Nass. Adaptivelanguage behavior in hci: how expectations and beliefs about a system affectusers’ word choice. In Proceedings of the SIGCHI conference on HumanFactors in computing systems, pages 1177–1180. ACM, 2006. → pages 21[61] C. Peirce. The philosophical writings of Peirce, pages 98–119. New York:Dover, 1955. → pages 22[62] M. Perlman and A. A. Cain. Iconicity in vocalization, comparisons withgesture, and implications for theories on the evolution of language. Gesture,14(3):320–350, 2014. → pages 22[63] P. Perniss and G. Vigliocco. The bridge of iconicity: from a world ofexperience to the experience of language. Phil. Trans. R. Soc. B, 369(1651):20130300, 2014. → pages 22[64] P. Rainville, A. Bechara, N. Naqvi, and A. R. Damasio. Basic emotions areassociated with distinct patterns of cardiorespiratory activity. Internationaljournal of psychophysiology, 61(1):5–18, 2006. → pages 13[65] J. Rantala, K. Salminen, R. Raisamo, and V. Surakka. Touch gestures incommunicating emotional intention via vibrotactile stimulation. Intl JHuman-Computer Studies, 71(6):679–690, 2013. → pages 54[66] React, 2017. https://facebook.github.io/react. → pages 47[67] T. Ribeiro and A. Paiva. The illusion of robotic life: principles and practicesof animation for robots. In Proceedings of the seventh annual ACM/IEEEinternational conference on Human-Robot Interaction, pages 383–390.ACM, 2012. → pages 20[68] J. S. Richman and J. R. Moorman. Physiological time-series analysis usingapproximate entropy and sample entropy. American Journal ofPhysiology-Heart and Circulatory Physiology, 278(6):H2039–H2049, 2000.→ pages 17101[69] R. Rose, M. Scheutz, and P. Schermerhorn. Towards a conceptual andmethodological framework for determining robot believability. InteractionStudies, 11(2):314–335, 2010. → pages 23[70] R. Rummer, J. Schweppe, R. Schlegelmilch, and M. Grice. Mood is linkedto vowel type: The role of articulatory movements. Emotion, 14(2):246,2014. → pages 22[71] J. A. Russell. A circumplex model of affect. Journal of Personality andSocial Psychology, 39(6):1161, 1980. → pages 15[72] J. A. Russell, A. Weiss, and G. A. Mendelsohn. Affect grid: a single-itemscale of pleasure and arousal. Journal of personality and social psychology,57(3):493, 1989. → pages 15, 52[73] G. W. Ryan and H. R. Bernard. Techniques to Identify Themes. FieldMethods, 15(1):85–109, feb 2003. ISSN 00000000.doi:10.1177/1525822X02239569. URLhttp://fmx.sagepub.com/cgi/doi/10.1177/1525822X02239569. → pages 67[74] D. Sakamoto, T. Komatsu, and T. Igarashi. Voice Augmented Manipulation:Using Paralinguistic Information to Manipulate Mobile Devices. InProceedings of the 15th International Conference on Human-computerInteraction with Mobile Devices and Services, MobileHCI ’13, pages 69–78,New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2273-7.doi:10.1145/2493190.2493244. URLhttp://doi.acm.org/10.1145/2493190.2493244. → pages 22[75] J. Saldien, K. Goris, S. Yilmazyildiz, W. Verhelst, and D. Lefeber. On thedesign of the huggable robot probo. Journal of Physical Agents, 2(2):3–12,2008. → pages 20[76] K. R. Scherer. Appraisal theory. 1999. → pages 91[77] O. S. Schneider and K. E. MacLean. Improvising design with a hapticinstrument. In 2014 IEEE Haptics Symposium (HAPTICS), pages 327–332.IEEE, 2014. → pages 22, 78[78] O. S. Schneider and K. E. MacLean. Studying Design Process and ExampleUse with Macaron, a Web-based Vibrotactile Effect Editor. In HAPTICS’16: Symposium on Haptic Interfaces for Virtual Environment andTeleoperator Systems, 2016. → pages 48102[79] O. S. Schneider and K. E. MacLean. Studying design process and exampleuse with macaron, a web-based vibrotactile effect editor. In 2016 IEEEHaptics Symposium (HAPTICS), pages 52–58. IEEE, 2016. → pages 8, 20,39[80] O. S. Schneider, H. Seifi, S. Kashani, M. Chun, and K. E. MacLean.HapTurk: Crowdsourcing Affective Ratings for Vibrotactile Icons. InProceedings of the SIGCHI Conference on Human Factors in ComputingSystems (CHI) ’16, pages 3248–3260, New York, New York, USA, may2016. ACM Press. ISBN 9781450333627. doi:10.1145/2858036.2858279.URL http://dl.acm.org/citation.cfm?id=2858036.2858279. → pages 43[81] Y. S. Sefidgar, K. E. MacLean, S. Yohanan, H. M. Van der Loos, E. A. Croft,and E. J. Garland. Design and evaluation of a touch-centered calminginteraction with a social robot. IEEE Transactions on Affective Computing, 7(2):108–121, 2016. → pages 13[82] H. Seifi, K. Zhang, and K. E. MacLean. Vibviz: Organizing, visualizing andnavigating vibration libraries. In World Haptics Conference (WHC), 2015IEEE, pages 254–259. IEEE, 2015. → pages 43[83] M. Shaver and K. Maclean. The twiddler: A haptic teaching tool forlow-cost communication and mechanical design. Master’s thesis, 2003. →pages 39[84] H. Shintel, H. C. Nusbaum, and A. Okrent. Analog acoustic expression inspeech communication. Journal of Memory and Language, 55(2):167–177,2006. → pages 22[85] J. Six, O. Cornelis, and M. Leman. TarsosDSP, a Real-Time AudioProcessing Framework in Java. In Proceedings of the 53rd AES Conference(AES 53rd), 2014. → pages 47[86] A. Strauss and J. Corbin. Basics of qualitative research: Techniques andprocedures for developing grounded theory . Sage Publications, Inc, 1998.→ pages 65[87] L. Takayama, D. Dooley, and W. Ju. Expressing thought: improving robotreadability with animation principles. In Proceedings of the 6thinternational conference on Human-robot interaction, pages 69–76. ACM,2011. → pages 20103[88] K. Wada and T. Shibata. Living with seal robotsits sociopsychological andphysiological influences on the elderly at a care house. IEEE Transactionson Robotics, 23(5):972–980, 2007. → pages 12[89] K. Wada, T. Shibata, T. Asada, and T. Musha. Robot therapy for preventionof dementia at home. Journal of Robotics and Mechatronics, 19(6):691,2007. → pages 12[90] R. M. Warner. Coordinated cycles in behavior and physiology duringface-to-face social interactions. Sage Publications, Inc, 1996. → pages 21[91] J. Watanabe and M. Sakamoto. Comparison between onomatopoeias andadjectives for evaluating tactile sensations. In Proceedings of the 6thInternational Conference of Soft Computing and Intelligent Systems and the13th International Symposium on Advanced Intelligent Systems (SCIS-ISIS2012), pages 2346–2348, 2012. → pages 22, 78[92] D. Watson, L. A. Clark, and A. Tellegen. Development and validation ofbrief measures of positive and negative affect: The PANAS scales. Journalof Personality and Social Psychology, 54(6):1063–1070, 1988.doi:10.1037/0022-3514.54.6.1063. URLhttp://doi.apa.org/getdoi.cfm?doi=10.1037/0022-3514.54.6.1063. → pages15[93] D. Watson, L. A. Clark, and A. Tellegen. Development and validation ofbrief measures of positive and negative affect: the panas scales. Journal ofpersonality and social psychology, 54(6):1063, 1988. → pages 47[94] A. C. Weidman, C. M. Steckler, and J. L. Tracy. The jingle and jangle ofemotion assessment: Imprecise measurement, casual scale usage, andconceptual fuzziness in emotion research. 2016. → pages 14[95] S. Yohanan and K. MacLean. A tool to study affective touch. In CHI’09Extended Abstracts, pages 4153–4158. ACM, 2009. → pages 12[96] S. Yohanan and K. E. MacLean. Design and assessment of the hapticcreature’s affect display. In Proceedings of the 6th international conferenceon Human-robot interaction, pages 473–480. ACM, 2011. → pages 12[97] S. Yohanan and K. E. MacLean. The role of affective touch in human-robotinteraction: Human intent and expectations in touching the haptic creature.Intl J Social Robotics, 4(2):163–180, 2012. → pages 54104Appendix ASupporting Materials105Figure A.1: RibBit assembly instructions, page 1.106Figure A.2: RibBit assembly instructions, page 2.107Figure A.3: RibBit assembly instructions, page 3.108Figure A.4: RibBit assembly instructions, page 2.109Figure A.5: RibBit assembly instructions, page 5.110Figure A.6: RibBit assembly instructions, page 6.111Figure A.7: RibBit assembly instructions, page 7.112Figure A.8: RibBit assembly instructions, page 8.113Figure A.9: RibBit assembly instructions, page 9.114Figure A.10: RibBit design system explainer, page 1.115Figure A.11: RibBit design system explainer, page 2.116Figure A.12: RibBit design system explainer, page 3.117Figure A.13: Lasercutting files for the RibBit.118Figure A.14: FlexiBit assembly instructions, page 1.119Figure A.15: FlexiBit assembly instructions, page 2.120Figure A.16: FlexiBit assembly instructions, page 3.121Figure A.17: FlexiBit assembly instructions, page 4.122Figure A.18: FlexiBit design system explainer, page 1.123Figure A.19: FlexiBit design system explainer, page 2.124Figure A.20: FlexiBit design system explainer, page 3.125Figure A.21: FlexiBit design system explainer, page 4.126Figure A.22: FlexiBit design system explainer, page 5.127Figure A.23: FlexiBit design files to be cut out.128