UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A vision system for a surgical instrument-passing robot Chan, Kenneth Ling-Man 1985

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-UBC_1985_A7 C42.pdf [ 11.72MB ]
JSON: 831-1.0064793.json
JSON-LD: 831-1.0064793-ld.json
RDF/XML (Pretty): 831-1.0064793-rdf.xml
RDF/JSON: 831-1.0064793-rdf.json
Turtle: 831-1.0064793-turtle.txt
N-Triples: 831-1.0064793-rdf-ntriples.txt
Original Record: 831-1.0064793-source.json
Full Text

Full Text

A VISION SYSTEM FOR A SURGICAL INSTRUMENT-PASSING ROBOT  by KENNETH LING-MAN CHAN B. A. Sc., The University of British Columbia,  1983  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED  SCIENCE  in THE FACULTY OF GRADUATE STUDIES (Department of Electrical Engineering)  We accept this thesis as conforming to the required standard  THE UNIVERSITY  OF BRITISH COLUMBIA  August e  1985  Kenneth Ling-Man Chan,  1985  In p r e s e n t i n g  t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of  requirements f o r an advanced degree a t the  the  University  o f B r i t i s h Columbia, I agree t h a t the L i b r a r y s h a l l make it  f r e e l y a v a i l a b l e f o r reference  and  study.  I further  agree t h a t p e r m i s s i o n f o r e x t e n s i v e copying o f t h i s t h e s i s f o r s c h o l a r l y purposes may  be granted by the head o f  department o r by h i s or her  representatives.  my  It is  understood t h a t copying or p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l gain  s h a l l not be allowed without my  permission.  Department o f  Electrical  Engineering  The U n i v e r s i t y o f B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date  3 0  August,  1985  written  ABSTRACT  To proposed  help for  control  use in  the  high  cost of  health  passing surgical instruments  care in  an  delivery,  a robotic  operating  system is  room. The  system  consists of a vision system, a robotic arm, a speech recognition and synthesis unit, and a  microcomputer.  A  complete  vision system has been  developed  using standard  and  new techniques to recognize arthroscopic surgical instruments.  Results  of  the  accuracy of over 99%.  vision  system  software  evaluation  gave  an  overall  recognition  Also, error conditions were analysed and found to be consistent  with the results of a clinical survey on the proposed instrument-passing robot. As well, a payback and cost benefit analysis using estimated  system costs and potential labour  savings showed that the instrument-passing robot is economically feasible.  Based on the results of this thesis, it was concluded that the instrument-passing robot would be beneficial for reducing the high cost of health care.  ii  Table of Contents  ABSTRACT LIST OF TABLES LIST OF FIGURES ACKNOWLEDGEMENT  ;.  •• v vi x  1.0 INTRODUCTION _ 1.1. Statement of Problem  _  10 SYSTEM OVERVIEW 2.1. Block Diagram of System 2.2. Description of Components 2.2.1 Computer 2.2.2 Vision System _ _ _ 2.2.3 Speech Recognition and Synthesis System  _  „  2.3. Gripper - End Effector 2.4. Surgical Instrument Set 2.5. Structured Lighting and Instrument Tray 2.6. System Integration 2.6.1 Operation Sequence 2.6.2 Role of Surgeon and Nurse in O.R. 2.7.1 Payback Period and Cost Effective Analysis 3.0 VISION SYSTEM : 3.1. Imaging Components Review 3.1.1 High-Resolution Imaging Systems 3.1.2 Vision System Design 3.2. Selection and Evaluation of Camera 3.2.1 Camera Review 3.2.2 Camera Selection . . 32.3 Camera Evaluation 3.3. Selection and Evaluation of Image Digitizer 3.3.1 Image Digitizer Review 3.3.2 Digitizer Selection 3.3.3 Digitizer Evaluation 4.0 Recognition Algorithm 4.1. General Recognition Problem 4.2. Description of Instruments and Layout 43. Survey of Existing Algorithm . 4.3.1 Overview 43.2 Criteria for Recognition Algorithm 43.3 Feature Extraction Algorithms 4.3.4 Global Features 4.3.5 Boundary Oriented Features 43.6 Classification Algorithms _ 4.4. Development of Recognition Algorithms 4.4.1 Low-Resolution Algorithm . 4.4.2 High Resolution Algorithm 4.4.3- Clinical Requirements 5.0 EVALUATION OF RECOGNITION ALGORITHM iii  _ _  _  _  1 1  —  _  ~  -  »  — _ _  9 9 11 11 12 14 18 18 19 21 21 25 29 31 31 31 33 37 38 40 42 48 48 49 51 56 56 58 59 61 62 63 63 65 69 70  ———71  77 105  -—.107  iv  5.1.1 Methods of error estimation methods 5.1.2 Test and Training Data 5.2. Results of Testing Each Algorithm 5.3. Failures of Each Algorithm 5.4. Improvements of Recognition Algorithms 5.5. Implementation and Optimization 5.5.1 Practical Problems 5.5.2 Optimization 5.6. Final Results of the Recognition Algorithm 6.0 CONCLUSIONS A N D R E C O M M E N D A T I O N S References t Appendix A Questionnaire on Clinical Requirements Appendix B Payback Period Calculation Appendix C Program Pseudo-code  -  .....*.  -  107 ......... 110 113 117 121 123 124 131 137 140 145 148 153 154  List of Tables Page  Table  Table 2.1 Cost of IBM  Table 2.2  13  PC.  20  List of arthroscopy instruments.  Table 2.3 Proposed system cost and scrub Nurse salary.  30  Table 3.1  Listing of commercial cameras considered for vision system.  41  Table 3.2  Listing of commercial image digitizers considered for vision system.  50  Table 4.1 Typical structural matching results.  103  Table 5.1  List of test data set.  Ill  Table 5.2  Results of tests on Acufex algorithm.  117  Table 5.3  Confusion matrix of all arthroscopic instruments.  118  Table 5.4 Types of errors in structural match.  121  Table 5.5  Time comparison results.  136  Table 5.6  Arthroscopic instrument usage probability.  Table 5.7 Overall recognition accuracy.  139  139  v  List of Figures Page  Figure  Figure 1.1 Cartilage knives (top)  Figure 1.2 Obturators (top)  and Acufex instruments (bottom)  5  and low resolution instruments (bottom)  Figure 1.3 Tips of cartilage knives (top) RPS (bottom)  6  and 3 Acufex instruments: L G F , LPS, 7  Figure 1.4 Tips of Acufex instruments: L20H, R20H, L60H, R60H (top); LSS, RSS, LBP, RBP (bottom) 8  Figure 2.1 Typical robotic system configuration.  10  Figure 2.2 Common geometries for robot arms.  17  Figure 2.3 Drawing of backlight unit  22  Figure 2.4 Pre-operation initialization flowchart  24  Figure 2.5 Vision system flowchart  26  Figure 2.6 Speech control flowchart for robot arm.  27  Figure 3.1 Vision systems using one camera.  35  Figure 3.2 Vision systems using 2 cameras.  36  Figure 3.3 Proposed vision system for instrument-passing robot  39  vi  vii  Figure 3.4 Resolution test pattern with converging lines Figure 3.5 RCA P200 test pattern.' Figure 3.6 Test set-up for RCA P200 test Figure 3.7 MTF of Javelin and Dage.  47  Figure 3.8 Digitizer evaluation set-up.  53  Figure 3.9 Image digitized by IP-512.  54  Figure 3.10 Image digitized by Oculus 100.  55  Figure 4.1 Pattern recognition system block diagram  56  Figure 4.2 Layout of proposed instrument tray.  60  Figure 4.3 Typical low resolution image.  72  Figure 4.4 Distribution of low resolution features.  75  Figure 4.5 Definition of Pose 1 and Pose 2.  77  Figure 4.6 Determination of handle width.  78  Figure 4.7 Distribution of lengths of Acufex instruments and cartilage knives. 79 Figure 4.8 Distribution of handle widths of Acufex instruments and cartilage knives.  80  Figure 4.9 Determination of pose for Acufex instrument and cartilage knives. 81  Figure 4.10 Typical high resolution image.  83  Figure 4.11 Distribution of widths and angles for obturators.  84  Figure 4.12 Distribution of width of cartilage knives : (top) all knives; (bottom) FCK, HP, SCK, MCK. 85  Figure 4.13 Curvature of Acufex tip: large grasping forceps.  87  Figure 4.14 Curvature of Acufex tip: left plain scissors.  88  Figure 4.15 Curvature of Acufex tip: right plain scissors.  89  Figure 4.16 Curvature of Acufex tip: left 20 degree hooked scissors.  90  Figure 4.17 Curvature of Acufex tip: right 20 degree hooked scissors.  91  Figure 4.18 Curvature of Acufex tip: left 60 degree hooked scissors.  92  Figure 4.19 Curvature of Acufex tip: right 60 degree hooked scissors.  93  Figure 4.20 Curvature of Acufex tip: left serrated scissors.  94  Figure 4.21 Curvature of Acufex tip: right serrated scissors.  95  Figure 4.22 Curvature of Acufex tip: left basket punch.  96  Figure 4.23 Curvature of Acufex tip: right basket punch.  97  Figure 4.24 Noisy high resolution image with intrusions.  98  ix Figure 4.25 High resolution structural features.  100  Figure 4.26 High resolution BMD features Figure 5.1 Overall recognition flowchart Figure 5.2 Histogram of goodness of structural match.  122  Figure 5.3 Noisy image with histogram of intensity.  125  Figure 5.4 Binary digitized noisy image: (top) threshold = 55; (bottom) threshold = 65.  126  Figure 5.5 Binary digitized noisy image: (top) threshold = 45; (bottom) threshold = 35.  127  Figure 5.6 Noise test of fluids in low resolution.  129  Figure 5.7 Noise test of fluids in high resolution.  130  Figure 5.8 Sample instrument tray.  132  Figure 5.9 Boundary test in low resolution.  133  Figure 5.10 Boundary test in high resoluion.  134  Figure 5.11 UBC VAX set-up.  135  ACKNOWLEDGEMENT  I  would  like  suggestions throughout Neil  Cox  for  to  thank  Dr.  Lawrence  and  Dr.  McEwen  the course of this thesis. I would  much timely advice  on completing  a  for  their  helpful  especially like to thank  thesis, without  which  this  Mr. thesis  would not have been completed.  I  would  also like  to  thank  the  following  people  and  groups  for  advice and assistance: Messrs. G . Auchinleck, R.  Bohl, S. Chan, J. Clark,  Fung,  Osborne,  H.  Garudadri,  W.  Jager,  R.  McNeil,  C.  K.  Yip,  the  their  J. Ens, B.  technical  clerical staff of the Biomedical Engineering Department at Vancouver General and  the  surgical nursing and  Finally,  I  would  like  to give  equipment  cleaning staff at  a special thanks  to  my  Surgical  Day  sister, Mabel,  typing and proof-reading of this thesis, and to my family  kind  Hospital,  Care  for  for their support  and  her  Centre. expert  throughout  this thesis.  I in the  am very grateful to the B.C. Science Council for their support of my work  form of G R E A T awards, and to Andronic Devices Ltd. for supplying some of  the equipment  for this project  x  CHAPTER 1 INTRODUCTION  1.1. Statement of Problem The high cost of health care delivery and, possibly, reduction  in  has led to significant reduction in staff  services performed  in  some hospitals. A  large  portion, as  much as 80%, of the hospital's operating budget is devoted to labour costs. Hence, it is essential for hospitals to control the ever increasing labour cost in order to maintain the quality of health care provided (McEwen, 1984).  One method commonly employed in manufacturing to reduce labour  costs and  increase productivity is through automation, particularly robotics. A robot, as defined by the  Robotics  designed  Industry  to  programmed extremely  move motions  Association,  materials, for  the  is  parts,  a  "reprogrammable  tools  performance  useful for performing repetitive,  or of  a  multifunctional  specialized  devices  variety  tasks."  of  manipulator  through Thus,  variable  robots  are  well defined tasks, or working in hazardous  or hostile environments.  Robotic efforts  to  devices  improve  dependence  on  health  employed extensively expected  in  the  the  have  been  quality care  used  of  workers  life  extensively for  (Leifer,  in  rehabilitation  disabled  people  1981).  Robotics  and  engineering to  have  reduce not  yet  in  their been  in acute care hospitals; however, more widespread usage can be  future  not  only  to  control  the  costs of  health  care, but  also to  improve the quality of such care. For example, a few robotic devices are being used in clinical chemistry laboratories to handle hazardous samples or reagents and to reduce  1  2  the exposure of infectious material to the laboratory workers (McEwen, 1984).  As  another  example  in  the  operating  room,  a  robot  has  been  used  in  stereotactic surgery (Kwok, 1985). The robot is used to align the trajectory of a probe to be inserted into the patient's skull; the surgeon then manually directs the aim to move  towards  a  tomography (CT)  specified target  within  the  brain  while  making  use  of  computer  images. The use of the robot allows the probe to move in a more  precisely controlled path than a surgeon would have been able to do.  In  this  thesis, a robotic system with visual sensing capability  is proposed to  perform non time-critical duties in the operating room (OR). At present, two nurses, a circulating nurse and a scrub nurse, are employed in the OR for each surgical case. The circulating nurse, working in the non-sterile areas of the OR, sets up the surgical equipment  prior  to  surgery,  fetches  and  retrieves  material  for  the  other  clinical  personnel during surgery, and removes the surgical equipment after the conclusion of surgery. The  scrub nurse, working in the sterile areas, sets up the sterile equipment  prior to surgery, passes the sterile instruments to the surgeon, and removes the surgical equipment  after  the  conclusion of  the  surgery. Since the  circulating nurse is highly  mobile during surgery, a robot would not be able to emulate  her activities. On the  other  of  hand,  instruments  a to  scrub the  nurse  spends  a  considerable  surgeon. Hence, these repetitive  amount  her  time  passing  instrument-passing duties  can be  performed by an instrument-passing robot, resulting in labour savings. Other advantages are the reduced risk of infection and consistency of robot workers.  Any  robotic  system  for  such  an  application  should  be  able  to  pass  the  instruments to the surgeon without compromising the patient's treatment, sterility in the surgical field and safety of the OR personnel. In particular, a vision system is to be developed for recognizing the surgical instruments. The vision system should be able to identify each instrument and determine its location and orientation.  3  The the  feasibility  surgical  involved,  of  procedure.  it  does hot  the  instrument-passing  For yet  cardiac  surgery  seem plausible  robot where  for  instruments  are  used  and  the  many  a robot  instruments with the required speed. On the other 36  depends on the  procedure  surgical  complexity  instruments  to sort through  all  of are  of these  hand, in arthroscopic surgery, only is  less  time-critical.  Hence,  the  instrument-passing robot can be used in arthroscopic surgical procedures.  1.2. Objectives One lack  of  a  difficult  commercially  surgical instruments. 8"  problem  A  with  available  implementing vision  survey of the  the  system  instrument-passing  that  commercial  is  capable  of  vision systems at  the  robot  is  the  recognizing  the  1984  "Robots  exhibition in Detroit revealed that most of the commercial vision system were not  suitable for this application. Of the systems examined, the closest to being suitable for this  application  (LNK  Corp.,  were the  Silver  E Y E (Analog  Spring, MD).  Devices, Norwood, MA),  However,  these  system were  and the either  too  IRR1S-100 expensive,  over $50,000, or did not have the sufficient resolution. Hence, a custom vision system capable of recognizing the surgical instruments will be developed in this thesis.  In requires  any  computer  extensive  arthroscopic  vision system, the  development  instrument  set,  it  From can be  image  Figures  pattern recognition  1.1  seen that  and  some of  1.2 the  problem usually  showing  the  instruments  complete are  quite  similar except for the tips. Figures 1.3 and 1.4 give a close-up view of the tips. The subtle differences in the instruments precludes the use of most of the simple statistical image pattern recognition techniques and require detailed examination of local properties of the instruments.  The objectives of this thesis are : 1.  Primary Objectives:  4 a.  to develop and evaluate  a real time, clinically acceptable vision system  for recognizing arthroscopic instruments. 2.  Secondary Objectivess a.  to  investigate  the  technical  problems  associated  with  using  such  a  vision system as part of an instrument-passing robot. b.  to  make  recommendations  for  system  components  for  an  instrument-passing robot c. The  to examine the economic aspects of the instrument-passing robot.  contributions  discriminating  the  of  this  thesis  are  in  arthroscopic instruments  the problems associated with implementing  developing in  real-time  computer on  algorithms  capable  a microcomputer,  of  outlining  an operating room instrument-passing robot,  and examining the economic aspects of implementing a robot in the operating room.  Figure 1.1  Cartilage knives (top) and Acufex instruments (bottom)  6  Figure 1.2 Obturators (top) and low resolution instruments (bottom)  7  Figure 1.3 Tips of cartilage knives (top) and 3 Acufex instruments: LGF, LPS, RPS (bottom)  Figure 1.4 Tips of Acufex instalments: L20H, R20H, L60H, R60H (top); LSS, RSS, LBP, RBP (bottom)  CHAPTER 2  S Y S T E M OVERVIEW  The  primary  function  of  the  instrument-passing  robot  is  to  replace  those  functions of a scrub nurse which involve handling the surgical instruments between sterile  instrument  area and the surgeon. Although  involves considerably more than instrument  the  the  duties of a human scrub nurse  handling, the  present technology  limitations  in robotics prevent the inclusion of other mobile and decision-making tasks of a scrub nurse. Nevertheless, if a robot is capable of successfully replacing a human operator in a limited capacity, an overall cost saving will result.  The  requirement  complicated  than  speed  accuracy,  and  necessary),  of  an  instrument-passing  those of an industrial  and  the  surgical  unobtrusive.  robot  robot  This  In  must  chapter  robot  addition  also  presents  be  are  considerably  to having safe,  the  sufficient reach,  quiet,  layout  more  sterilizable  of  the  (if  overall  instrument-passing robot  2.1. Block Diagram of System  The scrub  nurse  recognition  basic components required are: and  a  computer  synthesis unit  for for  to perform overall  receiving  the  control and  instrument and  data  acknowledging  passing duties of a processing,  a  speech  commands  from  the  surgeon, a vision system for recognizing the surgical instruments, and a robot arm to perform  the  physical  "pick  and  place"  operations.  shown in Figure 2.1  9  A  typical  system  configuration  is  10  VOICE I/O  Figure 2.1 Typical robotic system configuratioa  11  To computer  keep  the  cost low  serves both vision  and  as the  system.  The  system  lighting  unit. The robot arm  to  limit  the  system controller  consists  of  one  number  and  or  of  as the  more  processor for  cameras  consists of a mechanical arm,  freedom, and an associated controller.  system components, the  and  the  the  with 5 or  vision  appropriate  6 degrees of  The speech recognition and synthesis units may  be one or two units and should be controlled by the computer.  2.2.  Description of Components  2.2.1  Computer  The  computer  controlling  the  robot,  for  the  instrument-passing  responding  to  the  robot  will  speech recognition  have  and  the  tasks  synthesis units,  of and  processing the vision system data. The computer should be fast enough to process the vision system data quickly and should have system  program  input/output  and  control  capabilities  recognition/synthesis  program  for  unit,  sufficient  storage.  communicating  as  well  as  The  memory  computer  with  the  for  good  selection  (PC)  (IBM  Corp,  data, vision  should also  robot  a  image  of  and  have the  software  for  good speech  program  development  The computer several  IBM  Personal  Computer  which is suitable  models, the  for instrument-passing  PC or  PC/XT  microcomputer  The PC and P C / X T use the Intel 8088 8/16 is capable of addressing 1 Mb bit  CPU  which  memory.  Both  devices.  Other  runs at PC  6  MHz  models  advantages  of memory  have of  extensive software availability.  the  and 8  robot or  The IBM the  PC/AT  NY)  is  a  popular  PC is available  in  supermicrocomputer.  bit C P U which runs at 4.77  MHz  and  while the P C / A T uses the Intel 80286 16  is capable  of  expansions slots IBM  Armonk,  PC  are  addressing for  16  Mb  communicating  extensive  graphics  of physical with external  capabilities  and  12  The cost of the IBM and the  advanced P C / A T  PC and P C / A T  cost approximately  are listed in Table 2.1. The  basic PC  $3,500 and $5,000 respectively. Although  the P C / A T is slightly more costly than the PC, it can execute programs 4 to 6 times faster  than  the  PC -  an  execution  speed which  is very  desirable  for  the  vision  system.  2.2.2  Vision System  A  vision  system is  used with  the  instrument-passing  robot  to  identify  the  different instruments, and to locate the position and orientation of the instruments. The vision system also allows the instruments to be placed in any suitable position within the visual field, keeps track of which instruments are present and can recalculate the positions of instruments in case they become displaced. Additional safety measures can be incorporated into  the  someone reaching into  vision system to  inactivate  the robot arm  the  visual field to retrieve an instrument  of  a  in  the  event of  or to peform other  unforeseen actions.  The  advantage  vision  system  is  the  unconstrained  handling  of  the  instruments. Without a vision system, the instruments must be palletized, and the exact positions and orientations of the instruments must be known; otherwise, the robot arm would not be able to grasp the instruments. As well, a robotic system without visual sensing cannot. keep track  of the  instruments  removed from the field. Hence, the additional  easily if  more than  one  instrument is  cost and complexity of using a vision  system can be justified by the improved flexibility, reliability and safety.  To  decrease the complexity of the  system, two constraints are applied to the  vision system: the images are digitized to binary (1 bit) levels and the instruments are non-touching  and  non-overlapping. Good  appropriately  structured  lighting.  To  binary  prevent  images can be the  instruments  generated from  by using  touching  and  13  Table 2.1 Cost of IBM  IBM  IBM  PC.  PC  c o m p u t e r w i t h 2 d i s k d r i v e s , 256Kb s e r i a l and p a r a l l e l p o r t s monochrome m o n i t o r monochrome d i s p l a y c a r d  PC/AT  c o m p u t e r w i t h 1 d i s k d r i v e , 256Kb s e r i a l and p a r a l l e l p o r t s monochrome m o n i t o r monochrome d i s p l a y c a r d  $  2843 200 200 150  $  3393  $ $ $ $  4414 200 200 150  $ $  s  4964  14  overlapping,  a  compartmentalized  tray  structured lighting and the instrument  was  developed  to  hold  the  instruments.  The  tray will be discussed in more detail in a later  section.  A  major  around the direct  action  sterile  contact  lighting  concern with the  of a sterile  field  surgical instruments. Even though the vision system will not be in  with the  sterile  or camera equipment to prevent  vision system is the preservation  dirt from  instruments, may  may  be  concern that any  deposit dirt onto the sterile field. The  entering  nursing personnel, is to clean the  there  the  sterile  area,  after  consultation  overhead equipment  prior  to the  overhead appropriate  with surgical  beginning of the  surgical procedure.  2.2.3  Speech Recognition and Synthesis System  The  speech  communicate from  the  recognition  with the  and  surgeon. The  synthesis  system  speech recognition  is  used  by  purposes. It  system for  fast and accurate  system which  is important to  have  response to the  a high  computer  received commands for  performance  surgeon's request  speech recognition  A  speech recognition  cannot recognize or mis-recognizes commands will prolong the  of the surgical procedure and annoy the OR  to  system receives the commands  surgeon and the speech systhesis system echoes the  verification  the  staff. The only criterion  duration  for the speech  synthesis system is that it produces clear, easily recognizable speech sounds.  The different types of speech recognition systems may be classified as : discrete or  connect  speech  recognizers. The speaker-  recognizers;  majority of the  dependent  speech. There  and  speaker-dependent  speech recognition are  or  speaker-independent  systems available  connected speech recognition  are  for  systems  discrete available  but they are usually very expensive, e.g. the N E C DP-200 which costs over $10,000. Although some systems claim to be speaker independent, the vocabulary for recognition  15  may  be  limited  only  speaker-independent  to  speech  a  few  recognition  distinct is  sounds;  still  very  the  much  general  in  the  topic  research  of stage  (Flanagan,1982).  O f the speech recognition systems available, the N E C SR100 seems to have the best  performance  recognition algorithm align SR100  for  accuracy,  the  price  even  in  range,  the  $2,000.  presence  It  of  has  specifications  background  noise.  of  over  The  99%  recognition  employed by the SR100 uses the powerful dynamic programming technique to  the  input  were  speech signals to  performed  at  the  match  Electrical  the  reference  Engineering  speech signals. Tests of  Department  of  the  U B C (Frenette,  1985). The best recognition accuracy obtained was 98.5% for subjects with little training and a noiseless background. Another factor which can affect the recognition accuracy is the vocabulary of the application.  For  speech synthesis, the  SR100 since both units have  NEC  similar  AR100  can be used as a companion to  input/output  protocols. The  AR100  the  uses A D P C M  techniques to store the speech patterns and costs $2,000.  2.2.4  Robot Arm  A  robot  arm  is  used  to  perform  the  instrument-passing  duties  of  a scrub  nurse. The arm should have a sufficient payload to handle all the surgical instruments with ease, have sufficient reach and accuracy to access all the surgical instruments, and have and  sufficient, speed to perform reliably.  convenient  Other  power  requirements  source  for  the  the  passing, or  of  the  operation  robot  'pick arm  and place', operations are  room, proper  that  it  geometry  should for  quickly  have  the  :  type  a of  movements required, cleanable or sterilizable external components, and low cost  The  most important  factor  in  selecting the  perhaps cost At present, there exist industrial  robot  arm  for  this  application  is  robot arms that can satisfy all of the  16  above  requirements;  $100,000. The  however,  these  industrial  lower-cost robot arms are  robot  generally  arms  cost between  $50,000 and  smaller, less accurate and have a  lighter payload. O f all the common geometries shown in Figure 2.2, the revolute robot seems to be the most appropriate, since it can move over or around obstacles easily in  a  smooth  path.  electric, although  The  cost  convenient  a 50 psi pneumatic  payload of  the  robot  arm  instruments  weigh  less than  present,  there  line  power  is available  is not a major 150g  source  concern in  for  the  operating  in most operating this application  and most robot arms have  robot  is  rooms. The since all  a payload greater  the than  - 500g.  At  specifications  that  are  are  at  suited  to  least the  two  educational-commercial  instrument-passing  robot.  robot  They  are  arms the  with  $12,000  Robotic Systems International RT3 and the $14,000 Mitsubishi RM501. Both robot arms use  electric  power  freedom (DOF)  and  have  the  revolute  less  Devices  accurate,  Ltd.  robotics at  The  RT3  has  6  degrees  of  and a 60 cm reach while the RM501 has 5 dof and a 38 cm reach.  Although the RT3 has an extra DOF stable,  geometry.  and  (Andronic),  the  has a  a  and a longer reach over the RM501, it is less  higher  company  drift.  Experiments  performing  research  Vancouver General Hospital (VGH)  performed  into  by  medical  and U B C Health  Andronic  and surgical  Science Center  Hospital (HSCH), have shown that the reach of the RM501 can be extended slightly by adding a fixed link between the gripper and the robot wrist without decreasing the performance of the robot arm.  Since the important  robot arm will be working  near  to the  sterile  field,  sterility  is an  consideration. Proper 'draping' of the robot include covering all surfaces that  are adjacent  to sterile areas and covering the  actual arm since it  areas. Sterile  'draping' techniques for the RM501 have been developed by Andronic at  V G H and successfully tested in an operating room at the HSCH.  passes over sterile  SPHERICAL  REVOLUTE  Figure 2.2 Common geometries for robot arms.  18  2.3. Gripper -  The  End Effector  gripper,  or  end  effector,  is an  important  part of  the  instrument-passing  robot. The gripper must be able to firmly grip the instrument, while stationary and in motion, and release the instrument on the command of the surgeon, either via sensors on the gripper or electrical signals from the computer and the speech recognition uniL In  addition, the gripper should be detachable from the robot arm and be steriliz.able,  and it should be safe to touch without compromising sterility,  such as puncturing  the  surgical glove.  An off  the  after  important  instrument  the  design consideration for to the  the gripper  surgeon. Ideally, the  gripper  surgeon has obtained a firm grasp; it  instrument  since it  would  cause a  delay  is the  should release  is important  as the  exchange in handing  instrument  to  the  instrument  avoid dropping  the  is replaced, as well as  damaging the instrument.  Two successful designs have been tested by Andronic at V G H . One design uses an electrically while  the  driven  other  gripper  with a membrane  design is pneumatically  driven  switch  to  activate  the  release  with a micro switch release  action  activation  (Fengler, 1984).  2.4.  Surgical Instrument Set  The  arthroscopic instruments  used to  develop  the  vision system are  listed  in  Table 2.2. They are part of the standard arthroscopy set and the Acufex set used for arthroscopic  procedure  at  the  Surgical  Day  Care  Centre  (SDCC)  at  VGH.  Other  instruments that are used in arthroscopy but not included are the arthroscope and the Dyonics  shaver  normally  attached  Since they  (Dyonics  Surgical  Ltd.,  Andover,  MA).  These  two  instruments  to a long cord and cannot be easily handled by the  are  robot arm.  are not passed back and forth between the scrub nurse and the surgeon.  19  they are usually placed within reach of the surgeon at the beginning of the surgical procedure.  The 32 different instruments to be recognized by the knives,  knife  holder,  scissors,  needle  holder,  towel  clips,  vision system consist of trocar  sleeves,  pyramidal  trocars, blunt obturators, cartilage knives, hook probe, and Acufex scissors, punches and graspers. The Acufex scissors and punches contain pairs that differ only in the rotation direction of the cutting edge, clockwise (CW) sleeves, pyramidal  trocars and blunt  and counter-clockwise (CCW). The trocar  obturators  are  available  in  2 sizes, 5.0 mm and  3.8, mm.  In patient's trocar  diagnostic arthroscopy, the knee, after  sleeve,  which  pyramidal  procedures.  The  scissors  disposable  drape  used  instruments used  of  trocar, are on  the and  used the  cartilage blunt  at  incision in  the  knives, arthroscope and one set of  the  obturator  the  patient's  used to  end  knee.  of In  make  are the  initial  used  in  the  middle  portion  resurfacing is required,  the  of  many  extensively  procedure  surgical  to  cut  arthroscopies,  are increasingly being used to resurface the patient's  extensively  amount  only  scalpel is  in away  the  is often  the  Acufex  knee and hence are  arthroscopic procedures. If  Dyonics shaver  most  a  used instead of  large the  Acufex instruments.  2.5. Structured Lighting and Instrument Tray Structured high-quality the  lighting  binary  instruments  techniques  are  used  with  images. The structured lighting  to increase the  contrast between  the  vision  system  involves generating the  to  produce  backlighting  background and the  for  instruments.  The backlighting technique is the most common for generating binary images, although other methods exist (Schroeder, 1984). In  most backlit binary vision systems, the work  environment is usually darkened for best image contrast; however, since the lighting in  Table 2.2 List of arthroscopy instruments.  Name L a r g e Towel C l i p S m a l l Towel C l i p T r o c a r S l e e v e 5.0mm T r o c a r S l e e v e 3.8mm Drainage Cannula Cannula T r o c a r Needle H o l d e r Dissecting Scissors Tweezers Scalpel Long K n i f e Long K n i f e H a n d l e  Symbol LTC STC TS50 TS38 TS28 PT28 NH SCI TW KF LK KH  Width 7.9 6.2 5.5 5.5 2.8 1.0 7.5 5.5 1.0 1.3 2.0 2.5  Group  Length cm  13.5 8.8 18.5 19.3 10.7 11.7 15.5 17.0 15.0 14.2 20.3 23.0  cm  Low Low Low Low Low Low Low Low Low Low Low Low  Res Res Res Res Res Res Res Res Res Res Res Res  1.8 2.5 2.7 2.8  20.8 20.8 19.5 20.5  Obturator Obturator Obturator Obturator  1.9 1.9 2.0 2.3 2.0  24.5 23.5 23.5 24.2 23.7  Cart Cart Cart Cart Cart  L e f t B a s k e t Punch LBP L e f t S e r r a t e d S c i s s o r s LSS L e f t 20 Deg Hooked S c i L20H L e f t 60 Deg Hooked S c i L60H R i g h t B a s k e t Punch RBP R i g h t S e r r a t e d S c i s s o r s RSS R i g h t 20 Deg Hooked S c i R20H R i g h t 60 Deg Hooked S c i R60H  2.3 2.5 2.7 2.5 2.3 2.5 2.7 2.5  22.0 22.5 21.8 21.5 22.0 22.5 21.8 21.5  Reg Reg Reg Reg Reg Reg Reg Reg  Left Plain Scissors L e f t Grasping Forceps Right P l a i n Scissors  3.3 3.7 3.3  24.5 25.5 24.5  Long A c u f e x Long A c u f e x Long A c u f e x  B l u n t O b t u r a t o r 3.8mm P y r a m i d a l T r o c a r 3.8mm B l u n t O b t u r a t o r 5.0mm P y r a m i d a l T r o c a r 5.0mm Meniscotome Hook P r o b e S m a l l Meniscotome Medium Meniscotome L a r g e Meniscotome  BLOB3 PYTR3 BL0B5 PYTR5 FCK HP SCK MCK LCK  LPS LGF RPS  Knife Knife Knife Knife knife Acufex Acufex Acufex Acufex Acufex Acufex Acufex Acufex  21  the  operating  intensity  room  could  not  be  dimmed,  the  backlighting  to provide adequate contrast Also, it is important  must  be  of  sufficient  to have uniform backlight  intensity for binary images.  For  the  purpose  of  obtaining  test  backlight unit was built (see Figure 2.3).  images  for  the  vision system, a simple  Four 24-inch fluorescent tubes were used to  provide the lighting. Several sheets of frosted mylar were used to diffuse the light to obtain even intensity, with the amount of diffusion controlled by the number of mylar sheets used.  To vision  ensure that the  system, an  instruments do not  instrument  tray  with  touch or overlap, as required by  compartments  required. The instrument tray should be transparent  to  separate  to allow  the  the  the  instruments is  backlighting to pass  through and should be made of a sterilizable material which does not loose its clarity with use.  Each slightly  compartment  larger  compartments  than  the  should be  should be length  designed to be piecewise  and width  suitable  for  of  several  the  instrument  different  linear  and should be  Whenever  instrument. of  the  possible, the same  type,  allowing for different arrangements of the instruments.  2.6.  2.6.1  System Integration  Operation Sequence  In required  designing robotic for  the  process to  systems, it help  in  is helpful defining  the  to  know  the  operational sequence  specifications of  components. This  section discusses several proposed operational sequences for the instrument-passing robot Although implementation of the operational sequences were considered beyond the scope of  the  thesis,  an  understanding  of  their  operations  would  be  useful  in  defining  22  64  FRONT VIEW  SIDE  Figure 2.3 Drawing of backlight unit  VIEW  23 performance specifications for the instrument-passing robot  There robot  are  two  : initialization  ensure that the  main  operational  sequences involved  in  the  instrument-passing  and control. Initialization sequence is performed  during set-up to  proper programs and data files are present and that the  instruments  match the data file specified. The control sequence then takes over and operates the vision system, the speech recognition and synthesis units, and the robot arm.  In  initialization,  instruments.  After  the  vision  checking all  system  the  checks the  instruments,  the  compartments  for  the  vision system displays a  correct list  of  instruments according to the compartments and waits for verification by the circulating nurse. If  the  vision system is unable  an instrument therefore,  the  to recognize an instrument  circulating nurse enters the  when initialization  or has misclassified  correct instrument  from  the keyboard;  is done, the vision system should have a correct map of  the instruments in the visual field.  If  time permits, the initialization program will request a voice training with the  surgeon, which requires 2 to 3 minutes; otherwise, the data file of the voice of the particular  surgeon is retrieved  and used. -A flow  chart of the initilization  sequence is  given in Figure 2.4.  In update  the  the  nothing  is  main  control  position and altered  interrupt-driven  orientation  while  image  sequence, the  the  information  vision  system  boundary-testing detected  at  alarm  is clear  again;  continued.  the  boundary  system  on  the  the  is  is  boundary, the otherwise,  the  continually  instruments.  programs  subroutine  non-background pixels are until  vision  are  To  being  executed.  executed  If  ensure  to that  executed,  an  a  of  string  vision system generates an vision system programs  are  ,  GRAB FRAME *  FEATURE EXTRACTION CLASSIFICATION  RETRIEVE STORED VOICE FILE  T  READY  Figure 2.4 Pre-Operation initialization  flowchart  25  In the event that the vision program changes the classification of an instrument without any intrusion by the robot or an alarm due to the image boundary test, then the new classification is rejected and the old classification is retained. This permits the programs to continue to operate map  of  the  normally initial,  if  instruments the  correct  is  instruments classification  even in the presence of artifacts. Assuming that the  correct which is  at  initialization,  then  the  changed classification are  retained.  If,  however,  an  system  not  will  function  requested, since the  instrument  which  changed  classification has been requested and returned, then the vision system would be unable to classify or misclassify the instrument;  as a result, the vision system would request  assistance from the circulating nurse. Alternatively, since the vision system can keep a record of which instruments are being used currently, the returned instrument may be classified  at  a  reduced  confidence  level,  only  if  the  result  is  consistent  with  the  accounting of all instruments. The vision program flow chart is shown in Figure 2.5.  Operation of the robot arm is only required after an input recognition unit or a signal for  instrument  retrieval.  from the speech  The speech recognition unit, on  receiving a command, interrupts the processing and stores the command in a command list- The command processor then interprets and executes the commands in a sequential manner; special commands requiring immediate action may be stored in another list for urgent commands and executed quickly. When all commands have been executed, the control is returned to the vision programs. The proposed speech control flow charts for the robot arm are shown in Figure 2.6.  2.6.2  Role of Surgeon and Nurse in O.R.  In room, defined  order to effectively  implement  the  interactions  between  the  to  avoid any  accidents. To  the instrument-passing robot in an operating  robot  arm  find out  and the  the  OR  staff  must  necessary interactions  be  clearly  between  the  robot and the OR staff, a questionnaire (see appendix 1) was circulated to experienced  FROM MAIN CONTROL SEQ.  ALARM  Figure 2.5 Vision system flowchart  F R O M  V I S I O N  S P E E C H  S E Q .  C O M M A N D  P R O C E S S O R  C O M M A N D P I P E L I N E ( F I F O )  N E X T C O M M A N D  S T O P  F E T C H R E T R I E V E  R O B O T i F E T C H  R O B O T : R E T R I E V E I N S T R U M E N T  I N S T R U M E N T  J  Figure 16 Speech controlflowchartfor robot arm.  28 surgical nursing staff to inquire of the type of assistance available at different stages of arthroscopic procedures.  The  questionnaire  results  indicate  that  the  circulating  nurse  and  scrub nurse  would be available to set up the robotic system prior to the surgical procedure. The circulating nurse, when she is not required elsewhere, is available to assist the robotic system during the procedure and to shut down the robotic system at the end of the procedure. However, the surgeon is generally not available to assist the robotic system. Therefore, the circulating nurses should be trained to respond to any alarms generated by the robotic system.  2.7. Safety Safety is one of the most important criteria in the successful implementation of a robot Any robot should be programmed to avoid causing any harm to any human being through its actions and inactions (Thring, 1983). Safety is especially important  in  an instrument-passing robot since it is designed to work closely with the surgeon and to  handle  some potentially  hazardous instruments,  such  as scalpels. Special attention  must be given to how the robot arm picks up an instrument, what path is selected in transporting  the  instrument,  and  which  end  of  the  instrument  is  presented  to  the  of  the  surgeon during handoff.  The  structuring of the  work area  of  the  robot with  the  work  area  surgeon should be done such that they do not overlap. Additional devices should be placed around the robot arm such that anyone entering the work area of the  robot  would  safety  inactivate  the  robot  arm  and  raise  an  audible  alarm.  However,  these  alarms should not impede the movement of the surgical staff in the operating room, or interfere with the operation of the other clinical equipemem. Hence, a careful study of  the  equipment  layout  and  clinical  staff  work  areas  must  be  done  before  the  29  instrument-passing robot may be introduced into the operating room.  2.7.1  Payback Period and Cost Effective Analysis  In  order  to assess the cost-effectiveness of an instrument  passing robot, it is  necessary to know the cost of the total robot system, the salary of a scrub nurse, and the  time savings involved. The  total cost of the  system, as outlined  in the previous  sections, and the salary of a scrub nurse are given in Table 2.3.  In robotics  industrial should  robotics,  have  a  1-3  it  is  generally  years  payback  accepted period,  that i.e.  feasible savings  applications  arising  from  for the  introduction of a robot should equal the capital cost of the robot within two years. A similar criterion might be employed in health care.  To  determine  the  labour  arthroscopic procedure of 20-minute minutes scrub  savings duration.  and clean up requires approximately nurse  spends  in  the  OR  is  involved,  consider  a  typical  Initial set up requires 5 minutes. Thus, the  approximately  35  minutes.  diagnostic  approximately  10  total time that a By  introducing  an  instrument-passing robot, only the 20 minutes during the procedure would the robot be useful;  hence  the  cost savings is  calculated  as  57%  (see  Appendix  B).  system cost and the scrub nurse's salary, the payback time is approximately if the  robotic system could completely  replace all of the  Using  the  1.6 years,  functions of a scrub nurse.  Since this may not be true, the payback period should be pro-rated accordingly.  30  Table 2.3  Proposed system cost and  scrub Nurse salary.  System C o s t f o r I n s t r u m e n t P a s s i n g Robot computer IBM PC/AT R o b o t M i t s u b i s h i RM501 Speech R e c o g n i t i o n and S y n t h e s i s System NEC SR100/AR100 V i s i o n System Camera J a v e l i n J e 2 0 6 2 Digitizer O c u l u s 100 Miscellaneous  $  4,000  $ $ $  4,000 1,000 1,000  Total  $ 29,000  System Cost  $ 5,000 $ 14,000  Labour Cost S a l a r y o f a Scrub Nurse ( a t VGH, A u g u s t 1985) payrate per hour $12.84 - $14.85 i n c l u d i n g 18% b e n e f i t $ 1 5 . 1 5 - $17.52 Average t o t a l  salary  including benefits  Work week • 37.5 h o u r s p e r week  $16.34  CHAPTER 3  VISION S Y S T E M  3.1. Imaging Components  3.1.1  Review  High-Resolution Imaging Systems  Before appropriate imaging  embarking  to  first  system refers  on  explain  a  discussion  what  to the  is  of  meant  by  system's ability  to  described by the system modulation transfer of  the  time  standard  test  domain chart  frequency consisting  high-resolution  transfer of  "resolution." separate  function -  function.  converging  imaging  It  black  The  fine  systems,  resolution  details  it of  is an  and  is usually  the spatial frequency  equivalent  is commonly and  white  measured  lines.  using a  The  "limiting  resolution" is defined as the resolution at which the response, as measured by the test chart, has fallen resolvable by the  to 3% of the maximum, which corresponds to the human eye. The unit for resolution is number  minimum  of lines per  contrast picture  height  It The  is unclear how to relate system specifications to the  "limiting  detect  an  resolution"  object  if  the  may not be suitable response  is  only  resolution  since a computer  3%  of  the  may  maximum.  requirements.  not be able to Nevertheless,  the  resolution figures may be used to compare vision systems.  The limitations the  image  in resolution for a particular  sensor. There  are  two  general  types of  imaging device is usually due to image  sensors:  camera  tubes and  solid state sensors. The term "camera tube" commonly refers to a vacuum tube device with a photoconductive target which  can convert  31  an image  into  electrical  signals, an  32  example  being  discussed in generally  the  vidicon.  detail  by  The  Cope et  resolution  limitations  al. (Cope  et  depends on the construction of the  Solid state image determined  by  sensors consist of  the  number  of  these  al, tube  of  camera  1971). The  tubes  camera  tube  been  resolution  and the photosensitive target used.  discrete photosensitive sites and the discrete  have  sites. A  review  on  resolution is  image  sensors has  At present, the highest image resolution achievable is approximately  10,000 lines,  recently been published (Flory, 1985).  using  a  return-beam  system marketed pixels. It  vidicon (RBV)  by Eikonix Inc.  (Cantella,  can scan at  1971).  Alternatively,  variable  a  high-resolution  resolution up to 4096 x 5200  uses a solid state linear imaging sensor to sense one line of the image; the  sensor is then mechanically scanned perpendicular to the line image to produce a full two-dimensional  image. However, the high-resolution systems noted above are generally  slow and cannoc be used for real-time applications (i.e. 30 frames/s).  Real-time of  the  high-resolution imaging research has intensified recently in anticipation  introduction  of  these research is the 1978;  Isozaki  et  al,  "High  Definition  Television"  (HDTV)  system.  One  result  development of the saticon high-resolution camera tube (Isozaki, 1981).  The  saticon  tube  uses  a  selenium-tellurium-arsenic  photoconducter target, which has very good resolving-power characteristics. The resolution of  the  of  saticon is reported  to  be  over  1600  lines  for  a  1-inch  limiting diameter  target In comparison, the limiting resolutions of 1-inch antimony trisulfide vidicons and 1-inch  lead-oxide  plumbicons are approximately  1100  and 900  lines, respectively. For  camera tube sensors, the limiting resolution generally increases with the diameter of the tube.  The resolution of solid state image sensors depends on the number of discrete photosensitive CCD  sites. Currently,  area array  the  highest  density  sensor available  imager, at a cost of $20,000 (Frank,  is an  800  x  800  1985). Recent improvements in  33  solid state device fabrication techniques will enable even larger array areas to be built; several research laboratories have reported development of 1024 x 1024 pixels resolution devices (Smith, 1985; Frank, 1985).  At present, commercial 512  high-resolution  imaging  x 512 pixels resolution, since high quality,  readily  available.  $20,000 or over.  The  cost for  However,  1024  with the  x  1024  technology is limited  inexpensive pixel  anticipated  cameras and digitizers  imaging  introduction  to be at  systems are of H D T V  in  are  quite  high,  the  future  high-resolution cameras, capable of 1024 x 1024 pixel resolution, will soon be available at reasonably low cost, using either digitizing  and  computing  saticon or solid state sensors. As well, high-speed  equipment  is  becoming  available  to  make  1024  x  1024  imaging economically feasible in the near future.  3.1.2  Vision System Design  There depending  on  are the  many  different  number  of  approaches  cameras  used  to  assembling  and  the  a  robotic  resolution  vision system,  requirement  Most  commercial systems use only one camera to cover the entire field of view (FOV). The resolution of such a camera system would  depend  Special  lenses, and  accessories,  such  as variable  zoom  on the  resolution of the  pan-and-tilt  units,  camera.  allow  the  one-camera system to extend beyond the resolution limit of the camera. The single or multi- camera approaches are  described in  this  section and  a special configuration  in  proposed for the instrument-passing robot  Several one camera system configurations are shown in Figure 3.1. The standard fixed the  zoom camera system is shown in Figure 3.1(a). This system is simple but has disadvantage that the  resolution is limited  by  the  resolution of the camera. The  standard configuration can be enhanced by adding variable  zoom and/or pan and tilt  capabilities, Figures 3.1(b) and (c). Variable zoom allows the camera to zoom in on a  34  small  area  for  which  requires  zoom.  By  detailed  complicated  adding  significantly  examination. feedback  pan and  tilt  The zoom circuitry,  capability  unit  or it to  may be continuously may be a simple  the  camera,  the  variable,  two-position  scanning  field  is  increased beyond the F O V of a fixed camera. If both the variable zoom  and pan and tilt capabilities are combined on one camera, a high-resolution  system  may be assembled using low-cost, low-resolution components. The main disadvantage of the zoom is the complicated software The  required to make full use of the zoom  pan and tilt system has the major  disadvantage  angles and complicated position-tracking hardware  to mount  the camera  on the robotic  error  for extreme  and software.  Another approach to increasing the apparent is  of tangent  feature.  resolution of a one-camera system  arm, or on another  arm or track,  Figure  3.1(d) and (e). By keeping the camera sensor perpendicular to the field of view, i.e. there is no tilting of the camera axis, tangent disadvantages  are the additional  slightly  complicated software.  more  cost If  of either  errors can be eliminated. However, the an arm or a motorized  track, and  the camera is mounted on the robot  arm, the  payload of the arm is decreased as well.  Two-camera or multi- camera systems use the basic concepts of the one-camera system;  various approaches using two cameras  two-camera  are shown in Figure  approach is the dual-resolution approach in Figure 3.2(a). A fixed camera  is used to scan the entire F O V at low-resolution robot  3.2. A common  arm can examine  details  at high-resolution.  while a camera The primary  mounted on the  disadvantages of this  system are : decreased payload of the arm, and slightly more complicated software to deal with the dual resolution of the two cameras. Alternately,  two cameras can work  side  one camera.  image  by side  to  alignment  problem.  double  the resolution  achievable  at the common boundary  with  only  of the two cameras could be a  However, difficult  35  (b) one camera, zoom lens  (d) one camera, on robot arm  Figure 3.1 Vision systems using one camera.  one camera f i x e d , one camera on arm  two cameras, s i d e "by s i d e  Figure 3.2 Vision systems using 2 cameras.  37  For  the  vision system of  the  instrument-passing  robot,  the  camera  resolution  would be limited to 512 x 512 pixels since 1024 x 1024 pixels resolution systems were beyond  the  budget  images, i.e. view,  did  of  this project.  using 512 not  x  produce  512  As well,  pixels  sufficient  tests on  images of  details  on  simulated  one quarter  several  of  of  the  1024 the  x  1024  desired  instruments  pixel  field  for  of  reliable  recognition. From the photographs of the instruments. Figures 1.1  to 1.4, it is evident  that  Acufex  a  high-resolution  Acufex  image  graspers, and cartilage  will  be  required  knives since the  to  classify  the  only significant  scissors  differences are at  and the  tip. For this reason, a dual-resolution approach was chosen for the vision system.  The  configuration  used for  this project  is similar  to Figure  3.2(a). Instead of  mounting the high-resolution camera on the robot arm, the high-resolution  camera is  fixed over a special region. The tips of the instruments requiring high resolution vision are placed in the special region in the F O V of the high-resolution camera; The other instruments  are  placed  in  the  remaining  area  in  the  FOV  of  the  low-resolution  camera, as shown in Figure 3.3.  The advantage  of this approach is that the payload of the arm is unaffected  such that a low-cost arm with a light but sufficient payload could be used. Also, the system software  would  only  be  slightly  more  complicated than a one-camera  system  since the high-resolution camera is fixed in space and cannot zoom. The limitations of this dual-resolution approach is that the details requiring high-resolution imaging must be able  to  fit  within the  FOV  of the  high-resolution  camera. If  the  FOV  of  the  high-resolution camera is insufficient, then the camera must be mounted on the robot arm to extend the area covered by the high-resolution camera.  3.2. Selection and Evaluation of Camera  3.2.1  Camera Review  38  Two  cameras are  required  for  the  dual  resolution approach  for  this  project.  Although implementation of the camera related hardware was not done as part of this thesis, the planned configuration is as follows. The cameras will be mounted vertically, looking down on the camera outputs the  cameras  electronics  backlit translucent  will be fed into  are  and  high-resolution,  low  distortion.  the  image digitizer.  low  Also,  tray of arthroscopic surgical instruments;  image  the  The desirable characteristics for  retention  cost of  the  the  and  lag,  low  noise,  cameras should be  stable  within  the  given budget of $10,000 for the imaging system  As  mentioned  in  the  previous section, the  two  types of  image  camera tubes and solid state image sensors. Tube-type cameras, for models, generally  for  tube-type  higher-priced  shading,  the  cameras are models.  variation  in output  a  linear  or  is impossible in  susceptible to stability  Another  photoconductive surface of the applying  the higher-priced  have superior resolution to the solid state cameras. They may also  have the option of selectable scan rates, which However,  sensors are  problem  and distortion problems, even  associated  signal due to the  solid state cameras.  with  tube-type  non-uniform  bias  to  the  output  signal  is  response across the  camera tube. Shading can usually be  parabolic  cameras  compensated by  to  counteract  the  non-uniformity. The other characteristics such as image retention, lag, and noise depend on  the  type  of  photoconductive  surface of  the  tube.  The  common photoconductive  surfaces for camera tubes are : the antimony trisulfide vidicon, newvicon, saticon, and plumbicon.  For  high  background  light  applications,  the  newvicon  and  saticon  photoconductors are the most suitable as they are less susceptible to burn-in problems due to prolonged exposure to high-intensity light sources.  There axe many advantages to solid state cameras, such as having low image retention  and lag, and no shading, as well as being distortion free, sensitive, stable,  and tight weight The solid state fabrication process ensures that there is no noticeable  Figure 3.3 Proposed vision system for instrument-passing robot  40  distortion since the sensor pattern is etched into silicon and ensures that there is no shading  due  to  non-uniformity  of  response  between  pixels.  Stability  of  solid state  cameras is much better since solid state circuits use only low voltage, and solid state cameras do not have solid  state cameras  currently  heating problems like  is the  resolution.  camera tubes. The major  Most  solid state cameras  disadvantage  costing about  of  $2,000  have resolutions of less than 300 lines, which is much worse than 600 lines  or better resolution of comparably priced tube cameras.  3.2.2  Camera Selection  Table special  3.1  list  industrial  considered  various tube  cameras.  here,  only  the  As  type  there  ones  and solid state cameras, as well  are  which  far are  too  many  potentially  cameras  useful  as several  available  in  this  to  project  be are  presented.  Of the cameras listed in Table 3.1, the best resolution available in a real-time camera is the Sierra Scientific 2601  (Sierra Scientific, Mountain View, CA), which uses  a  is  1.5  inch  plumbicon  sensor  and  capable  of  digitizing  to  1024  resolution; however,  the cost of $26,190 precludes the use of 1024  project  cameras,  Two  other  the  Cohu  8000 (Cohu  Inc.,  San  x  1024  pixels  resolution in this  Diego,  CA)  and  the  Ikegami ITC-82 (Ikegami Electronics, Maywood, NJ), have variable line rates which can scan over 1000 lines and limiting resolutions of 1000 lines or more. The cost of these cameras are slightly  less than $10,000. Although the specified resolution is over  lines, a demonstration scan rate of 1125  1000  of the Cohu 8000 showed a resolution of only 650 lines at a  lines. Therefore, it does not appear possible to purchase a 1024 x  1024 pixels resolution imaging system for under $10,000 at the present time.  For 512 x 512 pixels resolution systems, the most suitable tube type camera is the  Dage  68  (Dage-MTI  Inc.,  Michigan  City,  IN).  This  camera  has  800  lines  Table 3.1 Listing of commercial cameras considered for vision system.  Model and Manufacturer  cost  resolution H V  BW  Signal  Sensor  SNR  line r t .  frame r t . distort  DAV-16  11,000  512  512  5.0  RS170,330,CCIR  1 inch  50  559  30  1.0% max  5.0  RS170,330,CCIR  1 inch  60  559  30  1.0% max  1.5 inch  66  1118  Scierra S c i .  DAV2P-16  Scierra Sci.  2601 Scierra S c i . 8000 Cohu ITC-82  Ikegarai  15,300  512  512  26,190  1024  1024  5.0  9,500  1100  775  32.0  RS170,330,CCIR  1 inch  30  1125  30  1.5%  12,000  1000  715  30.0  RS170.330  1 inch  40  1023  30  1.0%  7.5  1.0%  66 Dage  3,564  800  10.0  RS330  1 inch  N/S  525  30  0.5%  68 Dage  5,056  800  18.0  RS330  1 inch  52  525  30  0.5%  SOLID STATE CAMERAS CDR 460 Video logic AVT-Ol  Sony  2,250  280(384)  350(491)  N/S  RS170  CCD  46  525  30  nil  280(384)  350(491)  N/S  RS170  CCD  43  525  30  nil  MOS  43  525  30  nil  CID  43  525  30  nil  JE2062 Javelin  2,000  450(384)  450(485)  N/S  RSI 70  4TN2500A1 GE  4,050  (248)  (244)  N/S  RSI 70  4TN2200A1 GE  1,340  (1281  (128)  N/S  CID  48  N/A  N/A  nil  MC9256 Reticon  3,848  (256)  (256)  N/S  Ph. diode 60  N/A  N/A  nil  (4096)  (5200)*  N/S  CCD  N/S  N/A  N/A  1 pix max  27,000  (1728)  (204B)*  N/S  Ph. diode N/S  N/A  N/A  1 pix max  610 Datacopy  10,800  (1728)  (2846)*  N/S  CCD  N/S  N/A  N/A  1 pix max  300 Datacopy  10,800  (1720)  (2592)*  N/S  CCD  N/S  N/A  N/A  1 pix max  SPECIAL CAMERAS  850  Eikonix  EC 78/99  Eikonix  notes : N/S = not specified, N/A = not applicable,  * = mechanically scanned in one direction,  (I) « number of elements  42  resolution linearity  and  has  excellent  distortion  specifications of  distortion. However, at approximately  0.5%  geometric  and  0.25%  $5000, the Dage 68 is too expensive to  be used in this project The Dage 66, which has slightly higher distortion specifcations than the Dage 68, cost $3564.  Of the solid state cameras listed in Table 3.1, the Javelin JE2062 (Javelin Elec. Inc., uses  Torrance, CA) has the a  384  x  485  techniques to improve Dage  model  65,  best resolution, at  MOS the  which  image  sensor  and  resolution (Flory,  is very  cameras showed a limiting  similar  approximately incorporates  450  lines. This camera  some  signal processing  1985). This camera was tested against a  to the  Dage  model 66,  resolution of approximately  400  for  lines. The  resolution. Both Javelin JE2062  has all the advantages of a solid state sensor, as well as a resolution comparable to a high quality  tube  type camera such as the Dage 65, accordingly, two  of them were  purchased, at $2000 each, for this project  3.2.3  Camera Evaluation  The  major  factor  in selecting the  Javelin JE2062 instead of other  solid state  cameras is the higher resolution specified; therefore, resolution will be the major  test  in evaluating the camera. Other tests of the camera will be actual operational tests in the digitization of images.  There mentioned  are  before,  two  common  one method  white lines (see Figure 3.4). the black  camera, the and  white  "limiting lines  in  methods  is to  to  use a  test  the  wedge  resolution  pattern  of  of  a  camera. As  converging black and  With the test pattern properly adjusted in the F O V of  resolution" the  is obtained  wedge  pattern  by  observing the  cannot  be  point  distinguished  when any  the  more.  Normally, two sets of the wedge pattern, oriented horizontally and vertically, are used to determine the horizontal and vertical resolution of the camera. This test method is  43 simple to perform but gives only the approximate limiting resolution of the camera. A more detailed resolution test method involves using the RCA P200 test chart (Neuhauser,  1979)  (see Figure 3.5).  The P200 test chart consists of blocks of slanted  parallel lines at different resolution; the angle of the parallel lines depends on the line resolution of the particular block. The advantage of the P200 test chart is that by decreasing the slant of the parallel lines with increasing resolution, the frequency of black/white transitions during a horizontal line scan can be kept constant at 1.45  MHz. This results in amplitude  measurements that are independent of the frequency response  of the camera's video  amplifier; also, this allows the use of a low bandwidth video amplifier which results in significantly less noise  in the  video signal. Using an oscilloscope,  the camera's  response, measured horizontally in a scan line, for each block is recorded. A plot of the response versus the line resolution yields the MTF of the camera, which is much more useful in comparing different cameras. The  test setup required  for the  P200 test chart is shown in Figure  3.6.  Amplitude response measurements were taken for line resolutions up to 500 TV lines (TVL). The MTF of the Javelin JE2062 is shown in Figure 3.7. A Dage 65 vidicon camera was also tested and the MTF was plotted with the Javelin JE2062. Comparing the MTFs of both cameras, it can be seen that the Dage 65 has a higher response than the JE2062 up to approximately 300 TVL. Beyond 300 TVL, the response of the two cameras are approximately than the Dage 65 -  the same. Thus, Javelin JE2062 has a lower response  a mid-to-high priced vidicon camera.  Although the Javelin JE2062 did not perform as well in the resolution test, it had advantages over the Dage 65 in stability and burn resistance. In actual digitization sessions  performed  at  different times, the  Dage  65  image  had heat problems  44  Figure 3.4 Resolution test pattern with converging lines  45  Figure 3.5 R C A P200 test pattern.  46  Camera RCA P200 Test Chart  Video MDnitor  Lew Pass Filter)  1  0 - 3 KHz. (Wavetek 452)  Ch. 1  B out  Tektronics 468 oscilloscope Ch. 2  Figure 3.6 Test set-up for RCA P200 test  delayed trigger output  47  Resolution Test Results  TVL Resolution  Figure 3.7 M T F of Javelin and Dage.  48  which caused spurious signals to appear on the  screen and affected the aspect ratio;  also, after prolonged exposure to the backlighting unit, shadows of the previous image could be seen on the current image as a result of sensor burn. Even though other photoconductors are available for camera tubes which are more resistant to sensor burn, exposure to bright light sources can still decrease the life of the tube; on the other hand, the  Javelin JE2062 solid state camera is not plagued by heat or sensor burn  problems.  In  a subjective test, images taken with the Javelin JE2062 contained the same  information extracted  when  on  compared to images taken with the  images  taken  with  the  two  cameras  Dage 65. In did  not  addition, features  show  any  significant  difference. Hence, the Javelin JE2062 solid state camera is suitable for this application.  3.3. Selection and Evaluation of Image Digitizer 3.3.1  Image Digitizer Review  An  image  digitizer  converts an  input  video  signal  passing the video signal through an analog to digital (A/D)  into  a  digital  signal by  converter and storing the  digital output in memory. Image digitizers vary according to the image resolution, the A/D  converter resolution, speed of digitization, and other minor differences.  The  common resolutions for  pixels. Although 512  lines, or 256  digitizers are  512  x 512  pixels and 256  in the latter case, are processed by the  x 256 digitizer,  only 480 lines, or 240, correspond to the number of active lines in most cameras, and therefore contain actual image information.  The  A/D  converter in the  digitizer  determines the number  of gray levels in  the digitized image. Common gray scale digitizers have resolutions of 4 to 8 bits. For a "binary digitizer, a voltage comparator is used instead of an A / D converter, and the  49  binary voltage threshold can usually be set to one of 256 levels.  The speed of digitization usually depends on the speed of the A / D converter used.  Real-time  frame  of the  frames  to  image  digitizers  video signal in  digitize  one image.  with  fast  A/D  converters  1/30  sec; slow scan image  The  advantage  of  can  digitize  a  complete  digitizers may take several  real-time  digitizers  is the  reduced  noise due to motion artifact since the integration time of the image is just one frame,  1/30 sec. Other  important  considerations for  an  image  digitizer  are  how  to  access the  image data stored in the digitizer's memory and the availability of look up tables. The simplest method to access the  image data is to have the  image, or portions of the  image, mapped into the memory addressing space of the computer such that accessing a pixel I/O  is done by a simple memory  read or write. Other  access methods use the  ports and may involve forming a pixel pipeline at the digitizer  board to speed  up' the data transfer process. Look up tables (LUT) are extremely useful in performing gray-scale  image  equalization. transformed  transformation,  LUXs image  may to  be  be  such  located  stored,  or  as  binarization,  reverse  video,  and  histogram  at  the  video  input  stage,  which  causes  the  at  the  output  video  stage,  which  leaves  the  digitized image intact, while displaying the transformed image.  3.3.2  Digitizer Selection  A system.  digitizer  The  capable  digitizer  of  producing binary  should have  512  x  512  images is required pixels  for  resolution, have  camera inputs and should capture the images in real-time. Table 3.2  the at  imaging  least  two  lists most of the  image digitizers available for the IBM PC. As with the camera selection, the total cost of the imaging system should not exceed $10,000.  Table 3.2 Listing of commercial image digitizers considered for vision system.  Out LUT  Computer  0  4  PC,XT,AT Memory  RS170  opt.  4  PC,XT,AT Memory  30  RS170  2  6  PC, XT  I/O  8  30  RSI 70  0  opt.  PC, XT  Memory  7  8  30  RS170  2  0  PC,XT  I/O  512x512  1  N/A  30  RS170  N/A  N/A  PC,XT,AT  I/O  unk  512x512  1  1  30  RS170  N/A  N/A  PC,XT,AT  I/O  DT2B01 Data Trans.  unk  256x256  6  8  30  RS170  8  4  PC,XT  Memory  PC-Eye  668  640x512  6  N/A  10 max  RS170  N/S  N/S  PC,XT  N/s  Model and Manufac.  Price  Img Res  Dig Res  Disp Res  Fr Rate  In Sig  In  PCVisiort-1  Img. Tech.  4043  512x512  6  8  30  RS170  PCVision-2  Int;. Tech.  4043  512X512  8  8  30  IVG-128 Datacube  4050  384x485  6 or 8  8  Silicon Video  3368  752x480  8  Oculus 200 Coreco  2850  512x512  Oculus 100 Coreco  985  Oculus 150 Coreco  Chorus  Epix  notes : unk « unknown. N/A =• no applicable,  N/S = not specified,  opt = optional  WT  Img Access  51  Both  gray-scale and binary  digitizers are  capable of producing binary images,  except gray-scale digitizers use 8 bits per pixel while binary digitizers use 1 bit per pixel.  There  is  no  need  to  use  8  bits  per  pixel  unless  extensive  mathematical  computations are performed on each pixel; hence, the less expensive binary  digitizers  are more appropriate.  The  Coreco Oculus 100  and  Oculus 150  (Coreco Inc.,  Longueuil, P.Q.)  binary digitizers with very similar features. Both digitizers have 512 and the binary threshold can be set at one of 256 levels. It  x 512  are  resolution  is suitable for the dual  resolution approach as up to 4 cameras can be used simultaneously. Additionally, the Oculus  150  hardwired available  digitizer  OR, with  AND, the  has  a  XOR  video and  Oculus 100.  output  NOT  The  for  logic  output  displaying for  display  binary  images  comparing images feature  and  which  is particularly  some  are  not  useful  for  viewing images since the graphic capability of the IBM PC is limited. However, as the Oculus  150  was not yet available  at time of selection, the Oculus 100  was chosen.  The cost of the Oculus 100 is $985, which is well within the overall budget for the imaging system.  3.3.3  Digitizer Evaluation  The Oculus 100 image digitizer was evaluated in its ability to produce a clean, usable image for subsequent feature extraction. As well, it was compared to a $9000 Imaging Technology IP-512 (Imaging Technology Inc., Woburn, MA) image digitizer at the Electrical Engineering Department at UBC.  In  the  evaluation  setup, the  digitizers were tested on a low-resolution image  only since this is the worst case with the smallest details; the set up is shown in Figure  3.8.  The  IP-512 was first used to obtain a binary low-resolution image, see  Figure  3.9.  The  Oculus 100  was then  tested and compared to the  IP-512  digitized  52 image. It software  should be noted that the comparison is not exact due to the limitations for the Oculus 100 and hardware  and print a full  512  for the IBM  in  PC. The IP-512 can display  x 512 pixels image, using programs developed by Mr.  J. Clark  and K. Chan, whereas the Oculus 100 can only display and print a 512 x 200 pixels image of one field, using vendor supplied programs.  In  evaluating  Oculus 100  the Oculus 100, it was discovered that the  digitized field of the  is slightly smaller than the IP-512, by approximately  15% in width. This  will cause the aspect ratio of the image to change as well, to approximately  1.03.  i After adjusting the digitized field to cover the entire width of the tray, 512 x 200 pixels images were displayed on the IBM  PC's screen. Significant noise problems  ' were seen in the digitized images and could not be eliminated by adjusting the binarythreshold or the aperture of the lens. These problems were probably due to the video amplification show  or  excessive  conditioning noise.  circuitry  The  noise  in  the  problems  Oculus were  100  finally  since  the  overcame  IP-512 by  did  not  increasing  the  backlight diffusion in certain parts of the tray. This indicates that perhaps the D C or gain level may be the cause of the noise problem. The final 512 x 200 pixels Oculus 100 image is shown in Figure 3.10.  From  Figure  3.10,  it  can  be  seen  that  the  Oculus  100  can  produce  a  sufficiently clean image for feature extraction. However, the object edges in the image were quite  coarse as compared to the  may be partially  due to the  200  edges in the IP-512 image;  line resolution printed.  the  coarse edges  The tests performed  in this  evaluation shows that the Oculus 100 is capable of producing clean images for feature extraction, but it has drawbacks in its video circuitry and its smaller digitized field.  53  Camera Javelin JE2062  IP-512  DEC  Oculus 100  instruments backlight unit  IBM PC  PDP 11/23  Figure 3.8 Digitizer evaluation set-up.  54  Figure 3.9 Image digitized by IP-512.  Figure 3.10  Image digitized by Oculus 100.  CHAPTER 4  Recognition Algorithm  4.1.  General Recognition Problem  The  task  comes under  of recognizing objects, or  the  category  of  image  more  generally  patterns,  pattern recognition. The  in a visual  recognition  task  field  can be  divided into several subtasks as shown in Figure 4.1.  sensor or transducer  image  feature extractor  image_  preprocessor (optional)  qual itative or quantitative classifier  _descriptive results  Figure 4.1 Pattern recognition system block diagram  In  general, an image pattern recognition system obtains an image as input and  produces descriptive  results about  sensing,  image  where  the  is  preprocessing, where an input the  image,  obtained  feature  from  the  extraction,  the  input  converted image where  to  image. a  The  format  is modified qualitative  to  subtasks invovled are: suitable  for  enhance certain  and  quantitative  enhanced image, and classification, where the  processing,  image image  characteristics of measurements  features are  are  used to  obtain the results of the recognition system.  The image representation  sensor may produce a 2 dimensional (2D)  or  3 dimensional  (3D)  of the input scene. Two dimensional sensors are more common since the  56  57  system complexity, both in hardware and software, is much less than 3D systems and most objects are relatively flat such that a 2D planar representation is sufficient Only when depth information becomes significant are 3D vision systems used.  Image preprocessing usually performs one or more transformations on the image to  facilitate  geometric,  the such  modification.  feature as  extraction  rotations  Generally,  task.  and  scalar  For  2D  images, the  translations,  transformations  and/or  are  transformation  scalar,  extremely  easy  may  such  as  to  perform  be  intensity using  hardware look-up tables and are often very useful in highlighting desired features.  The  most common scalar transform  is binary  thresholding, which  is used in  most first generation in computer vision systems. Although some new systems use gray scale image pattern recognition techniques, binary vision systems are still very much in use. Some of the differences between binary and gray scale vision have been discussed by  Kelley (Kelley,  intensity  information  recognition parts.  1983). The in  algorithms  However,  if  to  the  main advantage  the  image  solve objects  which  of gray scale vision is the additional  permits  complicated scenes, to  be  recognized  the such  can  use  of  powerful  as touching or  be  adequately  pattern  overlapping  represented  in  binary, and if suitable lighting conditions exist such as backlighting in this application, then binary vision would be preferable  since it  is much faster, less complicated and  less expensive than gray scale vision.  Image both  pattern  difficult the  feature  extraction  recognition and  and classification are  robotic  vision.  Feature  subjects of extraction  problem in image pattern recognition. Once the  classification scheme is usually limited  quotes from  Levine's survey (Levine,  1969)  to a  few  intense  research in  is perhaps the  features  have been selected,  choices. However, the  summarizes the  underlying  exists to allow  us to  choose what features  following  difficulties  feature extraction: "No general theory  most  are  of  58  relevant  for  a particular  problem ... Design of  feature  extractors is  empirical and uses many ad hoc strategies. At present the only way the machine can get an adequate set of features is from a human programmer.  The  effectiveness  of  any  particular  set  can  be  demonstrated only by experiment." Thus, clever selection of features will normally lead to simple solutions. Unfortunately, there are no general features which work with a large variety  of patterns but many-  individual  or  patterns.  features Also,  as  which  are  developed  stated  above,  the  for  only  specific patterns method  of  evaluating  specific classes a  feature  is  of by  experimentation; hence the feature selection process may be very time consuming.  Important reliability, satisfy  criteria  consistency and  as many, if  criteria  in  not  feature  selection are  resolution of all, of  the  the  system constraints, noise sensitivity,  images.  criteria.  Features  should be  System constraints and  noise sensitivity  features to a small set  There is, in  general, no way of determining the resolution necessary for a particular  feature other  than  may sometimes limit the choice of the  selected to  by  experimentation;  studies  have  been  performed  to  determine  the  resolution for recognizing military targets (RCA, 1974; Hafemeister et al., 1985)  minimum but the  results only pertain to human observers.  4.2. Description of Instruments and Layout In the  20  the dual resolution approach, the 36 instruments are divided into 2 groups, high-resolution  instruments  and  the  16  low-resolution  instruments.  The  high-resolution instruments are the instruments which are in the high-resolution image as well as the low-resolution image; the low-resolution instruments are the instruments which  can be  seen in  the  low-resolution  image  these instruments will be discussed in this section.  only.  The  layout  and grouping of  59  The area on the tray with the 20 high-resolution  area.  pattern with their details used to within  the  field  instruments  of  the  view  instruments  instruments, the  individual  (FOV) are  of  this  area  graspers, scissors,  and  instrument the  further  obturators and the  obturators refer  in  are  arranged  in  a semi-circular  tips pointing towards the centre. The tips, which contain the fine  separate  high-resolution  the  The  high-resolution instruments is designated the  camera divided  cartilage  are  positioned such  for  the  into  punches manufactured  by  high-resolution  three  knives. The  that they  groups:  area. The  the  Acufex  Acufex instruments refer  Acufex  Microsurgical  are  Inc.;  to the  to the obturators and the trocars; the cartilage knives refer to the 4  cartilage knives and the hooked probe. In arranging the high-resolution instruments, the compartments are designed and positioned  so  as  to  achieve  the  highest  resolution  possible, and  possible confusion between groups while retaining the flexibility  to  minimize  the  that any instrument of  the same group can fit in any of the compartments for that group. In  doing so, the  Acufex instruments are further subdivided into 3 long and 8 regular instruments. Three extra  long  compartments  are  reserved  for  the  long  Acufex  instruments  so  as  to  minimize the area used by the other compartments.  There are The  only  no special grouping requirements  constraints for  possible to achieve the  the  for  low-resolution area are  highest resolution for  the  the  low-resolution instruments.  that it  should be as small as  low-resolution  image, and also to  arrange the instruments so that they are all within reach of a 5- degrees of freedom robotic arm. The dimensions of the tray should also be similar to the aspect ratio of the  camera to make  shown in Figure  full use  of the F O V . The final layout of the  4.2.  4.3. Survey of Existing Algorithm  instruments are  60  Figure 4.2 Layout of proposed instrument tray.  61  4.3.1  Overview  Algorithms for image pattern recognition generally consist of 2 parts, a feature extractor  and  a classifier. There  are  many  ways  to  extract  features  from  an input  pattern and the choice of the features greatly influences the type of classifier used. As stated  in  a  previous section,  there  is  no  general  theory  for  feature  selection and  appropriate selection of features can simplify the recognition problem considerably.  In binary vision, the common features for pattern recognition are geometric and topologic  properties  such  as  area,  length,  width,  perimeter  compactness, number  of  holes, hole area, minimum and maximum radii, and invariant moments. Another useful technique is matched filtering. These features obtain information using the whole input pattern and are referred to as global features.  For  input patterns  in  which the  global  features have  difficulty discriminating,  the boundary is. often a source for generating more detailed features; hence, these new features  are  boundary  is  call boundary-oriented features. For sufficient  boundary-oriented  to  feature  completely  describe  techniques  are  most binary  the  very  objects  powerful  to for  vision applications, the be  recognized. Thus,  binary-image  pattern  recognition.  The make  a  features generated by the  decision.  decision-theoretic  There  are  method  and  some form  generally the  classifiers  apply  of  extremely  easy to use. However,  feature two  structural  discriminant the  input  extractor  are  used by the  approaches to or  syntactic  function to features  pattern  classification: the  method. Decision-theoretic  the  to a  classifier to  input  features  and  are  decision-theoretic classifier  must be organized in the form of a feature vector.  In some cases, due to the type of feature selected, it may not be possible to organize the feature into a feature vector. For such features, the structural or syntactic  62  classifiers  may  techniques  to  be find  more the  suitable.  amount  of  These  classifiers  similarity  use  between  the  powerful input  graph-matching pattern  and  the  references. The disadvantages of the structural or syntactic classifiers are that they are fairly complicated and are somewhat difficult to use.  Most systems. In  of the  the  techniques mentioned  early  have  vision systems, which  been employed  were mostly  binary,  in commercial vision global features were  used with decision-theoretic classifiers to recognize industrial objects (Gleason & Wilson, 1981; Carlisle et al, 1981). More recently, with the advances in computer hardware and software, many systems have been developed which use boundary-oriented features and a structural  classifier. These systems have been successful in recognizing touching and  overlapping objects (Rummel and Beutel, 1984).  In the following sections, the criteria for selecting pattern recognition algorithms will  be  given.  More  details  will  be  presented  on  different  global  and  boundary  oriented features, and on different classifiers.  4.3.2  Criteria for Recognition Algorithm  The first step in designing a vision system is to examine the conditions of the working  environment  The  application  constraints  must  be  well  understood  before  algorithm development can begin. This section outlines the conditions of the application environment and the constraints of the system, and discusses the desirable features for the recognition algorithm.  One of the most important consideration in designing a recognition algorithm is the may  sensitivity to noise. In cause  difficulties  for  the the  operating vision  room environment, system  are  the  overhead  noise sources which lighting,  small tissue  fragments on the instruments, and blood or other opaque fluid on the instrument tray. Other sources of noise unrelated  to the operating-room environment are noise due to  63  the camera and noise due to the instrument tray.  To  eliminate  or  minimize  the  effects  of  noise  on  the  system, the  features  chosen must not depend on characteristics which may be corrupted by the noise. For overhead lighting, the  noises are  the middle of the instrument boundary.  Therefore,  the  usually bright  reflections along the  boundary  or  in  Similarly, tissue fragments may lead to distortions in the  features  should not  depend on  the  absolute  shape  of  the  left  on  the  boundary or use the interior points in an input object  For  the  problem  of  opaque  material,  such  as  blood,  being  instrument tray, there may not exist a software solution since the blood may hide an essentia] portion of the instrument from the camera, making recognition imposssible.  4.3.3  Feature Extraction Algorithms  This section presents the details of various feature extraction algorithms, divided into subsections of global features and boundary-oriented  features. The advantages and  disadvantages of each feature will be discussed in the context of this project impossible  to  discuss  the  myriad  of  feature-extraction  developed over the years, only those that can potentially be presented. Complete  discussions on the  algorithms  that  As it is  have  been  be useful in this project will  topic of feature  extraction  may  be  found  elsewhere, e.g. (Levine, 1969; Rosenfeld, 1981; Gonzalez and Safabachsh, 1982).  4.3.4  Global Features  Global features for binary vision systems are generally  fairly  easy to compute.  The most common of the global features are geometric and topologic properties. The geometric  features  are  area,  perimeter,  maximum/minimum  radii,  compactness,  elongatedness ... etc. The topological properties are the number of holes, branches...etc in  the  object  Other  global feature which are  used often  are  centroidal  or  invariant  64 moments, and matched filtering.  The geometric features can be computed easily by first tracing the boundary of the  object.  boundary  The  tracing  which  may  process generates  be  represented  by  a  series  the  of  direction  Freeman  chain  changes  code  along  (Freeman,  the  1961).  Starting from a given pixel and tracing the boundary in either the clockwise (CW) or counter-clockwise (CCW) to  the  direction, the next pixel may be in any one of 8 neighbours  given pixel. The  direction  vectors  for  the  8 neighbours  to  a pixel  the  area,  'P' are  shown below, 3 4 5 2 P 6 1 0 7 Given  the  chain  centroid,  and  methods  have  information  code of  centroidal been  from  an  object.  moments  proposed  a raster  Freeman  can  for  be  calculated  obtaining  scan, or more  has shown  area  that  easily  and  (Freeman,  perimeter  1961).  faster,  accurately, based on the  advantage  translation invariant  of  area,  perimeter  Area and perimeter  and  centroidal  moments  are also rotation invariant  invariance for the centroidal moments, a series of mathematical performed  to  these features reflections  obtain  invariant  is that they  causing  Nevertheless, in  a  (Wong  and Hall,  is  of  planned  the  binary  environment  object these  based  on  1984)  that  they  are  To achieve rotation  transformations may be  1978). The  disadvantage  are sensitive to noise which corrupts the  portions well  moments  Other  different sampling  grids used (Grant and Reid, 1981; Agrawala and Kulkarni, 1977; Capson,  The  perimeter,  to  become  features  have  object  of  such as  the  background.  been  successful in  recognizing many industrial objects.  From  the  area  compactness and the  and  perimeter,  eccentricity  or  other  geometric  properties,  such  elongation can be computed (Ballard  as  the  and Brown,  65 1982)  . The compactness C is defined as  C  =  -£  ...(4.1)  where P is the perimeter and A is the area of the object. The eccentricity is defined as the ratio of the principal axes of inertia. These features, although simplistic, provide additional information on the object to be recognized.  Topological properties such as the number of holes, branches in the object are often useful features. Associated parameters such as the hole area, perimeter, moments and position with respect to the object centroid are also very valuable as features. A method to obtain the number and type ' O f  straight-forward  branches in the object is  thinning. Thinning shrinks the body of an object symmetrically by sequentially deleting pixels from the object boundary, leaving only a thin single pixel wide skeleton. objects  The  problem  differing  only  with thinning in  a  scalar  is that it  factor  is not  can produce  connected line  a unique  representation;  the  skeleton.  same  As  two well,  thinning require connectivity to be preserved, which may not be possible in the noisy low-resolution images.  Matched  filtering,  or  template  matching,  is  a  straightforward  method  of  comparing an unknown image to a reference image. However, in its simplest form, it is not scale or rotation templates prior  invariant.  Although transformations can be used to align the  to matching, it would be computationally intensive unless implemented  in hardware. Another problem with matched filtering is that it is only good if gross differences are present and is unreliable for small details.  4.3.5  Boundary Oriented Features  The first step to extract features from the boundary of an object is to obtain a representation  of the  boundary. One method, the  Freeman chain code, has already  66 been  presented  (Persoon  in  the  last  and Fu, 1977),  section.  Walsh  Other  descriptors  representations (Sarvarayudu  numbers (Bribiesca and Guzman, 1980) and  are  Fourier  and Sethi,  curve (Ballard  descriptors  1983),  shape  and Brown, 1982). A  survey of the above and other boundary oriented techniques may be found in Pavlidis' paper (Pavlidis, 1980).  To  calculate  the Fourier  descriptors, the boundary  pixels  are rewritten  as a  complex function as follow:  b(t) =  b (t) x  + jb (t)  ...(4.2)  y  where b(t) is the boundary contour and b (t) and b (t) are pixel x, y positions along x y the contour. The Fourier descriptors T  T  Alternatively,  the  n  R  ^ / J  b(t) e  Fourier  descriptors  are given by  _ j ( 2 7 r n / L )  may  dt  ...(4.3)  be  computed  from  the  i^-S  curve  representation, to be discussed later. The advantages of Fourier descriptors are that the theory  and computation  of  preserving, and are invariant descriptors are that they representation,  they  have  Fourier  transforms  are well  to rotation and translation.  require  a large  difficulties  number  discriminating  developed,  are  information  The disadvantages of Fourier  of coefficients  for good boundary  opposite symmetries  (Pavlidis, 1977)  and they are sensitive to noises which corrupt portions of the boundary.  Walsh descriptors are similar to Fourier descriptors except the Walsh is used instead of the Fourier they require  less computation  transform. The advantage  of Walsh descriptors is that  than Fourier descriptors. However,  have the same disadvantages as the Fourier  transform  the Walsh descriptors  descriptors, in addition to being sensitive  to the starting point where boundary tracing begin.  67  The shape number of an object is a sequence of numbers representing different types of lines or corners which make up the  object boundary. The algorithm begins  by finding the principal axes of the object and assigns a rectangular grid binding the object  The resolution of the  grid is allowed  to vary until  pixels is equal to a number specified by the user -  the  number  of boundary  this number is the order of the  shape number. The numbers in the shape number describing the boundary is assigned as follows: convex corner is 1, straight line is 2, and concave corner is 3. After  the  object  the  has  sequence  been  labelled,  represents  the  sound theory  the  smallest  sequence number  of -  numbers the  is  circularly  shape number.  shifted  until  This method  and seems reasonably simple. However, it requires the  ability  has a  to assign  the sampling direction and the sampling resolution, which makes it impractical.  The i//-S boundary. ^  curve is a one dimensional continuous curvature representation of the  is defined as the tangent angle to the boundary at the point S, the arc  length along the boundary. In the discrete case where ^ points, the t/>-S measured  curve becomes the angular  several  pixels  apart  along  S,  equivalent  is the angle between adjacent of the chain code. For angles  \p-S  the  curve  provides  a  smoother  representation of the curvature than the chain code.  Having presented various methods to represent object boundaries, the next step is to extract features from them. For the Fourier and Walsh descriptors and the shape number, further The  processing is unnecessary as they already form a complete feature set  chain code and  extracted.  the  \[/-S  Duda and Hart (Duda  curve, however, and Hart  1977)  contain  local  features  which  can be  have suggested that points of high  curvature along a boundary can be used to describe the boundary. This is echoed by Pavlidis (Pavlidis, 1977)  who stated that curvature maxima and corners are important in  shape perception, as discovered in the early theories of vision. Also, from the work of Attneave,  Bennett  and  MacDonald  (Bennett  and  MacDonald,  1975)  stated  that  "the  68  information knowledge  sufficient  for  the  recognition  of the points of maximum  of  familiar  shapes,  absolute curvature  is  contained  in  a  on the boundaries of those  shapes and their relative location and connectivity."  Several methods have  been proposed to calculate or approximate  the curvature  K(s), which is defined as  ...(4.4)  where s is the arc length along the boundary and 9 is the tangent Let  a forward  forward  vector  be a vector  from  the current  pixels  angle along s.  to the pixel  n places  along the arc length, and the backward pixel as n places back. Geisler used  the angle difference between  a forward vector and a backward vector to the pixel of  interest, with each vector being at least 10 pixel distances in length (Geisler, 1982) .  Dessimoz defined curvature K(s,) as  K(S ) k  = f*k ~ <W ' « k i - W 3 s  =  7  ...(4.5)  +  which is consistent with the previous definition of K(s) (Dessimoz, 1978). However, his angle measurements seemed somewhat inconsistent and led to a different angle change from the mathematical  definition.  Rosenfeld and Weszka proposed a method of angle detection by computing the k-cosine,  the angle  between  the forward  and backward  vector,  for several  vector  lengths (Rosenfeld and Weszka, 1975). The k-cosine is defined as  ...(4.6)  where S, and t,  are the forward and backward vectors of length k. The angle of the  69  particular point is chosen to be the minimum of the angles calculated.  So far, the curvature measurements Hung  and  between  pixel  Kasvand to  characters  (Hung  calculates the  or angle techniques presented have been based on angle  and  difference  points.  select the Kasvand,  An  ad  significant 1983).  between  hoc  technique  corners from  Hung's  method  have  been  developed by  a  binary  line  uses  the  chain  in Chinese code,  and  successive chain codes and the sum of any non-zero  pairs of these differences. Based on a set of seven rules, corner pixels are labelled as critical points. The advantage  of this method  is speed, since it  does not require  any  intensive calculations.  4.3.6  Classification Algorithms  The features  choice  chosen.  structural.  of  Two  classifiers common  Decision-theoretic  based on some partitions qualitative  in  a  pattern  techniques  for  recognition classifiers  classifiers use quantitative  in  the  system are  on  the  decision-theoretic  features  feature space. Structural,  depends  and  or  make  and  a decision  syntactic, classifiers use  and quantitative features and make a decision based on a hierachial process.  This section discusses the advantages and disadvantages of various methods using these two techniques.  There  are  decision-theoretic discriminant  elsewhere  (Fu,  decision-theoretic  perform  The  functions, minimum aproach  an  approaches  classifiers.  parametric  given  many  is 1982;  the  appropriate  distance  Duda  feature  unsupervised training.  partitioning  nonparametxic  Bayes  classifiers are  to  classifier  Classifier. and  for  the  The primary  and  These  Hart,  that they set  approaches  fast,  reference  are  nearest  feature linear  The  space  or  neighbour  classifiers are  1977).  are  the  polynomial classifier; a  discussed  primary  disadvantage  it  may  in  detail  advantages  simple to use and train. objects,  in  be  In  of fact,  possible  to  of this classification approach  70  is that it lacks flexibility  in using features since the input feature set must be in the  form of a feature vector.  Structural  classifiers, on the other  hand, are extremely  features. Most structural classifiers implement to assign a class to the input object  some form  flexible  in dealing with  of graph or string matching  As there is no restriction on the data type at  the nodes of the graph, any characterizable properties of the object may be used as features.  Brief  discussions  of  various  graph  matching  techniques  may  be  found  elsewhere (Ballard and Brown, 1982). The matching process in a structural classifier can be controlled by the program designer, who can specify the degree of match desired for  the  match  application. Thus, it between  the  structural  classifier  where  complete  a  is possible to obtain  references makes  it  match  and  the  suitable is  not  input  for  a result  object  This  based on only powerful  classifying overlapping  usually  possible.  The  or  a partial  feature  of  the  touching objects,  drawbacks  of  a  structural  classifier are complicated programming, slow execution and, possibly, complicated training procedures.  4.4.  Development of Recognition Algorithms The  recognition  of  arthroscopic  instruments  has  been  divided  into  two  subproblems, that of recognizing a subset of the instruments at a lower resolution, over the  entire  higher the  instrument  tray, and  resolution. The low-resolution  algorithms  instruments  that  of  recognizing the  developed for and  also  the  remaining  low-resolution  obtain  any  gross  instruments image  at  a  discriminate  features  for  the  high-resolution instruments to assist the algorithms for the high-resolution images. The algorithms  for  the  high-resolution  image  instruments under the high-resolution view.  concentrate  on  the  fine  details  of  the  71  This  section  presents  the  algorithms  developed  for  the  low-resolution  and  high-resolution images. The results of a survey leading to the formulation of a set of clinical performance requirements are also discussed.  4.4.1  Low-Resolution Algorithm  The low-resolution image is a 512 x 476 pixels image of all 36 instruments to be recognized. The overall field of view which  is slightly  larger  than  the  68  (FOV)  cm x  of the camera is 68 cm x 52.6 cm,  50  cm tray.  To  achieve the  maximum  resolution in this large F O V , the width of the tray is aligned with the picture width; the picture height is allowed to extend beyond the tray area by a small amount A typical low-resolution image with all 36 instruments is shown in Figure 4.3  In  digitizing the low-resolution image, the binary threshold was initially  set to  a low value, which caused a large portion of the image to be filled with noise. As the  threshold was increased, the  corners  of  completely  the  image.  disappear  If  from  the the  noise decreased until threshold  was  screen; however,  it  increased the  could only be seen in the further,  increased  noise  threshold  cause portions of the instruments to disappear, creating gaps in the final  the  would  would  also  instruments. The  threshold level chosen for the low-resolution image was the level at which the  noise was relatively  removed from the instrument compartments. For consistency in the  features, the binary threshold of the test and training data was determined from the first image and kept constant throughout the digitizing session.  After  the  binary  digitization  process,  each  image  was  subjected  to  feature  extraction. At this stage, all background noise had been eliminated from the instrument compartments; however, numerous small gaps could be seen in the image. These small gaps occurred at places where the instruments were extremely narrow, such as the tips of the large and small towel clips, and the tip of the drainage cannula. Gaps were  72  Figure 4.3 Typical low resolution image.  73  also found along the the near  finger holes of the  backlighting reflecting the  centre  of the  off tray;  small towel clip. These gaps were due to  the finger holes and onto the this  effect  camera, for  could be observed in  the  instruments  thickness of  the  finger holes, which increases with the distance from the tray centre.  From the low-resolution image, Figure 4.3, it can be seen that there little  information  Therefore,  it  present  was  for  discriminating  considered more  the  appropriate  individual to  high-resolution  recognize  these  is very  instruments.  instruments  under  high-resolution. The remaining low-resolution instruments were found to exhibit various gross differences which could be characterized by simple global features, provided that such  features  were  are  insensitive  to  the  small  gaps  that  are  present  in  some  instruments.  Of  the  global  features  reviewed  in  a previous section, the  methods that are  insensitive to small gaps are length, width and area. Since the presence of small gaps changes the As  well,  topology of the objects, topological properties cannot be used as feature. perimeter  Maximum/minimum  cannot radii  be  used  as  errors  would  occur  around  and centroidal moments would not be effective  since the gaps may cause portions of the instrument  the  gaps.  in this case  to become disconnected, creating  difficulties when calculating the centroid.  To increase execution speed, only the length and width were chosen as features for the low-resolution instruments. Although these features are not rotation invariant, it does not affect this application as the instruments are limited in their rotation by the compartment  boundaries.  The  length  and  width  are  obtained  by  scanning  each  compartment line by line, for background-to-object transitions, or vice versa. The first and last scanned object lines, and the maximum process. From the located  to  obtain  width are recorded in the scanning  first and last scanned lines, the the  length  of  the  instrument  extremities  The  of the instrument  are  width is simply given by  the  74  maximum width of the scanned lines.  The instead  of  scanned  speed of every  lines  obtained. If before  a  the  line.  algorithm  By  respectively, the  can  backtracking a  be and  increased forward  close approximation  area is to be calculated, the  good approximation  to  the  area  by  scanning every  tracking  of  the  at  length  the  can  be  obtained.  first  and  scan line frequency  few  lines  and  width  last  can be  must be increased  For  the  low-resolution  images, the scanning frequency was set to be every four lines.  A  graph of the distribution of the instruments in the length  and width space  is given in Figure 4.4. The separation of different instruments in the  graph indicates  that the length and width are sufficient to discriminate the low-resolution  instruments.  For a classifier, the BMDP discriminant function is used. This is a linear  discriminant  function which is extremely  easy to use. Training  is done by  entering  training  data  into the BMDP stepwise discriminant analysis program which generates the coefficients. As this test is intended  for the  low-resolution  instruments, it will  be referred  to as  the low-resolution test  Since the high-resolution  F O V of the instruments,  high-resolution this  image is restricted  suggested  that  high-resolution instruments should be extracted  the  to  coarse  the  tips of the  features  in the low-resolution  for  the  image. There are  three coarse features which were found to be useful in discriminating the instruments : length,  width  of  the  handle,  and  pose  of  the  instrument.  high-resolution instrument (either "pose 1" or "pose 2")  The  defines whether  pose  of  a  the instrument  is lying on one side or the other. Figure 4.5 illustrates the difference between pose 1 and pose 2.  As will be shown in the  next section, knowledge  of  the  pose greatly  simplified the recognition of Acufex instruments in high-resolution. The width of the handle  was  found  to  be  useful  in  separating  the  cartilage  knives  and  the  Acufex  instruments while the length of the instruments was useful in separating the long and  LOW R E S O L U T I O N I N S T R U M E N T DATA 60 NH  mm I HI I  50  H  40  H  •  TS50 Eh CL. TS38  LTC  fnBL  TS50  O FrCTC  D  a  SCI  H  30  °a STC •  TS28 •  H  20  KH LK  TW  on  10H  KF PT28  1  0 60  1 80  ~I  1 100  1  1 120  T  —I 140 LENGTH  1  1 160  I  1 180  "I  1 200  l~~ 220  76  the regular Acufex instruments.  The algorithms to obtain the length, handle width and pose of high-resolution instruments were called the Acufex length test, the handle width test and the Acufex pose  test,  respectively.  The  data  to  calculate  the  three  features  are  generated  by  scanning the high-resolution compartments every 4 lines, as in the low-resolution test. During each scan, the positions of the edges of the instruments are recorded.  From  the scanning data of a compartment,  the  length  is calculated using the  method of the low-resolution test. To calculate the handle width, the angles along the two edges of the handle are calculated. Using simple trigonometry, the handle width is approximated, as shown in Figure 4.6.  The  distribution of the  length  and the width  are shown in Figures 4.7 and 4.8. Since these 2 tests have only one feature  value to  separate 2 classes, a simple threshold classifier suffices. To obtain a suitable threshold, the  training  discriminant  data  for  analysis.  each The  test  are  threshold  entered is  then  in  a  single-variable  assigned as  the  value  BMDP for  stepwise  which  the  classification probabilities are equal.  To calculate the pose of the instrument, the angle of the long narrow  tip of  the Acufex instruments or cartilage knives are measured. This angle is compared to the two  edge angles calculated previously for the  handle  width test, as shown in Figure  4.9. A minimum distance classifier is used to determine pose 1 and pose 2, based on the difference between the tip angle and the edge angles.  It  should be noted that the Acufex length, Acufex pose, and handle width tests  are only applied to the Acufex instruments and the cartilage knives. The obturators are not tested since two of the tests, handle width and pose, are not meaningful and the best test to separate them is performed in high-resolution.  4.4.2  High Resolution Algorithm  77  The  pose -of an Instrument i s a r b i t r a r i l y d e f i n e d as  Figure 4.5 Definition of Pose 1 and Pose 2.  78  Figure 4.6 Deterrnination of handle width.  79  A C U F E X  i  i  f  [•  )•  185  l  1  r  i  1  [•f'TT v  190  [•  r  195  LONG ACUFEX  r  r  L E N G T H  I -] T i ,  ,  200  I'  i  ;  i  i  i  205  I* |  T E S T  I'  210  i 'I  i ' | 'I  215  'I  'I  'I  ' | ' I 'I  220  'i  i ' | 'I  'i  'i  225  LENGTH  V7J\ REG ACUFEX  Figure 4.7 Distribution of lengths of Acufex instruments and cartilage knives.  i  |  230  80  H A N D L E WIDTH T E S T 70  60 H  50  40 H  30  R/  H  n s  20 H  R s s s s  10  T 5  s s s  6  \  s s  8 fZZ  R/  s \ \  s s  -P—i  10 ACUFEX  11  1  r  12  13  / / /  14  / / / / / / / /  / / /  15  / / / / / / / / / / / / / / / / / / / / / / / /  16  Vi / /  /  17  18  19  20  HANDLE WIDTH [SS CART KNIFE  Figure 4.8 Distribution of handle widths of Acufex instruments and cartilage knives.  81  if  |-e - -e | > he- - -e. 1  T  2  then pose is 1 else pose is 2  Figure 4.9 Determination of pose for Acufex instrument and cartilage knives.  82  The high-resolution image is a 512 x 476 pixel image covering the tips of the Acufex instruments, the obturators and the cartilage  knives. The tray area covered by  the camera is a 13.6 cm x 10.6 cm region, oriented 90 degrees to-the low-resolution image, located at the bottom of the image and displaced slightly to the left of centre. The  change  advantage  in  of  the  the  orientation  aspect  of  ratio  of  the  high-resolution  the  camera  to  image  achieve  camera  the  is  to  maximum  take  possible  resolution. A typical high-resolution image is shown in Figure 4.10.  In the high-resolution image, only the tips of all 20 high-resolution instruments are within view. extraction between these  Thus, intuitively,  it  did not  appear  techniques to these instruments. For the them  clues,  obturator  are pyramidal a  shaft  simple and  the  or  blunt  tip  obturator  test  was  angle  at  the  and  feasible  obturators, the  3.8  mm  designed  tip.  to  The  to  width  or  5.0  measure was  apply  global  feature  structural differences mm  diameter.  the  From  width  of  the  with  a  line  measured  scanning procedure for every fourth line, as in the low-resolution test. To measure the angle  at  the  tip,  the  boundary  of the  obturator  tip  was traced  and recorded as a  chain code. The angles along the boundary were then calculated as the angle between the  forward  vector, 5 pixels  forward,  and the  backward  vector,  5 pixels  back. The  minimum of these angles usually denotes the tip angle. The distribution of the data in the  width-angle  space is shown in Figure  4.11.  The  classifier used was the  BMDP  discriminant function.  The differences  remaining at  the  Acufex  tip,  instruments  which  made  and  them  the ideal  cartilage for  knives  showed  boundary-oriented  curvature curvature  techniques. The cartilage knives also showed some variation in the width near the tip. Using  the  same method as the  handle  width  test on the low-resolution  width of the cartilage knives were measured and graphed in Figure 4.12.  image,  the  The results  showed sufficient separation between the different classes that no additional feature was  Figure 4.10 Typical high resolution image.  OBTURATOR  TEST  130  51  an a •  120  m  a  cr c  s  ts.  &  Z  I  T,  D  o z < ZD  Q.  BL0B5  BLOB"?  o.  B  •  tm a  I  90  PYTR5  70 60  • •  PYTR3 a  D  0 a  a  a  92.  8  I  t  40  1  8  10  12  14  16  ,  18  OBTURATOR WIDTH oo  85  tried. Again, for a single variable feature, a simple multi-threshold classifier was used. The  different thresholds were calculated from  the  classification function generated by  the BMDP stepwise discriminant analysis program.  For  the  11  Acufex  scheme which would allow  instruments, as there  did  not  appear  to  be  any clever  simple features to be used, a boundary-oriented approach  was taken. The boundary of each instrument was traced and recorded as a chain code and the  angles for  the  boundary  were calculated as in  the  obturator  test.  A peak  detector was then applied to the boundary angles to extract the significant corners. The results  for  shows  a  all  11  instruments  high-resolution  are  image  illustrated  in  a  in  Figures 4.13  "noisy"  environment  to 4.23  .  generated  Figure by  4.24  overhead  lighting. This image shows the types of degradation that could occur to the objects "intrusions"  appearing  in  the  object  silhouettes.  To  overcome  this  type  of  noise  problems, a structural technique is proposed.  Drawing  on the  work  of Pavlidis in  matching  island contours using syntactic  pattern recognition techniques (Pavlidis, 1979), the tips on the Acufex instruments can be  modeled  represent  a  as  a  cyclic  sequence  of  sharp  corners.  sequence around a contour  In  with  Pavlidis' features  algorithm, such  the  corners  as size, type  and  orientation. Size is described by small, medium, large and huge; type is described by sharp protrusion or  intrusion, convex or concave corner  and convex or concave • arc;  orientation is described by directions such as East North-East North then sequentially  matches the  unknown corners to the  etc. Pavilidis  reference corners according to  the similarities of the features; a strength is used to measure the amount of similarity of each match feature pair. After different references have been matched, the unknown is classified as the reference that has the highest total strength.  For the Acufex instruments, the tips are always traced in the counter-clockwise direction, and knowing that the boundary waveforms are not cyclic, there are distinctive  86  FCK,  «LCK-  HP, SCK, MCK-  i i r' i ' i • i ' i ' i ' i i ' i ' i i • i i 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 1  1  WIDTH  C A R T I L A G E 5  K N I F E  T E S T  -r—  4 -  4.5  5.5  6.5  7.5  8.5  9.5 10.5 11.5 12.5 13.5 14.5 15.5 16.5 WIDTH  Figure 4.12 Distribution of width of cartilage knives : (top) all knives; (bottom) F C K , HP, SCK, M C K .  LEFT GRASPING  FORCEPS  1 60  400 ARC  LENGTH  LEFT PLAIN  SCISSORS  1 60  0  40  80  120  160  200  240  280  320  ARC LENGTH oo oo  RIGHT PLAIN  SCISSORS  1 60 140 • f  1 20 100 00 60 40 20 0 -20  \  U U U U U U UU I m\m  -40  I \ n n PP u II  pnn  u nil H U u  -60 -80 -100 -120 ' f : first peak -140 -1 : last peak \ : peaks detected -160 40 80  120  160 ARC LENGTH  200  240  280  320  LEFT 2 0  DEG H O O K E D  ARC  LENGTH  SCISSORS  RIGHT 2 0  DEG HOOKED  SCISSORS  LEFT 6 0 DEG H O O K E D  SCISSORS  1 60 -  /  140 1 20 100 80 • 60 • 40 • 20 0 -20 -40 -60 -80 -100 -1 20  f : first peak 1 : last peak , : peaks detected  -140 -1 60  -I—  0  40  80  120 ARC LENGTH  160  200  240  RIGHT 6 0  DEG HOOKED  (I  ft  v  vv  \J  \AJ\  I  V  20  40  /\ A  \A  A  \  -1  f : first peak 1 : last peak : peaks detected 0  SCISSORS  60  80  100  ARC LENGTH  120  140  160  180  LEFT SERRATED  u  1 60 140 100 80 60 40 o  <  20 0 -20  y  f  120  Ul _J  SCISSORS  •U  n  U  n  n A A U UJ V U  MMIIIllMn/ llinil V  UIW  r-V  V  Mill  -40  S  V |fV V-J U U  A AAA  nr\  U n  U n  n  -60 -80 -100 -120  f : first peak  -140 . 1 : last peak -1 60  ^  ." : peaks detected 40  80  120  160  200  240  280  ARC LENGTH so 4^  RIGHT S E R R A T E D  ARC LENGTH  SCISSORS  ARC LENGTH  RIGHT B A S K E T  ARC LENGTH  PUNCH  98  Figure 4.24 Noisy high resolution image with intrusions.  99 first and last corners in the boundary. Instead of qualitative features, absolute quantities such as distances and angles are used as features. Denoting the absolute angle of the instrument shaft as the reference shaft angle 6  , then all other corners between the sa  first and last corners are characterized by the distances dj- and dj to the first and last corners and by the angles 6^  and 6^  between the reference shaft angles and angles  to the first and last corners, as shown in Figure 4.25. The classifier for this structural model is a sequential matching algorithm. The matching algorithm starts with the corners in the reference models and tries to match them to the object corners. Since the corners must appear in a specific order, if one reference  corner  is matched, the  corner at the next object corner -  algorithm  then  begins matching the  next reference  never allowing overlapping of matched corners. For  each reference model tried, a score is given as:  Score =  The  (number of matches) /  classifier assigns  the  instrument  (number of corners in reference) ...(4.7)  to  the  class  with  the  highest  score.  If  the  matching algorithm is unable to match any corner to the references, then the result is a no match condition.  The  choice of first and last corners are  crucial in this algorithm; if in the  presence of noise, the wrong first or last comer was chosen, then this method would not  work.  To  the  algorithm  eliminate could  this  not  drawback, an improvement  match  any  of  the  was implemented  references  to  the  object,  whereby it  if  would  automatically choose a new first or last corner and restart the classification process. In actual tests of the algorithms, this extension was able to recover when the first or last corner was. incorrect In the case of an actual no-match condition, this extension did not produce a false match.  100  Figure 425 High resolution stmctural features.  101  To corners  obtain  for  instrument  each the  the  instrument  corners  correspond to the  reference  from  in  models the  for  the  training  different  data  same corner in the  structural  data are  are  then  classifier,  first  all  generated.  compared;  image are grouped together.  significant For  each  the  corners  that  For  each corner,  the mean and standard deviation for each distance and angle feature are calculated. If the  standard  deviation  is large, then  the  particular  corner may  be  discarded or the  mean and standard deviation may be recalculated without the outliers. The mean values of the feature of the significant corners are used to construct the reference models. In most cases, only the significant corners that are detected in a large proportion of the training data are used in the reference model.  In  an attempt to improve the speed and accuracy, six features associated with  the first and last corners were defined. They were: 1.  the  angle between  the shaft angle  8^  and the angle 8^ from first to last  corner; 2.  the distance d^ between the first and last corners;  3.  the angle 8^ of the first corner;  4.  the angle 8 j at the last comer;  5.  the difference between the angles formed by the first corner and a pixel 10 position  before, and the  angle  formed  by  the  last  corner  and a  pixel  10  positions past (8^); 6.  the same test as above but with the pixel 20 positions away  (C^Q);  An illustration of the features is given in Figure 4.26. These features were used with the BMDP stepwise discriminant analysis to generate a classification function. Depending on the highest scores from the classification function results, between highest scores are considered as "possible matches" for the input  3 to 5 of the  This limits the set  to be used with the structural matching algorithm and should speed up the algorithm.  102 In the  addition  to the  BMDP  feature  set, the  low-resolution image also help to limit  same instrument can look different different the  instruments, with different  knowledge  of  Similarly,  knowing  algorithm  will  instrument the  length  ignore  the  the  pose and length  information  from  set to be matched. Given that the  in pose 1 from pose 2, they  are treated as two  BMDP coefficients and reference models. Therefore,  pose of  reduces the  "possible  the  Acufex  "possible  instrument,  matches"  which  are  matches"  by  one-half.  the  sequential  matching  not  consistent  with  the  instrument lengths.  To  summarize  the  algorithm  for  the  Acufex  Acufex test, the algorithm consists of 2 stages. In  instruments,  referred  to  as the  the first stage, using the results of  the instrument pose, the algorithm performs a B M D P classification and identifies 3 to 5  instruments  eliminating length,  as a  the  "possible match"  "possible matches"  performs  a the  sequential  instrument  with  sequential  matching algorithm  for  that  match  are on  highest score will cannot  the  second stage. The  inconsistent according to the  remaining  have the  find  second stage, after  a  "possible  the  instrument  matches".  object assigned to its class;  match,  then  the  result  is a  if  The the  no-match  condition.  A  sample corner-matching sequence of  the  L60H  scissors is shown in Table  4.1. The ' B M D classification function gave 3 possible matches: L60H, R20H The  length  and pose of the  unknown  were  190  and LPS.  and pose 2 respectively. From the  length of the instrument, it was determined that the unknown cannot be LPS; hence, it was unnecessary to match the unknown to the LPS reference model. After extracting the peaks in the unknown, the first and last corners were found to be corners 2 and 5, leaving only two corners to be matched. In matching against R20H, only one comer could  be  matched  match  of  0.333. In  out  of  3  corners in  matching against  the  L60H,  R20H both  reference,  corners were  giving a goodness of matched  to  the  two  103  Table 4.1 Typical Structural matching  Unknown I n s t r u m e n t  results.  » L e f t 60 Deg. Hooked S c i s s o r s (L60H)  BKO D i s c r i m i n a n t F u n c t i o n S c o r e s : L60H R20H LPS 714.55 682.89 681.20  LSS 678.57  RPS 677.39  S i g n i f i c a n t C o r n e r s D e t e c t e d and F e a t u r e s C a l c u l a t e d : ANG F A LD LA FD 1. -143 f i r s t corner 2. 59 19 69 10 119 3. -77 19 84 91 8 4. 98 l a s t corner 5. 76 Structural  Matching  Unknown •77 98  19 19  69 84  150.8 120.3 91.5  10 8  n a t c h s c o r e « 0.333 -77 98  19 69 10 119 19 84 8 91 match s c o r e - 1.000  L60H - -74.8 - 98.5  18.6 19.7  68.2 83.7  10.9 7.1  116.3 89.7  104  Figure 4.26 High resolution BMD features  105  reference  corners,  classifier  giving  assigned the  a  goodness of  unknown  to  class  match  of  1.000.  L60H.  Since  Therefore,  the  structural  the  structural  classifier result  agreed with the B M D classifier result, the clinically acceptable match was also L60H.  4.4.3  Clinical Requirements  In  designing equipment  for use in the  health  care field, special considerations  must be given to safety. The equipment must not fail in a manner which may bring harm to a patient As well, the mode of operation must be acceptable to the clinical users, such as nurses and surgeons. This section addresses these problems -and outlines some  basic  criteria  for  a  clinically  obtain  pertinent  acceptable  implementation  of  an  image  pattern  recognition system.  In drafted  order  to  to solicit comments on the  data  for  this  application,  clinical issues related  to  the  a  questionnaire  was  vision system. The  questionnaire attempted to find the acceptable level of performance for a vision system, the type of assistance available for the vision system in the operating room, and other questions relevant  to the instrument-passing robot  In  determining  the  acceptable level  of performance, it was desired to know what type of error and what error rate would not  be  tolerated.  performance  It  was  also  considered  requirements, to know (a)  to  be  useful,  for  establishing  system  what assistance would be available to monitor  and, possibly, to correct system alarms, and (b) the tolerable frequency of assistance. A copy of the questionnaire is shown in Appendix  Responses to the years of  experience at  A  questionnaires were taken from 3 nurses, each with over 9 Vancouver General Hospital and U B C Health  Sciences Centre  Hospital. The maximum number of errors rated as tolerable varied from 0 to 2 errors per procedure, which is approximately nurses. This number, however,  the same level of performance as human scrub  depends on the patience  of the  surgeon. It  was also  106  learned  from  available  the  questionnaire  that  the  circulating  nurse  in  the  operating  room  is  to assist the vision system, by taking minor corrective actions, if she is not  engaged in other activities. The nurses indicated that the frequency of alarms from the vision system should not exceed once every  15 minutes. They stated that, in general,  the surgeon is not available to assist in the operation of the pattern recognition system as the  surgeon should not lose visual contact with the surgical site. Complete results  of the questionnaire survey are given in Appendix A.  The  primary  implications  of  the  questionnaire  results  recognition algorithm  design are that there should be as few  no  error  more  during the errors low.  by The  than  one  procedure, it  per  is possible for  responding with a no-match  procedure  condition  no-match can  circulating nurse, either immediately  be  of the  30  minutes.  with  solved by  when  Since assistance is  the  a request  to  the  errors as possible, and  vision system to reduce  condition  respect  the  recognition for  number  of  confidence is  intervention  or when the unmatched instrument  available  by  the  has been called  for.  These  results  were  used  as performance  criteria  in  the  pattern recognition algorithms to be used in the operating room.  development  of  the  CHAPTER 5  E V A L U A T I O N OF RECOGNITION A L G O R I T H M  The  usual question that is asked of any pattern recognition system is;  good is the system?" In standard  error-estimation  possible  improvements  with implementation,  to  "How  this chapter, the algorithms developed are evaluated based on methods. the  The  failures  algorithms  such as noise and  are  of  the  presented.  algorithms Practical  computation time are  are  analysed and  problems associated  presented. Finally,  the  error estimates for each algorithm are combined, with other considerations, to provide a performance estimate  5.1.  for the overall system.  Test Methods  5.1.1  Methods of Error Estimation  Various methods exist to estimate in  the  probability  a pattern recognition system. These methods are  elsewhere, e.g. (Toussaint, 1974;  of error,  or misclassification,  described and reviewed  Toussaint and Sharpe, 1975; Glick, 1978;  in  detail  Kanal, 1974).  The purpose of these error estimators is to provide an accurate error estimate given a small  data  set  The  differences  between  the  methods are  the  manner  in  which  the  data set is used, the bias of the resulting estimate, and the variance of the estimate.  Resubstitution Method  One pertinent procedures for the R  method of error  estimate  is the  resubstitution (R) method. The  method is as follows.  1.  train the classifier on all of the available data set N;  2.  test the classifier on the entire data set N.  107  108 This method makes efficient use of the data set in that the whole set is used to train the classifier. However, the results of the R method is "overly optimistic" since the classifier has encountered all the test data during training (Toussaint, 1974). Cross-validation (rr) method Alternatively, several methods that are known generically as "cross-validation" may be used for error estimation. The most general of these cross-validation methods is the rotation it method. The procedure for the JT method is : 1.  divide the data set N into P mutually exclusive sets  2.  N/P is an integer and PAN <=  3.  For i =  4  1  0.5;  P  a.  train  on set  b.  test on set (i)  (j).  <J  =  1  p  &  ^  average the results of the P training and testing sessions.  The efficiency of using the available data set in the v method depends on the ratio N/P; the efficiency increases as N/P decreases toward 2. The error estimate of the IT method has been shown to be a pessimistic estimate, e.g. (Toussaint, 1974). There are two special cases of the v method, when P=N/2 and P=l. For P=N/2, the w method is also known as the "double cross-validation" method. The data usage is least efficient in this case since only half of the data set is used for training,  and the resulting error estimate is the most pessimistic For P=l,  the v  method is also known as the "leave one out" (U) method. Of all the » methods, this method makes the most efficient use of the data set and has the least (pessimistic) bias in the resulting error estimate.  109  Although  the  U  method  has the  least bias in  shown that this estimate has a larger variance than of success rate (Glick,  1978)."  its  error  estimate,  Glick has  "any other well known estimator  Another disadvantage of the  U  method  is the  large  number of training sessions required. For a data size of N, the U method requires N trailing sessions while the R, double cross-validation, and it  methods . require  1,2 and  P training sessions, respectively. Toussaint's method An  alternative  for  obtaining an unbiased estimate  was proposed by Toussaint  (Toussaint and Sharpe, 1975). Since the R method provides an optimistic error estimate while  it  the  method  provides a  pessimistic estimate,  an  unbiased estimate  may  be  obtained using the following,  P = e  W(N,P,X) (P/N)  p  where  W  parameters  i  is a weighting N  N/P I P [TT]. + e I  =  function which  and P in the  it  [1 -  1  depends on the  ...(5.1)  dimensionality  X  and the  method. Experiments by Toussaint, with W  have shown that the estimate from equation (5.1)  In  W(N,P,X)] P [R] e  =  1/2,  approaches that of U method.  designing an evaluation protocol for the pattern recognition algorithms, it is  important to provide a least biased but reliable estimate of the error. The U method is not suitable since it has the largest variance of all the methods. Therefore, the it method was chosen to provide the error estimate. Another advantage of the IT method is that its estimates are pessimistic, which implies that the final estimates would be a worst case (lower bound) performance estimate. Similarly, the R method may be used to obtain a best case (upper bound) performance estimate.  Ideally, the  data set size N  and the  number of set partitions  K  should be  large to minimize the bias in the result However, there is a cost in user time and computing  time  associated  with  obtaining,  training  and  testing  a  large  data  set  110  Therefore, two versions of ir P=2  TT method  while  the  method  error  estimator  are used: P=4  and P=2.  The  will be used on the algorithms which are expected to perform well  less  biased  P=4  TT  method  will  be  used  on  the  more  complicated  algorithms.  5.1.2  Test and Training Data  Besides  choosing the  appropriate  error  estimate of performance, it is also important  estimation  method  to have appropriate  to  give  a  reliable  samples in the  set to represent the various conditions which may arise in the practical  data  implementation  of the pattern recognition system.  An  ideal  data  set  may  contain  a  large  number  of  noise-free  images  for  training and testing, and a reasonable number of images representing all possible noise sources  in  the  system. As well,  images  of  all  possible error  conditions should be  included in the ideal data set  As  with  the  protocol  for  performance  evaluation,  there  are  similar  costs  associated with obtaining and testing of the data set which may make it necessary to use a less than ideal data set The data set for this evaluation consists of noise free and  noisy  images  of  instruments  within  the  group  compartment.  Images  of  error  conditions for wrong instrument groups in compartments are not included in the data set A listing of the data set is given in Table 5.1.  The following are the keyword descriptors for different test images. 1.  Noisy images : images taken with backlighting and direct overhead lighting.  2.  Noise-free images : images taken with backlighting and no overhead lighting.  3.  Low-resolution images : images of entire tray area  4.  High-resolution images : images of the lower central region of tray containing the tips of the Acufex, obturators and cartilage knives  Table 5.1 List of test data set  Low Resolution images: 4 4 4 1 1 1  pose 1 noise free images (LT101. . .104) pose 2 noise free images (LT201. . .204) mixed pose noise free images (LTM01. . .04) pose 1 noisy image (LN01) pose 2 noisy image (LN02) mixed pose noisy image (LN03)  High Resolution images: 16 16 16 16 4 4 4 4 6 6 2 2  pose pose pose pose pose pose pose pose pose pose pose pose  1 random comp. noise free images (HT101. . .HT116) 2 random comp. noise free images (HT201. . .HT216) 1 fixed comp. noise free images (HT120. . .HT135) 2 fixed comp. noise free images (HT220. . .HT235) 1 random comp. noisy images (HN101. . .HN104) 2 random comp. noisy images (HN201. . .HN204) 1 fixed comp. noisy images (HN105. . .HN108) 2 fixed comp. noisy iamges (HN205. . .HN208) 1 noise * =aes (HT140. . .HT145) 2 noise (HT240. . .HT245) 1 noise HN110) 2 noisy imago. ''•  Conditions: Position Pose -  the positions of all instruments were moved slightly from image to image to randomize the image data.  the pose of some low-resolution images were changed to randomize the image data. The changes include right side up & upside down, front to end and end to front. . .etc.  Threshold -  the binary threshold was kept constant for noise free images but allowed to vary for noisy images.  112  5.  Pose-1 images : low  or high-resolution images with the  Acufex &  cartilage  images with the -Acufex &  cartilage  knives lying in pose 1 (as defined in chapter 4) 6.  Pose-2 images : low  or high-resolution  knives lying in pose 2 (as defined in chapter 4) 7.  Mixed-pose  images  :  low  cartilage knives alternatively 8.  Prespecified-compartment  or  high-resolution  images  with  the  Acufex  &  lying in pose 1 and 2. images  :  high-resolution  images  with  Acufex  instruments in any prespecified compartments 9.  Random-compartment  images : high-resolution images with Acufex instruments  in any suitably sized Acufex compartment  In  obtaining  the  image  data, certain  conditions were  defined  for  the  position  and pose of the instruments, and the threshold level for the binary images. To create a degree of randomness in the images, the positions of the instruments were changed from  image  to image. The  variations  in position included moving the  instruments to  the extremes of the compartment and to the extreme amount of rotation permitted the  boundaries of the  compartment  Since an actual  tray with compartments  by  was not  available, the effect of .instruments touching the compartment boundaries was not tested here. This problem will be considered in a later section with tests on a small sample tray.  The poses of all instruments, except the Acufex and the cartilage knives, were changed in some of the images to include this effect in the data testing. The types of pose changes involved were right-side-up end-to-front allowed  The  poses  to change freely  the Acufex pose test  of  the  Acufex  and upside-down; and front-to-end  instruments  and  cartilage  knives  were  and not  since controlled data were required for proper evaluation of  113  For  noise-free images used for training , the binary threshold level was kept  constant throughout the digitization session. The constant threshold is essential to obtain accurate data for the features. For noisy images used for testing only, the threshold was  readjusted  to  compensate  for  the  noise  and  was  allowed  to  vary  slightly  for  different noisy images.  5.2. Results of Testing Each Algorithm  All subjected  of  the  to  algorithms  which  performance-estimation  were tests,  developed, and are  oudined  which  in  were subsequently  Figure  5.1.  For  each  algorithm developed, the TT test will be used to estimate its performance. Since the TT test gives pessimistic results, the resulting performance estimate is a lower  bound for  the actual performance of the system. In addition, noisy images were tested with each t algorithm trained on noise-free images to observe the effects of lighting noise on that algorithm.  Low-resolution Test The using  their  "low-resolution test" length  and  width  classifies the as  features.  12  different  Classification is  low-resolution done  instruments  by- a classification  function generated by the B M D Stepwise Discriminant Analysis (SDA). A TT test, with P=2, the  was used to evaluate the performance on 12 noise free images. The result of 7T test  was 100% correct classification. Using the  same 12 noise-free images to  train the algorithm, it was then tested on 3 noisy images. The result from the noisy images was 100% correct  Obturator Test The  "obturator  high-resolution  image  test" using  classifies the  shaft  the  obturators  width  and  and  the  minimum  trocars  angle  as  in  the  features.  FEATURE EXTRACTOR  Low resolution image Low resolution instruments High resolution instruments  CLASSIFIER  Width Length  tt  Obturators & Trocars Cartilage knives I  Low resolution instruments  ; Width Length Pose  High resolution image Acufex inst.  iDiscriminant 1 function  Tip curvature —Peak extractor Width Tip curvature Width  Min. Angle  Discriminant Structural function matching  Discriminant function Threshold  Acufex instru.  Obturators & Trocars Cartilage  115  Classification is again done by Using  it  a  test,  with  P=2,  a classification function  on  12  noise-free  generated  images, the  by the  result  was  B M D SDA. 100%  correct  classification. Similarly, with the algorithm trained using the 12 noise-free images, tests done on 4 noisy images gave a 100% correct classification result.  Tip Width Test The  "tip  instrument  width test"  is done only if the  is in the cartilage  knife  group. It  "handle  width test"  shows that the  classifies the 4 cartilage  knives and the  hooked probe by the width of their tip. Classification is done by comparing the width with  thresholds  noise-free Testing  calculated  by  images to evaluate  the  algorithm  on 4  BMD  SDA.  the algorithm. noisy  A  TT  test,  with  P=2,  was  used on  12  The result was 100% correct classification.  images, with training  done  on the  12  noise-free  images, also gave 100% correct classification.  Acufex Test The other  "Acufex test"  tests performed  classifies the  11 Acufex instruments. It  on the low-resolution  uses results from 3  image, the handle width test, the Acufex  pose test and the Acufex length test The handle width test separates the Acufex and the cartilage  knives by the width of the handle. Classification is done by a threshold  calculated using the B M D SDA. A images. The result was 100% the  TT test, the  algorithm  it  test, with P=2,  was done using 12  correct classification. With the same training  was tested  on 3 noisy  images; the  result  noise-free data as in  was also  100%  correct classification.  Acufex Pose Test The  "Acufex  comparing the  pose  test"  edge angles at  determines  the  handle  the  pose  with the  tip  of  the  Acufex  instruments  by  angles. This test is empirical  116  and does not require were obtained  any training.  Using 12 noise-free  images, the following  results  : 131(99.2%) correct classifications; 1 no match condition; and 0 errors.  Using 3 noisy images, the result was 100% correct classification. This test is also used to find the pose of the cartilage knives. With the same 12 noise-free images and 3 noisy images, the result was 100% correct classification.  Acufex Length Test  The  "Acufex length test" classifies the long and regular Acufex instruments by  their  length. The classifier uses a threshold calculated by the B M D SDA. A ir test  with  P=2  on 12 noise free  images gave  the following  results  :  130(98.5%)  correct  classifications; 0 no match conditions; and 2 errors. The algorithm was then tested on 3 noisy images, using the training data from 12 noise-free images, and the result was 100% correct classification.  Finally, using the results of the 3 previous tests, the Acufex test classifies the 11  Acufex  function  instruments  and a  using  structural  a combination  matching  of a B M D SDA-generated  algorithm.  Two results  are available  classification from  each  Acufex test: a best match which tries to maximize the number of correct classifications or  a  clinically  acceptable  match  which  tries  to  minimize  the  number  of  misclassifications.  Two performance estimation methods were used for the Acufex test the TT test with P=4  and the resubstitution (R) test Using P=4  results  4  in  (lower-bound) of  times  as many  performance  all tests, the R  training  sessions  but  instead of P=2 in the TT test would  give  a  less pessimistic  estimate. Since the Acufex test had the poorest performance  test was done to give  an optimistic  (upper-bound)  performance  estimate. The data set for both estimation methods consisted of 32 noise-free images. In addition, using the training  data from the R test, 4 noisy prespecified  compartment  117 images and 4 noise-free random compartment images were tested on the Acufex test to observe the effects of lighting noise and random compartments independently. The results of all the tests applied to the Acufex test are listed in Table 5.2.  Table 5.2 Results of tests on Acufex algorithm.  Resubstitution  misclass. 2.8% 0.3%  no match 2.3% 9.1%  corr class 94.9% 90.6%  Max c o r r c l i n accept  4.5% 0.9%  2.8% 13.9%  92.6% 85.2%  Noisy Images  max c o r r c l i n accept  5.7% 0.0%  4.5% 19.3%  89.8% 80.7%  Free compartment  max c o r r c l i n accept  6.8% 3.4%  8.0% 21.6%  85.2% 75.0%  V  A  max c o r r d i n accept  :'.  confusion  matrix  containing  the  results  of  the  low  resolution  test,  the  obturator test, the cartilage knife test and the Acufex test is given in Table 53. S3. Failures of Each Algorithm From the results of tests of each algorithm presented in the last section, it is apparent  mat  the algorithm that performed  worst was the Acufex  test The  other  algorithms with less than perfect performance are the pose test and the Acufex length test  In  this  section,  the  results  significance and an analysis will  of be  each  algorithm  will  be  examined  for  their  given for the algorithms with less man 100%  Table 5.3 Confusion matrix of all arthroscopic  Rotation test  Confusion Matrix L K  W  KH TW KF PT2B TS28 SCI LTC STC TS50 TS38 NH RPS LGF LPS L20H L60H LSS LBP R20H R60H RSS RBP FCK HP SCK (CK LCK BLOB3 PYTR3 BLOBS PYTR5  12  NO MATCH NOTES : 1. 2. 3. 4.  K  H  instruments.  T H  K F  P  T  S  2 8  2 B  I  T  S c  h  S  T T c c  T  T  5 0  3 8  S  R B P  S  I  s c  F C K  K  M C K  L C K  I I  B P L Y 0 T 1 B R ! 3 3  B L 0 B 5  P Y T R 5  SEE NOTE 2 12  12  12  12  12  SEE NOTE 1 12  48 24  12  12  12 23  S E E  S E E  N 0 T E  N 0 T E  2  1  32  s 25  E E 20  31  SEE NOTE 4  25  32  26  2 28 32  1  26 12  N  3  T  SEE NOTE 2  0 2  12 12  SEE NOTE 3  S E E  N 0 T E  12  12  12  12  12 12  11 1 Confusion Is zero due to physical incompatibility of instruments and compartments Contusion i s assumed to be zero with program error checking Handle width test - confusion i s tested to be zero in these areas Only results of Acufex test given in this sub-matrix  119  correct classification.  Comparing  the  results  and  the  feature-space  plots  in  Chapter  4  for  the  algorithms with 100% correct classification (the obturator test, the handle width test and the  low-resolution  test) it  is evident  that the  features  are  well  separated, and  the  algorithms performed as expected. For the cartilage knife test, the algorithm was able to  separate  However,  two  the  of  results  the  groups which  were  were  based on only  clustered closely in 12  samples, and  the  more  feature  space.  samples may  be  required to properly test the efficacy of the algorithm in separating the two groups.  In  the  Acufex length  test, the  data  seemed to be well separated  except for  two samples of the regular Acufex instruments which clustered near the long Acufex instruments  (see  Figure  4.12).  The  length  of  the  regular  Acufex  instruments  ranged  from 184 to 203 pixels while the length of the long Acufex instruments ranged from 213 to 228 pixels. In the misclassified cases, 2 of the regular Acufex instruments have lengths of 209 and 211 pixels. In both cases, the instruments, LSS and LBP, were in the same compartment immediately on the right of the obturators. The average lengths of the two instruments over 12 training images were 197.8 and 196.2 pixels. It known two  why  the  errors  lengths were significantly longer in  should  not  cause  a  misclassification  those two in  is not  cases. However, these  future  if  error  implemented to ensure that long and regular Acufex instruments are  checks  in their  are  correct  compartmenL  In the Acufex pose test, the results (99.2% correct classification, no error and 1 no  match  condition)  show that this test is acceptably reliable.  In  the  case of  the  no-match condition, the edge angles on both sides of the handle were found to be the  same, such  that  no  decision was made.  This  error  was  probably  due  to  the  instrument handle being partially closed, for some unknown reason. The range of angle differences in all other handle measurements ranged from 2.83  to 8.67  degrees. If  the  120  wrong instrument  is in the  compartment, or the  handle is partially  closed such that  the angle difference is less than 2.5 degrees, the algorithm reports an error; similarly, an  error  is  reported  if  the  angle  difference  is  greater  than  10.0  degrees. These  error-checking measures should prevent confusion with some of the other instruments.  Of  the algorithms developed, the Acufex test is the most complicated, and has  the most difficult job of differentiating thus,  it  is not  algorithm  can  surprising that be  it  among a number of similar Acufex instruments;  has the  described as fundamental  poorest performance. errors  The  errors  and selection errors.  in  this  Fundamental  errors are errors that the algorithm cannot handle and usually results in no matches or  errors in the best match case. A no match can be caused either (1)  by the B M D  features being incorrect such that the correct instrument is not selected for matching, or  (2)  by some of the  features  for  the  structural  structural matching process being  sufficiently different to prevent a match. In most of the no matches tested, the latter was the major cause. Errors in the best match case were primarily caused by similar instruments matching with the structural features of the other instruments. These errors occurred  most  frequently  with  the  pairs  of  LPS/RPS  and  LBP/RBP  where  the  structrual differences are very minor. The errors due to structural similarities may be difficult  to  avoid  since  even  human  subjects  have  trouble  instruments when examining an image of the instruments. It algorithm was able to recover from the or  these  similar  should be noted that the  error of detecting a wrong first  last corner in most of the tests.  Selection  errors  are  acceptable match to differ an  fundamental  separating  error  in  the  BMD  considered to  be  differences  which  cause the  clinically  from the best match. The selection errors may be due to classification  function  output  or  in  the  structural  pattern  recognition algorithm. A listing of the causes of these errors is given in Table 5.4. can  It  be seen that a significant portion of the no matches in the clinically acceptable  121  results is due to incorrect B M D still  significant  correct B M D  portion  of  the  results with correct structural  no  matches  is  due  to  the  match. A  inverse  of  smaller but  the  above  -  results with incorrect structural match. Again, a number of the incorrect  structural matches are due to structurally similar instrument pairs, as mentioned before.  Table 5.4 Types of errors in structural match.  Resubstition  Test :  no B a t c h  a n d BHD c o r r e c t  no n a t c h and BMD i n c o r r e c t no match - BMD and s t r u c t , match wrong BMD wrong and s t r u c t , match c o r r e c t BMD c o r r e c t and s t r u c t , match i n c o r r e c t it  Test : no match and BMD c o r r e c t no match and BMD i n c o r r e c t BMD wrong and s t r u c t , match c o r r e c t BMD c o r r e c t and s t r u c t , match wrong e r r o r - BMD and s t r u c t , match wrong no match - BMD and s t r u c t , match wrong  An  6  3 1 14 7  6 4 26 12 3 1  analysis of the performance of the structural matching algorithm was done  by examining the goodness of match (the percentage of corners matched) in the cases of the R  and it  methods. The results are shown in Figure 5.2. It  can be seen that  the performance of the structural matching algorithm did not degrade significantly from the resubstitution test to the tr test Hence, the decrease in performance in the it test may be due to the B M D features.  5.4. Improvements of Recognition Algorithms  122  "PI s  to m  I u m  ci *  <  D Ul X  Li_  o  (0 6  if) if)  13  u  z Q O O o  o< 2  in Q:  UI  z O U  5° 6  CM  PJ  O  K>  CO CM  <0 CM  CM  CM  CM  Figure 5.2 Histogram of goodness of structural  match.  123  In  discussing  the  failures  of  the  algorithms,  it  was  noted  that two  groups in the cartilage knife test were clustered closely in the feature  of  the  space and that  the separation of the two groups may not be as reliable even though test results had indicated a 100% reliability  estimated  recognition accuracy. In  order  to improve  the  recognition  of the cartilage knife test, the distinctive tips of the cartilage knives may be  used as features  for  discrimination. Since the  knife  tips consist of several significant  corners, the algorithms for the Acufex test may be used.  Although the cartilage knives, with the exception of the hook probe, have the same general and angle  shape at  of the  sufficient  to apply  structural  matching  tip  the for  tip, there  appear  use with the  only the  BMD  algorithm  can  to be sufficient  Acufex  test  features  to improve  also  used  be  but  differences in the size  algorithms.  In  fact,  it  may  the recognition reliability. it  will  increase  the  be The  algorithm  processing time considerably.  Another algorithm The  drawback  of  Pavlidis's qualitative variations  in  the  that may  be improved is the  using quantitative  features  for  each  structural  matching algorithm.  significant  corner,  instead  of  features, is that the matching algorithm may be sensitive to large features.  To  reduce  this  should be examined in detail to determine  sensitivity,  the  structural  matching process  the conditions which cause correct matches  to fail. One possible improvement may be to assign a weighting or scoring scheme to the  feature  matching procedure such that one bad feature  data does not produce an  immediate rejection in the match.  5.5.  Implementation  Beside  the  and Optimization  recognition algorithms, other  practical  problems exist in  the vision  system which may result in degraded performance of the system. This section discusses these  practical  problems  as  well  as  methods  to  optimize  the  system  towards  124 implementation in the operating room.  5.5.1 Practical Problems Some of the practical problems in the implementation of the vision system are noise due to lighting, noise due to blood, tissue fragments, or other material from the surgical site, noise due to boundary effects, inconsistencies due to manufacturing variations, and disturbances due to geometric effects. There are two sources of noise due to lighting, room light and backlighting. Room lighting had been investigated in the noisy images where overhead lights were left on during digitization of the images. The resulting images showed cavities in the silhouettes of the instruments due to reflections. The amount and location of the reflections will generally vary according to the direction of the overhead light source. The overhead lighting noise can be eliminated in most cases by adjusting the binary threshold of the images. Figure 5.3 shows a gray-level image and its intensity histogram. A suitable threshold for the image is 55. If the threshold is too high, the image would be filled with noise, as shown in Figure 5.4. If the threshold is too high the silhouettes of the instruments would begin to thin out, see Figure 5.5. At present, the threshold is selected manually to obtain images that are free of noise cavities. Automatic threshold selection techniques exist (Weszka, 1978) but evaluation and implementation of such techniques was beyond the scope of the thesis. The effect of noise due to non-uniform backlighting has not been tested since it has not presented any major problems. Nevertheless, backlighting is an important part of binary vision since inconsistencies in backlighting would seriously affect the binary threshold selection and create noises in the image. This type of noise was encountered in the evaluation of the Oculus 100 digitizer.  Figure 5.3 Noisy Image with histogram of intensity.  Figure 5.4 Binary digitized noisy image: (top) threshold = 65.  55; (bottom) threshold  =  128  The type and level of noise due to tissue or other material remaining on the instruments  following  use in the surgical site have been tested using actual  room samples. Samples of two types of fluids were collected and tested plexiglass  tray.  The  fluids  were:  inserted into the knee, and (2)  (1)  the  residue  from  instruments  a purple-coloured skin preparation  operating  on a small  that  had  been  solution, tincture  of  cholorohexadene, used at the start of the surgical procedure. There were no blood or small tissue fragment  found in the  fluid  from  the  knee,  i.e.  it  was primarily  saline  solution. The tray of fluids was then dried and examined in low resolution and high resolution.  In  low  resolution, the fluid tray was placed near the centre of the  instrument  tray with the trocar sleeves placed across the residue of the fluid, as shown in Figure 5.6.  Both types of fluids were found to be invisible in the low-resolution  image and  the shape of the trocar sleeves were not distorted in any way.  In high resolution, the fluid tray was again placed near the centre of the F O V with the tips of several instruments lying across the fluid residue as shown in Figure 5.7. The saline from the knee was invisible while the skin-preparation  solution caused  numerous  also  spots to  instruments.  The  appear  problem  along may  the  not  edge  of  a  major  be  the  compartment  concern since  solution is not used with the high-resolution instruments,  and the  skin  near  the  preparation  but this should be  verified  in actual clinical trials.  The effect of the boundaries of the instrument the  small  Figure  5.9,  plexiglass the  tray  entire  mentioned  tray  was  above  invisible  (see and  compartments were tested using  Figure did  not  5.8).  In  low  cause any  instruments  placed in the  tray. In  high resolution, however,  the  dark  along  certain  portions,  while  others,  probable  causes  There  band are  two  for  the  being dark  invisible band,  in  uneven  resolution, see  distortion  of  the  boundaries created a see  lighting  Figure and  5.10.  extreme  Figure 5.6 Noise test of fluids in low resolution.  Figure 5.7 Noise test of fluids in high resolution.  131 curvature. Referring to Figure 5.9, the boundaries were visible on the right side of the image which had a slightly lower backlight intensity, due to the proximity to the edge of the backlight projector; on the other hand, the boundaries were invisible on the left-hand side where the backlighting is slightly stronger. The effects of the tray curvature can be seen in the divider which separates the two compartments in the plexiglass tray. In Figure 5.10, the upper part of the divider with the lesser curvature was invisible while the lower part with the higher curvature was visible. Differences in instruments due to manufacturing  variations are difficult to  predict; hence, it would be difficult to test the algorithm's sensitivity to such variations. The least sensitive algorithm is the obturator algorithm since it measures two very distinct shapes. The sensitivity of the low-resolution and cartilage knife algorithms can be tested easily by extracting the features and comparing with the other instruments in the feature space. For the Acufex test, manufacturing variations could lead to instruments that look entirely different and it is not certain that these instruments could be recognized by the structural matching algorithm. Nevertheless, if the instruments satisfy the basic criterion for the structural matching algorithm, which is the presence of distinct corners, then there is a very good chance these instrument would be recognized. 5.5.2 Optimization The algorithms presented in this thesis have been developed and tested on the UBC  Electrical Engineering VAX-11/750 Superminicomputer (see Figure 5.11). Since it  would not be economical to incorporate such an expensive computer into the vision system, an IBM  PC microcomputer, as described in chapter 2, was chosen for the  system. Some of the algorithms on the VAX, transferred to an IBM  written in VAX  FORTRAN, were  PC and modified to compile using MS-Fortran compiler to  obtain a time performance estimate of the algorithms using the IBM PC.  132  Figure S.8 Sample instrument tray.  133  Figure 5.9 Boundary test in low resolution.  134  Figure 5.10 Boundary test in high resolution  135  o  FPS-tOO ARRAY PROCESSOR  VAX 11 - 7 5 0 COMPUTER  RAMTEK 9300 IMAGE PROCESSOR  RGB  LINE  COLOUR MONITOR  TERMINAL VT100 |  DISK DRIVE  B/W  LINE  VIDEO SWITCH  ?  > i  B&W MONITOR  1 DISK DRIVE  PDP-11 COMPUTER  IP-512  VIDEO 1 CAMERA J  FRAME GRABBER  I  CAMERA  TERMINAL VT-105  Figure 5.11  UBC  VAX  set-up.  ]  136  In  modifying  assembly-language  the  programs  to  be  compatible  function was written to increase the  with  MS-Fortran,  image pixel  access speed and  decrease the image memory requirements. The binary images on the VAX-11/750 2 bytes per pixel while the IBM  The matching,  boundary  algorithms PC/AT  Acufex  and  developed, were  supermicrocomputer.  multiplication  and memory  and  the  tested  on  As  use  PC uses 1 bit per pixel.  low-resolution  tracking  an  scanning type the  well,  reference  algorithms,  IBM  tests  representative of  PC/XT of  performance,  two  data  of  the  structural  processing used in  microcomputer simple  and  operations,  were conducted. The  the  the IBM  such  as  results of  the  comparative speed tests for different computers are listed in Table 5.5.  Table 5.5 Time comparison results.  Real*8 m u l t i p l i c a t i o n s 10,000 50,000 Image Read - 10 f u l l s c a n s Of 512 X 512 p i x e l s Acufex Test t e s t on s i n g l e image r a n g e f o r 5 images max min average time Low R e s o l u t i o n T e s t t e s t on s i n g l e image r a n g e f o r 5 images max min average  VAX  PC/AT  PC  0.12 0.60  1.32 5.49  14.74 73.48  48.32  132.17  410.63 sec.  2.65  18.92  53.79 s e c .  2.65 1.82 2.23  18.92 13.04 15.51  53.79 38.34 45.17  0.76  2.15  6.66  0.81 0.50 0.67  2.20 2.15 2.16  6.88 6.66  6.80  sec.  sec.  137 In numerical calculations such as REAL*8 multiplications, the VAX-11/750 was over 10 times faster than the IBM PC/AT and over 100 times faster than the IBM PC/XT. In memory reference operations such as reading the image pixels, the VAX-11/750, using 512 KB of image memory, was 2.7 times faster than the IBM PC/AT and 8.5 times faster than the IBM PC/XT. The Acufex algorithm required both memory reference for boundary tracing and numerical calculation for structural matching. Tests of the Acufex algorithm showed that the VAX 11/750 was approximately 7 times faster than the IBM PC/AT and 20 times faster than the IBM PC/XT. The average CPU times were 2.23s, 15.51s, and 45.17s for the VAX, PC/AT and PC/XT respectively. The low-resolution test required mostly image pixel references and test results showed that the VAX-11/750 was aproximately 3.0 times faster than the IBM PC/AT and 10 times faster than the IBM PC/XT. The average CPU times were 0.67s, 2.16s and 6.8s for the VAX, PC/AT and PC/XT respectively. Although the VAX-11/750 is much faster than the IBM PC, the speed of the IBM PC/AT may be sufficient for this project The complicated Acufex algorithm required, on the average, 15 seconds to execute while the line scan type low-resolution algorithm required only 2 seconds using an IBM PC/AT. Hence, an estimate of the time required for the overall vision system would be approximately 20 to 25 seconds. 5.6.  Final Results of the Recognition Algorithm  Evaluation of the algorithms yielded recognition accuracies of 100% for the low-resolution test, the obturator test and the cartilage knife test The upper- and lower-bound accuracy estimates for the Acufex instruments were 94.9% and 92.6% for the best-case match, and 90.6% and 85.2% for the clinically acceptable match.  138 To obtain an overall classification accuracy for the vision system, the individual recognition accuracies were combined with the probability of usage for each group. From observing four videotaped arthroscopic cases performed by one surgeon, with case durations ranging from 12 to 18 min., the approximate instrument usage by group is shown in Table 5.6. Weighting the accuracies according to Table 5.6 yields an overall recognition accuracy of 99.1% correct, 0.69% no match and 0.16% misclassification. Table 5.7 shows the recognition accuracies for different weighting percentages of the Acufex instruments. Even at 40% Acufex usage, the correct classification is estimated to be 93.3%. To obtain an estimate of the frequency of the assistance required from the nursing personnel, it was assumed that an Acufex request occurred every 2 min. With that assumption, there would be 7.5 Acufex requests in a 15 minute period. The worst case recognition results for the 15-min. period are 6.38 correct classifications, 1.04 no-match conditions, and 0.06 misclassifications. The above results indicate that there will be approximately 1 instrument which could not be recognized within the 15-min. period and hence would require the assistance of the surgical nursing personnel. Since this result is within the design criteria given in the clinical requirements, the estimated accuracy of the vision system is considered to be suitable for clinical use.  139  Table 5.6 Arthroscopic instrument usage probability.  Cartilage Knives Obturator Low R e s o l u t i o n I n s t r u m e n t s Acufex Instruments  Table 5.  A c u f e x Usage 5.1 % 10.0 % 20.0 % 30.0 % 40.0 %  r  P^ i" r  23.7 44.1 27.1 5.1  % % % %  rec^rdtior accuracy.  correct 99.1 % 98.3 % 96.7 % 95.0 % 93.3 %  no match 0.7 % 1.4 % 2.7 % 4.1 % 5.4 %  misclass 0.2 % 0.3 % 0.6 % 0.9 % 1.2 %  CHAPTER 6 CONCLUSIONS AND RECOMMENDATIONS  The development and evaluation of a vision system for recognizing arthroscopic instruments has been presented in this thesis. Also, a robotic system which uses the vision system for performing surgical instrument passing duties in an operating room has been discussed. A two camera, dual-resolution approach was employed by the vision system to recognize the arthroscopic instruments. A camera, digitized to 512 x 512 pixels, was used to view all the instruments at a low resolution while another camera, also digitized to 512 x 512 pixels, was used to focus on a small area containing the fine details necessary to discriminate the instruments which are extremely similar. Four algorithms were developed to recognized the four different groups in the set of 36 arthroscopic instruments. For three of the algorithms; the low-resolution algorithm, the obturator algorithm and the cartilage-knife algorithm, simple boundary scanning techniques were adequate to obtain features such as length, width and minimum angle of the instruments. These simple features were able to discriminate the three groups of instruments, with an estimated recognition accuracy of 100 percent for each algorithm. A combination of statistical and structural features were found to be necessary to recognize the fourth group of Acufex instruments. The statistical features consisted of 6 angles and distances taken from the first and last significant corners found after boundary tracing; the statistical features were then matched using a BMD-generated discriminant function. The best scores resulting from the discriminant function were 140  141  subsequently matched using the structural-matching algorithm. The significant corners between the first and the last corner formed the structural features and were sequentially matched to the reference corners. The Acufex algorithm provided two possible matches: one which maximized the number of correct matches, and another which minimized the number of incorrect matches. The latter matching method was used to generate clinically acceptable recognition results. Using a resubstitution test to obtain an upper bound and a TT test to obtain a lower bound on the recognition accuracy, the results were 94.9% and 92.6% for the maximally correct matching method, and 90.6% and 85.2% for the clinically acceptable matching method. An overall estimate of the recognition accuracy, incorporating results of all four recognition algorithms, and weighted according to assumptions concerning instrument usage, gave a recognition accuracy of 99.1% correct classification, 0.69% no match condition, and 0.16% misclassification. As well, simple calculations showed that the frequency of requests for assistance is below the acceptable limit defined by operating-room personnel in a clinical questionnaire. Based on the above results, it can be concluded that the primary objective of the thesis, the development and evaluation of a clinically acceptable vision system for recognition of arthroscopic surgical instruments, has been satisfied. When practical vision system tests were performed involving overhead lighting, a sample plexiglass tray, and contamination or degradation of the surgical instruments, it was found that overhead lighting produced bright reflections on the instruments, but the reflections were compensated by increasing the binary threshold to give continuous boundaries. Tests using the sample plexiglass tray showed that the tray was invisible in low-resolution. In high-resolution however, portions of the compartment boundaries with high curvature created dark lines in the image. Samples of fluid from the surgical site were collected in the tray and  examined under high and  low-resolution. In  142  low-resolution, the saline and the dark skin-preparation solution were transparent In high-resolution, the saline was  transparent but the dark skin-preparation solution  generated some dark spot in the image. This may not be a problem since the solution is not used with the instruments in the high-resolution field. Computer time performance comparisons were done for the VAX-11/750, IBM PC/AT and IBM  PC/XT using the vision programs. The VAX-11/750 was 3 to 7  times faster than the IBM PC/AT and 10 to 20 times faster that the IBM PC/XT. The average execution times for the vision programs tested were 15.51s and 2.16s using the IBM PC/AT. Based on these performance results, the vision system developed in this thesis was capable of recognizing the arthroscopic surgical instruments within a clinically acceptable recognition accuracy. It was predicted that the algorithms developed will execute in real-time with an anticipated cycle time of 20-25 seconds for the overall vision system. A complete system outline was given for an instrument-passing  robot for use  with the vision system. Recommendations for individual system components were given and evaluations were done for the two primary vision system components : the solid state camera and the binary digitizer. Although the solid state camera did not perform as well as a high-quality vidicon camera in the resolution test, it did produce comparable high- and low-resolution images that were used in the evaluation of the vision algorithms. A very simple payback analysis was done based on the total system cost and the average salary of a scrub nurse. The payback time was calculated to be 1.6 years, assuming a 20 minutes labour savings in 35 minutes of nursing time. This payback period is within the guidelines for introduction of industrial robotics; hence, assuming  143  similar guidelines for the health care industry, the instrument-passing robot appears to be economically feasible for use in hospitals. Several of the major problems associated with implementing a real-time vision system for clinical use have been discussed. Equipment recommendations for the instrument-passing robot and a simple payback analysis were presented. Based on these results, it can be concluded that the three secondary objectives of this thesis have been adequately met Recommendations for future work The results of this thesis indicate that it is technically feasible to implement an surgical instrument-passing robot for arthroscopies. However, much work remains to be done before actual clinical implementation is possible. An extensive study should be done on operating-room (OR) activities during such surgical procedures to ensure that the presence of a robot and a vision system would not hinder the effectiveness of the operating room staff. Also, issues concerning safety should also be investigated thoroughly, especially concerning the transfer of instruments between the robot and the surgeon. One  area of improvement in the vision system is the structural matching  algorithm. The possibility of weighting the individual features of each structural feature should be studied to improve the matching accuracy. At present the structural feature selection process is performed manually; however, the selection can be simplified or partially automated by a computer program which looks for similar feature in reference inputs. Such a program would be desirable if the vision system is to be used commercially. As well, the structural matching algorithm should be tested on the group of cartilage knives to reinforce the decision from the single width feature. Finally, sample instrument trays using different moulding methods should be tested to find a  144  pattern that can be used in the high-resolution field. In the evaluation of the Oculus 100 digitizer, it was found that the digitizer did not produce images with comparable sharpness as the IP-512 gray-level digitizer. However, a more extensive software and hardware evaluation should be done to determine its actual performance. By improving the reliability of the vision system and the overall robotic system, it may  be possible to introduce a cost effective instrument-passing  robot into an  operating room in the near future to reduce the cost of health care delivery.  References Agrawala, A. & Kulkarni, A. (1977). "A Sequential Approach to the Extraction of Shape Features," Comp. Graph. & Im. Proc., Vol. 6, No. 6. pp. 538-557. Ballard, D. & Brown, C. (1982). Computer Vision (Prentice-Hall, Englewood Cliffs, NJ), pp. 255-256. Bennett, J. & MacDonald, J. (1975). "On the Measurement of Curvature in a Quantized Environment," IEEE Trans. Comp., Vol C-24, No. 8, pp. 803-820. Bribiesca, E. & Guzman, A. (1980). "How to Describe Pure Form and How to Measure Differences in Shapes Using Shape Numbers," Pan Recog., Vol. 12, pp. 101-112. Cantella, M. (1971). "The High-Resolution Return-Beam Vidicon with Electrical Input", Photoelectric Imaging Devices, Vol. 2, L. Biberman and S. Nudelman Eds., Plenum Press, NY., pp. 439-451. Capson, D. (1984) "An Improved Algorithm for the Sequential Extraction of Boundaries from a Raster Scan," Comp. Vision, Graph. & Im. Proc., Vol. 28, No. 1, pp. 109-125. Carlisle, B. et al. (1981). "The PUMA/VS 100 Robot Vision System," Proc. of the 1st Int'l Conf. on Robot Vision and Sensory Controls, pp. 128-140. Cope, A. et al. (1971). "The Television Tube as a System Component," Photoelectric Imaging Devices, Vol. 2, L. Biberman and S. Nudelman Eds. (Plenum Press, N.Y.), pp. 15-51. Dessimoz, J. (1978). "Visual Identification and Location in a Multi-object Environment by Contour Tracking and Curvature Description," Proc. 8th Int'l. Symp. on Ind. Robots, pp. 764-777. Duda, R. & Hart, P. (1977). "Pattern Classification and Scene Analysis," Toronto, Ont_, John Wiley & Sons Inc., ChpL 7, pp. 164-168. Fengler, J. and Spadinger, 1.(1984). "Development of a Robot Gripper for Handling Surgical Instruments," APSC 459 report. Dept. of Physics, Univ. of B.C., (unpublished). Flanagan, J. (1982). "Talking with Computers: Synthesis and Recognition of Speech by Machine," IEEE Trans. Biomed. Eng., Vol. BME-29, No. 4, pp. 223-232. Flory, R. (1985). "Image Aquisition Technology," Proc. IEEE, Vol.7, No. 4, pp. 613-637. Frank, S. (1985). "CCD Imager and Camera Project Data Sheet," Texas Instruments Inc. Sales brochure. Freeman, H. (1961). "Techniques for the Digital Computer Analysis of Chain Encoded 145  146 Arbitrary Plane Curves," Proc. Nat. Elec. Conf., Vol. 17, pp. 421-433. Fu, K. (1982). Applications of Pattern Recognition, K. Fu, Ed, Boca Raton, FL, CRC Press, ChpL 1, pp. 2-13. Geisler, W. (1982). "A Vision System for Shape and Position Recognition of Industrial Parts," Proc. 2nd Int'l Conf. on Robot Vision and Sensory Controls, pp. 253-262. Gleason, G. & Wilson, D. (1981). "A Vision Controlled Industrial Robot System," IEEE Ind. Appl. Soc. Conf. Rec, pp. 381-388. Glick, N. (1978). "Additive Estimators for Probabilities of Correct Classification," Patt. Recog., Vol. 1, pp. 211-222. Gonzalez, R. & Safabachsh, R. (1982). "Computer Vision Techniques for Industrial Applications and Robot Control," Computer, Vol. 19, No. 12, pp. 17-32. Grant, G. & Reid, A. (1981). "An Efficient Algorithm for Boundary Tracing and Feature Extraction," Comp. Graph. & Im. Proc., Vol. 17, No. 3, pp. 225-237. Hafemeister, D. et al (1985). "The Verification of Compliance with Arms-Control Agreements," Sci. Am., Vol. 252, No. 3, pp. 38-45. Helsingius, P. & Zoeller, S. (1985). "IVS-100 Software: A comprehensive Machine-Vision Library Facilitates Application Programming," Analog Dialogue, Vol. 18, No. 3, pp. 8-9. Hung, S. & Kasvand, T. (1983). "Critical Points on a Perfectly 8- or 6-connected Thin Binary Line," Patt. Recog., Vol. 16, No. 3, pp. 297-306. Isozaki, Y. (1978). "The 2-In Return Beam Saticon: A High-Resolution Camera Tube," SMPTE Journal, Vol. 87, No. 8, pp. 489-493. Isozaki, Y. et al. (1981). "1-Inch Saticon for High-Definition Colour Television Cameras," IEEE Trans, on Elec. Dev., Vol. ED-28, No. 12, pp. 1500-1507. Kanal, L (1974). "Patterns in Pattern Recognition: 1968-1974," IEEE Trans, on Info Theory, Vol. IT-20, No. 6, pp. 697-722. Kelley, R. (1983). "Binary and Gray Scale Robot Vision," Robot and Robot Sensing Systems, Proc. of the SPIE, Vol. 442, D. Casasent and E Hall, Eds., pp. 27-37. Kwok, Y.S. et al. (1985). "A New Computerized Tomographic-Aided Robotic Stereotaxis System," Robotic Age, Vol. 7, No. 6, pp. 17-22. Levine, M. (1969). "Feature Extraction: A Survey," Proc. of the IEEE, Vol. 57, No. 8, pp. 1391-1407. Leifer, L. (1981). "Rehabilitative Robots," Robotics Age, Vol. 3, No. 3, pp. 4-15. McEwen, J. (1984). "Medical and Surgical Robotics," Proc. 10th Can. Med. and Biol. Eng.'Conf., pp. 11-12.  147  Neuhauser, R. (1979). "Measuring Camera-Tube Resolution with the RCA P200 test Chart," Application Note ST-6812, RCA Solid State Division, pp. 1-6. Pavlidis, T. (1980). "Algorithms for Shape Analysis of Contours and Wareforms," IEEE Trans. PatL Anal. & Mach. Intell., Vol. PAMI-2, No. 4, pp. 301-312. Pavlidis, T. (1977). Structural Pattern Recognition (Springer-Verlag, New ChpL 7, pp. 164-168.  York, NY),  Pavlidis, T. (1979). "The Use of a Syntactic Shape Analyzer for Contour Matching," IEEE Trans, on PatL Anal. & Mach. Intell., Vol. PAMI-1, No. 3, pp. 307-310. Persoon, E. and Fu, K. (1977). "Shape Discrimination Using Fourier Descriptors," IEEE Trans. Sys. Man. Cyber., Vol. SMC-7, No. 3, pp 170-179. ;  RCA Corp. (1974). Electro-Optics Handbook, RCA Corp., Harrison, NJ, pp. 121-124. Rosenfeld, A. (1981). "Image Pattern Recognition, " Proc. of the IEEE, Vol. 69, No. 5, pp. 596-605. Rosenfeld, A. and Weszka, J. (1975). "An Improved Method of Angle Detection on Digital Curves," IEEE Trans. Comp.. Vol. C-24, No. 9, pp. 940-941. Rummel, P. and Beutel, W. (1984). "Workpiece Recognition and Inspection by a Model-Based Scene Analysis System, " Pan. Recog., Vol. 17, No. 1, pp. 141-148. Savarayudu, G. and Sethi, I. (1983). "Walsh Descriptors for Polygonal Curves," PatL Recog., Vol. 16, No. 3, pp. 327-336. Schroeder, H. (1984). "Practical Illumination Concept and Technique for Machine Vision Applications," Proc. Robots 8, pp. 14/27-14/43. Thring, M. (1983). Robots and Telechirs, (Ellis Horwood Ltd., West Sussex, England). Toussaint G. (1974). "Bibliography on Estimation of Misclassification," IEEE Trans, on Info. Theory, Vol. IT-20,'No. 4, pp. 472-479. ToussainL G. & Sharpe, P. (1975). "An Efficient Method for Estimating the Probability of Misclassification Applied to a Problem in Medical Diagnosis," CompuL Biol. Med., Vol. 4, pp. 269-278. Weszka, J. (1978). "A Survey of Threshold Selection Technique," Comp. Graph. & Im. Proc., Vol. 7, No.2, pp. 259-265. Wong R. and Hall, E (1978). "Scene Matching .with Invariant Moments," Comp. Graph. & Im Proc., Vol. 8, No. 1, pp. 6-24.  Appendix A Questionnaire on Clinical Requirements  Biomedical Engineering Department Computer V i s i o n Survey Your  i n i t i a l s  What  i s your  What  i s your  (optional) experience  X -  Does  your  yenra rr PUP experience  surgical  (Y/H)  as  a  c i r c u l a t i n g  nurse  rlnys T & = ^,101);  as  nursing  a  scrub  nurse  experience  -  f  ?  ?  (number  cover  a  wide  is  procedures)  PfM  r  X = TX,nnn  mrf.,.***  (procedures) variety  Rotate at VCE  If the response experienced in  T  If) years  of  of  procedures  :  'No'  above,  what  s p e c i f i c  procedures  are  you  l . O n t h e a v e r a g e , w h a t p e r c e n t a g e o f t h e 6crub n u r s e ' s a c t i v i t i e s in the is spent where her primary t a s k i s to pass s u r g i c a l instruments to the surgeon (percent) 75* (for a typical arthroscopy - I hour)  2. I f an from the  instrument-passing robot i s t o be used i n t h e OR, w o u l d s u r g i c a l n u r s i n g p e r s o n n e l be a v a i l a b l e always available  during  setup  :  to  drape  to  unpack  and/or  to  attach  s t e r i l e  to  s e l e c t (Y/N)  a c t i v i t i e s "  to during  surgery  :  robot  perform to  press  :  out  instruments  from  a  to  robot  menu  on  a  f, .-^  (Y/N) (Y/N)  nut  t  _££  computer  display .  minor  a c t i v i t i e s  keys  once  in  (Y/N) a  while  PH»»VA7.< (if  necessary)  nurse)  .  to  remove l o o s e t i s s u e fragments from instruments or instrument tray i f requested by computer system  to  take other remedial system (Y/N) Xes  (Y/N)  surgery  lay  components  Yes (bu circulating  V  post  a  to  selected  ( /N)  assistance  e  other  lee (but uneconomical to have scrub nurse present)  to  i n i t i a t e  to  remove  to  undrape  shut  down  instruments robot  (Y/N)  of  action  requested  computer  (Y/N)  system  by  (Y/N)  computer  ___  Routine  Tee  3. I f minor problems arose during surgery, would it, in your opinion, be p o s s i b l e f o r the s u r g e o n t o t a k e minor c o r r e c t i v e a c t i o n s r e q u e s t e d by computer system (Y/N) Generally SO! Do not want Burgeon to lose visual contact  with surgical site.  148  %  y„  (Y/N)  y  OR  149  Questions  on expected  1. A p p r o x . instrument  level  what p e r c e n t a g e to the surgeon,  of  performance  from  robot  :  of time does the scrub nurse including incorrect requests  pass from  the the  wrong surgeon  1 or 2 per case (less than IX)  What  percentage  2. I n v i e w o f t h robot t o pass th requests from t h without nurs  with  nursing  e e e in  of  the above  t  i s  due t o  surgeon' s error  (leas than IX)  above, what i s an a c c e p t a b l e p e r c e n t a g e o f wrong instrument t o t h e surgeon, i n c l u d i n g surgeon g assistance  assistance  %  time f o r a incorrect %  I or 2 per case  3. W h a t , i n y o u r o p i n i o n , i s t h e m a x i m u m a l l o w a b l e n u m b e r o f o c c u r r e n c e s which t h e robot passes an i n c o r r e c t instrument t o the surgeon before s u r g e o n b e c o m e s i r r i t a t e d (0 o r n u m b e r ) I or 2 per ease (not Z or 4)  in  depending on surgeon Is  t h i s  rate  of  error  acceptable  to  most  surgeons  (please  explain)  No (but I or 2 errors may be tolerated in arthroscopy) (average of 20 - SO passes per arthroscopy) Would  greater  errors  be t o l e r a t e d  ? If  yes,  under what  circumstances  ?  4. During a procedure, i s there u s u a l l y a c i r c u l a t i n g nurse a v a i l a b l e with sufficient time and s k i l l to respond t o an alarm by entering a few key s t r o k e s t o c o r r e c t any-m i s t a k e s res circulating rmrse only.  5. I f t h e c o m p u t e r b e c o m e s c o n f u s e d a n d r e q u i r e s a s s i s t a n c e i n verifying one o r t w o i n s t r u m e n t s from t h e c i r c u l a t i n g n u r s e d u r i n g s u r g e r y , how o f t e n would t h i s b e a c c e p t a b l e (never o r no more t h a n once e v e r y ? m i n . )  Once every If become  20-20 minutes.  an acceptable l e v e l has been intolerable or unacceptable  g i v e n above, a t what l e v e l Once every S to 10 minutes.  would  i t  150  6.  Would i t t a k e more, same o r to lay out the instruments  More time initially, to  randomly  place  less time according  or to  s k i l l for a scrub nurse a c o m p u t e r g e n e r a t e d map  but may be lata oa nurse gains experience, the  instruments  in  suitably  sized  compartments  Less time. to  p o s i t i o n  preloaded  trays  r.g»» nm*  7. I n t h e e v e n t t h a t n u r s i n g a s s i s t a n c e i s r e q u e s t e d , w o u l d i t require more, same o r l e s s e f f o r t i f the instruments have been p l a c e d i n an order specified by the nursing personnel Same  a  computer  generated  map  Same  151  Some 1.  general  What i s y o u r 0 t o 10  2. Does (range)  3.  questions  the  on  arthroscopy  experience in 11 t o  Burgeon n o r m a l l y ZOX  Approximately (range)  how 2  many  instrument  arthroscopy 30 use  a l l  or  usage  procedures o v e r 30  a percentage  :  (approx.)  : procedures  of  the  instruments *  :  instruments  does  the  surgeon use  at  one  time  _ 3 instruments-  r 4. A p p r o x . how many i n s t r u m e n t s d o e s t h e the procedure (range) i „„ ? ,w»i.«m  can  surgeon keep  5. of  Approx. what percentage of time the surgeon (percent range)  the  6. to  Are there any instrument(s) (or set of identify ( Y / N , g i v e names i f p o s s i b l e )  scrub  near  nurse  him/her  anticipate  during  the  needs *  7SX with TV. SST without TV instruments) _  that  are  d i f f i c u l t _  Acufex, 3.8 mm or S.O mm trocars  to  Are the d i f f i c u l t i e s other instruments, or  Unfamiliarity  due t o u n f a m i l i a r i t y other reasons  with  instruments,  similarity  and similarity  7. Assuming t h a t the s c r u b nurse c o u l d not a n t i c i p a t e the s u r g e o n ' s what i s the average time from the surgeon's request to p l a c i n g the requested instrument i n the s u r g e o n ' s hand : for easily identifiable instruments (seconds) 1  for  the if  more  d i f f i c u l t  nm»rmhly  nf  pnrtA  instruments tj  (seconds)  (  l  _20-to.-M  prior  to  handing  off  the  instrument  visual and operatismnl  checlt  s  s  rerpiirarl  8. Does t h e s c r u b n u r s e do a n y t h i n g d i f f e r e n t l y to help identify . d i f f i c u l t instruments during Setup partition cr strategically place instruments rm the ntcnH  needs,  the  152  9. W h a t i s t h e m a x i m u m d e l a y I n p a s s i n g t h e i n s t r u m e n t w i t h o u t the p a t i e n t ' s treatment or i r r i t a t i n g the surgeon (seconds) not significant 10.  How o f t e n  does  jeopardizing s  in terms of patient*« treatment; 2-ZOs depending on surgeon the  surgeon  ask f o r  multiple  instruments  (percent)  Usually asks scrub nurse to give or have ready 11. D o y o u k n o w o f a n y t i m e p e r f o r m a n c e s t u d i e s o r r e l a t e d w o r k instrument passing between scrub nurse and the surgeon (journal d a t e , o r name o f t e x t ) SIL  Any  other  comments  or  for and  suggestions  In arthroscopies at SDCC, 2S% of cases may have a resident in addition to Burgeon  (up to  BOX for  other  areas at VGB).  approx.  Appendix B Payback Period Calculation  robotic system cost i s  $29,000  present salary of scrub nurse  $16.34 per hr.  assume 20 minutes per case time o f 35 minutes may be replaced by the robot. labour savings = 20 / 35 = 57% f o r a 37.5 hours work week, the savings per week i s savings per week = $16.34 x 57% x 37.5 payback period  = $ 349.27  = system c o s t / savings per year $29,000 / ($349.27 * 52) 1.597 years  153  = 1.6 years  Appendix C Program Pseudo-code  C a r t i l a g e Knife Test Program Knife read image data read reference data f o r each k n i f e compartment begin get s t a r t p o s i t i o n and scan d i r e c t i o n scan every 4th l i n e u n t i l end record edge locations obtain edge l o c a t i o n s f o r 1. 7th l i n e from end 2. 12th l i n e from end use edge locations t o c a l c u l a t e width classify knife end stop  Obturator Test Program Obturator read image data read reference data f o r each obturator compartment begin get s t a r t p o s i t i o n and scan d i r e c t i o n scan every 4th l i n e u n t i l end record edge locations obtain edge locations f o r 1. 7th l i n e from end 2. 12th l i n e from end use edge l o c a t i o n t o c a l c u l a t e width obtain edge l o c a t i o n f o r 1. 8th l i n e from end trace around boundary s t a r t i n g a t 8th l i n e c a l c u l a t e angles between 5 p i x e l s forward & 5 p i x e l s backward f i n d minimum angle c l a s s i f y oburator end stop  154  Acufex Pose and Length Tests Program Acufex Pose read image for each ccimpartment scan l i n e and record a l l edge locations repeat every 4th l i n e u n t i l t i p end i f t i p i s less than 3 p i x e l s wide - then t i p found obtain edge locations f o r 1. 7th l i n e from 1st scanned l i n e 2. 12th l i n e from 1st scanned l i n e c a l c u l a t e handle angles on e i t h e r side obtain edge location f o r 1. 7th t i p location 2. 12th t i p location c a l c u l a t e angle a t t i p c a l c u l a t e pose from angles obtain f i r s t and l a s t scanned l i n e s calculate length of Acufex instrument end stop  Low Resolution Test Program LRTEST read image read references for each compartment begin scan l i n e and record a l l edge locations repeat every 4th i i n e u n t i l end c a l c u l a t e length c a l c u l a t e width c l a s s i f y instrument end stop Acufex Width Test Program Acufex width read image data f o r each Acufex compartment begin get s t a r t p o s i t i o n and scan d i r e c t i o n scan every 4th l i n e u n t i l 30 l i n e s scanned c a l c u l a t e width using trigonometric approx. end stop  157  Acufex Instruments Test Program Acufex t e s t read image data read reference data f o r each Acufex compartment begin get s t a r t p o s i t i o n & scan d i r e c t i o n s trace around boundary c a l c u l a t e angles between 5 p i x e l s forward and 5 p i x e l s backwards detect peaks i n angles c a l c u l a t e BMD features t e s t BMD discriminant f u n c t i o n check length of Acufex instrument f o r each p o s s i b l e match begin match reference features t o object features u n t i l a l l reference features are matched or a l l object features have been t r i e d . c a l c u l a t e score end i f no match, then move f i r s t or l a s t peak t r y match again s t a r t i n g with BMD features end end stop  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items