Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Design of a self-paced brain computer interface system using features extracted from three neurological.. 2008

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
ubc_2008_spring_fatourechi_mehrdad.pdf
ubc_2008_spring_fatourechi_mehrdad.pdf [ 5.55MB ]
Metadata
JSON: 1.0066215.json
JSON-LD: 1.0066215+ld.json
RDF/XML (Pretty): 1.0066215.xml
RDF/JSON: 1.0066215+rdf.json
Turtle: 1.0066215+rdf-turtle.txt
N-Triples: 1.0066215+rdf-ntriples.txt
Citation
1.0066215.ris

Full Text

 DESIGN OF A SELF-PACED BRAIN COMPUTER INTERFACE SYSTEM USING FEATURES EXTRACTED FROM THREE NEUROLOGICAL PHENOMENA  by  Mehrdad Fatourechi  B.Sc., University of Tehran, 1998 M.Sc., University of Tehran, 2001    A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY   in   The Faculty of Graduate Studies  (Electrical and Computer Engineering)     THE UNIVERSITY of BRITISH COLUMBIA  January 2008   © Mehrdad Fatourechi, 2008       ii ABSTRACT Self-paced Brain computer interface (SBCI) systems allow individuals with motor disabilities to use their brain signals to control devices, whenever they wish. These systems are required to identify the user’s “intentional control (IC)” commands and they must remain inactive during all periods in which users do not intend control (called “no control (NC)” periods). This dissertation addresses three issues related to the design of SBCI systems: 1) their presently high false positive (FP) rates, 2) the presence of artifacts and 3) the identification of a suitable evaluation metric.  To improve the performance of SBCI systems, the following  are proposed: 1) a method for the automatic user-customization of a 2-state SBCI system, 2) a two-stage feature reduction method for selecting wavelet coefficients extracted from movement-related potentials (MRP), 3) an SBCI system that classifies features extracted from three neurological phenomena: MRPs, changes in the power of the Mu and Beta rhythms; 4) a novel method that effectively combines methods developed in 2) and 3 ) and 5) generalizing the system developed in 3)  for detecting a right index finger flexion to detecting the right hand extension.  Results of these studies using actual movements show an average true positive (TP) rate of 56.2% at the FP rate of 0.14% for the finger flexion study and an average TP rate of 33.4% at the FP rate of 0.12% for the hand extension study. These FP results are significantly lower than those achieved in other SBCI systems, where FP rates vary between 1-10%. We also conduct a comprehensive survey of the BCI literature. We demonstrate that many BCI papers do not properly deal with artifacts. We show that the proposed BCI achieves a good performance of TP=51.8% and FP=0.4% in the presence of eye movement artifacts. Further tests of the performance of the proposed system in a pseudo- online environment, shows an average TP rate =48.8% at the FP rate of 0.8%. Finally, we propose a framework for choosing a suitable evaluation metric for SBCI systems. This framework shows that Kappa coefficient is more suitable than other metrics in evaluating the performance during the model selection procedure.       iii TABLE OF CONTENTS  Abstract .............................................................................................................................. ii Table of Contents ............................................................................................................ iii List of Tables .................................................................................................................. viii List of Figures .................................................................................................................. xi List of Abbreviations .......................................................................................................xv Acknowledgements ...................................................................................................... xvii Dedication ..................................................................................................................... xviii Co-authorship Statement ............................................................................................. xix Chapter 1 Introduction and background ...................................................................1 1.1 Introduction and motivation .........................................................................1 1.1.1 High false positive rates (FPR) ...............................................................3 1.1.2 Presence of artifacts .................................................................................4 1.1.3 Evaluation metrics .....................................................................................5 1.2 Functional model of a brain computer interface system .........................5 1.3 Background ....................................................................................................7 1.3.1 Signal recording ........................................................................................7 1.3.2 Choice of neurological phenomenon .....................................................8 1.3.3 Timing of BCI control ..............................................................................13 1.4 Design of self-paced BCI systems ...........................................................14 1.5 Use of multiple neurological phenomena in BCI systems ....................18 1.5.1 Simultaneous application of MRPs and changes in the power of Mu/Beta rhythms .....................................................................18 1.5.2 Using multiple neurological phenomena in BCI systems .................19 1.6 Artifacts in BCI systems .............................................................................21 1.6.1 Artifact avoidance ...................................................................................23 1.6.2 Artifact rejection .......................................................................................23 1.6.3 Artifact removal .......................................................................................25 1.7 Evaluating the performance of SBCI systems........................................27 1.8 Thesis contributions ....................................................................................31 1.8.1 Reducing high false positive rates .......................................................32 1.8.2 Addressing artifacts in SBCI systems ..................................................33 1.8.3 Finding a suitable evaluation metric for SBCI systems .....................34 1.9 Organization of the thesis ..........................................................................34 1.10 References ...................................................................................................40   iv Chapter 2 Automatic user customization for improving the performance of a self-paced brain computer interface system .......................................................50 2.1 Introduction ..................................................................................................50 2.2 Background ..................................................................................................53 2.3 Problem statement ......................................................................................56 2.4 Methods ........................................................................................................58 2.5 Experimental results ...................................................................................62 2.6 Discussion and conclusions ......................................................................69 2.7 Acknowledgements ....................................................................................71 2.8 References ...................................................................................................72 Chapter 3 Application of a hybrid wavelet feature selection method in the design of a self-paced brain computer interface system ....................................75 3.1 Background ..................................................................................................75 3.2 Data collection .............................................................................................79 3.3 Method ..........................................................................................................81 3.4 Results ..........................................................................................................86 3.5 Discussion and conclusions ......................................................................89 3.6 Acknowledgements ....................................................................................95 3.7 References ...................................................................................................96 Chapter 4 A self-paced brain computer interface system that uses movement related potentials in changes in the power of brain rhythms ................99 4.1 Introduction ..................................................................................................99 4.2 Background ................................................................................................102 4.2.1 Neurological phenomenon background .............................................102 4.2.2 Multiple neurological phenomena in BCI systems ...........................104 4.3 Data collection ...........................................................................................106 4.4 Methods ......................................................................................................108 4.4.1 Feature extraction .................................................................................110 4.4.2 Feature classifier ...................................................................................114 4.4.3 Feature selection ...................................................................................117 4.4.4 Performance evaluation .......................................................................119 4.5 Results ........................................................................................................120 4.6 Discussion ..................................................................................................127 4.6.1 Observations on the BCI designs based on a single neurological phenomenon ...................................................................127 4.6.2 Observations on Study 1 .....................................................................128 4.6.3 Observations on Study 2 .....................................................................128 4.6.4 Statistical analysis .................................................................................128 4.7 Acknowledgements ..................................................................................131 4.8 References .................................................................................................133 Chapter 5 A self-paced brain computer interface system with a low false positive rate ..........................................................................................................138 5.1 Introduction ................................................................................................138 5.2 Methods ......................................................................................................141   v 5.2.1 Feature extraction .................................................................................141 5.2.2 Feature classification ............................................................................147 5.2.3 Hybrid genetic algorithm (HGA) ..........................................................149 5.3 Experimental results .................................................................................152 5.3.1 Data collection and evaluation ............................................................152 5.3.2 Results ....................................................................................................155 5.4 Discussion and conclusions ....................................................................157 5.5 Acknowledgements ..................................................................................162 5.6 References .................................................................................................163 Chapter 6 EMG and EOG artifacts in brain computer interface systems: a survey  .................................................................................................................167 6.1 Introduction ................................................................................................167 6.2 Current neurological phenomena and associated artifacts ................168 6.2.1 Current neurological phenomena .......................................................168 6.2.2 Artifacts in BCI systems .......................................................................171 6.3 Methods of handling artifacts ..................................................................172 6.3.1 Artifact avoidance .................................................................................172 6.3.2 Artifact rejection .....................................................................................173 6.3.3 Artifact removal .....................................................................................175 6.4 Literature survey .......................................................................................178 6.4.1 EOG artifacts .........................................................................................185 6.4.2 EMG artifacts .........................................................................................185 6.5 Discussion and conclusions ....................................................................186 6.6 References .................................................................................................189 Chapter 7 Performance of a self-paced Brain computer Interface on data contaminated with eye blinks and on data recorded in subsequent sessions  .................................................................................................................209 7.1 Introduction ................................................................................................209 7.2 Methods ......................................................................................................212 7.2.1 Self-paced brain computer interface design .....................................212 7.2.2 Data collection .......................................................................................213 7.2.3 Evaluation ...............................................................................................215 7.3 Results ........................................................................................................217 7.3.1 Analysis of SBCI performance on artifact-contaminated data .......217 7.3.2 Test on data recorded in subsequent sessions ...............................218 7.3.3 The effect of adding a debounce component ...................................222 7.4 Discussion ..................................................................................................227 7.5 Acknowledgements ..................................................................................230 7.6 References .................................................................................................231 Chapter 8 Selection of a suitable evaluation metric for a self-paced brain computer interface system .................................................................................234 8.1 Introduction ................................................................................................234 8.2 Problem statement ....................................................................................239 8.3 A framework for comparing evaluation metrics ....................................244   vi 8.3.1 Suitability of an evaluation metric .......................................................245 8.3.2 Guidelines for comparing two evaluation metrics ............................247 8.3.3 Degree of consistency (DoC) ..............................................................248 8.3.4 The Degree of discriminancy (DoD) ...................................................251 8.3.5 Comparison of two evaluation metrics ...............................................252 8.3.6 Using sub-sampling grids for calculating the comparison measures ................................................................................................252 8.4 Selected evaluation metrics in SBCIs....................................................255 8.4.1 Overall accuracy (OA) ..........................................................................255 8.4.2 Information transfer rate (mutual information) ..................................256 8.4.3 Kappa ......................................................................................................256 8.4.4 HF-difference .........................................................................................257 8.4.5 FPR TPR ratio ................................................................................................257 8.4.6 ROC curve and related metrics ..........................................................258 8.5 Simulations ................................................................................................259 8.5.1 Application ..............................................................................................259 8.5.2 Results ....................................................................................................259 8.6 Discussion and conclusions ....................................................................271 8.7 References .................................................................................................274 Chapter 9 New studies on the design of a 2-state self-paced brain computer interface system with a low false activation rate ....................................276 9.1 Introduction ................................................................................................276 9.2 Experimental paradigm ............................................................................280 9.2.1 Data recording .......................................................................................280 9.2.2 Artifact monitoring .................................................................................282 9.3 System design methods ..........................................................................283 9.3.1 Generating the IC and NC data ..........................................................283 9.3.2 Feature extraction .................................................................................284 9.3.3 Feature classification ............................................................................287 9.3.4 Multiple classifier system .....................................................................288 9.3.5 Calculating the TPs and FPs ...............................................................288 9.3.6 Metric selection for model evaluation ................................................289 9.3.7 Model selection .....................................................................................292 9.3.8 Evaluation ...............................................................................................293 9.3.9 Using ROC curves for summarizing the performance on test sets ..........................................................................................................293 9.4 Results ........................................................................................................294 9.4.1 Choosing the evaluation metric for model selection ........................295 9.4.2 Performance of the system .................................................................295 9.5 Discussion and future work .....................................................................299 9.5.1 Discussion ..............................................................................................299 9.5.2 Future works ..........................................................................................305 9.6 Acknowledgements ..................................................................................308 9.7 References .................................................................................................309   vii Chapter 10 Summary and conclusions ..................................................................312 10.1 Summary ....................................................................................................312 10.1.1 Chapter 2: Improving the performance of LF-ASD by automatic user-customization .............................................................313 10.1.2 Chapter 3: Using DWT to extract features ........................................314 10.1.3 Chapter 4: Using three neurological phenomena as the source of control ....................................................................................315 10.1.4 Chapter 5:  Design of an automated SBCI system with low FP rates ..................................................................................................317 10.1.5 Chapter 6: Analysis of the effect of artifacts in BCI systems .........318 10.1.6 Chapter 7: Analysis of the performance of the proposed SBCI on artifact-contaminated data ...................................................319 10.1.7 Chapter 8: A framework for evaluating the performance of SBCI systems ........................................................................................320 10.1.8 Chapter 9: Applying the proposed SBCI with hand extension data .........................................................................................................320 10.2 Summary of contributions ........................................................................321 10.2.1 Reducing high false positive rates .....................................................321 10.2.2 Addressing artifacts in SBCI systems ................................................323 10.2.3 Finding a suitable evaluation metric for SBCI systems ...................323 10.3 Future research directions .......................................................................324 10.4 References .................................................................................................326 Appendix A- UBC Research Ethics Board Certificate .............................................327 Appendix b- Theoretical analysis of the proposed SBCI ........................................328 B.1. Formulating the problem ..................................................................................328 B.2. Constraints .........................................................................................................329 B.3. Objective functions ............................................................................................330 B.4. Results ................................................................................................................332 B.5. References .........................................................................................................336           viii LIST OF TABLES  Table 1-1. Comparison of the TPR and FPR rates achieved in different SBCI studies. ..................................................................................................17 Table 2-1. The confusion matrix for a 2-state self-paced BCI system. ...................60 Table 2-2. Comparison of the fitness value of the initial and final populations (tested on the validation sets). ................................................66 Table 2-3. TP rates of the LF-ASD and the ALF-ASD (FP=2%). ............................68 Table 2-4. Delay parameter values used in the design of the LF-ASD based on the ensemble averages of the MRP patterns in the training data set. Note that βi  and βj are set to zero and that the same delay parameter values are used for the rest of the bipolar channels. The table is reproduced from [15]. ............................................69 Table 3-1. The confusion matrix for a 2-state self-paced BCI system. ...................84 Table 3-2. Comparison of the average TP, average FP rates, average FPR TPR  and the average number of features. ................................................87 Table 3-3. The average number of selected features per channel after applying the hybrid feature selection algorithm. ........................................88 Table 4-1. The average TP and FP rates (%) for Study 1 (the numbers in parenthesis show the standard deviation). ...............................................121 Table 4-2. The average TP and FP rates (%) for each neurological phenomenon in Study 2 (the numbers in parenthesis show the standard deviation). ......................................................................................121 Table 4-3. The average TP and FP rates (%) for User AB1 in Study 2 (the numbers in parenthesis show the standard deviation). ..........................125 Table 4-4.The average TP and FP rates (%) for User AB2 in Study 2 (the numbers in parenthesis show the standard deviation). ..........................126 Table 4-5. The average TP and FP rates (%) for User AB3 in Study 2 (the numbers in parenthesis show the standard deviation). ..........................126 Table 4-6. The average TP and FP rates  (%) for User AB4 in Study 2 (the numbers in parenthesis show the standard deviation). ..........................127 Table 5-1. The time schedule of recording the data ................................................153   ix Table 5-2. The performance results for the proposed SBCI system. ...................156 Table 5-3. Comparison of the performance results. ................................................158 Table 6-1. Methods of handling artifacts in BCI literature .......................................181 Table 6-2. Methods of automatic EOG rejection in BCI studies. ...........................183 Table 6-3. Methods of automatic EMG rejection in BCI studies. ...........................184 Table 6-4.Methods of automatic EOG removal in BCI studies. .............................184 Table 6-5. Methods of automatic EMG removal in BCI studies. ............................185 Table 7-1. The time schedule of recording the data. For each participant, Day 1 is the first day that a participant attended the experiments. The rest of days are numbered with respect to Day 1 of that particular participant. ....................................................................................215 Table 7-2. Comparison of the average test results on artifact-contaminated and non-contaminated data. The averages are calculated over 5 outer validation sets. The numbers in the parentheses indicate standard deviations. .....................................................................................217 Table 7-3. Comparison of the average results using data recorded in the first five sessions with those using data recorded in subsequent sessions. The averages are calculated over 5 outer validation sets. The numbers in the parentheses indicate standard deviations. ......................................................................................................219 Table 8-1. DoS results for the evaluation metrics studied in this paper. Res1 stands for the finest resolution, Mean stands for the average of 10 resolution values and Res10 stands for the coarsest resolution. ......................................................................................261 Table 8-2. The DoC results for the evaluation metrics studied in this paper (the first three comparisons). Res1, Mean and Res10 stand for the finest resolution, the average of 10 resolution values and the coarsest resolution, respectively. ...............................................................263 Table 8-3. The DoC results for the evaluation metrics studied in this paper (the last three comparisons). Res1, Mean and Res10 stand for the finest resolution, the average of 10 resolution values and the coarsest resolution, respectively. ...............................................................264 Table 8-4. The DoD results for the evaluation metrics studied in this paper (the first three comparisons). Res1, Mean and Res10 stand for the finest resolution, the average of 10 resolution values and the coarsest resolution, respectively. ...............................................................265 Table 8-5. The DoD results for the evaluation metrics studied in this paper (the last three comparisons). Res1, Mean and Res10 stand for the finest resolution, the average of 10 resolution values and the coarsest resolution, respectively. ...............................................................266   x Table 8-6 . The effect of weights on average values of DoC. ................................267 Table 8-7. The effect of weights on average values of DoD. .................................268 Table 9-1. Comparison of the TP rates of monopolar and bipolar montages for different false activation rates. The numbers  in parentheses show the standard deviations. .............................................299 Table 9-2. Comparison of the TPR and FAR rates achieved in different SBCI studies. ................................................................................................302    xi LIST OF FIGURES  Figure 1-1. A BCI system allows users to control a device using their brain signals only. .......................................................................................................2 Figure 1-2. A typical SBCI system that identifies an IC command related to the execution of right finger flexion ................................................................3 Figure 1-3. High false positive rates can significantly impact the performance of an SBCI system, even if the TP rates are high. (a) Brain states of a user; (b) The output of the SBCI system. .................4 Figure 1-4. Functional model of a BCI system.  Note the control display is optional. ..............................................................................................................6 Figure 1-5. Two examples of neurological phenomena. (a) Changes in the power of Beta rhythms over time; (b) A movement-related potential. Vertical line shows the time of activation of the movement.  Note that these shapes are generated by averaging over many epochs. .........................................................................................10 Figure 1-6. Synchronized vs. self-paced control. (a) In a synchronized BCI system, control can be done only in certain intervals specified by the system; (b) In a self-paced BCI system, the control is done at the user’s own pace. ......................................................................................14 Figure 1-7. An example of how artifacts can affect the performance of an SBCI system. (a) The brain state of the user; (b) The periods when artifacts have occurred; (c) The output of the SBCI system (note: FP: false positive, TN: true negative, FN: false negative and TP: true positive). ....................................................................................21 Figure 1-8. Types of evaluation metrics used in synchronized and self- paced BCI systems. .......................................................................................29 Figure 1-9. The overall schematic of the SBCI system developed and studied in Chapters 4, 5, 7, and 9. ...............................................................36 Figure 1-10. Outline of the thesis. ................................................................................39 Figure 2-1. Components of the LF-ASD system (from [32]). ...................................54 Figure 2-2. Points selected by the feature generator when applied to a sample bipolar EEG signal. ..........................................................................55   xii Figure 2-3. The fitness of the best chromosomes as a function of the generation number for two representative individuals. a) AB2; b) SCI4. .................................................................................................................65 Figure 3-1. The overall structure of the proposed hybrid method for extracting MRP features. ...............................................................................81 Figure 3-2. Spatial distribution of the average number of selected features for AB1. ............................................................................................................90 Figure 3-3. Spatial distribution of the average number of selected features for AB2. ............................................................................................................90 Figure 3-4. Spatial distribution of the average number of selected features for AB3. ............................................................................................................91 Figure 3-5. Spatial distribution of the average number of selected features for AB4. ............................................................................................................91 Figure 3-6. Comparison of the fitness of the best chromosome vs. other subset of features. ..........................................................................................95 Figure 4-1. Functional model of a BCI system (adapted from [1]). .........................99 Figure 4-2. Synchronized vs. SBCI systems. (a) In a synchronized BCI system control is only possible during System Ready periods; (b) In an SBCI system, the system continuously accepts the input signals. ...........................................................................................................100 Figure 4-3. NC periods are generated by shifting a window over NC datasets. ........................................................................................................108 Figure 4-4. The overall structure of the SBCI system implemented in Study 1. .....................................................................................................................109 Figure 4-5. The overall structure of the two-stage MCS implemented in Study 2. ..........................................................................................................110 Figure 4-6. The process of generating templates. ...................................................112 Figure 4-7. The spatial distribution of the selected features for individuals in Study 2. (a) User AB1; (b) User AB2; (c) User AB3; and (d) User AB4. ......................................................................................................125 Figure 5-1. The overall structure of the improved SBCI incorporating three neurological phenomena. ............................................................................142 Figure 5-2. An example of how features are extracted using the proposed cross-covariance method. ...........................................................................145 Figure 5-3.  (a) The structure of a chromosome; (b) Representation of the parameter values for each SVM in a chromosome. ................................150 Figure 5-4. Method of calculating the TP rate; (a) EEG Signal; (b) Output of the finger switch; (c) Output of the SBCI. .............................................155   xiii Figure 6-1. The functional model of a BCI system depicting its principle functional components. ................................................................................167 Figure 6-2.  The number of papers published on different methods of handling EOG artifacts in BCI studies. ......................................................180 Figure 6-3. The number of papers published on different methods of handling EMG artifacts in BCI studies. .....................................................180 Figure 7-1. The overall structure of the improved SBCI .........................................213 Figure 7-2. An example of extracting the maximum of the cross- correlogram using the proposed cross-covariance method. .................214 Figure 7-3. Method of calculating the TP rate; (a) EEG Signal; (b) Output of the hand switch; (c) Output of the SBCI. ..............................................216 Figure 7-4. The SBCI output during periods when finger movements were executed for a) Participant AB1; b) Participant AB2; and c) The output of the SBCI during NC sessions when movements did not occur for Participant AB1. ...........................................................................222 Figure 7-5. The operation of a debounce component. ............................................224 Figure 7-6. The TP rate, FP rate and the FPR TPR  ratio as a function of the length of the debounce window for (a) Participant AB1; (b) Participant (AB2); (c ) Participant AB3; (d) Participant (AB4); (e) Averages of all four participants. ................................................................227 Figure 7-7. The output of the SBCI during periods when finger movements were executed for participant AB4; (b) the output of the SBCI during NC sessions when movements did not occur for participant AB4. ............................................................................................229 Figure 8-1. a) An example of a confusion matrix for a balanced dataset; b) An example of a confusion matrix for an imbalanced dataset; c) A second example of a confusion matrix for an imbalanced dataset. ..........................................................................................................236 Figure 8-2. A sample fitness landscape for a classification problem with two classes. ...................................................................................................242 Figure 8-3. An example of dividing the (TPR,FPR) domain into regions. Different movements on the (TPR, FPR) space may be associated with different weights. Note that the numbers on each axis denote (%). ............................................................................................244 Figure 8-4. Two examples of more complex break-down of the (TPR, FPR) domain with more complex partition and weighting schemes. ........................................................................................................245 Figure 8-5. An example of using grids on the (TPR, FPR) domain. .....................253   xiv Figure 8-6. Consistency between different resolutions in reaching the same conclusion for the cases studied here. The chart attributes to DoD results. ..............................................................................................271 Figure 9-1. The overall structure of the SBCI (from [17]). The dashed lines show the parts of the system whose values are determined by the hybrid genetic algorithm (HGA). ..........................................................284 Figure 9-2. An example of how features are extracted using the proposed cross-covariance method. ...........................................................................286 Figure 9-3. Method of calculating the TP rate; (a) EEG Signal; (b) Output of the finger switch; (c) Output of the SBCI. .............................................289 Figure 9-4. An example of dividing the (TPR,FPR) domain into regions. Different movements on the (TPR, FPR) space may be associated with different weights. Note that the numbers on each axis denote (%). ............................................................................................291 Figure 9-5. ROC plots for (a) AB1; (b)AB2;(c)AB3;(d) AB4;(e)AB5. .....................298 Figure 9-6. Average MRPs for Channel C1 over two sessions. (a) participant AB1; (b)participant AB2 ; (c) participant AB3; (d) participant AB4;. ...........................................................................................305     xv LIST OF ABBREVIATIONS 1-NN One nearest neighbor AB Able-bodied AEP Auditory evoked potentials ANC Activity of neural cells AR Auto-regressive AUC Area under the receiver operating characteristic curve BCI Brain computer interface BI Brain Interface BSS Blind source separation CAR Common average reference CBR Changes in the brain rhythms CPBR Changes in the power of Beta rhythms CPMR Changes in the power of Mu rhythms CT Cognitive task DoC Degree of consistency DoD Degree of discriminancy DoS Degree of suitability DROP Desired region of operation DWT Discrete wavelet transform ECG Electrocardiography ECoG Electro-corticogram EEG Electroencephalogram EMG Electromyogram ENT Energy normalization transform EOG Electrooculgram ERD Event-related desynchronization ERP Event-related potential ERS Event-related synchronization FAR False activation rate FDR False discovery rate FIR Finite impulse response FP False positive FPR False positive rate GA Genetic algorithm HGA Hybrid genetic algorithm IC Intentional control ICA Independent component analysis ITR Information transfer rate k-NN k-nearest neighbor LF-ASD Low frequency- asynchronous switch design MCS Multiple classifier system MEG Magnetoencephalography MI Mutual information   xvi MN Multiple neurological phenomena MRA Movement-related activity MRP Movement-related potential NC No control NN Neural networks OA Overall accuracy OPM Outlier processing method PCA Principal component analysis RMS Root mean square ROC Receiver operating characteristic SBCI Self-paced brain computer interface SCI Spinal cord injury SNR Signal-to-noise ratio SSEP Somatosensory evoked potential SSVEP Steady stated visual evoked potential STD Standard deviation SVM Support vector machine SWT Stationary wavelet transform TEM Time of expected attempted movement TP True positive TPR True positive rate VEP Visual evoked potential                         xvi i ACKNOWLEDGEMENTS This thesis is the result of nearly five years of research. I would like to express my sincere gratitude to all those who have supported me for completing my thesis. First, I would like to thank my research supervisors, Prof. Rabab K. Ward and Dr. Gary E. Birch for giving me the opportunity to work in their research group. I am greatly indebted to their guidance, support and encouragement throughout the course of my studies. I would also like to give thanks to my committee members: Dr. Dave Michelson, Dr. Jane Wang, and Dr. Tim Salcudean for investing their time to read and give me valuable feedback on this thesis. Next, I extend my thanks to the researchers in the brain interface laboratory of the Neil Squire Society and the Image and Signal Processing lab of UBC for their support and help. Especially, I would like to thank Dr. Lino Coria, Dr. Steven G. Mason, Dr. Jaimie Borisoff, Gordon Handford, Borna Noureddin, Xin Yi Yong, Angela Chuang, Qiang Tang, and Zicong Mai for their valuable technical comments and support on my work. Most importantly, I would like to express my deepest gratitude to my wife, Nona, my parents, my sister, my family and my friends for their love, help and endless support. If it were not for their sincere encouragement, I would have not made it as far as I did. This work was supported in part by NSERC under Grant 90278-06 and CIHR under Grant MOP-72711.     xvi ii DEDICATION      This thesis is dedicated to my lovely wife, Nona,                Who has offered me unconditional love and support and has stood by me all along. She believed in me, when I did not and for that, I shall always remain grateful.  And it is also dedicated to my parents, Nasrin and Hassan,                 Who have raised me to be the person I am today. They have been with me in every step. I hope I have made them proud.  And it is also dedicated to Shahnaz and Hossein,                 Who have always supported me and encouraged me.    xix CO-AUTHORSHIP STATEMENT  1-  Fatourechi, M., Bashashati, A., Birch, G.E. and Ward, R.K.  “Automatic User Customization for Improving the Performance of an Asynchronous Brain Interface System”, Journal of Medical & Biological Engineering and Computing, Vol.44, No.12, Dec 2006, pp.1093-1104. MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author. AB contributed to the study design, helped in shaping the manuscript , interpreting the results and evaluating the manuscript. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 2- Fatourechi, M., Birch, G. E., and Ward, R. K., "Application of a Hybrid Wavelet Feature Selection Method in the Design of a Self-paced Brain Interface System", Journal of NeuroEngineering and Rehabilitation, Vol.4, No.1, Apr 2007. MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 3- Fatourechi, M., Birch, G. E., and Ward, R. K., “A Self-paced Brain Interface System that Uses Movement Related Potentials and Changes in the Power of Brain Rhythms", Journal of Computational Neuroscience, Vol.23, No.1, Aug 2007,pp.21-37. MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author.   xx RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 4- Fatourechi, M., Birch, G. E., Ward, R. K., “A Self-paced Brain Interface System with a Low False Positive Rate”, Journal of Neural Engineering, vol.5, 2008, pp.9-23. MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 5- Fatourechi, M., Bashashati, A., Ward, R. K., and Birch, G. E., "EOG and EMG Artifacts in Brain Interface Systems: a Survey", Clinical Neurophysiology, Vol.118, No.3, Mar 2007, pp.480-494 (Invited Paper). MF reviewed the papers, created the tables, interpreted the results, wrote the manuscript and acted as the corresponding author. AB helped in writing the manuscript, interpreting the results and evaluating the manuscript. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 6- Fatourechi, M., Birch, G.E., Ward, R.K., “Performance of a Self-paced Brain Computer Interface on Data Contaminated with Eye Blinks and on Data Recorded in Subsequent Sessions”, submitted. MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 7- Fatourechi, M., Mason, S. G., Ward, R.K., and Birch, G. E., “A New Framework for Comparing Metrics Used in Pattern Classification Problems with Large Test Samples”, submitted.   xxi MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author. SGM contributed to the development of the initial concept, interpreting the results and evaluating the manuscript. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript. 8- Fatourechi, M., Ward, R.K., and Birch, G. E., “New studies on the design of a 2-state self-paced brain computer interface system with a low false positive rate”, submitted. MF developed the method, analyzed the data, interpreted the results, wrote the manuscript and acted as the corresponding author. RKW and GEB supervised the development of the work, helped in writing and evaluating the manuscript.          1 CHAPTER 1 INTRODUCTION AND BACKGROUND   1.1 Introduction and motivation Many physiological disorders such as Amyotrophic Lateral Sclerosis (ALS) or injuries such as high-level spinal cord injury can disrupt the communication path between the brain and the body.  People with severe motor disabilities may lose all voluntary muscle control, including eye movements. These people are forced to accept a reduced quality of life, resulting in dependence on caretakers and escalating social costs [1]. Most of the existing assistive technology devices for these patients are not possible because these devices are dependant on motor activities from specific parts of the body. Alternative control paradigms for these individuals are thus desirable. Over the last two decades, brain-computer interface (BCI) has emerged as a new frontier in assistive technology since it could provide an alternative communication channel between a user’s brain and the outside world [2](see Figure 1-1 for a high-level block diagram of a BCI system). Other terms that are also used in the literature for referring to a BCI system include: brain interface (BI), direct brain interface (DBI), and brain machine interface (BMI). A successful BCI design would enable people to control objects in their environment (such as a light switch in their room or television, wheelchairs, neural prosthesis and computers) by thought only.  This could be accomplished by measuring specific features of the user’s brain activity that relate to his/her intent to perform the control. This specific type of brain activity is termed a “neurological phenomenon”. As an example, when a particular movement such as right index finger flexion is performed, specific neurological phenomena that correspond to   2 that movement are generated. The corresponding neurological phenomena are then translated into signals that are eventually used to control devices [3].   Figure 1-1. A BCI system allows users to control a device using their brain signals only. Currently, two different approaches are pursued in the design of BCI systems: synchronized and self-paced [4]. In the synchronized approach, which forms the traditional approach to the design of BCI systems, the user can only perform the control in certain time intervals that are specified by the system. While synchronized BCI systems can achieve high classification accuracy (>90%), their application is limited. This is because the user cannot perform the control at all times. Moreover, many of these systems assume that the user will exert an intentional control (IC) command during specified control periods. In other words they do not consider periods for which the user does not wish to exert control (called no control, NC, periods). As a result, they may become unstable during NC periods [3]. To address these shortcomings of synchronized BCI systems, the concept of self-paced BCI (SBCI) has been proposed.  An SBCI system is constantly available for a user to use, as it should be able to identify IC patterns from the NC periods. Figure 1-2 shows a typical example of a 2-state SBCI system that should recognize IC patterns generated as the result of right finger flexion from the NC states. The output of this SBCI system is ‘IC’ when the system detects an IC command and is “NC” at all other times.   3  Figure 1-2. A typical SBCI system that identifies an IC command related to the execution of right finger flexion The main aim of this thesis is to study the following three issues pertaining to SBCI systems: 1.1.1 High false positive rates (FPR) The performance of SBCI systems is usually summarized by two measures: 1) the correct detection rate of IC commands (denoted as the true positive or TP rate), 2) the amount of false activations during NC periods (false positive or FP rate). At present, the performance of the SBCI technology is not high enough so that it can be used in a practical setting. While these systems can achieve an arguably good detection rate of TP>50%, their FP rates remain too high for practical applications (e.g., a false positive occurs every few seconds [5, 6]). For example, it has been argued that for an SBCI system that makes a decision every th 16 1 of a second, FP rates higher than 2% can cause excessive user frustration, since the SBCI system generates a false positive every 6.25 seconds on average [5]. As another example, consider the self-paced control of lighting in a room using an SBCI system.  The system has two states: I (turn on/turn off the light)   4 and N (no control). Figure 1-3 (a) and Figure 1-3 (b) show the brain states of a user and the output of the SBCI system, respectively. As seen, the system generated an FP at the beginning of monitoring the brain signals. The user then attempted to compensate for this error by issuing an intentional control command. After a short period, the system generated a second false positive, and the user had to compensate for it again. Clearly, during this period, the user only managed to compensate for the errors generated by the system.  This process becomes frustrating when errors happen frequently and especially if the TP rate is not very high. Based on these arguments, it is clear that the ultimate value of this new technology will largely depend on the degree to which its performance can be improved, e.g., false positives shall occur no more than once per minute.  Figure 1-3. High false positive rates can significantly impact the performance of an SBCI system, even if the TP rates are high. (a) Brain states of a user; (b) The output of the SBCI system. 1.1.2 Presence of artifacts  A second factor that limits the application of SBCI systems is the presence of artifacts. Artifacts are unwanted signals that can degrade the performance of the system. If artifacts occur at the same time of the initiation of an IC command by the user, they may change the shape of a neurological phenomenon, and decrease the TP rate. If they   5 occur during the NC periods, they can generate false positives and increase the FP rate. Since some artifacts such as physiological artifacts (e.g., eye blinks) frequently occur, methods should be developed to effectively handle them. Unfortunately, most BCI systems do not handle artifacts at all (or at least efficiently). This is a serious drawback in online applications of BCI systems in general and SBCI systems in particular. 1.1.3 Evaluation metrics Yet another important factor in the design of SBCI systems is the availability of a suitable “evaluation metric”. In synchronized BCI systems, the overall classification accuracy (OA) and the information transfer rate (ITR) are metrics that are widely used. They are also accepted by the research community as reliable measures for comparing the performances of different synchronized BCI systems. This is not the case for SBCIs. It is very difficult to compare the performances of different SBCI systems. A wide variety of metrics such as OA[7], HF-difference[8], the mutual information (and ITR) [9], Kappa [10], the area under the receiver operating characteristic (ROC) curve [10] , the TP rate at a fixed FP rate [5] and others have been proposed in the literature. However, no consensus yet exists amongst self-paced BCI researchers regarding which metric is more suitable for summarizing the performance and how a suitable evaluation metric should be chosen for a particular self-paced BCI systems [4]. Please note that the neurological phenomena generated as the results of attempted movements by able-bodied individuals are similar to those generated by individuals with motor disabilities, as discussed later in this section. For this reason, in this thesis, data collected from able-bodied individuals are used for the analysis.  Before we address the existing work in the literature related to the above topics, we first provide some background information about the operation of BCI systems. This is done in the next two sections. 1.2  Functional model of a brain computer interface system Figure 1-4 shows a traditional BCI system in which a person controls a device in an operating environment (e.g., a powered wheelchair in a house) through a series of   6 functional components [11]. In this context, the user’s brain activity is used to generate IC commands that operate the BCI system. The user monitors the state of the device to determine the result of his/her control efforts.   Device  User amp Feature Translator Control Interface Device Controller state feedback Control Display electrodes BCI Transducer Artifact Processor Feature Generator Signal Enhancement Feature Extraction Feature Selection Post- processing Feature Classification  Figure 1-4. Functional model of a BCI system.  Note the control display is optional. The building components of a BCI system (shown in Figure 1-4 ) have the following tasks: the electrodes placed on the head of the user record the brain signal (e.g., electroencephalography (EEG) signals from the scalp, electrocorticography (ECoG) signals from the brain or neuronal activity recorded using microelectrodes implanted in the brain). The ‘artifact processor’ block deals with artifacts in the EEG signals after the signals have been amplified. This block can either remove artifacts from the EEG signals or can simply mark some EEG epochs as artifact-contaminated. The ‘feature generator’ block transforms the resultant signals into feature values that relate to the underlying neurological phenomena employed by the user for control. For example, if the user is using the power of his/her Mu (8-12Hz) rhythm for the purpose of control, the feature generator could continually generate features relating to the power-spectral estimates of the user’s Mu rhythms. The feature generator generally consists of three components: the ‘signal enhancement’, the ‘feature extraction’, and the ‘feature selection’ components, as shown in Figure 1-4. In some BCI designs, ‘signal enhancement’ or some of form of ‘pre-processing’ is performed to increase the signal-to-noise ratio of the brain signal(s) prior to extracting the   7 features. To reduce the dimensionality of the problem, it is desired to reduce the number of features and/or the number of EEG channels. ‘Feature selection’ could be performed after or at the feature extraction stage to reduce the number of features and/or EEG channels used. Ideally, the features that are meaningful or useful in the classification stage are identified and chosen, while others are omitted. The ‘feature translator’ block translates the features into logical control signals, e.g., 0 and 1 where 0 denotes NC and 1 denotes IC. The translation algorithm uses linear classification methods (e.g., linear discriminant analysis) or nonlinear ones (e.g., neural networks). As shown in Figure 1-4, a feature translator may consist of two components: ‘feature classification’ and ‘post-processing’. The main aim of the feature classification component is to classify the features into logical control signals. Post-processing methods such as a moving average may be used after feature classification to reduce the number of activations of the system. The control interface translates the logical control signals from the feature translator into semantic control signals that are appropriate for the particular type of device used. Finally, the device controller translates the semantic control signals into physical control signals that are used by the device. For more detail refer to [3]. In the next section, we provide a brief review of some of the work done in the literature. 1.3 Background Since the introduction of the concept of BCI control in early 70’s (e.g., [12]), many BCI systems have been developed. Despite these efforts, many design issues remain under debate. In this section, we briefly review these design issues. 1.3.1 Signal recording An activity in a normal human brain can generate various responses including electrical, magnetic, and metabolic responses. These signals can be detected by appropriate sensors and they can be used for controlling a BCI system. For example, brain activity can produce magnetic fields that can be recorded using   8 magentoencephalographic (MEG) activity.  Brain activity can also result in some metabolic consequences in terms of changes in the blood flow and metabolism. Imaging methods such as functional magnetic resonance imaging (fMRI) can image these activities. At present, because of the cost and physical dimensions, methods that measure the electrical activities of the brain are more favored [1]. There are various ways to record the electrical activities of the brain. Non- invasive BCI approaches mostly use the EEG signals as the source of information. EEG signals are recorded by means of electrodes placed on the scalp. Invasive approaches, on the other hand, use electrocorticography (ECoG) signals recorded from the surface of the brain or action potentials of single neurons in the cerebral cortices, using implanted microelectrodes.  EEG signals have good temporal resolution, but their spatial resolution is not good compared to other recording technology methods [1]. A recent study showed that only 12% of published BCI studies use implanted electrodes, 5% use microelectrode arrays, and more than 80% use EEG signals [3]. The main reason is that the EEG recording equipment is commercially produced and their cost is lower than other brain signal recording technologies. Also, since no surgery is necessary for placing electrodes, more individuals are willing to participate in such BCI experiments. 1.3.2 Choice of neurological phenomenon  Neurological phenomena are specific features of the brain activity that appear in the brain signals and can be used to control a BCI system. They are time-locked to a physical stimulus or to the cognitive responses of the brain. Neurological phenomena are characterized by their voltage amplitude, their latency which is related to the internal or external stimuli and their spatiotemporal distribution. Their amplitude is usually much smaller than the background EEG signal. The more common neurological phenomena in BCI systems are:  Changes in the brain rhythms (CBR) such as the Mu, and Beta rhythms related to a movement: Mu ([8-12] Hz) and Beta ([18-30] Hz) rhythms are frequency bands in the brain signals that are known to be suitable neurological phenomena for controlling BCI   9 systems. The reason is that they are closely associated with those cortical areas directly connected to normal motor channels of the brain. Voluntary movement results in a circumscribed desynchronization in the Mu and Beta bands that are localized close to the sensorimotor areas ([13, 14]). This desynchronization, termed “event-related desynchronization (ERD)”, starts about two seconds prior to the onset of movement [15]. The enhanced rhythmic activity following the movement is called “event-related synchronization (ERS)”.   The post-movement Beta ERS is found in the first second after the termination of a voluntary movement, when the Mu rhythm might still display a desynchronized pattern [15]. The Beta ERS is a relatively robust phenomenon and is found in nearly all users after a finger, hand, arm or foot movement (see Figure 1-5 (a) for an example of the Beta ERS) [16]. Many research groups have developed BCI systems using the features extracted from the Mu and Beta rhythms. However, the works of two research groups are more prominent. Wolpaw and McFarland and their associates in Wadsworth Center have focused on developing such a CBR-based synchronized BCI system. Their proposed BCI system allows users to control the amplitude of Mu and Beta rhythms. This amplitude is then used to move a cursor on the computer screen [17-20]. Users of this system usually need training that may take up to a few weeks, but eventually they can achieve high accuracies (e.g., above 90%) [21]. The other research group, the Graz BCI, uses the ERD and the ERS of the Mu/Beta rhythms in the design of synchronized BCI systems[22-26]. Similar to the first group, after a few sessions of training, the users of the Graz BCI can also achieve high accuracies.      10               (a)     (b) Figure 1-5. Two examples of neurological phenomena. (a) Changes in the power of Beta rhythms over time; (b) A movement-related potential. Vertical line shows the time of activation of the movement.  Note that these shapes are generated by averaging over many epochs. Movement related potentials (MRPs): Averaging the EEG data with respect to movement onset results in the generation of slow potentials called “movement-related potentials” (MRPs) [27]. MRPs start about 1.5–1 seconds before the onset of a particular movement and have bilateral distribution (see Figure 1-5 (b) for an example of an MRP) [27-31]. High-resolution EEG studies have modeled the main sources of MRPs arising in the supplementary motor area and the primary sensorimotor cortex [32, 33]. MRPs have been used for the neurological phenomenon in several BCI studies. These studies include the work that has been carried out by Mason and Birch’s research group [5, 7, 34-36], Muller and Blankertz et. Al [37, 38] as well as Yom-tov and Inbar [6, 39, 40] .  Other movement related activities (OMRAs): We categorized the movement- related activities that do not belong to any of the preceding categories as OMRA. OMRAs are usually not restricted to a particular frequency band or scalp location and usually cover different frequency ranges. They may be a combination of specific and non-specific neurological phenomena. Levine and Huggins’ research group are amongst the prominent research groups that have used OMRAs related to different movements to design their ECoG-based BCI system. They recorded ECoG activity from patients with 16-126 subdural electrodes prior to an epilepsy surgery. They have used topographically focused potentials associated with different movements to develop various 2-state self- paced BCI designs [8, 41, 42].   11 Slow cortical potentials (SCPs): SCPs are slow usually non-movement potential changes generated by the user. They reflect changes in the cortical polarization of the EEG, lasting from 300 ms up to several seconds [2, 43]. Birbaumer and his colleagues have developed a BCI system called “Thought Translation Device (TTD)” that uses an SCP as the source of control [44-47]. They have shown that patients with severe motor disabilities such as late-stage ALS can learn to control their SCPs and thus use TTD to communicate with the outside world. Cognitive tasks (CTs): Changes in the brain signals as a result of non-movement mental tasks (e.g., mental counting, solving a multiplication problem) are usually categorized as CTs [48]. The works of Milan et.al [49] and Anderson et. al [50] are amongst the prominent BCI research carried out using cognitive tasks. Millan et.al’s work involves using the mental tasks to control a mobile robot, while Anderson et.al have focused on the design of a multi-class BCI system that detects cognitive tasks associated with different tasks such as 3D object recognition, mental counting, etc [51, 52]. P300: When infrequent or particularly significant auditory, visual or somatosensory stimuli are interspersed with frequent or routine stimuli, they evoke a positive peak at about 300 ms after the stimulus is received. This peak is called P300 [48, 53]. Using this so-called “oddball” response, Donchin and his colleagues have used P300 to build a successful BCI system [54, 55]. More recently, a number of studies have shown that P300-based control can be used as an alternative communication channel for people with spinal cord injury and ALS [56, 57]. Also, for individuals with visual impairments, solutions based on auditory or tactile stimuli have been proposed [58, 59]. Visual evoked potentials (VEP): VEPs are small changes in the brain signal, generated in response to a visual stimulus such as flashing lights. They display properties whose characteristics depend on the type of the visual stimulus [48]. Many BCI systems use VEPs to control the BCI system including the works of Vidal [60], Sutter [61] and Middendorf [62]. Steady-State visual evoked potentials (SSVEP): If a visual stimulus is presented repetitively at a rate of 5-6 Hz or greater, a continuous oscillatory electrical response is elicited in the visual pathways. Such a response is termed SSVEP.  The distinction   12 between VEP and SSVEP depends on the repetition rate of the stimulation [63]. The works carried out by Gao’s research group are noticeable in this area [63-66]. Multiple neurological phenomena (MNs): BCI systems based on multiple neurological phenomena use a combination of two or more of the above neurological phenomena for the purpose of control. We will review this category of BCI systems in more details later in this chapter. Activity of neural cells (ANC): Some BCI research groups have used microelectrode arrays to record the activity of single neurons in the motor cortex for the purpose of BCI control [67-73]. These BCI systems are usually based on reconstructing a movement from recorded spike trains. Experiments with monkeys have shown a relatively good ability of control in multiple directions in these systems [74]. Recently, there have been reports of a patient learning to use his neuronal activity to move a computer cursor to several directions using the ANCs [73]. These encouraging results provide hope for BCI control with multiple options and high accuracy. The downside is the invasive nature of the microelectrode implants, which may result in infection and side effects in the brain. The above neurological phenomena can be categorized in two groups based on the origin of the phenomenon in the brain. Those neurological phenomena generated as the result of cognitive responses of the brain are called endogenous.  The ones evoked by an external stimulus are called exogenous. BCI systems that use exogenous neurological phenomena, usually do not need any user training [54]. The downside of using these systems is that they require a constant commitment of one of sensory pathways to an external stimulus [75]. Furthermore, not all users may tolerate repetitive sensory stimulation. On the other hand, endogenous- based BCI systems rely on the generation of a phenomenon that is more natural and is thus expected to cause the users less fatigue. This may be the reason why more than 80% of BCI studies use endogenous neurological phenomena to control BCI systems [3]. To generate a suitable neurological phenomenon, endogenous-based BCI systems usually need user training. This training may take a long time, sometimes even up to few months. The use of complex signal processing schemes for detecting weak neurological   13 phenomena can greatly reduce or even eliminate the training process [76]. Another advantage of employing endogenous neurological phenomena is that it is possible to select and use a combination of some of them to improve the performance of the system. 1.3.3 Timing of BCI control  So far most BCI researchers have focused their attention on “synchronized” control applications. In synchronized applications, a user can initiate a command only during specific times specified by the system (see Figure 1-6(a)). In these systems, the users are required to generate an intentional control (IC) command during the periods allowed by the BCI system. In the example shown in Figure 1-6 (a), the user should generate one of IC1 or IC2 commands during the control period (the control period is shown as a ‘box’). In contrast, in self-paced BCI system, a user does not need to be constantly engaged in initiating the control command. In these systems, the users only consciously control their state when they desire to control the device (see Figure 1-6(b)) [35].  In the example shown in Figure 1-6 (b), the user is in the no control (NC) state at all times, except for those periods when he/she initiates an IC command.  In the latter case, the system will be in an IC state. During NC periods, the user can be idle, thinking about a problem or performing any action other than attempting to control the device. This property of SBCI systems that allows them to support the presence of NC periods is called “NC support”. Whenever a BCI system involves control actions with periods of inaction, it needs to have NC support.   14  Figure 1-6. Synchronized vs. self-paced control. (a) In a synchronized BCI system, control can be done only in certain intervals specified by the system; (b) In a self-paced BCI system, the control is done at the user’s own pace. Synchronized BCI systems usually require the user to initiate an IC command during the control periods. In other words, during the control periods, the users are expected to be engaged with controlling the device. For this reason, they usually do not support the “NC” periods. In some cases, the output of the system might even become unstable if an IC command is not issued. In the next three sections, we briefly describe the literature related to this thesis. First, in Section 1.4, the previous work pertaining to the design of SBCI systems is discussed. In Section 1.5, we address the use of more than one neurological phenomenon as the control source. Then in Section 1.6, we address the previous work on handling artifacts in the design of SBCI systems. Finally in Section 1.7, we discuss the previous research related to evaluating the performance of SBCI systems. 1.4 Design of self-paced BCI systems Self-paced BCI systems provide the user more freedom and more control flexibility. From the signal processing point of view, they are much more challenging to design compared to synchronized BCI systems.  The main reason is that there are many A synchronized BCI system IC1 IC1 IC2 IC NC NC A self-paced BCI system (a) (b)   15 types of NC states (e.g., idle, different mental tasks, etc.). As a result, an SBCI system should be able to handle various types of different NC signals, once the SBCI system is turned on.  For this reason, only a few BCI groups have pursued the design of self-paced BCI systems [7, 36, 40, 77, 78]. The concept of self-paced control started in early 90’s with the development of the outlier processing method (OPM), which aimed at detecting movement-related potentials (MRPs) in the EEG signals [36].  The results from this work were promising as true positive (TP) rates greater than 90% were achieved on a thumb movement task. However, its poor performance over NC epochs (FP rates ranging from 10% to 30%) restricted its use as a BCI system. To overcome the vulnerabilities of OPM, another SBCI system called the low frequency- asynchronous switch design (LF-ASD) was later proposed in 2000 by Mason and Birch [35] . Similar to OPM, LF-ASD is also designed to detect MRPs in the EEG signals. It uses features extracted from the 0.1- 4Hz band in six bipolar EEG channels recorded from F1- FC1, Fz- FCz, F2- FC2, FC1- C1, FCz-Cz and FC2- C2 on the scalp, sampled at 128 Hz. A detector that was a simplified version of the discrete wavelet transform was applied as the feature extractor and a 1-nearest neighbor (1-NN) classifier was used as the feature classifier.  By analyzing the EEG signals of five individuals, the features related to MRP (or IC) periods showed a definite difference from those in NC periods [35]. During the past few years, several changes have been applied to the structure of LF-ASD to improve its performance. These changes include the addition of an energy normalization transform [79], the addition of a debounce window as a post- processing component to decrease the FP rate [5],  user-customization of the feature extractor’s parameter values[80], and adding the knowledge of the past paths of features [81]. Despite these improvements, the performance of the LF-ASD is still not suitable for many practical applications. The most recent design of the LF-ASD achieves an average TP rate of 54.0 % at the false positive rate of 1%[82]. Since LF-ASD generates an output every 1/16th of a second, this is translated into, on average, one false positive every 6.25 seconds, while the detection rate of IC commands is less than 50%. For most   16 practical applications, generating such a high FP rate, may result in excessive user frustration. Another SBCI design, which improves upon the feature extractor of LF-ASD, is proposed by Yom-tov et.al[6]. The proposed method combines the LF-ASD feature extractor with a matched filter, resulting in a hybrid detector. This method also results in a high FP rate. For FP rates<2%, the TP rates are lower than 30%. This system generates an output every 1/25th of a second. An FP rate of 2% is translated into one false positive, every two seconds. As a result, the high amount of FPs limits the application of the proposed design. While the above studies are based on features extracted from EEG signals, researchers from the University of Michigan have focused on extracting features from ECoG signals [8, 41, 77, 83-85].  To detect IC commands, their designs either use the cross-correlation with a template [83] or the energy of wavelet packet transform [8]. In these studies, a threshold-based classifier is used for classifying the features. While these systems usually achieve TP>50%, their performance on NC epochs is not very clear. First, none of these studies has determined the number of NC epochs. Moreover, to quantify the false positives, a new metric called the false discovery rate (FDR, i.e., the percentage of total activations of the switch that were false) was used [86]. Since the number and the length of NC epochs is not determined in these studies, it is impossible to calculate the FP rate for these systems. In a recent study by this group, the reported FDR were in the range of 0% to 82% with 24 out of the 31 reported FDRs being higher than 10% [8]. However, since the numbers of IC and NC epochs were not specifically determined, no comment can be made on the performance of these systems over NC data. Table 1-1 compares the TPR and FPRs achieved in selected SBCI studies. Please keep in mind that although a direct comparison is not possible, this table roughly shows the performances of some of the existing SBCI systems. The rows of this table show the different SBCI studies. The columns show the rate at which the system generates an output, TPR, and FPR, respectively. As shown, with the exception of the first study that uses ECoG signals, the rest of these studies, have low TPR for FPR<2%. Please note that,   17 given the rates at which that these SBCI systems generate outputs, these FP rates are translated into one false positive every few seconds on average. Table 1-1. Comparison of the TPR and FPR rates achieved in different SBCI studies. Paper\Study Frequency TPR(%) FPR(%) Graimann, et.al [8] 100 Up to 100% ? Mason and Birch[35] LF-ASD 16 <20% 2 OPM <10% Mu-ASD* <10%   Yom-tov and Inbar[6]          25 30% 2 Townsend et.al [87] ? <20% 2 Bashashati, et.al [82] 16 54.0 1 * Mu-ASD is a self-paced BCI system that uses Mu rhythms as the neurological phenomenon.  A review of self-paced endogenous BCI studies shows that with the exception of one paper [8] (which will be discussed in more detail later in this section), all the proposed designs have relied on a single neurological phenomena. In the next Section, we bring evidence from the literature that supports the advantage of using the following three neurological phenomena (instead of only one) in a self-paced BCI system: MRPs, changes in the power of the Mu and Beta rhythms.   18 1.5 Use of multiple neurological phenomena in BCI systems 1.5.1 Simultaneous application of MRPs and changes in the power of Mu/Beta rhythms A number of papers provide some evidence that MRPs and changes in the power of brain rhythms [usually characterized as the event-related desynchronization (ERD) and event-related synchronization (ERS)] provide complementary information for exploring the cognitive functions of the brain. In [88], the analysis of subdural EEG recordings from primary sensorimotor in epileptic patients showed that the amplitude of the ERD of the Alpha rhythm recorded from subdural areas was not always correlated with the corresponding MRPs. It is suggested in the same paper that these neurological phenomena represent different aspects of cortical motor processes. In [89], the ERD of the Alpha rhythm is not always detected in cortical sites generating MRPs. In [31], through a high-resolution EEG study, it is shown that MRPs and the ERD of the Mu rhythm provide complementary information on human brain responses accompanying the preparation and execution of a finger movement. Further evidence from the analysis of EEG signals [90, 91] and magnetoencephalography (MEG)[92-94] strengthens these findings.   There is also some evidence regarding the differences between the Mu and the Beta rhythms. Several papers show that the reactivities of the Mu and Beta rhythms related to the movement onset are different [95, 96]. Both the Mu and Beta rhythms desynchronize before the occurrence of a voluntary self-paced movement. However, after the movement, the ERD of the Mu rhythm is usually followed by a slow return to baseline (and sometimes by a slight synchronization), while the Beta rhythms synchronize rapidly after the movement onset [96]. This evidence from the literature shows that MRPs, Mu and Beta rhythms provide complementary information that can be used for improving the performance of BCI systems. In the next sub-section, we review the simultaneous use of these phenomena in the BCI literature.   19 1.5.2 Using multiple neurological phenomena in BCI systems The main advantage of using more than one neurological phenomenon at the same time is that more information is available for the BCI system to detect an IC command related to a particular movement. The downside is that as the size of the input data increases, the complexity of the pattern recognition algorithm increases as well. Although most BCI researchers use a single neurological phenomenon as the source of control, there have been reports of using multiple neurological phenomena in BCI systems [8, 76, 92, 97-99]. In  [92], the authors analyzed different combinations of 1) features extracted from an early component of the MRP called Bereitschaftspotential (BP), 2) features extracted from the ERD of  neurological phenomena above 4Hz (through autoregressive modeling) and 3) features extracted from the common spatial patterns (CSP) features related to the ERD of Mu rhythms. The BCI system had to discriminate between left and right index finger movements. A linear discriminant analysis (LDA) classifier was used for classification. Different combination schemes were explored. The study showed that a certain combination of classifiers could result in a lower error rate than the case where a single classifier is used.  The results of combining the ERD of the Mu rhythm and the BP were not reported, although the authors mention that those results were slightly worse than the results obtained when all three neurological phenomena were used in the design of the BCI system. In [97], the authors applied a combination of microstate analysis and common spatial subspace decomposition to extract features belonging to three different frequency bands: Theta + Delta, Mu and Beta. MRPs were not treated as a separate neurological phenomenon. Instead, features were extracted from the frequency band covering both the Delta and Theta rhythms. These features were then used to discriminate between left and right hand movements. Using data of three participants, the proposed method achieved an average accuracy higher than 80%. In [100], the authors used the BP and the ERD of the brain rhythms in the 10 to 33 Hz frequency band (including both the Mu and Beta rhythms) to classify left vs. right finger movement. The features extracted from all neurological phenomena and from all EEG channels were then combined, the dimension of the feature vector was reduced and the final vector was classified using a perceptron neural network. The results showed classification accuracy of 84% on the test set. In [101], the authors used features   20 extracted from the BP and the ERD of the Mu rhythms for classifying the left and right index finger movements. It was shown that combining features results in decreasing the classification error for four out of five subjects whose data were studied.  The above studies all pertain to synchronized BCI systems. Only one SBCI system that uses multiple neurological phenomena has been reported so far [8].   In this study, the authors combined a number of neurological phenomena in order to design an ECoG-based SBCI system. Using a wavelet packet transform, ECoG signals were divided into 18 different frequency bands covering the range from 0 - 100 Hz. This range covered a wide range of neurological phenomena including Mu, Beta and Gamma rhythms, as well as other movement-related activities (OMRAs). Then for each band, wavelet-filtered signals were reconstructed. The wavelet filtered signals were then squared to achieve power values, and a genetic algorithm was applied to reduce the dimension of the feature space to one. Using a thresholding classifier, the test samples were classified as movement or no movement. As mentioned earlier in Table 1-1, the reported false discovery rates of this study were in the range of 0% to 82% with 24 out of the 31 reported FDRs being higher than 10%. This study, however, did not consider MRPs as one of the neurological phenomena. Instead it solely focused on detecting the power of different frequency bands. Furthermore, it extracted features from ECoG signals instead of EEG signals. As noted earlier in Section 1.3.1, recording ECoG signals needs surgery and it has an invasive nature. For this reason, this method of recording brain signals may not be fully accepted by the research community until the health-related issues are fully investigated. As we will discuss in Section 1.9, in this thesis we will design a new SBCI system that simultaneously uses information extracted from MRPs as well as changes in the power of the Mu and the Beta rhythms. To the best of our knowledge, this is the first time in the BCI literature that such a study is carried out in the context of self-paced BCI systems.    21 1.6 Artifacts in BCI systems Artifacts are undesirable potentials that contaminate brain signals and are mostly of non-cerebral origin. Unfortunately, they can modify the shape of a neurological phenomenon that drives a BCI system. They can also mistakenly result in an unintentional control of the device [102].  Therefore, there is a need to avoid, reject or remove artifacts from the recordings of brain signals. In an SBCI system, artifacts can impact the performance of the system in two ways: 1) by changing the shape of the neurological phenomenon during an IC period, they cause a decrease in the TP rate. 2) By mimicking the shape/properties of the neurological phenomenon during the NC periods, artifacts results in an increase in the FP rates.  Figure 1-7. An example of how artifacts can affect the performance of an SBCI system. (a) The brain state of the user; (b) The periods when artifacts have occurred; (c) The output of the SBCI system (note: FP: false positive, TN: true negative, FN: false negative and TP: true positive). Figure 1-7 shows how this can happen. Figure 1-7 (a) shows the brain states of a user during a specific time frame. As seen, the user is in an NC state, however, at two time instants the user initiates an IC command. Figure 1-7 (b) shows the periods of EEG signals that are contaminated with artifacts. The term “ART” denotes “artifact-   22 contaminated periods” and “NO” refers to the periods not contaminated with artifacts. The second period coincides with the first IC command. Figure 1-7 (c) shows the output of the SBCI system. The occurrence of the first artifact results in a false positive. The second artifact, results in masking the first IC command (a false negative or FN). Artifacts originate from non-physiological as well as physiological sources. Non- physiological artifacts originate from outside the human body (such as 50/60 Hz power- line noise or changes in electrode impedances), and are usually avoided by proper filtering, shielding, etc. Physiological artifacts arise from a variety of bodily activities. Electrocardiography (ECG) artifacts are caused by heart beats and may introduce a rhythmic activity into the EEG signal. Respiration can also cause artifacts by introducing a rhythmic activity that is synchronized with the body’s respiratory movements. Skin responses such as sweating may alter the impedance of electrodes and cause artifacts in the EEG signals [103]. The two physiological artifacts that have been most examined in BCI studies, however, are ocular (Electrooculography or EOG) and muscle (Electromyography or EMG) artifacts. EOG artifacts are generally high-amplitude patterns in the brain signal caused by blinking of the eyes, or low-frequency patterns caused by movements (such as rolling) of the eyes [104]. EOG activity has a wide frequency range, being maximal at frequencies below 4Hz, and is most prominent over the anterior head regions [105]. EMG activity (movement of the head, body, jaw or tongue) can cause large disturbances in the brain signal. EMG activity has a wide frequency range, being maximal at frequencies higher than 30 Hz [104, 105].  Difficult tasks may cause an increase in EMG activity related to the movement of facial muscles [106, 107]. Some studies have shown that EOG and EMG activities may generate artifacts that affect the neurological phenomena used in a BCI system [108, 109]. For example, [109] demonstrated that brain rhythms are contaminated with EMG artifacts during the early training sessions of their proposed BCI system that used Mu and Beta rhythms as sources of control.   23  Physiological artifacts such as EOG and EMG artifacts are much more challenging to handle than non-physiological ones. Moreover, controlling them during the signal acquisition stage is not easy. There are different ways of handling artifacts in BCI systems.  In this Section, we briefly examine the reported methods for handling EOG and EMG artifacts, as these are among the most important sources of contamination in BCI systems. 1.6.1 Artifact avoidance The first step in handling artifacts is to avoid their occurrence by issuing proper instructions to users. For example, users are instructed to avoid blinking or moving their bodies during the experiments. Instructing individuals to avoid generating artifacts during data collection has the advantage of being the least computationally demanding among the artifact handling methods, since it is assumed that no artifact is present in the signal (or that the presence of artifacts is minimal). However, it has several drawbacks. First, since many physiological signals, such as the heart beats, are involuntary, artifacts will always be present in brain signals. Even in the case of EOG and EMG activities, it is not easy to control eye and other movement activities during the process of data recording. Second, the occurrence of ocular and muscle activity during an online operation of any BCI system is not avoidable. Third, collecting sufficient amount of data without artifacts may be difficult, especially in cases where a user has a neurological disability [110]. Finally, avoiding artifacts may introduce an additional cognitive task for the individual. For example, it has been shown that refraining from eye blinking results in changes in the amplitude of some evoked potentials [111, 112]. 1.6.2 Artifact rejection Artifact rejection refers to the process of rejecting the trials affected by artifacts. It is perhaps the simplest way of dealing with brain signals contaminated with artifacts. It has some important advantages over the “artifact avoidance” approach. For example, it would be easier for individuals to participate in the experiments and perform the required tasks, especially those individuals with motor disabilities. Also, the “secondary”   24 cognitive task, resulting from an individual trying to avoid generating a particular artifact, will not be present in the EEG signal. ”Artifact rejection” is usually done by visually inspecting the EEG or the artifact signals, or by using an automatic detection method [113].  Manual rejection Manual rejection of epochs contaminated with artifacts is a common practice in the BCI field. Trials are visually checked by an expert, and those that are contaminated with artifacts are removed from the analysis. Similar to “artifact avoidance”, manual rejection also has the advantage of not being computationally demanding, as it is assumed that a human expert has identified all the artifact-contaminated epochs and removed them from the analysis. On the other hand, there are many disadvantages in using “manual rejection”. First, “manual rejection” comes at the cost of intensive human labor, especially if the study involves a large number of individuals or a large amount of recorded data. Second, the process of selecting the artifact-free trials may become subjective. It has been argued that because of the selection bias, the sample trials that are artifact-free may not be representative of the entire population of the trials [113]. Third, in the case of offline analysis, the rejection of artifact-contaminated trials, may lead to a substantial loss of data. This may become a huge drawback, especially in the case of individuals with motor disabilities, where offline data recording is not as convenient as it is for able-bodied individuals.  Automatic rejection   In the “automatic rejection”, the BCI system automatically discards the epochs of brain signals that are contaminated with particular artifacts. This procedure is commonly carried out in offline investigations. Automatic rejection of epochs can be done in the following two ways: Rejection using the EOG (EMG) signal: When one of the characteristics of the EOG (EMG) signal in an epoch exceeds a pre-determined threshold, the epoch is considered as artifact-contaminated and is automatically rejected.   25 Rejection using the EEG signal:  This rejection methodology is similar to the above; only the EEG signal is used instead of the EOG (EMG) signal. This approach has the advantage of being independent of the EOG (EMG) signal, and is useful if the EOG (EMG) signal is not recorded during data collection.     An advantage of the “automatic rejection” approach over that of “manual rejection” is that it is less labor intensive. However, automatic rejection still suffers from loss of valuable data [114, 115].  In the case of EOG artifacts, the automatic rejection approach also does not allow the rejection of contaminated trials when the EOG amplitude is small [116, 117].     Two issues need to be addressed for the BCI systems which reject artifacts: Because of the vast number of artifacts that exist in BCI systems (eye blinking, eye movements, movements of different parts of the body, breathing, etc.), not all the artifact-contaminated trials can be rejected. Usually only the epochs with a strong presence of artifacts are excluded from the analysis. Therefore, the so-called “clean” data are unfortunately not completely free of artifacts. The second issue is that the rejection of artifact-contaminated data during an offline analysis may generate “cleaner” data. However, for online real-time applications of a BCI system, this may pose a huge drawback. In online applications, artifacts are unavoidable. If artifacts are rejected during the offline analysis, the same rejection mechanism can be used to reject them during the online analysis. The only problem is that during the specific time periods when artifact-contaminated signals are rejected, the system is unreachable and cannot be used for controlling the device. 1.6.3 Artifact removal Artifact removal is the process of identifying and removing artifacts from brain signals. An artifact-removal method should be able to remove the artifacts as well as keep the related neurological phenomenon intact. Common methods for removing the artifacts in EEG signals are linear filtering [118, 119], linear combination and regression [116], blind source separation [120], principal component analysis [121], wavelet transform [122] , nonlinear adaptive filtering [123]and source dipole analysis (SDA) [124].   26 A survey of all BCI studies published before January 2006 shows that most BCI papers do not report whether or not they have considered EMG and/or EOG artifacts in their analysis. This is an important issue, since offline analysis methods that do not account for physiological artifacts may probably face some problems when tested during an online study. As a result, it is important that BCI researchers pay more attention to this important issue and address the method that they have employed for handling artifacts. A number of BCI studies state that EMG activity will not be present in the EEG signal when the EEG signal is analyzed before a movement has occurred [125]. This argument may not be valid for BCI systems.  This is because peripheral changes such as EMG tension can affect the EEG signal, even though the amount by which the EEG signal is affected remains unclear [126]. It is pointed out in [126] that even when the individuals are very restricted, they still preserve motor control over some muscle groups. Although the activities of several muscle groups are monitored in BCI studies, there remain some muscles whose activities are not recorded. The BCI systems that employ “manual rejection” of EOG and EMG artifacts should also consider the fact that “manual rejection” is only a preliminary step in the design of a BCI system. “Manual rejection” can only be used for offline analysis. In order for a particular BCI system to work in an online fashion, a scheme for handling artifacts should be incorporated. Requesting the individuals to avoid artifacts should be only considered as a temporary solution. In a practical application, EMG and EOG artifacts do happen, so methods of handling these artifacts during an online experiment should be investigated.  One solution for handling artifacts, which is not explored well in the BCI studies, is to design a BCI that is robust in the presence of artifacts. If such a BCI is designed, then the need for having a method of handling artifacts will be minimized. Another solution that has not been explored well in the BCI literature, is that of using more than one neurological phenomenon may lead to increasing the robustness to the occurrence of artifacts[76]. Since EOG artifacts mostly affect the low-frequency components of the EEG signals, BCI systems that use low-frequency ERPs, such as MRP and SCP are mostly affected by EOG artifacts. EMG artifacts on the other hand, mainly affect the   27 high-frequency components of the EEG signals, hence BCI systems that use high- frequency ERPs, such as Mu and Beta rhythms are mostly affected by EMG artifacts. Thus, it can be concluded that a BCI system that uses multiple neurological phenomena from whose frequency span both the low as well as the high frequency bands, may become more robust to the presence of artifacts. 1.7 Evaluating the performance of SBCI systems Model selection is the process of finding or adjusting the model parameters for any classification problem. For BCI systems, model selection is a crucial part of the design.  This process may include selecting the features, the type of the feature extractor, the classifier, the EEG channels, the neurological phenomenon, the frequency band of interest, the values of the classifier’s parameters and the preprocessing and post- processing components. As an example, to find the optimal set of features for a certain BCI, different sets of features are considered. For every set, the performance of the system is calculated and different performances are compared. The set of features that yields the best performance is then selected.  The performance of this best model can then be compared with those achieved by similar BCI systems (i.e., systems with the same experimental as well as evaluation protocols). Therefore, the performance of an SBCI must be evaluated in the following two cases, 1) during the model selection procedure and 2) when comparing the performance with other systems. The performance of a BCI with discrete states is usually summarized by a confusion matrix. The (i,j)  entry of this matrix represents the number of samples from class i that are classified as belonging to class j. A confusion matrix provides valuable information regarding how well each class is classified by the BCI system.  It is, however, not usually straightforward to compare different confusion matrices. Evaluation metrics are thus needed to summarize a confusion matrix into a single value. For classification problems with balanced datasets such as synchronized BCI systems (where )()()( 21 NClassprobClassprobClassprob    for an N-class problem), the overall classification accuracy (OA) is the most common evaluation metric presently used to summarize the performance [10]. The use of OA for problems with highly   28 imbalanced classes (e.g., )()( 21 ClassprobClassprob   for a two-class problem) is not satisfactory [127]. The choice of the evaluation metric is of great importance and is application- dependent.  A poorly defined evaluation metric may guide the model selection procedure to a far-from-optimal model or it can lead to erroneous conclusions when comparing the performances of two SBCI systems. As a result, all the effort spent in the design of a sophisticated SBCI may be lost, simply because of the poor choice of the evaluation metric. Recently, the choice of OA as the default evaluation metric has been questioned, even in classification applications with balanced datasets. Specifically, it was shown that for many applications, the area under the receiver operating characteristic (AUC) can summarize the performance better than OA [128]. Although OA is not suitable for classification problems with imbalanced classes, the choice of an alternative evaluation metric is not obvious. Several attempts have been made to define more suitable evaluation metrics for these problems. Examples of such evaluation metrics include weighted overall accuracy (WOA) [129], the use of receiver operating characteristic (ROC) curves and related measures such as area under the ROC (AUC) [130] and the Kappa coefficient [131]. In the SBCI literature, some of the evaluation metrics used include overall accuracy [7], HF-difference[8], mutual information (information transfer rate) [9], Kappa [10], AUC [10], the true positive rate (TPR) at a fixed false positive rate (FPR) [5] and FPR TPR [132].Figure 1-8 shows the proposed evaluation metrics for synchronized and self-paced BCI systems. As seen, the number of proposed evaluation metrics is significantly higher for self-paced BCI systems than synchronized BCI systems.   29  Figure 1-8. Types of evaluation metrics used in synchronized and self-paced BCI systems.    OA shows the total number of test samples correctly classified by an SBCI system. It has been frequently used in evaluating many synchronized BCI systems [133- 135]. Its use in SBCIs, however, has so far been limited [7]. This is because, for an SBCI system, OA assigns a huge weight on the more frequent class (NC) and a very small weight on the less frequent classes (IC). This may lead to misleading conclusions about the performance of the system. The information transfer rate (ITR) has been specifically proposed for evaluating the performance of synchronized BCI systems [136]. This metric is proposed based on the similarities between an SBCI and a communication channel, and using Shannon’s communication theory. The rationale is that ITR measures the amount of information transferred between two reference points.  The output Y of an SBCI is the interpretation (information) of the current state of the brain, and Y conveys this information to the downstream components. It is thus argued in [136] that the amount of information in Y is a useful tool for comparing the results obtained from different synchronized BCI designs. It is also argued that ITR by itself is “not“a suitable single evaluation metric for an SBCI system. This is because of the unique nature of this metric having more than one maximum (see [137] for a detailed discussion).   30 Cohen’s Kappa coefficient is a measure of agreement between two estimators [138].  Since it considers chance agreements, it is regarded as a more robust measure than OA [10]. The HF-difference is a newly proposed metric that summarizes the confusion matrix [85]. It is defined as the difference between the TP rate and the percentage of total activations that are incorrect (the false discovery rate (FDR)[86]). The advantage of using HF-difference is that it is sensitive to the ratio of FPs to the total number of detections. The downside of using the HF-difference is that it does not consider the length of NC periods. The FPR TPR  is another evaluation metric that was recently proposed for 2-class SBCI systems [132, 139]. This metric gives more weight to cases with low FPRs. As a result, during the model tuning process, any model with a high FPR is assigned a low fitness, even though TPR might have a high value. The downside is that for FPR=0, the system cannot differentiate between confusion matrices with different TPRs. The receiver operating characteristics (ROC) curve is a popular metric for evaluating systems with imbalanced classes. The ROC curve depicts the relationship between TPR and FPR. Popular methods that use the ROC curve for measuring the performance employ one of the following two criteria 1) The area under the ROC curve (AUC) which is used as the fitness of the system [10]; 2) Defining a critical FPR value ( CriticalFPR ) , and then using the value of the TP rate at CriticalFPR  as the fitness [5]. The advantage of using the ROC curve over previous metrics is that a whole range of solutions (in terms of a tradeoff between TPR and FPR) is provided.      One problem with using the ROC curve is that when it is plotted over the whole range of TPR and FPR, most SBCI systems produce a curve that is similar to a perfect ROC curve [4]. The other problem with using the ROC curve (and perhaps more important) is that it is computationally more demanding than other evaluation metrics. Several points need to be evaluated until a partial ROC curve that is accurate enough for estimating the AUC is drawn. Similarly, several points need to be calculated in order to obtain the value of TPR at CriticalFPR . Even if the ROC curve  is estimated using the more   31 computationally efficient algorithm as described in [128], it remains much more time consuming than the metrics described above as these only need the value of a single point to assess the performance.  When these metrics are used to evaluate the performance and select a model from thousands of confusion matrices during a model selection procedure, the computational burden becomes problematic. For these reasons, evaluation metrics that summarize the performance based on a single evaluation of a confusion matrix are more desirable during the model selection procedure. Each of these metrics has strengths and weaknesses [10], however, the published SBCI studies do not usually discuss why a particular evaluation metric is chosen for evaluating the performance.  This leads to the obvious conclusion that finding suitable evaluation metrics forms an important and a needed study for SBCI systems. This need has been emphasized in a recently published technical report on evaluating SBCI systems [4]. 1.8 Thesis contributions As discussed above, this thesis addresses three issues of importance to the designs of SBCI systems: 1) Decreasing the false positive rates in SBCI systems. 2) Handling artifacts in SBCI systems, and 3) Evaluating the performance of SBCI systems. To address the high FP rates and the presence of artifacts, in Chapters 4, 5, 7, and Chapter 9, we propose and evaluate a new 2-state SBCI system that can distinguish an IC command related to a specific movement pattern from the NC state in EEG signals. In the design of this system, the main focus is to improve the performance over those of previous EEG-based SBCIs. To achieve this goal, we propose and investigate the simultaneous detection of three neurological phenomena to recognize IC commands. These three neurological phenomena are movement-related potentials (MRPs), changes in the power of Mu rhythms (CPMR) and changes in the power of Beta rhythms (CPBR). These neurological phenomena are known to be time-locked to the onset of movement, so   32 we postulated that detecting all of them at the same time improves the system’s performance. This is the first time in the BCI literature that the analysis of these neurological phenomena at the same time is proposed for detecting the IC commands.  A systematic approach for feature extraction, selection and classification for each neurological phenomenon is presented. We also propose a 2 –stage multiple classifier system (MCS) to efficiently combine the information extracted from these neurological phenomena. The performance of the proposed system is compared with those of other state-of- the-art EEG-based SBCI systems. It is shown that the proposed method results in a superior performance. A theoretical analysis of the performance of the proposed SBCI is presented and it is shown that under certain conditions the proposed methodology can theoretically approach perfect classification accuracy. Since the proposed SBCI relies on detecting more than one neurological phenomenon at the same time, it is expected that its performance is robust in the presence of most artifacts. This is because artifacts are usually more prominent over a certain frequency band and do not affect other frequency bands as much. In Chapter 7, we show that the proposed SBCI has a good performance over periods contaminated with artifacts. Finally, in Chapter 8 a framework for comparing and selecting evaluation metrics for SBCI systems is also proposed. It is shown that this framework can be successfully applied to select from a number of available metrics, the evaluation metric that is most suitable for evaluating SBCI systems. The findings of this chapter are applied in Chapter 9 to select the most suitable evaluation metric for evaluating the performance. The thesis provides a detailed description of our methods and results. The main contributions of this thesis fall into three categories as follows: 1.8.1 Reducing high false positive rates 1) Introducing the idea of using features from MRPs, CPMR and CPBR at the same time to detect the possible presence of IC commands. 2) Developing a new SBCI system that extracts and classifies features extracted from the above neurological phenomena efficiently. A new two-stage multiple classifier   33 system (MCS) is proposed. At the first stage, an MCS is separately designed for each neurological phenomenon. At the second stage, another MCS combines the outputs of MCSs in the first stage. 3) Studying the performance of the 2-stage MCS using the Linear Programming theory. 4) Investigating of the performance of the proposed SBCI system on two datasets: one dataset related to the right finger flexion and the other dataset related to the right hand extension. It will be shown that the proposed SBCI system achieves error rates that are significantly lower than those of other state-of-the-art EEG-based SBCI systems. 5) Comparing the use of monopolar and bipolar EEG electrodes for detecting right hand extension movements and demonstrating that bipolar electrodes provide superior results. 6) Studying the effect of automatic user-customization in the performance of a state- of the-art self-paced BCI system previously developed in the brain interface lab of the Neil Squire society. It will be demonstrated that automatic user customization significantly improves the performance compared to manual customization by an expert. 1.8.2 Addressing artifacts in SBCI systems 1) Presenting a detailed review of the methods that handle artifacts in BCI systems. Surprisingly, this review shows that most BCI systems do not address the presence of artifacts properly. 2) Investigating the performance of the proposed SBCI system over periods contaminated with eye-blink artifacts. It will be shown that the system has a reasonably good performance over periods contaminated with large eye movement artifacts. 3) Investigating the performance of the proposed SBCI system using the data from the session recorded few days after the data used for training the SBCI system. Again, it will be demonstrated that the proposed SBCI system achieves a good performance for three out of four participants whose data are studied.   34 4) Proposing an artifact monitoring system that detects large eye movement artifacts as well as EMG artifacts related to frontalis muscles at the same time. 1.8.3 Finding a suitable evaluation metric for SBCI systems 1) Proposing a framework for comparing the evaluation metrics during the model selection process in SBCI systems. 2) Applying the proposed framework to a particular SBCI system and finding the most suitable evaluation metric. 3) Demonstrating that the Kappa coefficient is the most suitable evaluation metric for the proposed SBCI system. 1.9 Organization of the thesis The organization of this thesis is as follows: We first study the Low Frequency - Asynchronous Switch Design (the LF-ASD) in Chapter 2. The LF-ASD is a state-of-the-art EEG-based SBCI system that is used as the basis for performance comparison in some of the following chapters. It is thus reasonable that at the first stage of the research, the structure of LF-ASD as well as its performance are examined in detail. The parameter values of the feature generator of the LF-ASD have been usually determined by the designer based on trial and error. This process is suboptimal, subjective and time-consuming for the researchers. In Chapter 2, we propose the use of a genetic algorithm (GA) to automatically tune the parameter values of the feature generator of one of the designs of the LF-ASD.  The purpose of this study is 2-fold: 1) to automate the tuning process of the feature generator and 2) to improve the performance of the LF-ASD. Specifically, we are interested in finding an upper limit for the performance of this design of the LF-ASD. This could be a starting point for the next stage of the research as it would decide whether the current feature generator should be kept or if it should be replaced by a more powerful one. In Chapter 2, we show that only moderate improvements in the performance of the LF-ASD occur, after automatically   35 tuning the parameters of the feature generator. For this reason, the subsequent chapters, the use of the wavelet transform for extracting the features is explored. In Chapter 3, we study the use of the discrete wavelet transform (DWT) for extracting MRP features. The reason behind choosing DWT is 2-fold: 1) LF-ASD uses a transform for feature extraction that is a simplified version of the wavelet transform. We thus postulate that a more sophisticated version of this detector (i.e., the DWT) should be able to better extract features related to MRPs and 2) DWT provides both time as well as frequency information, so it can provide more information than the traditional frequency- based approaches and can improve the performance [11]. The evidence from the BCI literature also supports this hypothesis [140, 141]. We compare two different variations of this design. The first is based on MRP features extracted from monopolar EEG channels and the other is based on MRP features extracted from bipolar EEG channels. We argue that the system based on the bipolar MRP features yields a superior performance. Parallel to the research carried out in Chapter 3, we carried out a second study which focused on a simple design of an SBCI system that is based on features extracted from three neurological phenomena: MRP, CPMR and CPBR. This study is carried out as a proof of concept to show that the combination of the three neurological phenomena discussed earlier in this chapter would improve the performance of the system. For this reason, we only applied simple feature extraction and classification methods (matched filtering and K-nearest neighbor classifier). A 2-stage multiple classifier system (MCS) is proposed to “fuse” the classification results attributed to each neurological phenomenon. Figure 1-9 shows the overall structure of the system studied in Chapter 4.   36  Figure 1-9. The overall schematic of the SBCI system developed and studied in Chapters 4, 5, 7, and 9. In Chapter 5, we use the findings from Chapters 3 and 4 and propose an improved design. This design uses a new feature extraction method (a combination of stationary wavelet transform (SWT) followed by matched filtering) and proposes the use of a hybrid genetic algorithm (HGA) to simultaneously select the features, the parameter values of the classifiers and the combination method for the 2-stage MCS. We demonstrate that the new design achieves much lower FP rates than previous EEG-based SBCI systems, while it maintains a modest TP rate. These promising results facilitate the practical applications of the proposed SBCI system. In Chapters 6 and 7 we focus on artifacts in BCI systems in general and in the proposed SBCI system in particular. Artifacts are unwanted potentials that can change the shape of the neurological phenomena and thus decrease the system’s performance. As a result, handling artifacts is an important part in the design of BCI systems.  In Chapter 6 artifacts in BCI literature are addressed. The results of this review study show that the BCI literature does not properly report artifacts handling. In other words, BCI researchers do not report whether or not they have considered the presence of artifacts. A large number of studies reject the artifact-contaminated periods either manually or automatically. We argue that a proper solution is to either efficiently remove artifacts or to design a BCI system whose performance is robust to the presence of artifacts. In Chapter 7 we further analyze the performance of the proposed SBCI. We first consider the performance over periods contaminated with eye-blink artifacts. Next, we   37 test the performance of the system on data collected in a session recorded after the sessions used for training and testing the performance of the system. In both cases, we demonstrate that our proposed system maintains its good performance in the presence of artifacts. These results also demonstrate that during online testing, the system does not need to reject periods marked with artifacts. This, in turn, greatly increases the periods during which the system is available for control. In Chapter 8, we address the critical issue of selecting a suitable evaluation metric in the design of SBCI systems. We revise and improve a framework that was proposed earlier for comparing the classification accuracy and the AUC metrics [128].  Our revised model can be used to compare various metrics as well as studying new metrics. It can also be used for selecting the metric(s) that is (are) most suitable for evaluating a certain classification system. We also analyze the application of the proposed framework to the field of SBCI systems. In particular, we consider four evaluation metrics: overall classification accuracy (OA), FPR TPR , Kappa’s coefficient, and HF-difference and compare their performances during the model selection procedure for a particular SBCI system. We demonstrate that some evaluation metrics such as Kappa and HF-difference are more suitable and some such as OA and FPR TPR  are less suitable evaluation metrics for SBCI systems. In Chapters 2 to 8, the type of movement that was considered for the generation of IC commands was the index finger flexion. In order for the system to be generalized to more control options, its performance on new mental tasks (related to other types of movements) should also be investigated. It is also desired that the same system also performs well on other types of movements. In Chapter 9, we examine the performance of the SBCI system proposed in Chapter 5 on a new dataset. In this dataset, IC commands are generated by hand extension movements.  NC data are also recorded in a more engaging environment than those used in previous studies for training the system. It is demonstrated that our proposed design maintains a superior performance compared to other EEG-based SBCI designs in the literature. Secondly, electromyography (EMG) signals from frontalis muscles are recorded to rule out the activation of such muscles   38 during the movement executions. Furthermore, we use the framework developed in Chapter 8 to select the most suitable evaluation metric for the system. We conclude that Cohen’s Kappa coefficient is the most suitable evaluation metric for the model selection procedure of the proposed SBCI system. Finally, we compare the performance of monopolar and bipolar EEG electrode montages. This study shows that the bipolar montage generates more suitable features and thus a superior performance than the monopolar montage. In Chapter 10, we summarize the contributions of this thesis to the field of SBCI systems. We also present some of the potential research subjects that can immediately follow this research in this chapter. Figure 1-10 shows a summary of the organization of this thesis. In Appendix. A, we provide a copy of the approval from the Behavioral Research Ethics Board (BREB) of the University of British Columbia to conduct this study. In Appendix B, we use the linear programming theory to show how the proposed multiple classifier system could achieve perfect classification accuracy under certain conditions.             39  Figure 1-10. Outline of the thesis. Chapter 1: Introduction Design of a 2-state SBCI System with a low FP rate Analysis of the effect of artifacts in SBCI systems Finding a suitable evaluation metric for SBCI systems Chapter 2: Automatic user customization of the LF-ASD Chapter 3: Using DWT for extracting features from MRPs Chapter 4: Design of an SBCI using MRP, CPMR and CPBR features Chapter 5: Automating the design of an SBCI with low FP rates Chapter 9: Testing the performance of new movements; Comparison of monopolar and bipolar montages. Chapter 6: Comprehensive review of methods of handling artifacts in BCI systems Chapter 7: Testing the performance of the system developed in Chapter 5 over artifact-contaminated periods Chapter 8: Proposing a new framework for comparing evaluation metrics during model selection Chapter 9: Using the framework developed in Chapter 8 to find a suitable evaluation metric for an SBCI system. Chapter 10: Conclusions and directions for future works   40 1.10 References  [1] T. Vaughan, W. J. Heetderks, L. J. Trejo, W. Z. Rymer, M. Wienrich, M. M. Moore, A. Kubler, B. H. Dobkin, N. Birbaumer, E. Donchin, E. W. Wolpaw and J. W. R, "Brain- computer interface technology: a review of the second international meeting", IEEE Trans. Neural Syst. Rehab. Eng., vol. 11, no.2, pp. 94-109, 2003. [2] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller and T. M. Vaughan, "Brain- computer interfaces for communication and control", Clin. Neurophysiol., vol. 113, no.6, pp. 767-791, Jun.2002. [3] S. G. Mason, A. Bashashati, M. Fatourechi, K. F. Navarro and G. E. Birch, "A comprehensive survey of brain interface technology designs", Ann. Biomed. Eng., vol. 35, no.2, pp. 137-169, Feb.2007. [4] S. G. Mason, J. Kronegg, J. Huggins, M. Fatourechi and A. and Schloegl, "Evaluating  the performance of self-paced BCI technology”, Technical Report, available online: http://www.bci-info.tugraz.at/Research_Info/documents/articles/self_paced_tech_report- 2006-05-19.pdf, 2006. [5] J. F. Borisoff, S. G. Mason, A. Bashashati and G. E. Birch, "Brain-computer interface design for asynchronous control applications: improvements to the LF-ASD asynchronous brain switch", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 985-992, Jun.2004. [6] E. Yom-Tov and G. F. Inbar, "Detection of Movement-Related Potentials from the Electro- Encephalogram for possible use in a Brain-Computer Interface", Medical and Biological Engineering and Computing, vol. 41, no.1, pp. 85-93, Jan.2003. [7] G. E. Birch, Z. Bozorgzadeh and S. G. Mason, "Initial on-line evaluations of the LF-ASD brain-computer interface with able-bodied and spinal-cord subjects using imagined voluntary motor potentials", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 10, no.4, pp. 219-224, Dec.2002. [8] B. Graimann, J. E. Huggins, S. P. Levine and G. Pfurtscheller, "Toward a direct brain interface based on human subdural recordings and wavelet-packet analysis", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 954-962, Jun.2004. [9] J. Kronegg, s. Voloshynovskiy and P. Pun, "Analysis of bit rate definitions for brain-computer interfaces," in the Proc. Int. Conf. on Human-Computer Interaction (HCI'05), Las Vegas, Nevada, 2005. [10] A. Schlögl, J. Kronegg, J. Huggins and S. G. Mason, "Evaluation  criteria in BCI research," in Towards  Brain-Computer Interfacing (G. Dornhege, J. R. Millan, T. Hinterberger, D. McFarland and K. R. Muller, Eds.), MIT Press, 2007. [11] A. Bashashati, M. Fatourechi, R. K. Ward and G. E. Birch, "A survey of signal processing algorithms in brain-computer interfaces based on electrical brain signals", J. Neural Eng., vol. 4, no.2, pp. R32-57, Jun.2007. [12] J. J. Vidal, "Toward direct brain-computer communication", Annu. Rev. Biophys. Bioeng., vol. 2, pp. 157-180, 1973. [13] G. Pfurtscheller and A. Aranibar, "Event-related cortical desynchronization detected by power measurements of scalp EEG", Electroencephalogr. Clin. Neurophysiol., vol. 42, no.6, pp. 817-826, Jun.1977.   41 [14] L. Leocani, C. Toro, P. Manganotti, P. Zhuang and M. Hallett, "Event-related coherence and event-related desynchronization/synchronization in the 10 Hz and 20 Hz EEG during self- paced movements", Electroencephalogr. Clin. Neurophysiol., vol. 104, no.3, pp. 199-206, May.1997. [15] G. Pfurtscheller and F. H. Lopes da Silva, "Event-related EEG/MEG synchronization and desynchronization: basic principles", Clin. Neurophysiol., vol. 110, no.11, pp. 1842-1857, Nov.1999. [16] G. Pfurtscheller, K. Pichler-Zalaudek, B. Ortmayr, J. Diez and F. Reisecker, "Postmovement beta synchronization in patients with Parkinson's disease", J. Clin. Neurophysiol., vol. 15, no.3, pp. 243-250, May.1998. [17] D. J. McFarland, W. A. Sarnacki, T. M. Vaughan and J. R. Wolpaw, "Brain-computer interface (BCI) operation: signal and noise during early training sessions", Clin. Neurophysiol., vol. 116, no.1, pp. 56-62, Jan.2005. [18] J. R. Wolpaw and D. J. McFarland, "Control of a two-dimensional movement signal by a noninvasive brain-computer interface in humans", in Proc. Natl. Acad. Sci. U. S. A., vol. 101, no.51, pp. 17849-17854, Dec 21.2004. [19] J. R. Wolpaw and D. J. McFarland, "Multichannel EEG-based brain-computer communication", Electroencephalogr. Clin. Neurophysiol., vol. 90, no.6, pp. 444-449, Jun.1994. [20] D. J. McFarland, L. M. McCane and J. R. Wolpaw, "EEG-based communication and control: short-term role of feedback", IEEE Trans. Rehabil. Eng., vol. 6, no.1, pp. 7-11, Mar.1998. [21] L. A. Miner, D. McFarland and J. R. Wolpaw, "Answering Questions with an Electroencephalogram-Based Brain Computer Interface", Arch. Phys. Med. Rehabil., vol. 79, pp. 1029-1033, 1998. [22] G. R. Muller-Putz, R. Scherer, G. Pfurtscheller and R. Rupp, "EEG-based neuroprosthesis control: a step towards clinical practice", Neurosci. Lett., vol. 382, no.1-2, pp. 169-174, Jul 2005. [23] G. Pfurtscheller, G. R. Müller-Putz, J. Pfurtscheller and R. Rupp, "EEG-Based Asynchronous BCI Controls Functional Electrical Stimulation in a Tetraplegic Patient", EURASIP Journal on Applied Signal Processing, vol. 2005, no.19, pp. 3152-3155, 2005. [24] G. Pfurtscheller, B. Graimann, J. E. Huggins and S. P. Levine, "Brain-computer communication based on the dynamics of brain oscillations", Suppl. Clin. Neurophysiol., vol. 57, pp. 583-591, 2004. [25] G. Pfurtscheller, C. Neuper, C. Guger, W. Harkam, H. Ramoser, A. Schlogl, B. Obermaier and M. Pregenzer, "Current trends in Graz Brain-Computer Interface (BCI) Research", IEEE Trans. Rehab. Eng., vol. 8, no.2, pp. 216-219, Jun. 2000. [26] G. Pfurtscheller and C. Neuper, "Motor imagery and direct brain-computer communication", Proc. IEEE, vol. 89, no.7, pp. 1123-1134, 2001. [27] L. Deecke, B. Grozinger and H. H. Kornhuber, "Voluntary finger movement in man: cerebral potentials and theory", Biol. Cybern., vol. 23, no.2, pp. 99-119, Jul 14.1976. [28] H. Shibasaki, G. Barrett, E. Halliday and A. M. Halliday, "Components of the movement- related cortical potential and their scalp topography", Electroencephalogr. Clin. Neurophysiol., vol. 49, no.3-4, pp. 213-226, Aug.1980.   42 [29] I. M. Tarkka and M. Hallett, "Cortical topography of premotor and motor potentials preceding self-paced, voluntary movement of dominant and non-dominant hands", Electroencephalogr. Clin. Neurophysiol., vol. 75, no.2, pp. 36-43, Feb.1990. [30] M. Hallett, "Movement-related cortical potentials", Electromyogr. Clin. Neurophysiol., vol. 34, no.1, pp. 5-13, Jan-Feb.1994. [31] C. Babiloni, F. Carducci, F. Cincotti, P. M. Rossini, C. Neuper, G. Pfurtscheller and F. Babiloni, "Human movement-related potentials vs desynchronization of EEG alpha rhythm: a high-resolution EEG study", Neuroimage, vol. 10, no.6, pp. 658-665, Dec.1999. [32] A. Urbano, C. Babiloni, P. Onorati and F. Babiloni, "Human cortical activity related to unilateral movements. A high resolution EEG study", Neuroreport, vol. 8, no.1, pp. 203-206, Dec 20.1996. [33] A. Urbano, C. Babiloni, P. Onorati, F. Carducci, A. Ambrosini, L. Fattorini and F. Babiloni, "Responses of human primary sensorimotor and supplementary motor areas to internally triggered unilateral and simultaneous bilateral one-digit movements. A high-resolution EEG study", Eur. J. Neurosci., vol. 10, no.2, pp. 765-770, Feb.1998. [34] Z. Bozorgzadeh, G. E. Birch and S. G. Mason, "The LF-ASD brain computer interface: On- line identification of imagined finger flexions in the spontaneous EEG of able-bodied subjects," in Proc. IEEE ICASSP’00,vol.6,pp. 2385-2388 , 2000., [35] S. G. Mason and G. E. Birch, "A brain-controlled switch for asynchronous control applications", IEEE Trans. Biomed. Eng, vol. 47, no.10, pp. 1297-1307, Oct.2000. [36] G. E. Birch, P. D. Lawrence and R. D. Hare, "Single Trial Processing of Event Related Potentials Using Outlier Information", IEEE Trans. Biomed. Eng., vol. 40, no.1, pp. 59-73, 1993. [37] B. Blankertz, G. Dornhege, C. Schäfer, R. Krepki, J. Kolmorgen, K. R. Müller, V. Kunzmann, F. Losch and G. Curio, "Boosting bit rates and error detection for the classification of fast-paced motor commands based on single-trial EEG analysis," in IEEE Trans. Neural Sys. Rehab. Eng, vol.11, no.2, 2003, [38] G. Dornhege, B. Blankertz and G. Curio, "Speeding up classification of multi-channel brain- computer interfaces: Common spatial patterns for slow cortical potentials," in Proc. 1st IEEE EMBS Int. Conf. on Neural Engineering,pp. 595-598. 2003, [39] E. Yom-Tov and G. F. Inbar, "Feature Selection for the Classification of Movements From Single Movement-Related Potentials", IEEE Trans. Neural Syst. Rehab. Eng., vol. 10, no.3, pp. 170-177, Sep.2002. [40] E. Yom-Tov and G. F. Inbar, "Selection of relevant features for classification of movements from single movement-related potentials using a genetic algorithm," in the Proc. 23rd IEEE/EMBS Int. Conf.,vol.2,pp. 1364-1366 , 2001. [41] S. P. Levine, J. E. Huggins, S. L. Bement, R. K. Kushwaha, L. A. Schuh, E. A. Passaro, M. M. Rohde and D. A. Ross, "Identification of Electrocorticogram Patterns as the Basis for a Direct Brain Interface", J Clinical Neurophysiol, vol. 16, no.5, pp. 439-447, Sep.1999. [42] J. E. Huggins, S. P. Levine, R. Kushwaha, S. L. Bement, L. A. Schuh and D. A. Ross, "Identification of cortical signal patterns related to human tongue protrusion," in pp. 670-672. 1995.   43 [43] N. Neumann, A. Kubler, J. Kaiser, T. Hinterberger and N. Birbaumer, "Conscious perception of brain states: mental strategies for brain-computer communication", Neuropsychologia, vol. 41, no.8, pp. 1028-1036, 2003. [44] T. Hinterberger, B. Wilhelm, J. Mellinger, B. Kotchoubey and N. Birbaumer, "A device for the detection of cognitive brain functions in completely paralyzed or unresponsive patients", IEEE Trans. Biomed. Eng., vol. 52, no.2, pp. 211-220, Feb.2005. [45] N. Birbaumer, "The thought-translation-device (TTD): Taming cognition for action", Brain Cogn., vol. 54, no.2, pp. 130-130, Mar.2004. [46] T. Hinterberger, S. Schmidt, N. Neumann, J. Mellinger, B. Blankertz, G. Curio and N. Birbaumer, "Brain-computer communication and slow cortical potentials", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1011-1018, Jun.2004. [47] N. Birbaumer, A. Kubler, N. Ghanayim, T. Hinterberger, J. Perelmouter, J. Kaiser, I. Iversen, B. Kotchoubey, N. Neumann and H. Flor, "The thought translation device (TTD) for completely paralyzed patients", IEEE Trans. Rehabil. Eng., vol. 8, no.2, pp. 190-193, Jun.2000. [48] A. Kubler, B. Kotchoubey, J. Kaiser, J. R. Wolpaw and N. Birbaumer, "Brain-Computer Communication: Unlocking the Locked In", Psych Bulletin, vol. 127, no.3, pp. 358-375, May.2001. [49] J. d. R. Millan, J. Mourino, M. G. Marciani, F. Babiloni, F. Topani, I. Canale, J. Heikkonen and K. Kaski, "Adaptive brain interfaces for physically-disabled people," in Proc. IEEE EMBS Conf, vol.4,, pp. 2008-2011, 1998. [50] C. W. Anderson, S. V. Devulapalli and E. A. Stolz, "Signal Classification with Different Signal Representations", Neural Networks for Signal Processing, pp. 475-483, 1995. [51] D. Garrett, D. A. Peterson, C. W. Anderson and M. H. Thaut, "Comparison of linear, nonlinear, and feature selection methods for EEG signal classification", IEEE Trans. Neural Syst. Rehab. Eng., vol. 11, no.2, pp. 141-144, Jun. 2003. [52] C. W. Anderson, E. A. Stolz and S. Shamsunder, "Multivariate autoregressive models for classification of spontaneous electroencephalographic signals during mental tasks", IEEE Trans. Biomed. Eng., vol. 45, no.3, pp. 277-286, Mar.1998. [53] B. Z. Allison and J. A. Pineda, "ERPs evoked by different matrix sizes: Implications for a brain computer interface (BCI) system", IEEE Trans. Neural Syst. Rehab. Eng., vol. 11, no.2, pp. 110-113, Jun.2003. [54] E. Donchin, K. M. Spencer and R. Wijesinghe, "The mental prosthesis: assessing the speed of a P300-based brain-computer interface", IEEE Trans. Rehabil. Eng., vol. 8, no.2, pp. 174- 179, Jun.2000. [55] L. A. Farwell and E. Donchin, "Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials", Electroencephalogr. Clin. Neurophysiol., vol. 70, no.6, pp. 510-523, Dec.1988. [56] E. W. Sellers, A. Kubler and E. Donchin, "Brain-computer interface research at the University of South Florida Cognitive Psychophysiology Laboratory: the P300 Speller", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.2, pp. 221-224, Jun.2006. [57] F. Piccione, F. Giorgi, P. Tonin, K. Priftis, S. Giove, S. Silvoni, G. Palmas and F. Beverina, "P300-based brain computer interface: reliability and performance in healthy and paralysed participants", Clin. Neurophysiol., vol. 117, no.3, pp. 531-537, Mar.2006.   44 [58] A. A. Glover, M. C. Onofrj, M. F. Ghilardi and I. Bodis-Wollner, "P300-like potentials in the normal monkey using classical conditioning and an auditory 'oddball' paradigm", Electroencephalogr. Clin. Neurophysiol., vol. 65, no.3, pp. 231-235, May.1986. [59] B. Roder, F. Rosler, E. Hennighausen and F. Nacker, "Event-related potentials during auditory and somatosensory discrimination in sighted and blind human subjects", Brain Res. Cogn. Brain Res., vol. 4, no.2, pp. 77-93, Sep.1996. [60] J. J. Vidal, "Real-Time Detection of Brain Events in EEG,"  Proc IEEE, vol. 65, pp. 633- 641, 1977. [61] E. E. Sutter, "The brain response interface: communication through visually-induced electrical brain responses", J Micro Comp App, vol. 15, pp. 31-45, 1992. [62] M. Middendorf, G. McMillan, G. Calhoun and K. S. Jones, "Brain-Computer Interfaces Based on the Steady-State Visual-Evoked Response", IEEE Trans. Rehab. Eng., vol. 8, no.2, pp. 211-214, Jun. 2000. [63] X. Gao, D. Xu, M. Cheng and S. Gao, "A BCI-based environmental controller for the motion-disabled", IEEE Trans. Neural Syst. Rehab. Eng., vol. 11, no.2, pp. 137-140, Jun. 2003. [64] Y. Wang, R. Wang, X. Gao and S. Gao, "Brain-computer interface based on the high- frequency steady-state visual evoked potential," in the Proc. 1st  Int. Conf. in Neural Interface and Control, pp. 37-39, 2005. [65] Cheng Ming, Gao Xiaorong, Gao Shangkai and Wang Boliang, "Stimulation frequency extraction in SSVEP-based brain-computer interface," in in the Proc. 1st  Int. Conf. Neural Interface and Control, ,pp. 64-67. 2005, [66] Yijun Wang, Zhiguang Zhang, Xiaorong Gao and Shangkai Gao, "Lead selection for SSVEP-based brain-computer interface," in the Proc. 26th IEEE/EMBS Int. Conf.,vol.2,pp. 4507-4510 , 2004. [67] J. K. Chapin, K. A. Moxon, R. S. Markowitz and M. A. Nicolelis, "Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex", Nat. Neurosci., vol. 2, no.7, pp. 664-670, Jul.1999. [68] M. A. L. Nicolelis and J. K. Chapin, "Controlling robots with the mind", Sci. Am., vol. 287, no.4, pp. 46-53, Oct.2002. [69] J. T. Francis and J. K. Chapin, "Force field apparatus for investigating movement control in small animals", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 963-965, Jun.2004. [70] S. Darmanjian, Sung Phil Kim, M. C. Nechyba, S. Morrison, J. Principe, J. Wessberg and M. A. L. Nicolelis, "Bimodal brain-machine interface for motor control of robotic prosthetic," in the Proc. IEEE Int. Conf. Intelligent Robots and Systems, vol.4,pp. 3612-3617 , 2003. [71] J. P. Donoghue, A. Nurmikko, G. Friehs and M. Black, "Development of neuromotor prostheses for humans", Suppl. Clin. Neurophysiol., vol. 57, pp. 592-606, 2004. [72] F. Wood, M. J. Black, C. Vargas-Irwin, M. Fellows and J. P. Donoghue, "On the variability of manual spike sorting", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 912-918, Jun.2004. [73] L. R. Hochberg, M. D. Serruya, G. M. Friehs, J. A. Mukand, M. Saleh, A. H. Caplan, A. Branner, D. Chen, R. D. Penn and J. P. Donoghue, "Neuronal ensemble control of prosthetic devices by a human with tetraplegia", Nature, vol. 442, no.7099, pp. 164-171, Jul 13.2006.   45 [74] M. D. Serruya, N. G. Hatsopoulos, L. Paninski, M. R. Fellows and J. P. Donoghue, "Instant neural control of a movement signal", Nature, vol. 416, no.6877, pp. 141-142, Mar.2002. [75] T. M. Vaughan, J. R. Wolpaw and E. Donchin, "EEG-Based Communication: Prospects and Problems", IEEE Trans. Rehab. Eng., vol. 4, no.4, pp. 425-430, 1996. [76] G. Dornhege, B. Blankertz, G. Curio and K. R. Muller, "Boosting bit rates in noninvasive EEG single-trial classifications by feature combination and multiclass paradigms", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 993-1002, Jun.2004. [77] B. Graimann, J. E. Huggins, A. Schlogl, S. P. Levine and G. Pfurtscheller, "Detection of movement-related desynchronization patterns in ongoing single-channel electrocorticogram", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 11, no.3, pp. 276-281, Sep.2003. [78] R. Scherer, G. R. Muller, C. Neuper, B. Graimann and G. Pfurtscheller, "An asynchronously controlled EEG-based virtual keyboard: improvement of the spelling rate", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 979-984, Jun.2004. [79] Z. Yu, S. G. Mason and G. E. Birch, "Enhancing the performance of the LF-ASD brain- computer interface," in in Proc. of the 2nd  Joint EMBS/BMES Conference, vol.3,pp. 2443- 2444, 2002. [80] A. Bashashati, M. Fatourechi, R. K. Ward and G. E. Birch, "User customization of the feature generator of an asynchronous brain interface", Ann. Biomed. Eng., vol. 34, no.6, pp. 1051-1060, Jun.2006. [81] A. Bashashati, R. K. Ward and G. E. Birch, "A new design of the asynchronous brain computer interface using the knowledge of the path of features," in Proc.2nd  IEEE EMBS Int. Conf. on Neural Engineering,pp. 101-104. 2005.  [82] A. Bashashati, R. K. Ward and G. E. Birch, "Towards development of a 3-state self-paced brain computer interface", Computational Intelligence and Neuroscience, Vol.2007, pp.1-8, Oct. 2007. [83] J. E. Huggins, S. P. Levine, J. A. Fessler, W. M. Sowers, G. Pfurtscheller, B. Graimann, A. Schloegl, D. N. Minecan, R. K. Kushwaha, S. L. BeMent, O. Sagher and L. A. Schuh, "Electrocorticogram as the basis for a direct brain interface: Opportunities for improved detection accuracy," in Proc. 1st IEEE EMBS Int. Conf. on Neural Engineering,pp. 587-590. 2003, [84] S. P. Levine, J. E. Huggins, S. L. Bement, R. K. Kushwaha, L. A. Schuh, M. M. Rohde, E. A. Passaro, D. A. Ross, K. V. Elisevich and B. J. Smith, "A Direct Brain Interface Based on Event-Related Potentials", IEEE Trans. Rehab. Eng., vol. 8, no.2, pp. 180-185, Jun.2000. [85] J. E. Huggins, S. P. Levine, S. L. Bement, R. K. Kushwaha, L. A. Schuh, E. A. Passaro, M. M. Rohde, D. A. Ross, K. V. Elisevich and B. J. Smith, "Detection of Event-Related Potentials for Development of a Direct Brain Interface", J Clinical Neurophysiol, vol. 16, no.5, pp. 448-455, Sep.1999. [86] Y. Benjamini and Y. Hochberg, "Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing", Journal of the Royal Statistical Society.Series B (Methodological), vol. 57, no.1, pp. 289-300, 1995. [87] G. Townsend, B. Graimann and G. Pfurtscheller, "Continuous EEG classification during motor imagery--simulation of an asynchronous BCI", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 12, no.2, pp. 258-265, Jun.2004.   46  [88] C. Toro, G. Deuschl, R. Thatcher, S. Sato, C. Kufta and M. Hallett, "Event-related desynchronization and movement-related cortical potentials on the ECoG and EEG", Electroencephalogr. Clin. Neurophysiol., vol. 93, no.5, pp. 380-389, Oct.1994. [89] S. Arroyo, R. P. Lesser, B. Gordon, S. Uematsu, D. Jackson and R. Webber, "Functional significance of the mu rhythm of human cortex: an electrophysiologic study with subdural electrodes", Electroencephalogr. Clin. Neurophysiol., vol. 87, no.3, pp. 76-87, Sep.1993. [90] G. Pfurtscheller and A. Aranibar, "Evaluation of event-related desynchronization (ERD) preceding and following voluntary self-paced movement", Electroencephalogr. Clin. Neurophysiol., vol. 46, no.2, pp. 138-146, Feb.1979. [91] L. Defebvre, J. L. Bourriez, K. Dujardin, P. Derambure, A. Destee and J. D. Guieu, "Spatiotemporal study of Bereitschaftspotential and event-related desynchronization during voluntary movement in Parkinson's disease", Brain Topogr., vol. 6, no.3, pp. 237-244, Spring.1994. [92] K. R. Muller, G. Curio, B. Blankertz and G. Dornhege, "Combining features for BCI," in the Proc. Advances in Neural Inf. Proc. Systems (NIPS 02), vol.15,2003. [93] L. Narici, V. Pizzella, G. L. Romani, G. Torrioli, R. Traversa and P. M. Rossini, "Evoked alpha- and mu-rhythm in humans: a neuromagnetic study", Brain Res., vol. 520, no.1-2, pp. 222-231, Jun 18.1990. [94] B. Feige, R. Kristeva-Feige, S. Rossi, V. Pizzella and P. M. Rossini, "Neuromagnetic study of movement-related changes in rhythmic brain activity", Brain Res., vol. 734, no.1-2, pp. 252-260, Sep 23.1996. [95] G. Pfurtscheller, "Central beta rhythm during sensorimotor activities in man", Electroencephalogr. Clin. Neurophysiol., vol. 51, no.3, pp. 253-264, Mar.1981. [96] W. Szurhaj, P. Derambure, E. Labyt, F. Cassim, J. L. Bourriez, J. Isnard, J. D. Guieu and F. Mauguiere, "Basic mechanisms of central rhythms reactivity to preparation and execution of a voluntary movement: a stereoelectroencephalographic study", Clin. Neurophysiol., vol. 114, no.1, pp. 107-119, Jan.2003. [97] H. S. Liu, X. Gao, F. Yang and S. Gao, "Imagined hand movement identification based on spatio-temporal pattern recognition of EEG," in Proc. of the 1st  Joint EMBS/BMES Conference, pp. 599-602. 2003,  [98] B. D. Mensh, J. Werfel and H. S. Seung, "BCI Competition 2003--Data set Ia: combining gamma-band power with slow cortical potentials to improve single-trial classification of electroencephalographic signals", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1052-1056, Jun. 2004. [99] T. Hinterberger and G. Baier, "Parametric orchestral sonification of EEG in real time", Multimedia, IEEE, vol. 12, no.2, pp. 70-79, 2005. [100] Y. Wang, Z. Zhang, Y. Li, X. Gao, S. Gao and F. Yang, "BCI Competition 2003--Data set IV: an algorithm based on CSSD and FDA for classifying single-trial EEG", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1081-1086, Jun. 2004. [101] M. Krauledat, G. Dornhege, B. Blankertz, F. Losch, G. Curio and K. -. Muller, "Improving speed and accuracy of brain-computer interfaces using readiness potential features," in the Proc. 26th IEEE/EMBS Int. Conf.,vol.2,pp. 4511-4515 , 2004. [102] T. M. Vaughan, W. J. Heetderks, L. J. Trejo, W. Z. Rymer, M. Weinrich, M. M. Moore, A. Kubler, B. H. Dobkin, N. Birbaumer, E. Donchin, E. W. Wolpaw and J. R. Wolpaw, "Brain-   47 computer interface technology: a review of the Second International Meeting", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 11, no.2, pp. 94-109, Jun.2003. [103] J. S. Barlow, "Artifact processing (rejection and minimization) in EEG data processing", Handbook of Electroencephalography and Clinical Neurophysiology (Revised Series Ed.), Amsterdam: Elsevier, vol.2., pp.15–62, 1986. [104] P. Anderer, S. Roberts, A. Schlogl, G. Gruber, G. Klosch, W. Herrmann, P. Rappelsberger, O. Filz, M. J. Barbanoj, G. Dorffner and B. Saletu, "Artifact processing in computerized analysis of sleep EEG - a review", Neuropsychobiology, vol. 40, no.3, pp. 150-157, Sep.1999. [105] D. J. McFarland, L. M. McCane, S. V. David and J. R. Wolpaw, "Spatial filter selection for EEG-based communication", Electroencephalogr. Clin. Neurophysiol., vol. 103, no.3, pp. 386-394, Sep.1997. [106] W. Waterink and A. van Boxtel, "Facial and jaw-elevator EMG activity in relation to changes in performance level during a sustained information processing task", Biol. Psychol., vol. 37, no.3, pp. 183-198, Jul.1994. [107] B. H. Cohen, R. J. Davidson, J. A. Senulis, C. D. Saron and D. R. Weisman, "Muscle tension patterns during auditory attention", Biol. Psychol., vol. 33, no.2-3, pp. 133-156, Jul.1992. [108] I. I. Goncharova, D. J. McFarland, T. M. Vaughan and J. R. Wolpaw, "EMG contamination of EEG: spectral and topographical characteristics", Clin. Neurophysiol., vol. 114, no.9, pp. 1580-1593, Sep.2003. [109] D. J. McFarland, W. A. Sarnacki, T. M. Vaughan and J. R. Wolpaw, "Brain-computer interface (BCI) operation: signal and noise during early training sessions", Clin. Neurophysiol., vol. 116, no.1, pp. 56-62, Jan.2005. [110] R. N. Vigario, "Extraction of ocular artefacts from EEG using independent component analysis", Electroencephalography and Clinical Neurophysiology, vol. 103, no.3, pp. 395- 404, Sep. 1997. [111] R. Verleger, "The instruction to refrain from blinking affects auditory P3 and N1 amplitudes", Electroencephalogr. Clin. Neurophysiol., vol. 78, no.3, pp. 240-251, Mar.1991. [112] C. J. Ochoa and J. Polich, "P300 and blink instructions", Clin. Neurophysiol., vol. 111, no.1, pp. 93-98, Jan.2000. [113] G. Gratton, "Dealing with artifacts: The EOG contamination of the event-reJated brain potential", Behavior Research Methods, Instruments, & Computers, vol. 30, no.1, pp. 44-53, 1998. [114] H. Ramoser, J. Muller-Gerking and G. Pfurtscheller, "Optimal Spatial Filtering of Single Trial EEG During Imagined Hand Movement", IEEE Trans. Rehab. Eng., vol. 8, no.4, pp. 441-446, Dec.2000. [115] J. Millan, M. Franze, J. Mourino, F. Cincotti and F. Babiloni, "Relevant EEG features for the classification of spontaneous motor-related tasks", Biol. Cybern., vol. 86, no.2, pp. 89-95, Feb.2002. [116] R. J. Croft and R. J. Barry, "Removal of ocular artifact from the EEG: a review", Neurophysiol. Clin., vol. 30, no.1, pp. 5-19, Feb.2000. [117] V. Rowland, "Cortical steady potential (direct current potential) in reinforcement and learning", Progress in Physiological Psychology, vol. 2, pp. 1–77, 1968.   48 [118] J. S. Barlow, "EMG artifact minimization during clinical EEG recordings by special analog filtering", Electroencephalogr. Clin. Neurophysiol., vol. 58, no.2, pp. 161-174, Aug.1984. [119] J. R. Ives and D. L. Schomer, "A 6-pole filter for improving the readability of muscle contaminated EEGs", Electroencephalogr. Clin. Neurophysiol., vol. 69, no.5, pp. 486-490, May.1988. [120] S. Choi, A. Cichocki, H. M. Park and S. Y. Lee, "Blind Source Separation and Independent Component Analysis: A Review",  Neural Information Processing-Letters and Review, vol. 6, no.1, pp. 1–57, 2005. [121] T. D. Lagerlund, F. W. Sharbrough and N. E. Busacker, "Spatial filtering of multichannel electroencephalographic recordings  through principal component analysis by singular value decomposition", J. Clin. Neurophysiol., vol. 14, no.1, pp. 73-82, Jan.1997. [122] M. Browne and T. R. Cutmore, "Low-probability event-detection and separation via statistical wavelet  thresholding: an application to psychophysiological denoising", Clin. Neurophysiol., vol. 113, no.9, pp. 1403-1411, Sep.2002. [123] P. He, G. Wilson and C. Russell, "Removal of ocular artifacts from electro-encephalogram by adaptive filtering", Med. Biol. Eng. Comput., vol. 42, no.3, pp. 407-412, May.2004. [124] P. Berg and M. Scherg, "A multiple source approach to the correction of eye artifacts", Electroencephalogr. Clin. Neurophysiol., vol. 90, no.3, pp. 229-241, Mar.1994. [125] D. Burke, S. Kelly, P. de Chazal and R. Reilly, "A simultaneous filtering and feature extraction strategy for direct brain interfacing," in Proc. of the 2nd  Joint EMBS/BMES Conference,vol.1,pp. 279-280 , 2002. [126] A. Kuebler, B. Kotchoubey, H. P. Salzmann, N. Ghanayim, J. Perelmouter, V. Homberg and N. Birbaumer, "Self-regulation of slow cortical potentials in completely paralyzed human patients", Neurosci. Lett., vol. 252, no.3, pp. 171-174, Aug.1998. [127] F. Provost and T. Fawcett, "Robust Classification for Imprecise Environments", Mach. Learning, vol. 42, no.3, pp. 203-231, 2001. [128] J. Huang and C. X. Ling, "Using AUC and accuracy in evaluating learning algorithms", IEEE Trans. Knowled. Data Eng., vol. 17, no.3, pp. 299-310, 2005. [129] J. Zhu and T. Yao, "An evaluation of statistical spam filtering techniques", ACM Transactions on Asian Language Information Processing (TALIP), vol. 3, no.4, pp. 243-269, 2004. [130] A. P. Bradley, "Use of the area under the ROC curve in the evaluation of machine learning algorithms", Pattern Recognit, vol. 30, no.7, pp. 1145-1159, 1997. [131] N. T. Choplin and D. C. Lundy, "The sensitivity and specificity of scanning laser polarimetry in the detection of glaucoma in a clinical setting", Ophthalmology, vol. 108, no.5, pp. 899-904, May.2001. [132] M. Fatourechi, G. E. Birch and R. K. Ward, "A self-paced brain interface system that uses movement related potentials and changes in the power of brain rhythms", J. Comput. Neurosci., vol.23, no.1, pp.21-37, Aug. 2007. [133] N. Yamawaki, C. Wilke, Z. Liu and B. He, "An enhanced time-frequency-spatial approach for motor imagery classification", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.2, pp. 250-254, Jun.2006.   49 [134] A. Buttfield, P. W. Ferrez and R. Millan Jdel, "Towards a robust BCI: error potentials and online learning", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.2, pp. 164-168, Jun.2006. [135] G. R. Muller-Putz, R. Scherer, C. Neuper and G. Pfurtscheller, "Steady-state somatosensory evoked potentials: suitable brain signals for brain-computer interfaces?", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.1, pp. 30-37, Mar.2006. [136] J. R. Wolpaw, D. McFarland and G. Pfurtscheller, "EEG-based Communication: Improved Accuracy by Reponse Verification", IEEE Trans. Rehab. Eng., vol. 6, no.3, pp. 326-333, 1998. [137] M. Fatourechi, S. G. Mason, G. E. Birch and R. K. Ward, "Is information transfer rate a suitable performance measure for self-paced brain interface systems?" in Proc. IEEE Int. Symp. Signal Processing and Information Technology, pp. 212-216, 2006. [138] J. Cohen, "A coefficient of agreement for nominal scales", Educational and Psychological Measurement, vol. 20, no.1, pp. 37-46, 1960. [139] M. Fatourechi, G. E. Birch and R. K. Ward, "Applying a hybrid genetic algorithm in the design of a self-paced brain interface with a low false positive rate," in Proc. IEEE ICASSP’07,vol.4,pp. IV-1157; IV-1160, Apr. 2007. [140] V. Bostanov, "BCI Competition 2003--Data sets Ib and IIb: feature extraction from event- related brain potentials with the continuous wavelet transform and the t-value scalogram", IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1057-1061, Jun.2004. [141] L. Qin and B. He, "A wavelet-based time-frequency analysis approach for classification of motor imagery for brain-computer interface applications", J. Neural Eng., vol. 2, no.4, pp. 65-72, Dec.2005.     50 CHAPTER 2 AUTOMATIC USER CUSTOMIZATION FOR IMPROVING THE PERFORMANCE OF A SELF- PACED BRAIN COMPUTER INTERFACE SYSTEM1  2.1 Introduction A self-paced brain computer interface (BCI) system allows individuals with severe motor disabilities to control objects in their environment using their brain signals only and at any time, i.e., at their own pace [1-11]. The output of a self-paced BCI system should only be activated when the user intends to control, and should remain inactive at all other times. Implementing such a BCI system is much more difficult than implementing a traditional synchronized BCI system, in which the user can only control a device at certain periods of time specified by the system [12].   BCI systems use specific features of a neurological phenomenon in the brain activity for the purpose of control. Various neurological phenomena can be used, including neural firing rates, changes in the Mu and Beta rhythms, movement-related potentials (MRPs), slow cortical potentials (SCPs) and P300. For a complete list of neurological phenomena used in BCI systems and pertinent references, please see [13]. In designing a feature extractor for a BCI system, an important factor that needs to be addressed is the variability in the chosen neurological phenomenon; i.e., the specifications of the neurological phenomenon may change from one user to another. For example, it has been shown that the Mu and Beta frequency bands [14] and the shape of an MRP [15] may vary from one user to another. As a result, if the features extractor does not extract user-specific features, the performance of the BCI system may degrade [16],  1 A version of this chapter has been published. Fatourechi, M., Bashashati, A., Birch, G.E. and Ward, R.K. “Automatic User Customization for Improving the Performance of an Asynchronous Brain Interface System”, Journal of Medical & Biological Engineering and Computing, Vol.44, No.12, Dec 2006, pp.1093-1104.    51 or even detect an incorrect pattern [15]. A successful BCI system must therefore select features that correctly characterize the underlying neurological phenomenon of the specific user. We call this process user customization of the feature extractor. Traditionally, BCI systems have not employed user customization for extracting features. Recent studies, however, showed that user customization of the feature extractor leads to improved performance for most users [5, 15-19]. User customization can be achieved either “manually” or “automatically”.  In the manual user customization, the neurological phenomenon of interest is visually inspected by a human expert (usually through inspecting the ensemble average of many single trials); this is then followed by the expert determining the parameter values of the feature extractor [15].  This customization process has two main advantages: it is relatively fast and it is not computationally demanding. Thus, when the total number of users and EEG channels is small and the signal-to-noise ratio (SNR) of the neurological phenomenon is sufficiently high, the manual approach can be used for customizing the parameter values. When the number of users grows, however, this process becomes increasingly time-consuming and exhausting.  The problem becomes more challenging when features are extracted from a large number of EEG channels, since many EEG channels (and not only one or two) need to be visually observed. If the SNR of the neurological phenomenon is low, visual estimation becomes subjective and inaccuracies are introduced in the estimates of the parameter values. Furthermore, if some kind of preprocessing that changes the shape of the neurological phenomenon of interest is employed, then this change should be considered in the design of the feature extractor. For these reasons, an automatic user customization algorithm is desired. In this Chapter, we employ automatic user customization of the feature extractor of a self-paced BCI system called the Low Frequency-Asynchronous Switch Design (the LF-ASD) [1]. The LF-ASD detects an Intentional Control (IC) command in the EEG signal. The IC command corresponds to an MRP pattern generated by the flexion of the right index finger. When the users are not in an IC state, they are said to be in a no- control (NC) state. In an NC state, a user may be idle or perform some action other than trying to control the BCI system. We chose the LF-ASD for this study because of our   52 intimate knowledge of this BCI system. Also, the LF-ASD has been used as the basis for potential design improvements by other researchers [6], and it is one of the few BCI systems that has been successfully tested online [20]. Because the shape of MRP patterns differs from one user to another, determining the specific design parameter values for each individual is expected to improve the performance of a BCI system. In [15] the parameter values of the feature extractor of the LF-ASD are estimated by a human expert (see Section 2.3 for details). It is shown that such user customization results in improved performance of the LF-ASD. There are some limitations, however, in the application of the proposed method. First, as mentioned above, the process can become very time–consuming, especially for a relatively large number of EEG channels, which is the case here (the LF-ASD uses 6 bipolar EEG channels). Second, the LF-ASD incorporates a pre-processing component that changes the shapes of MRPs. Third, the SNR of the MRPs is usually very low; this makes estimating the parameter values from the ensemble averages unreliable for some individuals [15]. In this study, we propose the use of a genetic algorithm (GA) to automatically estimate the shape of MRPs for each user and thus user customize the parameter values of the LF-ASD. A GA is a heuristic search method that provides a framework for effectively sampling large search spaces [21]. GAs are designed based upon the genetic processes of biological organisms, which evolve over many generations according to the principles of natural selection and survival of the fittest. By mimicking this process, they are able to evolve solutions to real-world problems. They have been shown to be effective in optimization problems where a large-dimensional feature space is involved, especially when the optimization problem cannot be solved by analytical tools [21, 22]. Since in this study we plan to automatically estimate the shape of the MRP pattern for each EEG channel, and thus we are dealing with a high-dimensional parameter space, we employ GAs. The use of a GA for automatic user customization of the LF-ASD was also motivated by the results of our earlier work in [23]. There, we used a GA to automatically customize the parameter values of the post-processing component in the LF-ASD for two individuals. The improvements in performance of the two individuals studied, demonstrated the effectiveness of employing a GA.   53 This study demonstrates that automatic user customization of the LF-ASD results in statistically significant improvements in the performance over the BCI system whose design parameter values are user customized by a human expert [15]. This finding further supports existing evidence that automatic user customization leads to performance improvement in BCI systems. 2.2 Background This section briefly reviews MRPs and the overall structure of the LF-ASD. MRPs are low-frequency potentials that start about 1-1.5 seconds before a movement. They have bilateral distribution and present maximum amplitude at the vertex [24-26]. An MRP is a robust phenomenon observed in the brain signal. It has been shown that there are similarities between the shapes of MRPs resulting from a real execution of a movement and those resulting from an attempt to perform a movement [1]. In some BCI systems, MRPs have thus been chosen as the neurological phenomenon, from which the presence of an IC command is extracted [1, 27-29] . An MRP consists of different components, such as Bereitschaftspotential, a motor potential (MP) , post-movement positive potential (PMPP), etc. [30]. Different BCI systems focus on the detection of different components. For example, in [31], Bereitschaftspotential are detected, whereas in [1], the whole MRP is detected. There are various methods for detecting the components of MRPs, such as using autoregressive parameters [27], wavelet transform [28]and Fourier transform [29]. One solution is to use a simple feature extractor that detects the peaks of MRPs , since the peaks are found to be robust over different individuals [1]. The LF-ASD system and its variations have used this idea for detecting MRP patterns [1, 15, 32]. The block diagram of the LF-ASD [15, 32] is shown in Figure 2-1. This design uses features extracted from six bipolar EEG channels located on the sensorimotor cortex. After amplification and low-pass filtering using a low-pass, linear phase FIR filter with a 4 Hz cut-off frequency, all six EEG channels are normalized with an energy normalization transform (ENT) [33].   54 The ENT, (see Figure 2-1), normalizes the input energy and has been shown to result in a better class separation by increasing the difference between the means of the IC and NC features [32-34].  The output of the ENT is calculated using       2/)1( 2/)1( 2)( )()( N N Ws Ws snx nxne                                       (2-1) where x(n) is the input EEG channel, WN  is the width of the ”sliding” window used to normalize x(n), and e(n) is the normalized EEG channel (the output of the ENT). The only design parameter of ENT is WN , i.e., the “normalization parameter”. Its value was originally determined through using an exhaustive search on the data collected from one individual. This value was then used for all other individuals [33].  amp Feature Translator LF-ASD feature generator 1-NN classifier electrode array moving average  debounce KLT  ENT  codebook generation mechanism  Figure 2-1. Components of the LF-ASD system (from [32]). A specific feature generator is then applied to detect the presence of an MRP pattern in the single trial bipolar EEG signals [1]. Figure 2-2 shows the points used to calculate the features from a sample EEG signal at a particular point in time (t=n’). As shown in Figure 2-2,  each of the elemental features )'(nEi  and )'(nE j is defined as the difference in e(n) at two points in time, described by (2-2) and (2-3) below:                )'()'()'( nenenE iii                                             (2-2)                )'()'()'( jijij nenenE                                            (2-3)   55   Figure 2-2. Points selected by the feature generator when applied to a sample bipolar EEG signal. where )(ne is the ENT-normalized EEG signal. Throughout this Chapter, the above parameters ( jiji  ,,, ) are referred to as the “delay parameters” and are used to estimate the shape of a bipolar MRP. To emphasize the samples for which two large elemental features appear concurrently, compound features are defined by pairing the elemental features ( ji EE , ), as shown below:  otherwise nEnEifnEnEng jijiij 0 0)'()'()'()'()'(                            (2-4) For robustness, the compound features are maximized over a window as follows:       )'(),1'(),...,7'(),8'(max)'( ngngngngnG ijijijijij                    (2-5) Since there are six pairs of bipolar EEG channels, this procedure is repeated for each of these channels. Compound features of each of the six EEG signals then form a 6- dimension feature vector. The Karhunen-Loève Transform (KLT) is used to reduce the 6- dimensional feature space produced by the feature generator to a 2-dimensional space[35]. A 1-nearest neighbor (1-NN) classifier is used as the feature classifier.  The βj αj αi βi Ej(n’) e(n) time Ei(n’) t=n’   56 codebook generation mechanism for the classifier is explained elsewhere [1].  Finally, a moving average and a debounce algorithm are employed to improve the classification accuracy of the system by reducing the number of false activations (for details, see [1, 32]).  After training, the LF-ASD classifies the input patterns as one of two classes: NC or IC. 2.3 Problem statement In designing the LF-ASD, the parameter values of the ENT and the feature generator must be estimated. The aim is to determine these estimates so that it is possible to detect MRP patterns in a single trial. The ENT has one parameter to be determined. This is the window size, WN. Its value should be estimated for each of the six EEG channels. The feature generator has four delay parameters ( jiji  ,,, ) for each of the six EEG channels, resulting in a total of 24 delay parameters whose values should be estimated. This means that, to detect the presence of an MRP pattern, the values of 30 parameters should be determined. For the rest of this Chapter, we refer to these 30 parameters as the “design parameters”. These parameters were originally estimated by a human expert from the ensemble average of MRP patterns for one individual and then were used for all subsequent individuals [1, 32]. As the MRP pattern related to a specific movement may differ from one individual to another, using the same design parameter values for all individuals may lead to erroneous results.  Therefore, the design parameter values should be estimated for each individual. The same argument applies to any BCI system that uses a user-dependant pattern for its IC state. When determining the design parameter values, two points should be considered. First, these values could not be determined using an exhaustive search approach. Without having an efficient automatic method, it is prohibitively time consuming to determine all parameter values simultaneously by using an exhaustive search method.  Second, an improper choice of design parameter values may lead the BCI system to detect an incorrect pattern in the EEG signal. This, in turn, degrades the performance of the system, since the detected pattern would not correspond to an MRP pattern as it may have resulted from a particular artifact.   57 To improve the performance of the LF-ASD, the ),( ji  delay parameter values were user customized by a human expert in [15]. The βi’s  and βj’s were set to 0, equal i and j  values were used for each of the six EEG channels and the size of the normalization window was fixed for all individuals and for all EEG channels. For each individual, the MRP pattern associated with the flexion of the right index finger was determined using the ensemble average of the MRP pattern. The delay parameter values were then estimated by visually inspecting the user’s ensemble average of the MRP patterns. The rationale behind using the ensemble average was that it enhanced the SNR, and that the resulting waveform better showed the desired pattern that the LF-ASD aimed at detecting. As for the normalization parameters, since no analytical method for estimating these values existed, the values found earlier in [33] by trial and error were used. The data of eight individuals were analyzed [15]. Improvements from 2.0% to 6.8% were reported for four individuals, but the results for the rest of the individuals did not improve [15]. Although implementing the above customization approach seems straightforward, there were some problems associated with it. First, estimating the delay parameter values from the ensemble averages was not trivial. The number of available trials had a significant effect on the quality of the generated ensemble averages and ultimately on the estimated values of the delay parameters. For some individuals, there were a number of closely located peaks in the ensemble averages that made the estimation of the delay parameter values very difficult. As a result, several points had to be tested before the desired delay parameter values were estimated [15]. Also, the values of the normalization parameters and the delay parameters were not estimated simultaneously. Since the ENT was applied first, the delay parameter values were estimated subsequently. Thus, for each value of the normalization parameter of the ENT, the delay parameter values had to be estimated. Since no analytical method currently exists for estimating the normalization parameter values, the resulting estimates of the delay parameter values may not be reliable. Finally, the amount of improvement in the performance of the system found in [15] over that of the non-user customized system [32], was not as high as expected. This   58 is probably due to the fact that estimating the delay parameter values based on the ensemble averages does not guarantee optimal performance in single-trial analysis. To address these limitations, in the next section we propose the use of a GA to automatically user customize the design parameter values. 2.4 Methods    In applying GAs to select the parameter values, each parameter of interest is first coded in the form of a randomly generated binary string. Each bit in this binary string is called a gene. The concatenation of all the binary strings forms a “chromosome”, and the set of “chromosomes” forms a “population”. Each chromosome is then evaluated and a fitness value assigned. For example, the fitness value can be the classification accuracy of the BCI system for a particular set of parameter values. The chromosomes are then combined using operators such as “selection”, “crossover” and “mutation” in order to generate new chromosomes. The “selection” operator selects a proportion of the existing population to breed a new generation. The selected chromosomes are usually the ones with higher fitness compared to other chromosomes in the population. After selection of the “fitter” chromosomes, a pair of "parent" chromosomes is selected for generating the “child” chromosomes. A child chromosome is a new solution that typically shares many of the characteristics of its "parents". The “crossover” operator ensures that this is the case by copying some of the genes of each parent to the child. The “mutation” operator is used to maintain genetic diversity from one generation of a population to the next.  This process is repeated until a new population of chromosomes is generated. It is expected that the population evolves gradually and that fitness improves over generations. This process is continued until some criteria for stopping the GA is met [21]. The GA we apply for user customization has the following characteristics. Each chromosome consists of a concatenated binary version of 31 parameter values.  These parameters comprise the 30 design parameters previously stated and the “scale factor” parameter, which determines the operating point of the BCI system on the receiver operating characteristic (the ROC) curve. The ROC curve shows the relationship between   59 the true positive (TP) and the false positive (FP) results for each parameter configuration (for more details on scale factor and plotting ROC curve for the LF-ASD, see [32]). The function of the scale factor is explained below. The width of the normalization window was chosen to be from 0 to 1.5 seconds. The initial values of the delay parameters were visually estimated from the ensemble averages of the MRP patterns in the training data with the ENT removed from the system. For simplicity, the same initial delay values were chosen for all channels. The ranges for the delay parameter values were then chosen as follows (all numbers refer to sample numbers): Range of αi: [αi-est -32  to αi-est + 96] Range of αj: [αj-est -96  to αj-est + 32] Range of βi: [-32  to + 32] Range of βj: [-32  to + 32]                                                                               (2-6) where, αi-est  and αj-est are the approximate values of the delay parameters estimated  from the ensemble averages. These parameter ranges were chosen to cover the range over which the peaks, associated with the pattern shown in Figure 2-2, are expected to occur. Their values, thus, give an estimation of the shape of the MRP pattern.  The range of the scale factor (which determines the operating point on the ROC curve) was chosen as from 0.1 to 4. Our experience has shown that this selection covers the range of the operating points on the ROC curve of the LF-ASD that should be at low FP rates [32]. Following an initial estimate of the delay parameter values, a suitable fitness function for the GA was chosen as follows.  A confusion matrix, shown in Table 2-1, was used to summarize the classification performance of a 2-state self-paced BCI system. In Table 2-1, the FP rate is the percentage of misclassifying a NC state as an IC state, the true negative (TN) rate is the percentage of correctly classifying an NC, the TP rate is the percentage of correctly classifying an IC and the false negative (FN) rate is the percentage of misclassifying an IC state as an NC state. A suitable fitness function for a self-paced BCI should be able to effectively summarize the confusion matrix. For a two- state self-paced BCI system such as the LF-ASD, we have   60 FN(%) = 100 (%)- TP (%) and  TN(%) = 100 (%)- FP (%)                                             (2-7) Based on (2-7) , the fitness function needs to contain only TP and FP rates. One choice of a good fitness function can be one that maximizes the TP rate for a reasonably low fixed FP rate. This choice is based on our previous results, where it was found that an FP rate above 2% caused excessive frustration and distraction in users using a self-paced BCI system [32]. Thus, it is important to keep the FP rates below 2%. Table 2-1. The confusion matrix for a 2-state self-paced BCI system.        Actual Class Predicted Class IC NC IC  TP FN NC FP TN  Our earlier attempts at calculating a suitable performance measure based on the confusion matrix were based on reporting the TP rate at a fixed FP rate (which was set at 2%; see [32] for details). In order to achieve this, various points on the ROC curve were analyzed by varying the scale factor until a desired point, with an FP rate of 2%, was found.  Such an approach is undesired for calculating the fitness function because of the huge computational load involved. Currently, each evaluation of the fitness function, including training the classifier and evaluating the system on the validation set, takes about two minutes on a PC with a Pentium IV 2.8 GHz CPU and 512 MB of RAM. Since finding a specific point on the ROC curve requires several such evaluations, and this process should to be repeated for all chromosomes in the population, the computational load increases dramatically. To be more specific, if the time needed for each evaluation of the fitness function is denoted by sEvaluationT , and ChromosomeN evaluations are needed to find a specific point on the ROC curve, and the GA needs to evaluate   61 sEvaluationN chromosomes during its operation, the running time of the GA can be calculated as follows: sEvaluationChromosomesEvaluationGA TNNT                                                  (2-8)  Since 2sEvaluationT , and sEvaluationN is in the order of thousands (e.g., 5000),  it is evident that even for a small  ChromosomeN ,  GAT will become very large. For the same reason, using the area under the ROC curve is not practical at this stage, since several points on the ROC curve should be estimated for a single evaluation of the fitness function. Our final configuration incorporated the FP rate as a constraint in the fitness function. We defined the fitness function as follows:        %2if,1.0 %2if, )( FPTP FPTP Chromosomefitness            (2-9) where the TP and the FP rates are expressed in %. In (2-9), the TP rates remain intact only for FP values less than 2%. For FP>2%, we attenuated the fitness of these chromosomes dramatically in order to prevent the less fit chromosomes from becoming active members of the population. Although such chromosomes had high TP rates, they also had high FP rates, and were considered “unfit” from a practical point of view. The scale factor was added to the structure of the chromosome because of the expectation that the algorithm is able to find the value of the scale factor that yields the highest TP rate when %2FP . In [23], we showed that this was indeed the case. The GA was able to find the scale factor value yielding the highest TP rate for %2FP . The remaining operators of the GA were chosen as follows. Tournament-based selection (tournament size =3) was used as the selection operator. Uniform crossover (p=0.9) and uniform mutation (p=0.01) operators were used. The sizes of the initial population and of the population in the next generations were chosen as 200 and 100, respectively. We used random initialization for initializing the GA. The number of evaluations was set to 5000 and this criterion was used for the termination of the algorithm. If the improvement in the best solution was found to be less than 1% for more than 10 consecutive generations, before reaching the total number of evaluations the   62 algorithm was terminated. Because of the computational load involved, we did not tune the GA parameter values such as the mutation and crossover rates. 2.5 Experimental results In this section, the performance of the proposed algorithm is evaluated using the data collected from eight individuals. Off-line data were collected from users positioned 150 cm in front of a computer monitor. The EEG signals were recorded from six bipolar electrode pairs positioned over the users’ supplementary motor area and primary motor cortex at F1-FC1, Fz-FCz, F2-FC2, FC1-C1, FCz-Cz, and FC2-C2 in accordance with the International 10-20 System. Features extracted from these channels had been shown to provide more discriminant information for the separation of IC and NC features [1]. Electrooculography (EOG) activity was measured as the potential difference between two electrodes, placed at the corner of and below the right eye. The ocular artifacts were automatically rejected when the difference between the EOG electrodes exceeded ±25 µV. All signals were sampled at 128 Hz and referenced to the ear electrodes (see [1, 36] for details). Data from four individuals with a high-level spinal cord injury (location of injury between C4-5 and C6-7 on the spinal cord) and four able-bodied individuals were used in this study. The individuals with spinal cord injury were coded as SCI (spinal cord injury) individuals and the able-bodied individuals were coded as AB individuals. None of the individuals with spinal cord injury had residual sensation or motor function in their hands. The users’ descriptions are shown in Table 2-3. The data were collected from the users as they performed a guided task.  At each interval, a white circle of 2 cm diameter was displayed on the user’s monitor for ¼ second, prompting the user to attempt a movement. In response to this cue, the user had to attempt to flex his right index finger one second after the cue appeared. The 1-second delay was used to avoid visual evoked potential (VEP) effects from the cue, and the users were trained to estimate it.  The 1-second time after the cue is denoted by the “time of the expected attempted movement (TEM)”. Note that this is the time when the user is expected to attempt to perform the movement, and that this time may vary from one user   63 to another and from trial to trial. This task resulted in an attempted movement in individuals with spinal cord injury i.e., no physical finger movement, and an actual finger flexion in able-bodied individuals (see [36] for more details). For each user, an average of 80 trials was collected every day over a period of 6 days. The data in the EEG signals were divided into segments, each of length equal to seven seconds.  A 7-second window was wide enough to contain an MRP pattern as well as NC periods. A training set, a validation set and a test set were then randomly generated for each user from these 7-second windows. The training set was used to train the classifier.  The validation set was used to select the optimal values of the design parameters using the proposed GA. The parameter values yielding the least error on the validation set were then selected. The performance of the system was evaluated using the test set. For each user, the epochs were randomly divided into five non-overlapping sets of equal size. The data in the first set were used for training, the data in sets two and three were used for estimating the parameters and the data in sets four and five were used for testing the performance of the selected model. The number of epochs in the training, validation and test sets for each user is reported in the fifth column of Table 2-3. The features in (2-4) were generated by moving the feature generator over epochs, each of a 7–second length. Since the EEG signal is filtered to frequencies below 4Hz, the feature generator was shifted by 0.0625 seconds (8 samples), resulting in a total of 112 features in a 7-second epoch. To determine whether or not an IC command was detected by the system, we defined a sliding window around the TEM. The length of this window was 1.5 seconds (from 0.5 seconds before the TEM to 1 second after the TEM). If an MRP pattern was detected at any time within any such window, the output of the BCI system was activated. This method is similar to those used by other researchers [3, 6, 7].  False positives were assessed in the periods before the system cue appeared and after the user was expected to perform the movement. In [15], a 5-fold stratified cross-validation process was used to assess the performance of the LF-ASD. The trials in the training sets, validation sets and test sets were chosen randomly. The performance over different validation sets varied very little.   64 Thus, to save on computational time, we did not perform cross-validation over the different validation sets in this study, saving about 20% of the time needed for a 5-fold stratified cross-validation. Figure 2-3 shows the fitness of the best chromosome in each generation as a function of the generation number for two representative individuals (AB2 and SCI4). Figure 2-3 clearly shows the evolution of the fitness of the best chromosome as the GA explores the search space. Please note that in the early stages of the GA, the improvement rate of the fitness of the best chromosome is fast. As the population evolves, the rate of improvement drops. This is because in the early stages of the GA, the value of the scale factor is not properly chosen. The design parameter values are also far from optimal. As the population evolves, the GA is able to find the scale factor value that yields the highest TP rate for FP=2%. This in turn results in a significant improvement in the fitness. As the generation number increases, the scale factor value is more properly set. The rate of improvement thus drops. Table 2-2 summarizes the performance of the GA. In this table, the average fitness of the population, the fitness of the best chromosome and the fitness of the worst chromosome are reported for both initial and the final populations. As Table 2-2shows, the average fitness of the initial population is very low. This result may be due to the following reasons: (1) The parameter values are randomly selected and are far from optimal. (2) The scale factor value is not properly set. Many chromosomes in the population are thus assigned a fitness value equal to zero since their FP rates are above the threshold of FP=2% (see (2-9)).    65  0 5 10 15 20 25 30 35 40 45 50 66 68 70 72 74 76 78 80 82 84 Generation number Fi tn es s  (a) 0 5 10 15 20 25 30 35 40 45 50 50 55 60 65 70 75 80 Generation number Fi tn es s  (b) Figure 2-3. The fitness of the best chromosomes as a function of the generation number for two representative individuals. a) AB2; b)  SCI4.   66 As the population evolves through generations, the GA is able to find the optimal value of the scale factor that yields the highest TP rates for FP=2%. Moreover, the choice of optimal parameter values leads to the generation of chromosomes with high fitness, resulting in an increased average fitness of the population. Since the GA found the suitable scale factor values for the chromosomes, the fitness of the weakest chromosome in the population is also dramatically increased. Table 2-2. Comparison of the fitness value of the initial and final populations (tested on the validation sets).  The performance of the proposed “Automatically User Customized LF-ASD” system or ALF-ASD on the test sets is shown in Table 2-3 . We compared the performance of the ALF-ASD with that of the latest design of the LF-ASD whose parameter values tuned by a human expert [15]. The estimates of the delay parameter values in [15] are shown in Table 2-4.  We tested both designs on 10 different randomly User Initial population Final population Worst Fitness Mean Fitness Best Fitness Worst Fitness Mean Fitness Best Fitness AB1 0 13.45 63.75 76.79 76.90 77.65 AB2 0 13.74 66.19 81.31 81.50 82.99 AB3 0 6.33 54.55 78.03 79.22 80.65 AB4 0 13.66 65.42 82.93 83.82 86.51 SCI1 0 16.00 64.68 75.88 77.46 78.34 SCI2 0 8.93 63.69 79.21 81.74 83.19 SCI3 0 15.77 64.46 70.76 72.42 73.33 SCI4 0 11.94 51.14 73.12 75.81 76.24 Average 0 12.48 61.73 77.25 78.61 79.86   67 chosen datasets. The TP results were then averaged over 10 sets for a fixed FP rate of 2%.  Table 2-3 shows the results of running both algorithms on the data of all individual. The numbers in parentheses show the standard deviations. The last column shows the difference in TP rate for each user as well as the significance levels of the results, found by applying a two-sample t-test. Before carrying out the t-test, the Levene's test for equality of variances was used to determine whether the estimates of means in the t-test should be equal or unequal [37]. The results of Levene’s test showed the homogeneity of the variances. As Table 2-3 shows, the average TP rate was increased to 67.78% from 61.13% achieved using the method described in [15]. Such an improvement was statistically significant for 5 users (p<0.01) and non-significant for the remaining three (p>0.05). The average improvement in the TP rate for individuals with spinal cord injury was more than that of able-bodied individuals. To be more specific, the average TP rate for individuals with spinal cord injury was increased to64.90% in the current study from 55.08% achieved using the customization by a human expert (an increase of 9.82%). As for able- bodied users, the average TP rate was increased to 70.76% in the current study from 67.17% achieved using the customization by a human expert (an increase of 3.58%). Interestingly, the standard deviations of the TP rate also dropped from those achieved using the customization by a human expert. For individuals with spinal cord injury, the standard deviation of the TP rate decreased to 4.58% from 12.32% achieved using the customization by a human expert; while for able-bodied users, the standard deviation fell to 2.76% from 3.39% achieved using the customization by a human expert. Overall the standard deviation of the TP rate was reduced to 4.65% compared to 10.57% achieved using the customization by a human expert. These findings indicate that as we remove the inaccuracies introduced as the result of estimating the design parameter values by a human expert, the performance of individuals gets closer to each other. In other words, these results indicate that if the parameter values of the feature generator are correctly determined, the inter-subject variability in terms of performance will decrease.     68 Table 2-3. TP rates of the LF-ASD and the ALF-ASD (FP=2%).  User Disability Description Ag e Gen der Number of epochs LF-ASD (%) ALF- ASD(%) Difference in the TP(%) Train Valid ation Test AB1 N/A 56 M 128 256 256 65.5 (3.6) 70.0 (1.9) 4.5  (p<0.01) AB2 N/A 43 M 103 206 206 72.2 (2.3) 72.6 (3.2) 0.5  (p>0.05) AB3 N/A 31 F 133 266 266 66.2 (1.4) 67.5 (3.2) 1.2  (p>0.05) AB4 N/A 45 M 97 194 194 64.7 (3.4) 72.9  (3.8) 8.2 (p<0.005) Average- (AB users) N/A - - 115.2 (17.9) 230.5 (35.8) 230.5 (35.8) 67.2 (3.4) 70.8 (2.8) 3.6 (p=0.07) SCI1 C4/5 (17 y2) 53 M 128 256 256 63.5 (2.0) 64.6  (3.5) 1.1 (p>0.05) SCI2 C4/5 (23 y) 56 M 103 206 206 66.0 (4.4) 70.6 (2.6) 4.5  (p<0.005) SCI3 C5/6 (4 y) 33 M 91 182 182 39.1 (5.1) 59.3 (4.1) 20.2  (p<0.0001) SCI4 C4/5 (5 y) 35 M 85 170 170 51.7 (5.7) 65.0  (5.3) 13.3  (p<0.0001) Average (SCI ) - - - 101.7 (19.0) 203.5 (38.1) 203.5 (38.1) 55.1 (12.3)  64.9 (4.6)  9.8 (p=0.09) Overall Average - - - 108.5 217 217 61.1 (10.6) 67.8(4.6) 6.7 (p=0.06)  2 Indicates number of years since injury.   69 Table 2-4. Delay parameter values used in the design of the LF-ASD based on the ensemble averages of the MRP patterns in the training data set. Note that βi  and βj are set to zero and that the same delay parameter values are used for the rest of the bipolar channels. The table is reproduced from [15]. User αi αj AB1 95 87 AB2 83 114 AB3 37 21 AB4 128 43 SCI1 112 99 SCI2 95 53 SCI3 39 64 SCI4 89 69  2.6 Discussion and conclusions An important issue in the design of many BCI systems is the correct detection of the IC pattern (if present) for each user. Since the shape of a neurological phenomenon varies to some extent from one individual to another, it is necessary to consider this variation in the design of BCI systems. As a result, adjusting the parameter values of the feature extractor (user customization of the feature generator of the BCI system) is necessary for each user. If such user customization is done visually by a human expert, the results may have a subjective bias and unreliable; the customization process also becomes time consuming and exhausting. An automatic method therefore needs to be developed to perform user customization without the interference of a human expert. In this Chapter, the effect of automatic user customization of the design parameter values of a self-paced BCI system was analyzed. More specifically, we proposed an automatic method for estimating the shape of an MRP used to drive the output of a self- paced BCI. Since MRPs have been used as the neurological phenomenon in a number of BCI systems, an automatic algorithm to estimate their shape can be used as an effective feature extraction method in those systems. A  GA was implemented to user customize a self-paced BCI called the LF-ASD. The LF-ASD is one of the few self-paced BCI systems that have been successfully tested online [20] and has been used by other researchers as well [6]. In design of the LF-ASD,   70 estimates of the delay parameter values obtained from the ensemble averages may be far from optimal because of the noisy nature of the EEG signals, the presence of artifacts and the psychological factors of each user. In addition, no analytical method currently exists for estimating the normalization parameter values. Until recently, these have been estimated in an ad-hoc manner through an exhaustive search of possible values. Automatic customization resolves this problem, since it estimates the parameter values depending on their associated cost functions. We showed that by using a GA, the performance of the LF-ASD is improved to a great extent over the case where the design parameter values were estimated by a human expert [15]. This finding provides additional evidence that automatic user customization boosts the performance of a BCI system. Moreover, the designer is relieved from the cumbersome task of choosing the values of the feature extractor for each user.  One of the interesting findings of this Chapter is that the highest improvements were achieved in the performance of individuals with spinal cord injury when the delay parameter values were automatically customized. When the customization is done by a human expert, the highest improvements were achieved for able-bodied users [15]. However, the performance of individuals with spinal cord injury did not improve much. On the other hand, the results presented in Table 2-3 show that when the automatic user customization is used, the highest improvements were achieved for individuals with spinal cord injury. The average improvement in the TP rate was 3.58% for able-bodied users and 9.82% for individuals with spinal cord injury (resulting in the overall improvement of 6.68% (p=0.06)). This is probably due to the fact that individuals with spinal cord injury did not perform an actual movement, thus their MRP patterns were not as strong as those of able-bodied users. This resulted in noisier ensemble average MRP templates for the latter users, where visual estimation of the delay parameter values was not straightforward. The proposed automatic user customization method, however, deals with the optimization of the performance over single epochs and thus was able to find more suitable delay parameter values.  Because of the low number of users, these findings cannot be generalized. They do, however provide some preliminary evidence that automatic user customization is necessary for achieving acceptable BCI performance for individuals with spinal cord injury.   71 We also found that for every individual, the values of the delay parameters found by the GA differed from one channel to another. There are two reasons for this result. First, the spatial distribution of the measured EEG signals was taken into consideration. Since the spatial distribution differs from one channel to another, it is expected that the delay parameter values should also differ. The other reason is the presence of the ENT. The value chosen for each normalization parameter changes the shape of the resultant EEG signals to some extent. Thus, for every value of the normalization window, a new set of delay parameter values should be estimated to correctly detect the presence of a bipolar MRP pattern in the EEG signal. The design parameter values found by the GA also differed from one user to another, providing further evidence that user customization is necessary to achieve acceptable performance values. Comparison of the average results on test sets in Table 2-2 and Table 2-3 shows a drop of 12.05% in the performance. This drop in the performance indicates that the use of more sophisticated classifiers may be beneficial. For example, a support vector machines (SVM) can be used as a classifier, since not only it minimizes the empirical risk (the training error), it minimizes the confidence error as well (the test error) [38]. Future work includes finding better cost functions. Such a study has not been well explored in self-paced BCI systems. Finding better cost functions that can summarize the confusion matrix more effectively, is especially desired in optimization problems. Future work should also include online testing of the ALF-ASD. Specifically we shall investigate the performance of the ALF-ASD over time. Since the literature indicates that the shapes of MRPs may change from one day to another, a method that locally tunes the parameter values of the feature generator ahead of each session should be developed. 2.7 Acknowledgements This work was supported in part by the NSERC under Grant 90278-06 and the CIHR under Grant MOP-72711. The authors also would like to thank Mr. Craig Wilson for his valuable comments on this Chapter.    72 2.8 References [1] S. G. Mason and G. E. Birch, "A brain-controlled switch for asynchronous control applications”,  IEEE Trans. Biomed. Eng., vol. 47, no.10, pp. 1297-1307, Oct. 2000. [2] G. E. Birch, P. D. Lawrence and R. D. Hare, "Single-trial processing of event-related potentials using outlier information”,  IEEE Trans. Biomed. Eng., vol. 40, no.1, pp. 59-73, Jan. 1993. [3] S. P. Levine, J. E. Huggins, S. L. BeMent, R. K. Kushwaha, L. A. Schuh, M. M. Rohde, E. A. Passaro, D. A. Ross, K. V. Elisevich and B. J. Smith, "A direct brain interface based on event-related potentials”,  IEEE Trans. Rehabil. Eng., vol. 8, no.2, pp. 180-185, Jun. 2000. [4] R. Millan Jdel and J. Mourino, "Asynchronous BCI and local neural classifiers: an overview of the Adaptive Brain Interface project”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 11, no.2, pp. 159-161, Jun. 2003. [5] R. Scherer, G. R. Muller, C. Neuper, B. Graimann and G. Pfurtscheller, "An asynchronously controlled EEG-based virtual keyboard: improvement of the spelling rate”,  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 979-984, Jun. 2004. [6] E. Yom-Tov and G. F. Inbar, "Detection of movement-related potentials from the electro- encephalogram for possible use in a brain-computer interface”,  Med. Biol. Eng. Comput., vol. 41, no.1, pp. 85-93, Jan. 2003. [7] G. Townsend, B. Graimann and G. Pfurtscheller, "Continuous EEG classification during motor imagery--simulation of an asynchronous BCI”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 12, no.2, pp. 258-265, Jun. 2004. [8] L. R. Hochberg, M. D. Serruya, G. M. Friehs, J. A. Mukand, M. Saleh, A. H. Caplan, A. Branner, D. Chen, R. D. Penn and J. P. Donoghue, "Neuronal ensemble control of prosthetic devices by a human with tetraplegia”,  Nature, vol. 442, pp. 164-171, Jul 13. 2006. [9] J. F. Borisoff, S. G. Mason and G. E. Birch, "Brain interface research for asynchronous control applications”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.2, pp. 160-164, Jun. 2006. [10] J. T. Francis and J. K. Chapin, "Neural ensemble activity from multiple brain regions predicts kinematic and dynamic variables in a multiple force field reaching task”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.2, pp. 172-174, Jun. 2006. [11] G. Pfurtscheller, G. R. Muller-Putz, A. Schlogl, B. Graimann, R. Scherer, R. Leeb, C. Brunner, C. Keinrath, F. Lee, G. Townsend, C. Vidaurre and C. Neuper, "15 years of BCI research at Graz University of Technology: current projects”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 14, no.2, pp. 205-210, Jun. 2006. [12] S. G. Mason and G. E. Birch, "Temporal control paradigms for direct brain interfaces - rethinking the definition of asynchronous and synchronous”, in Proc. HCI International Conference, Las Vegas, USA, 2005. [13] S. G. Mason, A. Bashashati, M. Fatourechi, K. F. Navarro and G. E. Birch, "A Comprehensive Survey of Brain Interface Technology Designs”,  Annals of Biomedical Engineering, vol. 35, no. 2, pp. 137-69, Feb 2007. [14] M. Pregenzer and G. Pfurtscheller, "Frequency component selection for an EEG-based brain to computer interface”,  IEEE Trans. Rehabil. Eng., vol. 7, no.4, pp. 413-419, Dec. 1999.   73 [15] A. Bashashati, M. Fatourechi, R. K. Ward and G. E. Birch, "User customization of the feature generator of an asynchronous brain interface”,  Ann. Biomed. Eng., vol. 34, no.6, pp. 1051-1060, Jun. 2006. [16] M. Pregenzer and G. Pfurtscheller, "Frequency component selection for an EEG-based brain to computer interface”,  IEEE Trans. Rehabil. Eng., vol. 7, no.4, pp. 413-419, Dec. 1999. [17] G. Blanchard and B. Blankertz, "BCI Competition 2003--Data set IIa: spatial patterns of self- controlled brain rhythm modulations”,  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1062- 1066, Jun. 2004. [18] Wenjie Xu, Cuntain Guan, Chng Eng Siong, S. Ranganatha, M. Thulasidas and Jiankand Wu, "High accuracy classification of EEG signal”, in Proc. 17th Int. Conf. Pattern Recognition (ICPR 2004), vol.2,  pp. 391-394, 2004. [19] T. N. Lal, M. Schroder, T. Hinterberger, J. Weston, M. Bogdan, N. Birbaumer and B. Scholkopf, "Support vector channel selection in BCI”,  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1003-1010, Jun. 2004. [20] S. G. Mason, R. Bohringer, J. F. Borisoff and G. E. Birch, "Real-time control of a video game with a direct brain--computer interface”,  J. Clin. Neurophysiol., vol. 21, no.6,  pp. 404- 408, Nov-Dec. 2004. [21] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley Publishing Company, 1989. [22] T. Back, D. B. Fogel and T. Michalewicz, Evolutionary Computation. Bristol and Philadelphia: Institute of Physics Publishing, 2000. [23] M. Fatourechi, A. Bashashati, R. K. Ward and G. E. Birch, "A hybrid genetic algorithm approach for improving the performance of the LF-ASD brain computer interface”, in the Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, (ICASSP '05), vol. 5, pp. 345-348, 2005. [24] C. Babiloni, F. Carducci, F. Cincotti, P. M. Rossini, C. Neuper, G. Pfurtscheller and F. Babiloni, "Human movement-related potentials vs desynchronization of EEG alpha rhythm: a high-resolution EEG study”,  Neuroimage, vol. 10, no.6, pp. 658-665, Dec. 1999. [25] L. Deecke, B. Grozinger and H. H. Kornhuber, "Voluntary finger movement in man: cerebral potentials and theory”,  Biol. Cybern., vol. 23, no.2, pp. 99-119, Jul 14. 1976. [26] M. Hallett, "Movement-related cortical potentials”,  Electromyogr. Clin. Neurophysiol., vol. 34, no.1, pp. 5-13, Jan-Feb. 1994. [27] D. P. Burke, S. P. Kelly, P. de Chazal, R. B. Reilly and C. Finucane, "A parametric feature extraction and classification strategy for brain-computer interfacing”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 13, no.1, pp. 12-17, Mar. 2005. [28] E. L. Glassman, "A wavelet-like filter based on neuron action potentials for analysis of human scalp electroencephalographs”,  IEEE Trans. Biomed. Eng., vol. 52, no.11, pp. 1851- 1862, Nov. 2005. [29] M. Krauledat, G. Dornhege, B. Blankertz, F. Losch, G. Curio and K. -. Muller, "Improving speed and accuracy of brain-computer interfaces using readiness potential features”, in Proc. EMBC Int. Conf., vol.6, pp.4511-4515, 2005. [30] R. Q. Cui and L. Deecke, "High resolution DC-EEG analysis of the Bereitschaftspotential and post  movement onset potentials accompanying uni- or bilateral voluntary  finger movements”,  Brain Topogr., vol. 11, no.3,  pp. 233-249, Spring. 1999.   74 [31] B. Blankertz, C. Schäfer, G. Dornhege and G. Curio, "Single trial detection of EEG error potentials: A tool for increasing BCI transmission rates”, in Proc. Int. Conf. Artificial Neural Networks (ICANN’02),  pp. 1137-1143, 2002. [32] J. F. Borisoff, S. G. Mason, A. Bashashati and G. E. Birch, "Brain-computer interface design for asynchronous control applications: improvements to the LF-ASD asynchronous brain switch”,  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 985-992, Jun. 2004. [33] Z. Yu, S. G. Mason and G. E. Birch, "Enhancing the performance of the LF-ASD brain- computer interface”, in Proce.  2nd Joint IEEE-EMBS/BMES Conference, Houston, TX, USA, vol. 3,  pp.2443-2444, Oct. 2002. [34] Z. Yu, S. G. Mason and G. E. Birch, "Impact of an energy normalization transform on the performance of the LF-ASD brain computer interface”, in the Proc. Advances in Neural Information Processing Systems (NIPS’03), 16,  pp. 725-732, 2003. [35] N. S. Jayant and P. Noll, Digital Coding of Waveforms. Prentice Hall, 1984, [36] G. E. Birch, Z. Bozorgzadeh and S. G. Mason, "Initial on-line evaluations of the LF-ASD brain-computer interface with able-bodied and spinal-cord subjects using imagined voluntary motor potentials”,  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 10, no.4, pp. 219-224, Dec. 2002. [37] K. A. Brownlee, "Statistical theory and methodology in science and engineering”,  A Wiley Publication in Applied Statistics, New York: Wiley, 1965, 2nd Ed., 1965. [38] H. Yoon, K. Yang and C. Shahabi, "Feature subset selection and feature ranking for multivariate time series”,  IEEE Trans. Knowledge and Data Eng., vol. 17, no.9, pp. 1186- 1198, 2005.               75 CHAPTER 3 APPLICATION OF A HYBRID WAVELET FEATURE SELECTION METHOD IN THE DESIGN OF A SELF-PACED BRAIN COMPUTER INTERFACE SYSTEM3   3.1 Background A successful brain computer interface (BCI) system enables individuals with severe motor disabilities to control object in their environment (such as a light switch, a neural prosthesis or a computer) by using only their brain signals. Such a system measures specific features of a person’s brain signal that relate to his or her intent to affect control, and then translates them into control signals that are used to control a device [1, 2]. Brain computer interface systems are implemented in two ways: system-paced (synchronized) or self-paced (asynchronous). In system-paced BCI systems, a user can initiate a command only during certain periods specified by the system. In a self-paced BCI system, users can affect the output of the BCI system whenever they want, by intentionally changing their brain state.  The state in which a user is intentionally attempting to control a device is called an intentional control (IC) state. At other times, users are said to be in a no-control (NC) state, where they may be idle, thinking about a problem, or performing some action other than trying to control the device[3, 4]. To operate in this paradigm, BCI systems should be designed to respond only when the user is in an IC state and to remain inactive when the user is in an NC state. So far, only a few BCI systems (e.g. [3, 5-10]) have been specifically designed and tested for self-paced  3 A version of this chapter has been published. Fatourechi, M., Birch, G. E., and Ward, R. K., "Application of a Hybrid Wavelet Feature Selection Method in the Design of a Self-paced Brain Interface System", Journal of NeuroEngineering and Rehabilitation, Vol.4, No.1, Apr 2007.   76 control applications. But as recognized in [2], self-paced BCI systems deserve more attention. The discrete wavelet transform (DWT) can be used as a powerful feature extraction tool to extract time-frequency features similar in shape to that of a particular wavelet function. It therefore has an advantage over other feature extraction methods that operate in only one domain, such as the Fourier transform, or autoregressive modeling. The DWT has been extensively applied in the analysis of event-related potential (ERP) because of its ability to effectively explore both the time and frequency information of these signals [11, 12]. It has also been successfully used to generate wavelet features in BCI systems. In [13], DWT was employed in the design of a synchronized BCI system that used wavelet coefficients extracted from slow cortical potentials (SCPs) as well as other ERPs. This system performed better than other designs that used EEG time series and a mixed filtering method. In [14], the energies of various frequency bands decomposed by a wavelet packet transform (18 frequency bands in total) were used as features in detecting different movement patterns in a self-paced BCI system. These features were linearly combined to generate a single feature, with coefficients of the linear mapping determined by a genetic algorithm (GA). In [15], a custom-made wavelet function was employed in two different studies: the detection of P300 in a single EEG channel, and the detection of the Bereitschaftspotential from two EEG channels. In [16], a weighted linear combination of all available wavelet coefficients (15 in total) extracted from a single EEG channel was used to detect P300 patterns.  To estimate weights for each feature in the linear combination, a neural network was employed. Finally, in [17], investigators applied DWT to extract the 0-4Hz component of the EEG signal in a P300-based BCI system. Based on the above encouraging results, in this study we explore applying DWT to extract movement-related potential (MRP) features for driving a self-paced BCI system. Although the above BCI studies provide promising evidence that DWT can be employed to extract features in BCI systems, two main issues still need to be addressed. First, studies that used discrete wavelet coefficients as features (rather than wavelet- filtered EEG signals), used only one or two EEG channels. In these cases, the resulting   77 dimensionality of the space does not pose a serious problem, since it is not very large. Having a BCI system that uses data recorded from only one or two electrodes seems very appealing, since the setup is fast and uses less hardware/software infrastructure. Most of the above-mentioned papers, however, achieved a relatively high degree of classification error when only one or two EEG channels were used.  For example, in [16], the reported error rates were relatively high (nearly 40% error). In [17], where wavelet-filtered EEG signals were used, the system did not perform well (30% misclassification).  For the only self-paced BCI system that has applied wavelet coefficients so far [14] ,  false discovery rates (the percentage of hits that were not true positives) varied up to 67% , however, the authors did not indicate the number of NC epochs used in their study, so critical commentary on the performance of their BCI system cannot be made. The invasiveness of the recording technology of the BCI system in [14] is also an important issue that needs to be considered. The above observations strongly motivate the use of additional EEG electrodes in BCI systems. With signals recorded from multiple channels, we can explore spatial information, which is expected to yield improvements in classification performance.  Another issue that must be addressed when using DWT to extract features in BCI systems is the feature selection procedure. That is, how many features should be selected and how should they be selected? In [13], all of the 64 wavelet features used for classification were extracted from only one EEG channel. In [15], because of the computational limitations affecting the classifier, only a number of top wavelet features (ranked by the amount of discriminability) were selected. None of the above-mentioned approaches yielded best results (since the feature selection process used was necessarily not optimal). Using all features does not necessarily provide the best results, because some of the less discriminant features may degrade the classifier’s performance [18]. On the other hand, using only few features that have the highest rank (and filtering out the rest of features) does not necessarily lead to the optimal classification performance, since there is no guarantee that using only top-ranked features leads to the best classifier performance [19].   78 Based on the related literature review, we postulate that the information extracted from multiple- electrode signals is necessary for achieving acceptable performance. This in turn leads us to the high dimensionality problem of the feature space; since the feature space dimension is directly affected by the number of electrodes used as well as by the number of features per EEG signal. Since not all the wavelet coefficients provide discriminatory information between the output classes, we postulate that features that better discriminate between the output classes need to be selected to obtain better classification performance. A mechanism for selecting the most discriminating features is thus needed. Wrapper methods, such as GAs, use the classifier’s performance to evaluate a particular feature vector. They provide a good solution for finding the features that work well together by choosing the ones that lead to better classifier performance [20]. The downside of using wrapper methods is time inefficiency. As the dimension of the search space increases, it becomes harder for a wrapper method to find a suitable subset of features that lead to a high performance. In order to benefit from the advantages of both filter and wrapper methods, we decided to employ a hybrid approach. Features carrying the least discriminative information about the output classes were filtered out first. Then a wrapper method was applied to the reduced feature space to find the features that work well together, i.e., the combination that leads to the best classification performance. We used mutual information (MI) in the filtering stage. Mutual information is a powerful tool for ranking features based on the amount of discriminative information each carries [21]. We then applied a GA in a wrapper approach to select the features that lead to the best classification performance. Genetic algorithms are heuristic methods that can effectively sample large search spaces [22]. They are implemented based on the principles of evolutionary biology, and evolve over many generations. By mimicking this process, GAs are able to evolve solutions to real-world problems. They have been shown to be useful tools in automatically customizing many practical systems [22, 23]. We used a support vector machine (SVM) to classify the selected features into one of two classes: no control (NC) or intentional control (IC). The results of this study   79 show that applying the proposed approach to the offline data collected from four able- bodied individuals yields low false positive (FP) rates at a reasonably high true positive (TP) rate. We also examine the spatial distribution of the selected features. We show that this distribution varies considerably from one individual to another. This finding shows the importance of user customization of BCI systems. 3.2 Data collection People with severe motor disabilities cannot physically execute certain movements such as a finger flexion, but they are usually able to attempt it. Several studies have shown that recordings of brain signals obtained from attempted and real movements for able-bodied individuals bear many similarities [14, 24-29]. Based on these studies, both attempted and executed movements have been shown to activate similar cortical areas and to generate similar movement patterns. This evidence enables us to base our analysis on the data of able-bodied individuals, who actually execute a particular movement. It is then possible to detect the occurrence of the control command by analyzing signals such as electromyography (EMG) signal or the output of an actual switch. Such signals can be used to label the brain signals and to evaluate the performance of a BCI.  The data analysis of individuals with motor disabilities was thus left to future studies. The data of four (three male and one female) able-bodied individuals were used in this study. All individuals were right-handed and between 31 and 56 years old. They had all signed consent forms prior to participation in the experiment. Individuals were positioned 150 cm in front of a computer monitor. The EEG signals were recorded from 13 monopolar electrodes positioned over the individuals’ supplementary motor area and primary motor cortex (according to the International 10-20 System at F1, Fz, F2, FC3, FC1, FCz, FC2, FC4, C3, C1, Cz, C2 and C4 locations). Electrooculography (EOG) activity was measured as the potential difference between two electrodes, placed at the corner of and below the right eye. An ocular artifact was considered present when the difference between the EOG electrodes exceeded ±25 µV. All signals were sampled at 128 Hz and referenced to ear electrodes (see [30] for details   80 of the data recording). The recorded signals were then saved on the computer and converted to bipolar EEG signals by calculating the difference between the adjacent EEG channels. This procedure was used since it has been shown that bipolar electrodes generate more discriminating MRP features than monopolar electrodes do [3]. This conversion generated  the following 18 bipolar EEG channels: F1-FC1, F1-Fz , F2-Fz, F2- FC2 , FC3-FC1, FC3-C3, FC1-FCz, FC1-C1, FCz-FC2, C1-Cz, C2-C4 , FC2-FC4 , FC4-C4 , FC2-C2 , FCz-Cz , C3-C1 , Cz-C2 and Fz-FCz . Data were collected from individuals as they performed the following guided task. At each interval, a white, 2cm diameter circle was displayed on the individual’s monitor for ¼ second, prompting the individual to attempt a movement. In response to this cue, the user had to perform a right index finger flexion one second after the cue appeared. The 1-second delay was used to avoid visual evoked potential (VEP) effects caused by the cue (see [31] for more details). For each individual, an average of 80 IC epochs were collected every day over a period of 5 days. An IC epoch consisted of data collected over an interval containing the movement onset (measured as the finger switch activation) if no artifact was detected in that particular interval. The interval starts at tstart seconds before movement onset and ends at tfinish seconds after it. There were limitations in choosing the total length of (tstart+ tfinish). If the length of (tstart+ tfinish) increases, more artifacts may be present in an IC epoch.  As a result, the number of training epochs that are artifact-free based on the criterion used to reject ocular artifacts will be reduced. If the length of (tstart+ tfinish) is too short, a poor exploration of potential features results.  Since a simple finger flexion MRP usually starts about 1.5 seconds before the movement and returns back to the normal baseline around 1 second after the movement [32], data obtained from 1.5 seconds before to 1.0 second after the movement onset were analyzed (i.e., tstart=1.5 seconds and tfinish =1.0 second). NC epochs were selected as follows. A window of width (tstart+ tfinish) seconds was considered (tstart=1.5 seconds and tfinish =1.0 second). To extract NC epochs, the window was shifted over each EEG signal recorded during NC sessions by a step of 16 samples (0.1250 sec).Wavelet coefficients were extracted for each epoch that did not contain artifacts.   81 3.3  Method The overall structure of the proposed scheme is shown in Figure 3-1. EEG signals were checked for the presence of EOG artifacts. The contaminated epochs were rejected, as explained in Section 3.2.  Figure 3-1. The overall structure of the proposed hybrid method for extracting MRP features. The continuous wavelet transform (CWT) is defined as the convolution of the signal x(t) with the wavelet functions )(, tba , where  )(, tba is the dilated and shifted version of the wavelet function )(t and is defined as follows: )(.1)(, a bt a tba                                                                                     (3-1) where  a and b are the scale and translation parameters, respectively. The CWT maps a signal of one independent variable t into a function of two independent variables a, b. This procedure is redundant and not efficient for algorithmic implementations. Therefore, it is more practical to define the wavelet transform at a discrete scale a and a discrete time b by choosing the set of parameters (such a transform is called a discrete wavelet transform, or DWT), such that kba jkj j j .2,2 ,     (j, k are integers)                                                           (3-2) The contracted versions of the wavelet function will match the high-frequency components of the original signal and the dilated versions will match the low-frequency oscillations. Then by correlating the original signal with the wavelet functions of different sizes, the details of the signal at different scales are obtained. The resulting correlation features can be arranged in a hierarchical scheme called multi-resolution decomposition [33] which separates the signal into “details” at different frequency bands and a coarser representation of the signal called an “approximation”.   82 In this study, the rbio3.3 wavelet from the B-spline family was chosen as the wavelet function because it has some similarities with the shape of the classic bipolar MRP pattern.  Using a 5-level decomposition method resulted in wavelet coefficients corresponding to the following frequency bands (the sampling frequency was 128 Hz): [32-64], [16-32], [8-16], [4-8], [2-4], and [0-2] Hz. Based on the previous findings in [3], which showed that MRP features are mostly located in the frequency range below 4Hz , only the lowest frequency bands (i.e., 0-2Hz and 2-4Hz) were considered for further analysis of MRPs. Even with this reduced feature space, the resulting feature space dimension (Nfeatures), which is the product of the number of electrodes (Nelectrodes) and the number of wavelet features per EEG signal (Nwavelet). That is, waveletelectrodesfeatures NNN  remained very high. Thus, a feature selection procedure had to be used that could select the features that lead to optimal classification performance. This procedure should specify the selected EEG channels as well as the features selected per channel. We devised a hybrid feature selection algorithm to meet these requirements. Mutual information (MI) was employed in the filtering stage and a GA was then used to select the optimal set of features. Although MI has been used elsewhere to filter out the less informative features [21, 34], it is not usually successful at finding features that lead to optimal classification performance. This is because when there are more than three feature dimensions, the calculation of MI is computationally demanding, and impossible for large feature spaces (since the calculation of MI requires the joint probability of features in a high dimension) [21, 34]. Thus, MI was only used in our algorithm to discard the least informative features based on the amount of information that each feature carries regarding the output classes. The MI between the input feature vector X and the output classes Y was calculated as follows: )()(),( XYYYX HHI                        (3-3)     83 where    M j jj yPyPH 1 2 )(log).()Y(                                         (3-4)     N i M j ijiji xyPxyPxPH 1 1 2 )(log).().()( XY                                                (3-5)    N i ijij xyPxPyP 1 )().()(                                    (3-6) In these formulae,  I represents the mutual information between X and  Y, where X= {xi},  (i = 1,2,3,..., N) and Y= {yj}, ( j = 1,2,3,..., M) , N  is the number of input states and M is the number of outputs states (M=N=2, since the input and output can only take two values: IC and NC),P(xi ) is  the probability of occurrence of an input state xi , P(yj) is the probability of the output class yj when the input is unknown, and )( ij xyP is the probability of the output class  yj when the input state xi is known. For each individual, the wavelet coefficient (feature) values corresponding to all the training set data were calculated. Then, using histograms with 10 bins each, the probability function of each feature was estimated and its mutual information with each of the output classes was calculated. The values of MI were calculated for all Nfeatures features and then ranked in descending order. The top L features were then selected. In this study, we arbitrarily chose L=50 to avoid having a feature space with a very high dimension. After reducing the dimension of the feature space, a GA was used to select a subset of m features from the top L features. To represent each possible combination of features, a binary chromosome of length L was defined. The bit i of the binary chromosome specified whether or not the feature i was selected by the GA. A value of “1” indicated the presence of feature i and a value of “0” indicated its absence in a chromosome. An important decision in the design of a GA is the definition of a proper fitness function. In the proposed design, a suitable fitness function should consider at least three   84 objectives: maximizing the TP rate, minimizing the FP rate and minimizing the number of features selected by the hybrid feature selection procedure. The classification performance of a 2-state, self-paced BCI system is usually determined by a confusion matrix, as shown in Table 3-1. In Table 3-1, the FP rate is the percentage of instances for which an NC epoch is misclassified as an IC epoch, the true negative (TN) rate is the percentage of NC epochs being correctly classified, the true positive (TP) rate is the percentage of IC epochs being correctly classified and the false negative (FN) rate is the percentage of misclassifying an IC epoch as an NC epoch. The fitness function should summarize this confusion matrix.  For a 2-state self-paced BCI system, we have (%)(%)100(%) TPFN                         (3-7) and (%)(%)100(%) FPTN                                                                                  (3-8) Table 3-1. The confusion matrix for a 2-state self-paced BCI system.                     Predicted Class Actual Class IC NC IC TP FN NC FP TN  Based on 3-7) and 3-8), only TP rates (TPR) and FP rates (FPR) need to be included in the fitness function. One example of a fitness function is a function that maximizes the FPR TPR  ratio.   In this paper, the following objective function was used:        %20, )( )( %20,0 )( TPR ZFPR ZTPR TPR Zf          (3-9)   85 where Z is a chromosome and f is the fitness function. This fitness function gives a higher fitness level to chromosomes that generate a higher FPR TPR  ratio.  We also postulated that TP rates below 20% were too low for the successful operation of a self-paced BCI system (since they correspond to detection of less than one IC out of every five IC states, which may lead to user frustration, even though the FP rates might be very low). Such chromosomes were considered “unfit” and were assigned a “0” fitness value. Next, a lexicographic approach was applied for multi-objective optimization of the GA population [23]. Very briefly, in this approach, the objectives were ranked according to the priorities assigned to them prior to optimization. The objective with the highest priority was used first for comparing the members of the population. In our case, the average of FPR TPR  over the validation sets was first selected as the objective function with the highest priority. The chromosomes were then ranked in a single-objective fashion. Any ties were resolved by comparing the relevant chromosomes again with respect to objectives that were assigned lower priority. The other three objectives were chosen as (1) the average of FP rate over the validation sets, (2) the average of TP rate over the validation set, and (3) the number of features, resulting in four objectives per chromosome in the GA population. The 2nd and 3rd objectives were ordered such that for two chromosomes with the same FPR TPR  ratio, the one with the lower FP rate was considered to be the fit chromosome.     The remaining operators of the GA were tournament-based selection (tournament size =3), uniform crossover and uniform mutation. The sizes of the initial population and the population in the next generations were chosen as 100 and 50, respectively. We used random initialization to initialize the GA. Elitism was used to keep the best performing chromosome of each population in the subsequent populations.    The number of evaluations was set to 2000. If the improvement in the FPR TPR ratio of the best solution was found to be less than 1% for more than 10 consecutive   86 generations, the algorithm was terminated.  Because of the computational load, tuning the GA parameter values (such as the mutation and crossover rates) was not performed. A support vector machine (SVM) that uses kernel-based learning was chosen to classify each chromosome in the GA population. In kernel-based learning, all of the beneficial properties of linear classification methods, such as simplicity, are maintained, but the overall classification is nonlinear in the input space, since the feature and input spaces are nonlinearly related [35]. Another reason for selecting a SVM as a classifier is that SVMs not only minimize the empirical risk (training error), they also minimize the confidence error (test error) [36]. We used the LIBSVM software [37], which has also been used in other BCI papers [38, 39].      The evaluation process was as follows. For each individual, IC and NC epochs were randomized and divided into training, validation and test sets.  The training set was used to train the classifier, and the validation set was used to select the best set of features. The configuration yielding the best results on the validation set in the multi- objective sense mentioned above was selected, and the performance of the system calculated on the test set was reported. We used a five-fold nested cross-validation for evaluating the performance of the system. For each outer cross-validation set, 20% of the data were used for testing and the rest were used for training and model selection (selection of optimal subset of features). In order to select the models, the datasets were further divided into five folds. For each fold, 80% of the data were used for training the classifier and 20% were used for model selection.     To deal with the problem of unbalanced training sets (there were at least 20 times more NC epochs than IC epochs), the size of the NC training feature set was reduced to be the same as the size of the training IC feature sets. This was done by randomly selecting epochs from the NC training set. 3.4 Results    In this section, we present the offline analysis of the data of the four individuals described in Section 3.2. We performed a search on the classifier’s parameters during the model selection. Our findings showed that a 5th degree polynomial kernel function   87 performed better than other kernel functions studied (linear, polynomial with a degree other than 5 (3, 4, 6 and 7) and RBF kernel).     Since a five-fold nested cross-validation was used for the performance evaluation, the results were averaged over five runs of the outer validation sets. The columns 1 to 5 of  Table 3-2 show the individual identification number, the average TP rate on the test sets, the average FP rate on the test sets, the average FPR TPR ratio and the average number of features selected by the hybrid feature selection process. The latest performance results of another state-of-the art self-paced BCI system (the LF-ASD) [40], applied to the data of individuals AB1 to AB4 are presented in columns 6 to 9 of Table 3-2.  The numbers in parentheses are the standard deviations. As Table 3-2 shows, our proposed design achieved low FP rates for three of the four individuals (individuals AB1, AB2 and AB4) for a relatively high TP rate. For individual AB3, the TPR results on the test sets were low (although the FP rates remained less than 4%). Table 3-2. Comparison of the average TP, average FP rates, average FPR TPR  and the average number of features.    Individual ID Test Set (Current Study) Number of features (Current Study) Test Set ([[40]]) Number of Features ([[40]])   TPR FPR FPR TPR TPR FPR FPR TPR AB1 68.0 (4.8) 1.0 (0.3) 68.0 30.6 (1.1) 67.8 (1.4) 2.0 33.9 6 AB2 73.3 (2.6) 1.4 (0.4) 52.4 29.2 (3.3) 74.0 (1.7) 2.0 37.0 6 AB3 33.1 (14.0) 3.9 (1.0) 8.5 23.4 (2.4) 64.0 (1.3) 2.0 32.0 6 AB4 56.1 (4.9) 1.4 (0.7) 40.0 27.0 (2.8) 73.1 (1.8) 2.0 36.6 6 Average 57.4 1.9 30.2 27.5 69.7 2.0 34.9 6   88    Next, the spatial distributions of the selected features were examined.   The average number of selected features per channel is shown in Table 3-3. The numbers in parentheses show the standard deviation over five runs of outer cross-validation. Figure 3-2 to Figure 3-5 show the number of selected features per channel for all individuals after applying the hybrid selection method (averaged over the number of cross-validation Table 3-3. The average number of selected features per channel after applying the hybrid feature selection algorithm. Individual ID Channel AB1 AB2 AB3 AB4 F1-FC1 3.6 (1.1) 3  (1.2) 1.8 (0.8) 3 (0.7) F1-Fz 0.0 (0.0) 0.0 (0.0) 0.0 (0.0) 3.4 (0.5) F2-Fz 0.0 (0.0) 1.6 (0.9) 0.4 (0.5) 0.0 (0.0) F2-FC2 0.2 (0.4) 2 (0.7) 0.8 (0.8) 0.4 (0.5) FC3-FC1 1.0 (0.0) 1.0 (0.0) 1.6 (0.9) 0.0 (0.0) FC3-C3 1 (0.71) 3.0 (0.0) 2.4 (1.14) 1.6 (0.5) FC1-FCz 0.0 (0.0) 1.0 (0.0) 0.6 (0.5) 1.2 (0.84) FC1-C1 4.6 (0.5) 2.8 (0.4) 0.0 (0.0) 1.2 (0.4) FCz-FC2 0.0 (0.0) 2.2 (0.4) 0.6 (0.5) 0.0 (0.0) C1-Cz 1.6 (0.5) 0.4 (0.5) 3.6 (1.1) 1.2 (0.4) C2-C4 0.6 (0.5) 2.2 (0.4) 4.4 (0.9) 2.6 (0.9) FC2-FC4 4.2 (0.4) 1.6 (0.9) 2.2 (1.1) 3.4 (1.1) FC4-C4 3.2 (0.45) 2 (1.0) 1.8 (0.8) 4.4 (0.5) FC2-C2 2.0(0.0) 2.2 (0.4) 0.6 (0.5) 2.2 (0.4) FCz-Cz 1.6 (0.9) 0.6 (0.5) 0.2 (0.4) 0.8 (0.4) C3-C1 1 (0.7) 2.0 (0.0) 2.0 (0.0) 0.0 (0.0) Cz-C2 3.8 (0.4) 0.0 (0).0 0.0 (0.0) 0.6 (0.5) Fz-FCz 2.2 (1.3) 1.6 (0.5) 0.4 (0.5) 1.0 (0.7)   89 sets). The low standard deviation obtained for all cases shows the robustness of the proposed method over different runs of the algorithm. 3.5 Discussion and conclusions Discrete wavelet transform (DWT) is a useful feature extraction tool since it explores the time as well as the frequency information of the signal. Although DWT has been employed to some degree of success in a number of synchronized BCI systems, there remain some limitations in its application to self-paced BCI systems (in terms of the large size of the feature space). Brain computer interface systems that use DWT features have mostly employed only one or two channels (perhaps due to the large dimensionality of the feature space or to limitations imposed by the experimental protocol). To simultaneously explore the wavelet coefficients (features) of BCIs with more channels (so as to explore the spatial information) and to avoid the problems associated with the resultant large feature space, a two-stage (hybrid) feature selection algorithm is proposed. The first stage uses mutual information (MI) to discard the least informative features. In the second stage, a genetic algorithm (GA) selects those remaining features that lead to better system performance in the sense of meeting multiple objectives. In our study, the features selected per channel varied considerably from one individual to another, as shown in Figure 3-2 to Figure 3-5. For example, for individual AB1, more features were selected from channels FC1-C1, F1-FC1, Fz-FCz, FC4-C4, FC2- FC4 and Cz-C2, while for individual AB4, more features were selected from channels FC4-C4, FC2-FC4, F1-Fz, C2-C4, F1-FC1, and FC2-C2.  These results support the hypothesis that proper channel selection for every individual is necessary to obtain superior performance.   90  Figure 3-2. Spatial distribution of the average number of selected features for AB1.  Figure 3-3. Spatial distribution of the average number of selected features for AB2.   91  Figure 3-4. Spatial distribution of the average number of selected features for AB3.  Figure 3-5. Spatial distribution of the average number of selected features for AB4.   92 Another finding from Figure 3-2 to Figure 3-5 is that the relevant features for each individual were unique. These findings are in contrast to an earlier study done by our group that empirically determined six pairs of electrodes for all individuals (channels F1-FC1,  F2-FC2 , FC1-C1,  FC2-C2 , FCz-Cz , and Fz-FCz) [3]. Our findings in this regard are not surprising. The evidence from the literature supports the hypothesis that there is a significant amount of inter-subject variability in terms of generating MRP patterns [41]. The literature also shows that the selected features are not necessarily located in the standard frequency bands or on specific scalp locations, and that the set of selected features differs from individual to individual [42]. These studies support the notion that a customized BCI system should be designed for each individual. Table 3-3 shows that for each individual, a number of bipolar channels were not selected by the feature selection process (such as channel F1-Fz for individuals AB1, AB2 and AB3, and channel FC3-FC1 for individual AB4). These results indicate that these channels can be eliminated from the analysis in future studies. Moreover, Table 3-3 and Figure 3-2 to Figure 3-5 show that the degree of contribution to the classification performance varies from one channel to another. These results indicate that a channel elimination methodology could be incorporated into the proposed method to further decrease the number of channels used for the operation of the system. Such an approach would rank the channels according to the number of selected features. It would then repeatedly eliminate the channel with the lowest contribution to fitness until the performance drops below a certain threshold (recursive elimination of channels). Systematic elimination of channels can lead to a faster setup of the system as well as decreased computational time. This could be part of future research works aimed at moving towards a more practical system. It should be mentioned that it is difficult to directly compare the results of our study with other BCI studies. This is because the user population (whether or not individuals are able-bodied), the experimental protocols, the evaluation protocol and the neurological phenomenon differ from one study to another. In addition, the degree of training individuals receive before participating in a BCI experiment, vary among studies.   93 We can, however, compare our current results with the latest design of a state-of- the-art self-paced BCI system called the low frequency–asynchronous switch design (the LF-ASD) [40]. Both studies use the same individuals, the same experimental protocol, the same EEG data and similar evaluation protocol.  The LF-ASD (originally reported in [3] and later modified as reported in [40]) uses a feature extractor with a shape similar to a wavelet function, and extracts features from six bipolar EEG channels. The Karhunen-Loève Transform (KLT) is used to reduce the 6-dimensional feature space produced by the feature generator to a 2-dimensional space. A 1-NN classifier is used as the feature classifier.  A moving average and a debounce algorithm are employed to improve the performance of the system by reducing the number of false activations. The parameter values of the system were estimated by an expert (for details, see [3, 30, 40]).  The latest performance results of the LF-ASD [40], applied to the data of individuals AB1 to AB4 are presented in columns 6 to 9 of Table 3-2.  As can be seen from the table, our proposed system has resulted in an increased FPR TPR ratio for all individuals (with the exception of individual AB3) . Specifically, the FPR TPR ratio increased from 33.9 to 67.7 for individual AB1 (relative improvement of 99.5 %), from 37.0 to 52.4 for individual AB2 (relative improvement of 41.6%), and from 36.5 to 39.8 for individual AB4 (relative improvement of 8.9%). These results show that our proposed approach improved the performance of most individuals compared with the latest design of the LF-ASD. The degree of improvements in the FPR TPR  ratio, however, is not statistically significant ( 05.0p ), so tests on the data of more individuals are needed to further substantiate this improvement. Note that the improved performance was achieved at the expense of using more features (please see columns 6 and 9 in Table 3-2). The relatively poor results obtained for individual AB3 may be partly related to our choice of wavelet function. Note that the wavelet function chosen for this study was based on the similarities between the chosen wavelet function and a typical bipolar MRP ensemble average pattern. However, there is substantial inter-subject variability in the shape of MRPs, especially in single trials [40]. It is expected that by analyzing a more   94 diverse family of wavelet functions, a different wavelet function might be chosen for each individual that would produce superior results. As mentioned in Section 3.3, we designated the number of features chosen by the MI to be L=50. Fewer features would have sped up the process of feature selection at the second stage, but might have resulted in a lower fitness value. To test this possibility, we compared the fitness of the best subset of features (see Table 3-2) with that of all features for individual AB1 (see Figure 3-6). In this figure, the thick line shows the fitness of the best configuration (calculated from Table 3-2). The thin line shows the fitness of the classifier as a function of the number of top features. We began by training and testing the classifier using only the feature with the highest MI score, and then calculated the fitness. Then we added features one at a time (according to their MI scores) and trained and tested the classifier using the new set of features. This process was repeated until we reached L=50. Although the fitness of the classifier increased as more features were added, it stayed well below the optimal value achieved by the GA. These results indicate that a lower L (especially when only limited top features are used for training the classifier) does not necessarily lead to better performance. A useful area to explore is the automation of the classifier. Currently, the feature selection procedure is automated but the selection of other parameters, such as those of the classifier, is carried out through cross-validation. Incorporating these parameters into the automation process would relieve the designer from the tiresome process of selecting the classifier’s parameter values, while potentially yielding better classification results. Expanding the current results to continuous signals and ultimately online testing are also worthwhile topics for future work. These results should be considered as preliminary results in the development of a self-paced brain computer interface system with a low FP rate. Our future work will also include testing of the proposed system on a larger pool of individuals to further investigate its usability.    95  Figure 3-6. Comparison of the fitness of the best chromosome vs. other subset of features. 3.6 Acknowledgements   This work was supported in part by NSERC under Grant 90278-06 and CIHR under Grant MOP-72711. This research has been enabled by the use of WestGrid computing resources, which are funded in part by the Canada Foundation for Innovation, Alberta Innovation and Science, BC Advanced Education, and the participating research institutions. The authors would like to thank Mr. Craig Wilson for his valuable comments on this paper.      96 3.7 References [1] T. M. Vaughan, W. J. Heetderks, L. J. Trejo, W. Z. Rymer, M. Weinrich, M. M. Moore, A. Kubler, B. H. Dobkin, N. Birbaumer, E. Donchin, E. W. Wolpaw and J. R. Wolpaw, "Brain- computer interface technology: a review of the Second International Meeting,"  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 11, no.2 , pp. 94-109, Jun. 2003. [2] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller and T. M. Vaughan, "Brain- computer interfaces for communication and control,"  Clin. Neurophysiol., vol. 113, no.6, pp. 767-791, Jun. 2002. [3] S. G. Mason and G. E. Birch, "A brain-controlled switch for asynchronous control applications,"  IEEE Trans. Biomed. Eng., vol. 47, no.10, pp. 1297-1307, Oct. 2000. [4] S. G. Mason and G. E. Birch, "Temporal control paradigms for direct brain interfaces - rethinking the definition of asynchronous and synchronous," in Proc. HCI Int. Conf., 2005, Las Vegas, USA. [5] G. E. Birch, P. D. Lawrence and R. D. Hare, "Single-trial processing of event-related potentials using outlier information,"  IEEE Trans. Biomed. Eng., vol. 40, no.1, pp. 59-73, Jan. 1993. [6] S. P. Levine, J. E. Huggins, S. L. BeMent, R. K. Kushwaha, L. A. Schuh, M. M. Rohde, E. A. Passaro, D. A. Ross, K. V. Elisevich and B. J. Smith, "A direct brain interface based on event-related potentials,"  IEEE Trans. Rehabil. Eng., vol. 8, no.2, pp. 180-185, Jun. 2000. [7] R. Millan Jdel and J. Mourino, "Asynchronous BCI and local neural classifiers: an overview of the Adaptive Brain Interface project,"  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 11, no.2, pp. 159-161, Jun. 2003. [8] R. Scherer, G. R. Muller, C. Neuper, B. Graimann and G. Pfurtscheller, "An asynchronously controlled EEG-based virtual keyboard: improvement of the spelling rate,"  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 979-984, Jun. 2004. [9] E. Yom-Tov and G. F. Inbar, "Detection of movement-related potentials from the electro- encephalogram for possible use in a brain-computer interface,"  Med. Biol. Eng. Comput., vol. 41, no.1, pp. 85-93, Jan. 2003. [10] G. Townsend, B. Graimann and G. Pfurtscheller, "Continuous EEG classification during motor imagery--simulation of an asynchronous BCI,"  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 12, no.2,  pp. 258-265, Jun. 2004. [11] T. Demiralp, J. Yordanova, V. Kolev, A. Ademoglu, M. Devrim and V. J. Samar, "Time- frequency analysis of single-sweep event-related potentials by means of fast wavelet transform,"  Brain Lang., vol. 66, no.1, pp. 129-145, Jan. 1999. [12] V. J. Samar, A. Bopardikar, R. Rao and K. Swartz, "Wavelet analysis of neuroelectric waveforms: a conceptual tutorial,"  Brain Lang., vol. 66, no.1, pp. 7-60, Jan. 1999. [13] T. Hinterberger, A. Kubler, J. Kaiser, N. Neumann and N. Birbaumer, "A brain-computer interface (BCI) for the locked-in: comparison of different EEG classifications for the thought translation device,"  Electroencephalogr. Clin. Neurophysiol., vol. 114, no.3, pp. 416-425, Mar. 2003. [14] B. Graimann, J. E. Huggins, S. P. Levine and G. Pfurtscheller, "Toward a direct brain interface based on human subdural recordings and wavelet-packet analysis,"  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 954-962, Jun. 2004.   97 [15] E. L. Glassman, "A wavelet-like filter based on neuron action potentials for analysis of human scalp electroencephalographs,"  IEEE Trans. Biomed. Eng., vol. 52, no.11, pp. 1851- 1862, Nov. 2005. [16] S. Fukuda, D. Tatsumi, H. Tsujimoto and S. Inokuchi, "Studies of input speed of word inputting system using event-related potential," in the Proc. 20th Annual Int. Conf. IEEE Engineering in Medicine and Biology Society, vol.3, pp. 1458-1460, 1998. [17] B. H. Jansen, A. Allam, P. Kota, K. Lachance, A. Osho and K. Sundaresan, "An exploratory study of factors affecting single trial P300 detection,"  IEEE Trans.  Biomed. Eng., vol. 51, no.6, pp. 975-978, Jun. 2004. [18] C. H. Ding, "Unsupervised feature selection via two-way ordering in gene expression analysis,"  Bioinformatics, vol. 19, no.10, pp. 1259-1266, Jul 1. 2003. [19] L. E. -. Talavera, An Evaluation of Filter and Wrapper Methods for Feature Selection in Categorical Clustering. , vol. 3646, 2005, pp. 440-451. [20] R. Kohavi and G. H. John, "Wrappers for feature subset selection,"  Artif. Intell., vol. 97, pp. 273-324, 1997. [21] R. Battiti, "Using mutual information for selecting features in supervised neural net learning,"  IEEE Trans. Neural Networks, vol. 5, no.4, pp. 537-550, 1994. [22] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Publishing Company, Reading, MA, 1989. [23] T. Back, D. B. Fogel and T. Michalewicz, Evolutionary Computation. Institute of Physics Publishing, Bristol and Philadelphia,   2000. [24] R. Beisteiner, P. Hollinger, G. Lindinger, W. Lang and A. Berthoz, "Mental representations of movements. Brain potentials associated with imagination of hand movements," Electroencephalogr. Clin. Neurophysiol., vol. 96, no.2, pp. 183-193, Mar. 1995. [25] G. E. Chatrian, M. C. Petersen and J. A. Lazarte, "The blocking of the rolandic wicket rhythm and some central changes related to movement,"  Electroencephalogr. Clin. Neurophysiol. Suppl., vol. 11, no.3, pp. 497-510, Aug. 1959. [26] G. Pfurtscheller and C. Neuper, "Motor imagery activates primary sensorimotor area in humans,"  Neurosci. Lett., vol. 239, pp. 65-68, Dec 19. 1997. [27] G. Pfurtscheller, C. Neuper, D. Flotzinger and M. Pregenzer, "EEG-based discrimination between imagination of right and left hand movement,"  Electroencephalogr. Clin. Neurophysiol., vol. 103, no.6, pp. 642-651, Dec. 1997. [28] C. A. Porro, M. P. Francescato, V. Cettolo, M. E. Diamond, P. Baraldi, C. Zuiani, M. Bazzocchi and P. E. di Prampero, "Primary motor and sensory cortex activation during motor performance and motor imagery: a functional magnetic resonance imaging study,"  J. Neurosci., vol. 16, no.23, pp. 7688-7698, Dec. 1996. [29] R. Cunnington, R. Iansek, J. L. Bradshaw and J. G. Phillips, "Movement-related potentials associated with movement preparation and motor imagery,"  Exp. Brain Res., vol. 111, no.3, pp. 429-436, Oct. 1996. [30] J. F. Borisoff, S. G. Mason, A. Bashashati and G. E. Birch, "Brain-computer interface design for asynchronous control applications: improvements to the LF-ASD asynchronous brain switch,"  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 985-992, Jun. 2004.   98 [31] G. E. Birch, Z. Bozorgzadeh and S. G. Mason, "Initial on-line evaluations of the LF-ASD brain-computer interface with able-bodied and spinal-cord subjects using imagined voluntary motor potentials,"  IEEE Trans. Neural Syst. Rehabil. Eng., vol. 10, no.4, pp. 219-224, Dec. 2002. [32] C. Babiloni, F. Carducci, F. Cincotti, P. M. Rossini, C. Neuper, G. Pfurtscheller and F. Babiloni, "Human movement-related potentials vs desynchronization of EEG alpha rhythm: a high-resolution EEG study,"  Neuroimage, vol. 10, no.6, pp. 658-665, Dec. 1999. [33] S. G. Mallat, "Multifrequency channel decompositions of images and wavelet models," IEEE Tran.s Acoustics, Speech and Signal Processing, vol. 37, no.12,  pp. 2091-2106, 1989. [34] N. Kwak and Chong-Ho Choi, "Input feature selection for classification problems,"  IEEE Trans. Neural Networks, vol. 13, no.1, pp. 143-159, 2002. [35] K. R. Muller, C. W. Anderson and G. E. Birch, "Linear and Nonlinear Methods for Brain- Computer Interfaces,"  IEEE Trans. Neural Syst. and Rehab.  Eng., vol. 11, no.2, pp. 165- 169, Jun. 2003. [36] H. Yoon, K. Yang and C. Shahabi, "Feature subset selection and feature ranking for multivariate time series,"  IEEE Trans Knowledge Data Eng., vol. 17, no.9, pp. 1186-1198, 2005. [37] C. Chang and C. Lin, LIBSVM: A Library for Support Vector Machines. 2001,s Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [38] M. Kaper, P. Meinicke, U. Grossekathoefer, T. Lingner and H. Ritter, "BCI Competition 2003--Data set IIb: support vector machines for the P300 speller paradigm,"  IEEE Trans. Biomed. Eng., vol. 51, no.6, pp. 1073-1076, Jun. 2004. [39] M. Kaper and H. Ritter, "Generalizing to new subjects in brain-computer interfacing," in Proc. 26th  Annual Int. Conf. of Engineering in Medicine and Biology Society (EMBC’04), vo.6, , pp. 4363-4366, 2004. [40] A. Bashashati, M. Fatourechi, R. K. Ward and G. E. Birch, "User customization of the feature generator of an asynchronous brain interface,"  Ann. Biomed. Eng., vol. 34, no.6, pp. 1051-1060, Jun. 2006. [41] V. G. Evidente, J. N. Caviness, B. Jamieson, A. Weaver and N. Joshi, "Intersubject variability and intrasubject reproducibility of the bereitschaftspotential,"  Mov. Disord., vol. 14, no.2, pp. 313-319, Mar. 1999. [42] J. Millan, M. Franze, J. Mourino, F. Cincotti and F. Babiloni, "Relevant EEG features for the classification of spontaneous motor-related tasks,"  Biol. Cybern., vol. 86, no.2, pp. 89-95, Feb. 2002.       99 CHAPTER 4 A SELF-PACED BRAIN COMPUTER INTERFACE SYSTEM THAT USES MOVEMENT RELATED POTENTIALS IN CHANGES IN THE POWER OF BRAIN RHYTHMS4  4.1 Introduction In a brain computer interface (BCI) system, specific features of a person’s brain signal relating to his/her intent are used to generate a control command that controls/actuates a device (see Figure 4-1 for a functional model of a BCI system).    Device  User amp Feature Translator Control Interface Device Controller state feedback Control Display electrodes BI Transducer Artifact Processor Feature Generator Signal Enhancement Feature Extraction Feature Selection  Figure 4-1. Functional model of a BCI system (adapted from [1]). BCI designs are implemented in two ways: synchronized (system-paced) or asynchronous (self-paced). In synchronized BCI systems, a user can initiate a command  4 A version of this chapter has been published. Fatourechi, M., Birch, G. E., and Ward, R. K., “A Self- paced Brain Interface System that Uses Movement Related Potentials and Changes in the Power of Brain Rhythms", Journal of Computational Neuroscience, Vol.23, No.1, pp.21-37, Aug 2007.    100 only during certain periods specified by the system.  It is assumed that a user only intends some control action during these specified times. Figure 4-2(a) shows a synchronized BCI system that can detect two intentional control (IC) commands (IC1 and IC2). In a self-paced BCI (SBCI) system, users can affect the BCI transducer’s output whenever they want, by intentionally changing their brain state.  The state in which a user is intentionally attempting to control a BCI transducer is called an IC state. At other times, users are said to be in a no-control (NC) state, where they may be idle or performing some action other than trying to control the BCI transducer [2, 3]. To operate in this paradigm, BCI transducers are designed to respond only when a user is in an IC state and to remain inactive when a user is in an NC state. Figure 4-2(b) shows the output of an SBCI system.  Figure 4-2. Synchronized vs. SBCI systems. (a) In a synchronized BCI system control is only possible during System Ready periods; (b) In an SBCI system, the system continuously accepts the input signals. So far, only a few BCI transducers (e.g., [2, 4-9]) have been specifically designed and tested for self-paced control applications. But as recognized in [10], the SBCI systems deserve more attention. In this chapter, we focus on the issue of improving the performance of an SBCI system.  A two-state SBCI system should be able to discriminate an IC command from an NC state. The performance of this system is usually evaluated through two metrics: a true positive (TP) rate and a false positive (FP) rate. An FP rate is the percentage of incorrectly classifying NC periods as IC periods, and a TP rate is the percentage of IC1 IC1 IC2 IC NC NC (a)   (b)   101 correctly classifying IC periods.  The FP rates of current SBCI systems are still very high for practical applications. The main reason is the very noisy nature of the brain’s electrical signals, which makes correct detection of patterns associated with control commands very difficult.  Nevertheless, it is crucial to keep the FP rate as low as possible in order to prevent user frustration. We propose the use of multiple neurological phenomena as sources of control to improve the performance of an SBCI system (in terms of increasing the FPR TPR ratio).  The proposed SBCI system uses features extracted from three neurological phenomena: movement-related potentials (MRP), changes in the power of Mu rhythms (CPMR) and changes in the power of Beta rhythms (CPBR). The main rationale behind using these three neurological phenomena is that they are time-locked to movement onset [11, 12]. Thus, when a movement happens, it is expected that all three will be present. As a result, we postulate that a detector that considers the simultaneous occurrence of these phenomena should be more robust to the presence of transient non-control related changes in the brain signals (which may affect the performance of an SBCI system) than those that just look at one of the above-mentioned neurological phenomena. Increasing the number of neurological phenomena considered from one to three has the disadvantage of increasing the dimension of the feature space. When there are restrictions on the number of sample sizes, the training data are likely to fall in a very small fraction of the sample space. This will limit the generalization property of the classifier [13].To address this issue, we developed a new algorithm to reduce the dimensionality of the feature space. The proposed algorithm uses a two-stage multiple- classifier system (MCS) to classify the brain signals. We use the spatial information to develop the first stage of the MCS and the information from the neurological phenomena to develop the second stage of the MCS. The use of an MCS allows us to design a strong classifier by using an ensemble of “weak” classifiers. The evidence from the literature shows that in many cases this approach yields superior results over those of the best individual classifier [14, 15]and in some cases over those of a single powerful classifier [16-18].  For a BCI system, where the number of training IC patterns is usually limited, using an MCS provides us with a great opportunity to explore more features.   102 The proposed system showed superior performance to that of a simpler approach whereby all the features were combined into a single feature vector and then classified into IC or NC classes. Its performance was also found to be superior to that of an MCS that was based on the spatial information of a single neurological phenomenon. To maximize performance of the proposed design, we employ a user-customized approach. It is well known that the spatiotemporal characteristics of a particular neurological phenomenon change from one individual to another [19-21]. As a result, customizing the BCI system for each user is very important in achieving consistently good results in all users. The improvements in the performance for users after employing user-customization (e.g., [7, 22-25]) further emphasize the importance of such customization. In order to reduce the inter-subject variability of neurological phenomena in the proposed BCI design, a genetic algorithm was implemented to select the set of EEG channels that resulted in the best classification accuracy for each MCS. The results showed that the configuration of EEG channels leading to optimal performance varies from one user to another. In the next section, we discuss the neurological phenomena under consideration and provide evidence from the literature that their combination would be useful in the design of an SBCI system. 4.2       Background 4.2.1 Neurological phenomenon background It is known that an internally paced movement generates the following responses in the EEG signal: a movement-related potential (MRP), an event-related desynchronization (ERD) and an event-related synchronization (ERS). The MRP on the one hand and the ERD and ERS on the other are different responses of neural structures in the brain [26]. Averaging EEG data with respect to movement onset results in the generation of typical slow potentials, “movement-related potentials” (MRPs), from background   103 oscillatory electrical activity [27]. MRPs start about 1.5–1 seconds before the onset of a particular movement and have bilateral distribution [12, 27-30]. High-resolution EEG studies have modeled the main sources of MRPs arising in the supplementary motor area and the primary sensorimotor cortex [31, 32]. Voluntary movement results in a circumscribed desynchronization in the Mu and Beta bands, localized close to the sensorimotor areas ([33, 34]). This desynchronization, termed as “event-related desynchronization” (ERD),   starts about 2 seconds prior to the onset of movement [26]. The enhanced rhythmic activity following the movement is called “event-related synchronization” (ERS).   The post-movement Beta ERS is found in the first second after the termination of a voluntary movement, when the Mu rhythm still displays a desynchronized pattern [26]. The Beta ERS is a relatively robust phenomenon and is found in nearly all individuals after a finger, hand, arm or foot movement [35]. A number of papers provide some evidence that MRPs and changes in the power of brain rhythms (usually characterized as ERD and ERS) provide complementary information for exploration of the cognitive functions of the brain. It is suggested that MRPs can be considered as a series of transient post-synaptic responses of main pyramidal neurons triggered as a result of a specific event [26]. The same paper also states that the ERD and ERS phenomena can be viewed as being generated by changes in one or more parameters that control oscillations in neural networks. In [36], analysis of subdural EEG recordings from primary sensorimotor in epileptic patients showed that the amplitude of the ERD of the Alpha rhythm recorded from subdural areas was not always correlated with corresponding MRPs. It is suggested in the same paper that these neurological phenomena represent different aspects of cortical motor processes. In [37], the ERD of the Alpha rhythm is not always detected in cortical sites generating MRPs. In [12], through a high-resolution EEG study, it is shown that MRPs and the ERD of the Alpha rhythm provide complementary information on human brain responses accompanying the preparation and execution of a finger movement. Further evidence from the analysis of EEG signals [38, 39] and magnetoencephalography (MEG) [18, 40, 41] strengthens these findings.   104 There is also some evidence regarding the differences between Mu and Beta rhythms. Several papers show that the reactivities of the Mu and Beta rhythms related to the movement onset are different [20, 42]. Both the Mu and Beta rhythms desynchronize before a voluntary self-paced movement. However, after the movement, the ERD of the Mu rhythm is followed by a slow return to baseline (and sometimes by a slight synchronization), while the Beta rhythms synchronize rapidly after the movement onset [20]. 4.2.2 Multiple neurological phenomena in BCI systems Although most BCI researchers use a single neurological phenomenon as the source of control, there have been reports of using multiple neurological phenomena in BCI systems [11, 18, 43-46]. In  [18], the authors analyzed combinations of features extracted from an early component of the MRP called Bereitschaftspotential (BP), features extracted from the ERD of  neurological phenomena above 4Hz (through AR modeling) and features extracted from the common spatial patterns (CSP) features related to the ERD of the Mu rhythms. The BCI system had to discriminate between left and right index finger movements. A linear discriminant analysis (LDA) classifier was used for classification. Different combination schemes were explored. The study showed that a certain combination of classifiers could result in a lower error rate than the case where a single classifier  is used.  The results of combining the ERD of the Mu rhythm and the BP were not reported, although the authors mention that those results were slightly worse than the results obtained when all three neurological phenomena were used in the design of the BCI system. In [43], the authors applied a combination of microstate analysis and common spatial subspace decomposition to extract features belonging to three different frequency bands: Theta + Delta, Mu and Beta. The MRPs were not regarded as a separate neurological phenomenon. Instead, the features were extracted from the frequency band covering both the Delta and Theta rhythms.  These features were then used to discriminate between left and right hand movements. In [47], the authors used the BP and the ERD of the brain rhythms from 10 to 33 Hz (including both the Mu and Beta rhythms) to classify left vs. right finger movement. The features extracted from all   105 neurological phenomena and all channels were then combined, the dimension of the feature vector was reduced and the final vector was classified using a perceptron neural network. The results showed classification accuracy of 84% on the test set, but the amount of contribution of each neurological phenomenon is not exactly known. In [48], the authors used features extracted from the BP and the ERD of the Mu rhythms for classifying the left and right index finger movements.  The above studies all pertain to synchronized BCI systems. To the best of our knowledge, only one SBCI system that uses multiple neurological phenomena has been reported so far [44]. In [44], the authors studied combining a number of neurological phenomena in order to design an ECoG-based SBCI system. Using a wavelet packet, the ECoG signal was divided into 18 different frequency bands covering a range from 0 - 100 Hz. This range covered a wide range of neurological phenomena including the Mu, Beta and Gamma rhythms, as well as other movement-related activities. Then for each band, wavelet-filtered signals were reconstructed. The wavelet filtered signals were then squared to achieve power values, and a genetic algorithm applied to reduce the dimension of the feature space to one. Using a thresholding classifier, the test samples were classified as movement or no movement. Aside from the different signal processing approach, there are the following neurological phenomena-related differences between our proposed approach and that proposed in [44]: 1) While [44] focuses on the power of signals in a wide range of frequency bands, including the Theta,  Beta , Mu and Gamma rhythms among others, our approach focuses on three specific neurological phenomena: MRPs, and changes in the power of the Mu and Beta rhythms. 2) In [44], the power in the Delta rhythms was used as one of the features. In this chapter, we intend to detect the shape of the MRP pattern in the ongoing EEG signal. 3) In [44], the contribution of each neurological phenomenon is not evident. Only the most significant features (those with the largest weight) were highlighted. In   106 our study, we specifically show which neurological phenomena are present in the design of the proposed user-customized SBCI system.  4) In [44], only the powers of signals at different frequency bands were used as features, while in this chapter, we are interested in detecting the time course of three distinct neurological phenomena. 4.3 Data collection People with severe motor disabilities cannot physically execute a movement such as a finger flexion, but they are usually able to attempt a movement execution (by thinking that they are executing it). Several studies have shown that the EEG recordings obtained from attempted and real movements for able-bodied individuals bear many similarities [49-52]. These studies demonstrated that attempted and executed movements both result in the activation of similar cortical areas and generation of similar patterns. This evidence enables us to take the initial steps towards the development of an SBCI system, using the data recorded from able-bodied individuals. A similar rationale can be found in the design of other SBCI systems when the data of able-bodied individuals were employed [53, 54].  By using the data of able-bodied individuals, it is then possible to detect the occurrence (if any) of a control command by analyzing signals such as the electromyography (EMG) or the output of an actual switch. The signals can be used for labeling the brain signals and to evaluate the system’s performance.  The data analysis of individuals with motor disabilities was left to future studies. The data of four able-bodied individuals (three males and one female) were used in this study. All individuals were right-handed and between 31 and 56 years old. They had all signed consent forms prior to participation in the experiment.     The individuals were positioned 150 cm in front of a computer monitor. The EEG signals were recorded from 13 monopolar electrodes positioned according to the International 10-20 System at F1,Fz,F2,FC3,FC1,FCz,FC2,FC4,C3,C1,Cz,C2and C4 locations. The cutoff frequency of the amplifier was set at 30Hz. Electrooculography (EOG) activity was measured as the potential difference between two electrodes, placed at the corner of and below the right eye. The ocular artifact was detected when the   107 difference between the EOG electrodes exceeded ±25 µV. All signals were sampled at 128 Hz and referenced to the ear electrodes (see [55] for details of the data recording). The recorded signals were then saved on a computer for further analysis.    The individuals performed a guided task.  At each interval, a white circle of 2cm diameter was displayed on the user’s monitor for ¼ second, prompting the users to attempt a movement. In response to this cue, the user had to perform a right index finger flexion one second after the cue appeared.  The 1-second delay was used to avoid visual evoked potential (VEP) effects caused by the cue. This is the time that the user is expected to attempt the movement, but this time may vary from one use