UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Scalp and intracranial EEG quantitative analysis : robust detection and prediction of epileptic seizures Hussein, Ramy 2019

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2019_september_hussein_ramy.pdf [ 19.98MB ]
Metadata
JSON: 24-1.0379800.json
JSON-LD: 24-1.0379800-ld.json
RDF/XML (Pretty): 24-1.0379800-rdf.xml
RDF/JSON: 24-1.0379800-rdf.json
Turtle: 24-1.0379800-turtle.txt
N-Triples: 24-1.0379800-rdf-ntriples.txt
Original Record: 24-1.0379800-source.json
Full Text
24-1.0379800-fulltext.txt
Citation
24-1.0379800.ris

Full Text

Scalp and Intracranial EEG QuantitativeAnalysis: Robust Detection and Predictionof Epileptic SeizuresbyRamy Hussein,M.Sc., Alexandria University, 2013B.Sc., Alexandria University, 2011A THESIS SUBMITTED IN PARTIAL FULFILMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Electrical and Computer Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)July 2019c Ramy Hussein, 2019 ii 	The	following	individuals	certify	that	they	have	read,	and	recommend	to	the	Faculty	of	Graduate	and	Postdoctoral	Studies	for	acceptance,	the	dissertation	entitled:		Scalp	and	intracranial	EEG	quantitative	analysis:	robust	detection	and	prediction	of	epileptic	seizures		submitted	by	 Ramy	Hussein	 	 in	partial	fulfillment	of	the	requirements	for	the	degree	of	 Doctor	of	Philosophy	in	 Electrical	and	Computer	Engineering		Examining	Committee:	Rabab	K.	Ward,	Electrical	and	Computer	Engineering	Co-supervisor	Zhen	Jane	Wang,	Electrical	and	Computer	Engineering	Co-supervisor			Supervisory	Committee	Member	Leonid	Sigal,	Computer	Science  University	Examiner	Purang	Abolmaesumi,	Electrical	and	Computer	Engineering	University	Examiner			Additional	Supervisory	Committee	Members:	Panos	Nasiopoulos,	Electrical	and	Computer	Engineering	Supervisory	Committee	Member	Martin	J.	McKeown,	Clinical	Director	of	Pacific	Parkinson's	Research	Center	Supervisory	Committee	Member		AbstractEpilepsy is a common neurological disorder that affects over 90 million people globally — 30-40%of whom do not respond to medication. Electroencephalogram (EEG) is the prime tool that has beenwidely used for the diagnosis and management of epilepsy. As the visual inspection of long-term EEGis tedious, expensive, and time-consuming, research in the EEG-based methods to automatically detectand predict epileptic seizures has been very active. This thesis studies how to leverage the temporal,spectral, and spatial information in the EEG data to accurately detect and also predict seizures.To automatically detect epileptic seizures, we first introduce a computationally-efficient method thatdetects a seizure within a very short time of its onset. It relies on a computationally-simple feature ex-traction method based on LASSO regression and extracts the prominent EEG seizure-associated featuresin a time-efficient manner, achieving high seizure detection accuracy with very short detection latency.Subsequently, we propose two novel methods for robust detection of epileptic seizures where the mainquestion addressed is: Can we identify the seizure pattern(s) hidden in the contaminated EEG data? Wefirst present a novel feature learning algorithm based on `1-penalized robust regression. This algorithmextracts the most distinguishable EEG spectral features pertinent to epileptic seizures. We then pro-pose a deep learning method that achieves a better robust performance under real-life conditions. Thismethod uses long-short-term memory recurrent networks to exploit the temporal dependencies in theEEG data and accurately recognize the seizure patterns. Both methods are proven to maintain robustperformance in the presence of common signal contaminants and ambient noise.The thesis then addresses the seizure prediction problem using intracranial EEG (iEEG) data. Anovel architecture of multi-scale convolutional neural networks is proposed to learn the discrimina-tive pre-seizure iEEG features that could potentially help predict impending seizures. Experiments onclinical data show that this method achieves high seizure prediction sensitivity and maintains reliableperformance against inter- and intra-patient variations.iiiLay SummaryEpileptic seizures are bursts of electrical activity in the brain. These bursts are identifiable in the elec-troencephalogram (EEG) signals, but their visual detection is labor-intensive, time-consuming, and im-practical. Also, seizure EEG data are extremely noisy and vary in time for one person, and from oneperson to another. This dissertation develops automatic detection methods that effectively filter out thenoise and account for the variability in EEG signals, potentially making it suitable for the clinical di-agnosis of epileptic seizures. It also proposes an accurate seizure prediction method that could warnpatients and caregivers of upcoming seizures. This allows more tailored therapies and hence reduces theside effects of medications. Brain stimulation therapies and closed-loop intervention systems could alsobe used to prevent upcoming seizure attacks.ivPrefaceThe body of research in this dissertation (Chapters 3 - 7) is based off of several collaborative papers thathave either been previously published or are currently under review. Below is the list of these scientificpapers:• J1: Ramy Hussein, Mohamed Elgendi, Z. Jane Wang, and Rabab Ward, “Robust Detection ofEpileptic Seizures Based on L1-Penalized Robust Regression of EEG Signals”, Expert Systemswith Applications, Volume: 104, pp. 153–167, 2018.• J2: Ramy Hussein, Hamid Palangi, Rabab Ward, and Z. Jane Wang, “Optimized Deep NeuralNetwork Architecture for Robust Detection of Epileptic Seizures using EEG Signals”, ClinicalNeurophysiology, Volume: 130, Issue: 1, pp. 25–37, 2019.• J3: Ramy Hussein, Mohamed Osama Ahmed, Z. Jane Wang, Rabab Ward, Mark Schmidt, andLevin Kuhlmann, “Human Intracranial EEG Quantitative Analysis and Automatic Feature Learn-ing for Epileptic Seizure Prediction”, IEEE Transactions on Biomedical Engineering, Accepted,May 2019.• C1: Ramy Hussein, Rabab Ward, and Z. Jane Wang, “Energy-Efficient EEG Monitoring Systemfor Wireless Epileptic Seizure Detection”, IEEE International Conference on Machine Learningand Applications (ICMLA), pp. 294–299, 2016.• C2: Ramy Hussein, Z. Jane Wang, and Rabab Ward, “Ll-Regularization Based EEG FeatureLearning for Detecting Epileptic Seizure”, IEEE Global Conference on Signal and InformationProcessing (GlobalSIP), pp. 1171–1175, 2016.• C3: Ramy Hussein, Mohamed Elgendi, Rabab Ward, and Z. Jane Wang, “High Performance EEGFeature Extraction for Fast Epileptic Seizure Detection”, IEEE Global Conference on Signal andInformation Processing (GlobalSIP), pp. 953–957, 2017.• C4: Ramy Hussein, Hamid Palangi, Z. JaneWang, and RababWard, “Robust Detection of Epilep-tic Seizures using Deep Neural Networks”, IEEE International Conference on Acoustics, Speechand Signal Processing (ICASSP), pp. 2546–2550, 2018.• C5: Ramy Hussein, Z. Jane Wang, and Rabab Ward, “Multi-scale Deep Convolutional NeuralNetwork for Epileptic Seizure Prediction”, IEEE Global Conference on Signal and InformationProcessing (GlobalSIP), Submitted, 2019.vThe authors’ roles and contributions are as follows:• In C1, Ramy Hussein (the primary author and the main contributor of the paper) formulated theproblem, designed the proposed methods, implemented them, and analyzed the results. Prof. Wardand Prof. Wang provided extensive technical feedback throughout formulating the method andhelped editing the paper.• J1, C2: Ramy Hussein (the primary author and the main contributor of the papers) derived allthe formulations, implemented the proposed methods, and evaluated their effectiveness on realclinical data. Prof. Ward and Prof. Wang provided extensive technical and editorial feedbackthroughout writing these papers.• C3: Ramy Hussein (the primary author and the main contributor of the paper) conceptualizedthe paper, formulated the proposed solution, analyzed the results, and wrote the whole paper.Dr. Elgendi and Prof. Ward provided extensive technical discussion during the implementation ofthe proposed method. Prof. Wang helped with performing the evaluations and writing the paper.• J2, C4: Ramy Hussein (the primary author and the main contributor of the papers) developedthe ideas of this research independently, formulated the proposed solutions, performed the eval-uations, and wrote the papers. Dr. Palangi helped with conducting some of the experiments.Prof. Ward and Prof. Wang helped with writing the papers and provided valuable feedback andideas for data visualization.• J3, C5: Ramy Hussein (the primary author and the main contributor of the papers) developedand implemented the proposed neural network architectures, examined them on clinical datasets,analyzed the results, and wrote the papers. Dr. Ahmed and Prof. Schmidt helped with conductingthe quantitative analysis of the data and optimizing the structure of the proposed architectures.Dr. Ahmed also helped to run some of the experiments and fine-tune the neural network hyper-parameters. Prof. Kuhlmann provided the data together with a detailed description of its char-acteristics and properties. Prof. Ward and Prof. Wang provided extensive technical and editorialfeedback throughout writing these papers.viTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Epilepsy and Epileptic Seizures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Electroencephalogram (EEG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.1 Surface EEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.2 Intracranial EEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.4.1 Developing an Energy-Efficient Wireless EEGMonitoring System for EpilepticSeizure Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.4.2 Designing a Computationally-Efficient Feature ExtractionMethod for Early De-tection of Epileptic Seizures . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4.3 Developing Robust Detection Methods of Epileptic Seizures . . . . . . . . . . 81.4.4 Developing an Accurate Prediction Method of Epileptic Seizures . . . . . . . 91.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 Epileptic Seizure Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11vii2.1.1 Time Domain-based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1.2 Frequency Domain-based Methods . . . . . . . . . . . . . . . . . . . . . . . 132.1.3 Time-Frequency Domain-based Methods . . . . . . . . . . . . . . . . . . . . 142.1.4 Empirical Mode Decomposition (EMD)-based Methods . . . . . . . . . . . . 162.1.5 Rational Functions-based Methods . . . . . . . . . . . . . . . . . . . . . . . 172.2 Epileptic Seizure Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Energy Efficient EEG Monitoring System for Wireless Epileptic Seizure Detection . . . 203.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Research Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 EEG Feature Extraction and Classification . . . . . . . . . . . . . . . . . . . . . . . 233.3.1 EEG Data and Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.2 EEG Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.3 EEG Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.4 Proposed Energy-Efficient Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.1 Missing at Random – Sensor Side . . . . . . . . . . . . . . . . . . . . . . . . 253.4.2 Expectation Maximization – Server Side . . . . . . . . . . . . . . . . . . . . 253.5 EEG Data Transmission in a Wireless Seizure Detection System . . . . . . . . . . . . 273.5.1 Transmission of Entire Raw EEG Data . . . . . . . . . . . . . . . . . . . . . 283.5.2 Transmission of Compressed EEG Data . . . . . . . . . . . . . . . . . . . . . 283.5.3 Transmission of EEG Features . . . . . . . . . . . . . . . . . . . . . . . . . 293.6 Power Consumption Evaluation and Seizure Detection Performance . . . . . . . . . . 303.6.1 Raw EEG Data Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.6.2 Compressed EEG Data Streaming . . . . . . . . . . . . . . . . . . . . . . . . 323.6.3 EEG Features Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.7 Limitations and Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.8 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 Computationally Efficient EEGFeature Learning for Early Detection of Epileptic Seizures 344.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.2 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2.1 EEG Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2.2 EEG Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.3.1 EEG Feature Learning: LASSO Regression . . . . . . . . . . . . . . . . . . . 384.3.2 EEG Feature Classification: Random Forest . . . . . . . . . . . . . . . . . . 394.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.4.1 Seizure Detection under Ideal Conditions . . . . . . . . . . . . . . . . . . . . 404.4.2 Seizure Detection in the Presence of Body Artifacts . . . . . . . . . . . . . . 404.5 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42viii5 Robust Detection of Epileptic Seizures Based on L1-Penalized Robust Regression of EEGSignals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.2 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.2.1 EEG Data and Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.2.2 EEG Frequency Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.3.1 Two-class EEG Classification Problem . . . . . . . . . . . . . . . . . . . . . 485.3.2 Three-class EEG Classification Problem . . . . . . . . . . . . . . . . . . . . 505.3.3 Five-class EEG Classification Problem . . . . . . . . . . . . . . . . . . . . . 525.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.4.1 Robust EEG Feature Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 525.4.2 EEG Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565.5.1 Seizure Detection under Ideal Conditions . . . . . . . . . . . . . . . . . . . . 565.5.2 Seizure Detection under Real-life Conditions . . . . . . . . . . . . . . . . . . 615.6 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646 Optimized Deep Neural Network Architecture for Robust Detection of Epileptic Seizuresusing EEG Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.2 EEG Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.2.1 Description of EEG Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.2.2 Common EEG Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.3.1 Two-class EEG Classification Problem . . . . . . . . . . . . . . . . . . . . . 706.3.2 Three-class EEG Classification Problem . . . . . . . . . . . . . . . . . . . . 726.3.3 Five-class EEG Classification Problem . . . . . . . . . . . . . . . . . . . . . 736.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.4.1 High-level Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.4.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.5.1 Seizure Detection under Ideal Conditions . . . . . . . . . . . . . . . . . . . . 796.5.2 Seizure Detection under Real-life Conditions . . . . . . . . . . . . . . . . . . 826.6 Limitations and Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.7 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 866.8 Data and Codes Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87ix7 Human Intracranial EEGQuantitative Analysis and Automatic Feature Learning for Epilep-tic Seizure Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887.2 Human iEEG Quantitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.2.1 Subjects and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.2.2 Are Interictal and Preictal iEEGs Statistically Different? . . . . . . . . . . . . 927.2.3 Can PCA be Efficient for iEEG Data Reduction? . . . . . . . . . . . . . . . . 937.2.4 Can We Exclude some EEG Channels/Sensors? . . . . . . . . . . . . . . . . 947.2.5 Are Neighbor iEEG Sensors more Correlated than Distant Ones? . . . . . . . 977.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007.3.1 iEEG Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007.3.2 Automatic Feature Learning of iEEG. . . . . . . . . . . . . . . . . . . . . . . 1057.3.3 Performance Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077.4.1 Seizure Prediction Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077.4.2 Possible Reasons for Limited Performance . . . . . . . . . . . . . . . . . . . 1097.5 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1138 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1148.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1148.1.1 An Energy-Efficient EEG Tele-monitoring Scheme . . . . . . . . . . . . . . . 1148.1.2 A Computationally-Efficient EEG Feature Learning for Fast Detection of SeizureOnsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1158.1.3 An L1-Penalized Robust Regression for Reliable Detection of Epileptic Seizures1158.1.4 A Deep Learning Approach for Robust Detection of Epileptic Seizures . . . . 1168.1.5 A Convolutional Neural Network Architecture for Accurate Prediction of Epilep-tic Seizures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1178.2.1 Automated Video Processing for Real-time Detection of Epileptic Seizures . . 1178.2.2 Multi-modal and Non-invasive Framework for Accurate Prediction of EpilepticSeizures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1188.2.3 Epileptic Seizure Prediction using Non-invasive Canine EEG Signals . . . . . 118Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119AppendicesA Chapter 5 Supplementary Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135A.1 Convergence Rate of Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . 135xA.2 Convergence Rate of Block Coordinate Descent . . . . . . . . . . . . . . . . . . . . . 136B Chapter 6 Supplementary Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138B.1 Influence of EEG Segmentation on the Seizure Detection Accuracy . . . . . . . . . . 138B.2 Architecture of Vanilla RNN and LSTM . . . . . . . . . . . . . . . . . . . . . . . . . 138xiList of Tables3.1 Seizure detection performance and total power consumption of raw data (RD) model . 313.2 Seizure detection performance and total power consumption of compressed data (CD)model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.3 Seizure detection performance and total power consumption of EEG features (EEGF)model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1 Seizure detection results of the proposed and state-of-the-art methods; NR = Not Reported. 415.1 Seizure detection results of the proposed and state-of-the-art methods: Two-class prob-lem (A-E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Seizure detection results of the proposed and state-of-the-art methods: Two-class prob-lem (ABCD-E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.3 Detailed seizure detection accuracy of the proposed method for the imbalanced classifi-cation problem ABCD-E: Hold-out (50.00-50.00%) . . . . . . . . . . . . . . . . . . . 595.4 Detailed seizure detection accuracy of the proposed method for the imbalanced classifi-cation problem ABCD-E: 10-folds cross-validation . . . . . . . . . . . . . . . . . . . 595.5 Seizure detection results of the proposed and state-of-the-art methods: Three-class prob-lem (A-C-E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.6 Seizure detection results of the proposed and state-of-the-art methods: Five-class prob-lem (A-B-C-D-E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.1 Seizure detection results of the proposed and baseline methods: Two-class problem (A-E). 796.2 Seizure detection results of the proposed and baseline methods: Two-class problem(ABCD-E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806.3 Seizure detection results of the proposed and baseline methods: Three-class problem(A-C-E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.4 Seizure detection results of the proposed and baseline methods: Five-class problem (A-B-C-D-E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817.1 Description of the seizure prediction intracranial EEG dataset. . . . . . . . . . . . . . 927.2 Seizure prediction sensitivity scores of the algorithms under study. . . . . . . . . . . . 1087.3 AUC scores for the proposed method and Kaggle top finishing contestants on the test set. 1097.4 Per-subject AUC scores of the proposed method. . . . . . . . . . . . . . . . . . . . . 109xiiList of Figures1.1 Electrode locations of international 10-20 system for EEG recording . . . . . . . . . . 41.2 An example of multi-channel surface EEG signals recorded from an epileptic patient. . 51.3 Subdural grids placed on the brain surface for iEEG recording from an epileptic patient. 53.1 Frequency spectra of noisy and clean EEG signals. . . . . . . . . . . . . . . . . . . . 243.2 Missing value patterns of 100 EEG epochs. . . . . . . . . . . . . . . . . . . . . . . . 263.3 Proposed energy-efficient EEG monitoring methods for wireless epileptic seizure detec-tion. (A) MAR is applied to the entire raw EEG data before transmission, (B) MAR isapplied to the compressed EEG data before transmission, and (C) MAR is applied to thedistinctive EEG rhythms before transmission. EM is used at the server side to retrievethe missing values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4 Recovery of raw EEG data using EM: (a), (b) and (c) correspond to the original, missingand recovered raw EEG data, respectively. . . . . . . . . . . . . . . . . . . . . . . . . 314.1 Time-series EEG plots: (a), (b), and (c) clean EEG examples of normal, interictal, andictal EEG activities, respectively; (d), (e), and (f) noisy EEG examples of normal, inter-ictal, and ictal EEG activities, respectively. . . . . . . . . . . . . . . . . . . . . . . . . 364.2 Frequency spectrum plots: (a), (b), and (c) clean EEG spectrum of normal, interictal,and ictal activities, respectively; (d), (e), and (f) noisy EEG spectrum of normal, inter-ictal, and ictal activities, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3 Original and extracted frequency components of clean EEG spectrum. . . . . . . . . . 414.4 Original and extracted frequency components of noisy EEG spectrum. . . . . . . . . . 425.1 Time-series EEG signals and their spectra: (a)-(e) Examples of time-series EEG signalsfrom the five different sets of the Bonn University EEG dataset; (f)-(j) correspondingfrequency spectrum plots of the EEG example signals in (a)-(e). . . . . . . . . . . . . 465.2 Time-series EEG plots: (a) clean ictal signal; (b), (c), and (d) ictal signals corruptedwith muscle artifacts, eye-blinking, and white noise, respectively. . . . . . . . . . . . . 475.3 Classification accuracy against signal-to-noise ratio for the two-class problem A-E. . . 625.4 Classification accuracy against signal-to-noise ratio for the two-class problem ABCD-E. 635.5 Classification accuracy against signal-to-noise ratio for the three-class problem A-C-E. 645.6 Classification accuracy against signal-to-noise ratio for the five-class problem A-B-C-D-E. 65xiii6.1 Samples of EEG signals from each of the five sets of the Bonn University EEG database. 696.2 Clean and noisy EEG signals and their corresponding spectra: (a) clean EEG examplefrom set A; (b-d) noisy EEG examples contaminated by muscle activities, eye move-ment, and white noise, respectively; (e-h) corresponding frequency spectra of (a-d),respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.3 Schematic diagram of the overall seizure detection approach: Each EEG EEG chan-nel signal of d data-points is segmented into M segments - each segment includesL data-points; LSTM stands for Long-Short-Term Memory; y is the output of LSTMlayer; h1represents a fully connected (dense) layer unit; v is the fully connected layeroutput;P1, P2, P3, · · · , PK are the probabilities produced by softmax for the K-classes;Out stands for the output of the softmax layer (predicted label). . . . . . . . . . . . . 756.4 Classification accuracy vs. SNR plots for the two-class EEG classification problem (A-E). 836.5 Classification accuracy vs. SNR plots for the two-class EEG classification problem(ABCD-E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.6 Classification accuracy vs. SNR plots for the three-class EEG classification problem(A-C-E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.7 Classification accuracy vs. SNR plots for the five-class EEG classification problem (A-B-C-D-E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867.1 Examples of one-hour preictal iEEG clips collected by a 16-channel device with a 5-minute offset before seizures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 917.2 Boxplots of Patient 1’s interictal and preictal iEEG data. . . . . . . . . . . . . . . . . 937.3 Boxplots of Patient 2’s interictal and preictal iEEG data. . . . . . . . . . . . . . . . . 947.4 Explained variance by different principle components of interictal iEEG data for Patient 1. 957.5 Explained variance by different principle components of preictal iEEG data for Patient 1. 957.6 Heatmap of pairwise correlation values of Patient 1’s interictal iEEG’s 16 sensors. . . . 967.7 Heatmap of pairwise correlation values of Patient 1’s preictal iEEG’s 16 sensors. . . . 977.8 CT scan of the NeuroVista seizure advisory system implanted in a patient [50]. . . . . 987.9 Hierarchically clustered preictal iEEG sensors with dendrograms and clusters in Patient 1. 987.10 Hierarchically clustered preictal iEEG sensors with dendrograms and clusters in Patient 2. 997.11 Frequency spectra of interictal (blue) and preictal (red) iEEG signals collected by the 16implanted electrodes (channels) of the seizure advisory system in Patient 1. Ch1-Ch16stand for Channels 1 to 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1027.12 Time-series iEEG signals and their corresponding spectra: (a) and (b) original iEEGclip and its 200Hz frequency spectrum; (c) and (d) downsampled-by-4 iEEG clip andits 50Hz frequency spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1037.13 3D Spectrogram of a 1-minute preictal iEEG segment. . . . . . . . . . . . . . . . . . 103xiv7.14 Schematic pipeline of the proposed iEEG data pre-processing approach for epilepticseizure prediction: SEG1, SEG2, · · · , SEG10 are corresponding to the 1st, 2nd, and10th iEEG segments of each iEEG channel signal; NCh is the total number of iEEGchannels (NCh=16), L is the number of segments per iEEG clip (L=10), and d is thenumber of data-points in each iEEG segment (d=6,000). . . . . . . . . . . . . . . . . 1047.15 Schematic diagram of the proposed neural network architecture for epileptic seizureprediction: Each iEEG instance is of NCh ⇥ 129 ⇥ 26 (NCh=16); Max Pooling standsfor Maximum Pooling; FC layer stands for Fully Connected layer; P1 and P2 are theprobabilities produced by the sigmoid function got class 1 and 2, respectively. . . . . . 1067.16 Examples of corrupted iEEG clips for Patient 1. . . . . . . . . . . . . . . . . . . . . . 1107.17 Data mismatch in Patient 1’s preictal iEEG sensor data. . . . . . . . . . . . . . . . . . 1117.18 Scatter plot to identify outliers in preictal iEEG data of Patient 1: Preictal S1 readingsvs. Preictal S2 readings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112B.1 Classification accuracy against EEG segments’ length. . . . . . . . . . . . . . . . . . 138B.2 Detailed schematic of the Standard Recurrent Network (SRN) unit (left) and a Long-Short-Term Memory (LSTM) block (right) [79]. . . . . . . . . . . . . . . . . . . . . . 139xvGlossaryADC Analog-to-Digital ConverterANN Artificial Neural NetworkAUC Area Under the CurveBCD Block Coordinate DescentBLDA Bayesian Linear Discriminant AnalysisBPNN Back-Propagation Neural NetworkBPF Band-Pass FilterCD Compressed DataCV Cross-ValidationCVANN Complex-Valued Artificial Neural NetworksCT Computerized TomographyCNN Convolutional Neural NetworkCR Compression RateCS Compressive SensingCSV Comma-Separated ValuesCWT Continuous Wavelet TransformDb4 Daubechies of order 4DT Decision TreeDSTFT Discrete Short-Time Fourier TransformDFNN Dynamic Fuzzy Neural NetworkDWT Discrete Wavelet TransformEEG ElectroencephalogramEEGF EEG featuresECoG ElectrocorticographyxviECOC Error-Correction Output CodesELM Extreme Learning MachineEM Expectation MaximizationEMD Empirical mode decompositionEMG ElectromyographyFC Fully ConnectedFE Feature ExtractionFFT Fast Fourier TransformfMRI functional Magnetic Resonance ImagingFP False PositiveFPR False Positive RateFSC Fuzzy Sugeno ClassifierGFFNN Generalized Feed-Forward Neural NetworkGMM Gaussian Mixture ModelHF High FrequencyHOS Higher Order SpectraICA Independent Component AnalysisiEEG Intracranial ElectroencephalogramIMFs Intrinsic Mode FunctionsIQR Interquartile RangeL1PRR L1-Penalized Robust RegressionLASSO Least Absolute Shrinkage and Selection OperatorLDA Linear Discriminant AnalysisLF Low FrequencyLGBP Local Gabor Binary PatternLNDP Local Neighbor Descriptive PatternLSTM Long Short-Term MemoryLSSVM Least Square Support Vector MachineMAR Missing at RandomxviiME Mixture of ExpertsMLP Multi-Layer PerceptronMLPNN Multi-Layer Perceptron Neural NetworkMM Matrix MultiplicationMRI Magnetic Resonance ImagingMSE Mean Square ErrorNCOV Normalized Coefficient of VariationPCA Principal Component AnalysisPE Permutation EntropyPET Positron Emission TomographyPNN Probabilistic Neural NetworkPQ Piecewise QuadraticPSD Power Spectrum DensityPSO Particle Swarm OptimizationRBFNN Radial Basis Function Neural NetworkRCD Randomized Coordinate DescentRD Raw DatarDSTFT rational Discrete Short-Time Fourier TransformRF Random ForestROC Receiver Operating CharacteristicRMS Root Mean SquareRNN Recurrent Neural NetworkRNG Random Number GeneratorRUSBoost Random Under Sampling BoostingRVM Relevance Vector MachineSAS Seizure Advisory SystemSTFT Short-Time Fourier TransformSPECT Single-Photon Emission Computerized TomographySNR Signal-to-Noise RatioxviiiSRN Standard Recurrent NetworkSSAE Stacked Sparse AutoencoderSUDEP Sudden Unexpected Death in EpilepsySVM Support Vector MachineTP True PositiveWPE Weighted Permutation EntropyWPT Wavelet Packet TransformWNNs Wavelet Neural NetworksWSNs Wireless Sensor NetworksWT Wavelet TransformxixAcknowledgementsI would first and foremost like to thank my supervisors, Prof. RababWard and Prof. Z. JaneWang – I amgreatly indebted to them for their unconditional support and inspiration throughout my PhD research.I would like to thank Prof. Rabab Ward for her enormous patience, intellectual guidance, and invalu-able advice that helped me a lot during my PhD studies. Her support, encouragement and mentorshipare at the root of this work, and it has been an absolute pleasure learning from her.I would also like to express my deepest appreciation to my co-supervisor, Prof. Z. Jane Wang, forher dedication and insightful suggestions and discussions. This thesis could not have been completedwithout her support and guidance.To my supervisory committee, Prof. Panos Nasiopoulos and Prof. Martin McKeown – thank you foryour time, efforts and helpful feedback. Thank you to the Natural Sciences and Engineering ResearchCouncil of Canada for its financial support through Vanier CGS scholarship. Thank you to all of myco-authors and labmates – I have thoroughly enjoyed learning from you and alongside you. To all thesupport staff at UBC – thank you for always making the logistics and paper work an easy process.To my friends Mohamed, Shady, Hazem, Belal, and Radwa – thank you for your unwavering supportand for providing me with a distraction when needed. And finally, to my parents and my beloved son,for whom there are no words – this accomplishment is as much yours as it is mine.xxTo my parents, ...To Aaser, my son and my only sunshine, ...for their immeasurable spiritual support and love.xxiChapter 1IntroductionEpilepsy is a chronic neurological disorder that affects around 1% of the world’s population [182]. Thedefining characteristic of epilepsy is recurrent seizures that strike without warning. Symptoms mayrange from brief suspension of awareness to violent convulsions and sometimes loss of consciousness[67]. Early detection and prediction of epileptic seizures would facilitate the monitoring and treatmentof epilepsy and favorably impact the quality of life and safety for affected patients. Studies of epilepsyoften rely on the electroencephalogram (EEG) as a measure of the brain’s electrical activities associatedwith seizures [184].The aim of this thesis is to (i) develop an energy-efficient EEG monitoring system for epilepticseizure detection, (ii) develop an early seizure detection algorithm(s) that can identify seizure attackswithin seconds of onsets, (iii) design seizure detection methods that are robust to environmental noiseand body artifacts, and maintain reliable seizure detection performance in real-life situations, (iv) de-velop a reliable seizure prediction algorithm that can forecast impending seizures ahead of the seizureonset, and (v) quantitatively analyze the human intracranial EEG (iEEG) data and provide some in-sights into the dynamic correlation between human brain regions during the preictal (before seizures)and interictal (between seizures) brain states.1.1 Epilepsy and Epileptic SeizuresEpilepsy is the second most common neurological disorder after migraine. Unlike other brain disorders,such as stroke or Alzheimer’s disease, which tend to develop later in life, epilepsy affects people of allages, with the majority between 15 and 64 years old [49, 90]. The hallmark of epilepsy is recurrent,unprovoked seizures stemming from abnormally excessive neuronal activity in the brain. A person isdiagnosed with epilepsy if there has been at least one unprovoked seizure (with the likelihood of furtherseizures) that occurred without any known cause or medical condition [65, 212]. Provoked seizures, onthe contrary, are due to a temporary event such as fever, concussion, brain infection, alcohol withdrawal,or extremely low blood sugar [66, 139].Epilepsy has many possible causes, but for almost half of the epileptic patients, the cause is notidentifiable. For the other half, the epilepsy is traced to several causes such as genetic influence, braintumors, stroke, head trauma, infectious diseases, prenatal injury, developmental disorders [60]. Seizuresare generally classified into two major categories - focal (partial) seizures and generalized seizures [20]:• Focal (partial) seizures: Seizures that originate in just one part of the brain are classified as focal(partial) seizures. Around 60% of people with epilepsy are diagnosed with focal seizures [32].1• Generalized seizures: Seizures that involve all areas of the brain are called generalized seizures.These seizures may cause falls, muscles’ massive contractions, or sometimes loss of conscious-ness.It is also worth noting that not all seizures can be classified as focal or generalized. Some patientsexperience seizures that begin as focal and then spread all over the entire brain. Other patients maysuffer from both kinds of seizures.Epileptic seizures can also be classified into two categories based on the way they terminate: self-limiting seizure and status epilepticus seizure [61]. Unlike self-limiting seizures that last for a fewminutes, status epilepticus last between five and thirty minutes [12, 200]. The status epilepticus seizurescan be either of the generalized tonic-clonic type (with accompanying contraction and extension of thearms and legs), or of types that do not involve contractions as in complex focal seizures as well asgeneralized absence seizures [12]. Also, status epilepticus seizures are life-threatening (particularly ifthe treatment is delayed) and require immediate medical care. People with status epilepticus have anincreased risk of permanent brain damage and death. Indeed, between 10% and 30% of patients whohave status epilepticus die within 30 days [12].Doctors generally begin by treating epilepsy with medications. If medications do not treat the con-dition, doctors may suggest brain surgery. Some neuroscientists, however, prefer to try the potentialalternative of brain stimulation therapies for treating epilepsy [196]. Anti-epileptic medications areeffective in controlling seizures in most epileptic patients. Around 70% of people with epilepsy re-spond to medications [68]. Some of these can be cured and become seizure-free by taking adequateanti-epileptic drugs, while others may experience less frequent and intense seizure attacks. The rightmedication and dosage are determined by doctors on the basis of seizure type, frequency of seizureattacks, seizure intensity, symptoms, and many other factors. Anti-epileptic medications, however, mayresult in undesirable side-effects, some of them are severe and irremediable [159].According to the surveys,⇠30% of epileptic patients are drug-resistant and continue to have seizuresdespite treatment [68]. When anti-epileptic drugs fail to provide satisfactory results, brain surgery maybe an alternative option. Doctors usually propose brain surgery when medical tests show that seizuresoriginate in a well-defined small region of the brain and when extracting this particular region doesnot affect the vital functions such as vision, hearing, speech, and also motor functions. With epilepsysurgery, doctors first use magnetic resonance imaging (MRI) and other measures to determine whereseizures happen in the brain. They then surgically remove the region of the brain where seizures occur.Around 70% of epileptic patients who have brain surgery become either seizure-free or they experienceless frequent and intense seizures [187]. Surgery for epilepsy, however, involves serious risks and maycause severe complications [195].Apart from anti-epileptic medications and neurosurgery, brain therapy offers an alternative for drug-resistant patients who are not candidates for or who have failed resective surgery. Brain therapies havebeen found less impressive in terms of seizure freedom; however, they can significantly reduce thefrequency and strength of seizure attacks [62].21.2 Electroencephalogram (EEG)To diagnose epilepsy, patients may have to go through several medical tests such as blood test, EEG,high-density EEG, computerized tomography (CT) scan, MRI, functional MRI (fMRI), positron emis-sion tomography (PET), and single-photon emission computerized tomography (SPECT). Amongst allmedical tests, EEG is the most commonly used tool in the diagnosis, monitoring, and management ofepileptic seizures [184]. EEG is also useful for diagnosing other neurological disorders such as tumor[136], brain damage [23], stroke [113], brain dysfunction [40], attention-deficit [78], and also sleep dis-orders [15]. EEG is utilized to measure the electrical activities in the brain. Brain neurons communicatewith each other through electrical impulses, and EEG can help identify potential problems associatedwith these activities.EEG signals can be categorized into two main types: surface (scalp) EEG and intracranial (invasive)EEG (iEEG). From now on, the “EEG” term will refer to the surface EEG and “iEEG” will refer to theintracranial EEG. More details about both categories are provided below:1.2.1 Surface EEGSurface EEGs record the brain electrical activity using non-invasive electrodes placed along the scalp[184]. EEG measures voltage fluctuations resulting from electrical current within the brain cells [155].Diagnostic applications often focus either on the EEG semiology (as obtained in the time domain) or onthe spectral characteristics in the EEG frequency domain. The former investigates time-series data tolook for typical distinguishable patterns linked to an event (e.g., seizure onset, eye blinking, and fingermovement). The latter examines the spectrum of the EEG data and find if there is any variations in anyof the prime EEG frequency rhythms (e.g., Delta, Theta, Alpha, Beta, and Gamma).A routine EEG acquisition time typically lasts around 25 minutes and the EEG signals are recordedby scalp electrodes using a conductive gel that decreases the impedance and improves the quality ofthe acquired data. Despite its limited spatial resolution, the EEG continues to be a valuable tool in thediagnosis and monitoring of neurological disorders, especially epileptic seizures. It also provides hightemporal resolution for brain activities and provides some insights into how the human brain behavesbefore and between recurrent seizure attacks. This is not possible with many medical imaging tech-niques such as CT, PET or MRI. EEG is also used to localize the areas of the brain from which a seizureoriginates for workup of possible seizure surgery.For most clinical and research applications, the locations as to where to place the electrodes on thescalp are specified according to the international 10-20 system [199]. Commonly 19 recording elec-trodes (plus ground and system reference) are used. Figure 1.1 depicts the locations of scalp electrodesof the international 10-20 system in the context of an EEG test. This internationally recognized methodallows EEG electrode placement to be standardized and ensures that inter-electrode spacing is equal andthe electrodes cover all brain regions.As shown in Figure 1.1, each electrode location is labeled by a letter and a number. The letter iden-tifies the brain region (lobe) from where EEG is recorded. The letters “Fp”, “F”, “T”, “P”, “O” stand3Figure 1.1: Electrode locations of international 10-20 system for EEG recordingfor the Pre-frontal, Frontal, Temporal, Parietal, and Occipital lobes, respectively; while the letter “C”denotes the center of the scalp [58]. The electrodes with even numbers (2,4,6,8) are located on the rightside of the head, whereas those with odd numbers are placed on the left side. The electrodes referredto by “z” (zero) rather than numbers are placed on the midline sagittal plane of the skull. These “z”electrodes are commonly used as a ‘reference’ in most of the EEG montages meant to diagnose neu-rological disorders (e.g., epileptic seizures, sleep disorder, and brain death). The rest of the electrodes,however, are recorded, interpreted, and analyzed for accurate diagnostic of these diseases. Figure 1.2shows an example of multi-channel surface EEG signals recorded by the international 10-20 systemfrom an epileptic patient. The “10” and “20” refer to the fact that the actual distances between adjacentelectrodes are either 10% or 20% of the total front–back or right–left distance of the skull.1.2.2 Intracranial EEGWhen an epileptic patient is being considered for brain surgery, it is prerequisite to identify the locationsof seizure focus (source) with an accuracy better than that provided by surface EEGs. This is becausethe skull and scalp weaken the electrical brain activities recorded by surface EEG. In this case, doctorstend to implant electrodes directly on the exposed surface of the brain. The signals recorded by theseimplanted electrodes are referred to as intracranial/invasive EEG (iEEG), subdural EEG (sEEG), orelectrocorticography (ECoG) – all terms for the same thing.Unlike EEG, iEEG records the electrical activity directly from the cerebral cortex of the brain.This helps with recording a wide range of EEG frequencies and capturing low voltage components ofbrain activities [130]. Thus, iEEG can be used to determine the precise location and boundary of the4Figure 1.2: An example of multi-channel surface EEG signals recorded from an epileptic patient.Figure 1.3: Subdural grids placed on the brain surface for iEEG recording from an epileptic patient.epileptogenic zone inside the brain [13]. Figure 1.3 shows an example of invasive strips placed on thebrain surface for iEEG recording from an epileptic patient.51.3 MotivationAs EEG is the prime signal that measures the brain activities, EEG has been widely used for the diag-nosis, detection, and prediction of epileptic seizures. However, the visual inspection of long-term EEGrecordings is labor, cost, and time consuming. Also, facilitating the immediate intervention of familymembers, caregivers, and physicians necessitates detecting seizure patterns at the onset [171]; however,this may not be feasible if the seizure onset is not recognized early enough. This motivates the de-velopment of automatic, computationally-efficient, and real-time EEG-based seizure detection systems.This would smooth the way for early detection of seizure onsets and certainly lessen the potential risksthreatening epileptic patients during and after seizure attacks.Furthermore, most EEG-based seizure detection systems adopt feature extraction techniques that usehand-crafted EEG features computed in the time domain [138], frequency domain [167], time-frequencydomain [202], and sometimes from a combination of multiple domains [140]. In practice, these hand-crafted features experience two main challenges. First, they are very sensitive to variations in seizurepatterns (as EEG is a non-stationary signal and the seizure semiology varies across different patients andover time for the same patient). Secondly, the EEG recordings are highly susceptible to body artifacts(e.g., muscle activities and eye blinking) as well as environmental noise. These artifacts distort the EEGdata and negatively impact the performance of seizure detection systems [5]. This motivates researchersto develop robust EEG-based methods that maintain reliable seizure detection performance under idealand real-life conditions.The automatic, early, and robust EEG-based seizure detection systems can favorably impact thequality of life of epileptic patients as follows:• In the real world: epileptic patients would wear EEG headsets for real-time monitoring of theirbrain activities in order to detect and monitor seizures. If a seizure onset is detected early, familymembers and/or caregivers will be notified so they can lend a helping hand to the patient duringand after the seizure attack. For example, they can clear hard or sharp objects away from theperson, place the person on his side to keep his airway clear, time the seizure length and call formedical help if it lasts too long or the person has repeated seizures, and call for an ambulance ifthe person has a breathing problem.• In ambulatory and clinical settings: EEG-based seizure detection devices are also used to de-termine the laterality and approximate origin of a patient’s seizures. This helps doctors to localizethe brain regions that could be artificially stimulated in order to lessen the intensity and frequencyof seizure attacks.• In brain research institutions: continuous recording of brain waves from epileptic patients helpsresearchers analyze and well understand the brain activities at four different states: Preictal (priorto a seizure), Ictal (during a seizure), Postictal (after a seizure), and Interictal (between recurrentseizures).6Furthermore, around 30% of epileptic patients have drug-resistant epilepsy [68]. They continue toexperience seizures despite treatment, and their quality of life is significantly lowered by the anxietyassociated with the unpredictable nature of seizures and the consequences therefrom. This motivatedresearchers to develop seizure prediction systems [52]. The ability to predict seizures ahead of timewith high accuracies (i) would make individualized epilepsy treatment possible and allow more tailoredtherapies, (ii) patients also would be warned of an upcoming seizure so they can take their precautionsand avoid any probable injuries, (iii) brain stimulation therapy could be used to lessen the intensity ofseizure attacks, and (iv) closed-loop seizure intervention systems could be also used to prevent seizuresin patients with drug-resistant epilepsy.This thesis aims to develop EEG-based methods that are fast and have low-complexity, for earlydetection of epileptic seizure onsets. Since EEG interpretation is tedious, costly, and time-consuming,such computationally-simple seizure detection systems would facilitate immediate intervention by care-givers and doctors and would certainly help patients avoid potential complications. This work also aimsat developing robust EEG-based algorithms for automatic and reliable detection of epileptic seizures.Since the scalp EEG is highly susceptible to body artifacts and environmental noise, which distort theEEG signal and buries discriminative EEG patterns, such robust algorithms would be helpful in identify-ing seizure patterns hidden in the corrupted EEG data. The other objective of this thesis is to develop anefficient and reliable epileptic seizure prediction method, mainly for people with drug-resistant epilepsy.This necessitates a detailed quantitative analysis of the human invasive EEG data preceding seizure at-tacks, which in turn gives some insights into the dynamic correlation between the different brain regionsduring preictal and interictal brain states.1.4 Thesis ContributionsThe major contributions of this thesis are as follows:1.4.1 Developing an Energy-Efficient Wireless EEG Monitoring System for EpilepticSeizure DetectionWireless EEG monitoring systems have been widely used for remote seizure detection applications.These systems capture, process and transmit the EEG data wirelessly to the server side, where the seizurepatterns can be identified. However, the EEG sensor units of these systems encounter the challengeof (short) battery’s lifetime. Up to 70% of the total power consumption is devoted to wireless EEGdata transmission only. To elongate the battery lifetime of such systems, previous studies attempted tocompress the EEG data at the sensor side, and hence minimize the transmission power. However, theEEG signals reconstructed at the server side are distorted causing a serious impact on the performance ofseizure detection algorithms. In an attempt to address these limitations, we introduce a novel EEG datareduction algorithm that saves ⇠60% of the total power consumption at the sensor side while yieldingaccurate data reconstruction results at the server side.7• At the sensor side, the dimensionality of the data is reduced by intentionally deleting some data-points randomly. This method is computationally simple and effectively reduces the power con-sumption needed for both data encoding and data transmission.• At the server side, the expectation maximization algorithm is adopted to estimate the missing(randomly deleted) data-points. The performance of the proposed scheme is compared to thoseof the state-of-the-art methods and is shown to achieve ⇠60% less power consumption withoutcompromising the seizure detection accuracy.1.4.2 Designing a Computationally-Efficient Feature Extraction Method for EarlyDetection of Epileptic SeizuresEarly detection of epileptic seizures can favorably impact the quality of life of patients. Unlike seizuredetection systems that take over a minute to detect a seizure, early detection has the potential to recog-nize seizure attacks within seconds of onsets. This would allow therapeutic or warning devices to betriggered prior to the onset of disabling clinical symptoms. The family members and caregivers receivean immediate alert through alarms or phone calls so they can help the patient during the seizure attack.Closed-loop therapies (e.g. neurostimulation) could also be used to diminish the disabling symptoms ofan epileptic seizure, especially for subjects with drug-resistant epilepsy. Our contributions in the area ofearly detection of epileptic seizures are as follows:• we propose a computationally-efficient method for EEG feature extraction. This method usesthe least absolute shrinkage and selection operator (LASSO) regression to effectively identifythe distinguishable EEG spectral features that are pertinent to epileptic seizures. This helps toimprove the interpretability of seizure detection models and make them more generalizable.• we deploy a computationally-fast optimization algorithm, known as “randomized coordinate de-scent”, for estimating LASSO regression coefficients. The indices of these coefficients are thenused as indicators for the distinctive EEG features associated with epileptic seizure activities.1.4.3 Developing Robust Detection Methods of Epileptic SeizuresSeveral feature extraction techniques have been developed for EEG-based seizure detection systems.Most of them use hand-crafted features extracted in the time-domain, frequency-domain, time-frequencydomain, and sometimes in multiple domains. However, these domain-based methods encounter twomajor challenges. First, the non-stationarity nature of EEG makes it difficult to have a single seizurepattern, making the hand-crafted features less practical in clinical applications. Second, EEG dataacquisition systems are very susceptible to a diverse range of body artifacts such as muscle activitiesand eye-blinking as well as environmental noise. All these sources of contamination can negatively alterthe genuine EEG features and seriously impact the performance of seizure detection systems.To address these challenges, we present the following contributions:8• Developing a robust seizure detection method that accurately recognizes epileptic seizures un-der real-life conditions as well as ideal conditions. A feature extraction method, based on L1-Penalized Robust Regression (L1PRR), is proposed to identify the seizure-associated features inclean and contaminated EEG data. On a clinical EEG dataset, the proposed L1PRR-based methodis shown to outperform several existing state-of-the-art methods, and also achieve robust seizuredetection performance in the presence of common signal contaminants and ambient noise.• Designing a robust deep neural network architecture that achieves more reliable seizure detectionperformance in the presence of ambient noise and artifacts. The proposed architecture uses a re-current neural network (RNN) with long short-term memory (LSTM) cells to effectively exploitthe temporal dependencies in time-series EEG signals. Comparisons with previously publishedwork indicate that the proposed method achieves the most remarkable seizure detection perfor-mance under the perfect circumstances (i.e., EEGs are free of noise and artifacts). The proposedmethod is likewise demonstrated to maintain a robust performance in the existence of commonEEG artifacts and environmental noise, making it more suitable for clinical diagnosis.1.4.4 Developing an Accurate Prediction Method of Epileptic SeizuresOne major shortcoming with previous seizure prediction studies is the high false prediction rate, limitingtheir application in practice. We aim at developing a seizure prediction system with enhanced perfor-mance when compared to those reported in the literature. The systemwould warn patients and caregiversabout when the next seizure will strike. This would be of great benefit to people with epilepsy, especiallythose who are drug-resistant. Indeed, around 30% of epileptic patients continue to have seizures despitetreatment. The ability to predict seizures with high accuracies would make individualized epilepsy treat-ment possible. Patients also would be warned of an impending seizure so they can take their precautionsand avoid any possible injuries. Closed-loop systems could be also used to terminate upcoming seizures.Our contributions to the area of seizure prediction are as follows:• Providing an extensive quantitative analysis of the human invasive EEG data that directly pre-ceded seizures (preictal iEEG) and also the data between recurrent seizures (interictal iEEG).Such analysis gives some insights into how the human brain regions are correlated during thepreictal and interictal brain states.• Introducing an efficient pre-processing framework for transferring time-series iEEG data intoimage-like formats that can better leverage the power of convolutional neural networks.• Proposing a multi-scale convolutional neural network architecture that can automatically learndistinguishable iEEG features from different feature maps simultaneously.• Developing a seizure prediction algorithm that works reliably for several patients with drug-resistant epilepsy. Our proposed algorithm outperforms the state-of-the-art algorithms for thisproblem.91.5 Thesis OutlineThis chapter summarizes the foundational material on epilepsy, epileptic seizures, and EEG and illus-trates the research problems as well as the major contributions of this thesis. The remaining chaptersare organized as follows:• In Chapter 2, a review of previously developed EEG-based seizure detection and prediction meth-ods is presented.• In Chapter 3, the problem of energy-efficient EEG tele-monitoring systems is addressed. We in-troduce an efficient data reduction method that considerably reduces the total power consumptionof EEG sensor nodes and yields accurate data reconstruction results. The performance of the pro-posed method is compared to those of the state-of-the-art methods and is shown to achieve⇠60%less power consumption without compromising the seizure detection accuracy.• In Chapter 4, the problem of early detection of epileptic seizures is addressed. We propose andshow how to extract the most representative seizure-associated EEG features in a computationally-efficient manner, using the LASSO regression algorithm. The seizure detection accuracy andlatency (delay) of the proposed method are significantly better than those of the state-of-the-artmethods.• In Chapter 5, the problem of robust detection of epileptic seizures is addressed. In practice, EEGdata gets contaminated by different sources of artifacts (e.g., muscle activities and eye-blinking)as well as environmental noise. These artifacts induce serious distortion in the EEG shape andnegatively affect the performance of seizure detection systems. We develop a robust EEG featureextraction method that provide the most informative features whether from clean or corruptedEEG data. The proposed method is proven to achieve robust performance in the presence of noiseand common EEG artifacts.• In Chapter 6, the problem of robust seizure detection is further addressed. We propose a deeplearning method that achieves more robust performance under real-life conditions. The proposedmethod relies on recurrent neural networks to effectively exploit the temporal dependencies inclean and corrupted time-series EEG signals. We show how our method can achieve more robustseizure detection performance in the existence of common signal contaminants and ambient noise.• In Chapter 7, the problem of seizure prediction using intracranial EEG data is addressed. Wedevelop a rigorous seizure prediction algorithm that maintains reliable performance against inter-and intra-patient variations. We introduce a multi-scale convolutional neural network architec-ture for learning different EEG representations simultaneously. Seizure prediction results showthat our algorithm outperforms the algorithms intended for people with drug-resistant epilepsy inprevious studies.• In Chapter 8, we conclude this thesis and provide a summary of our contributions. We also discussthe limitations of the current study and possible directions for future work.10Chapter 2Literature ReviewChapter 1 presented the background information on EEG, epilepsy and epileptic seizures and explainedthe thesis objectives and contributions. In this chapter, the related work in the field of automated EEG-based epileptic seizure detection and prediction is reviewed.2.1 Epileptic Seizure DetectionEEG-based seizure detection systems aim at effectively analyzing the EEG signals in order to accu-rately determine whether the epileptic patient is experiencing a seizure or not. A typical scheme forEEG seizure onset detection system comprises the following steps: (i) EEG pre-processing (e.g., de-noising and artifacts removal), (ii) EEG feature extraction and selection (eliciting a set of EEG featuresthat well characterizes the seizure patterns), and (iii) EEG classification (identifying whether the ex-tracted features correspond to seizure or normal EEG activity). Several seizure detection methods havebeen presented in the literature; the majority of them focus on developing efficient EEG feature extrac-tion and selection techniques. Only a few studies pay particular attention to building customized EEGclassification models. In this section, we present a review of the EEG feature extraction and selectionmethods that have been developed in the time domain, frequency domain, time-frequency domain, andother domains, for the purpose of automated seizure detection.2.1.1 Time Domain-based MethodsGotman is one of the pioneers in the field of automatic seizure detection using EEG signals. In [76],Gotman et al. proposed the use of both EEG and iEEG signals to recognize the epileptic semiology indata recorded from 20 epileptic patients. A digital filter was first adopted to suppress the 60Hz EEGartifacts. The EEG signals were then decomposed into half-waves whose durations and amplitudes wereused to characterize the normal and epileptic EEG patterns. The study in [143] applied autoregressiveanalysis to the EEG dataset of the Epilepsy Center at Bonn University for the purpose of producing aninformative representation of EEG signals. Thereafter, a multi-layer perceptron (MLP) classifier wasutilized to identify the seizure activity, and classification accuracy in the range of 91.00-96.00% wasobtained.Acharya et al. integrated a set of EEG-based entropy features, namely: approximate entropy, sampleentropy, and phase entropy in only one feature vector representing the corresponding EEG class [6].The effectiveness of this set of entropy features was examined using several classifiers. The fuzzySugeno classifier (FSC) was the one that produced the highest seizure detection accuracy of 99.40%,11100.0%, and 98.10% for sensitivity, specificity, and classification accuracy, respectively. Likewise,Kollialil et al. used entropy features in addition to EEG signal energy as representative attributes to EEGactivities. These features were used in conjunction with the support vector machine (SVM) classifier andclassification accuracy of 99.66% was obtained [123]. Comparable classification accuracy of 98.67%was achieved in [156] by using the error-correction output codes (ECOC) classification technique.In an attempt to lessen the computational complexity of seizure detection systems, the authors of[122] utilized the computationally-efficient least square SVM (LSSVM) classifier along with the tradi-tional entropy features. Limited classification accuracy of 82.22% was achieved. In [151], the gray levelco-occurrence matrix was used to extract the intrinsic features from the EEG signals for seizure activityrecognition. Using the artificial neural network (ANN) classifier, this method achieved a classificationaccuracy of 90%. Further, the authors of [29] developed a seizure detection method based on EEG sta-tistical features and LSSVM classifier. The proposed method showed an average detection accuracy of97.19% and a short detection latency of 0.065 seconds.In [168], a feature extraction method was developed based on the mean and minimum values ofEEG signal energy in successive 1-second EEG epochs. The extracted features were examined onthe CHB-MIT EEG dataset using a linear classifier with 60% of the data used for training and 40%for testing. An average seizure detection accuracy, sensitivity, and specificity of 99.81%, 100%, and99.81% were attained, respectively. It was observed that the mean and minimum EEG signal energiesof seizure epochs had larger values than those of non-seizure epochs. The study in [14] presented apatient-specific technique for channel selection based on the histogram of multi-channel EEG signals.In the pre-processing stage, the EEG signals were partitioned into non-overlapping segments; each was10-seconds long. The histogram was then estimated for each EEG segment individually and used forchannel selection. The histograms of the selected EEG channels were used to determine whether theEEG segment had ictal or non-ictal activity. This method was also tested on the CHB-MIT databaseusing leave-one-out cross-validation and a seizure detection performance of 97.14% sensitivity and98.58% specificity were achieved.Mursalin et al. also used a combination of time-domain features such as mean, median, minimum,maximum, standard deviation, skewness, kurtosis, first quartile, third quartile, interquartile range, andHurst exponent [145]. The classification of these features was then carried out by an ensemble ofRandom Forest (RF) classifiers. The results on Bonn University EEG dataset showed the effectivenessof the selected EEG features in detecting epileptic seizure activities in all patients. In [53], Dalton etal. developed a single channel-based wireless sensor network that could identify seizure patterns in thelong-term EEG recordings of CHB-MIT database. They constructed the features of the mean, standarddeviation, zero-crossing rate, entropy, root means square (RMS), and auto-correlation from time domainsignals. The accelerometer and gyroscope signals were also used for physical activity monitoring forthe purpose of detecting motor seizures. The RMS was found to be the most distinguishable feature thatcharacterized seizure and non-seizure activities. Results showed an average seizure detection sensitivityand specificity of 91% and 84%, respectively.In brief, time domain-based EEG feature extraction methods are computationally simple and are12good candidates for real-time detection of epileptic seizures. They are, however, very sensitive to theinter- and intra-patients variations and are also prone to body artifacts and environmental noise; makingthem a good choice for patient-specific problems only.2.1.2 Frequency Domain-based MethodsUnlike time domain methods, the frequency domain methods provide insights into the characteristics ofthe frequency rhythms of EEG signals. EEG features extracted in the frequency domain are generallyproven to be more informative and robust than those computed from the signals in the time domain.In [34], Bhople et al. used fast Fourier transform (FFT) to transfer the time-series EEG signals tothe frequency domain. A set of FFT-based EEG features were then extracted and tested using multi-layer perceptron neural network (MLPNN) and the generalized feed-forward neural network (GFFNN)classifiers. An average seizure recognition accuracy of 100% was achieved. In [16], The long-termEEG signals were first divided into shorter EEG segments of 1-second duration each, and FFT wasthen applied to these short segments. The magnitude of the EEG segments’ spectra was computed in thefrequency range of 1–47Hz and then used together with the EEG correlation coefficients and eigenvaluesto form the feature vector. The effectiveness of these features was examined using a random forestclassifier with 3000 trees.The study in [150] used a combination of time domain and frequency domain features to form a morerepresentative feature vector, which was then fed into an MLPNN for EEG classification. The epilepsyrecognition rates achieved by the aforementioned method were 97.17-97.46% for sensitivity, 98.59-98.74% for specificity, and 97.19-97.50% for classification accuracy. Moreover, a feature extractionmethod based on the Hilbert transform was presented in [45]. It was employed together with the SVMto detect epileptic EEG activities, and classification accuracy of 97% was achieved.In [119], Khamis et al. also developed single channel and patient-specific EEG-based seizure de-tection method. The EEG spectral features, “frequency moment signatures”, were used to differentiatebetween seizure and non-seizure EEG activities. The proposed method also encompassed data filteringfor the EEG data recorded by the different channels placed on the right and left brain hemispheres.The power spectrum density (PSD) of the signals on both hemispheres were computed and the signa-ture was calculated by subtracting the normalized central moments from the mean PSD values. Thesesignatures were then used to distinguish between a seizure pattern and any other transient or normalactivity. A template matching algorithm named “Powell’s direction set” was used for EEG training andclassification. An average seizure detection sensitivity of 91% and a false positive rate of 0.02/h wereattained.To conclude, frequency domain methods are adequate choices for long-term EEG data of a largenumber of samples. They, however, lack the time domain features needed for visual interpretation ofEEG recordings. Features extracted from both domains were found to produce very promising seizuredetection results. This motivated researchers to develop time-frequency domain-based EEG featureextraction methods. Such methods can effectively elicit time and frequency domain attributes simulta-neously, achieving more reliable seizure detection performance.132.1.3 Time-Frequency Domain-based MethodsTime-frequency analysis tools such as short-time Fourier transform and wavelet transform are found tobe very useful for extracting the distinguishable features from signals that have a non-stationarity naturelike EEG signals. The wavelet transform, however, gives better time and frequency resolution comparedto the short-time Fourier transform. It decomposes the signal into approximations and details sub-bands,and then features are extracted from these sub-bands. The key challenge in wavelet transform-basedEEG feature extraction is to find the optimal mother wavelet, the number of decomposition levels, andcertainly the type of features that could be extracted from these wavelet sub-bands to characterize seizureEEG activities.In [74], the wavelet transform is used to decompose EEG signals into their main spectral rhythms.Then, three statistical features are extracted from these rhythms and fed into a back-propagation neuralnetwork (BPNN)-based classifier, which achieved 96.70% classification accuracy. In a similar study,Bao et al. adopted the probabilistic neural network (PNN) to recognize the seizure activities; a com-parable classification accuracy of 96.70% was achieved [27]. Recently, Gotman et al. made a furthersignificant contribution that used a new frequency range of iEEG spectrum to improve the detectionaccuracy of epileptic seizure [24]. The wavelet transform was employed to decompose the iEEG signalsinto five approximations and details sub-bands. Then, three types of delegate features were extractedfrom the high frequency (HF) wavelet sub-bands (i.e., 80-500Hz). These features were the relativeenergy, number of peaks and wavelet entropy, and were proven to have a huge potential in detectingseizure onset patterns; a sensitivity of 72% and a false detection rate of 0.7 per hour were achieved.The authors of [205] also used wavelet decomposition to obtain approximations and details sub-bands of EEG signals. Subsequently, the statistical features characterizing the behavior of EEG wereextracted and tested using anMLPNN classifier. The results show a sensitivity, specificity and classifica-tion accuracy of 96%, 94%, and 94.83%, respectively. In [129], wavelet entropy was used together withan artificial neural network classifier and this resulted in an average classification accuracy of 94.50%.The sample entropy was also used in [185] as a representative EEG feature to detect epileptic seizures.It was fed into the classification model of the extreme learning machine (ELM) and they produced asensitivity, specificity, and classification accuracy of 97.26%, 98.77%, and 95.67%, respectively.Furthermore, the authors of [134] proposed an efficient wavelet-based seizure detection method.The EEG data were first analyzed using wavelet transform, and then the effective features were extractedand tested using a SVM classifier. In [73], a piecewise quadratic (PQ) classifier that used a set of time,frequency, and time-frequency features for epilepsy detection was developed. This classifier resulted inremarkable detection rates of 98.60%, 99.33%, and 98.70% for sensitivity, specificity, and classificationaccuracy, respectively.The study in [166] presented a hybrid seizure detection method that used both wavelet and Hilberttransforms. The EEG signals were decomposed for only one decomposition level and the followingfeatures were extracted from the wavelet and Hilbert coefficients: mean, maximum, minimum, standarddeviation, and average power. The mother wavelet of Daubechies of order 4 (Db4) was used for EEGanalysis as it yielded the highest correlation with the EEG signals under study. Experiments on Bonn14University EEG dataset showed that the features extracted from the Hilbert transform coefficients to-gether with K-nearest neighbor (KNN) classifier achieved superior seizure detection results than thoseextracted from wavelet coefficients. Average sensitivity and specificity of 100% each were obtained.Similarly, the seizure detection method presented in [219] used the same set of features along withwavelet neural networks (WNNs) classifier. Bonn University database was also used to assess the per-formance of the proposed method and a seizure detection sensitivity and specificity up to 98% wereachieved.In [156], Niknazar et al. decomposed the EEG recordings into five levels using the Db4 motherwavelet. The obtained wavelet sub-bands have similar frequency range to the alpha, beta, delta, theta,and gamma EEG rhythms. A set of statistical features were then computed from the EEG rhythms andtested using an error-correcting output coding (ECOG) classifier; an average classification accuracy of98.67% was achieved. In [47], Chen et al. also used the Daubechies wavelet to analyze the EEG signalsinto six decomposition levels. Only the sub-bands corresponding to levels 3, 4, 5, and 6 were nominatedfor further processing. The magnitude of these sub-bands was used to discriminate between seizure andnon-seizure EEG patterns. By using KNN classifier, a seizure detection rate of 100% was attained.In [221], the EEG lacunarity and fluctuation index were used as representative attributes of theseizure EEG activities and the Bayesian linear discriminant analysis (BLDA) classifier was used toassess their performance. The lacunarity was a measure of gappiness in EEG patterns and the fluctuationindex evaluated the intensity of the fluctuations in these patterns. The seizure EEG patterns were foundto have larger lacunarity and fluctuation index than non-seizure activities. EEG segments were firstdecomposed into five wavelet sub-bands, from which only three sub-bands were nominated for the EEGfeature extraction. Lacunarity, fluctuation index, and other descriptive statistical features were computedfrom these three wavelet sub-bands. The end-to-end scheme was examined on the Freiburg EEG dataset,and the detection rates of 96.25% sensitivity and 0.13/h false detection rate were achieved. In a similarstudy, Abbasi et al. analyzed the EEG signals into five wavelet sub-bands but they chose the firstfour sub-bands for further feature engineering [3]. The conventional statistical features of maximum,minimum, mean, and standard deviation were computed for each sub-band and then combined togetherin only one feature vector. Using the MLPNN classifier, average accuracy, sensitivity, and specificity of98.33%, 100.00%, and 97.10% were attained, respectively.A feature extraction approach based on five-level wavelet decomposition was also presented in[162]. All wavelet sub-bands were used to compute the features of energy, standard deviation, andentropy, and the SVM classifier was adopted for EEG classification. The energy values of the waveletsub-bands were found to be the most distinguishable features that achieved the highest seizure detectionrate of 91.20%. In [120], Khan et al. used the same methodology as [162], but with a different set offeatures such as relative energy and normalized coefficient of variation (NCOV). The average seizuredetection rates of 83.60% accuracy, 100% sensitivity, 91.80% specificity, and 86.70% precision wereachieved.152.1.4 Empirical Mode Decomposition (EMD)-based MethodsEmpirical mode decomposition is also a time-frequency analysis tool but a bit different from the short-time Fourier and wavelet transforms which have prior fixed bases. EMD is adaptive in nature and doesnot require any prior fixed basis for analyzing nonlinear and non-stationary time-series signals suchas the EEG [97]. EMD is a nonlinear signal decomposition algorithm which transforms time-seriessignals into a set of components named “intrinsic mode functions (IMFs)” that maintain the inherentcharacteristics of the original signals. Many seizure detection methods have used IMFs to distinguishbetween seizure and non-seizure EEG activities. Several features can be extracted from the IMFs andeven IMFs themselves can be used as candidate features for epileptic seizure detection.The Hilbert transform (HT) was commonly used together with EMD as this transform can effectivelyfind the instantaneous frequency of the signal at a particular instant of time. Such a combination of EMDand HT results in a frequency-time distribution of the signal energy, which enables the identification ofthe prominent features in time and frequency domains jointly. It is worth highlighting that EMD-basedseizure detection methods with an adequate number of decomposition levels were found to surpasswavelet-based methods in terms of seizure recognition accuracy and reliability. For example, Eftekharet al. took the first initiative in applying EMD to both EEG and ECG signals for the purpose of seizuredetection [59]. Their results were comparable to those of the existing time-frequency methods. Thismotivated researchers to put more efforts into investigating how to optimize EMD so as to obtain betterseizure detection results.In [192], the mean value of the absolute values of the EEG IMFs was used as a delegate feature foridentifying seizure patterns. The effectiveness of such a feature was examined by the MLP classifier onFreiburg EEG dataset. Results showed 90.69% classification accuracy with only four empirical modes.In [158], Orosco et al. used the energies of the EEG IMFs as distinguishable features to distinguishbetween seizure and non-seizure EEG patterns. The performance of this method was also tested onFreiburg dataset and the obtained results were inferior to those of [192]. Average seizure detectionsensitivity and specificity of 56.41% and 75.86% were achieved, respectively. The study in [80] usedEMD components to compute a bigger set of features including amplitude, skewness, kurtosis, andShannon’s entropy. These features were used as an input to the linear Baye’s classifier and a superiorseizure detection accuracy of 98% was achieved.The study in [30] presented an unsupervised clustering algorithm to differentiate between seizureand non-seizure activities. The distance features of Euclidian, Bhattacharya, and Kolmogorov werecomputed from the EEG IMFs and then provided as an input to the proposed clustering technique. Theproposed detection paradigm was examined on the CHB-MIT EEG database and a remarkable seizuredetection accuracy of 98.84% was achieved. In [198], a combination of EMD-based feature extractiontechnique and an ANN classifier were used to identify the class label of each EEG epoch in Bonn EEGdataset. A high seizure recognition accuracy of 96% was achieved. In [26], Bajaj et al. also introducedan EMD-based algorithm for the detection of focal temporal lobe epilepsy. The proposed method firstobtained the IMFs by applying the EMD to the EEG signals, then computed Hilbert transformation ofthe produced IMFs of EEG, and finally estimated the instantaneous area from IMFs’ Hilbert transforms.16This algorithm was tested on Freiburg database and satisfactory seizure onset detection results of 90%sensitivity, 89.31% specificity, and 24.25% false detection rate were achieved.To conclude, EMD is an adaptive time-frequency analysis technique that can give deep insights intothe inherent properties of nonlinear and non-stationary time-series signals. IMFs produced by EMD canbe used as distinguishable features for identifying seizure activities in surface and invasive EEG record-ings. Statistical features could also be computed from these IMFs and used for EEG classification. Moreinvestigation are still needed to optimize the performance of EMD for efficient and robust detection ofepileptic seizures.2.1.5 Rational Functions-based MethodsRational transform is also a time-frequency analysis tool that is based on rational functions. UnlikeSTFT and wavelet transform that use fixed basis functions, rational transform uses basis functions thatare adaptive in nature [91]. Particle swarm optimization (PSO) algorithm is often used to identify theoptimal rational basis functions for a given dataset [179]. The study in [172] presented a new time-frequency transform called “rational discrete short-time Fourier transform (rDSTFT)” which relies onrational functions. The developed rDSTFT-based seizure detection method produced a sparse represen-tation of the EEG signals at hand. For each small EEG epoch of 1.5-seconds duration, the rDSTFT wasapplied and the first 32 coefficients only were selected for further processing. The stochastic hyper-bolic PSO algorithm was adopted to find the optimal number of rDSTFT coefficients for EEG epochs.To separate the seizure from seizure-free activities, the following features were computed for the se-lected coefficients: the absolute mean value, absolute median value, absolute minimum value, absolutemaximum value, and the absolute standard deviation. The effectiveness of the proposed rDSTFT-basedfeature extraction method was evaluated on the Bonn EEG dataset. Experimental results showed that,for the same number of non-zero coefficients, the proposed method achieves higher seizure detectionaccuracy than the classical DSTFT and also wavelet transform. Classification accuracy of 99.80% wasachieved for classifying ictal and normal EEG patterns. More importantly, the inverse rDSTFT wasfound to achieve less mean square error (MSE) than that achieved by the conventional inverse DSTFT.This makes rDSTFT a good candidate for data compression applications.Despite the encouraging seizure detection results reported in [172], the application of the proposedmethod was limited to the single-channel EEG paradigms only. Thus, the same research team improvedtheir algorithm by using multi-dimensional PSO so that rational functions can be used in multi-channelEEG-based seizure detection systems [174]. They proposed a novel feature extraction technique thatrelied on a sparse rational decomposition and the local Gabor binary pattern (LGBP). Several classifierswere used to assess the effectiveness of the proposed feature extraction scheme on the multi-channelEEG dataset of CHB-MIT. Experimental results showed an average seizure detection sensitivity, speci-ficity, and false positive rate of 70.40%, 99.10%, and 0.35/h, respectively.It is worth mentioning that rational functions hold a lot of promise in learning the optimal compactrepresentations of time-series EEG signals. The rational transform coefficients could either be usedas delegate features or used to compute a set of descriptive statistical attributes that could be used17to distinguish between the different classes of EEG activities (e.g., seizure and non-seizure). Furtherinvestigations are needed to find the rational function bases that well suit both surface and invasive EEGsignals and also the number of non-zero coefficients that well represent the original data and maintainthe best seizure detection performance.2.2 Epileptic Seizure PredictionCurrently, anti-epileptic drugs are given to epileptic patients in dosages sufficient to potentially obviateseizures. These drugs may result in undesirable side effects such as tiredness, stomach discomfort,dizziness, or blurred vision [142]. Even when seizures are well controlled, the patient’s quality oflife is significantly lowered by the anxiety associated with the unpredictable nature of seizures and theconsequences therefrom. Also, around 30% of people with epilepsy unfortunately continue to haveseizures despite taking the drugs. Moreover, many patients continue to experience spontaneous seizureseven after the surgical removal of brain tissues causing epilepsy [67]. These have motivated researchersto develop epileptic seizure prediction systems.Recently, many outstanding works have been reporting great results in providing accurate predictionof epileptic seizures for people with refractory epilepsy. In [52], the authors developed an individualizedmethod of feature extraction for epileptic seizure prediction. This method was applied to pre-seizurerecordings of iEEG signals collected from four patients. Average prediction accuracy of 62.50% wasachieved, with a false positive (FP) rate of 0.27/hour. Similarly, Park et al. presented a patient-specificprediction method using the iEEG spectral power features and SVM classification [163]. This approachachieved a high sensitivity of 97.50% and a low FP rate of 0.27/hour. In [111], Janet et al. useda combination of linear and non-linear iEEG features in order to discriminate between pre-seizure andnon-seizure activities. They then employed a logistic regression-based classification model and obtaineda prediction accuracy of 85.70%. In [213], a patient-specific seizure prediction algorithm that uses multi-channel EEG signals was proposed. The EEG signals were first partitioned into 15-seconds blocks, andthe eigenspectra of space–delay correlation and covariance matrices were then computed and used todifferentiate between preictal and interictal EEG patterns. The SVMwas used to assess the effectivenessof the selected features and the overall algorithm was examined on 19 patients in the Freiburg EEGdataset. Results showed that 71 out of 83 seizure activities were predicted.In [50], Cook et al. designed an embedded seizure advisory system (SAS), where all devices weresurgically implanted in 15 patients. This system achieved a satisfactory seizure prediction performanceof sensitivities ranging from 65.00% to 100%. The proposed algorithm, however, did not meet theperformance criteria in three patients with refractory focal epilepsy. The SAS system was also removedfrom one of the patients before sufficient data were recorded. The clinical effectiveness of the proposedalgorithm was re-evaluated four months after the device implantation and no significant changes in theprediction performance were noticed. In [201], a convolutional neural network (CNN)-based patient-specific seizure prediction algorithm was developed and tested on different EEG and iEEG datasets.The short-time Fourier transform was first applied to EEG windows of 30-seconds duration so that18CNN can extract the most distinguishable features from both time and frequency domains. Resultsshowed a superior seizure prediction sensitivity of 81.4%, 81.2%, and 82.3% as well as a false positiverate of 0.06/h, 0.16/h, and 0.22/h for the Freiburg iEEG dataset, CHB-MIT EEG dataset, and AmericanEpilepsy Society iEEG dataset, respectively. Furthermore, the study presented in [121] used iEEG dataof 10 subjects to develop an accurate, fully automated, and adjustable patient-specific seizure predictionmethod. This method used deep learning technology to characterize both preictal and interictal EEGpatterns and distinguish between them. Average seizure prediction sensitivity and time in warning of69% and 27% were achieved.Great efforts have also been made to develop high-performance seizure prediction systems based oncanine iEEG data. For example, Howbert et al. used a set of iEEG spectral features and the logisticregression classifier for seizure forecasting in three dogs [96]. The results had a sensitivity in the range of48.20%-91.60% for seizure prediction. Moreover, a promising seizure prediction method was developedby Krieger and Pollak in [209]; which detected a clear peak about 2.5 hours prior to seizure attacks. In[22], an efficient seizure forecasting method that relies on brain connectivity network for EEG channelselection was developed. The seizure onset zones were first identified in bilateral multi-channel iEEGrecordings. The EEG electrodes that were recognized as seizure activity sources and sinks were thenselected for the extraction of features from continuous EEG recordings in three dogs with focal epilepsy.Using a probabilistic SVM classifier, average seizure prediction sensitivity of 84.82% was achieved.Furthermore, the study in [124] proposed a deep learning-based seizure prediction algorithm thatused a convolutional neural network for EEG feature extraction. This algorithm was tested on iEEG datacollected from five dogs and it achieved comparable results to those of the commonly used methods;however, it is more favorable to the practical and clinical usage. In [180], a SVM-based seizure predic-tion method was applied to iEEG recordings from five dogs and one patient. The iEEG signals were firstdivided into smaller non-overlapping windows of 20-second duration. The frequency spectrum of each20-second EEG window was obtained, and the power of each prime EEG rhythm was computed andused as a feature for EEG classification. The feature vector also included the cross-correlation betweenthe 16 iEEG channels. Empirical results showed the robustness of the proposed method in identifyingpreictal iEEG patterns. A sensitivity and false-positive rate in the ranges of 90-100% and 0-0.3/daywere achieved, respectively. Authors also suggested that patient-specific systems are more practical forreliable seizure prediction.19Chapter 3Energy Efficient EEG Monitoring Systemfor Wireless Epileptic Seizure DetectionThe previous chapter reviewed the state-of-the-art in automated EEG-based detection and prediction ofepileptic seizures. This chapter addresses one of the major challenges faced by wireless seizure detectionsystems – the limited battery lifetime of EEG sensor units. A wireless EEG device enables ambulatoryEEG monitoring outside clinical settings while patients are freely moving around, performing theirdaily activities. This device, however, consumes a considerable amount of power to acquire, encode,and transmit the EEG data to the server side (where seizure detection is carried out). For the time being,there exist three main EEG monitoring paradigms, which are based on (i) transmitting the entire rawEEG data, (ii) transmitting the compressed EEG data, and (iii) transmitting the EEG features.In this chapter, we propose to modify all of the above-mentioned paradigms to reduce the totalpower consumption at the sensor side while maintaining high seizure detection accuracy at the serverside. Firstly, we modify the EEG sensor units to include a new module named “missing at random(MAR)”. This module is responsible for deleting some EEG data-points randomly, hence reducing theamount of data that has to be transmitted, and certainly reducing the energy required for data transmis-sion at the sensor side. Second, we amend the data server to have a new module named “expectationmaximization (EM)”, which is used to accurately estimate the missing (intentionally deleted) EEG data-points. Experimental results show that by adding the MAR and EMmodules to existing EEGmonitoringparadigms, ⇠60% saving in power consumption is achieved. The battery lifetime of these paradigms isdoubled, while a superior seizure detection accuracy of 95-99% is achieved. Parts of the work presentedin this chapter have been published in the proceedings of the IEEE International Conference on MachineLearning and Applications (ICMLA) in 2016 [109].3.1 IntroductionWireless EEG monitoring systems have been used for remote seizure detection applications. Thesesystems capture, process and transmit the EEG data wirelessly to the server side, where the data isanalyzed to detect epileptic seizures [43]. Several wireless EEG-based seizure detection methods havebeen developed and shown to achieve promising seizure detection performance [37, 102, 135, 175, 216].The success of such methods holds promises for better epilepsy control. For example, these systems maybe used to alert the family members and/or caregivers when the seizure begins so they could take careof the their patient during and after the seizure attack. They may help clear the area around the patient20of sharp or hard objects to prevent injury, turn the patient on his side to ensure that his airway remainsclear during the seizure, and time the seizure and call for medical help if any complications arise [188].In order for the wireless seizure detection systems to have clinical value in today’s epilepsy mon-itoring and management, it is crucial to develop a reliable method of processing and transmissionambulatory EEG signals. With recent advances in wireless and electronics technologies, ergonomic,lightweight, and comfortable designs of wireless EEG headbands become an increasingly viable alter-native to the traditional wired devices for EEG monitoring. A wireless EEG sensor unit is a minia-turized, battery-powered device that captures, processes, and transmits EEG signals wirelessly to theserver (receiver) side, where the data is stored or further data analysis is carried out. A major limitationof such wireless EEG devices is its limited battery’s lifetime. For a typical EEG montage with 24 EEGelectrodes and using a sampling rate of 400Hz and a 16-bit analog-to-digital converter, a data rate of150 kbps is generated. Given such a high data rate, the traditional way of transmitting the entire rawEEG data to the server side is not feasible anymore. This is because wireless transmission is extremelyhungry in term of power consumption. As shown in [215], wireless EEG data transmission accounts for⇠70% of the total power consumption of wireless EEG devices.To lessen the power consumption in wireless EEG data transmission, the size of the data that needsto be transmitted should be significantly reduced. A possible technique is to apply data compression tothe raw EEG signals before their transmission. This can be done by deploying any of the compressivesensing or wavelet transform compression techniques [42, 54]. Dynamic EEG channel selection, asan alternative data reduction method, has also been used for wireless seizure detection applications[64, 181]. Another recent data reduction approach is to perform on-board (on-sensor) feature extractionand only transmit the EEG features that are pertinent to epileptic seizures [48, 102]. However, theabove-mentioned data reduction methods encounter two main challenges. First, some of these methodsare computationally-expensive and the power they consume to process the EEG signals at the sensorside is comparable to that saved in wireless transmission. Second and most importantly, the reductionof transmitted data may also result in a non-trivial loss of EEG content, which negatively affects theseizure detection performance.To address these challenges, this chapter introduces a computationally-simple, energy-efficient, andpseudo-lossless EEG data reduction method for wireless seizure detection applications. This methodachieves less power consumption than state-of-the-art methods while achieving superior seizure de-tection performance. At the sensor side, the size of data is reduced by intentionally deleting somedata-points randomly. This process is computationally simple and can effectively reduce the powerconsumption in data transmission, and hence elongate the battery life. At the server side, a machinelearning algorithm, named expectation maximization, is employed to recover the missing (randomlydeleted) data-points. The proposed method can be applied to three different EEG monitoring frame-works, which are (i) transmission of raw EEG data, (ii) transmission of compressed EEG data, and (iii)Transmission of EEG features. We showed that, for each framework, the proposed method can consid-erably reduce the total power consumption of the wireless EEG devices while maintaining high seizuredetection performance. A reduction of ⇠60% in power consumption was achieved while obtaining a21seizure detection accuracy of 95-99%.3.2 Research BackgroundOnly a few studies were conducted on energy-efficient EEG monitoring systems for wireless seizuredetection applications [153]. In [153], Nia et al. developed an energy-efficient scheme for a personalhealth monitoring system that uses a set of different biomedical sensors such as accelerometer, bloodpressure, heart rate, and also EEG. The proposed scheme incorporated compressive sensing, sampleaggregation, and anomaly-driven transmission to reduce the power consumption of wireless EEG trans-mission. Experimental results demonstrated that the proposed scheme can achieve considerable energysavings by simply accumulating the sensor data before transmitting them to the server side.In [181], Shih et al. proposed an EEG channel (sensor) selection method for energy-efficient am-bulatory seizure monitoring applications. They used a machine learning algorithm to automaticallyidentify the EEG channels that include relevant information to seizure detection. Using a clinical EEGdataset taken from 16 patients, the proposed algorithm was examined and its performance was com-pared to an earlier study. Results demonstrated that the proposed algorithm can effectively reduce thenumber of EEG channels from 18 to 6 while maintaining comparable seizure detection accuracy (97%).The average detection latency, however, increased from 7.8 seconds to 11.2 seconds. In a similar study[64], Faul et al. presented a dynamic EEG channel selection to reduce the overall power consumptionin wireless seizure detection systems without compromising the detection accuracy. Different combina-tions of EEG channels were tested and the combination that achieved the best seizure detection rate wasselected for further analysis. Experimental results showed that the proposed channel selection methodachieves power savings up to 47% without affecting the seizure detection performance.In [48], Chiang et al. introduced energy-efficient data reduction methods for reducing transmissiondata in a wireless EEG seizure detection system. They studied two data reduction methods, compressivesensing and on-board feature extraction. The achieved performance was assessed in terms of seizure de-tection accuracy and total power consumption (the trade-offs between the detection accuracy and powerconsumption were also discussed). The results demonstrated that transmitting only the EEG featuresthat are pertinent to seizure patterns can significantly save the power consumed by wireless transmis-sion. This helped extend the battery lifetime of the sensor node by a factor of 14 while maintainingthe same seizure detection performance as the traditional method (with a seizure detection sensitivity of95%).In [102], Hussein et al. developed on-board data reduction approach which extracted low-complexityand high-level application-based EEG features at the sensor side. Specifically, the EEG spectrum wassegmented into five frequency sub-bands; numerous combinations of these sub-bands were selected asfeature vectors and the EEG classification was carried out using k-nearest neighbor. Simulations haverevealed that alpha and delta EEG rhythms constructed the most representative feature vector neededfor accurate detection of epileptic seizures. Satisfactory seizure detection accuracy of 92.47% classifi-cation accuracy was obtained. Moreover, the proposed approach is found to outperform conventional22data streaming and compression methods in terms of total power consumption and seizure detectionperformance.3.3 EEG Feature Extraction and Classification3.3.1 EEG Data and SubjectsThe proposed energy-efficient method was tested on the EEG dataset provided by Bonn University [19].In this study, we address the three-class classification problem between the following EEG categories:Normal EEG recorded from five healthy subjects, Interictal EEG recorded from five patients diagnosedwith epilepsy during seizure-free intervals, and Ictal EEG recorded from five patients experiencingactive seizures. Each EEG class has 100 single-channel EEG signals, each of 23.6-sec duration. All theEEG signals have been filtered, amplified, sampled at 173.6 Hz and digitized using a 12-bit analog-to-digital converter (ADC).3.3.2 EEG Feature ExtractionFeature extraction is a dimensionality reduction process that removes the redundancy in the raw data inorder to facilitate the subsequent analysis and classification processes, and in some cases lead to betterhuman interpretations. Efficient feature extraction methods should thus be developed to eliminate thedata ambiguity and decrease the computation cost of the classification problem. In this chapter, weuse a conventional frequency-domain based feature extraction method to examine the effectiveness ofthe proposed energy-efficient methods. The deployed feature extraction method has low computationalcomplexity and results in distinguishable EEG features that rightly characterize the original data.The captured EEG signals are first transformed to the frequency domain using fast Fourier trans-form (FFT) and then typical features are computed from the EEG rhythms Delta (), Theta (✓), Alpha(↵), Beta1 (1), Beta2 (2), and Gamma (). The frequency ranges of these rhythms are 0.5-4, 4-8,8-13, 13-22, 22-35, and >35Hz, respectively. The features of mean, median, minimum, maximum,standard deviation, skewness, kurtosis, and average power are computed from each EEG rhythm andthen combined together to form the feature vector.Figure 3.1 shows the frequency spectrum of noisy and clean EEG signals. The noisy signal iscorrupted with synthetic muscle artifacts and its signal-to-noise ratio (SNR) is 0dB. The figure clearlyshows that the spectrum of the noise-free EEG signals is localized in the low-frequency range of 0-50Hz while the spectrum of the noisy EEG signals is disseminated over the entire frequency range. Thismotivated us to use the EEG features extracted from the EEG rhythms below 50Hz. Our experimentshave shown that the features extracted from the EEG rhythms , ✓ and 1 only result in a representativefeature vector needed for accurate detection of epileptic seizures.230 10 20 30 40 50 60 70 80Frequency (Hz)00.511.522.53Magnitude (V)104 Noisy EEG(SNR = 0dB)Clean EEGFigure 3.1: Frequency spectra of noisy and clean EEG signals.3.3.3 EEG ClassificationAs clarified in Section 3.3.1, we address the classification problem between normal (non-seizure) EEG,interictal (between seizures) EEG, and ictal (during a seizure) EEG signals. Given that each EEG classhas 100 signals, a total of 300 EEG signals were used for training and testing the proposed seizuredetection method. For every EEG signal, feature extraction was carried out and the selected featureswere concatenated to construct a 24-element feature vector.To determine whether a newly observed feature vector is representative of normal, interictal, orictal activity, a multi-class classification model was needed. We examined the performance of severalclassifiers and found that random forest (RF) achieves the superior seizure detection performance. TheRF integrates a set of tree predictors, each has its own weight and is considered as an individual classifier[39]. The overall classification accuracy is computed based on the classification outputs of all trees.Principally, the correct class is identified based on the vote of the majority of trees. In this study, a RFclassifier with 10 trees was used in the classification of all feature types.The classification performance of the proposed wireless seizure detection systems was evaluated ona per-subject basis using leave-one-out cross-validation [118], which works as follows: In each round,the feature vectors from all, but one, of the subjects’ data were used as the training set to train theclassifier. The feature vectors from the withheld data of the subject were then used to test the classifier.This process was repeated until the data of the tested subjects was withheld once. This test reflects theability of the seizure detection method to generalize from the training set and to classify new unobserveddata.24The seizure detection performance was assessed in terms of:• Sensitivity, which measures the proportion of actual positives that are correctly identified assuch (e.g., the percentage of seizure epochs that are correctly classified as seizure epochs by theclassifier).• Specificity, which measures the proportion of actual negatives that are correctly identified as such(e.g., the percentage of non-seizure epochs that are correctly classified as non-seizure epochs bythe classifier).• Classification Accuracy, which measures the number of correct predictions made divided by thetotal number of predictions made, multiplied by 100 to turn it into a percentage.3.4 Proposed Energy-Efficient MethodThis section demonstrates how to encode the data at the sensor side so that the total power consumptionis reduced to ⇠60%. It also demonstrates how to decode the data at the server side in order to retrievethe original missing data.3.4.1 Missing at Random – Sensor SideTo increase the sensors’ lifetime of a wireless EEG monitoring system, their power consumption shouldbe significantly reduced. Thus, we proposed to modify the existing EEG sensor units to include a newmodule named “missing at random (MAR)” [133]. This module deletes some EEG data-points ran-domly, and thereby, reduces the size of the EEG data that needs to be transmitted to the data server(where the data is recovered and seizure detection is carried out). This data reduction method helpslessen the power consumption in wireless transmission with no increase in the processing power con-sumption by the EEG sensor nodes.Figure 3.2 depicts an example of resulting patterns of missing values for raw EEG data, where themissing data-points (variables) occur at random locations. The observed and missing data-points areindicated by white and red colors, respectively. The vertical axis of Figure 3.2 corresponds to 100 EEGepochs, while the horizontal axis corresponds to the number of data-points in each EEG epoch. Forconvenience, only 0.3 seconds (50 EEG time instances) are plotted. In this figure, each EEG epoch ismissing 30% of its data-points. i.e., 15 out of 50 EEG data-points are missing.3.4.2 Expectation Maximization – Server SideTo classify the data at the server side, the missing values should be estimated first. Many statisticalmethods have been employed to interpolate incomplete data. Expectation maximization (EM) is oneof the most promising algorithms that can effectively estimate missing entries [36]. Given a statisticalmodel, EM is utilized to estimate the missing data Z given an observed dataX . This can be formulated25Figure 3.2: Missing value patterns of 100 EEG epochs.as that of finding the model parameter ⇠ such that the conditional probability P(X|⇠) is a maximum.P(X|⇠) can be represented in terms of the missing values zi as follows:P(X|⇠) =XziP(X, zi|⇠) (3.1)where ⇠ denotes the parameters of the probabilistic model we try to find.EM finds the maximum likelihood estimate of the above-mentioned marginal likelihood by itera-tively applying the following two steps:1. Expectation step (E-step): Estimate Q(⇠|⇠t) - the conditional expectation of the log-likelihood basedon the current estimate of the model parameters ⇠t:Q(⇠|⇠t) = Ez|X,⇠t"log XziP(X, zi|⇠)!#(3.2)2. Maximization step (M-step): Calculate the parameter that yields the maximum log-likelihood esti-mate with respect to ⇠:⇠t+1 = argmax⇠Q(⇠|⇠t) (3.3)26Server Side Raw EEG (Missing) Compressed EEG (Missing) Sensor Side EEG Acquisition Compression (CS) MAR Feature Extraction RF  Classifier EM EEG Recovery EEG Acquisition Feature Extraction MAR RF  Classifier EM EEG Features (Missing) EEG Acquisition MAR Feature Extraction RF  Classifier EM (A) (C) (B) Figure 3.3: Proposed energy-efficient EEG monitoring methods for wireless epileptic seizure detection.(A) MAR is applied to the entire raw EEG data before transmission, (B) MAR is applied to the com-pressed EEG data before transmission, and (C) MAR is applied to the distinctive EEG rhythms beforetransmission. EM is used at the server side to retrieve the missing values.The EMmethod implemented in IBM SPSS software is used to recover the missing EEG data-points.The performance of the EM algorithm is examined at different proportions of missing data starting by10% and ending by 50%.3.5 EEG Data Transmission in a Wireless Seizure Detection SystemThe scientific literature reports on three main frameworks of wireless EEGmonitoring systems. The firstframework captures the raw EEG data and sends it as is to the server side. The second framework usesa compression method to reduce the data size and then transmits the compressed data to the server side(where data reconstruction and analysis are performed). The third framework applies feature extractionon the sensor side, then only the selected features are transmitted to the server side.In this study, we modify the three above-mentioned frameworks to reduce their overall power con-sumption at the sensor side, while maintaining high seizure detection accuracy at the server side. Fig-ure 3.3 depicts the developed frameworks. The sensor side is amended to include the MAR module,which deletes some EEG data-points at random. On the server side, the EM algorithm is adopted toestimate the values of missing EEG data-points. To evaluate the effectiveness of the proposed energy-efficient scheme, the detection accuracy and total power consumption are evaluated for the three mainframeworks.273.5.1 Transmission of Entire Raw EEG DataThe traditional framework of wireless EEG monitoring systems is to transmit the entire raw data (RD)to the server side. The main drawback of this system lies in its massive power consumption by wirelesstransmission. Figure 3.3 top branch depicts how we modified this framework (by adding a MARmoduleat the sensor side and an EMmodule at the receiver side) to maintain high seizure detection performancewith much less power consumption. The total power consumption of the RD framework, denoted byPRDtot , is computed as:PRDtot = PRDE + PRDT (3.4)where PRDE denotes the power consumed by encoding (acquiring the data and amplifying it) and PRDTis the power required for data transmission.For a multi-channel EEG monitoring system, PRDE is given by:PRDE = C (PAmp + PADC) (3.5)where C is the total number of EEG channels (electrodes), PAmp and PADC correspond to the powerconsumption of the amplifier and the analog-to-digital converter, respectively.Also, the power required for data transmission, PRDT , is given by:PRDT = C (fsRJ) (3.6)where fs is the sampling rate, R is the ADC resolution and J is the transmission energy per bit.After modifying the RD framework to include the MAR module, the power needed for transmissionis scaled by LN factor to be:PRDT = C✓LNfsRJ◆(3.7)where L and N are the lengths of the missing and original data, respectively.From equations (3.5) and (3.7), the total power consumption of the RD framework is then given by:PRDtot = C✓PAmp + PADC +LNfsRJ◆(3.8)3.5.2 Transmission of Compressed EEG DataEEG compression techniques are utilized to reduce the data size and hence the transmission power.Compressive sensing (CS) is the most effective techniques that have been used for EEG data compres-sion [56]. The compression rate (CR) is computed as N:M, where N is the length of the original dataand M is the length of the compressed data produced by CS. In this study, a CR of 5:1 is used so thatthe transmission power is significantly reduced to the fifth of its original value. At the server side, aCS reconstruction method is used to recover the original signal. Figure 3.3 middle branch elaborates onthe modified CS-based EEG encoding framework after adding the MAR and EM modules. The overallpower consumption of such a compressed data (CD)-based EEG monitoring framework, denoted by28PCDtot , is given as:PCDtot = PCDE + PCDT (3.9)where PCDE and PCDT correspond to the encoding and transmission power of the CD-based framework.Since compressing the data using compressive sensing necessitates deploying the random numbergenerator (RNG) and matrix multiplication (MM) modules, the encoding power PCDE is computed as:PCDE = C (PAmp + PADC) + PRNG + PMM (3.10)where PRNG and PMM denote the power consumption of the RNG and MM blocks, respectively.The transmission power is also derived as:PCDT = C✓1CRLMfs R J◆(3.11)where CR denotes the compression rate and equals NM .By taking the sum of equations (3.10) and (3.11), the total power consumption of the modifiedCD-based EEG monitoring framework is obtained as:PCDtot = C✓PAmp + PADC +1CRLMfsRJ◆+ PRNG + PMM (3.12)3.5.3 Transmission of EEG FeaturesRecently, the studies in [48] and [102] have presented novel EEG encoding schemes. They appliedfeature extraction (FE) to the EEG data at the sensor side. Once the EEG data is captured, the featuresrelevant to seizures are extracted and sent to the server side (where EEG classification is performed).The proposed schemes yield considerable power savings in both data encoding and transmission. They,however, showed limited seizure detection performance. In this study, we modify these on-sensor pro-cessing schemes to incorporate the MAR and EM modules, with the ultimate objective of saving morepower and achieving better seizure detection accuracy. MAR is applied to the captured EEG features tofurther reduce the transmitted data size and certainly the transmission power. At the server level, EM isused to estimate the missing EEG features. Figure 3.3 bottom branch describes the modified on-sensorfeature extraction framework.The feature extraction method derived in Section 3.3.2 is implemented at the sensor level. The fre-quency spectrum of EEG data is first obtained, and the EEG rhythms of , ✓ and 1 are then transmittedto the server side. This is where the features that are pertinent to seizures are computed from theserhythms and used as inputs to the RF classifier.Thus, the encoding power consumption of the EEG features (EEGF)-based encoding framework isexpressed as [102]:PEEGFE = C✓PAmp + PADC +3N2S log2(N)◆(3.13)29where 3N2 log2(N) is the computational complexity incurred by FFT and S is the net power required toperform one FFT instruction.The transmission power, denoted by PEEGFT is also computed as:PEEGFT = C✓FNLFfsRJ◆(3.14)where F represents the number of the frequency components in the EEG rhythms , ✓, and 1 (i.e.,390).Thus, the overall system power consumption of EEGF-based EEG monitoring framework is representedas:PEEGFtot = C✓PAmp + PADC +3N2S log2(N) +LNfsRJ◆(3.15)3.6 Power Consumption Evaluation and Seizure Detection PerformanceTo evaluate the effectiveness of the proposed energy-efficient method, the total system power consump-tion is computed for each framework individually. The standard performance metrics of sensitivity,specificity, and classification accuracy are also evaluated at the server side. MatlabTM is used to quan-tify the power consumption values at the sensor side and the WEKA software is used to assess theseizure detection performance at the server side.3.6.1 Raw EEG Data StreamingFor the raw EEG streaming framework, the wireless EEG sensor unit is in charge of data acquisitionand transmission to the server side, where diagnosis and detection processes are conducted. As shownin Figure 3.3 top branch, we modify the traditional model of raw EEG streaming by integrating theMAR and EM modules. The MAR module is adopted to intentionally delete some data-points andhence lessen the size of the EEG data that needs to be transmitted to the data server. The EM moduleis employed at the server side to recover the missing values. Figure 3.4(a) shows a raw EEG signalof 23.6-sec duration. The impact of MAR module is depicted in Figure 3.4(b), where the EEG signalis missing 30% of its samples. Figure 3.4(c) demonstrates the efficacy of the EM method to recoverthe original EEG signal. The root mean square error (RMSE) between the original and recovered EEGsignals is found to be 0.012, which verifies that both signals are similar.To prove the power saving capability of the modified raw EEG streaming framework, the overallsystem power consumption is evaluated and compared to those of the state-of-the-art methods. Thevalues of the system parameters C, PAmp, PADC , R and J are 24, 2.9µW, 0.2µW, 12 bit/sample and50nJ/bit, respectively [144, 176]. The standard seizure detection performance metrics are also evaluatedand listed together with the power consumption values in Table 3.1. The first two rows display thesensitivity, specificity, classification accuracy, and total power consumption achieved by the state-of-the-art methods presented in [48] and [102]. Despite the high seizure detection accuracy achieved byChiang’s method [48], the EEG sensor unit is found to consume a considerable total power of 32.50mW.30Time (sec)0 5 10 15 20Amplitude (uV)-200-150-100-50050100150200 Original EEGTime (sec)0 5 10 15 20Amplitude (uV)-200-150-100-50050100150200 Incomplete EEGTime (sec)0 5 10 15 20Amplitude (uV)-200-150-100-50050100150200 Recovered EEG(a)(b)(c)Figure 3.4: Recovery of raw EEG data using EM: (a), (b) and (c) correspond to the original, missingand recovered raw EEG data, respectively.Table 3.1: Seizure detection performance and total power consumption of raw data (RD) modelMethod Sensitivity (%) Specificity (%) Accuracy (%) PRDtot (mW)Chiang et al., 2014 [48] 94.91 99.83 97.37 32.50Hussein et al., 2015 [102] 95.14 90.06 93.54 32.50Proposed, MAR(00%) 99.50 99.50 99.50 32.50Proposed, MAR(10%) 97.25 99.78 99.18 22.20Proposed, MAR(20%) 96.84 98.88 98.26 19.80Proposed, MAR(30%) 94.95 99.76 97.54 17.40Proposed, MAR(40%) 91.18 98.00 96.80 15.00Proposed, MAR(50%) 89.66 96.78 95.15 12.70Hussein’s method achieves a comparable power consumption with lower seizure detection performance[102].Our proposed method, on the other side, can effectively achieve the optimal trade-off between thepower consumption of the wireless EEG sensor nodes and the seizure detection performance at theserver side. It is found to consume less power while ensuring a comparable seizure detection accuracy.The utilized MAR module can reduce the size of the raw EEG data by different percentages. The largerthe percentage of the deleted data-points, the more power saving our method achieves. MAR(XX%)indicates that a percentage of XX% is missing from the data. The detection accuracy and power con-sumption are evaluated at different missing percentages starting with 00% (no missingness) and ending31by 50%. It is clearly observed that the proposed method can significantly reduce the total power con-sumption without compromising the epileptic seizure detection performance. We recommend the use ofthe proposed method with a 30% missingness. It achieves a comparable seizure detection accuracy toChiang’s method [48], but with ⇠50% savings in the total power consumption.3.6.2 Compressed EEG Data StreamingGiven the proposed CS-based EEG monitoring system shown in Figure 3.3(B) and the power consump-tion models derived in equations (3.10), (3.11) and (3.12), the total power consumption is estimated andreported in Table 3.2. The power consumption of the random number generator (PRNG) and matrix mul-tiplication (PMM ) are 3µW and 352µW, respectively [93]. The overall system detection performanceand power consumption are evaluated against different missing percentages in the compressed data. Thenumerical results reported in Table 3.2 verify the effectiveness of the proposed energy-efficient seizuredetection method over the state-of-the-art methods. A significant decrease in the data transmission costis achieved without seriously affecting the detection accuracy. For instance, the fourth row of Table 3.2demonstrates that the proposed method with 30% missing data can achieve a detection accuracy com-parable to Chiang’s method [48] while reducing the power consumption from 7.46mW to 4.30mW.Table 3.2: Seizure detection performance and total power consumption of compressed data (CD) modelMethod Sensitivity (%) Specificity (%) Accuracy (%) PCDtot (mW)Chiang et al., 2014 [48] 91.82 99.40 95.61 7.46Hussein et al., 2015 [102] 81.37 87.55 85.95 7.46Proposed, MAR(00%) 95.32 99.00 97.46 7.46Proposed, MAR(10%) 92.75 98.27 96.08 6.41Proposed, MAR(20%) 92.14 97.66 94.92 5.35Proposed, MAR(30%) 92.14 97.34 94.92 4.30Proposed, MAR(40%) 89.20 89.20 89.19 3.26Proposed, MAR(50%) 84.90 84.90 84.88 2.213.6.3 EEG Features StreamingOur proposed energy-efficient EEGmonitoring framework shown in Figure 3.3(C) gives superior resultsover the state-of-the-art on-sensor systems. The system parameter S takes the value of 66pW [103].Before integrating the MAR module (i.e., 00%MAR), the obtained classification accuracy is 99.38%and the overall power consumption is 2.30mW. The use of the MAR module can further reduce thenumber of frequency components that should be transmitted to the server side, and hence yields asufficient reduction in total power consumption. The quantitative results listed in Table 3.3 explain thedecrease in the detection accuracy along the missing percentages. It can be seen that, for a MAR of40% missingness, the proposed method can fulfill a better seizure detection accuracy than achieved byChiang’s method [48] while consuming ⇠25% only of its power consumption.32Table 3.3: Seizure detection performance and total power consumption of EEG features (EEGF) modelMethod Sensitivity (%) Specificity (%) Accuracy (%) PEEGFtotChiang et al., 2014 [48] 94.91 99.83 97.37 2.30Hussein et al., 2015 [102] 94.82 90.06 92.47 2.30Proposed, MAR(00%) 100.00 98.75 99.38 2.30Proposed, MAR(10%) 99.40 99.40 99.33 1.84Proposed, MAR(20%) 99.27 98.24 98.62 1.32Proposed, MAR(30%) 96.88 98.97 98.26 0.90Proposed, MAR(40%) 95.98 98.56 97.54 0.58Proposed, MAR(50%) 93.55 97.43 96.90 0.253.7 Limitations and RecommendationsThe proposed energy-efficient wireless seizure detection systems incorporate the expectation maximiza-tion algorithm for data recovery at the server side. This algorithm works efficiently when there is onlya small percentage of missing data and the dimensionality of the data is not too big [85]. For data withhigh dimensionality and/or high missing percentage, the EM algorithm runs very slow. This is becausethe expectation step is computationally expensive and it converges extremely slow as the procedure ap-proaches a local maximum [141]. We also noticed that the reconstruction accuracy of the EM algorithmis considerably reduced when more than 50% of the data is missing. The resultant high reconstructionerror has a serious impact on the extracted EEG features, which negatively affects seizure detectionaccuracy. We thereby recommend using the proposed energy-efficient systems for offline EEG monitor-ing applications only. We also advise keeping the missing data percentage below 50% to ensure smallreconstruction error and fast convergence.3.8 Summary and ConclusionIn this chapter, an energy-efficient wireless epileptic seizure detection system was proposed. The miss-ing at random algorithm was used at the sensor side to reduce the energy cost of EEG data transmission.At the server side, we utilized the expectation maximization method to estimate the missing values andaccurately recover the missing data. Further, we proposed an efficient feature extraction method to se-lect distinguishable EEG features for epileptic seizure detection. The EEG rhythms , ✓, and 1 areselected to compute a set of descriptive statistical features that represent the input to the Random Forestclassifier. The superiority of the proposed scheme over the existing baseline methods is verified. Itachieves 42-75% lower power consumption while maintaining comparable seizure detection accuracyas state-of-the-art methods.33Chapter 4Computationally Efficient EEG FeatureLearning for Early Detection of EpilepticSeizuresIn the previous chapter, an energy-efficient EEG monitoring system for wireless epileptic seizure detec-tion was proposed and assessed using a clinical EEG dataset from 15 epileptic patients. The proposedsystem, however, is computationally expensive and does not meet our needs for real-time and imme-diate detection of epileptic seizures. This chapter introduces an early seizure detection method thatcan efficiently identify the seizure onset a few seconds after it begins. The proposed method uses theleast absolute shrinkage and selection operator (LASSO) regression to recognize the seizure-associatedEEG features in a time-efficient fashion. These features are then selected and fed into a random forest(RF) classifier for epileptic seizure recognition. Compared to the state-of-the-art methods, the proposedmethod achieves superior seizure detection performance with a much shorter detection latency of 0.52seconds. The work presented in this chapter has been published in part in the proceedings of the IEEEGlobal Conference on Signal and Information Processing (GlobalSIP) in 2016 [108] and 2017 [101].4.1 IntroductionEpilepsy is the second most common neurological disorder affecting around 90 million people world-wide [182]. Early detection of epileptic seizure has the advantage of warning patients of impendingseizures so they can administer the appropriate medications on time. Studies of epilepsy often relyon the electroencephalogram (EEG), which indicates the brain’s electrical activities associated withseizures [184]. EEG-based seizure detection systems aim at effectively analyzing the captured EEGdata to accurately recognize epileptic EEG patterns.Numerous methods for epileptic seizure detection have been presented in the literature [29, 45, 48,95, 134, 150, 185, 205, 208]. In [205], the wavelet transform was used to analyze the EEG data and ob-tain and the main spectral EEG rhythms. Then, a set of statistical features that characterize the behaviorof the EEG were extracted and tested using a multilayer perceptron neural network (MLPNN). The re-sults showed sensitivity, specificity, and classification accuracy of 96%, 94%, and 94.83%, respectively.In [185], a feature extraction method based on the sample entropy was used together with the extremelearning machine (ELM) classifier and resulted in a sensitivity, specificity, and classification accuracyof 97.26%, 98.77%, and 95.67%, respectively. In addition, the authors of [150] used a combination of34time-domain and frequency-domain EEG features, forming a more representative feature vector; thiswas then fed into an MLPNN for EEG classification. The epilepsy recognition rates achieved by thismethod were 97.46% for sensitivity, 98.74% for specificity, and 97.50% for classification accuracy.In an attempt to decrease the computational complexity of seizure detection systems, the authors of[134] utilized a low-complex support vector machine (SVM) classifier together with a group of waveletfeatures. This method achieved an average classification accuracy of 95.33%. Comparable classificationaccuracy of 95.61% was achieved in [48] by using a novel EEG feature extraction method, while thefeatures were extracted on the sensor side and seizure detection was performed on the server side. Fur-thermore, a feature extraction method based on the Hilbert transform was employed in conjunction witha SVM classifier to identify epileptic EEG activities [45]; achieving a classification accuracy of 97.00%.Also, independent component analysis (ICA) was used to determine the independent seizure-associatedfeatures [95]. It was used together with the SVM classifier to achieve detection sensitivity, specificity,and classification accuracy of 96.00%, 94.00%, and 95.00%, respectively. In [29], a seizure detectionapproach based on statistical EEG features and a least-square SVM (LSSVM) classifier showed an aver-age accuracy of 97.19% with a short computation time of 0.06 seconds. In [208], the energy features ofEEG-based harmonic wavelet packet transform and fractal dimensions were computed and tested usinga relevance vector machine (RVM) classifier to obtain a high classification accuracy of 99.80%.In an effort to further improve the performance of automatic seizure detection systems and de-crease their computational complexity, we apply a computationally-efficient feature learning model.This model extracts the most representative seizure-associated features in a time-efficient manner, usingthe least absolute shrinkage and selection operator (LASSO) regression method. These learned fea-tures are proven to be more discriminative and achieve higher seizure detection accuracy than otherhand-crafted features. Meanwhile, a computationally-fast algorithm, known as randomized coordinatedescent (RCD) [69], is used for estimating the LASSO regression coefficients, which are then used asfeatures for seizure detection. The random forest classifier is used to examine the effectiveness of theextracted features. The results on a benchmark EEG dataset show that our approach surpasses existingmethods in terms of sensitivity, specificity, classification accuracy, and also the seizure detection latency.Finally, we show that the proposed method achieves satisfactory seizure detection performance in thepresence of common signal contaminants (e.g., eye-blinking and muscle artifacts).4.2 Materials4.2.1 EEG DataThe proposed method was tested on the EEG dataset provided by Bonn University [19]. In this study, weaddress the classification problem between three different EEG sets, as follows: Normal EEG recordedfrom five healthy volunteers, Interictal EEG recorded from five epileptic patients during seizure-freeintervals and Ictal EEG recorded from five epileptic patients with active seizures. Each set includes 100single-channel EEG signals, each of 23.6-seconds duration. All the EEG sets are denoised, amplified,sampled at 173.6 Hz, and digitized using a 12-bit analog-to-digital converter. Figures 4.1(a), (b), and (c)350 5 10 15 20Time (Sec)-100-50050100Amplitude (µV)0 5 10 15 20Time (Sec)-100-50050100Amplitude (µV)0 5 10 15 20Time (Sec)-200-1000100200Amplitude (µV)0 5 10 15 20Time (Sec)-200-1000100200Amplitude (µV)0 5 10 15 20Time (Sec)-400-2000200400600Amplitude (µV)0 5 10 15 20Time (Sec)-400-2000200400600Amplitude (µV)(b)(c) (f)(e)(a) (d)Figure 4.1: Time-series EEG plots: (a), (b), and (c) clean EEG examples of normal, interictal, and ictalEEG activities, respectively; (d), (e), and (f) noisy EEG examples of normal, interictal, and ictal EEGactivities, respectively.show examples of noise-free EEG signals for normal, interictal, and ictal activities, respectively.In practice, the measured EEG data are often corrupted with different types of noise. Such noise neg-atively alters the EEG waveform shapes and severely affects the seizure detection accuracy of epilepticseizures. In this study, we address the classification problem of (i) clean EEG data and (ii) noisy EEGdata corrupted with eye-blinking and muscle artifacts. As demonstrated in [55], eye blinks are mod-eled as random noise band-pass filtered between 1 and 3 Hz, while the muscle artifacts are modeled asrandom noise band-pass filtered between 20 and 60 Hz. Figures 4.1(d), (e), and (f) describe the noisyversions of the same normal, interictal, and ictal EEG recordings shown in Figure 4.1(a), (b), and (c),respectively. The amplitudes of the eye-blinking and muscle artifacts are adjusted to produce noisysignals with a 0 dB signal-to-noise ratio (SNR).360 10 20 30 40 50 60 70 80Frequency (Hz)050001000015000Magnitude (µV)0 10 20 30 40 50 60 70 80Frequency (Hz)050001000015000Magnitude (µV)0 10 20 30 40 50 60 70 80Frequency (Hz)00.511.52Magnitude (µV)×1040 10 20 30 40 50 60 70 80Frequency (Hz)00.511.52Magnitude (µV)×1040 10 20 30 40 50 60 70 80Frequency (Hz)0246Magnitude (µV)×1040 10 20 30 40 50 60 70 80Frequency (Hz)0246Magnitude (µV)×104(f)(c)(b) (e)(a) (d)Figure 4.2: Frequency spectrum plots: (a), (b), and (c) clean EEG spectrum of normal, interictal, and ic-tal activities, respectively; (d), (e), and (f) noisy EEG spectrum of normal, interictal, and ictal activities,respectively.4.2.2 EEG SpectrumIn this study, we apply the proposed feature learning method described in Section 4.3.1 to the EEGfrequency spectrum. Only the distinctive frequency components are nominated as delegate features forthe classification of an epileptic seizure. Figure 4.2 shows the frequency spectrum of clean and noisyEEG signals. The noisy signals were corrupted with synthetic eye-blinking and muscle artifacts, and itsSNR was set to 0 dB. Figures 4.2(a), (b), and (c) depict the frequency spectrum of clean EEG signals fornormal, interictal, and ictal EEG activities, respectively. Figure 4.2(d), (e), and (f) display the frequencyspectrum of the noisy EEG signals for normal, interictal, and ictal EEG activities, respectively. Thefigure clearly shows how the eye-blinking and muscle artifacts noticeably change the shape of the EEGspectrum.374.3 MethodologyThis section demonstrates how the LASSO regression method was applied to the EEG spectrum so thatonly a few features were selected. It also demonstrates how the extracted features were then used totrain and test the RF classifier.4.3.1 EEG Feature Learning: LASSO RegressionWe present a computationally efficient feature learning method for early detection of epileptic seizures.First, the spectrum of the captured EEG signals is simultaneously obtained by performing a fast Fouriertransform (FFT). LASSO regression is then adopted as a feature extraction method applied to the EEGspectrum [70]. Finally, the randomized coordinate descent algorithm is used to solve LASSO and nom-inate the EEG spectral features pertinent to seizures [69, 152].In this work, we address the classification problem between the three EEG sets of Normal, Interictaland Ictal. Each set includes 100 EEG signals, each of 4,096 time-samples [19]. Accordingly, the totalnumber of EEG signals, defined as n, is 300. The FFT is deployed to attain the spectrum of all EEGsignals. Default FFT settings were used (i.e., the FFT size was set to the length of the signal [4,096]rounded up to the next power of 2). In our case, the EEG spectrum is found to include 2,049 frequencycomponents. Then, we defined a matrix A (n⇥ d) that includes the spectrum of the n EEG signals [a1;a2; . . . ; an]. The number of columns in A, defined as d, denotes the number of frequency componentsin each ai. We also defined the column vector b as an n-length vector including the class labels bi’s.These class labels are corresponding to the EEG samples ai’s in A. The labels of 1, 2 and 3 were givento the EEG classes of normal, interictal and ictal, respectively. The mathematical description of A andb is shown as:A =0BBBB@a1a2...an1CCCCA =0BBBB@a11 a12 a13 . . . a1da21 a22 a23 . . . a2d.......... . ....an1 ad2 ad3 . . . and1CCCCA& b =0BBBB@b1b2...bn1CCCCAThe feature learning problem can then be formalized as finding the sparse vector ? that yields theminimum least-square error ||A  b||22. The solution can be found by solving the following LASSOregression problem [70]:? = argmin 2 Rd||A  b||22 + ||||1 (4.1)where  is the LASSO tuning parameter that controls the sparsity level in ?.A wealth of methods that can solve LASSO problems have been reported in the literature [69].Amongst these methods, the RCD algorithm is found to converge substantially faster than the others[152]. RCD is favourable for minimizing convex functions that take the form:38argmin 2 Rdg () +dXj=1hjj(4.2)where “g” is a smooth function, “h” is a separable function, and d is the total number of regressioncoefficients.Interestingly, the LASSO problem shown in (4.1) can be rephrased to take the same expressionas (4.2):argmin 2 RdnXi=1Tai  bi2+ dXj=1|j | (4.3)Thereafter, the RCD algorithm is applied to (4.3) to find the sparse vector ?. The full algorithmdescription is given in Algorithm 1.Algorithm 1: Randomized Coordinate Descent [152].1 Input: A, b2 Output: Sparse vector ?3 Initialization: 0  0; k  0;  = e7;4 Initialization: Lipschitz constant: L =PA2;5 repeat6 Pick a random coordinate jk from {1, 2, . . . , d};7 Set ↵k  1/L(jk);8 Setrjkf(wk) Pni=1 aTiTai  bi9 Design a vector ejk of zeros with ones at the locations of block j10 Set k+1  k  ↵krjkf(k)  ejk ;11 k  k + 1;12 Until ||A  b||1  ✏After obtaining the sparse vector ? using Algorithm 1, the EEG frequency components that corre-spond to the non-zero elements of ? are selected as delegate features for EEG classification.4.3.2 EEG Feature Classification: Random ForestIn this work, we examined the performance of several classification models and found that the RFclassifier achieves the superior performance in terms of the classification accuracy and computationaltime. Cross-validation of leave-on-out was used for training and testing the classification performance.The RF classifier integrates a set of independent decision tree classifiers [39]. A decision tree withM leaves splits the feature space into M regions Rm, 1  m  M . For each tree, the predictionfunction f(x) is defined as:f(x) =MXm=1cm (x,Rm) (4.4)where M is the number of regions in the feature space, Rm is a region corresponding to m, cm is a39constants corresponding tom, and is the indicator function defined as:(x,Rm) =8<:1, if x 2 Rm0, otherwise (4.5)The final classification decision is made from the majority vote of all trees. In this study, we used an RFclassifier with 10 trees.4.4 Results and DiscussionTo evaluate the performance of our feature learning method, we compare our seizure detector with thestate-of-the-art detectors that use the same benchmark dataset. The classification performance was mea-sured using the standard metrics of sensitivity, specificity, classification accuracy, and also the latency.The latency is defined as the time delay between the seizure onset and the seizure activity detectionwithin an EEG detector.4.4.1 Seizure Detection under Ideal ConditionsThe proposed method is first examined under the ideal conditions, where the EEG recordings are as-sumed to be free of noise. The spectrum of the clean EEG signals is first obtained and fed into theLASSO regression-based feature learning method with the specific goal of efficient feature extraction.The RCD algorithm is used to solve the LASSO problem and extract the most robust EEG spectralcomponents in a time-efficient manner. The forepart of Figure 4.3 displays the spectrum of a clean EEGsignal including all 2,049 frequency components, while the rear part of Figure 4.3 shows the extractedfrequency components using LASSO. It is worth highlighting that only 126 components out of 2,049are nominated as delegate features for EEG classification.Furthermore, we compare the performance of our seizure detection method with those of relevantstate-of-the-art methods that also use the same benchmark EEG dataset. The performance metrics ofthe proposed and reference seizure detection methods are summarized in Table 4.1. The proposedmethod achieves a seizure detection sensitivity of 100%, which is higher than those of the baselinemethods reported in Table 4.1. It also produces a notable seizure detection specificity of 100%, which iscomparable to Vidyaratne et al.’s results [208] and is superior to those of the reference methods. Further,amongst all the existing seizure detection methods, the proposed scheme yields the highest classificationaccuracy of 100%. More importantly, the proposed LASSO-based seizure detection method is provento be computationally efficient, achieving a short seizure detection latency of 0.52 seconds. This makesour method more suitable and workable in real-time seizure detection applications.4.4.2 Seizure Detection in the Presence of Body ArtifactsWe further examine the effectiveness of the proposed seizure detection method in classifying noisyEEG data. In a previous work, we introduced an EEG feature learning method capable of performing40Magnitude (µV)10.80.60.4010 0.210Frequency (Hz)20230 40 503×104060 7048056Original Frequency ComponentsSelected Frequency ComponentsFigure 4.3: Original and extracted frequency components of clean EEG spectrum.Table 4.1: Seizure detection results of the proposed and state-of-the-art methods; NR = Not Reported.Methods Classifier Sensitivity (%) Specificity (%) Accuracy (%) Latency (Sec)Ubeyli et al. (2009) [205] MLPNN 96.00 94.00 94.83 NRSong et al. (2010) [185] ELM 97.26 98.77 95.67 NRNaghsh et al. (2011) [150] MLPNN 97.46 98.74 97.50 NRLiu et al. (2012) [134] SVM 94.46 95.26 95.33 NRChiang et al. (2014) [48] SVM 91.82 99.40 95.61 NRChaurasiya et al. (2015) [45] SVM 98.00 96.00 97.00 NRHosseini et al. (2016) [95] SVM 96.00 94.00 95.00 NRBehara et al. (2016) [29] LSSVM 96.96 99.66 97.19 0.06Vidyaratne et al. (2017) [208] RVM 99.00 100.00 99.80 1.34Proposed Method (2017) RF 100.00 100.00 100.00 0.52on noisy EEG signals [108]. This method, however, assumed that the noise encountered with EEG dataacquisition has a Gaussian distribution, which is not always the case in practical situations. In this work,we test the seizure detection performance of our LASSO-based method on noisy EEG data corruptedwith common body artifacts such as eye-blinking and muscle movements.Figure 4.4 shows how the proposed method adaptively selects a different set of frequency compo-41Magnitude (µV)10.80.60.400.210 10Frequency (Hz)20230 40 5030×10460 70 80456Selected Frequency ComponentsOriginal Frequency ComponentsFigure 4.4: Original and extracted frequency components of noisy EEG spectrum.nents from the noisy EEG spectrum. The frontal part of Figure 4.4 displays the frequency spectrum of anoisy EEG signal contaminated by eye-blinking and muscle artifacts such that its SNR is 0 dB. The rearpart of Figure 4.4 displays the selected frequency components from the spectrum of the contaminateddata. It can be seen that some of the noisy frequency components that are associated with muscle physi-ologic artifacts (20-60 Hz) are chosen as representative features for EEG classification. This negativelyimpacts the seizure detection accuracy, achieving a sensitivity, specificity, and classification accuracyof 95.95%, 93.30%, and 94.76%, respectively, which limits the application of this method in practice.Therefore, further research is needed to improve the reliability of epileptic seizure detection methodsunder real-life conditions, i.e., when the EEG data gets corrupted by physiologic and extra-physiologicartifacts.4.5 Summary and ConclusionFor early detection of epileptic seizures, a computationally efficient model that adaptively learns distin-guishable spectral EEG features is proposed. The feature learning problem is formalized as a LASSO42regression model to select the EEG frequency components pertinent to epileptic seizures. The LASSOcoefficients are found be adopting the computationally fast optimization algorithm of block coordinatedescent. The selected EEG frequency components are fed as inputs to a Random Forest classifier to rec-ognize the EEG activities associated with epileptic seizures. The results demonstrate that our proposedmethod surpasses state-of-the-art seizure detection methods, achieving 100% classification accuracy.The proposed model is also proven to be computationally efficient, achieving a very short seizure detec-tion latency of 0.52 seconds.The seizure detection performance of our proposed method is also examined on noisy EEG data con-taminated by the physiologic artifacts of eye-blinking and muscle movements. Noticeable performancedegradation is brought on by selecting some inadequate EEG features from the noisy EEG spectrum.Therefore, it is necessary to develop more reliable seizure detection methods capable of identifyingseizure patterns in contaminated EEG data. Chapter 5 introduces a robust seizure detection methodthat maintains high detection performance in the presence of physiologic and extra-physiologic EEGartifacts.43Chapter 5Robust Detection of Epileptic SeizuresBased on L1-Penalized Robust Regressionof EEG SignalsIn Chapter 4, a computationally efficient seizure detection method based on LASSO regression of EEGspectra was proposed. The results showed that our method could be used in the identification of EEGseizure onsets in a timely and efficient manner, achieving a short detection latency of 0.52 seconds.However, the LASSO-based method failed to maintain a high seizure detection accuracy in the presenceof inevitable EEG artifacts such as eye-blinking and muscle movements.In this chapter, a reliable seizure detection method based on L1-penalized robust regression is pro-posed (L1PRR). In fact, most of the existing EEG-based seizure detection methods, including ourLASSO-based method, cannot maintain a robust performance under real-life conditions (i.e., when EEGgets corrupted with ambient noise and body artifacts). This motivated us to develop a robust EEG fea-ture learning method that extracts the most prominent and reliable EEG features pertinent to epilepticseizures. Results on a public benchmark EEG dataset show the superiority of our L1PRR-based seizuredetection method compared to previous work, achieving a detection accuracy of 100%. The proposedmethod is also proven to maintain robust performance in the presence of white noise and EEG arti-facts mainly those arising from muscle activities and eyes-blinking. A version of this chapter has beenpublished in the Journal of Expert Systems with Applications [100].5.1 IntroductionStudies of epilepsy often rely on the electroencephalogram (EEG) as it can indicate the brain’s electri-cal activities associated with seizures [184]. Much research and development in the area of automatedseizure detection that relied on EEG signals have been carried out and several EEG-based seizure detec-tion systems have been reported in the literature. Feature extraction and feature classification are the twoessential modules that are necessary in building an automatic seizure detection system. Around ninetypercent of prior work has focused on developing effective feature extraction techniques that can discoverthe most discriminative EEG features for seizure diagnosis. Most of these techniques use hand-craftedfeatures extracted from the time-domain [137, 138], frequency-domain [2, 51, 167], wavelet-domain[4, 81, 202], and sometimes from multiple domain representations of the EEG [140]. However, domain-based schemes experience two major difficulties. First, the non-stationarity nature of EEG makes it44difficult to have a single seizure pattern, making the hand-crafted features less practical in clinical set-tings [115]. Second, in practice, the EEG data is prone to different sources of artifacts such as muscleactivities and eye-blinking as well as white noise. Artifacts and noise interfere with EEG signals, pro-ducing serious distortions that negatively affect the seizure detection performance [5].To overcome these challenges, we develop a robust seizure detection method that can accuratelyrecognize epileptic seizures under idea conditions as well as real-life conditions. A feature extrac-tion method, based on L1-penalized robust regression, is proposed to provide the most informativeseizure-associated EEG features. The proposed scheme is applied to the EEG frequency spectra, andthe extracted spectral features are used as inputs to a random forest (RF) classifier for training and clas-sification. We used the benchmark clinical dataset provided by Bonn University [19] to compare theseizure detection performance of our proposed method to those of the state-of-the-art techniques. Wefirst examine how efficiently our proposed method performs under ideal conditions, i.e., when the EEGdata is completely free of artifacts and noise. Results demonstrate that our approach achieves superiorperformance than those in the literature; yielding the highest seizure detection rates of 100% sensitivity,100% specificity, and 100% classification accuracy. The performance of our approach is also testedin the presence of muscle activities and eye-blinking artifacts as well as white noise. Our approach isproven to be robust against all these interferences. It maintains high seizure detection performance evenat high noise and artifacts levels, making it more practical to real-life and clinical settings.5.2 Materials5.2.1 EEG Data and SubjectsWe conduct our seizure detection experiments on the benchmark clinical EEG dataset provided by BonnUniversity [19]. This is a popular and a well-known dataset for epileptic seizure detection. It consistsof data from five different sets denoted A, B, C, D, and E; each set includes 100 single-channel EEGsignals of 23.6 seconds duration each. Sets A and B to contain surface EEG signals recorded from fivehealthy volunteers using the standard 10-20 system for EEG electrode placement [89]. Subjects wereawake and relaxed with eyes open (Set A) and eyes closed (Set B). Sets C and D consist of intracranialEEG signals taken from five epileptic patients during seizure-free intervals. EEG signals in set C arerecorded using electrodes implanted in the brain epileptogenic zone, while those in set D are recordedfrom the hippocampal formation of the opposite hemisphere of the brain. Set E includes EEG signalsrecorded from five epileptic patients while experiencing active seizures.All the EEG signals had been sampled at 173.6Hz and digitized using a 12-bit analog-to-digitalconverter. All EEG signals were recorded with the same 128-channel amplifier system, using a commonreference channel. The EEG data provided by the Bonn Dataset did not have artifacts. The original EEGsegments containing artifacts had been deleted and those containing delicate artifacts had been denoisedusing a band-pass filter with cut-off frequencies of 0.53Hz and 40Hz. Figures 5.1(a)-(e) show examplesof time-series EEG signals from the five sets available in the EEG dataset under study (sets A-E).In practice, several sources of noise contaminate the EEG signals, affect the appearance of seizure450 5 10 15 20Time (Sec)-2000200Amplitude (V)0 5 10 15 20Time (Sec)-2000200Amplitude (V)0 5 10 15 20Time (Sec)-2000200Amplitude (V)0 5 10 15 20Time (Sec)-2000200Amplitude (V)0 5 10 15 20Time (Sec)-100001000Amplitude (V)0 5 10 15 20 25 30 35 40Frequency (Hz)024Magnitude (V)1040 5 10 15 20 25 30 35 40Frequency (Hz)012Magnitude (V)1040 5 10 15 20 25 30 35 40Frequency (Hz)012Magnitude (V)1040 5 10 15 20 25 30 35 40Frequency (Hz)024Magnitude (V)1040 5 10 15 20 25 30 35 40Frequency (Hz)012Magnitude (V)105Set BSet CSet DSet ESet A(b)(d)(c)(e)(a) (f)(g)(h)(i)(j)Figure 5.1: Time-series EEG signals and their spectra: (a)-(e) Examples of time-series EEG signals fromthe five different sets of the Bonn University EEG dataset; (f)-(j) corresponding frequency spectrumplots of the EEG example signals in (a)-(e).manifestations, and severely affect the detection accuracy of epileptic seizures. The authors of [55]reviewed the most common EEG artifacts and developed adequate models to mimic their behavior. Inthis chapter, we use these models to generate synthetic artifacts, and we focus on the most three criticaland inevitable sources of artifacts, which are:1. Muscle Activities: Skeletal muscles produce electrical activities that interfere with the brain dataduring data acquisition causing muscular contamination to the EEG signals [148]. The authorsof [55] developed a model that emulates muscle artifacts by generating random noise, and thenfiltering it using 20-60 Hz BPF. The generated noise is then multiplied by a typical muscle scalpmap to adjust its magnitude.2. Eye Blinks: EEG also catches other electrical activities such as eye movement and blinking. Theseartifacts yield a significant distortion to the EEG and thus make the detection of epileptic seizuresmore troublesome. Besides, the low signal-to-noise-ratio (SNR) of the EEG signal makes therecognition of seizure patterns more challenging [87]. In [55], the eye movement artifacts weremodeled as random noise passed through 1-3 Hz BPF.460 5 10 15 20Time (Sec)-200002000Amplitude (V)0 5 10 15 20Time (Sec)-200002000Amplitude (V)0 5 10 15 20Time (Sec)-200002000Amplitude (V)0 5 10 15 20Time (Sec)-200002000Amplitude (V)(c)(b)(d)(a)Figure 5.2: Time-series EEG plots: (a) clean ictal signal; (b), (c), and (d) ictal signals corrupted withmuscle artifacts, eye-blinking, and white noise, respectively.3. White Noise: The other sources of noise such as power line interference and environmental noisecan be represented as additive white noise with a Gaussian distribution [55].Figure 5.2(a) shows an arbitrary noise-free EEG signal corresponding to an epileptic seizure activity(Ictal EEG), while Figures 5.2(b), (c), and (d) show the noisy versions of the same signal corrupted withmuscle artifacts, eye-blinking, and white noise, respectively. The amplitude of each of these artifactscan be adjusted to produce noisy EEG signals with different signal-to-noise-ratios (SNRs). The SNR ofthe noisy signals shown in Figure 5.2 is set to 0dB.5.2.2 EEG Frequency SpectrumWe use the EEG frequency spectrum as the input to our robust feature extraction method. Only thedistinctive frequency components are nominated as representative features for the diagnosis of epilepticseizures. Figures 5.1 (f)-(j) show the frequency spectra of clean EEG example signals from the five EEGsets of Bonn University dataset. The baseline frequency-domain based seizure detection systems oftenfollow one of these three approaches: (1) the whole EEG frequency components is used as an input to47the classification model. This method, however, is computationally-complex and is more appropriatefor offline diagnosis, (2) the EEG data is decomposed into its five main frequency rhythms of the delta,theta, alpha, beta, and gamma. Then, only two or three rhythms are selected as representative charac-teristics for EEG classification. The drawback of this approach is that it ignores some seizure-relevantinformation in the non-selected EEG rhythms, and (3) computing some hand-crafted statistical featuresfrom all five EEG frequency rhythms and combining them as one feature vector only. These features,however, are over-sensitive to inter-patient and intra-patient variations as well as ambient noise and bodyartifacts. Unlike existing frequency domain-based feature extraction techniques, our method adaptivelyextracts the discriminatory EEG frequency components that best characterize EEG seizure patterns, andhence achieve robust seizure detection performance.5.3 Literature ReviewIn this section, we briefly review the previous work in the context of epileptic seizure detection usingEEG signals.5.3.1 Two-class EEG Classification ProblemClassification between Seizure and Normal EEG ActivitiesIn this subsection, we cover prior work about the two-class EEG classification problem that deals withdifferentiating between sets A and E, which correspond to Normal and Ictal EEG signals, respectively.A wealth of methods have been reported in the literature. In [1], Aarabi et al. initially extracted a setof features from time-domain, frequency-domain, and wavelet domain signals as well as autoregressivecoefficients and cepstral features. Then, the feature selection method based on linear correlation wasadopted to rank the extracted features and select those best represent seizure activities. The back-propagation neural network (BPNN) was used afterwards for EEG classification, and the results showan overall detection rate of 93.00%. Similar detection performance was achieved by Subasi et al. in[189], where a dynamic fuzzy neural network (DFNN) classifier was adopted together with wavelettransform to produce an average seizure detection rate of 93%. In [190], Subasi et al. improved thedetection accuracy to 94.50% by using a mixture of experts (ME) classification model along with thesame wavelet sub-bands used in [189]. This improvement demonstrates the potential of ME neuralnetwork structure in achieving detection rates that are higher than those of the regular neural networkstructures [190].Furthermore, the study in [167], applied spectral analysis to produce a good representation of EEGsignals. After that, a decision tree (DT) classifier was employed for epilepsy detection, and a classifi-cation accuracy of 98.68% was obtained. In [44], cross-correlation was initially used as a feature ex-traction method, and the extracted features were then fed into support vector machine (SVM) for EEGclassification. This hybrid system resulted in obtaining an average classification accuracy of 95.96%.Moreover, the authors of [217] introduced a new seizure detection system based on nonlinear features48(e.g., approximate entropy and Hurst exponent) and the extreme learning machine (ELM) classifier. Areasonable seizure detection accuracy of 96.50% was achieved. In [120], a wavelet-based seizure de-tection system was designed. The EEG data were first analyzed using the wavelet transform, and thenthe statistical features were extracted from the wavelet coefficients in the frequency range of 0-32Hz.The effectiveness of the selected features was tested using a simple Linear discriminant analysis (LDA)classifier, which provided 91.80% classification accuracy. In [154], the permutation entropy (PE) wasused as a feature for the detection of epileptic seizures. It was found that the EEG signals taken fromepileptic patients are characterized by lower PE than healthy subjects. Using the SVM classifier, thismethod achieved a classification accuracy of 93.80%.Moreover, the authors of [221] introduced a three-stage seizure detection system. They first usedthe wavelet transform to decompose the EEG data into its five sub-bands. Then, the statistical lacunarityfeatures and the fluctuation index were extracted from the third, fourth and fifth wavelet sub-bands, andsent into the Bayesian linear discriminant analysis (BLDA) for EEG classification. This system achievedan average classification accuracy of 96.67%. In a similar study [128], Kumar et al. adopted the wavelettransform to decompose the EEG data into its main frequency rhythms: delta, theta, alpha, beta, andgamma. Then, only the theta, alpha, and beta rhythms were used to compute the energy, variance,and zero crossing rate statistical features as well as a set of nonlinear features. Results show that theextracted features together with SVM can yield a high seizure detection accuracy of 97.50%. In [186],Song et al. tested the effectiveness of the weighted permutation entropy (WPE) as a representative EEGfeature for normal and ictal EEG signals. Using WPE as an input to the SVM classifier, an averageclassification accuracy of 97.25% was obtained. In [41], Bugeja et al. examined the performance ofSVM and ELM to differentiate between healthy and epileptic subjects based on the spectral featuresobtained from the wavelet domain. They found that the ELM can yield the highest seizure sensitivity of99.48%. In [17], a robust learning scheme, named random undersampling boosting (RUSBoost), wasadopted to differentiate between seizure and non-seizure EEG activities. It was found that RUSBoostachieved a seizure detection accuracy of 97.87% with a detection delay of 2.7s. Recently, the authors of[164] used the tunable-Q wavelet transform to characterize the non-stationary behavior of EEG signals.Average classification accuracy of 97.75% was achieved using SVM with 10-folds cross-validation. Aswell, the SVM was used in [183] to test four statistical features extracted by the multiFractal detrendedfluctuation analysis. Promising average classification accuracy of 99.60% was obtained.Classification between Seizure and All Non-seizure EEG ActivitiesThis subsection addresses the special case of the two-class seizure detection problems that discriminatesbetween the seizure activities (set E) and all non-seizure activities (sets A, B, C, and D). Only a fewresearchers addressed this kind of imbalanced binary classification problems [71, 84, 110, 114, 146,165, 169, 210]. In [84], the approximate entropy derived from the wavelet sub-bands was used incombination with an artificial neural network (ANN) to identify whether the EEG pattern belongs toseizure or non-seizure activities. This method achieved a considerably high detection result of 98.27%classification accuracy. In [169], the genetic algorithm was adopted to adaptively extract the features49that closely represent the different EEG brain states. The selected EEG features were used afterwards inconjunction with k-nearest neighbor (kNN) classifier, which achieved an average classification accuracyof 98.40%. Likewise, Kaleem et al. used KNN classifier together with EEG features derived from thecomponents of empirical mode decomposition [114]. As a result; a classification accuracy of 98.20%was obtained.In recent times, the authors of [71] formed a feature vector of the spectral entropies and energyfeatures of the EEG rhythms obtained by Hilbert marginal spectrum analysis. This vector was thenfed into the SVM classifier for seizure detection, and the average classification accuracy of 98.80%was fulfilled. In [165], the conventional statistical features of maximum, minimum, mean, median, andstandard deviation were sent into the complex-valued artificial neural networks (CVANN) EEG classifi-cation. Experimental results show that the CVANN can accurately identify both seizure and non-seizureactivities, with an average classification accuracy of 99.33%. Moreover, Jaiswal et al. developed twonovel feature extraction techniques that were deployed jointly with the ANN classifier to achieve 98.72%classification accuracy. In [210], the feature extraction process was conducted in different domains in-cluding the time, frequency, and time-frequency domains. The principal component analysis was usedafterwards to reduce the number of features characterizing epileptic activities. The selected featureswere then presented as an input to the SVM classifier, which achieved an overall detection accuracy of99.25%. Moreover, the authors of [146] presented a novel seizure detection system based on an im-proved correlation-based feature selection method combined with random forest classifier. The resultsshowed that the proposed method achieves better performance compared to the existing correlation-based methods and comparable performance to other state-of-the-art methods. The tunable-Q wavelettransform was also used in [33] to compute the entropy of the EEG signal in different frequency bands.Using SVM, a classification accuracy of 99.00% was achieved in classifying seizure versus non-seizureEEG signals. As well, there has been a lot of work using random forest classifier but with different EEGfeatures and different EEG datasets [35, 57].5.3.2 Three-class EEG Classification ProblemThis subsection covers the classification problem between three different EEG classes: Normal EEG,interictal EEG, and ictal EEG, which correspond to sets A, C, and E, respectively. In recent years(2005-2017), several epileptic seizure detection methods have been proposed for the three-class EEGclassification problem . In [83], Guler et al. introduced the use of recurrent neural networks (RNNs) forepileptic seizure diagnosis using EEG signals. It was found that RNNs produce higher accuracy ratesthan those of the feedforward neural network, achieving an average classification accuracy of 96.79%.In [202], a time-frequency analysis approach was applied to the EEG signals, where a combination oftemporal and spectral features was selected and fed into an artificial neural network (ANN) for EEGclassification. This approach provided an average seizure detection accuracy of 97.94%. In [75], thewavelet transform was used to decompose the EEG signals into their central spectral rhythms. Then,three statistical features are extracted from these rhythms and used as an input to a BPNN classifier,which achieved 96.60% classification accuracy. The wavelet decomposition was also used in [204]50and [205] to analyze the EEG signals into approximations and details sub-bands. Subsequently, thestatistical features characterizing the behavior of EEG were extracted and tested using the mixture ofexperts (ME) [204] and multilayer perceptron neural network (MLPNN) [205]. The results had anaverage classification accuracies of 93.17% and 94.83%, respectively.Furthermore, the sample entropy was used in [185] as a method of feature extraction to detect EEGpatterns pertinent to epileptic seizures. Then, the sample entropy was used as the only input feature forthe classification model of extreme learning machine (ELM), which produced an average classificationaccuracy of 95.67%. Also, Naghsh-Nilchi et al. used a combination of EEG time domain features andspectral features to form a more representative feature vector, which was then fed into an MLPNN forEEG classification [150]. The epilepsy recognition rates achieved by the method above were 97.46%for sensitivity, 98.74% for specificity and 97.50% for classification accuracy. In [9], Acharya et al. usedthe parameters of recurrence quantification analysis as representative features for EEG classification.These features were then fed into SVM and Fuzzy classifiers, which achieved an average seizure detec-tion accuracy of 94.70%. One year later, Acharya et al. integrated a set of EEG-based entropy features,namely: Approximate entropy, sample entropy, and phase entropy [6]. The effectiveness of these el-ements is examined using seven different classifiers. Among all of them, the fuzzy Sugeno classifier(FSC) produced the highest classification accuracy of 98.10%. Also, in [8], Acharya et al. analyzed theEEG signals into wavelet coefficients using wavelet packet decomposition. The principal componentanalysis was then deployed to extract the eigenvalues from the produced wavelet coefficients. UsingGaussian mixture model (GMM) classifier, an average classification accuracy of 99.00% was achieved.Comparable classification accuracy of 98.67% was achieved in [156] by using the error-correction out-put codes (ECOC) classification technique.More recently, Chiang et al. proposed an energy-efficient EEG seizure detection system [48]. Theyapplied the feature extraction process at the sensor side and sent the extracted features only to the serverside, where the epilepsy detection was conducted. This method increased the battery lifetime by 14times; however, it achieved a low classification accuracy of 95.61%. Moreover, the authors of [73]developed a piecewise quadratic (PQ) classifier that uses a set of time, frequency and time-frequencyfeatures for epilepsy detection. This classifier together with the time-frequency domain features resultedin a remarkable detection rate of 98.70%. Likewise, Samiee et al. used the time-frequency represen-tations of EEG signals in conjunction with the MLPNN classifier; an average classification accuracyof 99.10% was obtained [173]. Besides, the independent component analysis (ICA) was used to effec-tively find out the independent features associated with seizure patterns [95]. The results demonstratedthat ICA, along with the SVM classifier, can achieve a classification accuracy of 95.00%. Further, theauthors of [29] developed a seizure detection scheme based on some statistical features and LSSVMclassifier. The proposed scheme showed an average accuracy of 97.19% with a short computation timeof 0.065 seconds. In [11], an automated EEG classification method was developed based on the featuresextracted using CWT, higher order spectra (HOS), and textures and the classification model of SVM,which in turn yielded an average classification accuracy of 96.00%.515.3.3 Five-class EEG Classification ProblemThe third and more challenging type of seizure detection problems is the five-class problem, while itaddresses the differentiation between the five classes of EEG data sets: A, B, C, D, and E. Achievinghigh detection accuracy for each set individually holds great potential for vital clinical applications. Forexample, the classification between sets C and D that correspond to the same brain state (Interictal) canlend a helping hand with seizure localization. This is because sets C and D are taken from the samepatients but from two different brain regions. This ever-increasing motivation has triggered interest inthe study of five-class problems and its impact on real clinical applications [82, 131, 147, 178, 203, 206].In [82], the wavelet coefficients and Lyapunov exponents were computed as characteristic EEGfeatures, and the multi-class PNN classifier was trained and tested on these features achieving a highclassification accuracy of 98.05%. Besides, the authors of [203] proposed an efficient seizure detectionmodel based on the EEG features extracted by eigenvector methods in combination with multi-classSVM. This resulted in significantly high detection accuracy of 99.30%. A similar seizure detectionapproach was introduced in [206] and a comparable seizure detection accuracy of 99.20% was achieved.In an attempt to lessen the computational complexity of seizure detection systems, the authors of[147] utilized the multi-class SVM classifier along with the features of Lyapunov exponents and waveletcoefficients. The resultant classification accuracy, however, was dropped down to 96.00%. As well, theauthors of [178] developed a seizure detection system based on wavelet entropy as delegate EEG featuresand SVM as a feature classification model. This system boosted seizure detection accuracy to 99.97%.Ultimately, in [131], the feature extraction method of optimum allocation was first adopted to select themost discriminative EEG features. Then, the MSVM was used to recognize the different categories ofEEG classes, and average classification accuracy of 99.99% was obtained.5.4 MethodologyThis section demonstrates how the L1-penalized robust regression model is applied to the EEG spectrumto extract the most discriminative EEG features pertinent to epileptic seizures. It also depicts how thisregression problem can be solved using a computationally-simple approach named block coordinatedescent (BCD). Finally, it describes how the extracted features are used for training and testing a randomforest (RF) classifier.5.4.1 Robust EEG Feature LearningIn this section, we describe how the feature extraction problem can be formulated as a robust regressionmodel and how solving this problem can help identify the EEG features that best characterize seizurepatterns.52Problem FormulationIn a previous study, we introduced the use of the least absolute shrinkage and selection operator (LASSO)technique as a feature extraction method for reliable seizure detection [108]. LASSO, however, uses the“least squares” to form the loss function, which performs well under ideal conditions and also in thepresence of low noise levels [70]. But when the EEG data is contaminated by severe noise levels,LASSO fails to maintain its high performance in EEG feature extraction. In this work, we present amore durable feature extraction method that can address noisy EEG data corrupted with severe levelsof physical artifacts and white noise. The proposed scheme is built based on a robust regression modelthat adopts the absolute value to form the loss function [18].In this study, we addressed the two-class, three-class, and five-class EEG classification problems. Asdepicted in Section 5.2, the EEG dataset comprises five different EEG sets corresponding to five differentclasses. The total number of EEG signals (observations) included in each EEG set is 100. Hence,the total number of observations for any of the above-mentioned classification problems is defined asn = 100 ⇥ k, where k is the number of classes. The EEG spectra of EEG signals are first obtained byfast Fourier transform (FFT). Given the EEG signal length L = 4096, the FFT size, defined as d, is setto L rounded up to the next power of 2. In our case, the spectrum of each EEG signal is found to haved = 2048 frequency components.Next, we define a matrix X (n ⇥ d) that includes the spectra of all EEG signals [x1; x2; . . . ; xn].The number of rows in X denotes the total number of EEG signals (n) and the number of columnsdenotes the number of frequency components (d) in each EEG spectrum xi. We also define the columnvector y as an n-length vector including the class labels yi’s that correspond to xi’s in X . The labelsof 1, 2, 3, 4, and 5 are assigned to the EEG sets of A, B, C, D, and E, respectively. The mathematicaldescription of X and y can be shown as:X =0BBBB@x1x2...xn1CCCCA =0BBBB@x11 x12 x13 . . . x1dx21 x22 x23 . . . x2d.......... . ....xn1 xn2 xn3 . . . xnd1CCCCA& y =0BBBB@y1y2...yn1CCCCAThe feature extraction problem can then be formulated as finding the sparse vectorw⇤ that achievesthe minimum absolute error kXw  y||1. The solution can be found by solving the following L1-penalized robust regression (minimization) problem:w⇤ = argminw 2 RdkXw  yk1 + kwk1 (5.1)where  is the L1 regularization parameter.And since kak1 =Pni=1 |ai|, the optimization problem in equation (5.1) is expressed as:w⇤ = argminw 2 RdnXi=1|wTxi  yi|+ dXj=1|wj | (5.2)53This regression model achieves a significant robustness but is hard to work with in practice. Themain reason is that the absolute value function is non-differentiable and we cannot take the gradientat zero. Thus, we substitute the absolute value by a smooth approximation named “Huber loss” [98],which is twice-differentiable and can be solved using any of the gradient descent methods. Hence, theoptimization function in equation (5.2) can be described as:w⇤ ⇡ argminw 2 RdnXi=1hµwTxi  yi+ dXj=1|wj | (5.3)where hµ is the Huber loss function, which is defined as [220]:hµ (ri) =8><>:12r2i for |ri|  µµ|ri| µ22for |ri| > µ(5.4)This function is quadratic for values of r smaller than a typical threshold µ, and linear for the valuesof r larger than µ. In practice, µ = 1.345 achieves results that are within some small tolerance of theresults produced by the absolute value in equation (5.2) [98].Given the Huber loss approximation in equation (5.4), the optimization problem in equation (5.3)can be expanded to:f(w) =12nXi=1wTxi  yiT wTxi  yi |wT xiyi|µ+µnXi=1|wTxiyi|12µ|wT xiyi|>µ+dXj=1|wj |(5.5)Unlike the absolute value loss, this approximation is amenable to standard unconstrained optimiza-tion methods since it is twice-differentiable:rf(w) = 12nXi=1xTiwTxi  yi |wT xiyi|µ + µnXi=1xi · sgnwTxi  yi |wT xiyi|>µ (5.6)where sgn is the signum function that extracts the sign of real numbers.Problem SolutionAs shown in the optimization problem in (5.5), L1-regularization produces a sparsity in the weight vec-tor w, which can be used to prune irrelevant features in X . One way to solve the problem in (5.5) iswith proximal-gradient methods [38], which is not very effective for small values of  (it takes manyiterations to reach an accurate solution). Details of the convergence rate of proximal gradient descentis presented in Appendix A.1. To improve the performance, another strategy is to notice that our opti-mization problem in (5.5) can take the form:54argminw 2 Rdf(Xw  y) +dXj=1gj(wj) (5.7)for a smooth function f and a set of non-smooth functions gj . This allows us to apply the block coor-dinate descent (BCD) method [69], which uses a random selection of the coordinate to update. The fulldescription of BCD method is given in Algorithm 2.Algorithm 2: Block Coordinate Descent [152].1 Input: X, y2 Output: Sparse vector w?3 Initialization: w0  0; k  0;  = e7;4 Initialization: Lipschitz constant: L =PX2;5 repeat6 Pick a random coordinate jk 2 {1, 2, . . . , d};7 Set ↵k  1/L(jk);8 Compute rjkf(wk) using Eqn (5.6)9 Design a vector ejk of zeros with ones at the locations of block j10 Set wk+1  wk  ↵k(rjkf(wk)  ejk);11 k  k + 1;12 Until ||Xw  y||1  ✏BCD is a computationally fast algorithm with an iteration cost of O(n), which is d times faster thanthe proximal-gradient iteration cost of O(nd). The convergence rate of the BCD method is presented indetails in Appendix A.2. The sparse vector w produced by the BCD algorithm is of a great benefit foreffective EEG feature extraction and selection problems. It successfully excludes the irrelevant features(EEG frequency components corresponding to the zero elements of w) and preserves the features thathave a crucial role in EEG classification process (EEG frequency components corresponding to thenon-zero elements of w).5.4.2 EEG ClassificationIn this study, we tested the performance of several classifiers (e.g., logistic regression, support vectormachine, Naive Bayes, K-means, K-nearest neighbor, and gradient boosting algorithms) and we foundthat the random forest classifier achieves the optimal performance in terms of the detection rate andcomputational cost. The rationale behind the superior performance of RF lies in leveraging an ensembleof trees and letting them vote for the most popular class [39]. Combining a diverse set of learners(individual models) together can help improve the stability and predictive power of the model, and thenproduce better classification results.As explained in Section 4.3.2, the RF classifier integrates a set of independent decision tree classi-fiers and makes the final classification decision from the majority vote of all trees [39]. In this study,we used a RF classifier with 10 trees. The hold-out mode was first used for analyzing the RF detec-tion performance, where the dataset was divided into two sets, training, and testing, respectively. The55leave-one-out and k-folds cross-validation strategies were also used for training and testing the classifier.5.5 Results and DiscussionTo evaluate the effectiveness of the proposed robust regression-based seizure detection approach, wecompare its performance to those of the state-of-the-art detectors that use the same benchmark dataset.The detection performance was evaluated using the standard metrics of sensitivity (Sens), specificity(Spec), and classification accuracy (Acc).5.5.1 Seizure Detection under Ideal ConditionsThe proposed method is first examined for the ideal conditions, where the EEG recordings are assumedto be free of noise. The clean EEG signals are first converted into the frequency domain with the specificgoal of extracting the EEG frequency components that best characterize normal and seizure patterns.Two-class Classification Results: A-EThe first kind of the two-class problems is to discriminate between the normal and seizure EEG epochs,which are corresponding to healthy and epileptic patients with active seizures, respectively. The perfor-mance metrics of the proposed and relevant seizure detection methods are summarized in Table 5.1. Tohave a fair comparison, we examined the proposed method using different training and testing fashions.For example, compared to Polat’s method [167] that used 10-folds cross-validation, our method achieveshigher detection rates of 100% sensitivity, 99% specificity, and 99.50% classification accuracy. We alsotested our approach using Hold-out fashion with different training/testing percentages. For instance, wecompare the detection performance of the proposed method with Zhou’s method [221] using the sameHold-out percentage of 95.00-05.00%. Interestingly, our approach attained the supreme detection per-formance of 100% sensitivity, 100% specificity, and 100% classification accuracy, while those of [221]were 96.25%, 96.70%, and 96.67%, respectively. Further, using the Hold-out mode of 33.33-66.67%,our method outperforms the method of Kumar [128] that produced a sensitivity of 98%, a specificity of96%, and an average classification accuracy of 97.50%, while our method yielded a sensitivity of 100%,a specificity of 100%, and an average classification accuracy of 100%. Moreover, amongst all the ex-isting seizure detection methods, the proposed approach achieves the superior classification accuracy of100%, with a gap of 2.13% above the highest accuracy reported in the literature [17].Further, after carefully inspecting Table 5.1, we found that the sensitivity values are quite low formost of the two-class seizure detectors reported in the literature. The highest sensitivity of 99.48% wasachieved by Bugeja et al. using multilevel wavelet transform as a feature extraction method and extremelearning machine as a classification model [41]. Nevertheless, it is interesting to find how our seizuredetection approach achieved a higher sensitivity of 100%. Moreover, our approach produced a notableseizure specificity of 100%, which is comparable to those of [120] and [186], and superior to those ofthe other baseline methods. Our approach, however, can work on the raw EEG data and does not requireany data pre-processing like those of [120] and [186].56Table 5.1: Seizure detection results of the proposed and state-of-the-art methods: Two-class problem(A-E)Method Classifier Training/Testing Sens (%) Spec (%) Acc (%)Kumar et al. (2014) [128] SVM Hold-out (33.33-66.67%) 98.00 96.00 97.50Proposed Method (2017) RF Hold-out (33.33-66.67%) 100.0 100.0 100.0Yuan et al. (2011) [217] ELM Hold-out (50.00-50.00%) 92.50 96.00 96.50Amin et al. (2016) [17] RUSBoost Hold-out (50.00-50.00%) – – 97.87Proposed Method (2017) RF Hold-out (50.00-50.00%) 100.0 100.0 100.0Subasi et al. (2006) [189] DFNN Hold-out (60.00-40.00%) 93.10 92.80 93.00Nicolaou et al. (2012) [154] SVM Hold-out (60.00-40.00%) 94.38 93.23 93.80Proposed Method (2017) RF Hold-out (60.00-40.00%) 100.0 100.0 100.0Subasi et al. (2007) [190] ME Hold-out (62.50-37.50%) 95.00 94.00 94.50Chandaka et al. (2009) [44] SVM Hold-out (62.50-37.50%) 92.00 100.0 95.96Proposed Method (2017) RF Hold-out (62.50-37.50%) 100.0 100.0 100.0Khan et al. (2012) [120] LDA Hold-out (80.00-20.00%) 83.60 100.0 91.80Proposed Method (2017) RF Hold-out (80.00-20.00%) 100.0 100.0 100.0Zhou et al. (2013) [221] BLDA Hold-out (95.00-05.00%) 96.25 96.70 96.67Proposed Method (2017) RF Hold-out (95.00-05.00%) 100.0 100.0 100.0Song et al. (2016) [186] SVM – 94.50 100.0 97.25Aarabi et al. (2006) [1] BPNN Leave-one-out CV 91.00 95.00 93.00Bugeja et al. (2016) [41] ELM Leave-one-out CV 99.48 77.16 –Proposed Method (2017) RF Leave-one-out CV 99.67 99.67 99.67Polat et al. (2007) [167] DT 10-fold cross-validation 98.87 98.50 98.68Proposed Method (2017) RF 10-fold cross-validation 100.0 99.00 99.50Two-class Classification Results: ABCD-EIn the second evaluation, we address the imbalanced two-class classification problem of ABCD-E. Thefirst class includes all non-seizure activities (i.e., sets A, B, C, and D) and the second class includesthe seizure activities (i.e., set E). Given that each EEG set includes 100 signals, we have a 2-class(binary) classification problem with 500 instances (signals). A total of 400 instances are labeled asClass-1 (non-seizure), and the remaining 100 instances are labeled as Class-2 (seizure). This is animbalanced dataset, and the ratio of Class-1 to Class-2 instances is 400:100 or more concisely 4:1. Inthis situation, the predictive models developed using conventional machine learning algorithms could bebiased and inaccurate. Our approach, instead, can efficiently address this kind of classification problemsand outperform the literature performance.We adopt two strategies that form integral parts of the proposed detection system’s structure. Thefirst tactic is to penalize the proposed feature extraction method. Penalized detection imposes an ad-ditional cost on the model for making classification mistakes on the minority class during training.These penalties can bias the model to pay more attention to the minority class. In our problem, wetried a variety of penalty schemes and found that the L1-penalty works best for the balanced and imbal-anced classification problems under study. Equation (5.1) shows the proposed feature extraction methodpenalized with `1-norm. The second tactic is to use any of the decision trees algorithms for EEG clas-sification. Decision trees often perform well on imbalanced datasets. The splitting rules that look atthe class variable used in the creation of the trees can force both classes to be addressed. As mentioned57Table 5.2: Seizure detection results of the proposed and state-of-the-art methods: Two-class problem(ABCD-E)Method Classifier Training/Testing Sens (%) Spec (%) Acc (%)Guo et al. (2010) [84] ANN Hold-out (50.00-50.00%) 95.50 99.00 98.27Proposed Method (2017) RF Hold-out (50.00-50.00%) 100.0 100.0 100.0Peker et al. (2016) [165] CVANN Hold-out (60.00-40.00%) 100.0 98.01 99.33Proposed Method (2017) RF Hold-out (60.00-40.00%) 100.0 100.0 100.0Rivero et al. (2011) [169] KNN Variable – – 98.40Kaleem et al. (2013) [114] KNN 10-folds cross-validation – – 98.20Fu et al. (2015) [71] SVM 10-folds cross-validation – – 98.80Jaiswal et al. (2017) [110] ANN 10-folds cross-validation 98.30 98.82 98.72Wang et al. (2017) [210] SVM 10-folds cross-validation 97.98 99.56 99.25Mursalin et al. (2017) [146] RF 10-fold cross-validation 97.40 97.50 97.40Proposed Method (2017) RF 10-folds cross-validation 99.60 99.60 99.60in Section 5.4.2, we adopted Random Forest, one of the most efficient random tree algorithms, to helpimprove the detection performance on imbalanced datasets as well as balanced datasets.In addition to the three performance metrics, sensitivity, specificity, and classification accuracy, wealso assess the precision and F-measure metrics that can provide a more truthful performance whenworking with imbalanced classes. The performance metrics used in the proposed and baseline methodsare reported in Table 5.2 and the detailed detection results obtained by the proposed method are recordedin Tables 5.3 and 5.4. For the sake of fair comparison, we tested our seizure detection approach usingdifferent training and testing scenarios. For the Hold-out scenario of 50-50%, our results improve onthe 95.50% sensitivity, 99% specificity, and 98.27% classification accuracy presented in [84]; achieving100% for each of the aforesaid performance metrics. Although Peker et al. boosted the seizure detectionrates using artificial neural networks achieving 100% sensitivity, 98.01% specificity, and 99.33% clas-sification accuracy [165], our method produced a comparable sensitivity of 100% and higher specificityand classification accuracy of 100% each. Moreover, in comparison with the methodology in [210] thathas used 10-folds cross-validation, our method attained an analogous specificity of 99.60% and superiorsensitivity and classification accuracy of 99.60% each.Furthermore, Tables 5.3 and 5.4 show the detailed accuracy of the proposed method at the training-testing scenarios of Hold-out (50-50%) and 10-folds cross-validation, respectively. The performancemetrics, true positive (TP) rate, false positive (FP) rate, precision, recall, F-measure, and ROC Area arecomputed for the seizure and non-seizure classes. Table 5.3 illustrates that the proposed method canachieve outstanding seizure precision and F-measure equal to 1.0 each, which means that all the EEGinstances belonging to the seizure minority class are correctly classified. Table 5.4 also shows that, at10-folds cross-validation, the proposed method can maintain considerably high seizure TP rate of 0.980,precision of 1.0, F-Measure of 0.990, and ROC area of 0.990.Three-class Classification ResultsWe also address the potential of the proposed approach to distinguish between three different classesof EEG signals, which are normal, interictal, and ictal EEGs. The classification performance of the58Table 5.3: Detailed seizure detection accuracy of the proposed method for the imbalanced classificationproblem ABCD-E: Hold-out (50.00-50.00%)TP Rate FP Rate Precision Recall F-Measure ROC Area Class1.000 0.000 1.000 1.000 1.000 1.000 Seizure1.000 0.000 1.000 1.000 1.000 1.000 Non-SeizureWeighted Avg. 1.000 0.000 1.000 1.000 1.000 1.000Table 5.4: Detailed seizure detection accuracy of the proposed method for the imbalanced classificationproblem ABCD-E: 10-folds cross-validationTP Rate FP Rate Precision Recall F-Measure ROC Area Class0.980 0.000 1.000 0.980 0.990 0.990 Seizure1.000 0.020 0.995 1.000 0.998 0.990 Non-SeizureWeighted Avg. 0.996 0.016 0.996 0.996 0.996 0.990proposed seizure detection method is compared to those of the state-of-the-art methods that used thesame benchmark epileptic EEG dataset.Table 5.5 comprises the performance metrics obtained by the proposed and reference methods. Theproposed method is examined using different training and testing scenarios including Hold-out withdifferent training/testing percentages and cross-validation with a different number of folds. First, forthe common scenario of Hold-out (50.00-50.00%), the results obtained by our method improve on theseizure detection accuracies presented in [83], [202], [204], [205], and [173]; achieving the top sensi-tivity of 99.67%, specificity of 99.67%, and classification accuracy of 99.67%. Moreover, for anotherscenario of Hold-out (70.00-30.00%) and compared to the seizure detection results obtained by Niknazaret al. [156], our method attains the superior performance of 100% for each of the sensitivity, specificity,and classification accuracy. Similar results are also obtained by our method with the training/testingmode of Hold-out (80.00-20.00%). Even for the Leave-one-out cross-validation (CV), our method out-performs both of [48] and [95]; achieving the topmost detection rate of 99.67% sensitivity, 99.67%specificity, and 99.67% classification accuracy. Also, for the 3-folds cross-validation and compared tothe seizure detection results presented in [6], our method produces an analogous seizure detection sen-sitivity, specificity, and classification accuracy of 99.33% each. The results in [6], however, have beenobtained with more sophisticated signal analysis techniques used with the computationally-expensivefuzzy Sugeno classifier.Finally, it is clearly shown that the proposed method outperforms all other baseline methods that use10-folds cross-validation, achieving higher sensitivity, specificity, and also classification accuracy. Theleading cause was utilizing the L1PRR that extracts the most informative features for EEG classification.The detection results reported in Table 5.5 demonstrate the potential of L1PRR to efficiently learn therepresentative EEG features that best describe the behavior of normal, interictal and ictal EEG activities.It is worth highlighting that the proposed approach yields a seizure sensitivity of 100%, which is superiorto any of the baseline methods. Further, the proposed method produces an eminent seizure specificityof 100%, which is similar to the recent results obtained by Acharya et al. [6], and is better than those ofthe reference methods. More interestingly, amongst other methods, the proposed approach achieves an59Table 5.5: Seizure detection results of the proposed and state-of-the-art methods: Three-class problem(A-C-E)Method Classifier Training/Testing Sens (%) Spec (%) Acc (%)Gu¨ler et al. (2005) [83] RNN Hold-out (50.00-50.00%) 95.50 97.38 96.79Tzallas et al. (2007) [202] ANN Hold-out (50.00-50.00%) 95.73 97.86 97.94U¨beyli et al. (2008) [204] ME Hold-out (50.00-50.00%) 92.75 94.00 93.17U¨beyli et al. (2009) [205] MLPNN Hold-out (50.00-50.00%) 96.00 94.00 94.83Samiee et al. (2015) [173] MLPNN Hold-out (50.00-50.00%) 99.20 98.90 99.10Proposed Method (2017) RF Hold-out (50.00-50.00%) 99.67 99.67 99.67Niknazar et al. (2013) [156] ECOC Hold-out (70.00-30.00%) 98.55 99.33 98.67Proposed Method (2017) RF Hold-out (70.00-30.00%) 100.0 100.0 100.0Dastidar et al. (2008) [75] RBFNN Hold-out (80.00-20.00%) – – 96.60Proposed Method (2017) RF Hold-out (80.00-20.00%) 100.0 100.0 100.0Chiang et al. (2014) [48] SVM Leave-one-out CV 91.82 99.40 95.61Hosseini et al. (2016) [95] SVM Leave-one-out CV 96.00 94.00 95.00Proposed Method (2017) RF Leave-one-out CV 99.67 99.67 99.67Nilchi et al. (2010) [150] MLPNN 3-folds cross-validation 97.46 98.74 97.50Acharya et al. (2012) [6] FSC 3-folds cross-validation 99.40 100.0 98.10Proposed Method (2017) RF 3-folds cross-validation 99.33 99.33 99.33Gajic et al. (2015) [73] PQ 5-folds cross-validation 98.60 99.33 98.70Proposed Method (2017) RF 5-folds cross-validation 99.67 99.67 99.67Song et al. (2010) [185] ELM 10-folds cross-validation 97.26 98.77 95.67Acharya et al. (2011) [9] Fuzzy 10-folds cross-validation 97.70 94.70 94.40Acharya et al. (2012) [8] GMM 10-folds cross-validation 99.00 99.00 99.00Behara et al. (2016) [29] LSSVM 10-folds cross-validation 96.96 99.66 97.19Proposed Method (2017) RF 10-folds cross-validation 99.67 99.67 99.67outstanding classification accuracy of 100%.Five-class Classification ResultsWe also address the classification problem between the five different EEG sets of A, B, C, D, and E,respectively. Accurately solving this problem is more challenging but advantageous for many vitalapplications. It regards the discrimination between EEG activities belonging to the same data class(e.g., sets C and D, which are both interictal); aiming to provide more beneficial clinical practices. Forinstance, the classification between the EEG sets of C and D plays a crucial role in seizure localization,as they are taken from different brain regions. Only a few researchers paid attention to the importance ofthe five-class classification problem. They, however, succeeded to achieve satisfactory seizure detectionresults, as shown in Table 5.6.We compare the performance of the proposed approach to those of the state-of-the-art methodsthat have been developed in the last decade. The performance metrics of all methods are reported inTable 5.6. Comparing our results with the literature performance, we find that Siuly et al. developeda multiclass seizure detection method that produces detection results comparable to those reported inour study, achieving 99.99% sensitivity, 99.99% specificity, and 99.99% classification accuracy [131].However, their method involved applying three pre-processing techniques, which are computationallyintensive and might hinder real-time applications. Our approach, on the other hand, relaxes the need60Table 5.6: Seizure detection results of the proposed and state-of-the-art methods: Five-class problem(A-B-C-D-E)Method Classifier Training/Testing Sens (%) Spec (%) Acc (%)Gu¨ler et al. (2007) [82] PNN Hold-out (50.00-50.00%) 98.05 99.50 98.05U¨beyli et al. (2009) [206] SVM Hold-out (50.00-50.00%) 99.20 99.79 99.20Shen et al. (2013) [178] SVM Hold-out (50.00-50.00%) 98.37 100.0 99.97Siuly et al. (2014) [131] MSVM Hold-out (50.00-50.00%) 99.99 99.99 99.99Proposed Method (2017) RF Hold-out (50.00-50.00%) 99.67 99.67 99.67Murugavel et al. (2011) [147] MSVM – – – 96.00U¨beyli et al. (2008) [203] SVM Hold-out (70.00-30.00%) 99.30 99.82 99.30Proposed Method (2017) RF Hold-out (70.00-30.00%) 100.0 100.0 100.0of data pre-processing and works directly on the raw EEG data, achieving a comparable detection rateof 99.67% sensitivity, 99.67% specificity, and 99.67% classification accuracy. Moreover, comparingour method to those are using Hold-out (70.00-30.00%) mode, we found that the proposed methodoutperforms all others in terms of the sensitivity, specificity, and classification accuracy; achieving thesuperior seizure detection rate of 100.0% each.5.5.2 Seizure Detection under Real-life ConditionsWe further examine the robustness of the proposed seizure detection method against common EEG ar-tifacts. In previous work, we developed a vigorous EEG feature learning method capable of performingon such noisy signals [108]. This method, however, assumed that the noise encountered during EEGacquisition had a Gaussian distribution, which is not always the case in practical situations. In this work,we introduce a practical seizure detection approach that can address noisy EEG data corrupted with realphysiologic artifacts (e.g., muscle activities and eye-blinking) as well as white noise. MatlabTM toolwas used to implement the artifacts’ models described in Section 5.2.1, and then append them to theEEG data.Two-class Classification Results: A-EWe first investigate the performance of the proposed approach to recognize whether the noise-corruptedEEG data are corresponding to a healthy person (set A) or an epileptic patient with active seizure (setE). As shown in Figure 5.3, our method is examined for different noise levels and types. The commonEEG artifacts of muscle activities and eye-blinking in addition to the white noise were considered, andtheir magnitudes were adjusted to produce noisy EEG signals of different SNRs. Figure 5.3 shows theseizure detection results obtained by our method against each of the eye-blinking, muscle activities, andthe white noise at a wide range of SNRs (20:20dB).Several interesting observations can be made here. First, the proposed method can effectively learnthe most discriminative and robust EEG features associated with seizures, even when the EEG data arecompletely immersed in noise. For example, Figure 5.3 demonstrates the robustness of the proposedmethod in the presence of the inevitable physiologic artifacts as well as the ambient noise. Interest-61-20 -15 -10 -5 0 5 10 15 20SNR (dB)90919293949596979899100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 5.3: Classification accuracy against signal-to-noise ratio for the two-class problem A-E.ingly, for the noisy EEG having eye-blinking artifacts, the proposed scheme maintains high classifi-cation accuracies at all SNR levels, while the least classification accuracy of 92.00% is achieved atSNR=20dB. The proposed method also maintains high seizure detection performance in the existenceof muscle artifacts and white noise. As depicted in Figure 5.3, the least classification accuracies ob-tained at SNR=20dB in the presence of muscle artifacts and white noise are 91.80% and 90.25%,respectively. These are quite high seizure detection accuracies for severely corrupted EEG data.Two-class Classification Results: ABCD-EAs for the two-class problem of ABCD-E, the proposed approach was also examined on noisy datacontaminated with eye-blinking, muscle artifacts, and white noise. And since the dataset here has im-balanced class distribution, a negligible decay in the proposed method’s performance was experienced.It is worth pointing out that, for such an imbalanced classification problem, the proposed method isproven to maintain high classification accuracies even at extremely low SNRs. Figure 5.4 illustrates thedetection results obtained by our method in the presence of each noise type. It is shown that the leastclassification accuracy of 86.12% was achieved when the EEG data was entirely immersed in whitenoise (SNR=20dB). For noisy EEG data of SNR>0dB, the proposed method attains classification ac-curacies higher than 97%. Better performance was achieved in the presence of eye-blinking and muscleartifacts.62-20 -15 -10 -5 0 5 10 15 20SNR (dB)859095100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 5.4: Classification accuracy against signal-to-noise ratio for the two-class problem ABCD-E.Three-class Classification ResultsFigure 5.5 investigates the performance of the proposed method against the three common EEG con-taminants at different SNR levels. It can be observed that the proposed method maintains its superiorperformance when applied to noise-corrupted data of low SNRs. The main reason is that robust re-gression can efficiently learn the most discriminative and robust EEG features associated with seizures,even in abnormal conditions. The performance of the proposed method starts to decline when appliedto noisy EEG data of SNRs below 0dB, particularly when the data is polluted with white noise. Bet-ter performance can be achieved for the case of muscle artifacts since the muscle activities interferewith the EEG signals within the limited frequency band of 20-60Hz only. The superior performance isachieved for the case of eye-blinking artifacts. Figure 5.5 verifies the robustness of the proposed ap-proach against eye-blinking artifacts, even at extremely low SNRs. It can accurately identify the seizureactivities submerged in noise with classification accuracies higher than 90.50%.Five-class Classification ResultsWe also study the potential of the proposed seizure detection approach to handle the five-class classi-fication problem under abnormal conditions. This is where the EEG signals are mixed with differentlevels of eye-blinking, muscle activities, and white noise. Figure 5.6 depicts the detection performanceof the proposed method at various SNRs. Surprisingly enough, for this kind of intractable classifica-tion problem, the proposed approach is found to sustain high classification accuracies for noisy EEGcorrupted with eye-blinking artifacts. An inferior detection accuracies were obtained for the case of63-20 -15 -10 -5 0 5 10 15 20SNR (dB)707580859095100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 5.5: Classification accuracy against signal-to-noise ratio for the three-class problem A-C-E.muscle artifacts, while the classification accuracy is decreased to 69.60% at the SNR of 20dB. Themain reason was that the muscle activities dwell in a wide range of EEG frequency spectrum producinga severe distortion in the EEG waveform shapes. Moreover, the performance of the proposed methodencounters a higher deterioration for the case of severe white noise, while poor classification accuraciesdown to 50% were obtained. In better situations (SNR>0dB), the proposed approach achieved betterperformance with classification accuracies higher than 85.20%.5.6 Summary and ConclusionThis chapter presents a robust method for the automatic detection of epileptic seizures using EEG sig-nals. This method relies on a robust feature extraction scheme that we developed based on L1-penalizedrobust regression (L1PRR). L1PRR can adaptively learn the most informative EEG features pertinent toseizures under ideal and real-life conditions. The performance of the proposed method is first examinedunder ideal conditions, i.e., when the EEG data is free of noise. Experimental results on a popular clini-cal dataset demonstrate the superiority of our proposed scheme over existing seizure detection methods.For ideal conditions, this scheme achieves the highest seizure detection rates: 100% sensitivity, 100%specificity, and 100% classification accuracy for each of the two-class, three-class, and five-class EEGclassification problems. We then examined whether or not the proposed method is applicable in prac-tice. We tested the seizure detection performance of our method on noisy EEG data corrupted withwhite noise as well as the muscle activities and eye-blinking artifacts. For each contaminant type, the64-20 -15 -10 -5 0 5 10 15 20SNR (dB)50556065707580859095100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 5.6: Classification accuracy against signal-to-noise ratio for the five-class problem A-B-C-D-E.overall classification accuracy is evaluated against a wide range of signal-to-noise ratios. The experi-mental results verify the robustness of the proposed method in the presence of the inescapable artifactsand noise described above, while it maintains a seizure detection performance higher than 90.00% evenat high noise levels.The advantages of the seizure detection method presented in this chapter can be summarized as fol-lows: (1) It achieves superior detection results compared to the state-of-the-art methods, (2) It maintainshigh seizure detection accuracy in the presence of white noise and the EEG artifacts arising from muscleactivities and eye movement. To the best of our knowledge, there exists no prior work that addressesseizure detection under these conditions, and (3) It is a computationally-simple method, while it usesthe block coordinate descent (BCD) algorithm to identify the most representative and distinguishableEEG features in a time-efficient fashion. As depicted in Appendix A.2, BCD converges in few iterationsonly. Nevertheless, the performance of our seizure detection method deteriorates when applied to EEGdata corrupted with severe noise levels, particularly when the SNR of the noisy data is below 0dB. In thefollowing chapter, we investigate the performance of deep recurrent neural networks (RNNs) on suchnoisy data and we show that RNN has a great potential to robustly detect seizure patterns in severelycontaminated EEG data.65Chapter 6Optimized Deep Neural NetworkArchitecture for Robust Detection ofEpileptic Seizures using EEG SignalsIn the previous chapter, a dependable seizure detection method based on L1-penalized robust regres-sion was proposed and assessed under ideal and abnormal conditions. This chapter introduces a deeprecurrent neural network architecture that further improves the robustness of epileptic seizure detectionsystems, especially in severely noisy environments. Automatic detection of epileptic seizures based ondeep learning methods received much attention last year. However, the potential of using these methodsfor robust seizure detection has not yet been fully exploited in terms of the optimal design of the neuralnetwork architecture and the detection power of the time-series brain data.In this work, a deep neural network architecture is introduced to learn the temporal dependencies inElectroencephalogram (EEG) data for robust detection of epileptic seizures. A long short-term memory(LSTM) network is first used to learn the high-level representations of different EEG patterns. Then,a fully connected (FC) layer is adopted to extract the most robust EEG features relevant to epilep-tic seizures. Finally, these features are supplied to a Softmax function for training and classification.The results on a well-known benchmark clinical dataset demonstrate the superiority of the proposedapproach over existing state-of-the-art methods. Furthermore, our approach is shown to be robust un-der noisy and real-life conditions. Compared to current methods that are quite sensitive to noise, theproposed method maintains its high detection performance in the presence of common EEG artifacts(muscle activities and eye-blinking) as well as white noise. The work presented in this chapter hasbeen published in part in the 2018 IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP) [106] and in the Journal of Clinical Neurophysiology [107].6.1 IntroductionAs mentioned earlier, epilepsy is a chronic neurological disorder of the brain that affects people ofall ages, and is typically the second most common neurological disease after a migraine [170]. Thecharacterizing feature of epilepsy is repetitive seizures that strike abruptly. Symptoms may run fromshort suspension of awareness to violent convulsions and once in a while loss of consciousness [10].Electroencephalogram (EEG) is the prime signal that has been widely used for the diagnosis of epilepsy.As the visual inspection of EEG is labor- and time-consuming, research in the EEG-based automatic66detection of epileptic seizures has been very active.Most feature extraction techniques that have been developed for automatic seizure detection systemsuse hand-crafted features extracted in some domain. However, these domain-based methods encounterthree main challenges. First, they are sensitive (not robust enough) to acute variations in seizure patterns.This is because the EEG data is non-stationary and its statistical features change across different subjectsand over time for the same subject. Secondly, EEG data acquisition systems are very susceptible to adiverse range of artifacts such as muscle activities, eye-blinks, and white environmental noise. All thesecontaminants can negatively alter the genuine EEG features and hence seriously impact the performanceof seizure detection systems [5]. Finally, most existing seizure detection systems have been trained onsmall-scale EEG datasets collected from a few specific patients, making them less practical in clinicalapplications.As mentioned in Chapter 2, deep learning techniques have been developed to address these limi-tations. Thodoroff et al. took the lead in deploying deep neural networks for automatic detection ofepileptic seizures [197]. They used a combination of convolutional and recurrent neural networks toextract the spatial and temporal EEG representations of seizures. This model achieved an average sen-sitivity of 85% and a false positive rate of 0.8/hour. Moreover, Lin et al. proposed a deep learningframework based on stacked sparse autoencoder (SSAE) to extract and the high-level representations ofEEG signals. This framework achieved an average classification accuracy of 96% [132]. More recently,Acharya et al. used the convolutional neural network (CNN) for analyzing EEG signals and detectingepileptic seizures [7]. They implemented a 13-layers deep CNN model that attained an average accu-racy, sensitivity, and specificity of 88.67%, 90%, and 95%, respectively. In [218], Yuan and his teamproposed to feed the time-frequency representations of EEG signals as an input to the SSAE. This strat-egy helped extract both of the spectral and temporal information pertinent to epileptic seizures, and thusachieve an average classification accuracy of 93.82%.Despite the encouraging seizure detection results obtained using such deep learning methods, therestill exist several improvements that could be achieved. First, most of these models are built based onCNNs, which are looking for the same pattern over the EEG signals collected from different patients.The signature of the epileptic seizure, however, varies across different patients and over time for thesame patient. The second issue is the assumption that very deep and complex neural network struc-tures would be powerful enough in capturing all the useful information necessary for detecting epilepticseizures. However, increasing the size of the network (especially with a limited amount of data) intro-duces more parameters that need to learn, and hence increases the chances of overfitting. Third andmost important, none of the existing deep learning methods addresses the detection of epileptic seizuresin real-life situations, that is when the EEG measurements are contaminated by artifacts (e.g., mus-cle activities and eye-movement) and ambient noise. These may alter the genuine EEG features andnegatively affect the performance of seizure detection systems.To address the challenges mentioned above, we propose a robust deep neural network architecturethat uses a recurrent neural network (RNN) with long LSTM cells to effectively exploit the temporaldependencies in time-series EEG signals. A fully connected layer is used on top of the LSTM layer to67learn the most robust and discriminative EEG attributes associated with epileptic seizures. The learnedfeatures are then sent into a softmax layer for EEG training and classification. Results on a well-knownbenchmark EEG dataset demonstrate the superiority and robustness of the proposed deep neural net-work architecture for seizure detection. Comparisons with methods based on classical machine learningand deep learning indicate that the proposed method achieves the most remarkable seizure recognitionperformance under perfect circumstances (i.e., the EEG measurements are entirely free of noise). Thisproposed strategy is likewise demonstrated to maintain a robust performance in the existence of commonEEG artifacts and environmental noise, making it more suitable for clinical diagnosis.6.2 EEG Dataset6.2.1 Description of EEG DataAs in Chapter 5, our seizure recognition experiments are also conducted using the widely used andpublicly available EEG database produced by Bonn University [19]. In summary, this database has fivediverse sets named A, B, C, D, and E. Each of these sets has 100 time-series EEG signals, each ofwhich is 23.6 seconds time span. Sets A and B include surface EEG signals collected from five healthyparticipants using the standardized 10-20 system for scalp EEG electrodes placement [94]. Set A isrecorded from the five participants when they were awake and rested with their eyes open, while set B isrecorded when their eyes were closed. Sets C, D, and E, on the other hand, include electrocorticography(ECoG) signals collected from the cerebral cortex of five epileptic patients. The main difference betweenset E and (sets C and D) is that set E was taken from patients while experiencing active seizures andsets C and D were recorded throughout the seizure-free interims. The electrodes of set D and set C wereimplanted within the brain epileptogenic zone and the hippocampal formation of the obverse cerebralhemisphere, respectively. Since set E only included seizure activities, ECoG segments were taken fromall recording positions exhibiting ictal activities.The acquired EEG signals are first amplified using a 128-channel amplifier system and then digitizedvia a 12-bit analog-to-digital converter at a sampling rate of 173.61 Hz. The digital EEG signals are thenstored electronically and filtered using a band-pass filter (BPF) with typical settings of 0.53-40.00 Hzcut-off frequencies. Exemplary EEG signals are shown in Figure 6.1.6.2.2 Common EEG ArtifactsEEG recordings are usually corrupted by several types of artifacts. These artifacts negatively affect themanifestation of seizure patterns and badly influence the performance of seizure detection systems. In[55], the common types of EEG artifacts were investigated, and the mathematical models that emulatesimilar behaviors were developed. As in Chapter 5, we used these models to study the three most vitaland inevitable sources of artifacts, which are:• Muscle Activities: As depicted in [55], muscle activities can be modeled as random noise filteredby a band-pass filter of 20Hz and 60Hz cut-off frequencies.680 5 10 15 20 25Time (Sec)-2000200Amplitude (V)0 5 10 15 20 25Time (Sec)-2000200Amplitude (V)0 5 10 15 20 25Time (Sec)-2000200Amplitude (V)0 5 10 15 20 25Time (Sec)-2000200Amplitude (V)0 5 10 15 20 25Time (Sec)-100001000Amplitude (V)Set ASet BSet CSet DSet EFigure 6.1: Samples of EEG signals from each of the five sets of the Bonn University EEG database.• Eye Blinks: The eye blinks and movement can be modeled as a random noise signal filtered by aband-pass filter of 1Hz and 3Hz cut-off frequencies [55].• White Noise: The extra-physiologic artifacts and environmental noise are characterized as whitenoise with Gaussian distribution [55].Figure 6.2(a) depicts an arbitrary clean EEG signal from set A, while Figures 6.2(b), (c), and (d)demonstrate the corrupted variants of the same signal after adding synthetic muscle activities, eye move-ment, and white noise, respectively. Figures 6.2(e), (f), (g) and (h) show their corresponding frequencyspectra. The amplitudes of the muscle activities and eye movement artifacts, as well as the white noise,is adjustable to generate noise-corrupted EEG signals with different SNRs.The noisy EEG signals depicted in Figures 6.2(b-d) are of 0 dB SNR; this is where the noise signalhas the same power as the EEG signal. The MatlabTM platform was adopted to generate the syntheticmuscle and eye movement artifacts as well as the white noise, and then add them to the noise-free EEGrecordings.690 5 10 15 20Time (Sec)-200-1000100200Amplitude (V)0 10 20 30 40 50 60 70Frequency (Hz)050001000015000Magnitude (V)0 5 10 15 20Time (Sec)-200-1000100200Amplitude (V)0 10 20 30 40 50 60 70Frequency (Hz)050001000015000Magnitude (V)0 5 10 15 20Time (Sec)-200-1000100200Amplitude (V)0 10 20 30 40 50 60 70Frequency (Hz)00.511.52Magnitude (V)1040 5 10 15 20Time (Sec)-200-1000100200Amplitude (V)0 10 20 30 40 50 60 70Frequency (Hz)050001000015000Magnitude (V)(c)(b)(a)(d)(g)(f)(e)(h)Figure 6.2: Clean and noisy EEG signals and their corresponding spectra: (a) clean EEG example fromset A; (b-d) noisy EEG examples contaminated by muscle activities, eye movement, and white noise,respectively; (e-h) corresponding frequency spectra of (a-d), respectively.6.3 Related WorkThe published work related to EEG-based epileptic seizure detection can be sorted into three mainclassification problems summarized below. It is worth highlighting that none of the studies below takesinto consideration the existence of artifacts and their negative influence on seizure detection accuracy.6.3.1 Two-class EEG Classification ProblemClassification between Seizure and Normal EEG ActivitiesMost of the two-class seizure recognition problems focus on the classification between normal EEGsignals taken from healthy subjects (set A) and seizure EEG patterns taken from epileptic patients whileexperiencing active seizures (set E). In [1], Aarabi et al. developed an automated seizure detectionsystem using a combination of characteristic EEG features extracted from the time, frequency, andtime-frequency domains. All these features together with the EEG cepstral features were fed into aback-propagation neural network (BPNN) classifier with two hidden layers and resulted in a seizurerecognition accuracy of 93.00%. In [190], Subasi et al. used wavelet transform to derive the EEG70frequency bands and then fed all the spectral components into the mixture of experts (ME) classifier;an average classification accuracy of 94.50% was achieved. In [167], Polat et al. achieved a higherclassification accuracy of 98.68% using a decision tree (DT) classifier.Furthermore, Chandaka et al. used the EEG cross-correlation coefficients to compute three statisticalfeatures, and hence present them as a feature vector to the support vector machine (SVM) for EEGclassification [44]. This model yielded a modest seizure detection accuracy of 95.96%. Yuan et al.obtained comparable detection accuracies using the extreme learning machine (ELM) classifier and aset of non-linear features such as approximate entropy and Hurst exponent [217]. Wavelet transformwas also used in [120] to decompose the EEG segments into five approximation and detail sub-bands.Then, the wavelet coefficients located in the low-frequency range of 0-32Hz were used to compute theEEG features of energy and normalized coefficients. The linear discriminant analysis (LDA) classifierwas used to prove the potential of the extracted features in detecting seizure onsets with a classificationaccuracy of 91.80%. Besides, the authors of [154] leveraged the permutation entropy as a delegate EEGattribute for automatic recognition of epileptic seizures. An SVM was utilized to distinguish betweenhealthy and ictal EEG epochs; a 93.80% classification accuracy was achieved.Given the advantages of the wavelet transform outlined in the previous paragraph, it was also usedin [128] to disband the EEG signals into five distinct frequency rhythms namely delta, theta, alpha,beta, and gamma. A group of statistical and non-linear features was subsequently extracted from theserhythms and fed into an SVM classifier to achieve a superb classification accuracy of 97.50%. The au-thors in [186] also used the SVM together with the permutation entropy features to obtain a classificationaccuracy of 97.25%.Classification between Seizure and All Non-seizure EEG ActivitiesAn exceptional case of the two-class problems is to differentiate between the seizure activities (set E)and any of the non-seizure activities (sets A, B, C or D). The primary goal of this kind of problems isto accurately identify whether or not the patient experiences an active seizure. This can help patients,caregivers, and healthcare providers to administer the appropriate medication on time. In recent years,many researchers have shed light on this particular problem using a variety of techniques [71, 84, 110,165, 210]. In [84], Guo et al. used the Wavelet-based approximate entropy features together with anartificial neural network (ANN) model to recognize the seizure episodes with 98.27% classificationaccuracy. In 2015, the authors of [71] used the empirical mode decomposition approach to extract morerobust features such as the spectral entropies and energies of EEG frequency bands. They also used theSVM classifier to improve the seizure detection accuracy to 98.80%. In [165], the wavelet transformwas also leveraged to analyze the EEG data into different rhythms, and then five statistical features werecomputed from each individual rhythm. These features are concatenated together and supplied to thecomplex-valued ANN (CVANN) classifier for seizure diagnosis. Accordingly, the average classificationaccuracy of 99.33% was achieved.Further, in [110], a novel computationally-simple feature extraction technique named local neighbordescriptive pattern (LNDP) was tested with different classification models including SVM, ANN, and71DT. Experimental results demonstrate that the best detection performance can be fulfilled using LNDPjointly with the ANN classifier, where the highest classification accuracy of 98.72% is obtained. Tofurther enhance the seizure detection rate, a set of time domain, frequency domain, and time-frequencydomain features were used together with the SVM classifier to achieve the best classification rate of99.25% [210].6.3.2 Three-class EEG Classification ProblemThis category of seizure detection problems addresses the classification of three different EEG classes:Normal EEG taken from healthy subjects, Inter-ictal EEG taken from epileptic patients throughoutseizure-free intervals and Ictal EEG recorded from epileptic patients while experiencing active seizures.Numerous relevant methods have been presented in the literature [6, 8, 29, 73, 75, 83, 95, 156, 173, 185,202, 205]. For example, the authors of [83] investigated the use of the RNN as a classification modelfor the epilepsy diagnosis. Satisfactory performance of 96.79% classification accuracy was achieved.In [202], Tzallas et al. reached a superior detection accuracy of 97.94% by using the ANN classifiertogether with the energy features of EEG frequency bands. Moreover, the work in [75] introduced anovel classifier named radial basis function neural network (RBFNN), which was integrated with thewavelet features to achieve a seizure diagnostic accuracy of 96.60%.Furthermore, U¨beyli et al. adopted wavelet transform to analyze the EEG signals into their mainspectral rhythms [205]. At that point, the statistical features representing the characteristics of the differ-ent EEG activities were extracted and examined using a multilayer perceptron neural network (MLPNN)classifier. The results achieved were 96.00%, 94.00%, and 94.83% for the sensitivity, specificity, andclassification accuracy, respectively. In [185], a sample entropy-based feature extraction method was uti-lized together with an ELM classifier and achieved 97.26% sensitivity, 98.77% specificity, and 95.67%classification accuracy.In an effort to alleviate the computational complexity burden of real-time seizure detection methods,Acharya et al. relaxed the need for EEG pre-processing and worked straightway on the raw data asis [6, 8]. In [6], a set of robust EEG features including sample entropy, approximate entropy, andphase entropy was computed from the recorded EEG signals and then fed into fuzzy Sugeno classifier(FSC) for EEG classification. This approach boosted the seizure detection accuracy to 98.10%. Theauthors in [8] used the wavelet packet transform (WPT) to decompose the EEG segments into eightapproximation and detail wavelet sub-bands. The wavelet coefficients of these bands were then usedto infer the distinctive eigenvalues and use them as an input to the Gaussian mixture model (GMM)classifier, which in turn achieved an outstanding 99.00% classification accuracy. Similar classificationaccuracy of 98.67% was accomplished in [156] by leveraging a feature extraction approach based onrecurrence quantification analysis integrated with a two-stage classifier named error-correction outputcode (ECOC).Further, the authors of [73] built a piecewise quadratic (PQ) classification model for detecting epilep-tic EEG episodes. They integrated this classifier with a set of temporal, spectral, and non-linear featuresand reached up to 98.70% classification accuracy. In [173], a feature extraction method based on the72discrete short-time Fourier transform was adopted together with an MLPNN classifier to differentiatebetween healthy and seizure EEG epochs. As a result, the highest detection accuracy of 99.10% wasachieved. Also, the independent component analysis (ICA) method was employed to determine thediscriminatory features pertinent to epileptic seizures [95]. The extracted features together with theSVM classifier were used to achieve 96.00% sensitivity, 94.00% specificity, and 95.00% classificationaccuracy. In [29], a seizure detection paradigm based on a statistical feature extraction method and aleast-square SVM (LSSVM) classifier achieved a 97.19% classification accuracy in a matter of 0.065seconds computation time.6.3.3 Five-class EEG Classification ProblemThis section addresses the classification of a data sample when the labels are one of five classes (whichare A, B, C, D, and E. This kind of classification problems is more complicated and harder to solvethan the two-class and three-class problems. The main reason is that it attempts to differentiate betweensimilar pathological EEG patterns corresponding to the same class (e.g., the classification between setC and set D, which are both Inter-ictal EEGs). But since the EEG sets of C and D are recorded fromdifferent epileptogenic brain zones [19], their correct classification holds great potential in localizingthe seizure foci inside the brain; making it quite advantageous for such kinds of vital applications. Here,we highlight the most recent work that handles such kinds of problems.In [82], one of the most efficient multi-class EEG classification methods for epileptic seizure detec-tion was introduced. The best typical characteristics were extracted from the EEG wavelet coefficientsand Lyapunov exponents. The probabilistic neural network (PNN) was used afterward for EEG classifi-cation, where it achieved a notable seizure detection rate of 98.05%. In addition, U¨beyli et al. developedan eigenvector-based method for EEG feature extraction, which in turn achieved a 99.30% classifica-tion accuracy using SVM [203]. In [206], the same authors used simple statistical features instead, andclose classification accuracy of 99.20% was obtained. In [178], SVM was also used in cooperation withthe adaptive feature extraction method of wavelet approximate entropy, and outstanding classificationaccuracy of 99.97% was achieved.6.4 MethodologyDeep learning has been successfully applied to several research problems such as face recognition [193],image classification [125], compressive sensing [161], information retrieval [160] and speech recogni-tion [77]. In this study, we propose the use of deep recurrent neural networks, particularly the longshort-term memory model [92], for automatic detection of epileptic seizures.6.4.1 High-level PictureFigure 6.3 depicts the whole process of the proposed seizure detection system. Each time-series EEGsignal is first divided into smaller non-overlapping segments. These segments are then fed into the73LSTM layer, which is used for learning the high-level representations of the EEG signals. Next, theoutput of LSTM layer y is presented as an input to the time-distributed fully connected layer h to findthe most robust EEG features pertinent to epileptic seizures. Finally, a softmax layer is used to create thelabel predictions. The full pipeline of the proposed approach is described in the following subsections.6.4.2 Proposed MethodEEG Segmentation and Data ReshapeBiomedical data such as EEGs are usually non-stationary signals, i.e., their statistical characteristicschange over time [25]. The purpose of EEG segmentation is to divide a signal into several pseudo-stationary epochs (segments) as these are expected to have similar temporal and spectral features [88].This EEG segmentation is often applied as a pre-processing stage for non-stationary signal analysis.The other important factor behind EEG segmentation, particularly in this study, is the need for alarge number of labeled data samples. In general, it is hard to obtain sufficient well-labeled data fortraining deep neural networks in real-life applications. The data segmentation, however, can help obtainmore training samples, and hence improve the performance of the deep learning architecture understudy. Over and above, EEG segmentation helps in finding the dependencies between consecutive EEGdata-points in each EEG channel signal.The EEG dataset under study includes 500 EEG signals; each is of 23.6 seconds duration. So giventhe sampling frequency of 173.6 Hz, the total number of data points in each EEG signal, d, is equal to4096. All the EEG signals are divided into non-overlapping segments of a specific length (L). The mostnatural selection for L is L = 1, i.e., having a predictive model with LSTM predicting sample 2 fromsample 1, sample 3 from sample 2, and so on. This choice will, however, result in a computationallyslow process. To reduce the computational complexity for each L data-point EEG segment, we createvectors of size L⇥ 1 and do all multiplications and additions in parallel for these L data-point vectors.In our experiments, we tested a wide range of the EEG segment lengths, and we inferred that increasingthis length can lessen the computational cost of training the proposed neural network architecture but atthe cost of decreasing the detection accuracy [99].The influence of the EEG segment length on the seizure detection accuracy is presented in detailsin B.1. We found that the EEG segment length L=2 achieves the optimal trade-off between the compu-tational complexity and seizure detection score. Based on these premises, and as shown in Figure 6.3,each one-dimensional EEG signal of size d (here d =4096) is reshaped into a two-dimensional slice ofsize (M ⇥ L), where M is the number of time-steps, and L is the EEG segment length (L =2 in ourexperiments). Given an N of EEG signals, the input data will take the shape N ⇥M ⇥ L.EEG Deep Feature LearningTo learn the expressive seizure characteristics from EEG data, deep learning was deployed to extract thediscriminative EEG features closely related to epileptic seizures. We design our deep neural networkto include three layers, with a softmax layer on top of them. The EEG data samples were first passed74LSTM	LSTM	LSTM	LSTM	LSTM	y1y:	LSTM	Output	2048	×	80	y2y3y4y80…..……..		…..……..		…….		…….		….		v1v2v3v50v49v48Average		Pooling	LSTM	Layer		(80	cells)	Fully	Connected	Layer	(50	units)	Softmax	v:	Dense	Output	2048	×	50	EEG	Slice	(M	×	L	=	2048	×	2)	EEG	Signal	(d	=	4096)	…….		Data	Reshape	d	 M	Raw	EEG	Data	(N		×	d	)	EEG	Tensor	(N		×	M	×	L)	P1	P2	PK	hhhhhhOut	Figure 6.3: Schematic diagram of the overall seizure detection approach: Each EEG EEG channel signal of d data-points is segmented into Msegments - each segment includes L data-points; LSTM stands for Long-Short-Term Memory; y is the output of LSTM layer; h1represents a fullyconnected (dense) layer unit; v is the fully connected layer output;P1, P2, P3, · · · , PK are the probabilities produced by softmax for the K-classes;Out stands for the output of the softmax layer (predicted label).75through an LSTM layer of 80 cells. The motivation for this was to learn the short- and long-termdependencies between the EEG segments in each signal and between the different EEG signals acrossthe same class. Remembering information for long periods of time is practically the default behaviorof LSTMs, making them the best candidate for processing long-term EEG signals. B.2 provides moredetails on how the LSTM architecture is different from the architecture of the vanilla RNN, and howthis difference helps LSTMs handle the shortcomings of vanilla RNNs.As illustrated in Figure 6.3, the fully connected (dense) layer was adopted as the second layer totranslate the information learned by the LSTM layer into meaningful seizure-associated features. Also,since we address a long-term sequence labeling problem, we deployed the time-distributed FC layer(not the ordinary FC layer) so that the cost function is calculated over all EEG time-steps and not on thelast time-step only. A time-distributed FC layer of 50 units was used in this model.The final structural step was to pass the output of the FC layer through a one-dimensional averagepooling layer. The motivation for this was that all the EEG segments should contribute equally to thelabel prediction. The output of the average pooling layer is then presented as an input to the softmaxlayer for EEG classification. The proposed deep learning model was trained and tested using two com-mon scenarios: (1) The hold-out scenario: the EEG dataset was split into two sets, the first set was usedfor training, and the second set for classification1. Several hold-out percentages were used in this study.(2) The cross-validation scenario: 3-folds, 5-folds, and 10-folds cross-validation were also used to trainand test the proposed deep neural network model.EEG Feature ClassificationAs shown in Figure 6.3, we add a softmax layer at the top of our model to generate label predic-tions. Softmax is the most common function used to represent a probability distribution in the machinelearning literature [112]. From an optimization perspective, it has some subtle properties concerningdifferentiability. From a machine learning perspective: using a deep neural network with a softmaxlayer on top can represent any K-class probability function over the feature space.In our EEG classification problem, the class labels are assumed to be: y(i) 2 1, · · ·,K, whereK is the total number of classes. Given a training set {(x(1), y(1)), (x(2), y(2)), · · · , (x(N), y(N))} ofN labeled samples, where x(i) 2 <(d). For each test sample x, the softmax hypothesis evaluates theprobability that P(y = k|x(t), x(t  1), x(t  2), · · · , x(t M)) for each class label k = 1, · · · ,K;where t symbolizes the EEG segment andM is the total number of segments. The summations of theseK-probability values should equal to 1 and the highest probability belongs to the predicted class. Thus,the softmax function, denoted by h✓(x), is defined as follows:1Our experiments on the EEG feature learning using LSTM were conducted with the open-source software of Keras usingTensorFlow backend [99].76h✓(x) =0BBBB@P(y = 1|x; ✓)P(y = 2|x; ✓)...P(y = K|x; ✓)1CCCCA = 1KPj=1exp(✓Tj x)0BBBB@exp(✓T1 x)exp(✓T2 x)...exp(✓TKx)1CCCCAwhere ✓1, ✓2, · · · , ✓K are the softmax model parameters.The cost function that has been widely used with softmax is the “cross entropy” J(✓) [112]:J(✓) = "NXi=1KXk=1{y(i) = k} logP(y(i) = k|x(i); ✓)#(6.1)= 26664NXi=1KXk=1{y(i) = k} log exp(✓Tk x(i))KPj=1exp(✓Tj x(i))37775 (6.2)where {.} is the “indicator function”, which equals to 1 if the statement is true and 0 if the statementis false.Then, an iterative optimization method such as the stochastic gradient descent [38] is used to mini-mize the cost function and maximize the probability of the true class label.Referring to the LSTM architecture provided in B.2, the pseudo-code of the proposed LSTM-basedseizure detection method is presented in Algorithm 3.Network ConfigurationOur LSTM network was trained by optimizing the “categorical cross-entropy” cost function with “adam”parameter update and a learning factor of 1⇥103. The total numbers of LSTM cells and FC units wereset to 80 and 50, respectively. The “return sequence” was set to “True” so that all EEG segments areconsidered in the feature extraction process. The batch sizes were set to 64, and the network parame-ters converged after around 2000 iterations with 40 epochs. The data were augmented by adding eyemovement and muscle activity artifacts as well as Gaussian white noise, and various noise levels wereconsidered in our experiments. Our implementation was derived in Python using Keras with TensorFlowbackend and underwent two hours training on an NVIDIA K40 GPU machine. Although the trainingof our end-to-end neural network model takes up to two hours, testing the trained model on new datatakes less than a second. This fast testing performance makes our model a perfect fit for the real-timeprocessing of EEG signals in real-life and clinical applications.77Algorithm 3: Epileptic Seizure Detection using Long-Short-Term Memory (ESD-LSTM).1 Input: d-dimensional EEG/iEEG Signal x; Trained LSTM model2 Output: Predicted EEG class label y˜ ! {1, · · · ,K}3 Initialization: d 4096;M  2048;4 Initialization: K  number of EEG classes;K = 2, 3, and 5 for two-class, three-class, andfive-class seizure detection problems.5 procedure ESD-LSTM(x, K, LSTM)6 Pick an EEG segment length L 2 {20, 21, 22, 23, · · · , d};7 Partitioning the EEG/iEEG signal intoM segments, each of L length.8 while t M do9 t t+ 110 zt = g(Wz xt + Rz yt1 + bz) . LSTM input11 it = (Wi xt + Ri yt1 + Pi  ct1 + bi) . imput gate12 ft = (Wf xt + Rf yt1 + Pf  ct1 + bf ) . forget gate13 ct = zt  it + ct1  ft . cell14 ot = (Wo xt + Ro yt1 + Po  ct + bo) . output gate15 yt = h(ct)  ot . LSTM output16 vt = ht (yt) . FC Layer17 end18 E = AP(vt, vt1, vt2, · · · , vtM ); . Average Pooling19 Compute Pk = {P1, · · · , PK} softmax(E)20 Find Idx Support(max(Pk)) . Index of highest probability21 y˜ = Idx; . Predicted class label22 end procedure78Table 6.1: Seizure detection results of the proposed and baseline methods: Two-class problem (A-E).Method Year Classifier Training/Testing Sens (%) Spec (%) Acc (%)Aarabi et al. [1] 2006 BPNN Hold-out (50.00-50.00%) 91.00 95.00 93.00Subasi et al. [190] 2007 ME Hold-out (62.50-37.50%) 95.00 94.00 94.50Chandaka et al. [44] 2009 SVM Hold-out (62.50-37.50%) 92.00 100.00 95.96Yuan et al. [217] 2011 ELM Hold-out (50.00-50.00%) 92.50 96.00 96.50Khan et al. [120] 2012 LDA Hold-out (80.00-20.00%) 83.60 100.00 91.80Nicolaou et al. [154] 2012 SVM Hold-out (60.00-40.00%) 94.38 93.23 93.80Kumar et al. [128] 2014 SVM Hold-out (66.67-33.33%) 98.00 96.00 97.50Song et al. [186] 2016 SVM – 94.50 100.00 97.25Proposed Method 2017 ESD-LSTM Hold-out (66.67-33.33%) 100.00 100.00 100.00Polat et al. [167] 2007 DT 10-folds cross-validation 98.87 98.50 98.68Proposed Method 2017 ESD-LSTM 10-folds cross-validation 100.00 100.00 100.006.5 Results and DiscussionIn this section, we test the performance of our seizure detection approach under ideal and imperfectconditions. The results are also compared to those of the baseline seizure detection methods that use thesame clinical dataset. The detection performance was assessed using the metrics of sensitivity (Sens),specificity (Spec), and classification accuracy (Acc).6.5.1 Seizure Detection under Ideal ConditionsWe first examine the proposed deep learning approach to clean EEG signals that are free of artifacts andnoise. After EEG pre-processing (i.e., data segmentation and reshaping), EEGs are then fed into ourdeep neural network model with the ultimate objective of effective EEG feature learning and classifica-tion.Two-class Classification Results: A-EThe first category of the two-class seizure detection problems is to distinguish between the normal EEGscollected from healthy volunteers and seizure EEGs recorded from epileptic patients experiencing activeseizures. Table 6.1 reports the results obtained by the proposed and baseline seizure detection strategies.As shown in Table 6.1, most of the existing seizure detection methods achieve low sensitivity valuesin the range of 91.00-94.50%. The top sensitivity of 98.87% was accomplished by Polat et al. using acombination of wavelet-based feature extractor and a decision tree classifier [167]. Our seizure detectionapproach, however, accomplished the most astounding sensitivity of 100.00%.Further, our approach achieved an outstanding seizure specificity of 100,00%, which is similar tothose of [120] and [186], and superior to those of the other state-of-the-art methods. Also, our approachcan deal with the raw EEG data and does not entail any major pre-processing (e.g., denoising) like thoseof [120] and [186]. Among all the current seizure detection strategies, our approach yields an unrivaledclassification accuracy of 100.00%, which is 1.32% better than the top classification accuracy reportedin the literature [167].79Table 6.2: Seizure detection results of the proposed and baseline methods: Two-class problem (ABCD-E).Method Year Classifier Training/Testing Sens (%) Spec (%) Acc (%)Guo et al. [84] 2010 ANN Hold-out (50.00-50.00%) 95.50 99.00 98.27Peker et al. [165] 2016 CVANN Hold-out (60.00-40.00%) 100.00 98.01 99.33Proposed Method 2017 ESD-LSTM Hold-out (80.00-20.00%) 100.00 100.00 100.00Fu et al. [71] 2015 SVM 10-folds cross-validation – – 98.80Jaiswal et al. [110] 2017 ANN 10-folds cross-validation 98.30 98.82 98.72Wang et al. [210] 2017 SVM 10-folds cross-validation 97.98 99.56 99.25Proposed Method 2017 ESD-LSTM 10-folds cross-validation 100.00 100.00 100.00Two-class Classification Results: ABCD-EIn the second assessment, we handle the classification problem between any non-seizure activities (setsA, B, C, or D) and seizure activities (set E). Considering the fact that each EEG set comprises 100signals, this classification problem has an imbalanced class distribution. The reason is that the numberof EEG samples belonging to the seizure class is substantially lower than those of the non-seizure class.In this case, the seizure detection systems developed using traditional machine learning algorithmscould be inaccurate and prejudiced against the minority class. The proposed approach, instead, canadequately tackle this sort of classification problems and defeat the literature performance. Again, theperformance is evaluated in terms of sensitivity, specificity, and classification accuracy values. Theseizure detection results achieved by the proposed and baseline methods are reported in Table 6.2. Theyverify the superiority of the proposed approach over the state-of-the-art methods, achieving the topmostperformance of 100.00% for each of the sensitivity, specificity, and classification accuracy.Three-class Classification ResultsWe also investigate the effectiveness of the proposed method to differentiate between three distinctclasses of EEG activities: normal, inter-ictal, and ictal. The EEG classification results achieved bythe proposed method are compared to those of the baseline methods presented in [83]-[29]. For a faircomparison, the performance of all these methods is tested on the same clinical EEG dataset [19].Table 6.3 includes the performance metrics achieved by the proposed and existing seizure detectionstrategies. It is clear that our proposed approach yields superior sensitivity, specificity, and classificationaccuracy. The leading reason was the use of LSTM that investigates the correlation between the EEGsignals taken from different subjects and the dependencies among EEG segments of the same subject.The results in Table 6.3 demonstrate the high potential of deep neural networks in effectively learningthe discriminative EEG features that best characterize normal, inter-ictal and ictal EEG activities. It hasnot escaped our notice that, for the three-class seizure detection problem, our proposed deep learningmethod achieves 100% sensitivity, which is better than those obtained by all the reference methods.Besides, our approach yields 100% seizure specificity, which is comparable to the specificity valuesachieved by Acharya et al. in [6], and superior to those of the baseline methods. All the more strikingly,among all existing seizure detection strategies, our approach achieves a notable 100.00% classification80Table 6.3: Seizure detection results of the proposed and baseline methods: Three-class problem (A-C-E).Method Year Classifier Training/Testing Sens (%) Spec (%) Acc (%)Gu¨ler et al. [83] 2005 RNN Hold-out (50.00-50.00%) 95.50 97.38 96.79Tzallas et al. [202] 2007 ANN Hold-out (50.00-50.00%) 95.73 97.86 97.94Dastidar et al. [75] 2008 RBFNN Hold-out (80.00-20.00%) – – 96.60U¨beyli et al. [205] 2009 MLPNN Hold-out (50.00-50.00%) 96.00 94.00 94.83Niknazar et al. [156] 2013 ECOC Hold-out (70.00-30.00%) 98.55 99.33 98.67Samiee et al. [173] 2015 MLPNN Hold-out (50.00-50.00%) 99.20 98.90 99.10Proposed Method 2017 ESD-LSTM Hold-out (50.00-50.00%) 100.00 100.00 100.00Hosseini et al. [95] 2016 SVM Leave-one-out CV 96.00 94.00 95.00Proposed Method 2017 ESD-LSTM Leave-one-out CV 100.00 100.00 100.00Acharya et al. [6] 2012 FSC 3-folds cross-validation 99.40 100.00 98.10Proposed Method 2017 ESD-LSTM 3-folds cross-validation 100.00 100.00 100.00Gajic et al. [73] 2015 PQ 5-folds cross-validation 98.60 99.33 98.70Proposed Method 2017 ESD-LSTM 5-folds cross-validation 100.00 100.00 100.00Song et al. [185] 2010 ELM 10-folds cross-validation 97.26 98.77 95.67Acharya et al. [8] 2012 GMM 10-folds cross-validation 99.00 99.00 99.00Behara et al. [29] 2016 LSSVM 10-folds cross-validation 96.96 99.66 97.19Proposed Method 2017 ESD-LSTM 10-folds cross-validation 100.00 100.00 100.00Table 6.4: Seizure detection results of the proposed and baseline methods: Five-class problem (A-B-C-D-E).Method Year Classifier Training/Testing Sens (%) Spec (%) Acc (%)Gu¨ler et al. [82] 2007 PNN Hold-out (50.00-50.00%) 98.05 99.50 98.05U¨beyli et al. [203] 2008 SVM Hold-out (70.00-30.00%) 99.30 99.82 99.30U¨beyli et al. [206] 2009 SVM Hold-out (50.00-50.00%) 99.20 99.79 99.20Shen et al. [178] 2013 SVM Hold-out (50.00-50.00%) 98.37 100.00 99.97Proposed Method 2017 ESD-LSTM Hold-out (50.00-50.00%) 100.00 100.00 100.00accuracy.Five-class Classification ResultsWe also study the potential of the proposed deep learning model to address the five-class EEG clas-sification problem between the EEG sets of A, B, C, D, and E. This is a more challenging problemcompared to the two-class and three-class classification problems but has an advantage in numerousclinical applications. It addresses the differentiation between EEG epochs belonging to the same class(e.g., sets C and D, which are both inter-ictal), seeking for providing more useful clinical practices. Forexample, the discrimination between EEG set C and set D plays a vital role in seizure localization, astheir data were captured from the opposite brain hemispheres. In fact, only a few scholars have focusedon the significance of the five-class EEG classification problem [82]-[178]. Nevertheless, they haveaccomplished adequate EEG classification results, as shown in Table 6.4.To assess the efficacy of our proposed method for addressing the five-class EEG classification prob-lem, we compare its performance to the baseline methods of the last decade. Table 6.4 includes theperformance measures attained by our method as well as reference methods. The proposed method is81shown to outperform the literature performance achieving superior sensitivity, specificity, and classifi-cation accuracy. Comparing our EEG classification results to those of the baseline methods, we find thatthe multiclass seizure detection method of Shen et al. produces similar results to those achieved by ourapproach. It achieves 98.37% sensitivity, 100.00% specificity, and 99.97% classification accuracy [178].This method, however, necessitates three computationally-intensive pre-processing steps that may ham-per its real-time applications. On the contrary, our method does not require any pre-processing stagesand deals with the raw data as is; yielding the superior EEG classification performance of 100.00%.6.5.2 Seizure Detection under Real-life ConditionsWe also study the robustness of the proposed seizure detection approach in the presence of commonartifacts such as muscle activities and eye movement as well as the environmental noise. In [108],we have presented a reliable EEG feature learning algorithm that can deal with noise-corrupted EEGsignals. This algorithm presumed that only Gaussian noise had been experienced during EEG dataacquisition, i.e., EEG artifacts were excluded, which is not the case in practical situations. This study,however, shows that our deep learning model can deal with noisy EEG data contaminated by physicalartifacts (muscle activities and eye movement) and in addition Gaussian white noise.Two-class Classification Results: A-EWe initially examine the capability of the proposed method to recognize whether the noisy EEGs belongto a healthy subject (set A) or an epileptic patient who experiences active seizures (set E). As depicted inFigure 6.4, the performance of the proposed approach is tested at different levels of artifacts and noise.The models depicted in 6.2.2 have been used to generate the synthetic artifacts of muscle activities andeye movement as well as synthetic white noise. The magnitude of these artifacts and noise is adjustedto produce noise-corrupted EEG signals of different SNRs. Figure 6.4 depicts the seizure classificationaccuracy achieved by our deep learning approach in the presence of muscle activities, eye movement,and white noise at a variety of SNR values between 20dB and 20dB.Many interesting observations can be made here. First, the proposed approach can efficiently findout the most representative and robust EEG features pertinent to seizures, even when EEGmeasurementsare entirely submerged in noise. For instance, Figure 6.4 demonstrates the robustness of the proposedapproach in the existence of two common artifacts and noise. Interestingly, for the EEGs corruptedby eye movement artifacts, our method preserves a high classification accuracy of 100.00% at all SNRlevels. The same applies to the EEG contaminated by muscle activities and white noise, except whenSNR=20dB. The main reason was that, for SNR=20dB, the EEG data were completely buried innoise and their original waveform shapes were distorted. The proposed approach, however, maintainsan adequate seizure detection performance achieving a classification accuracy of 99.75% and 99.25%for the case of muscle activities and white noise, respectively.82-20 -15 -10 -5 0 5 10 15 20SNR (dB)9999.199.299.399.499.599.699.799.899.9100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 6.4: Classification accuracy vs. SNR plots for the two-class EEG classification problem (A-E).Two-class Classification Results: ABCD-EAs for the two-class problem of set E versus sets A, B, C, and D (ABCD-E), the proposed methodwas also tested on noisy EEG segments polluted by muscle artifacts, eye movement, and white noise.Since the class distribution of this binary classification problem is imbalanced, the performance of theproposed method experiences a trivial decay. It is worth highlighting that, for such a hard problem, theproposed approach is shown to preserve adequate seizure detection accuracy even at excessive noiselevels. Figure 6.5 displays the seizure detection performance achieved by the proposed method in thepresence of the two sorts of artifacts as well as the white noise. It is demonstrated that the minimumclassification accuracy of 96.70% was obtained when the EEG signals were wholly buried in whitenoise (SNR=20). Concerning the noise-corrupted EEG data of SNR above 0dB, our approach yieldsa classification accuracy higher than 99.00%.Three-class Classification ResultsFigure 6.6 demonstrates the overall performance of the proposed approach in the presence of muscleactivities, eye movement, and white noise at different SNR levels. It is clearly shown that the proposedapproach preserves a superb performance when applied to noisy EEG segments of SNR above 0dB. Theleading cause is that LSTM networks can effectively extract the most faithful and robust EEG represen-83-20 -15 -10 -5 0 5 10 15 20SNR (dB)9595.59696.59797.59898.59999.5100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 6.5: Classification accuracy vs. SNR plots for the two-class EEG classification problem (ABCD-E).tations pertinent to epileptic seizures - even under noisy conditions. The performance of the proposedmethod begins to decay when tested on noisy EEG samples of SNRs less than 0dB, especially whenthese samples are contaminated by Gaussian white noise. Better classification results are accomplishedwhen the EEG data is corrupted by muscle artifacts. The main reason behind this improvement is thatmuscle activities interfere with EEG data within the tight frequency band of 20-60Hz [55]. The supe-rior results are attained when applying our method to EEG data mixed with eye movement artifacts.As shown in Figure 6.6, the proposed method is proven to be robust against high and severe levels ofeye artifacts achieving classification accuracies above 95.20%. By and large, the proposed method canprecisely recognize EEG seizure activities contaminated with artifacts and noise with adequate classifi-cation accuracies.Five-class Classification ResultsWe also investigate the performance of our seizure detection method to address the five-class EEGclassification problem in real-life situations. i.e., when the EEG data is polluted with muscle activities,eye movement, and white noise of different intensities. Figure 6.7 demonstrates the EEG classificationresults achieved by our deep learning approach at different SNRs. Even though the five-class EEG84-20 -15 -10 -5 0 5 10 15 20SNR (dB)707580859095100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 6.6: Classification accuracy vs. SNR plots for the three-class EEG classification problem (A-C-E).classification is an intractable problem, particularly in the presence of noise and artifacts, the proposedmethod is proven to maintain high seizure detection results at low SNR values. For instance, it yields aclassification accuracy larger than 94.00%when applied to noisy EEG signals mixed with eye movementartifacts. Inferior recognition accuracy is achieved in the presence of muscle activities artifacts; thediagnostic accuracy is diminished to 70.90% at SNR=20dB. The root reason is that muscle artifactsreside in a broad band of EEGs spectra causing a considerable distortion to the EEG waveform shapes.A much higher decay in the proposed method’s performance is experienced in the presence of highlevels of white noise. The classification accuracy goes down to 53.50% for white noise-corrupted EEGdata of 20dB. However, in realistic situations (SNR>0dB), our method yields a notable performancewith EEG classification accuracies larger than 90.00%.6.6 Limitations and RecommendationsThe EEG dataset used in this study was collected from only five healthy volunteers and five epilepticpatients. This makes us more cautious with interpreting the quantitative results reported in this work.To generalize our research results, other experiments on larger datasets are needed.Our recommendations are:85-20 -15 -10 -5 0 5 10 15 20SNR (dB)50556065707580859095100Classification Accuracy (%)Eye BlinksMuscle ActivitiesWhite NoiseFigure 6.7: Classification accuracy vs. SNR plots for the five-class EEG classification problem (A-B-C-D-E).• Finding a larger EEG dataset that consists of many more patients. This will enable our deepneural network model to learn the various patterns of epileptic seizures across different patients,and hence improving its generalizability. It is well known that the performance of deep neuralnetworks improves as the training data increases.• Incorporating long-term EEG signals in our seizure detection tests. The ultimate objective, in thiscase, is to identify the pre-seizure EEG activities and notify the epileptic patients of upcomingseizures.• Since our method was developed to address the single-channel EEG data, the neural networkstructure needs some modifications to accommodate the multi-channel EEG systems. One of thesuggestions is to add a convolution neural network that can extract the spatial correlations betweenthe EEG epochs collected from different sensor locations on the scalp.6.7 Summary and ConclusionThis chapter introduces a deep learning approach for the automatic detection of epileptic seizures usingEEG signals. Compared to the baseline methods, this approach can learn the high-level EEG represen-86tations, and can adequately distinguish between normal and seizure EEG activities. Another advantageof this approach lies in its robustness against common EEG artifacts (e.g., muscle activities and eyemovement), and also white noise. The proposed approach has been examined on the Bonn EEG datasetand compared to several state-of-the-art methods. The experimental results evidence the effectivenessand superiority of the proposed method for detecting epileptic seizures. It achieves robust seizure de-tection performance under ideal and imperfect conditions. The proposed method, however, addressessingle-channel EEG data only. We plan to modify the deep neural network architecture to accommodatemulti-channel EEG systems as well.6.8 Data and Codes AvailabilityThe pre-processed data in Matlab and comma-separated values (CSV) formats are publicly availableon the first author’s Github repository [99]. The python codes for the proposed deep neural networkstructure is also made available on the same repository.87Chapter 7Human Intracranial EEG QuantitativeAnalysis and Automatic Feature Learningfor Epileptic Seizure PredictionThe aim of this study is to develop an efficient and reliable epileptic seizure prediction system usingintracranial EEG (iEEG) data, especially for people with drug-resistant epilepsy. The prediction proce-dure should yield accurate results in a fast enough fashion to alert patients of impending seizures. Wequantitatively analyze the human iEEG data to obtain insights into how the human brain behaves beforeand between epileptic seizures. We then introduce an efficient pre-processing method for reducing thedata size and converting the time-series iEEG data into an image-like format that can be used as inputsto convolutional neural networks (CNNs). Further, we propose a seizure prediction algorithm that usesco-operative multi-scale CNNs for automatic feature learning of iEEG data.Experimental results showed that: 1) iEEG channels contain complementary information and ex-cluding individual channels is not advisable to retain the spatial information needed for accurate pre-diction of epileptic seizures. 2) The traditional PCA is not a reliable method for iEEG data reductionin seizure prediction. 3) Hand-crafted iEEG features may not be suitable for reliable seizure predictionperformance as the iEEG data varies between patients and over time for the same patient. 4) Seizure pre-diction results show that our algorithm outperforms existing methods by achieving an average sensitivityof 87.85% and AUC score of 0.84. Such accurate seizure prediction algorithms can warn patients aboutthe next seizure attack so they could avoid dangerous activities. Medications could then be administeredto abort the impending seizure and minimize the risk of injury. Closed-loop seizure intervention systemscould also help to prevent seizures in patients with drug-resistant epilepsy. Parts of the work presentedin this chapter has been accepted for publication in the IEEE Transactions on Biomedical Engineering[105] and the 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) [104].7.1 IntroductionEpilepsy is a neurological disorder that affects around 90 million people worldwide [170]. It is charac-terized by recurrent seizures that strike without warning. Symptoms may range from brief suspensionof awareness to violent convulsions and sometimes loss of consciousness [67]. Currently, anti-epilepticdrugs are given to epileptic patients in sufficiently high dosages. These drugs could result in undesirableside effects such as tiredness, stomach discomfort, dizziness, or blurred vision. Unfortunately, 20-40%88of people with epilepsy continue to have seizures despite treatment [68]. Further, the patient’s quality oflife is significantly degraded by the anxiety associated with the unpredictable nature of seizures and theconsequences therefrom. This motivated researchers to develop seizure prediction systems [52]. Theability to predict seizures with high accuracies would make individualized epilepsy treatment possible(e.g., tailored therapies with less side effects). With warnings of impending seizures, the patients cantake their precautions and avoid any probable injuries. This vision inspired the proposed research.Epilepsy surgery may improve the quality of life of epileptic patients who are drug-resistant toepilepsy. Brain surgery may reduce the number of seizure attacks and hence limit the risk of permanentbrain damage. It may, however, involves serious risks such as stroke, paralysis, speech issues, memoryproblems, loss of vision, loss of motor skills, and sometimes more seizures [195]. Besides epilepsysurgery, people with drug-resistant epilepsy can benefit from methods that predict seizures far enoughin advance. Despite the fact that epileptic seizures seem unpredictable and often occur without warning,recent investigations have demonstrated that seizures do not strike at random [21, 72]. Intracranial elec-troencephalography (iEEG) is the prime tool used for forecasting epileptic seizure attacks. The iEEGdata recorded preceding seizures are analyzed to identify pattern(s) that indicate upcoming seizures.Over the past decade, researchers have made several attempts to develop robust iEEG-based seizureprediction methods. The prediction performance, however, was affected by the lack of long-term pre-ictal (before seizures) and interictal (between seizuress, i.e., baseline) iEEG recordings [126, 142]. In2013, Cook et al. designed a seizure advisory system that can predict the probability of seizure oc-currence in the minutes or hours in advance [50]. Fifteen adults with refractory epilepsy were eachimplanted with 16 electrodes. This 16-channel seizure prediction device allows uninterrupted recordingof intracranial EEG for a long period of time (6-36 months). The seizure prediction algorithm howeveryielded prediction performances with a large subject variability for 15 patients. For example, the highestseizure prediction sensitivity of 100% was reached for two patients, while a low sensitivity of 17-45%was reported for three patients. Improving prediction performances for these three patients is importantto ensure that seizure prediction is possible for different patients with drug-resistant epilepsy.Recently, a big dataset of long-term iEEG recordings has been collected from the aforementionedthree patients over 374-559 days. A subset of this dataset was made publicly available and used in theMelbourne University AES/MathWorks/NIH Seizure Prediction Competition organized in November,2016 on Kaggle.com1. Kuhlmann et al. describe the human iEEG dataset used in this contest andthe results achieved by the top eight seizure prediction algorithms [127]. These algorithms adopted adiverse range of iEEG feature extraction, selection, and classification techniques; and achieved superiorperformance compared to those in [50]. For example, the winning solution was based on an ensembleof 11 different machine learning models; each was designed subject-specific. A detailed description ofthe 11 models - including the extracted iEEG features as well as the utilized classification models - isprovided in [28]. To assess the predictive power of these models, the area under the ROC curve wasobtained for each model individually. The best reported prediction score for the contest test set was 0.85[28, 127].1www.kaggle.com/c/melbourne-university-seizure-prediction89In this work, we propose a convolutional neural network (CNN)-based algorithm for reliable pre-diction of epileptic seizures. Major contributions of this study are as follows: (1) We first provide anextensive quantitative analysis of the human iEEG data preceding and between epileptic seizures; (2)We introduce an efficient pre-processing strategy for transferring time-series iEEG data into image-likeformat that can better leverage the power of CNNs; (3) We propose a multi-scale CNN architecture thatcan learn different representations of iEEG data efficiently; and (4) We demonstrate that the proposedseizure prediction algorithm works reliably for different patients with refractory focal epilepsy, and out-performs the state-of-the-art algorithms for this problem. To the best of our knowledge, this is the firstattempt to automatically learn both local and high-abstracted iEEG features simultaneously. This is incontrast to previous studies that rely on extracting hand-crafted features.7.2 Human iEEG Quantitative AnalysisIn this section, we first describe the human intracranial EEG data used in our study and then presentsome hidden insights of the data, in order to better understand the unpredictable nature of epilepticseizures that we have been attempting to quantify.7.2.1 Subjects and DataThe iEEG data used is from the 2016 Kaggle seizure prediction competition and is described in [127].The data were recorded constantly from humans suffering from refractory (drug-resistant) focal epilepsyusing the NeuroVista Seizure Advisory System (described in [50]). Sixteen electrodes (4⇥4 contactstrips) were implanted in every patient, directed to the presumed seizure focus, and connected to atelemetry unit embedded in the subclavicular area. A rechargeable battery powered the embeddeddevice. Data were sampled at 400Hz, digitized using a 16-bit analog-to-digital converter, wirelesslytransmitted to an external hand-held advisory device, and continuously saved in a removable flash drive.Three drug-resistant patients who also had the least seizure prediction performance in [50] werechosen for our study. The selection of these patients was propelled by the intention to demonstrate thataccurate seizure prediction is also feasible for patients with drug-resistant epilepsy. The large numberof seizures recorded per patient (⇠380) gives us the chance to explore the common signature within theiEEG preceding seizures of every patient. The three patients were females and they all had resectivesurgery before the trial. The first, second, and third patient were 22, 51, and 53 years old at the time ofthe trial, but were diagnosed with epilepsy at age of 16, 10, and 15, respectively. The first patient suf-fered from parietotemporal focal epilepsy and was receiving the antiepileptic drugs of carbamazepine,lamotrigine, and phenytoin. The second patient was diagnosed with occipitoparietal focal epilepsy andwas taking carbamazepine medication. The third patient had seizures of frontotemporal origin and wastaking perampanel, phenytoin, and Lacosamide for treatment.Neuroscientists have found that the temporal dynamics of the brain activity of epileptic patients canoften be categorized into four different states: preictal (prior to seizure), ictal (seizure), postictal (afterseizure), and interictal (between seizures). The key challenge in seizure prediction is to differentiate90Figure 7.1: Examples of one-hour preictal iEEG clips collected by a 16-channel device with a 5-minuteoffset before seizures.between the preictal and interictal brain states in humans with epilepsy. Only lead seizures, charac-terized as seizures occurring at least 4 hours after a previous seizure, were used for every patient. Thecaptured iEEG data were labeled and separated into training and testing sets (for training and testing theproposed seizure prediction method). To mimic real-life clinical settings, the testing data was recordedquite much later after recording the training data. To avert the signal non-stationarity in the immediatetime period following sensors’ implantation, both sets of data (training and testing) were obtained fromdata recorded between 1 and 7 months after sensors’ implantation.As shown in Figure 7.1, 65-min preictal data clips were extracted preceding the lead seizures. Theyare divided into six 10-minute clips followed by a 5-minute offset segment before each seizure. Everytwo consecutive segments were also separated by a 10-second gap. In a similar manner, interictal clipswere segmented from 61 minutes long recordings that started at arbitrarily time with a minimum gapof 3 hours before and 4 hours after any seizure. They also were split into six 10-minute clips with 10-second spacing. A summary of the data for every patient and the number of iEEG clips in the trainingand testing data sets are shown in Table 7.1.91Table 7.1: Description of the seizure prediction intracranial EEG dataset.Patient Age Gender Recording Seizures Training clips Testing clips(years) duration (days) Preictal Interictal Preictal Interictal1 22 Female 559 390 256 570 16 462 51 Female 393 204 222 1836 18 2793 53 Female 374 545 255 1908 18 1887.2.2 Are Interictal and Preictal iEEGs Statistically Different?Most seizure prediction problems focus on the discrimination between the interictal (baseline) and pre-ictal brain states. The question that arises here is: Are the interictal and preictal iEEGs statisticallydifferent? If yes, can we use their statistical features to differentiate between them? In an effort toaddress this question, we use descriptive statistics to describe whether these data are similar or different.The boxplot is a descriptive statistics tool that presents the data distribution using the five number sum-mary: minimum, first quartile, median, third quartile, and maximum. To interpret a boxplot, we focuson three measures:• Median: The median is indicated by the horizontal line in the center of the box.• Range: The range describes the spread of the data, represented by the vertical distance betweenthe minimum and maximum values.• Interquartile Range (IQR): IQR shows the likely range of variations and is represented by thelength of the box spanning the first quartile to the third quartile.Figure 7.2 shows a comparative boxplot for Patient 1’s interictal and preictal iEEG recordings. Itcan be noticed that, for all 16 sensors, the median values of the interictal and the preictal iEEG are botharound zero. Moreover, for most of the sensor readings (⇠11 sensors), the overall range and interquartilerange of the preictal data are much greater than those of the interictal data. However, for the rest of thesensor readings, the overall range and interquartile range of the preictal data are smaller than those ofthe interictal data. Interestingly, both data sets display potential outliers at both ends.Despite the fact that the interictal and preictal data seem to have different dispersion (which impliesthat they are statistically different), we believe that such statistical features can only be used for patient-specific seizure prediction systems. Figure 7.3, for instance, depicts that the distributions of interictaland preictal iEEG sensor readings for Patient 2 are quite different from those of Patient 1. Also, withinthe same iEEG data class (e.g., preictal), the overall range and IQR of the iEEG sensor readings ofPatient 2 are less than those of Patient 1. Based on our observations for different patients, there is notypical trend in either interictal or preictal data across different epileptic patients. The iEEG data ofeach patient has its own characteristics and its statistical features can solely be meaningful for buildinga seizure prediction system for this particular patient.92Figure 7.2: Boxplots of Patient 1’s interictal and preictal iEEG data.7.2.3 Can PCA be Efficient for iEEG Data Reduction?Principal component analysis (PCA), is a prime tool for data reduction as it projects higher dimensionaldata to lower dimensional data. PCA has been widely used in reducing the dimensionality of the scalp(surface) EEG data, particularly for seizure onset detection. It also helps reduce the dimension of multi-channel EEG signals, and hence speed up the subsequent machine learning algorithms (e.g., featureextraction and classification), minimizing the computation time of the overall EEG-based diagnosissystem of neurological disorders. PCA also helps remove the redundancy (correlation) in multi-variateEEG data, while maintaining most of the variance (information) in the observed variables. A usefulmeasure is the “explained variance”, which can be calculated from the eigenvalues to measure howmuch information can be represented by each of the principal components.The work presented in [210] demonstrates how PCA can be effectively used in choosing the optimalfeature subset from the original EEG feature set, and therefore improve the epileptic seizure detectionperformance. PCA has shown to be a valuable data reduction tool that preserves a significant amountof the EEG data variance and achieves stable seizure detection results. The question that arises hereis: Can we use PCA to efficiently reduce the iEEG data dimensionality for epileptic seizure prediction?The iEEG dataset under study has 16 feature columns (16 sensors), a training set of 5,047 instances,and a test set of 565 instances (Table 7.1 shows the details of the used dataset). To answer the abovequestion, we applied PCA to the interictal and preictal iEEG data under study. We then tested whether93Figure 7.3: Boxplots of Patient 2’s interictal and preictal iEEG data.the 16-sensor readings can be mapped into a fewer number of principal components as follows:We chose the minimum number of principal components such that 95% of the variance was retained.Figures 7.4 and 7.5 show the individual principal components (colored in blue) and the cumulativeprincipal components (colored in red) for both the interictal and preictal iEEG data clips for Patient 1.Unexpectedly, Figure 7.4 shows that, for interictal iEEG, the first principal component accounts fora small amount of the data variance (⇠16%), and 95% of the variance is contained in 14 principalcomponents. Only the 15th and 16th principal components can be dropped without losing too muchinformation. Similarly, Figure 7.5 represents the explained variance by 16 principal components of thepreictal iEEG data. It provides useful insights on the distribution of variance across different principalcomponents and suggests that, at least, 14 principal components are needed to preserve most of the datavariance. This suggests that PCA is not a reliable method for iEEG data reduction for our problem.7.2.4 Can We Exclude some EEG Channels/Sensors?Since PCA has failed to reduce the dimensionality of the human iEEG data when most of its varianceis retained, researchers have considered turning some of the sensors off in order to reduce the amountof data and the computation time. The excluded channels are usually less relevant or redundant. It isa trade-off between selecting fewer EEG channels and retaining as much spatial information as possi-ble. The work presented in [177] investigates the epileptic seizure detection performance for different94Figure 7.4: Explained variance by different principle components of interictal iEEG data for Patient 1.Figure 7.5: Explained variance by different principle components of preictal iEEG data for Patient 1.95Figure 7.6: Heatmap of pairwise correlation values of Patient 1’s interictal iEEG’s 16 sensors.channel configurations. As expected, using all EEG channels from a 10-20 EEG configuration achievesthe best seizure detection performance. Choosing a moderately fewer number of EEG channels (e.g., 16and 8) reduces the amount of data to 36-72% of its original size, and results in a minor degradation inthe seizure detection performance [177].Whilst many researchers have successfully used channel selection for surface EEG data reductionapplied to seizure detection, there is no clear consensus regarding turning off any or some intracranialEEG channels in the seizure prediction task. The correlation heatmap gives us a way to gain someinsights into what sensors (channels) are correlated and whether we can exclude certain sensors. Fig-ures 7.6 and 7.7 depict the correlation heatmaps of iEEG sensor data during the interictal and preictalbrain states for Patient 1. The positive correlations are displayed by increasing the red hues and thenegative correlations are displayed by increasing the blue shades. Faded red and blue colors denotelow positive and negative correlations, respectively. The correlation coefficients are also included in theheatmaps, displaying the dependency relationships between 16 iEEG sensors.In [86], Hall et al. proposed a correlation-based feature selection method and concluded that: “arepresentative feature (sensor) subset is one that comprises features (sensors) highly correlated with thetarget (iEEG class), yet uncorrelated with each other”. Following the same rule, we watched out forthe iEEG sensors that are highly correlated during the interictal and preictal brain states. If a sensor’spredictive ability is covered by another, then it can safely be removed. As shown in Figures 7.6 and7.7, iEEG sensors are not heavily correlated neither in the interictal nor in the preictal brain states.96Figure 7.7: Heatmap of pairwise correlation values of Patient 1’s preictal iEEG’s 16 sensors.The cross-correlation values range between 0.49 and 0.61 for the interictal iEEG sensor readings, andbetween 0.82 and 0.83 for the preictal iEEG sensor readings. Therefore, it is advised not to excludeany of the iEEG channels in order to retain the spatial information needed for accurate prediction ofepileptic seizures.As illustrated in the previous two sections, the PCA and channel selection methods were foundinefficient to reduce the dimensions of the large seizure prediction dataset at hand. Therefore, we planto test the effectiveness of a relatively new method of dimensionality reduction named “autoencoder”.Autoencoders are a branch of neural networks that attempt to compress the information of the largeinput data into a reduced dimensional space. The autoencoder, like its peers of neural networks, istrained using the gradient descent optimizer that iterates until the convergence of the cost function. Thequestion then arises, “Does the compact representation generated by the autoencoder preserve all theinformation from the original data?”. The answer to this question will be addressed in future studies.7.2.5 Are Neighbor iEEG Sensors more Correlated than Distant Ones?Figure 7.8 shows an example of computerized tomography (CT) scan of the head of one patient, andreveals the locations of 16 iEEG electrodes on the cortical surface of the brain. The questions here are:Are adjacent iEEG sensors more correlated than distant ones? If yes, are they positively or negativelycorrelated? In an attempt to answer these questions, we first identified the approximate location of theiEEG electrodes on the cerebral cortex of epileptic patients under study. Unlike most seizure prediction97Figure 7.8: CT scan of the NeuroVista seizure advisory system implanted in a patient [50].Figure 7.9: Hierarchically clustered preictal iEEG sensors with dendrograms and clusters in Patient 1.systems that deploy iEEG electrodes implanted across both brain hemispheres, the electrodes of theseizure advisory system used in the studied dataset were placed on one side over the small brain regionpresumed to include the epileptogenic zone. In patients with bilateral temporal lobe epilepsy, electrodeswere positioned over the brain hemisphere that generates most seizures.Figures 7.9 and 7.10 display the clustered heatmaps of preictal iEEG sensor data for Patients 1and 2, respectively. As shown in Figure 7.9, the iEEG sensor data of Patient 1 are grouped into three98Figure 7.10: Hierarchically clustered preictal iEEG sensors with dendrograms and clusters in Patient 2.main clusters; each includes a set of sensors that have similar correlations. The first cluster comprisesthe sensors S3, S4, S7, S12, and S15 (S3 and S4 are neighbors), while the second cluster includesS2, S6, S11, and S14, and the third cluster includes S1, S5, S8, S9, S10, S13, and S16 (S9 and S10are neighbors). As can be seen, adjacent sensors do not necessarily have analogous correlation profiles.These clustering methods may allow researchers to determine which particular sensors can act as seizureactivity sources and which can act as sinks. However, categorizing iEEG sensors into ictal activitygenerators and sinks based on the correlation profiles is not well-established yet. In Figure 7.10 (forPatient 2), the first cluster includes S6, S7, and S8 (they all are neighbors), the second cluster includesS1, S2, S3, S4, S9, S10, S11, and S12 (where S1, S2, S3, and S4 are neighbors & S9, S10, S11, and S12are also neighbors), and the third cluster comprises S5, S13, S14, S15, and S16 (S5 is a non-neighbor toothers).Despite the fact that both Patients 1 and 2 are diagnosed with focal epilepsy, it is important to realizethat they have different preictal iEEG data clusters. Similarly, Patient 3 has a totally dissimilar iEEGclustered heatmap (omitted for the lack of space). The main reason why patients have different clusteredheatmaps of preictal iEEGs is that they have different epileptogenic zones. Patient 1, for example, wasdiagnosed by parietal-temporal lobe epilepsy, while Patient 2 was diagnosed by occipito-parietal lobeepilepsy. In brief, the correlations between iEEG sensors vary depending on which brain region isaffected and whether the seizure is focal (partial) or generalized.99To summarize, the conducted quantitative analysis of iEEG data suggests that iEEG data analysis forseizure prediction can be quite different from regular EEG data analysis for other biomedical classifica-tion/prediction problems; and it will be desirable to design an iEEG-based seizure prediction algorithmthat (i) uses all iEEG channels for proficient feature extraction, (ii) avoids using PCA for iEEG dimen-sionality reduction, and (iii) avoids using domain-based hand-crafted features for iEEG classification.Such observations help explain why Epileptic Seizure Prediction using iEEG data remains a challengingtopic and traditional machine learning algorithms using hand-crafted features did not yield satisfactoryperformance in the literature. Therefore, the conducted quantitative analysis in Section 7.2 motivatesus to explore a deep learning-based approach for seizure prediction using iEEG data. The proposedapproach in Section 7.3 has the desired advantages as described above.7.3 MethodologyConvolutional Neural Networks have been proven to achieve astonishing results in different researchareas such as face recognition [193], object detection [63], image classification [125], and speech recog-nition [77]. In this study, we propose the use of CNNs for epileptic seizure prediction.We rely on CNNs to extract different representations of the iEEG data to use for iEEG classifi-cation. The proposed neural network architecture is unique as it allows us to recover both the localfeatures (through smaller convolutions) and the high abstracted features (with larger convolutions). Pre-processing and preparing the data for the feature extraction and classification processes are explainedbelow.7.3.1 iEEG PreprocessingThe first step of building our pipeline system is to preprocess the time-series iEEG data by 1) reducingthe data dimensionality, 2) dividing the data into smaller segments, and 3) encoding the data into animage-like format.Data ReductionA key limitation in the seizure prediction problem is the large data size. The data used had a totalof 5,612 iEEG clips (4827 interictal and 785 preictal); each clip is a 10 minutes duration, and giventhe high sampling rate of 400Hz, each iEEG clip had 240,000 data-points (10min⇥60sec⇥400Hz). Thetotal size of the three patients’ iEEG data was around 120GB. Reducing the data size can significantlyreduce the required memory and computational resources.As discussed in Section 7.2, PCA is not suitable for dimensionality reduction of human iEEG data.We use a simple alternative for data reduction by resampling the iEEG clips at 100Hz rather than 400Hz.Figure 7.11 shows an example of the frequency spectra of interictal and preictal iEEGs recorded by the10016-channel seizure advisory system in Patient 1. It can be noticed that, for both interictal and preictaliEEGs, a significant amount of iEEG information is concentrated in the low-frequency band (<50Hz),where there is no much relevant information in the high-frequency band (>50Hz). Following Nyquistrule, we thus resample the 10-minutes iEEG clips at 100Hz (2⇥50Hz) instead of 400Hz. This reducesthe dimensionality of the data by a factor of 4. The resulting number of data-points in an iEEG clip isthus 60,000 (10min⇥60sec⇥100Hz).Figure 7.12 demonstrates how the down-sampling process affects iEEG signals in both time andfrequency domains. Figure 7.12(a) shows an example of time-series preictal iEEG signal and Fig-ure 7.12(b) depicts its spectrum along the entire frequency range of 0-200Hz. Figure 7.12(c) shows thedown-sampled (by a factor of 4) version of the iEEG signal in Figure 7.12(a). The frequency spectrumof the down-sampled signal is shown in Figure 7.12(d). It can be noticed in the figure that the powerspectrum of the signal is concentrated in the frequency band of 0-50Hz. It is important to note that,in some cases, this down-sampling process may hurt the prediction performance, because some of theiEEG samples have energy beyond 50Hz.iEEG SegmentationHuman surface and intracranial EEGs are usually non-stationary signals, i.e., their statistical char-acteristics change over time [25]. The main intuition behind iEEG segmentation is to divide a non-stationary iEEG signal into several pseudo-stationary segments as those are expected to have com-parable features [88]. Another advantage of segmentation is the output large number of labeled iEEGsamples needed for training deep CNNs to improve their performance. In this work, each down-sampled10-minute iEEG clip is further split into 10 non-overlapping 1-minute iEEG segments; each has 6,000data-points. This results in a tenfold increase in the total number of training (interictal and preictal)iEEG samples.Mapping Time-series iEEG Data into ImagesSince CNN has achieved great success in computer vision tasks, we are motivated to convert time-seriesiEEG data into an image-like format. We propose the use of short-time Fourier transform (STFT) toobtain a two-dimensional representation of each iEEG segment. The (6,000 data-points) 1-minute iEEGsegments are first partitioned into shorter chunks of equal length, and then the Fourier transform iscomputed for each individual chunk. This eventually reveals the changing power spectra as a functionof time and frequency. Figure 7.13 shows the three-dimensional spectrogram of a 1-minute preictaliEEG segment. It can be noticed that the iEEG signal power decays along the frequency axis startingwith 30dB at 0-5Hz and till 30dB at 40-50Hz.After iEEG downsampling and segmentation, the resulting dimension of the data is 10N⇥16⇥6,000;where N is the total number of 10-minute iEEG training clips, 10 is the number of iEEG segments per101Figure 7.11: Frequency spectra of interictal (blue) and preictal (red) iEEG signals collected by the16 implanted electrodes (channels) of the seizure advisory system in Patient 1. Ch1-Ch16 stand forChannels 1 to 16.clip, 16 is the number of iEEG channels, and 6,000 is the length of each iEEG segment. For eachiEEG segment, we compute the STFT for the 16 channels separately. The resulting dimension of thedata is thus 10N⇥16⇥129⇥26; where 129 and 26 are the numbers of frequencies and time chunks,respectively.Figure 7.14 illustrates all the transformations applied to the raw iEEG data before feeding it to the1020 100 200 300 400 500 600Time (seconds)-300-200-1000100200300Amplitude (V)Original iEEG0 20 40 60 80 100 120 140 160 180 200Frequency (Hz)01234Amplitude (V)105Spectrum of Original iEEG0 100 200 300 400 500 600Time (seconds)-300-200-1000100200300Amplitude (V)Downsampled-by-4 iEEG0 5 10 15 20 25 30 35 40 45 50Frequency (Hz)246810Amplitude (V)104Spectrum of Downsampled-by-4 iEEG0 10 20 30 40 501234105(d)(b)(a)(c)Figure 7.12: Time-series iEEG signals and their corresponding spectra: (a) and (b) original iEEG clipand its 200Hz frequency spectrum; (c) and (d) downsampled-by-4 iEEG clip and its 50Hz frequencyspectrum.Figure 7.13: 3D Spectrogram of a 1-minute preictal iEEG segment.seizure prediction algorithm. The final output is STFTs of 1 minute duration and 50Hz bandwidth. Tostandardize the training data across all 16 channels, the mean and standard deviation were computed foreach STFT image in every channel and then used to standardize the test set prior to their classification.103T	=	10min	(D	=	60,000)	Ch	1	Ch	2	Ch	3	Ch	16	…..…		iEEG	Signals	 iEEG	Segmentation	…..…		…..…		…..…		…..…		…..…		1min	SEG1	 SEG2	 SEG3	 SEG10	10	iEEG	Segments;		each	is	1min	(d=6,000)	Data	Reshape	SEG1	SEG2	SEG3	SEG10	d	iEEG	Tensor	(NCh×	d	×	L)	iEEG	Slice	(d	×	L	=	6,000	×	10)	 (NCh×	129	x	26)	..…		iEEG	Spectrogram	Figure 7.14: Schematic pipeline of the proposed iEEG data pre-processing approach for epileptic seizure prediction: SEG1, SEG2, · · · , SEG10 arecorresponding to the 1st, 2nd, and 10th iEEG segments of each iEEG channel signal; NCh is the total number of iEEG channels (NCh=16), L is thenumber of segments per iEEG clip (L=10), and d is the number of data-points in each iEEG segment (d=6,000).1047.3.2 Automatic Feature Learning of iEEG.As iEEG is a time-series data, we first attempted to use LSTM recurrent neural networks for iEEGfeature extraction and classification. The iEEG data samples, however, have 1000s of time-steps andLSTMs can be challenging to use when we have such very long input sequences. These so-called longsequence classification tasks require special handling when using recurrent neural networks. We firstattempted to use the long iEEG sequence data as-is but this resulted in vanishing gradients, and hence,an unlearnable model. The other strategy was to truncate these long sequence by deleting some timesteps from their beginning or their end. This helped solve the vanishing gradient problem but at the costof losing the data that is crucial for accurate seizure prediction. We also used truncated back-propagationthrough time to force the LSTM to back-propagate over only a subset of the last time steps. However,the long-term dependencies in EEG data were being lost. These limitations of LSTM and all its variantsmotivated us to investigate the potential of recent CNN architectures for seizure prediction.Figure 7.15 depicts the detailed architecture of the proposed CNN-based seizure prediction method.As explained in the previous subsection, the input data takes the shape of 10N⇥16⇥129⇥26; where10N is the total number of iEEG samples. Each sample comprises 16 STFT images; one for each iEEGchannel. These images are first fed into a combination of CNNs of the same filter size (1⇥1) withtheir output supplied to other CNNs of bigger filter size (3⇥3 and 5⇥5). Additionally, since poolingoperations have been proven to play a key part in the promising convolutional networks, we add aparallel path in which maximum pooling and convolution operations have been adopted. Ultimately, theoutputs of all parallel paths are concatenated into a single feature vector forming the input of the nextfew layers. The concatenated output is flattened and presented as an input to two fully connected (FC)layers. Finally, a Sigmoid function is used to compute the label probabilities and predictions [112].As for the proposed architecture, we were inspired by the successful CNN architectures in com-puter vision tasks. The goal here is to learn different feature maps and then combine them together sothat the following FC layers can learn discriminative features from different scales simultaneously. Theproposed CNN model was trained using 50,470 iEEG spectrograms (43,140 interictal and 7,330 preic-tal), and its seizure prediction performance was tested using 5,830 iEEG spectrograms (5,310 interictaland 520 preictal)2. To make a single prediction for each 10-minute iEEG clip, we take the maximumprobability over the past consecutive ten 1-minute predictions. The arithmetic mean was also computedfor every consecutive ten predictions, however, it achieved inferior prediction results than those of themaximum.From the network configuration perspective, our CNN model was trained by optimizing the “binarycross-entropy” cost function with “Adam” parameter update and a learning rate of 0.001. The 1⇥1, 3⇥3,and 5⇥5 convolutional networks have 64 units each. The number of units for the two FC layers was setto 124 and 64, respectively. Our implementation was derived in Python using Keras with TensorFlow2Our experiments on the human iEEG-based seizure prediction using multi-scale CNNs were conducted with the open-source software of Keras using TensorFlow backend [99].105(NCh×	129	x	26)	iEEG	Spectrogram	1x1	Convolutions	3x3	Max	Pooling	5x5	Convolutions	Features	Concatenation	Sigmoid	Function	Flatten	Layer	.		.		 .		.		.		.		P1	P2	1x1	Convolutions	1x1	Convolutions	1x1	Convolutions	3x3	Convolutions	FC	Layer	FC	Layer	Figure 7.15: Schematic diagram of the proposed neural network architecture for epileptic seizure prediction: Each iEEG instance is of NCh ⇥ 129 ⇥26 (NCh=16); Max Pooling stands for Maximum Pooling; FC layer stands for Fully Connected layer; P1 and P2 are the probabilities produced by thesigmoid function got class 1 and 2, respectively.106backend and the training took eight hours on an NVIDIA K40 GPU machine.7.3.3 Performance Evaluation.Metrics used to assess the prediction performance of the proposed method are sensitivity (SEN) andarea under the receiver operating characteristic (ROC) curve (AUC). To examine the generalizability ofour seizure prediction algorithm over different subjects, we first evaluate the performance metrics forthe three patients individually and then report the average performance.7.4 Results and DiscussionIn this section, we first assess the seizure prediction performance of the proposed algorithm and compareit to the state-of-the-art methods when tested on the same benchmark iEEG dataset [50]. We also providefour intuitive reasons why current seizure prediction algorithms achieve limited performance.7.4.1 Seizure Prediction ResultsIn this section, we test our seizure prediction algorithm and compare the results with those of fourrecent studies (reported in [50], [116], [117], and [121]) that were tested on the same human iEEGdataset. In [50], Cook et al. implanted the first-in-man seizure advisory system (SAS) in patients withrefractory epilepsy. After SAS implantation, a seizure prediction algorithm was developed to distinguishtime intervals of low, medium, and high likelihood of upcoming seizure attacks. Satisfactory seizureprediction results were achieved for most of the patients, proving that seizure prediction was possible.For example, the average seizure prediction sensitivity for all patients was around 61.20%, while thethree patients under study had inferior seizure prediction sensitivities with an average of 33.67%. Themain reason behind such low seizure prediction performance was the temporal drift (variation) observedin the adopted time-dependent iEEG features (this study did not report what kind of iEEG features wereextracted and what classifier was used to evaluate their effectiveness) [50].In [116], Karoly et al. developed a detection method to identify probable seizure activities. Theyused the preictal spike rate as an indicator that the brain is approaching a seizure. However, the spikerate was found to be an unreliable biomarker for all patients, as it was shown to increase before seizuresfor 9 out of 15 patients and decrease for the rest of the patients. The classification of iEEG clipswas performed manually and the results showed an average seizure prediction sensitivity of 66.54%for all patients and 43.34% for the three patients under study. In [117], the same authors proposed acircadian seizure forecasting method to achieve robust prediction performance across all patients. Withthe help of the adopted circadian information, a significant improvement in the predictive power ofseizure prediction methods was observed for most of the patients. The logistic regression classifierwas used to assess the prediction performance of the proposed algorithm. Average seizure predictionsensitivities of 62.10% and 52.67% were achieved for all patients and for the three patients under study,respectively. In [121], Kiral-Kornek et al. used deep learning to build an accurate seizure warning107Table 7.2: Seizure prediction sensitivity scores of the algorithms under study.Subject Cook et al. [50] Karoly et al. [116] Karoly et al. [117] Kiral-Kornek et al. [121] Proposed Method Proposed Method(2013) (2016) (2017) (2018) (DS-by-4) (DS-by-2)Patient 1 45.00 56.00 55.00 71·10 82.92 79.65Patient 2 17.00 18.00 45.00 83·10 88.52 91.86Patient 3 39.00 56.00 58.00 77·90 91.84 92.05Average 33.67 43.34 52.67 77.36 87.76 87.85Patient 1, 2, and 3 in the Kaggle compeition dataset are the same as Patients 3, 9, and 11 in [50], [116], [117], and[121]. DS = downsampled. DS-by-4 and DS-by-2 indicate the seizure prediction results using the downsampled-by-4 and downsampled-by-2 iEEG data, respectively.system that is patient-specific and can be adjusted to meet patients’ needs. The deployed algorithmmanifested notable seizure prediction results for all patients; yielding an average sensitivity of 69.00%for the 15 patients and 77.36% for the three patients whose data is studied in this work.It is worth noting that the seizure prediction algorithms presented in [50], [116], [117], and [121]could not achieve adequate prediction sensitivity for Patients 3, 9, and 11. These are the same as Pa-tients 1, 2, and 3 in our study. The potential reasons for such poor performance are given in the followingsubsection. In our work, a multi-scale CNN architecture was specifically built to accurately recognizepre-seizure patterns in the preictal iEEG data taken from those three patients. Table 7.2 summarizes theseizure prediction results achieved by the proposed and the state-of-the-art methods. It can be noticedthat, for the three patients, our algorithm yields superior seizure prediction sensitivity rates. It is worthmentioning that the seizure prediction performance is affected by the number of training iEEG samplesfor individual patients. For instance, Patient 3 has the largest number of training samples (see Table 7.1),which in turn helps our CNN algorithm achieve a notably high seizure prediction sensitivity of 91.84%.Nevertheless, a lower prediction sensitivity of 82.92% was achieved for Patient 1 which had the leastnumber of training samples.As explained in Section 7.3, all the iEEG signals are downsampled by a decimation factor of 4 forthe purpose of speeding up the deployed deep learning algorithm. The question that may arise here:Does the downsampled-by-4 data closely represent the original data? Resampling the data at 100Hzmeans that all iEEG signals’ content above 50Hz would be discarded. After careful examination of thefrequency spectra of a wide range of interictal and preictal iEEG signals, we have found the following:(i) for most of the iEEG signals, the signal power dwells in the low-frequency band of 0-50Hz (as shownin Figure 7.11), and (ii) for a few iEEG signals, a non-trivial amount of the signal power resides in higherfrequency bands (>50Hz), implying that downsampling the data at 100Hz entails a considerable loss ofinformation.In an attempt to retain as much iEEG information as possible, the original iEEG signals are down-sampled by a factor of 2 (i.e., fs=200Hz). The proposed seizure prediction algorithm is then testedon the data after downsampling it by a factor of 2. The achieved seizure prediction sensitivities arereported in Table 7.2. The results demonstrate that downsampling the data at 200Hz helps preserve theiEEG signal information below 100Hz, and hence improve the overall seizure prediction sensitivity to87.85%.108Table 7.3: AUC scores for the proposed method and Kaggle top finishing contestants on the test set.Scores of Top 10 Kaggle Solutions (2018) [127] Proposed Method Proposed Methodmin-max (DS-by-4) (DS-by-2)0.82-0.85 0.79 0.84The first column shows the minimum and maximum AUC scores of the 10 top-scoring solutions of apost-competition study [127]. The second and third columns show the AUC score of our multi-scaleCNN model on the iEEG data downsampled by 4 and 2, respectively.Table 7.4: Per-subject AUC scores of the proposed method.Subject Proposed Method Proposed Method(DS-by-4) (DS-by-2)Patient 1 0.68 0.69Patient 2 0.82 0.89Patient 3 0.87 0.94Average 0.79 0.84In addition, we compare our results to those of the winning solutions in the 2016 Kaggle seizureprediction challenge. The AUC was the metric used for ranking various solutions. The winningteam deployed 11 different classification models and more than 3000 hand-crafted iEEG features, andachieved an average AUC of 0.85 (see Table 7.3) [28, 127]. It is, however, impractical to use such acomputationally-intensive process for real-time applications. Our algorithm obtains a comparable butslightly lower seizure prediction AUC score of 0.84 on the iEEG data downsampled by 2. It is, however,much faster in obtaining the results and is thus more suitable for use in ambulatory and clinical applica-tions. For faster iEEG processing (feature extraction and classification), the proposed algorithm is alsotested on the downsampled by 4 iEEG data and an average AUC score of 0.787 is achieved.Table 7.4 shows the AUC scores of our algorithm for individual patients when tested on iEEG datadownsampled by 4 and 2. It can be seen that, for the downsampled-by-4 data, the AUC scores of 0.82and 0.87 are achieved for Patients 2 and 3, respectively, while Patient 1 experiences a lower AUC scoreof 0.68. Using the downsampled-by-2 data helped improve the AUC scores for all patients. For example,a remarkable AUC score of 0.94 was achieved for Patient 3 (as this patient had the largest number ofinterictal and preictal iEEG samples). Patient 2 had a little lower AUC score of 0.89 and Patient 1 hadthe least AUC score of 0.69. The average AUC score produced by our CNN algorithm for all patients is0.84.7.4.2 Possible Reasons for Limited PerformanceSeizure prediction algorithms examined on the 2016 Kaggle/Melbourne University iEEG dataset pro-vide limited performance. The top contestant, for example, achieved the highest AUC of 0.85 [28, 127].Herein, we list some potential reasons why existing machine learning and deep learning algorithmscould not achieve any better performance.1090 100 200 300 400 500 600Time (seconds)-150-100-50050100150Amplitude (V)0 100 200 300 400 500 600Time (seconds)-300-200-1000100200300Amplitude (V)0 100 200 300 400 500 600Time (seconds)-300-200-1000100200300Amplitude (V)Figure 7.16: Examples of corrupted iEEG clips for Patient 1.iEEG Data CorruptionAfter a thorough inspection and data visualization of the iEEG data under study, we realized that manyof the 10-minutes iEEG clips contain “data drop-out”. This is when the implanted device failed tocommunicate with the storage device for several possible reasons. This data drop-out corresponds toiEEG signal values of zeros across all channels at a given time sample. A handful of 10-minute clipscomprise 100% data drop-out and cannot be classified. Other clips, however, are partially corrupted andcontain different percentages of data drop-out. Figure 7.16 displays various examples of iEEG signalscomprising data drop-outs at different time samples. It is worth noting that, if for any reason, a datasetincludes corrupted instances, standard machine learning algorithms can break down.iEEG Data MismatchThe terms “data mismatch”, “concept drift”, and “covariate shift” have been used to refer to the situationwhere data characteristics (distribution) change over time [222]. Often, the distribution of a particulardata class (e.g., interictal or preictal) is assumed to not change over time, implying that the distributionof the historical data is just the same as the distribution of the new data. This holds true for many110Figure 7.17: Data mismatch in Patient 1’s preictal iEEG sensor data.machine learning problems, but not for all problems. In some cases, the characteristics of the data varyover time, and hence the predictive models trained on historical data are no longer valid for makingpredictions on new unseen data. After attentive screening to the characteristics of the iEEG data understudy, we realized that, for each individual patient, the distribution of either interictal or preictal iEEGdata is different between the training set and testing set. A potential reason for such a data shift is thatthe testing data was recorded a long time after recording the training data. During this time, the patientmay be positively or negatively influenced by the medications. Figure 7.17 depicts how the preictaliEEG data distributions for Patient 1 have changed over time. The top plot explains how the densityof preictal iEEG signals recorded by Channel 1 differs between training and testing sets. Similarly, thebottom plot demonstrates the variations in preictal iEEG density for Channel 2.Outliers in iEEG DataSince iEEGs are recorded with electrodes attached directly to the surface of the brain, they are deemedto be free of body movement artifacts (e.g., muscle activities and eye blinking) and their signal-to-noiseratios are expected to be much higher than those of surface EEGs [157]. However, there exist some111Figure 7.18: Scatter plot to identify outliers in preictal iEEG data of Patient 1: Preictal S1 readings vs.Preictal S2 readings.inevitable sources of artifacts (e.g., static magnetic field and environmental noise) that interfere withthe iEEG signals producing undesired outliers. The scatter plot in Figure 7.18 shows an example of“data outliers” which were detected in the S1 and S2 readings for some preictal iEEG clips taken fromPatient 1. Other scatter plots (omitted for the lack of space) demonstrate that, for the same iEEG clips,the remaining 14 sensor readings also include similar outliers. The presence of outliers often has anegative influence on the extracted features and certainly on the predictive power of the fitted models. Ifthe outliers are removed from the fitting process, then the resulting fit will maintain subtle performanceevery time it is tested on new unseen data.Imbalanced Class DistributionThe seizure prediction dataset under study has 5,047 iEEG samples. A total of 4,314 belongs to Class-1(interictal iEEG) and the remaining 733 samples belong to Class-2 (preictal iEEG). This represents abinary classification problem with imbalanced class distribution, where the ratio of Class-1 to Class-2 samples is 4,314:733, i.e., 5.885:1. With such an imbalanced dataset, the classification rules that112predict the minority class tend to be more flimsy than those that predict the majority class; therefore,test samples belonging to the minority class are misclassified more often than those belonging to themajority class [191]. Thus, one of the possible reasons for limited seizure prediction performance isthat the deployed machine learning algorithms do not take into consideration the interictal and preictalclass distributions. They pay less attention to the minority class of preictal iEEGs, which is of crucialimportance for forecasting impending seizures.There exist many approaches that can tackle classification problems having imbalanced data. Oneof the effective procedures is to generate synthetic samples from the under-represented class. In thisstudy, we have first used the synthetic minority over-sampling technique (SMOTE) [46] to generatemore samples from the minority class (preictal iEEG). However, a serious degradation in the seizureprediction performance was noticed (as the trained CNN model was overfitting). As an alternativesolution, we employed the class weights in training the proposed CNN model in order to force themodel to treat every sample of preictal iEEG as ⇠6 samples of interictal iEEG. This helped achieve theremarkable seizure prediction performance reported in Tables 7.2, 7.3, and 7.4.7.5 Summary and ConclusionBig data is crucial for reliable seizure prediction. The Kaggle/Melbourne University Seizure PredictionCompetition provides a big dataset. It includes a long-term human intracranial EEG data recorded con-tinuously from three (drug-resistant) epileptic patients for over 373-559 days. Leveraging this datasetproperly can lead to substantial improvements in the field of epileptic seizure prediction.We have therefore done an extensive analysis of the iEEG data, and from which we have provided,for the first time, a detailed quantitative analysis of the human iEEG data during the preictal and in-terictal brain states. This analysis reveals why hand-crafted iEEG features that have been successfullyused in seizure detection are not the best candidates for robust prediction of epileptic seizures. We thenproposed a convolution neural network model for accurate prediction of epileptic seizures. The data wasfirst divided into smaller segments on which we applied STFT. This resulted in a much larger numberof interictal and preictal iEEG samples, which is important in deep learning in achieving better seizureprediction performance. The resulting time-series segments were converted into image-like formatsto enable its use as inputs to our CNN model. The CNN model we proposed has an efficient multi-scale CNN architecture that automatically learns distinctive iEEG features from different spatial scalescontemporaneously. Unlike previous seizure prediction algorithms that used hand-crafted features, theproposed algorithm can yield reliable performances across different patients, achieving a superior aver-age prediction sensitivity performance as 87.85%.Though training the proposed CNN model takes 6-8 hours, the trained model takes less than a sec-ond to test new unseen data. Compared to the Kaggle leaderboard winning solution that uses over3,000 custom features, the proposed algorithm achieves comparable performances in a much morecomputationally-efficient fashion.113Chapter 8Conclusion and Future Work8.1 ConclusionThis thesis focuses on robust methods used for early detection and prediction of epileptic seizures. Itapproaches four important problems: energy-efficient EEG telemonitoring systems, early detection ofepileptic seizure onsets, robust recognition of seizure patterns under noisy and abnormal conditions, andreliable prediction of epileptic seizures. It explains how solving these problems can favorably impact thequality of life of epileptic patients. It shows the effectiveness and robustness of the proposed methodsthrough extensive experiments on real EEG data taken from people with different types of epilepticseizures. A summary of the thesis contributions are listed below:8.1.1 An Energy-Efficient EEG Tele-monitoring SchemeThe recent advances in wireless sensor networks (WSNs) have motivated healthcare researchers to de-velop miniaturized wireless devices that are capable of measuring body physiological signals and mon-itoring patients’ behavior. Epilepsy management is one of the important applications that could gainadvantage from deployingWSNs [149]. By using portable wireless EEG devices, real-time EEG record-ing and ambulatory seizure monitoring outside clinics would be possible. For the time being, there existmany ergonomic, lightweight, and comfortable designs of wireless EEG headbands that are easy to wearand does not need any preparation. A wireless EEG headband measures, processes, and transmits theEEG data to the server side, where the data is stored and analyzed. One major limitation in using suchwireless EEG devices is the limited battery lifetime at the sensor side. This motivated researchers todevelop energy-efficient EEG telemonitoring systems.Different solutions to elongate the battery lifetime of the EEG sensor units have been presented inthe literature. Most of them apply data compression techniques (e.g., compressive sensing and discretewavelet transform) to the raw data and then transmit the compressed data to the server side, where thedata is recovered and analyzed. Two major considerations in using such compression-based seizuredetection paradigms is that they are computationally-expensive and the reconstructed data at the serverside gets distorted.A novel method to reduce the overall power consumption of the wireless EEG-based epilepticseizure detection systems was presented. At the sensor side, we managed to reduce the data dimen-sionality by intentionally deleting some data-points randomly. This method is computationally simpleand can significantly reduce the power consumption needed for data transmission. At the server side,Expectation Maximization, an efficient machine learning algorithm, was deployed to accurately substi-114tute the missing (randomly deleted) data-points. The proposed method can be applied to three differentEEG monitoring frameworks, which are based on (i) Transmission of the entire raw EEG data, (ii)Transmission of compressed EEG data, and (iii) Transmission of EEG features. It was found that thatthe proposed method can considerably reduce the power consumption of the wireless EEG devices whilemaintaining high seizure detection performance at the server side. A reduction of ⇠60% in power con-sumption was achieved while obtaining a seizure detection sensitivity over 95.98%. It was also foundthat extracting the EEG features at the sensor side and transmitting the features to the server side (whereEEG classification is carried out) is the best scenario for wireless EEG telemonitoring systems. It yieldsthe least power consumption while achieving satisfactory seizure detection performance.8.1.2 A Computationally-Efficient EEG Feature Learning for Fast Detection of SeizureOnsetsThe early detection of seizure onsets is of great benefit to epileptic patients suffering from recurrentseizure attacks. Family members or personal care providers would be notified so they can help epilepticpatients avoid potential risks or complications during and after seizure attacks. For instance, they keepthe patient away from any hard or sharp object to avert possible injuries, make sure that the patient isbreathing properly, record the duration of the seizure and ask for medical help if it lasts too long.In this work, the aim was to address the problem of how to identify the seizure onset at the verybeginning of seizure attacks. A novel computationally-fast EEG-based method for early detection ofepileptic seizures was proposed. This method incorporates a low-complex feature learning algorithmthat extracts the prominent EEG seizure-associated features in a computationally-efficient fashion. Un-like the baseline methods that compute engineered and hand-crafted features, the proposed algorithmuses the least absolute shrinkage and selection operator (LASSO) to automatically recognize the EEGspectral features that well represent epileptic seizure activities. These learned features are proven to bemore informative and achieve higher seizure onset detection accuracy than other hand-crafted features.Meanwhile, a computationally-simple algorithm, known as coordinate descent [69], is adopted toestimate the LASSO regression coefficients, which are then used as indicators for which features aremore relevant to seizures. Random Forest classifier was used to test the potency of those selectedfeatures. The results on a benchmark dataset showed that our approach surpasses existing methods interms of sensitivity, specificity, classification accuracy, and most importantly the detection latency.8.1.3 An L1-Penalized Robust Regression for Reliable Detection of Epileptic SeizuresThe majority of existing seizure detection methods use hand-crafted EEG features extracted from thetime domain, frequency domain, time-frequency domain, and even from multiple domains. Thesedomain-based schemes have two main drawbacks. First, the non-stationarity nature of EEG data makesit difficult to have a single seizure pattern, making the hand-crafted features less practical in clinicalsettings. Second, the EEG data in practice is prone to different sources of artifacts (e.g., muscle activi-ties and eye blinking) and also white noise, which alter the genuine EEG features and negatively affect115the seizure detection performance. The question that arises here is: Can we extract more robust EEGfeatures that maintain reliable seizure detection performance in real-life situations?In an effort to address the above question, a novel EEG-based method for robust detection of epilep-tic seizures was presented. The proposed method incorporates a feature learning approach based onL1-penalized robust regression (L1PRR). This approach is applied to the EEG spectra to extract themost distinguishable EEG features pertinent to epileptic seizures. The extracted features are then usedas an input to the Random Forest classifier for EEG training and classification. Experimental results ona popular clinical dataset demonstrate the superiority of the proposed scheme over the baseline seizuredetection methods. Further, the proposed L1PRR-based method is proven to be robust against the in-evitable sources of artifacts and ambient noise, making it more practical and workable in real-life andclinical settings.8.1.4 A Deep Learning Approach for Robust Detection of Epileptic SeizuresTo further improve the robustness of seizure detection systems, a new recurrent neural network (RNN)architecture that reliably detects epileptic seizure patterns in noise-free and noise-corrupted EEG datawas developed. The proposed RNN architecture uses long short-term memory (LSTM) to learn thelow- and high-level representations of EEG patterns, where the main questions addressed are: (a) Howto exploit the temporal dependency in time-series EEG data? (b) Does the proposed deep learningapproach maintain robust seizure detection performance under real-life conditions? (c) Can we usedeep learning for real-time seizure detection?Through experiments on a real-world clinical dataset, it was shown that the proposed deep learningapproach can accurately recognize different seizure patterns recorded from several patients and caneffectively differentiate between the seizure and non-seizure brain activities. More importantly, theproposed approach was found to maintain robust performance in detecting seizure semilogies hidden innoisy and corrupted EEG recordings; making it a superb candidate for diagnostic applications in clinicaland ambulatory settings.8.1.5 A Convolutional Neural Network Architecture for Accurate Prediction ofEpileptic SeizuresPredicting epileptic seizure in advance can favorably impact the quality of life of people with epilepsy.The patients would be warned of upcoming seizures so they could avoid risky activities such as hiking,driving or swimming. Also, anti-epileptic medications could also be taken to forfend impending seizuresor at least reduce the duration and intensity of seizure attacks, which in turn minimize the risk of injury.What is more, brain stimulation therapies or closed-loop seizure intervention systems could also be usedto abort seizures in people with refractory (medication-resistant) epilepsy.In this work, a detailed quantitative analysis to the human intracranial EEG (iEEG) data was firstconducted. This gives some insights into the dynamic correlation between the brain regions beforeand between seizure attacks. By performing this analysis on iEEG data taken from patients with drug-116resistant epilepsy, it was shown that: 1) iEEG channels contain diverse information and excluding anyof them might hurt the spatial information required for accurate seizure prediction. 2) The traditionaldata reduction method of Principal Component Analysis fails in reducing the iEEG data dimensionalitywhile preserving the data variance needed for efficient feature extraction. 3) The adjacent iEEG sen-sors are not necessarily more correlated than the distant ones. The correlations amongst iEEG sensorsvary depending on which regions of the brain are affected and whether the seizure is focal (partial) orgeneralized.Also, a rigorous seizure prediction algorithm that maintains reliable performance against inter- andintra-patient variations was developed. A novel architecture of multi-scale Convolutional Neural Net-works was proposed to learn different representations of iEEG data simultaneously. The learned featuremaps are concatenated together and then fed into a fully connected neural network for identifying theEEG patterns that indicate upcoming seizures. Experiments on clinical iEEG dataset recorded from pa-tients with drug-resistant epilepsy showed the superiority and reliability of the proposed seizure predic-tion algorithm. It outperforms the previously published methods achieving an average seizure predictionsensitivity of 87.85% and area under the curve of 0.84.8.2 Future WorkThere is still a lot that can be done in the areas of epileptic seizure detection and prediction discussed inthis thesis. Below is a list of reasonably “feasible” future work problems.8.2.1 Automated Video Processing for Real-time Detection of Epileptic SeizuresAutomatic seizure detection devices are important for documenting recurrent seizures, avoiding seizure-related injuries and social embarrassments, administering subject-specific anti-epileptic medications,and most importantly preventing sudden unexpected death in epilepsy (SUDEP). EEG is the most com-mon diagnostic tool used to monitor epileptic seizures. Some patients, however, feel discomfort to wearthe EEG headbands for 24 hours or more. Also, video EEG monitoring is usually used in hospital andclinical settings to record what a patient is experiencing while EEG records accompanying brainwaves.This helps determine if the experienced seizure is actually epilepsy and identify the seizure type giventhe recorded symptoms.We propose to develop a video-based method for real-time analysis of body movements (especiallyconvulsion motions) and automatic recognition of epileptic seizures. To handle this problem, we in-troduce an efficient computer vision approach for video-based seizure detection, which is composed oftwo stages: human body detection and seizure recognition. First, we use a multi-scale body detectorto effectively localize the patient’s body in videos. Next, an unsupervised subspace learning algorithmis adopted to recognize body movements associated with seizures. Initial experiments on a small-scalevideo dataset demonstrate that the proposed method can efficiently detect and associate body movements(stiffening and shaking) and hence recognize epileptic seizures.1178.2.2 Multi-modal and Non-invasive Framework for Accurate Prediction of EpilepticSeizuresMost of the existing seizure prediction systems rely on invasive EEG tests. Indeed, invasive EEG ismore reliable than scalp EEG for brain activity recording, and it helps to achieve better seizure predic-tion performance. Ultimately, the complications that happen to the patients as a result of implantingforeign bodies inside the brain (depth or strip electrodes) have to be outweighed by the potential ben-efits of invasive EEG workup. Several studies, however, reported infections, neurological deficiency,and intracranial hematomas related to invasive monitoring of EEG [194, 207]. Complications of anytype were documented in 23% of patients [211, 214]. The controversial issue that we have here is: Is itnecessary to undertake brain surgery (to implant EEG electrodes inside the brain) in order to improvehuman capacity?We plan to develop an automated seizure prediction method that is patient friendly and more con-venient than invasive EEG recording. The proposed method is based on combining multiple modali-ties such as scalp EEG, accelerometer sensor, electromyography (EMG), electrocardiography (ECG),electrodermal screening (EDS), and pulse oximetry. Having such multi-modal systems may have thepotential to improve the seizure prediction sensitivity and lower false positive warnings, by taking ad-vantage of the strengths of each of the modalities. The other merit of the multi-modal seizure predictionmethods is that when one modality fails to recognize the pre-seizure state, the other may still work.8.2.3 Epileptic Seizure Prediction using Non-invasive Canine EEG SignalsThere have been reports about dog breeds with the uncanny ability to predict epileptic seizures beforethey occur. A research study found that some dog breeds already have this innate skill and only a fewbreeds can be trained to develop it. Once these dogs recognize an upcoming seizure, they start barking(to warn patients, family members, and caregivers of next seizure strike) and then moving in a way toprevent injury of patients.More research is needed in this field of “seizure predicting dogs”. We believe that the surface EEGrecordings of these dogs might include a distinguishable pattern during the preictal (pre-seizure) brainstate. Identifying this pattern far enough before the seizure onset may help predict upcoming seizureattacks with high accuracy.118Bibliography[1] A Aarabi, F Wallois, and R Grebe. Automated neonatal seizure detection: a multistage classi-fication system through feature selection based on relevance and redundancy analysis. ClinicalNeurophysiology, 117(2):328–340, 2006.[2] Ardalan Aarabi, Reza Fazel-Rezai, and Yahya Aghakhani. A fuzzy rule-based system for epilep-tic seizure detection in intracranial EEG. Clinical Neurophysiology, 120(9):1648–1657, 2009.[3] Rezvan Abbasi and Mansour Esmaeilpour. Selecting statistical characteristics of brain signals todetect epileptic seizures using discrete wavelet transform and perceptron neural network. IJIMAI,4(5):33–38, 2017.[4] Berdakh Abibullaev, Hee Don Seo, and Min Soo Kim. Epileptic spike detection using continuouswavelet transforms and artificial neural networks. International Journal of Wavelets, Multireso-lution and Information Processing, 8(01):33–48, 2010.[5] Khalid Abualsaud, Massudi Mahmuddin, Mohammad Saleh, and Amr Mohamed. Ensembleclassifier for epileptic seizure detection for imperfect EEG data. The Scientific World Journal,2015:1–15, 2015.[6] U Rajendra Acharya, Filippo Molinari, S Vinitha Sree, Subhagata Chattopadhyay, Kwan-HoongNg, and Jasjit S Suri. Automated diagnosis of epileptic EEG using entropies. Biomedical SignalProcessing and Control, 7(4):401–408, 2012.[7] U Rajendra Acharya, Shu Lih Oh, Yuki Hagiwara, Jen Hong Tan, and Hojjat Adeli. Deep convo-lutional neural network for the automated detection and diagnosis of seizure using EEG signals.Computers in Biology and Medicine, 100:270–278, 2018.[8] U Rajendra Acharya, S Vinitha Sree, Ang Peng Chuan Alvin, and Jasjit S Suri. Use of principalcomponent analysis for automatic classification of epileptic EEG activities in wavelet framework.Expert Systems with Applications, 39(10):9072–9078, 2012.[9] U Rajendra Acharya, S Vinitha Sree, Subhagata Chattopadhyay, Wenwei Yu, and PengChuan Alvin Ang. Application of recurrence quantification analysis for the automated iden-tification of epileptic EEG signals. International Journal of Neural Systems, 21(03):199–211,2011.[10] U Rajendra Acharya, S Vinitha Sree, G Swapna, Roshan Joy Martis, and Jasjit S Suri. AutomatedEEG analysis of epilepsy: a review. Knowledge-Based Systems, 45:147–165, 2013.[11] U Rajendra Acharya, Ratna Yanti, Jia Wei Zheng, M Muthu Rama Krishnan, JEN HONG TAN,Roshan Joy Martis, and Choo Min Lim. Automated diagnosis of epilepsy using cwt, hos andtexture parameters. International Journal of Neural Systems, 23(03):1350009, 2013.[12] Fawaz Al-Mufti and Jan Claassen. Neurocritical care: status epilepticus review. Critical CareClinics, 30(4):751–764, 2014.119[13] Gonzalo Alarcon, CD Binnie, RDC Elwes, and CE Polkey. Power spectrum and intracranial EEGpatterns at seizure onset in partial epilepsy. Electroencephalography and Clinical Neurophysiol-ogy, 94(5):326–337, 1995.[14] Turky N Alotaiby, Saleh A Alshebeili, Fathi E Abd El-Samie, Abdulmajeed Alabdulrazak, andEman Alkhnaian. Channel selection and seizure detection using a statistical approach. In 20165th International Conference on Electronic Devices, Systems and Applications (ICEDSA), pages1–4. IEEE, 2016.[15] Bruce M Altevogt, Harvey R Colten, et al. Sleep disorders and sleep deprivation: an unmetpublic health problem. National Academies Press, 2006.[16] Narendra Kumar Ambulkar and SN Sharma. Detection of epileptic seizure in EEG signals us-ing window width optimized s-transform and artificial neural networks. In 2015 IEEE BombaySection Symposium (IBSS), pages 1–6. IEEE, 2015.[17] Saadullah Amin and Awais Mehmood Kamboh. A robust approach towards epileptic seizuredetection. In Machine Learning for Signal Processing (MLSP), 2016 IEEE 26th InternationalWorkshop on, pages 1–6. IEEE, 2016.[18] Robert Andersen. Modern methods for robust regression. Number 152. Sage, 2008.[19] Ralph G Andrzejak, Klaus Lehnertz, Florian Mormann, Christoph Rieke, Peter David, and Chris-tian E Elger. Indications of nonlinear deterministic and finite-dimensional structures in time seriesof brain electrical activity: Dependence on recording region and brain state. Physical Review E,64(6):061907, 2001.[20] DK Angeles. Proposal for revised clinical and electroencephalographic classification of epilepticseizures. Epilepsia, 22(4):489–501, 1981.[21] Elie Bou Assi, Dang K Nguyen, Sandy Rihana, and Mohamad Sawan. Towards accurate pre-diction of epileptic seizures: A review. Biomedical Signal Processing and Control, 34:144–157,2017.[22] Elie Bou Assi, Dang K Nguyen, Sandy Rihana, and Mohamad Sawan. A functional-geneticscheme for seizure forecasting in canine epilepsy. IEEE Transactions on Biomedical Engineering,65(6):1339–1348, 2018.[23] Roland N Auer. Hypoglycemic brain damage. Forensic Science International, 146(2-3):105–110,2004.[24] Lacoma Ayoubian, H Lacoma, and J Gotman. Automatic seizure detection in SEEG using highfrequency activities in wavelet domain. Medical Engineering & Physics, 35(3):319–328, 2013.[25] Hamed Azami, Karim Mohammadi, and Hamid Hassanpour. An improved signal segmentationmethod using genetic algorithm. International Journal of Computer Applications, 29(8):5–9,2011.[26] Varun Bajaj and Ram Bilas Pachori. Epileptic seizure detection based on the instantaneous areaof analytic intrinsic mode functions of EEG signals. Biomedical Engineering Letters, 3(1):17–21,2013.120[27] Forrest Sheng Bao, Donald Yu-Chun Lie, and Yuanlin Zhang. A new approach to automatedepileptic diagnosis using EEG and probabilistic neural network. In 2008 20th IEEE InternationalConference on Tools with Artificial Intelligence, volume 2, pages 482–486. IEEE, 2008.[28] Alexandre Barachant, Andriy Temko, Feng Li, and Gilberto Titericz Junior. Code and documen-tation for the winning solution to the melbourne university aes/mathworks/nih seizure predictionchallenge on kaggle. December 2016.[29] Durga Siva Teja Behara, Anirudh Kumar, Piyush Swami, Bijaya K Panigrahi, and Tapan KGandhi. Detection of epileptic seizure patterns in EEG through fragmented feature extraction. In2016 3rd International Conference on Computing for Sustainable Global Development (INDIA-Com), pages 2539–2542. IEEE, 2016.[30] Sabrina Belhadj, Abdelouahab Attia, Ahmed Bachir Adnane, Zoubir Ahmed-Foitih, and Abdel-malik Ahmed Taleb. Whole brain epileptic seizure detection using unsupervised classification.In 2016 8th International Conference on Modelling, Identification and Control (ICMIC), pages977–982. IEEE, 2016.[31] Yoshua Bengio, Patrice Simard, Paolo Frasconi, et al. Learning long-term dependencies withgradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, 1994.[32] Gregory K Bergey, Martha J Morrell, Eli M Mizrahi, Alica Goldman, David King-Stephens,Dileep Nair, Shraddha Srinivasan, Barbara Jobst, Robert E Gross, Donald C Shields, et al. Long-term treatment with responsive brain stimulation in adults with refractory partial seizures. Neu-rology, 84(8):810–817, 2015.[33] Abhijit Bhattacharyya, Ram Bilas Pachori, Abhay Upadhyay, and U Rajendra Acharya. Tunable-q wavelet transform based multiscale entropy measure for automated classification of epilepticEEG signals. Applied Sciences, 7(4):385, 2017.[34] Ashwini D Bhople and PA Tijare. Fast fourier transform based classification of epileptic seizureusing artificial neural network. International Journal of Advanced Research in Computer Scienceand Software Engineering, 2(4), 2012.[35] Javad Birjandtalab, Maziyar Baran Pouyan, Diana Cogan, Mehrdad Nourani, and Jay Harvey.Automated seizure detection using limited-channel EEG and non-linear dimension reduction.Computers in Biology and Medicine, 82:49–58, 2017.[36] Sean Borman. The expectation maximization algorithm-a short tutorial. Submitted for Publica-tion, 41, 2004.[37] Golshan Taheri Borujeny, Mehran Yazdi, Alireza Keshavarz-Haddad, and Arash Rafie Borujeny.Detection of epileptic seizure using wireless sensor networks. Journal of medical signals andsensors, 3(2):63, 2013.[38] Le´on Bottou. Large-scale machine learning with stochastic gradient descent. In Proceedings ofCOMPSTAT’2010, pages 177–186. Springer, 2010.[39] Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.[40] Samantha J Broyd, Charmaine Demanuele, Stefan Debener, Suzannah K Helps, Christopher JJames, and Edmund JS Sonuga-Barke. Default-mode brain dysfunction in mental disorders: asystematic review. Neuroscience & Biobehavioral Reviews, 33(3):279–296, 2009.121[41] Sylvia Bugeja, Lalit Garg, and Eliazar E Audu. A novel method of EEG data acquisition, featureextraction and feature space creation for early detection of epileptic seizures. In Engineering inMedicine and Biology Society (EMBC), 2016 IEEE 38th Annual International Conference of the,pages 837–840. IEEE, 2016.[42] Alexander J Casson and Esther Rodriguez-Villegas. Toward online data reduction for portableelectroencephalography systems in epilepsy. IEEE transactions on biomedical engineering,56(12):2816–2825, 2009.[43] Alexander J Casson, David C Yates, Shelagh JM Smith, John S Duncan, and Esther Rodriguez-Villegas. Wearable electroencephalography. IEEE Engineering in Medicine and Biology Maga-zine, 29(3):44–56, 2010.[44] Suryannarayana Chandaka, Amitava Chatterjee, and Sugata Munshi. Cross-correlation aided sup-port vector machine classifier for classification of EEG signals. Expert Systems with Applications,36(2):1329–1336, 2009.[45] Rahul Kumar Chaurasiya, Khushbu Jain, Shalini Goutam, et al. Notice of retraction epilepticseizure detection using hht and svm. In 2015 International Conference on Electrical, Electronics,Signals, Communication and Optimization (EESCO), pages 1–6. IEEE, 2015.[46] Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. Smote: syn-thetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16:321–357,2002.[47] Guangyi Chen, Wenfang Xie, Tien D Bui, and Adam Krzyz˙ak. Automatic epileptic seizure detec-tion in EEG using nonsubsampled wavelet–fourier features. Journal of Medical and BiologicalEngineering, 37(1):123–131, 2017.[48] Joyce Chiang and Rabab Ward. Energy-efficient data reduction techniques for wireless seizuredetection systems. Sensors, 14(2):2036–2051, 2014.[49] J Cloyd, W Hauser, A Towne, R Ramsay, R Mattson, F Gilliam, and TWalczak. Epidemiologicaland medical aspects of epilepsy in the elderly. Epilepsy Research, 68:39–48, 2006.[50] Mark J Cook, Terence J O’Brien, Samuel F Berkovic, Michael Murphy, AndrewMorokoff, GavinFabinyi, Wendyl D’Souza, Raju Yerra, John Archer, Lucas Litewka, et al. Prediction of seizurelikelihood with a long-term, implanted seizure advisory system in patients with drug-resistantepilepsy: a first-in-man study. The Lancet Neurology, 12(6):563–571, 2013.[51] A Garce´s Correa, E Laciar, HD Patin˜o, and ME Valentinuzzi. Artifact removal from EEG signalsusing adaptive filters in cascade. In Journal of Physics: Conference Series, volume 90, page012081. IOP Publishing, 2007.[52] Maryann D’Alessandro, Rosana Esteller, George Vachtsevanos, Arthur Hinson, Javier Echauz,and Brian Litt. Epileptic seizure prediction using hybrid feature selection over multiple intracra-nial EEG electrode contacts: a report of four patients. IEEE Transactions on Biomedical Engi-neering, 50(5):603–615, 2003.[53] Anthony Dalton, Shyamal Patel, Atanu Roy Chowdhury, Matt Welsh, Trudy Pang, StevenSchachter, Gearo´id O´Laighin, and Paolo Bonato. Development of a body sensor network todetect motor patterns of epileptic seizures. IEEE Transactions on Biomedical Engineering,59(11):3204–3211, 2012.122[54] Hoda Daou and Fabrice Labeau. Dynamic dictionary for combined EEG compression and seizuredetection. IEEE journal of biomedical and health informatics, 18(1):247–256, 2014.[55] Arnaud Delorme, Terrence Sejnowski, and Scott Makeig. Enhanced detection of artifacts in EEGdata using higher-order statistics and independent component analysis. Neuroimage, 34(4):1443–1449, 2007.[56] David L Donoho et al. Compressed sensing. IEEE Transactions on Information Theory,52(4):1289–1306, 2006.[57] Cristian Donos, Matthias Du¨mpelmann, and Andreas Schulze-Bonhage. Early seizure detectionalgorithm based on intracranial EEG and random forest classification. International Journal ofNeural Systems, 25(05):1550023, 2015.[58] John S Ebersole, Aatif M Husain, and Douglas R Nordli. Current practice of clinical electroen-cephalography. Lippincott Williams & Wilkins, 2014.[59] Amir Eftekhar, Farah Vohra, Chris Toumazou, Emmanuel M Drakakis, and Kim Parker. Hilbert-huang transform: preliminary studies in epilepsy and cardiac arrhythmias. In 2008 IEEE Biomed-ical Circuits and Systems Conference, pages 373–376. IEEE, 2008.[60] Jerome Engel. Seizures and epilepsy, volume 83. Oxford University Press, 2013.[61] Jerome Engel Jr. Report of the ilae classification core group. Epilepsia, 47(9):1558–1568, 2006.[62] Dario J Englot, Harjus Birk, and Edward F Chang. Seizure outcomes in nonresective epilepsysurgery: an update. Neurosurgical Review, 40(2):181–194, 2017.[63] Dumitru Erhan, Christian Szegedy, Alexander Toshev, and Dragomir Anguelov. Scalable objectdetection using deep neural networks. In The IEEE Conference on Computer Vision and PatternRecognition (CVPR), June 2014.[64] Stephen Faul and William Marnane. Dynamic, location-based channel selection for powerconsumption reduction in EEG analysis. Computer methods and programs in biomedicine,108(3):1206–1215, 2012.[65] Fred F Ferri. Ferri’s Clinical Advisor 2019: 5 Books in 1. Elsevier Health Sciences, 2018.[66] Robert S Fisher, Carlos Acevedo, Alexis Arzimanoglou, Alicia Bogacz, J Helen Cross, Chris-tian E Elger, Jerome Engel Jr, Lars Forsgren, Jacqueline A French, Mike Glynn, et al. Ilaeofficial report: a practical clinical definition of epilepsy. Epilepsia, 55(4):475–482, 2014.[67] Robert S Fisher, Walter Van Emde Boas, Warren Blume, Christian Elger, Pierre Genton, PhillipLee, and Jerome Engel Jr. Epileptic seizures and epilepsy: definitions proposed by the interna-tional league against epilepsy (ilae) and the international bureau for epilepsy (ibe). Epilepsia,46(4):470–472, 2005.[68] Jacqueline A French. Refractory epilepsy: clinical overview. Epilepsia, 48:3–7, 2007.[69] Jerome Friedman, Trevor Hastie, and Rob Tibshirani. Regularization paths for generalized linearmodels via coordinate descent. Journal of Statistical Software, 33(1):1, 2010.[70] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning,volume 1. Springer series in statistics New York, 2001.123[71] Kai Fu, Jianfeng Qu, Yi Chai, and Tao Zou. Hilbert marginal spectrum analysis for automaticseizure detection in EEG signals. Biomedical Signal Processing and Control, 18:179–185, 2015.[72] Kais Gadhoumi, Jean-Marc Lina, Florian Mormann, and Jean Gotman. Seizure prediction fortherapeutic devices: A review. Journal of Neuroscience Methods, 260:270–282, 2016.[73] Dragoljub Gajic, Zeljko Djurovic, Jovan Gligorijevic, Stefano Di Gennaro, and Ivana Savic-Gajic. Detection of epileptiform activity in EEG signals based on time-frequency and non-linearanalysis. Frontiers in Computational Neuroscience, 9:38, 2015.[74] Samanwoy Ghosh-Dastidar, Hojjat Adeli, and Nahid Dadmehr. Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection. IEEE Transactionson Biomedical Engineering, 54(9):1545–1551, 2007.[75] Samanwoy Ghosh-Dastidar, Hojjat Adeli, and Nahid Dadmehr. Principal component analysis-enhanced cosine radial basis function neural network for robust epilepsy and seizure detection.IEEE Transactions on Biomedical Engineering, 55(2):512–518, 2008.[76] Jean Gotman. Automatic recognition of epileptic seizures in the EEG. Electroencephalographyand Clinical Neurophysiology, 54(5):530–540, 1982.[77] Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with deeprecurrent neural networks. In Proceedings of the ICASSP’2013, pages 6645–6649. IEEE, 2013.[78] Michelle Green, Michael Wong, David Atkins, Joanna Taylor, and Marcia Feinleib. Diagnosis ofattention-deficit/hyperactivity disorder. 1999.[79] Klaus Greff, Rupesh K Srivastava, Jan Koutnı´k, Bas R Steunebrink, and Ju¨rgen Schmidhuber.Lstm: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems,28(10):2222–2232, 2017.[80] Cristian Guarnizo and Edilson Delgado. EEG single-channel seizure recognition using empiricalmode decomposition and normalized mutual information. In IEEE 10th International Conferenceon Signal Processing Proceedings, pages 1–4. IEEE, 2010.[81] Inan Gu¨ler and Elif Derya U¨beyli. Adaptive neuro-fuzzy inference system for classificationof EEG signals using wavelet coefficients. Journal of Neuroscience Methods, 148(2):113–121,2005.[82] Inan Guler and Elif Derya Ubeyli. Multiclass support vector machines for EEG signals classifi-cation. IEEE Transactions on Information Technology in Biomedicine, 11(2):117–126, 2007.[83] Nihal Fatma Gu¨ler, Elif Derya U¨beyli, and Inan Gu¨ler. Recurrent neural networks employinglyapunov exponents for EEG signals classification. Expert Systems with Applications, 29(3):506–514, 2005.[84] Ling Guo, Daniel Rivero, and Alejandro Pazos. Epileptic seizure detection using multiwavelettransform based approximate entropy and artificial neural networks. Journal of NeuroscienceMethods, 193(1):156–163, 2010.[85] Maya R Gupta, Yihua Chen, et al. Theory and use of the em algorithm. Foundations and Trends Rin Signal Processing, 4(3):223–296, 2011.124[86] Mark Andrew Hall. Correlation-based feature selection for machine learning. 1999.[87] H. Hallez, A. Vergult, R. Phlypo, P. Van Hese, W. De Clercq, Y. D’Asseler, R. Van de Walle,B. Vanrumste, W. Van Paesschen, S. Van Huffel, and I. Lemahieu. Muscle and eye movementartifact removal prior to EEG source localization. In Proceedings of the IEEE Eng Med Biol Soc,pages 1002–1005, Aug 2006.[88] H Hassanpour and M Shahiri. Adaptive segmentation using wavelet transform. In Proceedingsof ICEE’2007, pages 1–5. IEEE, 2007.[89] Uwe Herwig, Peyman Satrapi, and Carlos Scho¨nfeldt-Lecuona. Using the international 10-20EEG system for positioning of transcranial magnetic stimulation. Brain Topography, 16(2):95–99, 2003.[90] Dale C Hesdorffer, W Allen Hauser, John F Annegers, and Gregory Cascino. Major depressionis a risk factor for seizures in older adults. Annals of Neurology: Official Journal of the AmericanNeurological Association and the Child Neurology Society, 47(2):246–249, 2000.[91] Peter SC Heuberger, Paul MJ van den Hof, and Bo Wahlberg. Modelling and identification withrational orthogonal basis functions. Springer Science & Business Media, 2005.[92] Sepp Hochreiter and Ju¨rgen Schmidhuber. Long short-term memory. Neural Computation,9(8):1735–1780, 1997.[93] Jeremy Holleman, Brian Otis, Seth Bridges, Ania Mitros, and Chris Diorio. A 2.92 µw hard-ware random number generator. In 2006 Proceedings of the 32nd European Solid-State CircuitsConference, pages 134–137. IEEE, 2006.[94] Richard W Homan, John Herman, and Phillip Purdy. Cerebral location of international 10–20system electrode placement. Electroencephalography and Clinical Neurophysiology, 66(4):376–382, 1987.[95] Mohammad-Parsa Hosseini, Abolfazl Hajisami, and Dario Pompili. Real-time epileptic seizuredetection from EEG signals via random subspace ensemble learning. In 2016 IEEE internationalconference on autonomic computing (ICAC), pages 209–218. IEEE, 2016.[96] J Jeffry Howbert, Edward E Patterson, S Matt Stead, Ben Brinkmann, Vincent Vasoli, Daniel Cre-peau, Charles H Vite, Beverly Sturges, Vanessa Ruedebusch, Jaideep Mavoori, et al. Forecastingseizures in dogs with naturally occurring epilepsy. PloS one, 9(1):e81920, 2014.[97] Norden E Huang, Zheng Shen, Steven R Long, Manli C Wu, Hsing H Shih, Quanan Zheng,Nai-Chyuan Yen, Chi Chao Tung, and Henry H Liu. The empirical mode decomposition and thehilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the RoyalSociety of London. Series A: Mathematical, Physical and Engineering Sciences, 454(1971):903–995, 1998.[98] Peter J Huber. Robust estimation of a location parameter. The Annals of Mathematical Statistics,pages 73–101, 1964.[99] Ramy Hussein. Epileptic seizure detection and prediction. GitHub repository,https://github.com/ramyh, 2016.125[100] Ramy Hussein, Mohamed Elgendi, Z. Jane Wang, and Rabab K Ward. Robust detection ofepileptic seizures based on l1-penalized robust regression of EEG signals. Expert Systems withApplications, 104:153–167, 2018.[101] Ramy Hussein, Mohamed Elgendi, Rabab Ward, and Amr Mohamed. High performance EEGfeature extraction for fast epileptic seizure detection. In 2017 IEEE Global Conference on Signaland Information Processing (GlobalSIP), pages 953–957. IEEE, 2017.[102] Ramy Hussein, Amr Mohamed, and Masoud Alghoniemy. Energy-efficient on-board processingtechnique for wireless epileptic seizure detection systems. In 2015 International Conference onComputing, Networking and Communications (ICNC), pages 1116–1121. IEEE, 2015.[103] Ramy Hussein, Amr Mohamed, Masoud Alghoniemy, and Alaa Awad. Design and analysis ofan adaptive compressive sensing architecture for epileptic seizure detection. In 2013 4th AnnualInternational Conference on Energy Aware Computing Systems and Applications (ICEAC), pages141–146. IEEE, 2013.[104] Ramy Hussein, Mohamed Osama Ahmed, Z. Jane Wang, and Rabab Ward. Multi-scale deep con-volutional neural network for epileptic seizure prediction. In 2019 International Work-Conferenceon Artificial Neural Networks (IWANN). IEEE, 2019.[105] Ramy Hussein, Mohamed Osama Ahmed, Z. Jane Wang, Rabab Ward, Mark Schmidt, and LevinKuhlmann. Human intracranial EEG quantitative analysis and automatic feature learning forepileptic seizure prediction. IEEE Transactions on Biomedical Engineering, 2019.[106] Ramy Hussein, Hamid Palangi, Z. Jane Wang, and Rabab Ward. Robust detection of epilep-tic seizures using deep neural networks. In 2018 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP), pages 2546–2550. IEEE, 2018.[107] Ramy Hussein, Hamid Palangi, Rabab K Ward, and Z. Jane Wang. Optimized deep neural net-work architecture for robust detection of epileptic seizures using EEG signals. Clinical Neuro-physiology, 130(1):25–37, 2019.[108] Ramy Hussein, Z. Jane Wang, and Rabab Ward. Ll-regularization based EEG feature learningfor detecting epileptic seizure. In 2016 IEEE Global Conference on Signal and InformationProcessing (GlobalSIP), pages 1171–1175. IEEE, 2016.[109] Ramy Hussein, Rabab Ward, Z. Jane Wand, and Amr Mohamed. Energy efficient EEG monitor-ing system for wireless epileptic seizure detection. In 2016 15th IEEE International Conferenceon Machine Learning and Applications (ICMLA), pages 294–299. IEEE, 2016.[110] Abeg Kumar Jaiswal and Haider Banka. Local pattern transformation based feature extractiontechniques for classification of epileptic EEG signals. Biomedical Signal Processing and Control,34:81–92, 2017.[111] A Janet, A Bearman, and C Dong. Predicting seizure onset in epileptic patients using intracranialEEG recordings. Technical Report, 2012.[112] Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax.arXiv preprint arXiv:1611.01144, 2016.[113] Kenneth G Jordan. Emergency EEG and continuous EEG monitoring in acute ischemic stroke.Journal of Clinical Neurophysiology, 21(5):341–352, 2004.126[114] M Kaleem, A Guergachi, and S Krishnan. EEG seizure detection and epilepsy diagnosis usinga novel variation of empirical mode decomposition. In Engineering in Medicine and BiologySociety (EMBC), 2013 35th Annual International Conference of the IEEE, pages 4314–4317.IEEE, 2013.[115] Alexander Ya Kaplan, Andrew A Fingelkurts, Alexander A Fingelkurts, Sergei V Borisov, andBoris S Darkhovsky. Nonstationary nature of the brain activity as revealed by EEG/MEG:methodological, practical and conceptual challenges. Signal Processing, 85(11):2190–2212,2005.[116] Philippa J Karoly, Dean R Freestone, Ray Boston, David B Grayden, David Himes, Kent Leyde,Udaya Seneviratne, Samuel Berkovic, Terence O’brien, and Mark J Cook. Interictal spikes andepileptic seizures: their relationship and underlying rhythmicity. Brain, 139(4):1066–1078, 2016.[117] Philippa J Karoly, Hoameng Ung, David B Grayden, Levin Kuhlmann, Kent Leyde, Mark JCook, and Dean R Freestone. The circadian profile of epilepsy improves seizure forecasting.Brain, 140(8):2169–2182, 2017.[118] Michael Kearns and Dana Ron. Algorithmic stability and sanity-check bounds for leave-one-outcross-validation. Neural computation, 11(6):1427–1453, 1999.[119] Heba Khamis, Armin Mohamed, and Steve Simpson. Frequency-moment signatures: a methodfor automated seizure detection from scalp EEG. Clinical neurophysiology : Official Journal ofthe International Federation of Clinical Neurophysiology, 124(12):2317–27, Dec 2013.[120] Yusuf Uzzaman Khan, Nidal Rafiuddin, and Omar Farooq. Automated seizure detection in scalpEEG using multiple wavelet scales. In Signal Processing, Computing and Control (ISPCC), 2012IEEE International Conference on, pages 1–5. IEEE, 2012.[121] Isabell Kiral-Kornek, Subhrajit Roy, Ewan Nurse, Benjamin Mashford, Philippa Karoly, ThomasCarroll, Daniel Payne, Susmita Saha, Steven Baldassano, Terence O’Brien, et al. Epileptic seizureprediction using big data and deep learning: toward a mobile system. EBioMedicine, 27:103–111,2018.[122] Maheshkumar H Kolekar and Deba Prasad Dash. A nonlinear feature based epileptic seizuredetection using least square support vector machine classifier. In TENCON 2015-2015 IEEERegion 10 Conference, pages 1–6. IEEE, 2015.[123] Eldho S Kollialil, Gopika Gopan, A Harsha, and Liza Annie Joseph. Single feature-based non-convulsive epileptic seizure detection using multi-class svm. In 2013 International Conferenceon Emerging Trends in Communication, Control, Signal Processing and Computing Applications(C2SPCA), pages 1–6. IEEE, 2013.[124] Iryna Korshunova, Pieter-Jan Kindermans, Jonas Degrave, Thibault Verhoeven, Benjamin HBrinkmann, and Joni Dambre. Towards improved design and evaluation of epileptic seizurepredictors. IEEE Transactions on Biomedical Engineering, 65(3):502–510, 2018.[125] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep con-volutional neural networks. In Proceedings of the NIPS’2012, pages 1097–1105, 2012.[126] Levin Kuhlmann, Dean Freestone, Alan Lai, Anthony N Burkitt, Karen Fuller, David B Gray-den, Linda Seiderer, Simon Vogrin, Iven MY Mareels, and Mark J Cook. Patient-specific127bivariate-synchrony-based seizure prediction for short prediction horizons. Epilepsy Research,91(2-3):214–231, 2010.[127] Levin Kuhlmann, Philippa Karoly, Dean R Freestone, Benjamin H Brinkmann, Andriy Temko,Alexandre Barachant, Feng Li, Gilberto Titericz Jr, Brian W Lang, Daniel Lavery, et al. Epilep-syecosystem. org: crowd-sourcing reproducible seizure prediction with long-term human in-tracranial EEG. Brain, 141(9):2619–2630, 2018.[128] Abhishek Kumar and Maheshkumar H Kolekar. Machine learning approach for epileptic seizuredetection using wavelet analysis of EEG signals. In Medical Imaging, m-Health and EmergingCommunication Systems (MedCom), 2014 International Conference on, pages 412–416. IEEE,2014.[129] S Pravin Kumar, N Sriraam, PG Benakop, and BC Jinaga. Entropies based detection of epilepticseizures with artificial neural network classifiers. Expert Systems with Applications, 37(4):3284–3291, 2010.[130] J Ph Lachaux, D Rudrauf, and P Kahane. Intracranial EEG and human brain mapping. Journalof Physiology-Paris, 97(4-6):613–628, 2003.[131] Yan Li et al. A novel statistical algorithm for multiclass EEG signal classification. EngineeringApplications of Artificial Intelligence, 34:154–167, 2014.[132] Qin Lin, Shu-qun Ye, Xiu-mei Huang, Si-you Li, Mei-zhen Zhang, Yun Xue, and Wen-ShengChen. Classification of epileptic EEG signals with stacked sparse autoencoder based on deeplearning. In Proceedings of ICIC’2016, pages 802–810. Springer, 2016.[133] Roderick JA Little and Donald B Rubin. Statistical analysis with missing data, volume 333. JohnWiley & Sons, 2014.[134] Yinxia Liu, Weidong Zhou, Qi Yuan, and Shuangshuang Chen. Automatic seizure detectionusing wavelet transform and SVM in long-term intracranial EEG. IEEE Transactions on NeuralSystems and Rehabilitation Engineering, 20(6):749–755, 2012.[135] Nandor Ludvig, Geza Medveczky, Ruben Kuzniecky, Gabor Illes, and Orrin Devinsky. Systemand device for seizure detection, February 8 2011. US Patent 7,885,706.[136] Laura M Lynam, Mark K Lyons, Joseph F Drazkowski, Joseph I Sirven, Katherine H Noe,Richard S Zimmerman, and James AWilkens. Frequency of seizures in patients with newly diag-nosed brain tumors: a retrospective review. Clinical Neurology and Neurosurgery, 109(7):634–638, 2007.[137] Ralph Meier, Heike Dittrich, Andreas Schulze-Bonhage, and Ad Aertsen. Detecting epilepticseizures in long-term human EEG: a new approach to automatic online and real-time detectionand classification of polymorphic seizure patterns. Clinical Neurophysiology, 25(3):119–131,2008.[138] Georgiy R Minasyan, John B Chatten, Martha Jane Chatten, and Richard N Harner. Patient-specific early seizure detection from scalp EEG. Clinical Neurophysiology, 27(3):163, 2010.[139] Karl E Misulis and E Lee Murray. Essentials of Hospital Neurology. Oxford University Press,2017.128[140] Joyeeta Mitra, John R Glover, Periklis Y Ktonas, Arun Thitai Kumar, Amit Mukherjee, Nico-laos B Karayiannis, James D Frost Jr, Richard A Hrachovy, and Eli M Mizrahi. A multi-stagesystem for the automated detection of epileptic seizures in neonatal EEG. Journal of Clin-ical Neurophysiology: Official Publication of the American Electroencephalographic Society,26(4):218, 2009.[141] Todd K Moon. The expectation-maximization algorithm. IEEE Signal processing magazine,13(6):47–60, 1996.[142] FlorianMormann, Ralph GAndrzejak, Christian E Elger, and Klaus Lehnertz. Seizure prediction:the long and winding road. Brain, 130(2):314–333, 2006.[143] SR Mousavi, M Niknazar, and B Vosoughi Vahdat. Epileptic seizure detection using ar modelon EEG signals. In 2008 Cairo International Biomedical Engineering Conference, pages 1–4.IEEE, 2008.[144] Boris Murmann. A/d converter trends: Power dissipation, scaling and digitally assisted architec-tures. In 2008 IEEE Custom Integrated Circuits Conference, pages 105–112. IEEE, 2008.[145] Md Mursalin, Yuan Zhang, Yuehui Chen, and Nitesh V Chawla. Automated epileptic seizuredetection using improved correlation-based feature selection with random forest classifier. Neu-rocomputing, 241:204–214, 2017.[146] Md Mursalin, Yuan Zhang, Yuehui Chen, and Nitesh V Chawla. Automated epileptic seizuredetection using improved correlation-based feature selection with random forest classifier. Neu-rocomputing, 241:204–214, 2017.[147] AS Muthanantha Murugavel, S Ramakrishnan, K Balasamy, and T Gopalakrishnan. Lyapunovfeatures based EEG signal classification by multi-class svm. In Information and CommunicationTechnologies (WICT), 2011 World Congress on, pages 197–201. IEEE, 2011.[148] Suresh Muthukumaraswamy. High-frequency brain activity and muscle artifacts in MEG/EEG: areview and recommendations. Frontiers in Human Neuroscience, 7:138, 2013.[149] Mark H Myers, Madeline Threatt, Karsten M Solies, Brent M McFerrin, Lindsey B Hopf, J Dou-glas Birdwell, and Karl A Sillay. Ambulatory seizure monitoring: From concept to prototypedevice. Annals of Neurosciences, 23(2):100–111, 2016.[150] Ahmad R Naghsh-Nilchi and Mostafa Aghashahi. Epilepsy seizure detection using eigen-systemspectral estimation and multiple layer perceptron neural network. Biomedical Signal Processingand Control, 5(2):147–157, 2010.[151] B Suguna Nanthini and B Santhi. Seizure detection using svm classifier on EEG signal. Journalof Applied Sciences, 14(14):1658–1661, 2014.[152] Yu Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems.SIAM Journal on Optimization, 22(2):341–362, 2012.[153] Arsalan Mohsen Nia, Mehran Mozaffari-Kermani, Susmita Sur-Kolay, Anand Raghunathan, andNiraj K Jha. Energy-efficient long-term continuous personal health monitoring. IEEE Transac-tions on Multi-Scale Computing Systems, 1(2):85–98, 2015.129[154] Nicoletta Nicolaou and Julius Georgiou. Detection of epileptic electroencephalogram based onpermutation entropy and support vector machines. Expert Systems with Applications, 39(1):202–209, 2012.[155] Ernst Niedermeyer and FH Lopes da Silva. Electroencephalography: basic principles, clinicalapplications, and related fields. Lippincott Williams & Wilkins, 2005.[156] Mohammad Niknazar, SR Mousavi, B Vosoughi Vahdat, and M Sayyah. A new framework basedon recurrence quantification analysis for epileptic seizure detection. IEEE Journal of Biomedicaland Health Informatics, 17(3):572–578, 2013.[157] Paul L. Nunez and Ramesh Srinivasan. Scale and frequency chauvinism in brain dynamics: toomuch emphasis on gamma band oscillations. Brain Structure and Function, 215(2):67–71, Dec2010.[158] Lorena Orosco, Eric Laciar, Agustina Garces Garces Correa, Abel Torres, and Juan P Graffigna.An epileptic seizures detection algorithm based on the empirical mode decomposition of EEG. In2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society,pages 2651–2654. IEEE, 2009.[159] Pavel Ortinski and Kimford J Meador. Cognitive side effects of antiepileptic drugs. Epilepsy &Behavior, 5:60–65, 2004.[160] Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song,and Rabab Ward. Deep sentence embedding using long short-term memory networks: Analysisand application to information retrieval. IEEE/ACM Transactions on Audio Speech and LanguageProcessing, 24(4):694–707, 2016.[161] Hamid Palangi, Rabab K Ward, and Li Deng. Distributed compressive sensing: A deep learningapproach. IEEE Transactions on Signal Processing, 64(17):4504–4518, 2016.[162] R Panda, PS Khobragade, PD Jambhule, SN Jengthe, PR Pal, and TK Gandhi. Classification ofEEG signal using wavelet transform and support vector machine for epileptic seizure diction. In2010 International Conference on Systems in Medicine and Biology, pages 405–408. IEEE, 2010.[163] Yun Park, Lan Luo, Keshab K Parhi, and Theoden Netoff. Seizure prediction with spectral powerof EEG using cost-sensitive support vector machines. Epilepsia, 52(10):1761–1770, 2011.[164] Shivnarayan Patidar and Trilochan Panigrahi. Detection of epileptic seizure using kraskov en-tropy applied on tunable-q wavelet transform of EEG signals. Biomedical Signal Processing andControl, 34:74–80, 2017.[165] Musa Peker, Baha Sen, and Dursun Delen. A novel method for automated diagnosis ofepilepsy using complex-valued classifiers. IEEE Journal of Biomedical and Health Informat-ics, 20(1):108–118, 2016.[166] Hasan Polat and Mehmet Sirac Ozerdem. Epileptic seizure detection from EEG signals by usingwavelet and hilbert transform. In 2016 XII International Conference on Perspective Technologiesand Methods in MEMS Design (MEMSTECH), pages 66–69. IEEE, 2016.[167] Kemal Polat and Salih Gu¨nes¸. Classification of epileptiform EEG using a hybrid system basedon decision tree classifier and fast fourier transform. Applied Mathematics and Computation,187(2):1017–1026, 2007.130[168] Muhammed Shanir PP, Yusuf U Khan, and Omar Farooq. Time domain analysis of EEG forautomatic seizure detection.[169] Daniel Rivero, Enrique Fernandez-Blanco, Julian Dorado, and Alejandro Pazos. A new signalclassification technique by means of genetic algorithms and knn. In Evolutionary Computation(CEC), 2011 IEEE Congress on, pages 581–586. IEEE, 2011.[170] Greg Rogers. Epilepsy: the facts. Primary Health Care Research & Development, 11(4):413,2010.[171] ME Saab and Jean Gotman. A system to detect the onset of epileptic seizures in scalp EEG.Clinical Neurophysiology, 116(2):427–442, 2005.[172] Kaveh Samiee, Peter Kovacs, and Moncef Gabbouj. Epileptic seizure classification of EEG time-series using rational discrete short-time fourier transform. IEEE Transactions on BiomedicalEngineering, 62(2):541–552, 2015.[173] Kaveh Samiee, Peter Kovacs, and Moncef Gabbouj. Epileptic seizure classification of EEG time-series using rational discrete short-time fourier transform. IEEE Transactions on BiomedicalEngineering, 62(2):541–552, 2015.[174] Kaveh Samiee, Pe´ter Kova´cs, and Moncef Gabbouj. Epileptic seizure detection in long-term EEGrecords using sparse rational decomposition and local gabor binary patterns feature extraction.Knowledge-Based Systems, 118:228–240, 2017.[175] Mohamad Sawan, Muhammad T Salam, Je´roˆme Le Lan, Amal Kassab, Se´bastien Ge´linas, Phet-samone Vannasing, Fre´de´ric Lesage, Maryse Lassonde, and Dang K Nguyen. Wireless recordingsystems: from noninvasive EEG-NIRS to invasive EEG devices. IEEE transactions on biomedi-cal circuits and systems, 7(2):186–195, 2013.[176] Freescale Semiconductor. Xs110 uwb solution for media-rich wireless applications, 2004.[177] Vinit Shah, Meysam Golmohammadi, Saeedeh Ziyabari, Eva VonWeltin, Iyad Obeid, and JosephPicone. Optimizing channel selection for seizure detection. In Signal Processing in Medicine andBiology Symposium (SPMB), 2017 IEEE, pages 1–5. IEEE, 2017.[178] Chia-Ping Shen, Chih-Chuan Chen, Sheau-Ling Hsieh, Wei-Hsin Chen, Jia-Ming Chen, Chih-Min Chen, Feipei Lai, and Ming-Jang Chiu. High-performance seizure detection system usinga wavelet-approximate entropy-fsvm cascade with clinical validation. Clinical EEG and Neuro-science, 44(4):247–256, 2013.[179] Yuhui Shi. Particle swarm optimization. IEEE Connections, 2(1):8–13, 2004.[180] Han-Tai Shiao, Vladimir Cherkassky, Jieun Lee, Brandon Veber, Edward E Patterson, Ben-jamin H Brinkmann, and Gregory A Worrell. Svm-based system for prediction of epilepticseizures from iEEG signal. IEEE Transactions on Biomedical Engineering, 64(5):1011–1022,2017.[181] Eugene I Shih, Ali H Shoeb, and John V Guttag. Sensor selection for energy-efficient ambula-tory medical monitoring. In Proceedings of the 7th international conference on Mobile systems,applications, and services, pages 347–358. ACM, 2009.131[182] Simon Shorvon, Emilio Perucca, and Jerome Engel Jr. The treatment of epilepsy. John Wiley &Sons, 2015.[183] Debdeep Sikdar, Rinku Roy, and Manjunatha Mahadevappa. Epilepsy and seizure characteri-sation by multifractal analysis of EEG subbands. Biomedical Signal Processing and Control,41:264–270, 2018.[184] SJM Smith. EEG in the diagnosis, classification, and management of patients with epilepsy.Journal of Neurology, Neurosurgery & Psychiatry, 76(suppl 2):ii2–ii7, 2005.[185] Yuedong Song and Pietro Lio`. A new approach for epileptic seizure detection: sample entropybased feature extraction and extreme learning machine. Journal of Biomedical Science and En-gineering, 3(06):556, 2010.[186] Zhenxi Song, Jiang Wang, Lihui Cai, Bin Deng, and Yingmei Qin. Epileptic seizure detectionof electroencephalogram based on weighted-permutation entropy. In Intelligent Control and Au-tomation (WCICA), 2016 12th World Congress on, pages 2819–2823. IEEE, 2016.[187] Susan Spencer and Linda Huh. Outcomes of epilepsy surgery in adults and children. The LancetNeurology, 7(6):525–537, 2008.[188] William C Stacey and Brian Litt. Technology insight: neuroengineering and epilepsy—designingdevices for seizure control. Nature Reviews Neurology, 4(4):190, 2008.[189] Abdulhamit Subasi. Automatic detection of epileptic seizure using dynamic fuzzy neural net-works. Expert Systems with Applications, 31(2):320–328, 2006.[190] Abdulhamit Subasi. EEG signal classification using wavelet feature extraction and a mixture ofexpert model. Expert Systems with Applications, 32(4):1084–1093, 2007.[191] Yanmin Sun, Mohamed S Kamel, and Yang Wang. Boosting for learning multiple classes withimbalanced class distribution. In Sixth International Conference on Data Mining (ICDM’06),pages 592–602. IEEE, 2006.[192] Azadeh Kamali Tafreshi, Ali M Nasrabadi, and Amir H Omidvarnia. Epileptic seizure detec-tion using empirical mode decomposition. In 2008 IEEE International Symposium on SignalProcessing and Information Technology, pages 238–242. IEEE, 2008.[193] Yaniv Taigman, Ming Yang, Marc’Aurelio Ranzato, and Lior Wolf. Deepface: Closing the gapto human-level performance in face verification. In Proceedings of the IEEE CVPR’2014, pages1701–1708, 2014.[194] Taner Tanriverdi, Abdulrazag Ajlan, Nicole Poulin, and Andre Olivier. Morbidity in epilepsysurgery: an experience based on 2449 epilepsy surgery procedures from a single institution.Journal of neurosurgery, 110(6):1111–1123, 2009.[195] Jose´ F Te´llez-Zenteno, Lizbeth Herna´ndez Ronquillo, Farzad Moien-Afshari, and Samuel Wiebe.Surgical outcomes in lesional and non-lesional epilepsy: a systematic review and meta-analysis.Epilepsy Research, 89(2-3):310–318, 2010.[196] William H Theodore and Robert S Fisher. Brain stimulation for epilepsy. The Lancet Neurology,3(2):111–118, 2004.132[197] Pierre Thodoroff, Joelle Pineau, and Andrew Lim. Learning robust features using deep learningfor automatic seizure detection. In Proceedings of MLHC’2016, pages 178–190, 2016.[198] DA Torse, V Desai, and R Khanai. EEG signal classification into seizure and non-seizure classusing empirical mode decomposition and artificial neural network. IJIR, 3(1):2454–1362, 2017.[199] Vernon L Towle, Jose´ Bolan˜os, Diane Suarez, Kim Tan, Robert Grzeszczuk, David N Levin,Raif Cakmur, Samuel A Frank, and Jean-Paul Spire. The spatial location of EEG electrodes:locating the best-fitting sphere relative to cortical anatomy. Electroencephalography and ClinicalNeurophysiology, 86(1):1–6, 1993.[200] Eugen Trinka, Julia Ho¨fler, and Alexander Zerbs. Causes of status epilepticus. Epilepsia, 53:127–138, 2012.[201] Nhan Duy Truong, Anh Duy Nguyen, Levin Kuhlmann, Mohammad Reza Bonyadi, Jiawei Yang,Samuel Ippolito, and Omid Kavehei. Convolutional neural networks for seizure prediction usingintracranial and scalp electroencephalogram. Neural Networks, 105:104–111, 2018.[202] Alexandros T Tzallas, Markos G Tsipouras, and Dimitrios I Fotiadis. Automatic seizure detectionbased on time-frequency analysis and artificial neural networks. Computational Intelligence andNeuroscience, 2007, 2007.[203] Elif Derya U¨beyli. Analysis of EEG signals by combining eigenvector methods and multiclasssupport vector machines. Computers in Biology and Medicine, 38(1):14–22, 2008.[204] Elif Derya U¨beyli. Wavelet/mixture of experts network structure for EEG signals classification.Expert Systems with Applications, 34(3):1954–1962, 2008.[205] Elif Derya U¨beyli. Combined neural network model employing wavelet coefficients for EEGsignals classification. Digital Signal Processing, 19(2):297–308, 2009.[206] Elif Derya U¨beyli. Decision support systems for time-varying biomedical signals: EEG signalsclassification. Expert Systems with Applications, 36(2):2275–2284, 2009.[207] Jamie J Van Gompel, Gregory A Worrell, Michael L Bell, Todd A Patrick, Gregory D Cascino,Corey Raffel, W Richard Marsh, and Fredric B Meyer. Intracranial electroencephalography withsubdural grid electrodes: techniques, complications, and outcomes. Neurosurgery, 63(3):498–506, 2008.[208] Lasitha S Vidyaratne and Khan M Iftekharuddin. Real-time epileptic seizure detection usingEEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(11):2146–2156,2017.[209] Xin Wan. Early detection of seizure with a sequential analysis approach. 2015.[210] Lina Wang, Weining Xue, Yang Li, Meilin Luo, Jie Huang, Weigang Cui, and Chao Huang.Automatic epileptic seizure detection in EEG signals using multi-domain feature extraction andnonlinear analysis. Entropy, 19(6):222, 2017.[211] Jo¨rg Wellmer, Ferdinand von der Groeben, Ute Klarmann, Christian Weber, Christian E Elger,Horst Urbach, Hans Clusmann, and Marec von Lehe. Risks and benefits of invasive epilepsysurgery workup with implanted subdural and depth electrodes. Epilepsia, 53(8):1322–1332,2012.133[212] Jessica A Wilden and Aaron A Cohen-Gadol. Evaluation of first nonfebrile seizures. AmericanFamily Physician, 86(4), 2012.[213] James R Williamson, Daniel W Bliss, David W Browne, and Jaishree T Narayanan. Seizureprediction using EEG spatiotemporal correlation structure. Epilepsy & Behavior, 25(2):230–238,2012.[214] Chong H Wong, Julie Birkett, Karen Byth, Mark Dexter, Ernest Somerville, Deepak Gill, RayChaseling, Michael Fearnside, and Andrew Bleasel. Risk factors for complications during in-tracranial electrode recording in presurgical evaluation of drug resistant partial epilepsy. Actaneurochirurgica, 151(1):37, 2009.[215] Refet Firat Yazicioglu, Tom Torfs, Patrick Merken, Julien Penders, Vladimir Leonov, RobertPuers, Bert Gyselinckx, and Chris Van Hoof. Ultra-low-power biopotential interfaces and theirapplications in wearable and implantable systems. Microelectronics Journal, 40(9):1313–1321,2009.[216] Chung-Ping Young, Sheng-Fu Liang, Da-Wei Chang, Yi-Cheng Liao, Fu-Zen Shaw, and Chao-Hsien Hsieh. A portable wireless online closed-loop seizure controller in freely moving rats.IEEE Transactions on Instrumentation and Measurement, 60(2):513–521, 2011.[217] Qi Yuan, Weidong Zhou, Shufang Li, and Dongmei Cai. Epileptic EEG classification based onextreme learning machine and nonlinear features. Epilepsy Research, 96(1-2):29–38, 2011.[218] Ye Yuan, Guangxu Xun, Kebin Jia, and Aidong Zhang. A multi-view deep learning method forepileptic seizure detection using short-time fourier transform. In Proceedings of ACM-BCB’2017,pages 213–222. ACM, 2017.[219] Zarita Zainuddin, Lai Kee Huong, and Ong Pauline. On the use of wavelet neural networks inthe task of epileptic seizure detection from electroencephalography signals. Procedia ComputerScience, 11:149–159, 2012.[220] Tong Zhang. Solving large scale linear prediction problems using stochastic gradient descentalgorithms. In Proceedings of the twenty-first international conference on Machine learning,page 116. ACM, 2004.[221] Weidong Zhou, Yinxia Liu, Qi Yuan, and Xueli Li. Epileptic seizure detection using lacunarityand bayesian linear discriminant analysis in intracranial EEG. IEEE Transactions on BiomedicalEngineering, 60(12):3375–3381, 2013.[222] Indre˙ Zˇliobaite˙, Mykola Pechenizkiy, and Joao Gama. An overview of concept drift applications.In Big Data Analysis: New Algorithms for a New Society, pages 91–114. Springer, 2016.Appendix AChapter 5 Supplementary MaterialA.1 Convergence Rate of Gradient DescentIn this appendix, we derive the convergence rate of the proximal gradient descent method that has beenwidely used for solving the twice-differentiable optimization problems.Assuming the function f(w) is µ-strongly convex with Lipschitz-continous gradient, we can saythat the f(w) is µ-strongly invex if for all w and some µ > 0 it holds that:12krf(w)k2  µ (f(w) f⇤) (A.1)where f⇤ is the function value achieved by any optimal solution w⇤. For differentiable functions, invexfunctions are the set of functions where all stationary points are global minimizers (i.e., all convexfunctions are invex but there are invex functions that are not convex).And assuming that f(w) has an L-Lipschitz continous gradient, has at least one solution w⇤, and isµ-strongly invex. If we use gradient descent with a constant step size of ↵k = 1/L,wk+1 = wk  1Lrf(wk) (A.2)From Taylor’s expansion:f(wk+1) = f(wk) +rf(wk)T⇣wk+1  wk⌘+12⇣wk+1  wk⌘r2f(w)⇣wk+1  wk⌘(A.3)Hence,f(wk+1)  f(wk) +rf(wk)T⇣wk+1  wk⌘+L2kwk+1  wkk2 (A.4)Substituting by (A.2) in (A.4), we havef(wk+1)  f(wk) 12Lkrf(wk)k2 (A.5)Subtracting f⇤ from both sides and using strong-invexity we get,135f(wk+1) f⇤  f(wk) f⇤  12Lkrf(wk)k2 f(wk) f⇤  2µ2L⇣f(wk) f⇤⌘=⇣1 µL⌘⇣f(wk) f⇤⌘ (A.6)Applying this recursively gives us:f(wk) f⇤ ⇣1 µL⌘k f(w0) f⇤ (A.7)Hence, the iteration cost of the proximal-gradient descent is found to be O(nd), where n is the totalnumber of observations (EEG signals in our case) and d is the total number of features (EEG frequencycomponents in our case).A.2 Convergence Rate of Block Coordinate DescentIn this appendix, we present the convergence rate of the block coordinate descent (BCD) algorithm thatis computationally-ideal for solving the proposed L1-penalized robust regression problem. We provethat the iteration cost of BCD is O(n) so that the iterations are d times faster than the proximal-gradientiteration cost of O(nd).Consider the optimization problemargminw 2 Rdf(w)where f(w) is twice-differentiable. We’ll partition our variable w into c disjoint ‘blocks’w =hw1, w2, w3, · · · , wciT,each of size d/c. Assume that f is strongly-convex, and blockwise strongly-smooth,r2f(w) ⌫ µI, r2jjf(w)  LI,for all w and all blocks j. Consider a block coordinate descent algorithm where we use the iterationwt+1 = wt  1L(rf(wt)  ejk),where jk is the block we choose on iteration k, ej is vector of zeros with ones at the locations of block j,and  means element-wise multiplication of the two vectors. (It’s like coordinate descent except we’reupdating d/c variables instead of just one.)136Assume that we pick a random block on each iteration, p(jk = j) = 1/c, this satisfies:f(wk+1) = f(wk) +rjkf(wk)T⇣wk+1  wk⌘jk+12⇣wk+1  wk⌘Tjkr2jjf(z)⇣wk+1  wk⌘jk f(wk) +rjkf(wk)T⇣wk+1  wk⌘jk+L2k⇣wk+1  wk⌘jkk2= f(wk) 1Lkrjkf(wk)k2 +L2k 1Lrjkf(wk)k2= f(wk) 12Lkrjkf(wk)k2(A.8)This holds for any block jk, but using random selection we have,E[f(wk+1)]  f(wk) 12LXj1ckrjf(wk)k2= f(wk) 12Lckrf(wk)k2(A.9)And by strong convexity we have,E[f(wk+1) f(w⇤)] ⇣1 µLc⌘[f(wk) f(w⇤)] (A.10)Applying this recursively gives:E[f(wk+1)] f(w⇤) ⇣1 µLc⌘k[f(w0) f(w⇤)] (A.11)137Appendix BChapter 6 Supplementary MaterialB.1 Influence of EEG Segmentation on the Seizure Detection AccuracyIn this appendix, we show the impact of the EEG segment length on the detection accuracy of epilepticseizures. Figure B.1 depicts how the seizure detection accuracy decays with longer segment lengths. Italso shows that L=1 and L=2 are the only EEG segment lengths that achieve the highest seizure detectionaccuracy of 100,00%. And since the EEG segment length of 2 yields a lower computational complexitythan that of 1; we adopted this length in all our seizure detection experiments. In this regard, each EEGsegment is designed to have only two data-points out of 4096, producing 2048 segments for each EEGchannel signal.0 2 4 6 8 10 12Log2(L)9293949596979899100Classification Accuracy (%)Figure B.1: Classification accuracy against EEG segments’ length.B.2 Architecture of Vanilla RNN and LSTMIn this appendix, we describe the architecture of the vanilla RNN (the standard recurrent network (SRN))and the long-short-term memory (LSTM) that have been widely used for solving the sequence label-138ing problems [79]. The SRN’s architecture, shown in the left part of Figure B.2, represents the basicstructure of RNNs. Although RNNs reveal an outstanding capacity of modeling nonlinear time-seriesproblems, SRNs usually suffer from vanishing or blowing up the gradient during the back-propagationprocess, and thus, being incapable of learning from long-time sequences [31].To address this limitation of RNNs, more powerful recurrent architectures, such as LSTM, havebeen developed. It has been found that LSTM works best on time-series with short- and long-termdependencies. We use the LSTM architecture illustrated in the right part of Figure B.2 for our seizuredetection problem. This Figure has three gates (input, forget, output), a block input, a single cell (theConstant Error Carousel), an output activation function, and peephole connections [79]. The output ofthe block is recurrently connected back to the block input and all of the gates.Let xt and yt be the LSTM input and output vectors at time t. Then we get the following weightsfor an LSTM layer:• Input weights: Wz ,Wi,Wf ,Wo 2 RB⇥M• Recurrent weights: Rz , Ri, Rf , Ro 2 RB⇥B• Peephole weights: Pi, Pf , Po 2 RB• Bias weights: bz , bi, bf , bo 2 RBConsidering Figure B.2, the definitions of the vector relationships formulas for a basic LSTM layerforward pass can be written as [79]:Figure B.2: Detailed schematic of the Standard Recurrent Network (SRN) unit (left) and a Long-Short-Term Memory (LSTM) block (right) [79].139z¯t = Wzxt + Rzyt1 + bz (B.1)zt = g(z¯t) block input (B.2)i¯t = Wixt + Riyt1 + Pi  ct1 + bi (B.3)it = (¯it) input gate (B.4)f¯t = Wfxt + Rfyt1 + Pf  ct1 + bf (B.5)ft = (¯ft) forget gate (B.6)ct = zt  it + ct1  ft cell (B.7)o¯t = Woxt + Royt1 + Po  ct + bo (B.8)ot = (o¯t) output gate (B.9)yt = h(ct) ot block output (B.10)where , g, and h are point-wise activation functions. The logistic sigmoid (.) is used as a gateactivation function and the hyperbolic tangent g(.) = h(.) = tanh(.) is used as the input and outputactivation function of an LSTM unit.  denotes the point-wise multiplication of two vectors [79].140

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0379800/manifest

Comment

Related Items