UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Energy-efficient compressed sensing frameworks for the compression of electroencephalogram signals Fauvel, Simon 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2013_fall_fauvel_simon.pdf [ 3.69MB ]
JSON: 24-1.0103290.json
JSON-LD: 24-1.0103290-ld.json
RDF/XML (Pretty): 24-1.0103290-rdf.xml
RDF/JSON: 24-1.0103290-rdf.json
Turtle: 24-1.0103290-turtle.txt
N-Triples: 24-1.0103290-rdf-ntriples.txt
Original Record: 24-1.0103290-source.json
Full Text

Full Text

Energy-Efficient Compressed SensingFrameworks for the Compression ofElectroencephalogram SignalsbySimon FauvelB.Eng., McGill University, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Electrical and Computer Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)October 2013c? Simon Fauvel 2013AbstractThe use of wireless body sensor networks (WBSNs) is gaining popularity inmonitoring and communicating information about a person?s health. In suchapplications, the amount of data transmitted by the sensor node should beminimized. This is because the energy available in these battery-poweredsensors is limited. In this thesis, we study the wireless transmission ofelectroencephalogram (EEG) signals. We propose novel, energy-efficientcompressed sensing (CS) frameworks that take advantage of the inherentstructure present in EEG signals (both temporal and spatial correlations)to efficiently compress these signals at the sensor node in WBSNs.We first present a simple CS-based framework that is adapted to the EEGWBSN setting. We optimize the sparsifying dictionary and demonstratethat using a fixed sparse binary sensing matrix offers similar performancesto optimal matrices while requiring far fewer computations.We then add an energy-efficient Independent Component Analysis (ICA)preprocessing block to the simple CS framework to exploit the spatial cor-relations among EEG channels. We show that the proposed framework pro-vides significant energy savings as compared to the state-of-the-art method.As well, for a fixed compression ratio, our system achieves similar normal-ized mean square error performance as the state-of-the-art method, whichiiis better than that achieved by the simple CS framework.We further improve on the energy performance of the framework byreplacing the ICA preprocessing block by a simpler, correlations-based in-terchannel redundancy module and by using entropy coding. On the energyfront, our proposed CS framework is up to 8 times more energy-efficient thanthe typical wavelet compression method. We also show that our methodachieves a better reconstruction quality than the state-of-the art BSBLmethod. We further demonstrate that our method is robust to measure-ment noise and to packet loss, and that it is applicable to a wide range ofEEG signal types.We finally apply our CS framework to compress EEG signals in thecontext of a brain computer interface application and evaluate its impacton the performance of the system. We show that interesting energy savingscan be realized at the cost of a mild decrease in classification accuracy.iiiPrefaceThis thesis presents the research conducted by Simon Fauvel, in collabora-tion with Dr. Rabab K. Ward. I hereby declare that I am the first authorof this thesis. Chapters 3 to 5 are based on work that has been published orsubmitted for publication. Below are the publications related to this thesis.C1 S. Fauvel, A. Agarwal, and R. Ward. Compressed Sensing and Energy-Aware Independent Component Analysis for Compression of EEG Sig-nals. In 38th International Conference on Acoustics, Speech, and Sig-nal Processing (ICASSP ?13), 2013.J1 S. Fauvel and R. Ward. An Energy-Efficient Compressed Sensing Frame-work for the Compression of Electroencephalogram Signals. 2013 (sub-mitted).For both publications, I carried out the literature review, developed theframeworks, implemented them in software, carried out the simulations,analyzed the results, and wrote the manuscripts. In C1, A. Agarwal helpedwith the Matlab implementation of the proposed ICA algorithm.Dr. Ward helped in formulating the research problem, supervised thedirection of the research, and provided significant editorial comments andimportant suggestions for the organization of each manuscript.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Telemedicine and Wireless Body Sensor Networks . . . . . . 11.2 Electroencephalogram Signals . . . . . . . . . . . . . . . . . 31.3 EEG-Based Wireless Body Sensor Network . . . . . . . . . . 41.4 Challenges and Constraints of EEG-Based Wireless Body Sen-sor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Aim of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 71.6 Existing EEG Compression Algorithms . . . . . . . . . . . . 8v1.7 Compressed Sensing . . . . . . . . . . . . . . . . . . . . . . . 91.8 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.9 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . 101.10 Contributions of this Research . . . . . . . . . . . . . . . . . 121.11 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.12 Organization of the Thesis . . . . . . . . . . . . . . . . . . . 142 Compressed Sensing . . . . . . . . . . . . . . . . . . . . . . . . 152.1 Signal Sparsity . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2 Signal Acquisition and Reconstruction . . . . . . . . . . . . . 162.3 Incoherent Measurements . . . . . . . . . . . . . . . . . . . . 182.4 Extension to Compressible Signals . . . . . . . . . . . . . . . 192.5 Analysis Prior Formulation . . . . . . . . . . . . . . . . . . . 202.6 A Cautionary Note on the Analysis and Synthesis Operators 223 A Simple Compressed Sensing Framework for the Compres-sion of EEG Signals . . . . . . . . . . . . . . . . . . . . . . . . 243.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 243.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . 263.2.2 Compression . . . . . . . . . . . . . . . . . . . . . . . 263.2.3 Wireless Transmission . . . . . . . . . . . . . . . . . . 283.2.4 Reconstruction . . . . . . . . . . . . . . . . . . . . . . 283.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 303.3.2 Performance Metrics . . . . . . . . . . . . . . . . . . 31vi3.3.3 Choice of the CS Measurement Matrix . . . . . . . . 313.3.4 Reconstruction Performance . . . . . . . . . . . . . . 343.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 Compressed Sensing and Energy-Efficient Independent Com-ponent Analysis Preprocessing for the Compression of EEGSignals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 374.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2.1 Energy-Efficient ICA . . . . . . . . . . . . . . . . . . 394.3 State-of-the-Art System . . . . . . . . . . . . . . . . . . . . . 414.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.4.1 Data Used . . . . . . . . . . . . . . . . . . . . . . . . 414.4.2 Reconstruction Performance . . . . . . . . . . . . . . 414.4.3 Energy Analysis . . . . . . . . . . . . . . . . . . . . . 434.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 An Energy-Efficient, Complete Compressed Sensing Frame-work for the Compression of EEG Signals . . . . . . . . . . 485.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 485.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . 505.2.2 Compression . . . . . . . . . . . . . . . . . . . . . . . 505.2.3 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 535.2.4 Wireless Transmission . . . . . . . . . . . . . . . . . . 555.2.5 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . 55vii5.2.6 Reconstruction . . . . . . . . . . . . . . . . . . . . . . 565.3 State-of-the-Art Systems . . . . . . . . . . . . . . . . . . . . 565.3.1 JPEG2000-Based Compression . . . . . . . . . . . . . 575.3.2 BSBL CS Compression . . . . . . . . . . . . . . . . . 595.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 605.4.2 Performance Metrics . . . . . . . . . . . . . . . . . . 615.4.3 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 625.4.4 Reconstruction Performance . . . . . . . . . . . . . . 655.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696 A Compressed Sensing Framework for BCI Systems in WB-SNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 716.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.3.1 Data Used . . . . . . . . . . . . . . . . . . . . . . . . 746.3.2 Compression and Classification Performance . . . . . 756.3.3 Energy Analysis . . . . . . . . . . . . . . . . . . . . . 766.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . 797.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 80Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82viiiList of Tables1.1 EEG-Based WBSN Commercial Products . . . . . . . . . . . 51.2 EEG Lossy Compression Algorithms . . . . . . . . . . . . . . 93.1 Reconstruction Performance of the Simple Framework underDifferent Compression Ratios . . . . . . . . . . . . . . . . . . 354.1 FLOPS Comparison for Measurement Matrices . . . . . . . . 454.2 FLOPS Comparison Between xFICA and SOBI . . . . . . . . 465.1 Energy Consumption and Reconstruction Accuracy for allFrameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646.1 Classification Performance for BCI Competition IV, dataset# 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.2 Classification Performance for BCI Competition IV, dataset# 2a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766.3 Energy Consumption Under Different Compression Ratios . . 77ixList of Figures1.1 General block diagram for the EEG WBSN system . . . . . . 53.1 Block diagrams of (a) the sensor node and (b) the server nodefor the simple compressed sensing framework . . . . . . . . . 253.2 NMSE vs d for different compression ratios . . . . . . . . . . 323.3 NMSE vs CR for different measurement matrices . . . . . . . 333.4 Original and reconstructed EEG signals (one channel) at dif-ferent compression ratios . . . . . . . . . . . . . . . . . . . . . 364.1 Block diagrams of (a) the sensor node and (b) the server nodefor the compressed sensing framework with ICA preprocessing 394.2 NMSE as a function of the compression ratio for regular CS,state-of-the-art from [39], and the proposed method. . . . . . 434.3 NMSE for the 100 epochs (in decreasing order of magnitude)when the compression ratio is 3:1. . . . . . . . . . . . . . . . 445.1 Block diagrams of (a) the sensor node and (b) the server nodefor the proposed CS-based framework . . . . . . . . . . . . . 495.2 The interchannel redundancy removal algorithm. . . . . . . . 525.3 pdf of the raw CS measurements and of the difference signals 54x5.4 Block diagrams of (a) the sensor node and (b) the server nodefor the JPEG2000-based framework . . . . . . . . . . . . . . . 585.5 Block diagrams of (a) the sensor node and (b) the server nodefor the BSBL framework . . . . . . . . . . . . . . . . . . . . . 595.6 Reconstruction performance under Gaussian noise for a fixedCR (2.5:1), with varying SNR . . . . . . . . . . . . . . . . . . 675.7 Reconstruction performance under packet loss for a fixed CR(2.5:1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.1 Block diagrams of (a) the sensor node and (b) the server nodefor the proposed CS-based BCI framework . . . . . . . . . . . 73xiList of AcronymsBCI Brain Computer InterfaceBSBL Block-Sparse Bayesian LearningBSBL-BO Block-Sparse Bayesian Learning - Bounded OptimizationCDF Cohen-Daubechies-FeauveauCR Compression RatioCS Compressed SensingCSP Common Spatial PatternsDCT Discrete Cosine TransformEEG ElectroencephalogramFICA FastICAFLOPS Floating-Point OperationsGDP Gross Domestic ProductIC Independent ComponentICA Independent Component AnalysisIID Independent and Identically DistributedIOC Iteration of ConvergenceLDA Linear Discriminant AnalysisMMV Multiple Measurements VectorsNMSE Normalized Mean Square ErrornD n-DimensionalPDF Probability Density FunctionRIP Restricted Isometry PropertySNR Signal-to-Noise RatioSOBI Second-Order Blind IdentificationSPIHT Set Partitioning in Hierarchical TreesWBSN Wireless Body Sensor NetworkxFICA Cross-Product-Based FastICAxiiAcknowledgementsI would first like to thank my supervisor, Dr. Rabab Ward, for her continuedguidance and support throughout my time at UBC, and for her patience asI was learning the ropes and figuring out this new environment. This wholejourney was far from being smooth on many levels, and Dr. Ward?s supportwas essential in bringing it to fruition. This research and thesis would nothave been possible without her.Thank you to my labmates for contributing to making my time at UBCpleasant. A special thanks to Tanaya and Mani, two wonderful colleagues,but more importantly two amazing friends. I will always remember andcherish the great times we had together.Thank you to my friends back home, who I wish I could see more of-ten. Thank you to my EWB family for providing constant inspiration andmotivation, for helping me grow as a person, and for helping me maintainbalance in my life (sort of).And last but not least, a big thanks to my family for their never endinglove and support. Given what we had to go through in the last few months,it?s clear that I would not be here without you. Thank you for your continuedsupport in my different ventures, even the more foolish ones. Knowing thatyou are always behind me is essential and is what has allowed me to perseverexiiiwhen motivation was low and when I was questioning myself.This research was supported by the Qatar National Research Fund (QNRFNo. NPRP 09-310-1-058), the Natural Sciences and Engineering ResearchCouncil of Canada (NSERC) through an Alexander Graham Bell CanadaGraduate Scholarship (CGS-M), and the Fonds de recherche Nature et tech-nologies (FQRNT) through a masters research scholarship (B1).xivChapter 1Introduction1.1 Telemedicine and Wireless Body SensorNetworksHealthcare consumes a large part of the gross domestic product (GDP) ofmost countries, and the trend is going upward. In the last two decades,health expenditures as a percentage of GDP have steadily increased from8.8% to 10.1% worldwide. The trend is more obvious in high income coun-tries, where the same metric has gone from 9.5% to 12.0% during that period[6].Most countries are also facing important increases in elderly population(defined as the total population above 65 years old) and in chronic diseasepatients. Between 1960 and 2012, the percentage of elderly population hasgone from 5.1% to 7.8%, and the trend is predicted to accelerate in thedecades to come [7, 45]. Chronic diseases are on the rise everywhere aroundthe world and take up a significant portion of healthcare budgets [46]. Theirimpact can also be felt on the global economy (mainly due to productivitylosses). In Canada, 58% of all annual healthcare spending went to treatchronic disease patients in 2010. It is estimated that Canada lost about 1901billion dollars in 2010 due to chronic diseases ($68 billion in direct healthcosts, and $122 billion in indirect costs) [44]. The financial impact of theseincreases is predicted to accelerate in the years and decades to come.On top of the financial burden created by the increasing number of el-derly and chronic disease patients, traditional healthcare is suffering fromscalability issues. By definition, chronic disease patients require close moni-toring of their condition over time, which puts significant pressure on healthinfrastructure. Traditional healthcare cannot provide the scalability re-quired, as it relies on a physical, one-to-one relationship between the care-giver and the patient.Cost-effective solutions are thus needed to mitigate these issues. Onepossibility is to enable patients to participate in their own treatment bygiving them the technological tools necessary to monitor and remotely com-municate their situation to caregivers. With recent advances in signal pro-cessing and very-low-power wireless communications, wireless body sensornetworks (WBSNs) have gained popularity as a potential solution. The useof various sensors located on the patient?s body allows WBSNs to measureand communicate different physiological signals (e.g. heart and brain activ-ity) [40].Such a setup has the potential to be cost-effective both for the patientsand for the healthcare system as a whole. By giving the patients morecontrol over their treatment, the burden on medical staff and infrastructuresis reduced. The patients can also carry out most of their activities normallyand avoid trips to a physical health infrastructure. Moreover, WBSNs givepatients greater autonomy and ownership when it comes to taking care of2their own health, which can lead to a better quality of life.1.2 Electroencephalogram SignalsAn important component of WBSNs is the study of the electrical brain activ-ity. This is achieved via recording and analyzing the electroencephalogram(EEG) signals, using a collection of non-invasive wireless sensors located ona patient?s scalp. On top of being non-invasive, EEG signals provide hightemporal resolution, a desirable characteristic in many situations. This hightemporal resolution comes at the expense of a reduced spatial resolution(determined by the number of sensors used).EEG signals can be used to detect different medical conditions, suchas epileptic seizures [41]. The detection of seizures through the use of aWBSN offers significant advantages. Because it is a relatively rare occur-rence, seizure detection requires constant monitoring for an extended periodof time, which is resource-intensive when carried in a health institution. Us-ing an EEG WBSN can circumvent this by providing the patients a way todo the monitoring themselves and then consult a physician once the relevantdata has been gathered.Another important application of EEG signals in WBSNs is the use of abrain computer interface (BCI) that can detect the EEG patterns associatedwith a mental task performed by a patient [13]. The patient could usea mental task (such as attempting to move a finger or some arithmetictask) to operate a wheel chair, switch a light off or communicate with acaregiver. As the signals associated with the mental task are embedded in3the patient?s EEG, if their presence is detected in the EEG signals, thenthis information could be used to control the BCI. As will be seen later, oneof the main components required in the successful use of BCIs in WBSNsis the development of advanced compression techniques that preserve therelevant information (or features) in the EEG signals [9].Other common uses of EEG signals include sleep pattern studies, andthe diagnosis and treatment of strokes, infections (e.g. encephalitis, cerebralabscess), cerebral tumors and diseases (e.g. Alzheimer?s) (see e.g. [21, 32,33, 59]).In the above applications, it is preferable to have a system that is mini-mally obtrusive and allows the patient to move and walk freely, hence whyWBSNs can be valuable.1.3 EEG-Based Wireless Body Sensor NetworkThe setup for an EEG-based WBSN is as follows. EEG sensors are placedon a person?s head, usually following an international convention (e.g. theinternational 10-20 system). An EEG sensor is also referred to as an EEGchannel. The number of sensors (electrodes) depends on the application -some systems require few sensors while others require a few hundreds. Everysensor is wired to a single central microprocessor unit that would normallyhave three main components: a buffer (to store the EEG data stream comingfrom the different EEG channels; this buffer can be seen as memory), themicroprocessor itself (to carry out any computations needed before trans-mission), and a low-power radio (to wirelessly transmit the data). The com-4S1Data Buffer Microprocessor Radio TransmitterS2S3SC...Radio Receiver Computing ResourceServer NodeMicroprocessor UnitSensor NodeEEG SensorsWireless TransmissionFigure 1.1: General block diagram for the EEG WBSN systemTable 1.1: EEG-Based WBSN Commercial ProductsDevice Release Nb of EEG BatteryReferenceName Date Channels LifeEmotiv EPOC December 2009 14 12h [20]Neurosky Mindwave March 2011 1 8h [43]Interaxon Muse December 2013 4 5h [31]Imec Not announced yet 8 22h [30]bination of the EEG sensors and the microprocessor unit is what we refer toas the sensor node. This sensor node is battery-powered. The sensor nodewirelessly transmits the EEG data to the server node placed in an adjacentroom or somewhere in the vicinity of the sensor node. The server node iscomprised of two main blocks: a low-power radio receiver (to receive thetransmitted EEG data) and a computing resource (to carry out any post-transmission computations, storage, and any other desired operations). Weassume there is no constraint on the energy supply or the computationalpower available from this server node. The complete system setup is shownin Fig. 1.1.EEG-based WBSN products have started to appear on the market in thelast decade or so. The best available commercial products are summarizedin Table 1.1.5However, these products suffer from two major drawbacks. First, theirbattery life is limited. With a battery life of at most 22 hours for the leastenergy-hungry device, extended monitoring is not possible unless the patientwere to change the batteries every day. This is clearly inconvenient. Sec-ond, the number of EEG channels is still small. With more EEG channels,greater spatial resolution could be achieved, which could improve currentapplications or even unlock new ones. Of course, the number of channelsis intrinsically linked to the battery situation - using more channels woulddecrease the operating life time of the battery because of the additional sens-ing and wireless transmission required. There is also a limit to the numberof data packets that the radio can transmit per unit of time, which mayrestrict the maximum number of channels that can be used.Therefore, solutions that can address these two drawbacks are desirableand could lead to better, more practical commercial products available tocustomers and patients.1.4 Challenges and Constraints of EEG-BasedWireless Body Sensor NetworksThe energy available in the battery-powered sensor node in WBSNs is lim-ited. This energy is needed for acquiring and digitizing the EEG samples,for carrying out the computations at the sensor node, and for wirelesslytransmitting the data. Under current sensors technology, there is little thatcan be done to minimize the sensing energy - that is, the raw data mustall be acquired and digitized. For computations carried out in the sensor6node, energy savings should therefore be realized by using algorithms thatare energy-efficient, i.e. have low computational complexity. To minimizethe amount of data to be transmitted by the sensor node, the acquired sig-nals should be compressed before their transmission. A higher compressionratio will minimize the energy required for transmission. In other words,reducing the amount of computations and the number of bits transmitted istherefore crucial if one is to minimize the overall energy consumption of thesystem. There is thus a need to find compression solutions for EEG signalsthat do not require much energy.Traditionally, measurements are collected by the sensors at the Nyquistrate. Then, compression algorithms are directly applied to the acquireddata, at the sensor node, prior to wirelessly transmitting the data to theserver node. The computational demand (and thus energy consumption) oftraditional compression algorithms is high at the sensor node. This is highlyundesirable for WBSNs.1.5 Aim of the ThesisThe aim of this thesis is to find a compression framework suitable for anEEG-based WBSN in the context of telemedicine. Based on the discussionpresented in previous sections, our goal is to find a solution that is energy-efficient so that the system has a longer battery life.71.6 Existing EEG Compression AlgorithmsThere exists an important body of work related to the compression of EEGsignals. Broadly speaking, compression algorithms can be split into twocategories: lossless and lossy. Lossless algorithms give the ability to exactlyrecover the original signal, at the expense of greater computational complex-ity and lower compression ratios. Lossy algorithms, on the other hand, donot allow for perfect recovery of the original signals. However, these algo-rithms provide higher compression ratios and tend to be simpler. Given theconstraints mentioned in the previous section, in this work we focus on lossyalgorithms, as lossless algorithms are too complex in the WBSN setting. Wealso require high compression ratios, which is only possible through the useof lossy algorithms.For the sake of completeness, let us still briefly go over the body of liter-ature regarding lossless algorithms. To achieve losslessness, the algorithmsusually employ a lossy algorithm but also model the error so that perfectrecovery can be achieved. Since the 1990?s, many lossless EEG compressionalgorithms have been proposed, with varying complexity and compressionperformance (see for instance [4, 17, 38, 52?56, 61]).In many practical cases, exact recovery is not necessary, and a smallreconstruction error is tolerable (the magnitude of this error depends on theparticular application). Lossy algorithms are an interesting solution in theseinstances. Table 1.2 gives an overview of the main lossy EEG compressionalgorithms in the literature.8Table 1.2: EEG Lossy Compression AlgorithmsReference Year Brief Description[16] 2004 Thresholding of Daubechies-8 wavelet packets coef-ficients, followed by uniform quantization and runlength encoding[26] 2009 Classified signature and envelope vector sets ap-proach[27] 2010 Thresholding of CDF 9/7 wavelet coefficients, fol-lowed by uniform quantization and arithmetic en-coding[28] 2011 Thresholding of CDF 9/7 wavelet coefficients, fol-lowed by uniform quantization and set partitioningin hierarchical trees (SPIHT) encoding[10] 2011 Thresholding of nearly-perfect reconstructioncosine-modulated filter banks coefficients, followedby dynamic uniform quantization and run lengthencoding[18] 2011 Clustering of EEG channels, followed by regularand differential embedded zero-tree wavelet encod-ing1.7 Compressed SensingRecent research has demonstrated the use of compressed sensing (CS) asan alternative compression scheme for physiological signals in the context ofWBSNs [36]. CS is a novel paradigm which allows sampling of the signals ata sub-Nyquist rate. After acquiring the raw data, CS obtains a much smallernumber of samples by taking linear projections of the raw data. This is asimple operation which can be done at a low energy cost. The reconstructionof the data is computationally complex and is done at the server node [15].In a nutshell, if a signal can be sparsely represented by some dictio-9nary (i.e., if it can be well represented by a small number of transformcoefficients), then a small number of random linear projections (roughlyproportional to the information rate of the signal) is sufficient to recover thesignal exactly. To reconstruct the signal, we use greedy algorithms or anoptimization-based approach. The CS theory also extends to compressiblesignals (where the signal has many very small coefficients in some dictio-nary), although in this case the reconstruction is not exact [14]. Chapter 2contains more details about CS and its underlying theory.1.8 MotivationCS is an interesting paradigm in the case of WBSN since it requires verysimple computations at the sensor node (non-adaptive random projections),i.e. the node where the signals are acquired and compressed. The recon-struction, which is computationally intensive, is shifted to the server node.In WBSNs, we generally place no limitation on the energy consumption(and computational power) of the server, whereas the sensor node is heavilyconstrained both in terms of energy available and computational power. Assuch, CS fully exploits the strengths of WBSNs.1.9 Literature ReviewThe first study that applied CS to EEG compression used the multiplemeasurements vectors (MMV) approach to jointly compress and reconstructthe signals of the different EEG channels (i.e. all channels are reconstructedsimultaneously) [5]. The obtained results were good (high compression ratio10for reasonable reconstruction error) but the experimental setup was suchthat the EEG signals used were coming from repeated trials (asking thepatient to repeat the same task many times and recording one EEG channeleach time). This setup increases the coherence in the signals (asking someoneto carry out the same task is bound to result in EEG signals that are highlycoherent). This setting is of limited interest in telemedicine applications,since in these applications the patient is usually not prompted to act in acertain way or to repeat the same task multiple times.Other works have looked at CS approaches to compress EEG signals. In[50], the authors developed a scheme using Slepian functions. However, itis necessary to know the Slepian coefficients prior to compression, which isundesirable from a computational point of view. Another work used a finiterate of innovation approach to compressively sample EEG signals [48]. Themajor drawback of this method is the fact that the innovation rate must beknown, a task which can be difficult and impractical.For telemedicine applications, the first study that addressed the poten-tial of CS in EEG signal compression is found in [1] and [2]. This workfocused on surveying existing sparsifying dictionaries and reconstruction al-gorithms, and testing different combinations of these elements to determinewhich yielded the best results. The conclusion was that the applicability ofsingle-channel CS for EEG signals depends on the final application and thetolerable reconstruction error. This work therefore did not look into novelCS frameworks for WBSNs.More recently, Independent Component Analysis (ICA) was appplied asa preprocessing step before using CS for the compression of EEG signals11of newborn babies [39]. The compression results obtained were superior toother state-of-the-art methods that do not employ ICA preprocessing. Thissystem however consumes much energy at the sensor node and would notbe suitable for telemedicine applications. This is because the ICA algorithmused is computationally intensive, and such an operation must be carried atthe sensor node.1.10 Contributions of this ResearchThe above studies have resulted in some important questions: (i) Whatenergy savings can be realized through the use of CS for EEG WBSNapplications? (ii) Is it possible to exploit both the temporal correlations(intra-correlations) and the spatial correlations (inter-correlations betweenchannels) to increase the compression performance of CS? (iii) How does CScompare with other state-of-the-art compression algorithms for EEG com-pression in WBSNs, for different types of EEG signals? (iv) What is theimpact of CS compression on the performance of a practical application thatuses EEG signals?The main contributions of this thesis are:? To propose novel CS frameworks that fully take advantage of the in-herent structure present in EEG signals (both temporal and spatialcorrelations) to improve the compression performance.? To compare CS frameworks with other state-of-the-art compressionframeworks for EEG compression in WBSNs. It is also the first study12where different types of EEG signals representing a variety of appli-cations are used to test the performance of the proposed and existingframeworks, thus providing a more robust answer to the usefulnessand validity of the systems.? To apply a CS framework to compress signals in the context of aBCI application and to evaluate its impact on the performance of thesystem.1.11 NotationBefore getting started, let us first briefly introduce the notation used through-out this thesis:? Regular letters (either lowercase or uppercase) represent scalar num-bers and variables;? Bold, lowercase letters represent column vectors;? Bold, uppercase letters represent matrices;? ? ? ?p corresponds to the `p norm of a vector, computed as follows:?x?p =N?i=0|xi|pwhere x is a vector of length N .? (?)T corresponds to the transpose operation.131.12 Organization of the ThesisThis thesis is organized as follows. Chapter 2 gives an overview of the the-ory underlying CS. Chapters 3-5 each present a different version of a CSframework for EEG compression in the context of WBSNs. Chapter 3 in-troduces a simple CS framework that provides the basis we will build uponand improve in the next 2 chapters. Chapter 4 improves the reconstruc-tion performance of the simple framework by using an energy-efficient ICApreprocessing algorithm to exploit the interchannel correlations. Chapter 5proposes the first complete, practical framework for an EEG-based WBSNsystem. Chapter 6 applies the framework of Chapter 5 to a simple BCIsystem to test the classification performance of such a system when EEGsignals are compressed. Finally, Chapter 7 concludes this thesis by summa-rizing the contributions of this research and by offering possible extensionsof our work and paths to explore.14Chapter 2Compressed SensingThis chapter briefly discusses the key theoretical concepts behind com-pressed sensing (CS). There is a very large body of literature around thetheory of compressed sensing, and it is not our aim to cover it entirely.Given that this thesis is focused on applying compressed sensing to a prac-tical problem, we focus on the basics that will allow the reader to quicklygrasp how CS works. This chapter is organized as follows. Section 2.1 dis-cusses what is meant by the sparsity of a signal. Section 2.2 explains theacquisition and reconstruction framework for CS. Section 2.3 discusses thenecessary conditions on the measurement matrix so that reconstruction ispossible, and Section 2.4 extends the CS theory to compressible signals. Fi-nally, Section 2.5 introduces an alternative formulation for the CS problemand Section 2.6 clears up a few points about the two CS formulations.2.1 Signal SparsityCS exploits the fact that most signals have a sparse representation in somedictionary. Denoting this dictionary by the matrix ?N?K = [?1,?2, . . . ,?K ],we can write an original one-dimensional signal f of length N as15f = ?c =K?i=1ci?i. (2.1)When the K ? 1 vector c has a large number of zero (or small, insignifi-cant) coefficients, f can be obtained from c using few dictionary vectors ?i.The number of nonzero elements of c is called the sparsity of f . If thereare S such elements, it is said that c is the S-sparse representation of f indictionary ?. ? is also called the synthesis operator.2.2 Signal Acquisition and ReconstructionThe compressed sensing theory implies that instead of acquiring the N sam-ples of the signal f and then compressing it, it is possible to only acquireM samples, where M is slightly larger than S but is still much smaller thanN . As mentioned above, S is the smallest number of vector elements inthe dictionary ? that represent the signal f , so that all information in f iscaptured.This sampling paradigm - referred to as ?analog CS? - is the ultimategoal of CS. However, it cannot yet be attained by present day samplingtechnologies. At present, to represent f using M samples, all the N samplesare collected, discretized and then subsampled. The subsampling is carriedout using M linear projections of f . That is, CS subsamples f using anM?N measurement matrix1 ?, with M  N , thus creating a measurementvector y of length M :1also referred to as the sensing matrix16yM?1 = ?f = ??c. (2.2)This system of equations is largely underdetermined. That is, given ay vector, there is an infinite number of solutions for f (or equivalently, c).However, since the signal we wish to recover (f) is sparse, the correct solu-tion is often the sparsest solution. This corresponds to solving the followingl0 optimization problem:minc?c?0 subject to y = ??c. (2.3)where the `0 pseudonorm ? ? ?0 is the number of non-zero elements in agiven vector. Unfortunately, this problem is NP hard, and as such it is nottractable. Indeed, solving this problem requires an exhaustive search overall subsets of columns of ?, which is a combinatorial problem [14].Fortunately, an `1 optimization problem, a more practical problem dueto its convexity, was shown to be equivalent under some conditions. It canbe rewritten as follows:minc?c?1 subject to y = ??c. (2.4)Equation 2.4 is called the synthesis prior formulation for CS. This prob-lem can be recast as a linear programming one, for which many practicalsolvers exist. It has also been shown that perfect recovery can be achieved,even when a small number of measurements (i.e. M  N) is used [19].Perfect recovery is possible if the restricted isometry property (RIP) of ??17is satisfied. The RIPp is defined as follows:(1? ?S)?x?2p ? ???x?2p ? (1 + ?S)?x?2p (2.5)where ?S < 1, x is any S-sparse vector and p is the chosen norm (typically1 or 2). When p = 2, there is an intuitive geometric explanation of the RIP.Indeed, the RIP means that S-sparse vectors are not in the null space of?? and that S-sparse linear combinations of columns of ?? behave closeto an orthonormal system [15].2.3 Incoherent MeasurementsThe minimum acceptable value for M that allows the perfect reconstructionof the signal f is not only linked to the degree of sparsity S of f in dictionary? but also to ?, the degree of coherence between ? and ?. This coherencemeasures the largest correlation between any element of ? and ? and ismeasured by?(?,?) =?N ? max1?l,j?N|??l,?j?|. (2.6)The number of measurements M is given byM ? C ? ?2(?,?) ? S ? logN (2.7)where C is a positive constant [15]. Thus, the smaller the coherence, thesmaller the value of M can be. As such, it is important to select ? and ?so that they are maximally incoherent.18Ideally, one should not need to know the sparsifying dictionary ? in orderto pick a measurement matrix ?. Fortunately, some measurement matricescan be shown to be maximally incoherent with any sparsifying dictionarywith overwhelming probability. Random matrices have this property. In-deed, a matrix ? generated by independent and identically distributed (IID)Gaussian random variables or by IID Bernoulli random variables would dis-play this property [14]. This means that a random measurement matrix thatis properly constructed can allow perfect reconstruction without having anyknowledge about the original signal.2.4 Extension to Compressible SignalsCS can further be extended to compressible signals (signals that are notpurely sparse in a given dictionary but whose coefficients c decay with apower law when arranged in descending order). This setting is more realisticin practice, as real signals are rarely purely sparse but are often compressible.This is indeed the case for EEG signals.Suppose that we are interested in recovering the P largest coefficientsof the compressible signal f (where P  N). That is, we want to recoverthe indices as well as the magnitudes of the P largest values of c. Supposethat the number of collected measurements (M) is equal to P , i.e. onlyP  N random projections of the signal are collected. If the indices (i.e.the locations) of these P largest coefficients of the vector c are known, thenoptimal performance could be achieved, i.e. it is possible to exactly recoverthe magnitude of each of these P largest coefficients. This means that it is19possible to reconstruct c (or equivalently f) with an accuracy correspondingto its P largest coefficients. Now, it can be shown that CS can asymptoticallyobtain the same accuracy as this optimal solution, as long as the numberof random projections is increased by a factor of O(log(NP))[14]. Thisresult is spectacular, as it means that compressed sensing?s non-adaptivesampling scheme performs as well as a purely adaptive scheme that knowsthe locations of the P largest coefficients. In other words, CS is able to findthe P largest coefficients without any knowledge of the signal. The only costis a mild oversampling factor.2.5 Analysis Prior FormulationIn section 2.2, we introduced the synthesis prior formulation for CS recon-struction:minc?c?1 subject to y = ??c.where f = ?c. In this formulation, the objective was to solve for thesparse coefficients c. This formulation is the most popular in the CS fieldand has been studied extensively.There is an alternative formulation, where the objective is to solve for fsuch that ?f is sparse. This formulation is referred to as the analysis priorformulation:minf??f?1 subject to y = ?f . (2.8)where the dictionary ? is called the analysis operator. This formulationhas only started to be studied recently and its surrounding body of literature20is smaller than that of its synthesis counterpart.The analysis and synthesis prior formulations are equivalent only whenthe sparsifying dictionaries are orthogonal (i.e. ? = ??1). When the dictio-naries are tight frames, then both the synthesis and analysis problems canbe solved, although they do not yield the same result. Two frames are saidto be tight if they satisfy a generalized version of Parseval?s theorem:K?i=1(g ??i)2 = A?g?22for all vectors g of length N , with 0 < A <?. In this case, ? = ?T .In our work, we use the synthesis prior formulation. This is primarilybecause the underlying theory is better known, many practical algorithmscan solve this problem and previous work has focused on finding good dic-tionaries ? for EEG signals. On the other hand, there are few algorithmsthat can solve the analysis problem (Eq. 2.8) and no previous work has at-tempted to find a good analysis dictionary ? for EEG signals. We note thatthe analysis prior formulation may yield better results than the synthesisprior formulation.212.6 A Cautionary Note on the Analysis andSynthesis OperatorsIn sections 2.2 and 2.5, we introduced the notion of synthesis and analysisoperators, which were used in solving the CS problem:Synthesis: f = ?cAnalysis: c? = ?f (2.9)where f is the original signal, c and c? are the coefficients in the transformdomain, ? is the synthesis operator and ? is the analysis operator. Knowl-edge of the synthesis and analysis operators allow us to solve the synthesisand analysis prior problems, respectively.When ? and ? are tight frames (i.e. ? = ?T ), Eq. 2.9 becomes:Synthesis: f = ?cAnalysis: c = ?Tf (2.10)That is, the knowledge of either ? or ? is enough to solve both thesynthesis and the analysis problems. However, if we do not have tight frames,we must explicitly know ? and ? to solve the respective problem.In most practical instances, both the analysis and synthesis operators areknown. When only the analysis operator ? is known, the synthesis operatorcan be easily found if ? is orthogonal or a tight frame. The case for whenonly the synthesis operator is known is analogous.22If only the analysis operator ? is known and it is not orthogonal ora tight frame, then we cannot solve the synthesis prior problem since thesynthesis operator ? is not known. However, if we only know the synthesisoperator ?, then the synthesis problem can be solved provided the specificalgorithm used in solving it does not require the knowledge of ?.23Chapter 3A Simple CompressedSensing Framework for theCompression of EEG Signals3.1 Problem DescriptionAs mentioned in the introduction, compressed sensing (CS) is an interestingparadigm in the case of EEG-based WBSNs since the computational loadat the sensor node is very light - the only operation that is carried out is totake non-adaptive random projections of the EEG signals. As such, CS hasthe potential to offer an energy-efficient compression scheme in this context.While a few studies have looked into CS for EEG-based systems (referto Section 1.9 in the introduction), most of them haven?t considered it froma telemedicine perspective. Similarly, there has been little focus to optimizethe different CS building blocks for an EEG telemedicine application.In this chapter, we propose a simple CS framework that is suitable fora low-energy, power-efficient telemedicine application. To do so, we adaptthe typical CS scheme to reduce the load at the sensor node and use a24L inear Measurement Matrix  ( ? )C om pressionE poch ingP reprocessing W irel ess T ra nsm issionContinuous E E G  Stream F Y(a)CS Reconstruction A lgorith mReconstructionFrec?(b)Figure 3.1: Block diagrams of (a) the sensor node and (b) the server nodefor the simple compressed sensing frameworkstate-of-the-art reconstruction strategy.This chapter is organized as follows. Section 3.2 presents the differentbuilding blocks of the framework. Section 3.3 shows the performance of thesystem through different experiments. Section 3.4 discusses the obtainedresults.3.2 MethodsBelow we present the different blocks that make up the system. We willdiscuss the preprocessing, the compression, the wireless transmission andthe reconstruction. A block diagram of the proposed system is shown inFig. PreprocessingThe EEG data is formed of C signals, where C represents the number ofchannels (sensors). The signal from each channel is first divided into non-overlapping line segments (epochs) of length N . In our experiments, Ncorresponded to 512 samples for each channel. Note that our frameworkoperates on data from one epoch at a time. For C channels of EEG data,after dividing the data into epochs a total of C sequences of N = 512 datapoints each are obtained for each segment of time. These are representedby f1,f2, . . . ,fC . This forms a matrix F . Each column of F contains oneof the channels: FN?C = [f1|f2| . . . |fC ].3.2.2 CompressionTo compress the EEG signals contained in one epoch, we take their linearrandom projections. As mentioned in Section 2.3, the chosen measurementmatrix ? must be maximally incoherent with the sparsifying dictionary.With overwhelming probability, random matrices satisfy this requirementirrespective of the dictionary used. The most often used matrices are ran-dom matrices with IID entries formed by sampling a Gaussian distribu-tion (N (0, 1/N)) or a Bernouilli distribution (with P (?i,j = +1/?N) =1/2, P (?i,j = ?1/?N) = 1/2). In fact, it can be proven that these twotypes of matrices (Gaussian and Bernouilli) are optimal. Unfortunately,these matrices are not suitable for WBSN applications. This is because gen-erating a Gaussian random matrix requires a Gaussian random generator atthe sensor node (which cannot be efficiently implemented on simple sensor26hardware). Moreover, using such a matrix would lead to C large matrix-vector multiplications for each epoch (one for each of the C channels). Asthese multiplication operations are energy-intensive, they should be avoidedin WBSN applications. They are also time consuming, and would thereforeprevent the system from operating in near realtime mode.The use of a full Bernouilli matrix would reduce the challenges mentionedabove (it is easier to generate its random entries, and it also has simplermultiplication operations) but this unfortunately would still require a highnumber of computations.Instead, we use what is known as sparse binary sensing. This was firstproposed in [23] and has since been applied to WBSNs ([36, 63]). Thesematrices only contain d nonzero entries in each column, and the value ofthese nonzero entries is 1. The optimal value of d (that is, the smallestvalue for which reconstruction would be stable) is obtained experimentallyand is much smaller than M . While the theoretical guarantees are notas strong as those for full random matrices, it has been shown that thesematrices perform well in many practical applications. There are significantadvantages to using such a matrix in WBSNs. The most important one isthat the matrix multiplication operation is very simple: in fact, it consistsof (Nd?M) simple additions for each channel.We propose the use of the same sparse binary sensing matrix for eachchannel (sensor) so that we can more easily generate and/or store the matrixin memory at the sensor node. This setup also preserves the interchannelcorrelations, a feature which will be exploited in later chapters. We willtest 2 different settings: the first uses a fixed sensing matrix (stored in the27sensor node memory) for all epochs, and the second generates a new matrixfor each epoch.We apply the M ? N measurement matrix ? to each EEG channel inorder to obtain the compressed measurements: YM?C = [y1|y2| . . . |yC ] =[?f1|?f2| . . . |?fC ]3.2.3 Wireless TransmissionThe compressed measurements Y are then wirelessly transmitted from thesensor node to the server node. Note that we make no attempt to modelthe wireless channel and instead treat it as a black box.3.2.4 ReconstructionThe final step is to reconstruct the original EEG signals from the compressedmeasurements Y . The vast majority of reconstruction algorithms use adictionary in which the signals are sparse (or at least compressible).Sparsifying Dictionary (?)As discussed previously, one of the main elements of compressed sensingis the selection of a proper sparsifying dictionary ?. Different sparsifyingdictionaries were tested in [2] but there was no aim at optimizing the chosendictionaries to obtain the best performance possible (that is, to obtain thedictionary in which EEG signals are the most sparse). Previous work hasshown that EEG signals are sparse in the Gabor domain [5].The Gabor dictionary is a redundant dictionary which provides opti-mal joint time-frequency resolution [22]. Gabor functions are sinusoidally-28modulated Gaussian functions. The atoms in this dictionary are given bygi(n;n0, f0, s) = K(n0, f0, s) ? exp(?(n? n0)22s2)? sin(2pi ? f0(n? n0))where n0 and f0 are the centers of the Gabor atom, s is the spread ofthe atom, and K(n0, f0, s) is a normalizing constant.We now require a discretization over the n0, f0 and s parameters - thatis, we need to determine how to increment these parameters. For s, thechosen discretization is a dyadic scale (base 2). The time increment n0is proportional to the spread, and the frequency increment f0 is inverselyproportional to the spread. The size of the dictionary depends on the lengthof the EEG epoch considered in the time domain. To obtain the frequencystep and the time step, we rely on the following equations:?f0 =?8pi?sN?n0 = sN ??2?piwhere N is the epoch length and ? = 0.5ln(0.5(B+1/B)). B is the baseused (2 in our case).These equations are based on the distance metric for Gabor atoms pro-posed in [8]. We can show that selecting n0 and f0 in this manner providesa dictionary with optimal distance between the atoms.29Reconstruction AlgorithmThere exists a multitude of reconstruction algorithms for reconstructing theoriginal signals. It is possible to use convex optimization to solve (2.4). It isalso possible to use a greedy algorithm, which looks at finding a suboptimalsolution to (2.3) by iteratively selecting the locally optimal solution, in thehope of getting closer to the globally optimal solution. Such algorithms con-verge to an acceptable solution faster than convex optimization algorithms.This is however at the expense of requiring slightly more measurements.In our framework, we use a convex optimization algorithm, the BasisPursuit Denoise algorithm implemented in the SPGL1 package [60]. Basedon our tests, this algorithm required fewer measurements (i.e. a smallervalue for M) than greedy algorithms to achieve an equivalent reconstructionaccuracy; this results in a higher compression as a smaller M can be used.After performing the reconstruction on each channel data (i.e. on eachcolumn of Y ), we obtain the N ?C matrix of reconstructed signals: Frec =[f1|f2| . . . |fC ].3.3 Results3.3.1 DatasetIn order to assess the performance of the simple framework, we used the datafrom Dataset 1 of the BCI Competition IV [12]. This dataset was recordedfrom healthy subjects or generated artificially and contains the data of 7patients. The recording was made using 59 EEG channels per subject at30an initial sampling rate of 1000Hz. We resampled the data to 128Hz soas to provide a realistic sampling frequency in the context of telemedicine.Non-overlapping windows of length N = 512 were used in our experiments.The experiments were carried using 100 randomly extracted windows.3.3.2 Performance MetricsTo quantify the compression performance, we used the compression ratio(CR), defined as follows:CR =NM(3.1)To test the reconstruction quality, we used the normalized mean squareerror (NMSE), defined as follows:NMSE(x,y) =?x? y?2?x? ?x?2(3.2)where x is the original vector, y is the reconstructed vector and ?x is themean of x.The NMSE measures the distance between 2 vectors. Of course, thelower the NMSE, the better the reconstruction. Note that in our formula,we normalize by the de-meaned original signal so that differences in meansbetween datasets do not bias the results.3.3.3 Choice of the CS Measurement MatrixAs mentioned in section 3.2.2, we employ sparse binary sensing where themeasurement matrix ? contains d nonzero entries in each column. There312 4 6 8 10 12 1400. of nonzero elements (d) in measurement matrixNMSE  CR = 8:1CR = 6:1CR = 5:1CR = 4:1CR = 3:1CR = 2:1Figure 3.2: NMSE vs d for different compression ratiosare no theoretical guidelines for choosing the optimal value of d; we thusdetermine it experimentally. To do so, we vary the value of d and carry outthe reconstruction for various compression ratios. The results are shown inFig. 3.2.As seen from this figure, the NMSE saturates relatively fast, which isa desirable property. Indeed, once the number of nonzero entries in eachcolumn reaches 8, increasing its value does not yield better reconstruction.As such, we select d = 8 in our implementation. Of course, in terms ofenergy consumption, it is desirable to have the lowest value for d, since thisresults in fewer random linear projections (and thus fewer computations)at the sensor node. Looking at Fig. 3.2, we can see that we could use asfew as 4 nonzero entries per column for the measurement matrix. However,because improvements (albeit small) are still possible for d > 4, we decided322:14:16:18: RatioNMSE  Gaussian MatrixBernouilli MatrixSparse Matrix (d = 8)Fixed Sparse Matrix (d = 8)Figure 3.3: NMSE vs CR for different measurement matricesto use d = 8 instead of 4 for our experiments.We then set out to verify the performance of this sub-optimal mea-surement matrix by comparing the reconstruction performance with thatobtained using an optimal random matrix (Gaussian or Bernouilli). Westudy the reconstruction error for different compression ratios using 4 differ-ent matrices: (1) an optimal Gaussian random matrix, in which each entryis formed by sampling an IID Gaussian random variable, (2) an optimalBernouilli random matrix, in which each entry is formed by sampling anIID Bernouilli random variable, (3) a sparse binary sensing matrix (withd = 8) that is different for every epoch analyzed, and (4) a fixed sparse bi-nary sensing matrix with (d = 8) i.e. the same matrix is used for all epochs.The results are shown in Fig. 3.3.33As seen from Fig. 3.3, although the theoretical guarantees for the sub-optimal sparse binary sensing matrices are weaker than for the optimalGaussian and Bernouilli matrices, in practice their performance is statisti-cally almost the same. We can thus safely assume that using a sub-optimalmeasurement matrix yields near-optimal reconstruction results. Of equalinterest is that the fixed sparse binary sensing matrix does not result indegradation in the reconstruction quality. Because the proofs rely on thestochasticity of the matrices used, it is not possible to prove this resulttheoretically. The use of fixed sparse binary measurement matrices has sig-nificant advantages in the context of WBSNs since we can generate such amatrix offline and then store it in memory at the sensor node. The 2 otheralternatives are 1) generating a new sparse binary sensing matrix at thesensor node for each epoch, or 2) generating the matrix at the server nodeand then wirelessly transmitting the positions of the nonzero entries to thesensor node. Both alternatives are more energy-hungry and are thus lessdesirable.3.3.4 Reconstruction PerformanceWe then dive deeper into the reconstruction performance of the simpleframework when using a fixed binary sparse sensing matrix with d = 8.The reconstruction performance for different compression ratios is shown inTable 3.1. Fig. 3.4 shows an example of the original and reconstructed EEGsignals at different compression ratios so that one can empirically visualizethe impact of compression on the EEG signals.34Table 3.1: Reconstruction Performance of the Simple Framework under Dif-ferent Compression RatiosCR Mean NMSE ? STD8:1 0.386 ? 0.1876:1 0.290 ? 0.1625:1 0.239 ? 0.1454:1 0.182 ? 0.1173.5:1 0.152 ? 0.1013:1 0.119 ? 0.0832.5:1 0.090 ? 0.0652:1 0.056 ? 0.0423.4 DiscussionIn this chapter, we adapted the standard, simple CS framework to reducethe load at the sensor node. This was done by using a suitable measurementmatrix that requires a small amount of computations. We showed that usingsuch a matrix did not have an impact on the performance of the system ascompared to using theoretically optimal measurement matrices. We alsodeveloped a sparsifying dictionary specific to EEG signals.While this simple framework provides an interesting solution to the EEG-based WBSN application, it suffers from one major drawback: it does notexploit the spatial correlations (i.e. the correlations between EEG chan-nels). It is quite likely that exploiting such correlations could improve thereconstruction performance of the CS framework. This will be investigatedin the next 2 chapters.350 1 2 3 40100200 CR = 8, NMSE = 0.3500 1 2 3 40100200 CR = 6, NMSE = 0.3490 1 2 3 40100200 CR = 5, NMSE = 0.2840 1 2 3 40100200 CR = 4, NMSE = 0.1760 1 2 3 40100200 CR = 3.5, NMSE = 0.1600 1 2 3 40100200 CR = 3, NMSE = 0.0830 1 2 3 40100200 CR = 2.5, NMSE = 0.074Time (s)Amplitude (?V)0 1 2 3 40100200 CR = 2, NMSE = 0.046Time (s)Amplitude (?V)  OriginalReconstructedFigure 3.4: Original and reconstructed EEG signals (one channel) at differ-ent compression ratios36Chapter 4Compressed Sensing andEnergy-Efficient IndependentComponent AnalysisPreprocessing for theCompression of EEG Signals4.1 Problem DescriptionIn this chapter, we improve the framework presented in the previous chap-ter by adding a preprocessing step that exploits the interchannel correla-tions. This is achieved through the use of Independent Component Analysis(ICA). Therefore, instead of compressing the original EEG signals, the inde-pendent components obtained after applying ICA to the original signals arecompressed and wirelessly transmitted. This chapter discusses the develop-ment of an efficient ICA algorithm that requires low computational energy.By using the CS scheme presented in the previous chapter along with the37proposed ICA algorithm, it is shown that the proposed framework achievessimilar EEG compression results as those achieved by the state-of-the-artmethod proposed in [39] but is more energy-efficient at the sensor node.This chapter is organized as follows. Section 4.2 presents the differentbuilding blocks of the proposed framework while Section 4.3 briefly intro-duces the state-of-the-art framework. Section 4.4 compares the performanceof both frameworks through different experiments. Section 4.5 discusses theobtained results.4.2 MethodsThe proposed framework for compressing EEG signals is similar to the onepresented in Chapter 3, with one addition: there is an ICA preprocessingstep applied prior to compression. Compressed sensing is thus applied tothe derived independent components directly. Instead of compressing theEEG signals F as in the previous chapter, we compress S, the independentcomponents: Y = ?S. After CS reconstruction, we reconstruct the EEGsignals from the independent components by using the ICA mixing matrix.A block diagram of the proposed system is shown in Fig. 4.1.Since most of the framework is the same as for the one presented in theprevious chapter, we will not go into the details of the framework again. Werefer the reader to Section 3.2 for a detailed explanation. Instead, we focuson the energy-efficient ICA preprocessing step.38L inear Measurement Matrix  ( ? )C om pressionE poch ingP reprocessing W irel ess T ra nsm issionContinuous E E G  Stream I CAF S Y(a)CS Reconstruction A lgorith mReconstructionFrec? I CA  ReconstructionA(b)Figure 4.1: Block diagrams of (a) the sensor node and (b) the server nodefor the compressed sensing framework with ICA preprocessing4.2.1 Energy-Efficient ICAICA has been around for close to two decades now and was proposed asa solution to the blind source separation problem. The solution to theproblem is found by enforcing statistical independence of the source signals,commonly through maximizing the non-Gaussianity of the signals or throughminimizing the mutual information of the signals [29]. ICA has also beensuccessfully applied to EEG signals (see, for example, the study conductedby Makeig et al. [35]). The underlying basis as to why ICA works in the EEGcase is that the electrical scalp potential measured by an EEG electrode canbe seen as a mixture of a smaller number of ?sources? located in the brainthat give rise to these potentials.The ICA problem can be formulated as F = AS, where F is a matrix39containing the measured mixed signals (each column containing one mixedsignal), A is the mixing matrix, and S is a matrix containing the indepen-dent components (one source per column). The task is to find A and S fromthe observable measurements F . Note that in our case, the mixed signalsF correspond to the raw EEG signals.FastICA (FICA) is a popular algorithm that can solve this problem [35].Summarizing the FICA algorithm:1. Preprocessing : Whiten (decorrelate) the mixed signals.2. Iteration: In a deflationary manner (i.e. one at a time), estimateeach independent component (IC) by maximizing its non-Gaussianitythrough a contrast function. Using Gram-Schmidt, orthogonalize thefound IC with respect to the previously found ICs, and normalize it.Repeat this stage until the component converges.However, this algorithm is computationally intensive and thus not suit-able for WBSN applications. Acharyya et al. proposed an algorithm and anenergy-efficient architecture to calculate n-dimensional (nD) cross-products,and mentioned that one potential application is FICA [3]. After identifyingn ? 1 components with FICA, the nth component can simply be identifiedby taking the cross-product of the first n?1 components since at that point,the direction for maximal independence has already been determined. Wecall this method xFICA. This allows the saving of one full iteration cycle,which is not negligible.404.3 State-of-the-Art SystemBesides comparing the proposed ICA-augmented framework with the simpleframework of Chapter 3, we also compare it with the state-of-the-art systemfrom the literature proposed in [39]. The block diagram of this system is thesame as for ours, although different components and algorithms are used.Second Order Blind Identification (SOBI) is used as the ICA algorithm.The measurement matrix ? is the typical Gaussian random matrix. It ismentioned that the sparsifying dictionary is a Gabor dictionary, and thereconstruction algorithm used is iterative hard thresholding. However, dueto a lack of details about the parameters used for these last two elements,we used our own Gabor dictionary in combination with Basis Pursuit.4.4 Results4.4.1 Data UsedWe randomly selected 100 epochs of N = 512 samples each from dataset #1 of BCI Competition IV [12]. We resampled the data to 128Hz and selected12 channels in the sensorimotor area of the cortex out of the 59 channelsavailable. The selected channels were F1, FZ, F2, FC3, FC1, FCZ, FC2,FC4, C3, C1, CZ and C2.4.4.2 Reconstruction PerformanceIn this experiment, we wish to test the compression and reconstruction per-formance of 1) the simple framework of Chapter 3, 2) the proposed method41and 3) the method presented in [39]. We thus vary the number of indepen-dent components retained as well as the number of measurements M and wecompute the reconstruction error, in terms of the NMSE, against the com-pression ratio (again defined as CR = N/M). To keep the comparison fair,we must include the mixing matrix entries in the total number of measure-ments when ICA is used, and we must ensure that the same total numberof measurements is used for all methods (thus ensuring a constant compres-sion ratio). To select I independent components, we first apply PrincipalComponent Analysis to the EEG data so as to only keep the I componentsthat account for the most variance in the original data.The results are shown in Fig. 4.2. In Fig. 4.3, we show a slice from Fig.4.2 by selecting a compression ratio of 3:1 and showing the reconstructionerror for each block when we vary the number of independent components.As can be seen from Fig. 4.2, adding an ICA preprocessing step de-creases the NMSE for a fixed compression ratio (or, alternatively, it allowsfor an increase in the compression ratio for a fixed NMSE), with our methodand the state-of-the-art yielding similar results. In Fig. 4.3, we can see thatexcept for a few epochs for 8 ICs, using an ICA preprocessing step system-atically yields better results than using CS alone. Of course, selecting fewerindependent components has advantages both in terms of compression ratioand speed (since fewer sources need to be reconstructed). Our experimentsalso suggest that even if we use a small number of ICs (e.g. 3 or 4), weare still able to accurately represent the original signal. However, one mustbe careful to keep enough ICs so that the variance of the data is preserved.Indeed, in our experiments we observed that keeping less than 3 ICs made42246800. ICs6 ICs4 ICs3 ICsCompression Ratio (N/M)NMSE  Our methodMethod from [39]Regular CSFigure 4.2: NMSE as a function of the compression ratio for regular CS,state-of-the-art from [39], and the proposed method.it impossible to reconstruct the original signal faithfully.4.4.3 Energy AnalysisWe now wish to compare the energy performance at the sensor node of thestate-of-the-art and proposed frameworks. We use the number of floatingpoint operations (FLOPS) as a measure analogous to energy consumption.The number of FLOPS required is a measure of the dynamic power con-sumption. Reducing that value has the effect of reducing the overall powerconsumption. We look at the measurement matrices and the ICA algorithmsused in both methods.Measurement Matrix We first compare the random Gaussian matrixused in [39] and our binary sparse sensing matrix. The number of FLOPS4320 40 60 80 10000. NumberNMSE  Our method (3 ICs)Our method (4 ICs)Our method (6 ICs)Our method (8 ICs)Regular CSFigure 4.3: NMSE for the 100 epochs (in decreasing order of magnitude)when the compression ratio is 3:1.(F) for both matrices is given by the following equations:FGM = (2MN ?M) ? IFSBM = (Nd?M) ? Iwhere FGM is the FLOPS count for the Gaussian matrix, FSBM the FLOPScount for the sparse binary matrix, N the epoch length (512), M the numberof measurements per IC, I the number of independent components retainedand d the number of nonzero elements in each column of ?.We compare them for different compression ratios and numbers of ICs.The results are shown in Table 4.1. As can be seen, using a sparse binary44Table 4.1: FLOPS Comparison for Measurement MatricesCR Number of ICs 3 4 6 82 FLOPS (Gaussian Matrix) 3.11 ? 106 3.09 ? 106 3.07 ? 106 3.04 ? 106FLOPS (Sparse Binary Matrix) 9.25 ? 103 1.34 ? 104 2.16 ? 104 2.98 ? 1043 FLOPS (Gaussian Matrix) 2.06 ? 106 2.05 ? 106 2.02 ? 106 2.00 ? 106FLOPS (Sparse Binary Matrix) 1.03 ? 104 1.44 ? 104 2.26 ? 104 3.08 ? 1044 FLOPS (Gaussian Matrix) 1.53 ? 106 1.52 ? 106 1.50 ? 106 1.47 ? 106FLOPS (Sparse Binary Matrix) 1.08 ? 104 1.49 ? 104 2.31 ? 104 3.13 ? 1046 FLOPS (Gaussian Matrix) 1.01 ? 106 9.98 ? 105 9.70 ? 105 9.49 ? 105FLOPS (Sparse Binary Matrix) 1.13 ? 104 1.54 ? 104 2.36 ? 104 3.18 ? 1048 FLOPS (Gaussian Matrix) 7.49 ? 105 7.37 ? 105 7.12 ? 105 6.87 ? 105FLOPS (Sparse Binary Matrix) 1.16 ? 104 1.57 ? 104 2.39 ? 104 3.21 ? 104matrix offers significant advantages, as it is between 1 and 2 order of mag-nitude more efficient than a random Gaussian matrix in terms of FLOPS.ICA Algorithm We then compare the state-of-the-art ICA used in [39]and xFICA. To do so, we modify the numerical complexity models of [34]and then calculate the number of FLOPS (F) required for both algorithms:FSOBI = DNC2/2 + 4C3/3 + (D ? 1)C3/2+ IOCSOBI ? I2[4I(D ? 1) + 17(D ? 1) + 4I + 75]/2FxFICA = NC2/2 + 4C3/3 + ICN+ IOCxFICA ? [2(I ? 1)(I ? 1 +N) + 5N(I ? 1)2/2]+ I + I(I ? 1)3where C is the number of channels (12), D the number of delay lags usedfor SOBI (we used D = 100, as recommended by the authors of [11]), andIOC{SOBI, xFICA} are the number of iterations for convergence (IOC) of eachalgorithm, respectively. To find the number of (IOC) for both algorithms,45Table 4.2: FLOPS Comparison Between xFICA and SOBINumber of ICs 3 4 6 8[39] IOC (SOBI) 3.56 4.62 6.42 7.63FLOPS (SOBI) 3.82 ? 106 3.90 ? 106 4.25 ? 106 4.98 ? 106Proposed IOC (xFICA) 22.36 38.26 77.74 118.68FLOPS (xFICA) 0.218 ? 106 0.623 ? 106 2.97 ? 106 8.40 ? 106% FLOPS Saved 94.29% 84.02% 30.28% -68.45%we took the mean number of iterations over the 100 signals. The results areshown in Table 4.2.Table 4.2 demonstrates that when the number of ICs is six or less, ourproposed xFICA requires significantly less FLOPS than the ICA used in[39]. As the number of ICs decreases, our method offers larger and largerenergy savings.4.5 DiscussionThis chapter addresses the problem of efficiently compressing EEG signalsin wireless body sensor network applications, where efficiency is measured interms of compression ratio, reconstruction accuracy, and energy consump-tion. It proposes the use of compressed sensing after preprocessing the rawdata with an energy-efficient independent component analysis method. ICAimproves the reconstruction accuracy by exploiting the interchannel corre-lations. It was demonstrated that our system provides significant energysavings as compared to the state-of-the-art method, which also uses an ICApreprocessing block. As well, for a fixed compression ratio, our systemachieves similar NMSE performance as the state-of-the-art method, which46is much better than that achieved by CS only.However, the approach suggested still suffers from two important draw-backs. The first one is that the proposed approach only works for a smallnumber of EEG channels. Also, despite the fact that xFICA is more energy-efficient than other ICA algorithms, the computational complexity incurredat the sensor node is still too high for practical systems. In general, it isquestionable as to whether ICA is a practical strategy in the context of WB-SNs. As such, in the next chapter we investigate another strategy to exploitthe interchannel correlations.47Chapter 5An Energy-Efficient,Complete CompressedSensing Framework for theCompression of EEG Signals5.1 Problem DescriptionIn this chapter, we build on the work of previous chapters and propose anovel CS framework that takes advantage of the inherent structure presentin EEG signals (both temporal and spatial correlations) to improve the com-pression performance. To the best of our knowledge, this is also the firsttime that CS frameworks are compared with other state-of-the-art compres-sion frameworks for EEG compression in WBSNs. It is also the first studywhere different types of EEG signals representing a variety of applicationsare used to test the performance of the proposed and existing frameworks,thus providing a more robust answer to the usefulness and validity of thesystems.48L inear Measurement Matrix  ( ? )C om pressionE poch ingP reprocessing W irel ess T ra nsm issionContinuous E E G  Stream F YMeans Removal I nterch annel Redundancy  Removal Scalar Q uantiz ationEncod ingE ntropy  Coding(a)CS Reconstruction A lgorith mReconstruction? P ostprocessingE ntropy  DecodingD ecod ingFrec(b)Figure 5.1: Block diagrams of (a) the sensor node and (b) the server nodefor the proposed CS-based frameworkThis chapter is organized as follows. Sections 5.2 and 5.3 describe ourframework and briefly introduce the benchmarking frameworks, respectively.Section 5.4 presents our results. Finally, section 5.5 provides a short analysis.5.2 MethodsThis section introduces the framework we developed to efficiently compressEEG signals using low energy consumption. Below we present the differentblocks and algorithms that make up our proposed system. We will discussthe preprocessing, the compression, the encoding, the wireless transmission,the decoding and the reconstruction. A block diagram of the proposedsystem is shown in Fig. 5.1.495.2.1 PreprocessingAs in the case of the simple framework of Chapter 3, the data is first epoched,with N = 512. For every epoch, the data is written in matrix form asFN?C = [f1|f2| . . . |fC ], where C is the number of channels and fi is theEEG data from one channel.The mean of each channel is then removed. The resulting matrix isF? = [f?1|f?2| . . . |f?C ]. The means will be added back in the reconstructionphase. Removing the means leads to a higher compression ratio becausethe interchannel redundancy removal module (as will be discussed later)performs better on de-meaned EEG signals. It also reduces the total rangeof the signals, making it easier to quantize and encode them.5.2.2 CompressionTo compress the de-meaned EEG signals contained in one epoch, we firsttake their linear random projections and then apply an interchannel redun-dancy removal module.Measurement Matrix (?)As for the simple framework of Chapter 3, we use a fixed sparse binarysensing matrix ? with d = 8 nonzero entries in each column. The samematrix ? is used for each channel. After applying the measurement matrixto the de-meaned EEG signals, we obtain the compressed measurements:YM?C = [y1|y2| . . . |yC ] = [?f?1|?f?2| . . . |?f?C ]50Interchannel redundancy removalBecause all EEG channels collect information related to the same physiolog-ical signal, there exist large interchannel correlations amongst them. Indeed,channels that are spatially close to one another tend to have signals thatare highly correlated. Because we use the same measurement matrix foreach channel, the linear projections of the signals of these channels are alsocorrelated. To remove the redundancies inherent in the multi-channel EEGacquisition, we compute the differences between the compressed values ofchannel pairs (i.e. between yi,yj for some index pairs (i, j) that we care-fully select). To select the best channel pairs (i.e. those that will lead tothe smallest differences), we run an algorithm that finds these pairs at theserver node. The server node transmits the channel pairs to the sensor node.These pairs are then used to compute the differences for the next epoch. Theredundancy removal algorithm is summarized in Fig. 5.2.The threshold T controls the minimum acceptable correlation betweenchannel pairs. That is, any channel pair whose correlation falls below Twill not be selected. If, for a certain value of T , C or more pairs havecorrelations above T and these channel pairs are non-redundant (i.e. theyare all linearly independent), T is in fact meaningless. However, if this isnot the case, T determines the number of channel pairs i.e. the number ofdifferences to be taken. We selected T = 0.6, as this was the value that gaveus the best results experimentally. Also note that the selected channel pairsare wirelessly sent to the sensor node in a ?predictive? manner - of course,the server node does not know what the compressed values of the current511. Compute R, the matrix of correlation coefficients of the current epoch.Each entry of R is calculated as follows:Ri,j =cov(yi,yj)?(yi)?(yj)where cov(a, b) is the covariance between a and b, and ?(a) is the stan-dard deviation of a.2. Subtract the identity matrix I from R (because the autocorrelations arenot meaningful).3. Define a loop index k that will be used to track the number of differencestaken. Let k = 1.4. Find Rmax, the maximum absolute value in matrix R ? I. The rowand column indices (ik, jk) corresponding to Rmax correspond to the 2channels used for the difference. Remove the channel pair (ik, jk) fromthe correlation pool.5. Check that the selected channel pair is not redundant (i.e. it is linearlyindependent from the previously selected pairs when we look at it as asystem of equations). If it is redundant, discard the current pair. If it isnot, keep the current pair and increment k by 1.6. Repeat steps 4 and 5 while k < C or while Rmax < T , where C is thenumber of channels and T is a user-defined threshold.7. If k < C, we have run out of good channel pairs and will thus needto pick individual channels (not a difference) to complete our system ofequations. Choose a ?pair? (ik, 0) such that channel ik is the channel withthe smallest variance that is linearly independent from the previouslyselected pairs. Increment k by 1. Repeat until k = C.8. Wirelessly transmit the selected channel pairs (ik, jk) , k = 1 to C to thesensor node. These pairs will be used to compute the differences in thenext epoch.Figure 5.2: The interchannel redundancy removal algorithm.epoch are before they are transmitted, and so the channel pairs are basedon the compressed values from previous epochs.We observed experimentally that for a given dataset, the best channel52pairs are mostly stable over time (i.e. the best pairs from one epoch to an-other are fairly constant). As such, there is no need to repeatedly transmitthe best channel pairs. In practice, the best approach might be to period-ically update the best channel pairs, in case there has been a shift in thesignals.The sensor node is only responsible for calculating the differences, whichis computationally inexpensive. They are calculated for k = 1 to C asfollows:y?k = yik ? sign(Rik,jk)yjk .After this stage, we obtain a matrix Y?M?C = [y?1|y?2| . . . |y?C ] whosecolumns are the difference signals.Fig. 5.3 shows the probability density functions (PDFs) of the originalmeasurement signals and of the difference signals, taken over 1000 randomwindows of length 512 of the datasets described at the start of Section 5.4.As can be seen, the pdf of the difference signals has the desirable propertyof being narrower than that of the original measurements. This makes iteasier to encode the values (it requires fewer bits to do so), which results ina higher compression ratio.5.2.3 EncodingAfter removing the interchannel redundancies, the signals are quantized andan entropy coding scheme is applied. The output of the encoding module isa stream of bits to be transmitted wirelessly.53?200 ?100 0 100 20000.0050.010.0150.02Signal Value (?V)Probability  Raw Compressed SignalsDifference SignalsFigure 5.3: pdf of the raw CS measurements and of the difference signalsFor quantization, the columns of Y? T are first stacked as a vector:y?MC?1 = vec(Y?T ) =[y?1(1)y?2(1) . . . y?C(1)y?1(2)y?2(2) . . . y?C(2) . . . . . . y?1(M)y?2(M) . . . y?C(M)]T .This type of vectorization ensures that the compressed samples of asingle channel are interleaved i.e. that they do not all come in one burst.After quantization, the resulting vector is y?q = Q(y?) where Q is the scalarquantization operator. Assuming that the original signals are representedusing 12 bits, we use a 15-bit quantization (to account for the increase insignal range caused by taking the linear projections).Entropy coding is then applied on y?q as this further reduces the numberof bits required to represent the signals. As seen in Fig. 5.3, the distribution54of the difference signals y? is definitely not uniform; in fact, it approximatelyhas the shape of a Gaussian. We exploit this fact to achieve greater com-pression. Huffman encoding is known to perform well in such cases, so itis chosen as the entropy coding strategy. The codebook is generated offlineusing the information from Fig. 5.3 and is then stored in memory at thesensor node. We thus obtain a vector y? = H(y?q) where H is the Huffmanencoding operator. Note that for a fixed value for M , the length of y? variesslightly from one epoch to another, depending on the encoding.5.2.4 Wireless TransmissionAfter compression and encoding, the EEG signals are wirelessly transmittedfrom the sensor node to the server node. As in [36], we assume that eachdata packet has a size of 127 bytes, of which 13 bytes are the MAC overhead.We simply form the data packets by splitting the vector y? in consecutivesegments of 114 bytes. Note that we make no further attempt to model thewireless channel and that we do not study different modulation schemes fortransmission.5.2.5 DecodingUpon receiving the encoded compressed EEG signals y? (assuming that weare dealing with a perfect channel), the server would first decode them.This is a straightforward decoding operation: y?q = H?1(y?) where H?1 isthe Huffman decoding operator. We then form an M ? C matrix: Y?q =[y?q1|y?q2| . . . |y?qC ].555.2.6 ReconstructionThe final step is to reconstruct the original EEG signals from the decodedmeasurements Y?q. Before applying the CS reconstruction algorithm, wefirst reverse the effect of the interchannel redundancy module to obtain Yq,the original compressed measurements (before their differences were taken).Once this is done, we can do CS reconstruction. As in Chapter 3, thesparsifying dictionary ? is a Gabor dictionary and the chosen reconstructionalgorithm is the Basis Pursuit Denoise algorithm implemented in the SPGL1package.PostprocessingThe final step is to add back the means of each EEG channel.5.3 State-of-the-Art SystemsA brief overview of the state-of-the-art systems that will be used to compareour results with is also given.Given that the problem under investigation (EEG compression in aWBSN setting) has only started to be studied recently, there is not a largebody of existing literature around it, and identifying a proper benchmarkis challenging. EEG compression has been studied quite extensively in thelast two decades but algorithms have not generally been designed for low en-ergy consumption or for implementation on simple hardware. As such, someframeworks can achieve high compression ratios, but they require too manycomputations or some operations that are too complex given our target sen-56sor hardware, making these frameworks prohibitive in WBSNs. Similarly,WBSNs started to gain momentum in the last decade but few of them haveaddressed their applications to EEG signals. As a result, there only existvery few papers that have explicitly studied the problem of EEG compres-sion for WBSNs. In order to identify state-of-the-art systems to comparethe performance of our proposed framework with, it is therefore necessary toextrapolate the results of this previous research in order to identify schemesthat can offer good performance in the context we are interested in.We use the following requirements for selecting state-of-the-art systemsto compare our system with:? low energy consumption: the system must not have a high computa-tional requirement at the sensor node in order to conserve energy;? simple sensor hardware: the system must be simple enough to operateusing inexpensive hardware;? high compression ratio: the achievable compression ratio must be highenough to justify the extra computations carried at the sensor node(as compared to wirelessly sending raw, uncompressed data).Based on these requirements, we have selected 2 state-of-the-art compres-sion methods to compare our proposed framework to. These are describedin some details below.5.3.1 JPEG2000-Based CompressionThe JPEG2000-based EEG compression scheme was proposed in [27]. Amongstthe non-CS methods, it is one of the simplest yet most effective compres-57Discrete F orw ard Wavelet TransformC om p ressionE poch ingP rep rocessing W irel ess T ra nsm issionContinuous E E G  Stream H ard Th resh olding Scalar Q uantiz ationE ncod ingE ntropy  Coding(a)I nverse Discrete Wavelet TransformReconstructionE ntropy  DecodingD ecod ingFrec(b)Figure 5.4: Block diagrams of (a) the sensor node and (b) the server nodefor the JPEG2000-based frameworksion schemes. It is a simple wavelet compression algorithm adapted fromthe JPEG2000 algorithm used for image compression. Its block diagram isshown in Fig. 5.4.The wavelet coefficients of the raw EEG signals are first computed usingthe Cohen-Daubechies-Feauveau (CDF) 9/7 discrete wavelet transform. De-pending on the desired compression ratio, a hard threshold is then applied sothat only the largest wavelet coefficients of the signal are kept. These largecoefficients are then encoded using arithmetic coding. At the server side, thereceived signals are decoded and then reconstructed by taking the inversewavelet transform. In our implementation, we used a decomposition level of7 for the wavelet transform. We determined this level experimentally, as itgave the lowest reconstruction error; furthermore, using more levels did notprovide an improvement in reconstruction accuracy.In contrast to CS schemes, this type of algorithm is adaptive in the58L inear Measurement Matrix  ( ? )C om pressionE poch ingP reprocessing W irel ess T ra nsm issionContinuous E E G  Stream Scalar Q uantiz ationEncod ing(a)CS Reconstruction A lgorith mReconstructionFrec?(b)Figure 5.5: Block diagrams of (a) the sensor node and (b) the server nodefor the BSBL frameworksense that it relies on the exact knowledge of the signal to find its largestcoefficients. Also, the bulk of the computations is done at the sensor node.We will discuss the implications of these later on.5.3.2 BSBL CS CompressionBlock-Sparse Bayesian Learning (BSBL) is a reconstruction method thatwas proposed in [64] and then applied to EEG signals in [63]. Its blockdiagram is shown in Fig. 5.5.The main difference between the BSBL framework and our framework isin how the reconstruction is carried. Whereas we assume that EEG signalsare sparse in a transform domain (Gabor frames in our case) and use thatinformation to reconstruct the original signals, the BSBL framework doesnot rely on this assumption. BSBL was first proposed to reconstruct signals59that have a block structure - that is, signals that have few blocks containingnonzero entries, and the remainder of the blocks containing only zeros. Itwas then experimentally shown that the BSBL scheme was effective in re-constructing signals even if their block partition boundaries are unknown orif they do not have any clear block structure. Raw EEG signals fall in thislast category.To carry out the reconstruction, the BSBL framework uses a ?sparsify-ing? matrix ? which is an inverse discrete cosine transform (DCT) operator(EEG signals are not sparse in the DCT basis but as mentioned previously,BSBL does not rely on sparsity). We used the Bounded-Optimization vari-ant of the BSBL family (BSBL-BO) with a block partition of 24. Theexperiments we carried found this value to be the optimal partition size.5.4 ResultsWe now introduce the experimental results. We start by presenting thedatasets used as well as defining the performance metrics selected. We thencarry out experiments to evaluate the energy consumption and reconstruc-tion accuracy.5.4.1 DatasetsIn order to assess the performance of the different algorithms on a widerange of EEG cases, we selected 3 different databases.The first one is Dataset 1 of the BCI Competition IV [12]. This datasetwas recorded from healthy subjects or generated artificially and contains60the data of 7 patients. The recording was made using 59 EEG channels persubject at an initial sampling rate of 1000Hz.The 2 other databases are from Physionet [24]. The first consists ofseizure data of 22 pediatric subjects. The recordings contain 21 EEG chan-nels at a sampling rate of 256Hz [51]. The second database is a collectionof 108 polysomnographic recordings of different sleep disorders monitoring.Each recording contains between 5 and 13 EEG channels sampled at 256Hz[57].For each dataset, we used a resolution of 12 bits for the original EEGsignals. Note that some files in these datasets contain dead channels (i.e.channels where the output is always 0). We removed these channels fromour analyis, as they provide no useful information. We also resampled thedata to 128Hz so as to provide a realistic sampling frequency in the contextof telemedicine as well as ensure uniformity between the datasets. Non-overlapping windows of length N = 512 were used in our experiments. Theexperiments were carried using 100 randomly extracted windows from eachdataset.5.4.2 Performance MetricsTo quantify the compression performance, we used the compression ratio(CR), which is redefined here as follows:CR =bb?(5.1)where b is the total number of bits in the original (uncompressed signal)61and b? is the total number of bits in the compressed signal.To test the reconstruction quality, we used the NMSE metric, as definedin Eq. EnergyWe study the energy performance at the sensor node of the two CS schemes(BSBL and the proposed framework) and the JPEG2000 scheme. As men-tioned previously, energy consumption is paramount in WBSNs due to thelimited battery life at the sensor node. We also note that energy consump-tion is only important at the sensor node. For the server node, we assumean unlimited energy supply. We study the three broad classes of energy atthe sensor node: 1) sensing energy (energy required to acquire and digitizethe EEG samples), 2) computing energy (energy required to carry out thecomputations at the sensors) and 3) transmission energy (energy requiredto wirelessly transmit the data to the server).After being acquired and digitized, the signals are compressed, encodedand transmitted. The sensing energy is constant for all schemes since allsamples must first be acquired. It was also shown in [62] that the sensingenergy is small (about an order of magnitude smaller) compared to the othertwo classes when using ultra-low-power sensors. As such, we can safely omitit in our analysis. For a fixed compression ratio, the transmission energyis the same for all schemes because the number of bits to be transmittedwould be the same. We thus focus our efforts on the computation energy,which is mainly the energy required for compression and encoding.We implemented the code in Embedded C and simulated it on a MICAZ62target platform using the AVRORA simulator ([58]). We evaluated theperformance based on the total cycle count, the run time and the energyconsumption. To estimate the transmission energy, we rely on the experi-mental work done in [36], in which it was calculated that the transmissionof each packet requires 524.72 ?J. The results are presented in the top partof Table 5.1. They refer to one epoch for one EEG channel so that the dif-ference in the number of channels in the different datasets does not impactthe presented results.63Table 5.1: Energy Consumption and Reconstruction Accuracy for all FrameworksCR 8:1 6:1 5:1 4:1 3.5:1 3:1 2.5:1 2:1Energy ConsumptionCycle CountProposed 502,200 507,690 512,448 519,402 524,526 531,480 542,757 557,477BSBL 429,486 429,486 429,486 429,486 429,486 429,486 429,486 429,486JPEG2000 4,908,731 4,908,367 4,908,055 4,907,535 4,907,171 4,906,703 4,905,975 4,904,831Run TimeProposed 68.11 ms 68.86 ms 69.50 ms 70.45 ms 71.14 ms 72.08 ms 73.61 ms 75.61 msBSBL 58.25 ms 58.25 ms 58.25 ms 58.25 ms 58.25 ms 58.25 ms 58.25 ms 58.25 msJPEG2000 665.8 ms 665.7 ms 665.7 ms 665.6 ms 665.6 ms 665.5 ms 665.4 ms 665.3 msComputation EnergyProposed 1.55 mJ 1.56 mJ 1.58 mJ 1.60 mJ 1.61 mJ 1.64 mJ 1.67 mJ 1.72 mJBSBL 1.32 mJ 1.32 mJ 1.32 mJ 1.32 mJ 1.32 mJ 1.32 mJ 1.32 mJ 1.32 mJJPEG2000 15.11 mJ 15.11 mJ 15.11 mJ 15.11 mJ 15.11 mJ 15.11 mJ 15.10 mJ 15.10 mJComputation +Transmission EnergyProposed 1.99 mJ 2.15 mJ 2.29 mJ 2.48 mJ 2.62 mJ 2.82 mJ 3.08 mJ 3.49 mJBSBL 1.76 mJ 1.91 mJ 2.03 mJ 2.20 mJ 2.33 mJ 2.50 mJ 2.73 mJ 3.09 mJJPEG2000 15.55 mJ 15.70 mJ 15.82 mJ 15.99 mJ 16.12 mJ 16.29 mJ 16.51 mJ 16.87 mJReconstruction AccuracyMean NMSE(BCI Dataset)Proposed 0.227 ? 0.147 0.145 ? 0.103 0.105 ? 0.077 0.065 ? 0.050 0.044 ? 0.036 0.025 ? 0.021 0.009 ? 0.009 0.001 ? 0.003BSBL 0.455 ? 0.205 0.312 ? 0.191 0.239 ? 0.159 0.180 ? 0.120 0.148 ? 0.103 0.114 ? 0.083 0.086 ? 0.063 0.059 ? 0.048JPEG2000 0.101 ? 0.079 0.079 ? 0.064 0.066 ? 0.055 0.053 ? 0.045 0.045 ? 0.040 0.037 ? 0.034 0.028 ? 0.026 0.018 ? 0.018Mean NMSE(Seizure Dataset)Proposed 0.552 ? 0.160 0.388 ? 0.148 0.298 ? 0.130 0.202 ? 0.107 0.149 ? 0.084 0.101 ? 0.067 0.056 ? 0.047 0.021 ? 0.025BSBL 0.607 ? 0.136 0.452 ? 0.144 0.371 ? 0.142 0.275 ? 0.127 0.234 ? 0.122 0.181 ? 0.106 0.140 ? 0.091 0.095 ? 0.070JPEG2000 0.159 ? 0.108 0.119 ? 0.098 0.097 ? 0.090 0.074 ? 0.080 0.062 ? 0.073 0.050 ? 0.065 0.038 ? 0.055 0.025 ? 0.042Mean NMSE(Sleep Dataset)Proposed 0.340 ? 0.148 0.192 ? 0.124 0.122 ? 0.101 0.062 ? 0.073 0.039 ? 0.057 0.022 ? 0.040 0.009 ? 0.020 0.003 ? 0.008BSBL 0.472 ? 0.158 0.300 ? 0.154 0.211 ? 0.125 0.134 ? 0.109 0.096 ? 0.092 0.066 ? 0.082 0.040 ? 0.059 0.020 ? 0.040JPEG2000 0.071 ? 0.061 0.043 ? 0.047 0.030 ? 0.040 0.018 ? 0.030 0.013 ? 0.025 0.009 ? 0.020 0.006 ? 0.014 0.003 ? 0.00964As expected, the CS-based schemes significantly outperform JPEG2000in every category. In fact, even if we use the highest compression ratio forJPEG2000, it will never qualify from an energy perspective when comparedwith CS. JPEG2000, being an adaptive scheme, relies on a high amountof computations in order to efficiently compress the data while the randomprojections of CS require a significantly smaller lower number of computa-tions. It is also interesting to note that small gains (decrease in the numberof cycles, run time and energy consumption) are obtained by the proposedframework when the compression ratio increases. This is due to a reductionin redundancy removal and encoding computations. We note that BSBL isslightly more energy-efficient than the proposed framework. However, as wewill see shortly that comes at the expense of a worse reconstruction accu-racy. We will argue that the small increase in energy consumption is wellworth it in this case.Another interesting consideration would be to look at the comparison be-tween sending compressed and uncompressed data. Sending uncompresseddata requires 3.53 mJ of energy per channel (which is the energy requiredfor wireless transmission, as no computations are carried out prior to trans-mission). We thus note that CS is more advantageous at any compressionratio, whereas JPEG2000 always consumes more energy than even sendinguncompressed data.5.4.4 Reconstruction PerformanceWe now look at the reconstruction performance of the three frameworks overthe three datasets. We look at both noiseless and noisy cases.65Noiseless CaseFor this experiment, we assume that the measurements are noise-free (thatis, the output of the entropy decoding block is the same as the output of theinterchannel redundancy removal module), and only look for the algorithms?ability to accurately reconstruct the original signals at different compressionratios. The reconstruction performance of the different frameworks over thethree datasets is shown at the bottom of Table 5.1.In terms of the NMSE, the proposed framework systematically outper-forms BSBL. As expected, the reconstruction accuracy of the JPEG2000framework is generally better than that of the CS-based frameworks, es-pecially at high compression ratios. This is not a surprise because theJPEG2000 framework adaptively encodes the EEG signals, using full knowl-edge about the structure of the signal when finding its largest coefficients. Incontrast, at the sensor nodes, the CS frameworks do not make any attemptat comprehending the signal at play; instead, they nonadaptively take a sub-set of linear random projections, and only rely on the signal structure on theserver side when carrying out the reconstruction. As such, it is obvious thatan adaptive algorithm will perform better than a nonadaptive one. However,it is interesting to note that as the compression ratio decreases, the gap inthe reconstruction error quickly shrinks. At lower compression ratios, theproposed framework can even outperform the adaptive JPEG2000 scheme.6610 15 20 25 3000.  Our frameworkBSBLJPEG2000Figure 5.6: Reconstruction performance under Gaussian noise for a fixedCR (2.5:1), with varying SNRNoisy CaseWe then show that CS-based schemes are more robust than JPEG2000 toGaussian noise and packet loss. Such studies have so far been omitted inEEG-based WBSN studies.Gaussian Noise For this experiment, we arbitrarily fix the compressionratio to 2.5:1, and vary the signal-to-noise ratio (SNR) by adding Gaussiannoise with varying standard deviations. The results are shown in Fig. 5.6.The JPEG2000 framework is the most affected by Gaussian noise, es-pecially when the SNR is low. Comparing our framework with BSBL, wenotice that BSBL performs better at low SNRs, whereas our frameworkperforms better for SNRs higher than 13dB.670 10 20 30 40 5000. Loss (%)NMSE  Our frameworkBSBLJPEG2000Figure 5.7: Reconstruction performance under packet loss for a fixed CR(2.5:1)Packet Loss We then test the impact of packet loss. Again, we select acompression ratio of 2.5:1 and vary the percentage of packets lost througha noisy channel. This is done by randomly choosing the lost packets whichthen cannot be used to reconstruct the original signal. The results are shownin Fig. 5.7.The important thing to note about this figure is the relationship be-tween reconstruction accuracy and packet loss. What we notice is that forboth CS-based methods, the slope of the line is much less steep than for theJPEG2000 framework. This property is desirable since it leads to a moregraceful degradation in signal quality. In fact, we can see that as soon aspacket loss happens, JPEG2000 becomes worse than the proposed frame-work. Similarly, when the percentage of packets lost go above 12%, BSBL68performs better than JPEG2000. Knowing that in WBSN the packet lossrate can be high, we can see that CS becomes an attractive solution evenfrom the perspective of reconstruction accuracy. It is also important to un-derstand the possible implications of packet loss for the different frameworks.Because JPEG2000 is adaptive and only sends the largest coefficients, werun the risk of losing the largest coefficients, and we have no control overthat - it is simply a matter of luck as to which coefficients are affected. Incontrast, because the CS measurement operator takes nonadaptive randomprojections, no resulting measurement bears more weight than another one.In a way, this provides a safety net: even if we have a noisy channel witha high packet loss rate, we know that in all cases the degradation in signalquality will be proportional to the noise in the channel, and won?t dependon the timing or location of these losses.5.5 DiscussionIn this chapter, we proposed a novel CS framework that exploits both thetemporal correlations and the spatial correlations to efficiently compressEEG signals in WBSNs. On the energy front, our proposed CS framework isbetween 5 and 8 times more energy-efficient than the JPEG2000 frameworkin terms of sensor computations and wireless transmission. We also showthat our method achieves a better reconstruction quality than the state-of-the art BSBL method which also uses CS. This was also the first study inwhich a wide range of EEG signals are used to validate the performance ofdifferent frameworks.69We wish to reiterate the advantages of using a CS framework in thecontext of WBSNs. Recall that the core purpose of WBSNs is to provide asimple and practical tool for people to participate in their own treatment. Assuch, energy consumption is paramount, and we also desire to have the sim-plest sensors possible, for affordability. What this means is that operationscarried out at the sensor node must be as basic as possible. This is exactlywhat CS does: the main operations at the sensor node consist of comput-ing random projections, which happen to be simple additions when usinga sparse binary sensing matrix. The computational complexity is shiftedto the server node, which has a lot more computing power than the sensornode. In that sense, a CS framework is an ideal solution that fully takesadvantage of the strengths and limitations of WBSNs.Furthermore, CS provides a simple and universal encoding scheme sincethe random measurement matrix ? is incoherent with any sparsifying basis? with overwhelming probability. What this means is that if for some reasonwe wish to change ? along the way, it does not require us to also change ?.Finally, CS measurements are robust to noise. The use of a larger numberof CS measurements will lead to a progressively better reconstruction. Sim-ilarly, losing a few measurements (e.g. due to wireless channel interference)leads to a progressive reconstruction degradation. This is in contrast witha wavelet compression scheme, where losing one of the large coefficients canhave an important impact on the quality of the reconstruction.70Chapter 6A Compressed SensingFramework for BCI Systemsin WBSNs6.1 Problem DescriptionBrain-Computer Interface (BCI) systems have received significant attentionin the last few decades. BCIs provide a direct pathway between the brainand an external interface such as a computer or a prosthesis. Such devicescan prove to be life-changing for disabled people by allowing them a morenatural interaction with their environment despite their physical limitations.One widely used brain signal in BCIs is the EEG. These signals are pop-ular because they provide high time resolution, a necessary feature to builda practical BCI [37]. A BCI would use the EEG signals to detect patternsassociated with a mental task performed by a person (such as attemptingto move a finger or visualizing some arithmetic task). The person woulduse such a simple mental task to operate a wheel chair, switch a light offor communicate with a caregiver, for instance. One popular mental task71used in BCIs is what is called motor imagery, in which the user performsan imagined motion (e.g. flexing the left or right hand). It has been shownthat motor imagery can be reliably identified when sensors (electrodes) arelocated over the sensorimotor area of the cortex [47].There are two main categories of motor imagery BCI systems: syn-chronous and asynchronous (also called ?self-paced?) systems. A synchronousBCI provides a time window during which the BCI can be operated (i.e. thesystem gives a cue to users to let them know when they can control it).The user?s mental commands must be sent during that time interval spec-ified by the system, and the brain output is ignored at all other times. Incontrast, a self-paced BCI allows a person to control a device or an objectwhenever they wish. The system must figure out whether the user is tryingto control the interface (called ?intentional control?) or not (called ?idle? or?no control?). This additional flexibility is of course desirable from a userexperience perspective but it comes at the cost of increased complexity. Inthis work, we focus on synchronous BCIs.In many cases, it is important to provide a BCI system that is mini-mally obtrusive to users and that allows them to carry on their daily tasksas normally as possible. This is when the concept of WBSN becomes inter-esting. We have shown in the past chapters that a CS framework providesan interesting solution in this setting.In this chapter, we study a practical application of the framework devel-oped in Chapter 5. We apply our framework to compress EEG signals thatare then (after decompression) used to control a BCI.This chapter is organized as follows. Section 6.2 presents the different72L inear Measurement Matrix  ( ? )C om pressionE poch ingP reprocessing W irel ess T ra nsm issionContinuous E E G  Stream F YMeans Removal I nterch annel Redundancy  Removal Scalar Q uantiz ationEncod ingE ntropy  Coding(a)CS Reconstruction A lgorith mReconstructionBCI  output? F eature E x traction F eature Selection ClassificationM a ch ine L ea rningE ntropy  DecodingD ecod ing(b)Figure 6.1: Block diagrams of (a) the sensor node and (b) the server nodefor the proposed CS-based BCI frameworkbuilding blocks of the framework. Section 6.3 shows the performance of thesystem through different experiments, and Section 6.4 discusses the obtainedresults.6.2 MethodsWe now present our proposed framework, shown in Fig. 6.1. The operationsat the sensor node and the reconstruction part are exactly the same as forthose of Chapter 5. As such, we refer the reader to Section 5.2 for furtherdetails. The only difference is that here we use an epoch length of N = 128.This is done to ensure that the BCI system can make classification decisionsat a practical rate. We could also afford to reduce the value of d from 8 to4.After the reconstruction stage, we carry out machine learning steps tobuild the BCI. The features are extracted based on the synchronous cues73given by the system. For each cue, we extract the window that starts 0.5safter the cue and of length 2s. This window has been shown to work wellwhen working with motor imagery EEG data. We then use Common Spa-tial Patterns (CSP), after first having bandpass filtered the selected databetween 8Hz and 35Hz. CSP is a well-known algorithm which looks for fea-tures that maximize the variance with respect to one class while minimizingit with respect to the other one [49]. We extract the log-variance of theCSPs to use as our features. The feature selection method we selected issimple: we select the 2m CSPs that best discriminate between classes (basedon their eigenvalues). We used m = 3. The selected features are then fed toa Linear Discriminant Analysis (LDA) classifier. LDA is a simple classifierthat is fast and that works well for BCIs [9].6.3 Results6.3.1 Data UsedWe used 2 publicly available motor imagery datasets to test the performanceof our proposed method:? BCI Competition IV, dataset # 1 [12]: This data was recorded fromhealthy subjects (subjects a, b, f and g) or generated artificially (sub-jects c, d and e). The initial recording was made using 59 EEG chan-nels per subject at a sampling rate of 1000Hz. The subjects performeda motor imagery task (imagining a left hand movement, a right handmovement or a foot movement). Only 2 of the 3 classes were selectedfor each subject. For each subject, there are 200 cues (100 for each74Table 6.1: Classification Performance for BCI Competition IV, dataset # 1CR 1:1 2:1 3:1 4:1 6:1 8:1Subject ?a? 70.0 63.2 68.4 61.9 55.1 56.3Subject ?b? 56.2 57.9 54.4 51.0 50.1 56.1Subject ?c? 64.6 52.6 53.2 56.5 56.5 52.1Subject ?d? 81.0 75.1 70.6 58.9 55.1 57.4Subject ?e? 87.3 87.5 86.0 77.9 72.1 64.6Subject ?f ? 72.5 72.1 69.2 62.2 63.6 55.4Subject ?g? 82.3 84.0 82.7 80.2 73.4 73.7Mean ? STD 73.4? 10.9 70.3? 13.1 69.2? 12.5 64.1? 10.9 60.8? 9.0 59.3? 7.4class). Out of the original 59 channels we selected 12 channels in thesensorimotor area of the cortex: F1, FZ, F2, FC3, FC1, FCZ, FC2,FC4, C3, C1, CZ, C2. We also resampled the data to 256Hz.? BCI Competition IV, dataset # 2a [42]: This data was recorded from9 subjects performing a motor imagery task (imagining one of 4 move-ments: left hand, right hand, both feet, tongue). We only look at leftand right hands (2 classes). For each subject, there are 72 instances ofeach class, for a total of 144 cues. There is a total of 22 EEG channels,and the sampling rate is 250Hz.6.3.2 Compression and Classification PerformanceWe first look at the classification performance when the system is subjectedto different compression ratios. We tested 6 different compression ratios:1:1 (uncompressed - this is the baseline), 2:1 (50% compression), 3:1 (67%compression), 4:1 (75% compression), 6:1 (83% compression) and 8:1 (87.5%compression). To obtain the classification performance, we used 10-foldcross-validation. The results are shown in Tables 6.1 and 6.2.75Table 6.2: Classification Performance for BCI Competition IV, dataset #2aCR 1:1 2:1 3:1 4:1 6:1 8:1Subject ?A01? 83.0 80.2 75.9 72.3 73.2 66.2Subject ?A02? 59.4 56.2 50.6 55.0 53.9 51.2Subject ?A03? 94.6 93.4 92.0 90.5 87.8 91.5Subject ?A04? 67.3 69.2 56.2 67.5 65.5 61.9Subject ?A05? 62.7 59.2 62.9 50.8 52.9 51.5Subject ?A06? 64.6 64.0 57.6 59.2 58.4 55.6Subject ?A07? 73.3 75.4 67.3 59.7 54.3 53.9Subject ?A08? 96.2 95.8 95.4 91.2 90.6 92.6Subject ?A09? 74.2 73.8 72.7 68.2 66.5 69.7Mean ? STD 75.0? 13.5 74.1? 13.9 70.1? 15.6 68.3? 14.4 67.0? 14.3 66.0? 16.1As expected, there is a degradation in performance when the CR isgreater than 1:1, i.e. when the data is compressed. However, we wouldargue that this degradation in performance is not significant.6.3.3 Energy AnalysisWe now turn our attention to the energy consumption of the proposed frame-work, as compared to that of sending uncompressed data. As mentionedpreviously, it is very important to use as little energy as possible at the sen-sor node since it is battery-powered. We place no constraint on the energysupply of the server node, as we assume it is wired.The energy analysis is carried out in a manner analogous to that ofSection 5.4.3. The results are presented in Table 6.3. They are normalizedover one epoch for one EEG channel so that the difference in the number ofchannels in each dataset does not impact the presented results.As can be seen from the results, interesting energy savings can be real-ized when using our proposed framework. This is especially true when the76Table 6.3: Energy Consumption Under Different Compression RatiosCRComputation Transmission Total EnergyEnergy Energy Energy Savings1:1 0 mJ 0.884 mJ 0.884 mJ -2:1 0.296 mJ 0.442 mJ 0.738 mJ 16.5 %3:1 0.259 mJ 0.295 mJ 0.553 mJ 37.4 %4:1 0.242 mJ 0.221 mJ 0.463 mJ 47.6 %6:1 0.227 mJ 0.147 mJ 0.374 mJ 57.7 %8:1 0.222 mJ 0.111 mJ 0.333 mJ 62.3 %number of channels increases and when the sampling rate is high since thetotal absolute energy consumption of the system is compounded by these 2factors.6.4 DiscussionIn this chapter, we applied our CS framework to efficiently compress EEGsignals in WBSNs, in the context of a BCI application. By providing asimple, nonadaptive compression scheme at the sensor node, CS offers anenergy-efficient solution to compress EEG signals in WBSNs. We verifiedthe performance of the proposed compression framework by testing the clas-sification performance of the BCI systems under different compression levelsand by looking at the overall energy consumption of the sensor node. Ourresults showed that a CS framework leads to important energy savings atthe cost of a lower classification accuracy.Using a compressed sensing framework for WBSN-based BCI systemsinvolves some relatively obvious tradeoffs. On the one hand, as the com-pression ratio increases, the classification accuracy decreases. However, at77the same time the total energy consumption also decreases. It is thus a de-sign choice to determine the appropriate compression ratio. We note that inWBSNs, energy consumption is an important consideration, which promptsus to say that in some situations, compression may be necessary.We also note that using more sophisticated machine learning techniquescould have improved the overall accuracy of our BCI system. Similarly,we could have optimized the parameters for each subject to obtain betterclassification performance. However, our aim was to obtain a simple systemthat worked well on the average case. In fact, we were only interested inthe comparative performance of the BCI system under different compressionratios.78Chapter 7Summary and Conclusions7.1 Main ResultsIn this thesis, we proposed novel, energy-efficient CS frameworks that takeadvantage of the inherent structure present in EEG signals (both temporaland spatial correlations) to efficiently compress these signals in WBSNs.In Chapter 3, we presented a simple CS-based framework that wouldbecome the basis for the bulk of the work in this thesis. We optimized theGabor sparsifying dictionary and demonstrated that using a fixed sparsebinary sensing matrix offered similar performances to optimal matrices whilerequiring far fewer computations. This framework has the advantage ofbeing simple but does not exploit the spatial correlations between EEGchannels.In Chapter 4, we added an energy-efficient ICA preprocessing block tothe simple CS framework to exploit the spatial correlations among EEGchannels. We showed that the proposed framework provides significant en-ergy savings as compared to the state-of-the-art method in [39]. As well, fora fixed compression ratio, our system achieves similar NMSE performanceas the state-of-the-art method, which is much better than that achieved bythe simple CS framework.79In Chapter 5, we proposed a complete, novel CS framework that exploitsboth the temporal correlations and the spatial correlations. On the energyfront, our proposed CS framework is up to 8 times more energy-efficientthan the JPEG2000 framework in terms of sensor computations and wirelesstransmission. The reconstruction accuracy of the JPEG2000 framework isgenerally better than that of the proposed framework but as the compressionratio decreases, the gap in the reconstruction error quickly shrinks. Atlower compression ratios, the proposed framework can even outperform theJPEG2000 scheme. We also showed that our method achieves a betterreconstruction quality than the state-of-the art BSBL method. This wasthe first study that compared CS frameworks with other state-of-the-artcompression frameworks for EEG compression in WBSNs. It was also thefirst study where different types of EEG signals representing a variety ofapplications were used to test the performance of the proposed and existingframeworks, thus providing a more robust answer to the usefulness andvalidity of the systems.Finally, in Chapter 6, we applied the framework of Chapter 5 to compressEEG signals in the context of a BCI application and evaluated its impacton the performance of the system.7.2 Future WorkThere are many possible directions for extending the work presented in thisthesis. It would be interesting to implement the CS framework in hardwareand test its reconstruction and energy performances in practice, under real-80life conditions (instead of relying on simulations and assumptions).It would also be interesting to look at CS reconstruction algorithmsthat can directly exploit the spatial correlations by jointly reconstructingthe channels. Along the same lines, it may be useful to look at the anal-ysis prior formulation for the CS reconstruction problem, as it could yieldan improvement in reconstruction accuracy. These two approaches havebeen gaining momentum recently and have the potential to improve our CSframework.Another avenue would be to use dictionary learning techniques to findadaptive dictionaries in which EEG signals are sparser than in our fixedoptimized Gabor dictionary. We also see potential in combining a dictio-nary learning approach with an improved reconstruction algorithm to builda structured sparsifying dictionary, whose structure can then be exploitedby the reconstruction algorithm to achieve reconstruction using fewer mea-surements.It might also be worth looking for an optimal quantization and encodingstrategy for CS measurements under a sparse binary sensing matrix. Whilethis problem has been investigated for more traditional CS matrices (see e.g.[25]), it has not been explored for sparse binary sensing matrices.Finally, on top of BCIs, other end-user systems (e.g. a seizure detectionsystem) that use the developed compression framework could be developedand investigated.81Bibliography[1] A. M. Abdulghani, A. J. Casson, and E. Rodriguez-Villegas. Quanti-fying the performance of compressive sensing on scalp EEG signals. In3rd International Symposium on Applied Sciences in Biomedical andCommunication Technologies (ISABEL), pages 1 ?5, 2010.[2] A. M. Abdulghani, A. J. Casson, and E. Rodriguez-Villegas. Com-pressive sensing scalp EEG signals: Implementations and practicalperformance. Medical and Biological Engineering and Computing,50(11):1137?1145, 2012.[3] A. Acharyya, K. Maharatna, and B. M. Al-Hashimi. Algorithm andarchitecture for n-D vector cross-product computation. IEEE Transac-tions on Signal Processing, 59(2):812?826, 2011.[4] G. Antoniol and P. Tonella. EEG data compression techniques. IEEETransactions on Biomedical Engineering, 44(2):105?114, 1997.[5] S. Aviyente. Compressed sensing framework for EEG compression. InIEEE/SP 14th Workshop on Statistical Signal Processing (SSP ?07),pages 181?184, 2007.82[6] World Bank. Health expenditure, total (% of GDP). http://data.worldbank.org/indicator/SH.XPD.TOTL.ZS. Accessed: 23/08/2013.[7] World Bank. Population ages 65 and above (% of total). http://data.worldbank.org/indicator/SP.POP.65UP.TO.ZS/countries. Ac-cessed: 23/08/2013.[8] M. Barwin?ski. Product-based metric for Gabor functions and its impli-cations for the matching pursuit analysis. Master?s thesis, Universityof Warsaw, Warsaw, Poland, 2004.[9] A. Bashashati, M. Fatourechi, R. K. Ward, and G. E. Birch. A surveyof signal processing algorithms in brain?computer interfaces based onelectrical brain signals. Journal of Neural Engineering, 4(2):R32, 2007.[10] C. Bazan-Prieto, J. Cardenas-Barrera, M. Blanco-Velasco, and F. Cruz-Roldan. Electroencephalographic compression based on modulated fil-ter banks and wavelet transform. In Annual International Conferenceof the IEEE Engineering in Medicine and Biology Society (EMBC ?11),pages 7067?7070, 2011.[11] A. Belouchrani, K. Abed-Meraim, J. F. Cardoso, and E. Moulines.Second-order blind separation of temporally correlated sources. In Pro-ceedings of the International Conference on Digital Signal Processing,pages 346?351, 1993.[12] B. Blankertz, G. Dornhege, M. Krauledat, K. Muller, and G. Curio.The non-invasive Berlin braincomputer interface: Fast acquisition of83effective performance in untrained subjects. NeuroImage, 37(2):539 ?550, 2007.[13] B. Blankertz, G. Dornhege, M. Krauledat, K. R. Muller, V. Kunzmann,F. Losch, and G. Curio. The Berlin brain-computer interface: EEG-based communication without subject training. IEEE Transactions onNeural Systems and Rehabilitation Engineering, 14(2):147?152, 2006.[14] E. J. Candes and T. Tao. Decoding by linear programming. IEEETransactions on Information Theory, 51(12):4203?4215, 2005.[15] E. J. Candes and M. B. Wakin. An introduction to compressive sam-pling. IEEE Signal Processing Magazine, 25(2):21?30, 2008.[16] J. L. Ca?rdenas-Barrera, J. V. Lorenzo-Ginori, and E. Rodr??guez-Valdivia. A wavelet-packets based algorithm for EEG signal compres-sion. Informatics for Health and Social Care, 29(1):15?27, 2004.[17] J. Dauwels, K. Srinivasan, M.R. Reddy, and A. Cichocki. Near-losslessmultichannel EEG compression based on matrix and tensor decomposi-tions. IEEE Journal of Biomedical and Health Informatics, 17(3):708?714, 2013.[18] V.R. Dehkordi, H. Daou, and F. Labeau. A channel differential EZWcoding scheme for EEG data compression. IEEE Transactions on In-formation Technology in Biomedicine, 15(6):831?838, 2011.[19] D. L. Donoho. Compressed sensing. IEEE Transactions on InformationTheory, 52(4):1289 ?1306, 2006.84[20] Emotiv. EPOC specifications. http://www.emotiv.com/epoc/download_specs.php, 2012. Accessed: 29/08/2013.[21] I. Feinberg, R. L. Koresko, and N. Heller. EEG sleep patterns as a func-tion of normal and pathological aging in man. Journal of PsychiatricResearch, 5(2):107?144, 1967.[22] D. Gabor. Theory of communication. Part 1: The analysis of informa-tion. Journal of the Institution of Electrical Engineers - Part III: Radioand Communication Engineering, 93(26):429?441, 1946.[23] A. Gilbert and P. Indyk. Sparse recovery using sparse matrices. Pro-ceedings of the IEEE, 98(6):937?947, 2010.[24] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C.Ivanov, R. G. Mark, J. E. Mietus, G. B Moody, C.-K. Peng, and H. E.Stanley. Physiobank, PhysioToolkit, and PhysioNet: Components ofa new research resource for complex physiologic signals. Circulation,101(23):e215?e220, 2000.[25] C.S. Gunturk, M. Lammers, A. Powell, R. Saab, and O. Yilmaz. Sigmadelta quantization for compressed sensing. In 44th Annual Conferenceon Information Sciences and Systems (CISS ?10), pages 1?6, 2010.[26] H. Gu?rkan, U. Guz, and B. S. Yarman. EEG signal compression basedon classified signature and envelope vector sets. International Journalof Circuit Theory and Applications, 37(2):351?363, 2009.[27] G. Higgins, B. McGinley, M. Glavin, and E. Jones. Low power compres-85sion of EEG signals using JPEG2000. In 4th International Conferenceon Pervasive Computing Technologies for Healthcare, pages 1?4, 2010.[28] G. Higgins, B. McGinley, N. Walsh, M. Glavin, and E. Jones.Lossy compression of EEG signals using SPIHT. Electronics Letters,47(18):1017?1018, 2011.[29] A. Hyvarinen. Fast and robust fixed-point algorithms for indepen-dent component analysis. IEEE Transactions on Neural Networks,10(3):626?634, 1999.[30] Imec. Imec, Holst Centre and Panasonic present wireless low-poweractive-electrode EEG headset. http://www2.imec.be/be_en/press/imec-news/imeceeg2012.html, 2012. Accessed: 29/08/2013.[31] Interaxon. Muse: The brainwave sensing headband tech specsheet. http://www.interaxon.ca/muse/Muse_Tech_Spec_Sheet_2013.pdf, 2013. Accessed: 29/08/2013.[32] J. Jeong. EEG dynamics in patients with Alzheimer?s disease. ClinicalNeurophysiology, 115(7):1490?1505, 2004.[33] K. G. Jordan. Emergency EEG and continuous EEG monitoring inacute ischemic stroke. Journal of Clinical Neurophysiology, 21(5):341?352, 2004.[34] A. Kachenoura, L. Albera, L. Senhadji, and P. Comon. ICA: A potentialtool for BCI systems. IEEE Signal Processing Magazine, 25(1):57 ?68,2008.86[35] S. Makeig, A. J. Bell, T.-P. Jung, and T. J. Sejnowski. Independentcomponent analysis of electroencephalographic data. Advances in Neu-ral Information Processing Systems, pages 145?151, 1996.[36] H. Mamaghanian, N. Khaled, D. Atienza, and P. Vandergheynst. Com-pressed sensing for real-time energy-efficient ECG compression on wire-less body sensor nodes. IEEE Transactions on Biomedical Engineering,58(9):2456 ?2466, 2011.[37] D. J. McFarland and J. R. Wolpaw. Brain-computer interfaces forcommunication and control. Communications of the ACM, 54(5):60?66, 2011.[38] N. Memon, X. Kong, and J. Cinkler. Context-based lossless and near-lossless compression of EEG signals. IEEE Transactions on InformationTechnology in Biomedicine, 3(3):231?238, 1999.[39] B. Mijovic, V. Matic, M. De Vos, and S. Van Huffel. Independentcomponent analysis as a preprocessing step for data compression ofneonatal EEG. In 33rd Annual International Conference of the IEEEEngineering in Medicine and Biology Society (EMBS ?11), pages 7316?7319, 2011.[40] A. Milenkovic?, C. Otto, and E. Jovanov. Wireless sensor networks forpersonal health monitoring: Issues and an implementation. ComputerCommunications, 29(13-14):2521?2533, 2006.[41] H. R. Mohseni, A. Maghsoudi, and M. B. Shamsollahi. Seizure detec-tion in EEG signals: A comparison of different approaches. In 28th87Annual International Conference of the IEEE Engineering in Medicineand Biology Society (EMBS ?06), pages 6724?6727, 2006.[42] M. Naeem, C. Brunner, R. Leeb, B. Graimann, and G. Pfurtscheller.Separability of four-class motor imagery data using independent com-ponents analysis. Journal of Neural Engineering, 3(3):208?216, 2006.[43] Neurosky. What is Mindwave Mobile? http://www.neurosky.com/products/mindwavemobile.aspx, 2011. Accessed: 29/08/2013.[44] Public Health Agency of Canada. Chronic diseases ? most significantcause of death globally. http://www.phac-aspc.gc.ca/media/nr-rp/2011/2011_0919-bg-di-eng.php. Accessed: 23/08/2013.[45] World Health Organization. Ageing and life course. http://www.who.int/ageing/en/. Accessed: 23/08/2013.[46] World Health Organization. Global status report on noncommuni-cable diseases 2010. http://www.who.int/chp/ncd_global_status_report/en/. Accessed: 23/08/2013.[47] G. Pfurtscheller, C. Neuper, D. Flotzinger, and M. Pregenzer. EEG-based discrimination between imagination of right and left hand move-ment. Electroencephalography and Clinical Neurophysiology, 103(6):642? 651, 1997.[48] K.-K. Poh and P. Marziliano. Compressive sampling of EEG signalswith finite rate of innovation. EURASIP Journal on Advances in SignalProcessing, 2010(1):183105, 2010.88[49] H. Ramoser, J. Muller-Gerking, and G. Pfurtscheller. Optimal spatialfiltering of single trial EEG during imagined hand movement. IEEETransactions on Rehabilitation Engineering, 8(4):441?446, 2000.[50] S. Senay, L. F. Chaparro, M. Sun, and R. J. Sclabassi. Compressivesensing and random filtering of EEG signals using Slepian basis. Pro-ceedings of the EURASIP EUSIPCO, 8, 2008.[51] A. H. Shoeb. Application of Machine Learning to Epileptic SeizureOnset Detection and Treatment. PhD thesis, Massachusetts Instituteof Technology, 2009.[52] K. Srinivasan, J. Dauwels, and M. R. Reddy. A two-dimensional ap-proach for lossless EEG compression. Biomedical Signal Processing andControl, 6(4):387 ? 394, 2011.[53] K. Srinivasan, J. Dauwels, and M. R. Reddy. Multichannel EEG com-pression: Wavelet-based image and volumetric coding approach. IEEEJournal of Biomedical and Health Informatics, 17(1):113?120, 2013.[54] N. Sriraam. Neural network based near-lossless compression of EEGsignals with non uniform quantization. In 29th Annual InternationalConference of the IEEE Engineering in Medicine and Biology Society(EMBS ?07), pages 3236?3240, 2007.[55] N. Sriraam. A high-performance lossless compression scheme for EEGsignals using wavelet transform and neural network predictors. Inter-national Journal of Telemedicine and Applications, 2012:1?8, 2012.89[56] N. Sriraam and C. Eswaran. An adaptive error modeling scheme for thelossless compression of EEG signals. IEEE Transactions on InformationTechnology in Biomedicine, 12(5):587?594, 2008.[57] M. G. Terzano, L. Parrino, A. Sherieri, R. Chervin, S. Chokroverty,C. Guilleminault, M. Hirshkowitz, M. Mahowald, H. Moldofsky,A. Rosa, et al. Atlas, rules, and recording techniques for the scoringof cyclic alternating pattern (CAP) in human sleep. Sleep Medicine,2(6):537?553, 2001.[58] B. L. Titzer, D. K. Lee, and J. Palsberg. AVRORA: Scalable sensornetwork simulation with precise timing. In Fourth International Sympo-sium on Information Processing in Sensor Networks (IPSN ?05), pages477?482, 2005.[59] A. Upton and J. Gumpert. Electroencephalography in diagnosis ofherpes-simplex encephalitis. The Lancet, 295(7648):650 ? 652, 1970.[60] E. van den Berg and M. P. Friedlander. Probing the pareto frontierfor basis pursuit solutions. SIAM Journal on Scientific Computing,31(2):890?912, 2009.[61] Y. Wongsawat, S. Oraintara, T. Tanaka, and K.R. Rao. Lossless multi-channel EEG compression. In IEEE International Symposium on Cir-cuits and Systems (ISCAS ?06), pages 1614?1617, 2006.[62] R.F. Yazicioglu, T. Torfs, P. Merken, J. Penders, V. Leonov, V. Puers,B. Gyselinckx, and C. Van Hoof. Ultra-low-power biopotential inter-90faces and their applications in wearable and implantable systems. Mi-croelectronics Journal, 40(9):1313 ? 1321, 2009.[63] Z. Zhang, T. Jung, S. Makeig, and B. Rao. Compressed sensing ofEEG for wireless telemonitoring with low energy consumption and in-expensive hardware. IEEE Transactions on Biomedical Engineering,60(1):221?224, 2013.[64] Z. Zhang and B. D. Rao. Extension of SBL algorithms for the recoveryof block sparse signals with intra-block correlation. IEEE Transactionson Signal Processing, 61(8):2009?2015, 2013.91


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items