@prefix vivo: <http://vivoweb.org/ontology/core#> .
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
@prefix ns0: <https://open.library.ubc.ca/terms#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix skos: <http://www.w3.org/2009/08/skos-reference/skos.html#> .

<http://dx.doi.org/10.14288/1.0355539>
  vivo:departmentOrSchool "Applied Science, Faculty of"@en, "Electrical and Computer Engineering, Department of"@en ;
  edm:dataProvider "DSpace"@en ;
  ns0:degreeCampus "UBCV"@en ;
  dcterms:creator "Zou, Liang"@en ;
  dcterms:issued "2017-09-11T15:45:21Z"@en, "2017"@en ;
  vivo:relatedDegree "Doctor of Philosophy - PhD"@en ;
  ns0:degreeGrantor "University of British Columbia"@en ;
  dcterms:description """Blind Source Separation (BSS) methods have been attracting increasing attention for their promising applications in signal processing. Despite recent progress on the research of BSS, there are still remaining challenges. Specifically, this dissertation focuses on developing novel Underdetermined Blind Source Separation (UBSS) methods that can deal with several specific challenges in real applications, including limited number of observations, self/cross dependence information and source inference in the underdetermined case. First, by taking advantage of the
Noise Assisted Multivariate Empirical Mode Decomposition (NAMEMD) and Multiset Canonical Correlation Analysis (MCCA), we propose a novel BSS framework and apply it to extract the heart beat signal form noisy nano-sensor signals. Furthermore, we generalize the idea of (over)determined joint BSS to that of the underdetermined case. We explore the dependence information between two datasets and propose an underdetermined joint BSS method for two datasets, termed as UJBSS-2. In addition, by exploiting the cross correlation between each pair of datasets, we develop a novel and effective method to jointly estimate the mixing matrices
from multiple datasets, referred to as Underdetermined Joint Blind Source Separation for Multiple Datasets (UJBSS-M). In order to improve the time efficiency and relax the sparsity constraint, we recover the latent sources based on subspace representation when the mixing matrices are estimated. As an example application for noise enhanced signal processing, the proposed UJBSS-M method also can be utilized to solve the single-set UBSS problem when suitable noise is added to the observations. Finally, considering the recent increasing need for biomedical signal processing in the ambulatory environment, we propose a novel UBSS method for removing electromyogram (EMG) from Electroencephalography (EEG) signals. The proposed method for recovering the underlying sources is also applicable to other artifact removal problems. Simulation results demonstrate that the proposed methods yield superior performances over conventional approaches. We also evaluate the proposed methods on real physiological data, and the proposed methods are shown to effectively and efficiently recover the underlying sources."""@en ;
  edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/63013?expand=metadata"@en ;
  skos:note "Underdetermined Joint Blind Source Separation withApplication to Physiological DatabyLiang ZouB.Sc., Anhui University, 2010M.Sc., University of Science and Technology of China, 2013A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Electrical and Computer Engineering)The University of British Columbia(Vancouver)September 2017c© Liang Zou, 2017AbstractBlind Source Separation (BSS) methods have been attracting increasing attentionfor their promising applications in signal processing. Despite recent progress onthe research of BSS, there are still remaining challenges. Specifically, this dis-sertation focuses on developing novel Underdetermined Blind Source Separation(UBSS) methods that can deal with several specific challenges in real applications,including limited number of observations, self/cross dependence information andsource inference in the underdetermined case. First, by taking advantage of theNoise Assisted Multivariate Empirical Mode Decomposition (NAMEMD) and Mul-tiset Canonical Correlation Analysis (MCCA), we propose a novel BSS frameworkand apply it to extract the heart beat signal form noisy nano-sensor signals. Further-more, we generalize the idea of (over)determined joint BSS to that of the underde-termined case. We explore the dependence information between two datasets andpropose an underdetermined joint BSS method for two datasets, termed as UJBSS-2. In addition, by exploiting the cross correlation between each pair of datasets,we develop a novel and effective method to jointly estimate the mixing matricesfrom multiple datasets, referred to as Underdetermined Joint Blind Source Sepa-ration for Multiple Datasets (UJBSS-M). In order to improve the time efficiencyand relax the sparsity constraint, we recover the latent sources based on subspacerepresentation when the mixing matrices are estimated. As an example applicationfor noise enhanced signal processing, the proposed UJBSS-M method also can beutilized to solve the single-set UBSS problem when suitable noise is added to theobservations. Finally, considering the recent increasing need for biomedical sig-nal processing in the ambulatory environment, we propose a novel UBSS methodfor removing electromyogram (EMG) from Electroencephalography (EEG) signals.iiThe proposed method for recovering the underlying sources is also applicable toother artifact removal problems. Simulation results demonstrate that the proposedmethods yield superior performances over conventional approaches. We also eval-uate the proposed methods on real physiological data, and the proposed methodsare shown to effectively and efficiently recover the underlying sources.iiiLay SummaryBlind source separation aims to extract underlying sources when only the mixedobservations are available. In this dissertation, we propose a set of novel underde-termined BSS methods to jointly recover underlying sources from multiple datasetsin underdetermined cases (i.e. the number of sources is greater than that of obser-vations). First, we propose a novel BSS framework and apply it to extract the heartbeat signal form noisy nano-sensor signals. Taking into account the dependence in-formation between datasets, we further propose two Underdetermined Joint BlindSource Separation (UJBSS) methods, termed as UJBSS-2 and UJBSS-m, which areapplicable to two datasets and multiple datasets respectively. Finally, consideringthe recent increasing need for acquiring EEG signals with limited number of sen-sors, we propose a novel underdetermined BSS method for removing EMG artifactsfrom EEG signals. It is shown that our proposed methods outperform conventionalBSS methods on both synthetic data and real physiological signals.ivPrefaceThis dissertation is written based on a collection of manuscripts. The majority ofthe research, including literature study, algorithm development and implementa-tion, numerical studies and report writing, were conducted by the candidate, withsuggestions from Prof. Z. Jane Wang. The manuscripts were primarily drafted bythe candidate, with helpful revisions and comments from Prof. Z. Jane Wang (pa-pers in Chapter 2–5), Prof. Xun Chen (papers in Chapter 2–3) and Prof. PeymanServati (paper in Chapter 2).Chapter 2 is based on the following manuscripts:• L. Zou, X. Chen, A. Servati, S. Soltanian, P. Servati and Z. J. Wang, “ABlind Source Separation Framework for Monitoring Heart Beat Rate UsingNanofiber-Based Strain Sensors,” IEEE Sensors Journal, vol.16, pp. 762-772, 2015.• L. Zou, X. Chen, A. Servati, P. Servati, M. J. McKeown, “A heart beat ratedetection framework using multiple nanofiber sensor signals,” 2014 IEEEChina Summit International Conference on Signal and Information Process-ing (ChinaSIP), pp. 242-246, Xi’an, China, Jul. 2014.Chapter 3 is based on:• L. Zou, X. Chen and Z. J. Wang, “Underdetermined Joint Blind Source Sepa-ration for Two Datasets Based on Tensor Decomposition,” IEEE Signal Pro-cessing Letters. vol.23, pp. 673-677, 2016Chapter 4 is mainly based on the following manuscripts:v• L. Zou, X. Chen, X. Ji and Z. J. Wang, “Underdetermined Joint Blind SourceSeparation of Multiple Datasets,” IEEE Access, in press, 2017, doi: 10.1109/AC-CESS.2017.2695497.• L. Zou, Z. J. Wang, X. Chen and X. Ji, “Underdetermined joint blind sourceseparation based on tensor decomposition,” 2016 IEEE Canadian Confer-ence on Electrical and Computer Engineering (CCECE), pp. 1-4, Vancou-ver, Canada, May 2014.And finally, Chapter 5 is based on the following manuscript:• L. Zou and Z. J. Wang, “Removing Muscle Artifacts From EEG Data Basedon a Novel Underdetermined Blind Source Separation Method,” in submit-ting to IEEE Signal Processing Letters.Other publications related to my PhD work are:• L. Zou, J. Zheng and Z. J. Wang, “3D CNN for Automatic Diagnosis ofADHD Using functional and structural MRI,” submitted to IEEE ACCESS.• Y. Hu, L. Zou, X. Huang and X. Lu, “Detection and quantifcation of oal con-tent in ground beef meat using vibrational spectroscopic-based chemometricanalysis,” submitted to Scientifc Reports.• J. Zheng, L. Zou and Z. J. Wang, “Mid-Level Deep Food Part Mining forFood Image Recognition,” submitted to IET Computer Vision.viTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction to Blind Source Separation and Its Applications . . . 11.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.1 (Over)Determined Blind Source Separation vs. Underde-termined Blind source Separation . . . . . . . . . . . . . 61.2.2 Single Set Blind Source Separation vs. Joint Blind SourceSeparation . . . . . . . . . . . . . . . . . . . . . . . . . 91.2.3 Instantaneous Blind Source Separation vs. ConvolutionalBlind Source Separation . . . . . . . . . . . . . . . . . . 111.3 Research Objectives and Methodologies . . . . . . . . . . . . . . 13vii1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 A Blind Source Separation Framework Based on NAMEMD andMCCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . 202.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.2.1 Related Methods . . . . . . . . . . . . . . . . . . . . . . 242.2.2 Proposed Framework . . . . . . . . . . . . . . . . . . . . 282.2.3 Data Description . . . . . . . . . . . . . . . . . . . . . . 322.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 372.3.1 Simulation Study . . . . . . . . . . . . . . . . . . . . . . 372.3.2 Real Data Study . . . . . . . . . . . . . . . . . . . . . . 392.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Underdetermined Joint Blind Source Separation for 2 datasets basedon Tensor Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 493.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . 503.2 Problem Formulation and Proposed Method . . . . . . . . . . . . 513.2.1 Signal Generation Model and Problem Statement . . . . . 513.2.2 Mixing Matrices Estimation . . . . . . . . . . . . . . . . 523.2.3 Source Extraction Based on The Estimated Mixing Matrices 543.3 Numerical Study . . . . . . . . . . . . . . . . . . . . . . . . . . 563.3.1 Simulation 1: Audio Signals . . . . . . . . . . . . . . . . 573.3.2 Simulation 2: Physiological Signals . . . . . . . . . . . . 603.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614 Underdetermined Joint Blind Source Separation of Multiple Datasets 634.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . 644.2 Notations and Preliminaries . . . . . . . . . . . . . . . . . . . . . 674.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . 694.4 Canonical Polyadic Decomposition of Tensor . . . . . . . . . . . 714.5 Algorithm for Estimating the Mixing Matrices in UJBSS . . . . . 734.5.1 Tensor Construction . . . . . . . . . . . . . . . . . . . . 74viii4.5.2 Joint Tensor Polyadic Decomposition . . . . . . . . . . . 764.6 Source Extraction Based on the Estimated Mixing Matrices . . . . 794.7 Numerical Study for the Multiple Dataset Case . . . . . . . . . . 834.7.1 Simulation 1: Audio Signals . . . . . . . . . . . . . . . . 844.7.2 Simulation 2: Physiological Signals . . . . . . . . . . . . 884.8 A Case Study: Solve A Single Set UBSS Problem Based on UJBSS-m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.9 Conclusions and Discussion . . . . . . . . . . . . . . . . . . . . 945 Removing Muscle Artifacts from EEG Data via UnderdeterminedBlind Source Separation . . . . . . . . . . . . . . . . . . . . . . . . . 965.1 Motivation and Objectives . . . . . . . . . . . . . . . . . . . . . 975.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . 995.3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . 1005.4 Data Generation and Performance Indices . . . . . . . . . . . . . 1045.5 Numerical Study for the Synthetic EEG Data . . . . . . . . . . . 1075.6 Conclusions and Discussion . . . . . . . . . . . . . . . . . . . . 1096 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . 1126.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1156.2.1 Multiple Datasets Generation . . . . . . . . . . . . . . . 1156.2.2 Estimate the Number of Source Signals in Determined andUnderdetermined BSS . . . . . . . . . . . . . . . . . . . 1166.2.3 Online Underdetermined BSS . . . . . . . . . . . . . . . 117Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119A Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132A.1 Proof for Proposition 1 . . . . . . . . . . . . . . . . . . . . . . . 133ixList of TablesTable 2.1 The estimated RHBR results when employing three differentmethods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Table 3.1 PPMCC performance results in Simulation 1. . . . . . . . . . . 60Table 3.2 PPMCC performance results in Simulation 2. . . . . . . . . . . 61Table 4.1 PCC performance results in Simulation 1. . . . . . . . . . . . 89Table 4.2 PCC performance results in Simulation 2. . . . . . . . . . . . 93Table 5.1 Performance comparison between the proposed method and theother three methods (SOBIUM, UJBSS-m, EEMD-CCA) . . . 109xList of FiguresFigure 1.1 Illustration of the cocktail party problem. . . . . . . . . . . . 3Figure 1.2 Illustrative example of a scatter plot of the mixtures with thenumber of observations M=2 and the number of sources N =6.(a) is for all DFT coefficients, and (b) is for samples at singlesource points. . . . . . . . . . . . . . . . . . . . . . . . . . . 8Figure 1.3 Graphical comparison between (a) Group ICA and (b) Joint ICA. 11Figure 1.4 Graphical comparison between instantaneous BSS and convo-lutive BSS: (a) for instantaneous BSS without time delay and(b) for convolutive BSS with reverberation. . . . . . . . . . . 12Figure 1.5 The basic idea of UJBSS for multiple datasets. M is the numberof observations and N is the number of sources in each dataset.T represents the number of data samples, and K represents thenumber of datasets. r(k1,k2)(n) denotes the correlation betweenthe nth source in the kth1 dataset and that in the kth2 dataset. . . . 16Figure 1.6 The overview of the challenges, objectives and contributionsof this research work. . . . . . . . . . . . . . . . . . . . . . . 16Figure 2.1 Three groups of source signals and mixed signals. In each sub-figure, the horizontal axis represents the time index and thevertical axis represents the amplitude of the signal in microvolt. 32Figure 2.2 Photographs of NF sensors: (a) Dimensions of NF sensors;(b) Flexibility of NF sensors; (c) Three NF sensors used formonitoring the RHBR. . . . . . . . . . . . . . . . . . . . . . 35Figure 2.3 Illustration of the measurement system. . . . . . . . . . . . . 36xiFigure 2.4 Extracted pulse waves based on the synthetic data. . . . . . . 39Figure 2.5 The absolute correlation coefficients between the original PPGsignals and the extracted pulse waves on 100 synthetic datasets.The blue asterisks represent the averages and the red lines standfor the medians. The edges of the box are the lower and up-per quartiles. Outliers beyond the upper whisker or under thelower whiskers are indicated by the red plus signs. . . . . . . 40Figure 2.6 Time domain analysis of the NAMEMD results. In each sub-figure, the horizontal axis represents the time index and thevertical axis represents the amplitude of the signal. . . . . . . 41Figure 2.7 Frequency domain analysis of the NAMEMD results. . . . . . 42Figure 2.8 Frequency domain analysis of the EEMD results. . . . . . . . 43Figure 2.9 Extracted heart beat signals when employing three differentmethods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Figure 3.1 Performance of the proposed UJBSS method. (a) Performancecomparisons between the proposed UJBSS method and thesingle-set SOBIUM method. Here the number of sources N= 6 and the number of observations M = 4. (b) Estimation er-ror of A[1] when employing the proposed UJBSS method. Herethe number of sources N = 8 and the number of observationsM varies from 4 to 8. Similar results are observed for A[2]. . . 58Figure 3.2 The original sources and extracted sources from the first dataset.(a) Simulation 1. (b) Simulation 2. The top subfigures are theoriginal sources and the bottom ones are the extracted sources.Similar results are observed for the second dataset. . . . . . . 59Figure 4.1 Illustration of how to generate tensors by incorporating the de-pendence information between each pair of datasets. . . . . . 74xiiFigure 4.2 Simulation 1: performance comparisons on audio signals whenusing the proposed UJBSS-m method and other UBSS meth-ods, including the single-set UBSS method SOBIUM [31] andthe UJBSS method for two dataset, i.e., UJBSS-2 [124]. Herethe number of sources N = 5 and the number of observationsM = 4. The number of time delays L = 20 and the step size oftime delays (i.e., τl−τl−1) is 2 data samples, corresponding to0.25ms. Similar results are observed for A(2) and A(3). . . . . 86Figure 4.3 Simulation 1: estimation error of A(1) when employing the pro-posed UJBSS method. Here the number of sources N = 8 andthe number of observations M varies from 4 to 7. The num-ber of time delays L = 20 and the step size of time delays (i.e.,τl− τl−1) is 2 data samples, corresponding to 0.25ms. . . . . . 87Figure 4.4 Simulation 1: an illustrative example from the proposed UJBSS-m method. First row: The original 4 sources; Second row: 3channels of the mixed observations; Third row: the recovered4 sources from the first dataset. . . . . . . . . . . . . . . . . . 88Figure 4.5 Simulation 2: performance comparisons on physiological sig-nals between the proposed UJBSS-m method and two othermethods (i.e., the single-set UBSS method SOBIUM [31] andUJBSS method for two dataset, UJBSS-2 [124]). Here thenumber of sources N = 4 and the number of observations M= 3. The number of time delays L = 20 and the step size oftime delays (i.e., τl − τl−1) is 2 data samples. Similar resultsare observed for A(2) and A(3). . . . . . . . . . . . . . . . . . 91Figure 4.6 Simulation 2: performance of the proposed UJBSS-m methodwhen the step size of time delays (i.e., τl − τl−1) varies from1 to 9. Here the number of sources N = 4 and the number ofobservations M = 3. The number of time delays L = 20. Similarresults are observed for A(2) and A(3). . . . . . . . . . . . . . 92xiiiFigure 4.7 The sum of the absolute correlation coefficients between therecovered sources and the original ones. The blue asterisksrepresent the averages and the red lines stand for the medians.The edges of the box are the lower and upper quartiles. . . . 94Figure 5.1 Estimation error of A with the change of signal-to-noise ratios.Here, the number of time delays L equals to 10, and the stepsize of time delays (i.e. τl−τl−1) is 1 data sample correspond-ing to 4 ms. . . . . . . . . . . . . . . . . . . . . . . . . . . . 110Figure 5.2 An illustrative example of the reconstructed EEG signals basedon the proposed EMG removal method. . . . . . . . . . . . . 111xivGlossaryAIC Akaike Information CriterionALS Alternating Least SquaresACC absolute correlation coefficientsANOVA Analysis of VarianceBSS Blind Source SeparationCANDECOMP Canonical DecompositionCCA Canonical Correlation AnalysisCPD Canonical Polyadic DecompositionEEG ElectroencephalographyEEMD Ensemble Empirical Mode DecomposingELS Enhanced Line SearchEMD Empirical Mode DecomposingECG electrocardiogramEMG electromyogramEOG electrooculographyFEEL Flexible Electronics and Energy LabxvFMRI Functional Magnetic Resonance ImagingGGD Generalized Gaussian DistributionHBR Heart Beat RateHOS high order statisticsICA Independent Component AnalysisIVA Independent Vector AnalysisIMFS Intrinsic Mode FunctionsJBSS Joint Blind Source SeparationMAC Mean of Absolute CorrelationMCCA Multiset Canonical Correlation AnalysisMDL Minimum Description LengthMEMD Multivariate Empirical Mode DecompositionMRI Magnetic Resonance ImagingNAMEMD Noise Assisted Multivariate Empirical Mode DecompositionNF nanofiberNCG nonlinear conjugate gradientPAN PolyacrylonitrilePARAFAC Parallel Factor AnalysisPDF probability Density FunctionPPG photoplethysmogramRHBR Resting heart beat rateRRMSE Relative Root Mean Squared ErrorxviSCA Sparse Component AnalysisSCICA Single-Channel Independent Component AnalysisSNRS signal-to-noise ratiosSSPS single source pointsSTFT Short-Time Fourier transformTF Time-FrequencyUBSS Underdetermined Blind Source SeparationUJBSS Underdetermined Joint Blind Source SeparationUJBSS-2 Underdetermined Joint Blind Source Separation for Two DatasetsUJBSS-M Underdetermined Joint Blind Source Separation for Multiple DatasetsWICA Wavelet-Independent Component AnalysisWVD Wigner-Ville distributionWT Wavelet TransformxviiAcknowledgmentsDoctoral studies are challenging. I would like to thank those who have helped andsupported me during these past years.Special thanks go to my supervisor Prof. Z. Jane Wang. Her guidance and pa-tience inspire me to move on not only in my PhD study but also in other aspects ofmy life. Without her generous support and wise suggestions, I would not have beenable to make it to this point. I also would like to thank Dr. Wang’s financial supportthrough Natural Sciences and Engineering Research Council (NSERC) grants.I own a debt of gratitude to my supervisory committee. Suggestions and inspi-ration from these great professors are valued most by a student like me, as someonewho is new to the research field. Particularly, I would like to thank Prof. MartinMcKeown for his valuable suggestions and efforts.I would like to thank my lab-mates, friends and coauthors, Prof. Yuhu Cheng,Prof. Yi Yang, Dr. Ying Ji, Dr. Chunsheng Zhu, Dr. Tingting Wang, KunzhongJian, Jiannan Zheng, Yiming Zhang, Huan Qi, Nandinee Haq, Pegah Kharazmi,Liang Zhao, Jiayue Cai, River Huang. I would also like to express special thanksto my senior lab-mates, including Dr. Xun Chen, Dr. Chen He and Dr. Aiping Liu.Without their professional suggestions and personal encouragement, I would neverhave grown up as fast as I did.I also would like to thank the China Scholarship Council for providing mefinancial support for the last four years.Lastly, I am forever thankful to my beloved parents. I deeply thank them fortheir love and support.xviiiChapter 1IntroductionOne of our most important faculties is our ability to listen to, andfollow, one speaker in the presence of others. — Colin CherryBlind Source Separation (BSS) refers to a general class of signal processing meth-ods, aiming to recover the underlying source signals from their mixtures withoutresorting to any prior information about the mixing matrix. In this chapter we willreview these state-of-the-art BSS methods. We begin by introducing the basic ideaof BSS and its applications. Considering different assumptions of BSS methods,the BSS problems can be categorized in different ways, such as (over)determinedvs. underdetermined BSS, single-set vs. joint BSS, and instantaneous vs. convo-lutional BSS. We further describe the structure of the dissertation at the end of thischapter.1.1 Introduction to Blind Source Separation and ItsApplicationsThe BSS problem appears in many multi-sensor systems, where each sensor con-tains the mixture of several underlying sources. For instances, an electrode forcollecting Electroencephalography (EEG) signals measures a weighted sum of the1electrical activities of many brain regions as well as a microphone measures soundsof different people/devices in the environment. A fundamental goal of BSS is torecover the underlying sources which usually provide important information butcannot be directly seen in the observed signals. The term ‘blind’ emphasizes thefact that BSS exploits only the information carried by the observations themselvesand does not resort to any prior information about the mixing matrix.The BSS problem can be mathematically formulated asX(t) = AS(t)+E(t), (1.1)where the observations are noisy instantaneous linear mixtures of the underlyingsource signals. In this context, X(t) = [x1(t),x2(t), . . . ,xM(t)]T denotes the M-dimensional observations and xm(t) is the mth channel of the observations. S(t) =[s1(t),s2(t), . . . ,sN(t)]T denotes the underlying N-dimensional sources and sn(t) isthe nth source. E(t) = [e1(t),e2(t), . . . ,eM(t)]T represents the additive noise addedinto each channel of the observations.A classical example of BSS is the cocktail party problem. At a cocktail party,a group of people talk at the same time over the background music. Human beingsare able to selectively focus on and recognize one auditory source in a noisy en-vironment, where their ears capture numerous audio sources, such as the voice ofinterest, the background music and many others. This refers to the cocktail partyeffect [10]. The main questions of relevance are: how does the human brain solvethe cocktail party problem, and is it possible to build a machine to fulfill this task?From an engineering perspective, separating the voice of each speaker can beperformed by using recordings of several microphones in the room, as illustratedin Fig. 1.1. In order to pick out each voice from the observed recordings, wecan perform BSS to recover the sources. Essentially, the BSS problem is a prob-2Figure 1.1: Illustration of the cocktail party problem.lem of parameter estimation. This kind of problem always contains the followingthree ingredients: a parametric model, a criterion and an optimization technique.The parametric model provides a simple representation of the mixing model andrestricts the solution to a particular space. Based on the assumption about the dis-tribution of sources, the criterion is a specific cost function, evaluating the qualityof the potential solutions. The optimization technique is the method for obtainingthe optimal solution with respect to the criterion function. Generally, for a BSSproblem, we can find a demixing matrix W to estimate the source signals asS(t) =WX(t). (1.2)3For successful blind source separation, the demixing matrix W satisfiesWA =ΦΛ, (1.3)where Φ is an (N×N)-dimensional permutation matrix with exactly one entry of 1in each row and each column, and 0s elsewhere. Λ is a diagonal matrix with non-zero diagonal elements λ1,λ2, . . . ,λN . This equation explicitly shows the typicalindeterminacies of permutation and scaling. However, it is clear that, without addi-tional assumptions, the BSS problem is still ill-posed, even if a scale-permutationambiguity is allowed. A commonly used BSS method is Independent ComponentAnalysis (ICA), which makes use of the following assumptions:1) The source signals S(t) = [s1(t),s2(t), . . . ,sN(t)]T are assumed to be statisti-cally independent. This assumption is critical for ICA and implies thatp(S) = p(s1,s2, . . . ,sN) = p(s1)p(s2) . . . p(sN), (1.4)where p(S) is the joint probability Density Function (PDF) of the sources S, andp(sn)(n = 1,2, . . . ,N) is the PDF of the nth channel of sources.2) At most one source can be Gaussian distributed. The linear mixtures ofGaussian signals are still Gaussian and therefore people cannot separate the sourcebased on the non-Gaussianity.Besides ICA, many BSS methods based on other properties of sources andmixing models have been proposed. For instance, Sparse Component Analysis(SCA) is a simple but powerful framework to separate source signals from a fewsensors [47] when the sources can be represented sparsely at a given basis. In theconvolutive BSS, the mixing model is much more complex, which assumes themixtures are weighted and delayed. Designing a BSS system is a challenging task4despite the fact that human ability to recover sources in noisy environments appearsa simple process for humans. The selection of a suitable separation method varieswith the assumed mixing process. In order to achieve satisfying results in real-lifeapplications, it is a prerequisite that [25]:1) The real sources satisfy the assumption of the proposed BSS method, suchas independence between sources, sparsity of the sources, non-negative nature ofthe sources, etc.;2) The mixing model of the proposed method is correct, which means that themixing model is identical or closely resembles to the physical model generatingthe observations.If the above conditions are not satisfied, e.g., a wrong mixing model or sources’distribution is utilized, the performance of the BSS method can not be guaranteed.BSS methods have been attracting increasing attention for their promising ap-plications in signal processing. Among the potential applications, four domainshave been studied intensively:1) audio source separation, such as audio signal enhancement by removingunwanted signal components [110];2) biomedical applications, such as denoising of EEG, Magnetic ResonanceImaging (MRI) and surface electromyogram (EMG) signals [80];3) communication applications, such as co-channel interference suppression inmultiple-antenna systems [25];4) surveillance applications, such as discovering interesting signals among aset of mixed signals [16].In addition, BSS methods can also be used in other domains, such as imageprocessing, watermarking, remote sensing, astrophysics and so on [24].51.2 Related WorkDepending on different assumptions of the existing BSS methods, the BSS prob-lem can be categorized in different ways, such as (over)determined vs. underdeter-mined BSS, single-set vs. joint BSS, and instantaneous vs. convolutive BSS. Thissection describes the related work in literature for each of these categories.1.2.1 (Over)Determined Blind Source Separation vs.Underdetermined Blind source SeparationIn order to select an appropriate BSS method, an important concern is the relation-ship between the number of observations and the number of underlying sources. Inthe (over)determined case where a sufficient number of observations are available(i.e., the number of observations M ≥ the number of sources N), the sources can beeffectively recovered without making strong assumptions about the sources and/orthe mixing model. A typical way to solve this type of BSS problems is ICA. A widevariety of ICA algorithms have been developed since the original work of Bell etc.Mixing matrices are optimally estimated based on different criteria, such as kurto-sis, mutual information, negentropy and log-likelihood [51]. Even though they arepresented in different formalisms, these ICA-based BSS methods are shown to bemathematically equivalent [51].However, in the underdetermined case, where the number of sensors M issmaller than the number of sources N, the BSS problem is much more complexthan that in the (over)determined case and ICA-based methods may not work. Themain reason is that the mixing matrix is not invertible and hence the estimation ofthe mixing matrix is not equivalent to the reconstruction of the sources. The perfor-mances of most (over)determined BSS algorithms are often degraded adversely oreven are no longer applicable to Underdetermined Blind Source Separation (UBSS)problems. Instead, additional prior knowledge about the sources is required to6recover the underlying sources, such as the sparsity of the sources.Generally speaking, in the underdetermined case, the recovery of the underly-ing sources consists of two steps: The mixing matrix is first estimated, and thenthe underlying sources are separated based on the estimated result of the mixingmatrix. A large class of algorithms to estimate the mixing matrix in the underdeter-mined cases starts from the assumption that the sources are sparse. The scatter plotshows high signal values in the direction of the mixing vectors [31]. Some clus-tering methods, such as K-means and fuzzy c-means, are widely used to estimatethe mixing matrix. The clustering-based methods generally perform an exhaustivesearch in the mixing vector space. Therefore, they are time-consuming, especiallywhen the number of observations is greater than two [31, 37]. In addition, in reality,sources are usually not sufficiently sparse. They can have sparser representationsin some transformed domains, such as the Time-Frequency (TF) domain via Short-Time Fourier transform (STFT), Wavelet Transform (WT) and Wigner-Ville distri-bution (WVD) etc. However, clustering-based methods cannot estimate the mixingmatrix precisely if the sources are still insufficiently sparse in the transformed do-main. To improve the accuracy of mixing matrix estimation, many algorithms havebeen proposed for detecting the single source points (SSPS) in the TF domain [37].Initially, the SSPs denote the regions where only one source is active. However,this condition is noted as too strict and the assumption is relaxed by many re-searchers. For instance, the SSPS denote the TF regions where only one source isdominant [71]. Several algorithms were proposed to detect the SSPS. Abrard etal. proposed the TIFROM method to detect the single source points by identifyingthe regions where the complex ratio of the TF transform remains constant [1]. In[7] and [9], the SSPs were detected based on the eigenvalue decomposition of thecovariance matrices of the mixtures. After the detection of SSPs, the clusteringmethod can be utilized to estimate the mixing matrix. As an illustrative example,7Figure 1.2: Illustrative example of a scatter plot of the mixtures with the num-ber of observations M =2 and the number of sources N =6. (a) is for allDFT coefficients, and (b) is for samples at single source points.Fig.1.2 shows a scatter diagram of the mixtures taking samples from 40 frequencybins with M = 2 and N = 6 [91]. Fig.1.2(a) shows the distribution of all the dis-crete Fourier transform coefficients whereas Fig.1.2(b) shows the result based onthe SSPs. It is apparent that the scatter plot in Fig.1.2(b) has a clearer orientationtowards the directions of the mixing vectors [91].Another category of methods to estimate the mixing matrix exploit the sta-tistical properties of the sources. Based on the fourth-order cumulant of the sig-nals, De Lathauwer et al. proposed the FOOBI algorithm [32]. He further ex-ploited the second-order correlation of the sources and applied simultaneous matrixdiagonalization-based techniques to estimate the mixing matrix in the underdeter-mined case [31]. Similar algorithms can also be found in more recent references[84, 111, 124].In the UBSS problems, the sources are not obtained easily and must be inferredeven when the mixing matrix is known or estimated. The sources generally are in-ferred via the maximum posterior approach or the maximum-likelihood approach.8In order to model the underlying source distributions, Cemgil et al. utilized thestudent-distribution [15], and Snoussi et al. used the generalized hyperbolic distri-bution [97], respectively. In order to model a wide class of signals, such as audioand biological signals, Kim modeled the sources’ distributions via the General-ized Gaussian Distribution (GGD) and recovered the underlying sources based onsubspace representation [59]. Another commonly used method to infer sources issparse representation. For instance, Li et al. considered the sparse representationof sources via l1-norm minimization [70, 71]. In order to design overcomplete dic-tionaries for sparse representation, Aharon et al. proposed a novel algorithm basedon K-means clustering and singular value decomposition (K-SVD) [6]. Recently,Zhen et al. exploited the sparse coding of TF representations of observations andproposed a UBSS strategy based on sparse coding [119].However, all the UBSS algorithms mentioned above have been developed forextracting the sources from a single dataset. To the best of our knowledge, therehas been very limited work on UBSS methods specifically proposed for exploringthe correlations between multiple datasets and extracting sources simultaneously.1.2.2 Single Set Blind Source Separation vs. Joint Blind SourceSeparationBSS was originally designed for extracting the sources from mixed signals in a sin-gle dataset. However, in recent years blind separation of multiple datasets simulta-neously (i.e., Joint Blind Source Separation (JBSS)), has been increasingly impor-tant in many applications, such as the group data analysis of multi-subject/multi-session Functional Magnetic Resonance Imaging (FMRI) [67]. For the case of Kdatasets, the mixing process can be modeled asX [k] = A[k]S[k], k = 1,2, ...,K (1.5)9where X [k] ∈ RM×T represents the M-dimensional observations, S[k] ∈ RN×T arethe N-dimensional sources, Ak ∈ RM×N is the mixing matrix, and it is assumedthat M ≥ N (i.e., the (over)determined case). T is the number of data samples. Inthis problem formulation, traditional BSS techniques face the following challenge:even if common sources can be extracted when we apply the BSS to each datasetindividually, the mixing matrix of each dataset could have an arbitrary permutation[8]. In addition, this individual strategy also neglects the correlations between themultiple datasets. To address these concerns, JBSS methods have recently beenproposed, aiming to extract underlying sources so that the estimated sources arealigned across datasets [8].There have been a number of methods proposed for JBSS [49, 56, 67, 68, 72].Among them, Canonical Correlation Analysis (CCA) has been a powerful JBSStool, whereas originally it was proposed only to analyze the correlation structurebetween two datasets and not for the joint analysis of multiple datasets [49]. Mul-tiset Canonical Correlation Analysis (MCCA), an extension of CCA, was proposedby Kettenring et al. to analyze the linear relationships between multiple variatesby only using second-order statistics [56]. MCCA has been quite effective for theanalysis of multi-subject fMRI data [67]. Another extension of CCA, the jointdiagonalization using second-order statistics (JDIAG-SOS), provided a computa-tionally efficient way to solve JBSS problems and avoid the deflationary procedureof MCCA [68].The group ICA and the joint ICA are the extensions of the ICA from one tomultiple datasets [14]. The group ICA assumes that different datasets share a com-mon signal subspace and maximizes the statistical independence of the sourceswithin the extracted source subspace [14]. The joint ICA was developed to max-imize the independence of joint sources of multiple datasets by assuming that Kdatasets share the same mixing matrix [14]. Fig. 1.3 shows the difference between10Figure 1.3: Graphical comparison between (a) Group ICA and (b) Joint ICA.these two methods. Both methods were demonstrated to be powerful in a varietyof applications and can provide meaningful results [14]. However, these two ap-proaches rely on stringent assumptions, which may not be satisfied in some studiesand applications [19].It should be noted that the exiting JBSS algorithms mentioned above generallyassume that the number of sources equals to or is less than that of the observationsignals (i.e., (over)determined case). This assumption may not be true in certainapplications. However, to the best of our knowledge, there are no JBSS methodsspecifically designed for the underdetermined case when the number of sources isgreater than that of observation signals.1.2.3 Instantaneous Blind Source Separation vs. Convolutional BlindSource SeparationIn instantaneous BSS, the observations are weighted sums of the individual sourceswithout time delay, as shown in 1.1. However, in many real-life situations, themixing model is much more complex. For instance, in acoustics, the microphonesnot only pick up the original sources but also the attenuated and delayed versionsof the sources corresponding to the sound waves bouncing back from the wall andceiling. Therefore, the assumption of the instantaneous mixing model does nothold in this scenario. Then the instantaneous mixing model is converted to theconvolutive mixing model, as11Figure 1.4: Graphical comparison between instantaneous BSS and convolu-tive BSS: (a) for instantaneous BSS without time delay and (b) for con-volutive BSS with reverberation.X(t) =K−1∑k=0AkS(t− k), (1.6)where X(t) = [x1(t),x2(t), . . . ,xM(t)]T denotes the M-dimensional observations,S(t) = [s1(t),s2(t), . . . ,sN(t)]T denotes the underlying N-dimensional sources and[s1(t − k),s2(t − k), . . . ,sN(t − k)]T is the delayed version of the original sourcesS(t). The observations are the linear mixture of the delayed version of the sources,and the matrix Ak ∈ RM×N denotes the mixing matrix corresponding to the timedelay k. A graphical comparison between instantaneous and convolutive models isshown in Fig.1.4. In convolutive BSS, each microphone records filtered versionsof the sources, which can be written as,xm(t) =K−1∑k=0N∑n=1Ak(m,n)sn(t− k). (1.7)Each source contributes to the sum with multiple delays corresponding to multiplepaths by which the signal from the corresponding speaker propagates to a micro-phone. Various attempts have been made to tackle this type of convolutive BSS12problem [95]. A commonly used way is based on frequency-domain [6][13], whichconverts time-domain observation signals into frequency-domain time-series sig-nals by a STFT.The instantaneous mixing model is a special case of the convolutive modelwhere the delay is set to be 0. The instantaneous BSS methods can be extendedto solve the convolutive BSS problem with more constraints. In this dissertation,the focus is on novel BSS methods for instantaneous mixtures, especially for theunderdetermined cases.1.3 Research Objectives and MethodologiesThe goal of this dissertation is to develop a set of novel underdetermined blindsource separation methods which are able to cope with several challenges presentin real applications. As we introduced previously, the underdetermined BSS prob-lem is generally more difficult than the (over)determined BSS problem where thenumber of observations is equal to or greater than that of underlying sources. It is adifficult topic with some challenges including limited observations, the convolutivemixing model, an unknown number of sources, dependence information, temporaldynamics, efficiency and so on.Considering the recent research progress in underdetermined BSS, we are in-terested in addressing the following concerns and challenges. First, although somerecent overcomplete representation methods have been proposed, the problem torecover the underlying sources from limited number of observations remains a dif-ficulty. Traditional underdetermined BSS methods, both the methods based on thesparse properties and the methods based on the statistical properties of source sig-nals, may fail to work when the number of sources is greatly larger than that ofthe observations. Second, the sources may be mixed with a time delay way ratherthan in an instantaneous way. The sources corresponding to different observations13may be correlated with each other rather than identical as in the instantaneous BSSmixing model. In that case, traditional BSS methods will fail to solve the BSSproblem. Third, the dependence information across multiple datasets should betaken into account in solving the underdetermined BSS problems. Joint analysisin BSS has attracted great attention owing to its ability to simultaneously recoverthe underlying sources from multiple datasets. However, to the best of our knowl-edge, existing JBSS methods in the literature consistently assume that the numberof observations is equal to or greater than that of the underlying sources. There-fore, these methods are not suitable for solving underdetermined BSS problems.In addition, it is a challenge to obtain the underlying sources in the underdeter-mined BSS even when the mixing matrix is known. As a result, methods that caneffectively and efficiently infer the underlying sources are highly desired.In this research work, we attempt to develop novel underdetermined BSS meth-ods to address the aforementioned challenges, including limited number of ob-servations, presence of highly correlated sources across observations (rather thanidentical sources), and self/cross dependence and source inferences. Specifically,the main research contributions of this dissertation are summarized as follows:• For the case of the underdetermined BSS, we propose a two-step source ex-traction method termed as NAMEMD-MCCA. Considering the multichan-nel nature of the measured signals, Noise Assisted Multivariate EmpiricalMode Decomposition (NAMEMD) is utilized to decompose each channel ofthese signals into a series of Intrinsic Mode Functions (IMFS). Then theunderdetermined problem is converted into the determined problem, as wehave a sufficient number of IMFs. In addition, the sources correspondingto different observations may be highly correlated rather than identical. Wedecompose each channel into a series of IMFs, which can be regarded as14a dataset. Considering the correlations between the sources across multi-ple observations, a classical JBSS method is employed to solve this problem.Benefiting from cross-channel information and its increased robustness to ar-tifacts, the proposed method can achieve superior performance in recoveringthe underlying sources.• Inspired by the canonical correlation analysis model and the simultaneousmatrix diagonalization, we exploit the second order statistics of the observa-tions in two datasets and propose a novel UBSS method which can accuratelyestimate the mixing matrices in the underdetermined case. In addition, weemploy a novel time-frequency analysis method to recover the sources. Theproposed method is referred to as the Underdetermined Joint Blind SourceSeparation for Two Datasets (UJBSS-2).• Exploiting the cross correlation between each pair of datasets, we propose anovel and effective method to jointly estimate the mixing matrices for multi-ple datasets in the underdetermined case. We further recover the underlyingsources individually based on subspace representation. We extend the ideaof (over)determined JBSS to that of the underdetermined case. The proposedBSS method is referred to as Underdetermined Joint Blind Source Separa-tion for Multiple Datasets (UJBSS-M). The basic idea of UJBSS for multipledatasets is shown in Fig.1.5.• We further explore the second-order statistics of the underlying sources,including autocorrelation and cross correlation information, and propose anovel EMG artifact removal method. The proposed method can recover thelatent sources and reconstruct noise-free EEG signals with a limited numberof observations.15Figure 1.5: The basic idea of UJBSS for multiple datasets. M is the numberof observations and N is the number of sources in each dataset. T rep-resents the number of data samples, and K represents the number ofdatasets. r(k1,k2)(n) denotes the correlation between the nth source in thekth1 dataset and that in the kth2 dataset.Figure 1.6: The overview of the challenges, objectives and contributions ofthis research work.16Fig.1.6 illustrates the challenges, objectives and contributions of this researchwork.1.4 Thesis OutlineThe remainder of this dissertation is structured as follows:In Chapter 2, in order to address certain concerns of the currently availableEMD-BSS based methods, we propose a novel blind source separation frameworkby combining NAMEMD and MCCA. The proposed method takes advantage of themultivariate data-adaptive nature of the NAMEMD and MCCA, which contributes toaccurate extraction of the desired signal. The experimental results on both sim-ulated data and real data demonstrate the superior performance of the proposedmethod.In Chapter 3, we extend the idea of (over)determined JBSS to that of the un-derdetermined case and introduce a novel blind source separation method, termedas UJBSS-2. Considering the dependence information between two datasets, theproblem of jointly estimating the mixing matrices is tackled via Canonical PolyadicDecomposition (CPD) of a specialized tensor in which a set of spatial covariancematrices are stacked. Furthermore, the estimated mixing matrices are used to re-cover sources from each dataset separately.In Chapter 4, as a generalization of our previous work on two datasets in Chap-ter 3, we exploit the cross-correlation between each pair of datasets and presentanother novel blind source separation method, referred to as UJBSS-m. In thischapter, the cross correlation between each pair of datasets is modeled by a third-order tensor in which a set of spatial covariance matrices corresponding to dif-ferent time delays are stacked. Considering the latent common structure of theseconstructed tensors, the mixing matrices are jointly estimated via joint canoni-cal polyadic decomposition of these specialized tensors. Furthermore, we recover17the sources from each dataset separately based on the estimated mixing matrices.Simulation results demonstrate that the proposed UJBSS-m method yields supe-rior performances when compared to commonly used single-set UBSS and JBSSmethods.In order to efficiently acquire physiological signals in the ambulatory environ-ment, a small number of sensors is usually desired. Conventional artifact removalmethods based on BSS, such as CCA and ICA, could fail to remove the artifactsin this case. In Chapter 5, we further explore the cross correlation and autocorrela-tion of the underlying sources and propose a novel underdetermined BSS methodto remove the EMG artifacts from EEG signals. We conduct a performance com-parison through numerical simulations in which EEG recordings are contaminatedwith muscle artifacts. The results demonstrate that the proposed method can effec-tively and efficiently remove muscle artifacts while successfully preserve the EEGactivity.Finally, Chapter 6 summarizes the contributions of this dissertation, and dis-cusses future research directions.18Chapter 2A Blind Source SeparationFramework Based on NAMEMDand MCCAA recently developed novel nanofiber based strain sensor is introduced as a poten-tial alternative to conventional measurement tools for Heart Beat Rate (HBR) mon-itoring. Since the measured signals in real-life are often contaminated by certainartifacts, in this chapter, to overcome limitations of currently available EMD-BSSbased methods and recover the buried heart beat signal accurately, we propose anovel blind source separation framework by combining NAMEMD and MCCA.The proposed method takes advantage of the multivariate data-adaptive nature ofthe NAMEMD and MCCA, which contributes to accurate extraction of the desiredsignal. The absolute correlation coefficients (ACC) between the extracted signaland the original source signal are adopted to evaluate the performance of the pro-posed method in the simulation study. The average of the ACC yielded by the pro-posed method is 0.902, which is significantly higher than that by state-of-the-art19approaches. We also examine the proposed method on the nano-sensor data col-lected when the subject performs 11 tasks. It is shown that the proposed methodcan achieve better performance, especially for preserving the shape of the heartbeat signal.2.1 Motivation and ObjectivesAs health costs are increasing and the world population is aging [96], monitoringphysiological signals, such as body temperature, oxygen concentration, weight andfat levels, has drawn a lot of attention and has been used as an important tool foraccessing people’s fitness and health [54]. Resting heart beat rate (RHBR), one typeof vital physiological signs, indicates the speed of the heart beat in the relaxed con-dition and can provide an assessment of health status [53]. It has been clinicallyapproved that there is a direct relation between higher RHBR and higher mortal-ity for the elderly and patients with cardiovascular problems, hypertension, andmetabolic syndromes [63, 85, 86], whereas a relatively lower RHBR is probably asignature of better fitness and health condition. Therefore, it is of great importanceto monitor health condition by measuring and tracking the RHBR.For real-world applications such as the emerging wearable health monitoringsystems [118], there is a need for developing and testing novel, inexpensive, non-invasive and miniature sensors for heart rate monitoring purposes. In this work, weadopt a novel type of nanofiber based strain sensors recently developed by FlexibleElectronics and Energy Lab (FEEL). The compliant and conformable nature of thistype of sensors, in addition to their sensitivity, stable behavior and low performancepower [98], makes them a great choice for home health monitoring systems. Itshigh sensitivity of such sensors enables us to detect very small deformations suchas those induced in the wrist by radial pulses.To measure the RHBR, 3 nanofiber based strain sensors are deployed on the20wrist. Yet, the signals measured by the nanofiber based sensors are often con-taminated by instrumentation noise, motion artifacts and other types of interfer-ence. It is challenging to remove artifacts and obtain clean radial pulse signal,especially when the subject might be moving. Although there exists a great va-riety of artifact reducing or removal techniques based on WT, Empirical ModeDecomposing (EMD) and BSS [29, 66, 93, 108], they have certain limitations inrecovering the underlying unknown source signals. Also, it is worth emphasizingthat different systems (sensors) will generate different types of bio-signals, suchas EEG and EMG, and generally will pose different challenges for signal denois-ing, artifact removal and data analysis. For a particular problem, the appropriateapproach is generally sensor-dependent and application dependent. Since the typeof nanofiber-based sensor used in this chapter is novel and its application to heartbeat rate estimation is also new, it is not clear which method is more suitable forour specific problem (i.e., RHBR monitoring using a novel type of nano-sensorsignals) and whether a new method is desired.In wavelet-based approaches [13], researchers need to select a set of basis com-ponents in advance and then calculate the related coefficients for each component.The performance of these methods heavily depends on the predefined basis compo-nents [13]. Given that the ground truth is generally unknown, the predefined basiscomponents can not be guaranteed to match the signals of interest. EMD was firstintroduced by Huang et al. [50] as a technique for processing nonlinear and non-stationary time series, such as biomedical signals [66]. It decomposes a time seriessignal into a number of spectrally independent oscillatory components, defined asIMFS. EMD is nonparametric in the sense that all the IMFs are derived empiricallyfrom the data. Thus, it is a fully data-driven method [77]. However, this methodis very sensitive to noise, making it ineffective in low signal-to-noise ratio (SNR)situations [81]. BSS methods, such as ICA and CCA, are increasingly being used as21artifact removal tools for analyzing multivariate biomedical signals [39]. They canrecover the underlying sources of interest when there are sufficient channels. How-ever, these BSS methods commonly assume that the number of sources cannot begreater than that of channels/sensors. In fact, this assumption may not be satisfiedin practice, especially when minimal instrumentation is required [19], such as theRHBR measurement case.To overcome these limitations, WT and Ensemble Empirical Mode Decom-posing (EEMD) [114] were generally used to decompose the unidimensional signalfrom each sensor into multichannel signals before applying BSS methods. Linet al. proposed a method called wavelet-ICA expanding the 1-D signal into 2-D signal by WT and then extracted independent features from the 2-D signal byFastICA [74]. Considering the advantage of data-driven methods, Mijovic et al.introduced a single-channel technique by combining EEMD and ICA (EEMD-ICA) [81]. It decomposed the single-channel signal into spectrally independentcomponents and then extracted statistically independent source signals. This tech-nique was shown to be more effective compared to wavelet-ICA. Recently, a moreefficient method combining EEMD with CCA (EEMD-CCA) was proposed bySweeney et al. [103]. It provided superior performance for source recovery andartifact suppression. However, these existing methods only focus on one singlechannel each time and neglect the possible inter-channel information when mul-tiple sensors are available. Two significant obstacles of these methods in dealingwith multichannel signals are the problem of uniqueness and mode mixing with thechannel-wise EMD, such as EEMD. There is no guarantee that the decompositionresults of different channel signals are matched [87], either in the number or theirfrequency, making the multichannel analysis often very difficult. In addition, itwas shown that the multivariate signal processing can provide more insights intocomplex and nonstationary real-world process [76]. The essences of dynamical22behavior are intrinsic correlations, within a single-channel and more importantlyacross multiple channels. Therefore, the above methods including EEMD-ICA andEEMD-CCA may not be good choices for multichannel artifacts removal.Considering the above concerns, in this chapter, we propose a novel frame-work for extracting heart beat signals by combining NAMEMD [90] with MCCA[49, 72], termed as NAMEMD-MCCA. NAMEMD, which processes the input sig-nal directly in the high dimensional space and considers the inter-channel infor-mation, can effectively overcome the problems of uniqueness and mode mixing[77]. MCCA, as an extension of CCA, is developed to find linear transforms thatsimplify the correlation structure among a group of random vectors [72]. It canjointly extract the sources from each dataset through maximizing the correlationsof the extracted sources across datasets. The combination of these two methodsbenefit from the use of inter-channel information and increased robustness to arte-facts. Considering the multichannel nature of the measured signals, NAMEMDis first utilized to decompose each channel of these signals into a series of IMFs.The IMFs with dominant frequencies close to that of typical heart beat signals areselected to form corresponding datasets. These datasets are further employed asthe input to MCCA for recovering the heart beat signals, and finally the RHBR canbe detected.We evaluate the performance of the proposed method in both synthetic data andreal data. We first validate it on simulated data and illustrate its performance. Thesesimulations are performed under realistic assumptions in which mixed signals arelinear mixture of photoplethysmogram (PPG), EMG, motion signals, power fre-quency signals and Gaussian white noises. We then apply the method to nanofibersensor signals collected when the subject was performing 11 tasks. RHBRs areestimated by the proposed framework and comparison also suggests its superior-ity. The main contributions of this chapter are manifold: we explore a novel type23of nanofiber based strain sensor for potential RHBR monitoring; we investigatethe state-of-the-art EMD-BSS based methods for exacting RHBR information ac-curately based on the nano-sensor data; and we further propose the NAMEMD-MCCA for improved RHBR monitoring.2.2 MethodsIn this section, we first briefly introduce the EMD related methods and the BSStechniques. We then present a novel heart beat rate detection method based onthe combination of NAMEMD and MCCA. Finally, both synthetic data and realnano-sensor data are described.2.2.1 Related MethodsEmpirical mode decompositionEMD is a fully data driven method which adaptively decomposes a signal into aresidue and a finite set of spectrally independent oscillatory components, calledIMFs. As it is adaptive in nature, the IMFs usually offer a physically meaningfulrepresentation of the underlying process [113], which render its wide applicationin the biomedical signal processing.However, the standard EMD method may suffer from the problem of unique-ness and mode mixing. To overcome these issues, EEMD, a noise assisted versionof EMD, was proposed and has been demonstrated to be more robust in real-lifeapplications. It defines each IMF component as the mean of the IMFs extractedby applying standard EMD on the signal corrupted by added white noise of finiteamplitude. The noise of each trail is different and its effect in the ensemble meancan be canceled out when there are enough trials. This contributes to establish auniformly distributed reference scale and then perform a quasi-dyadic structure of24the corresponding filter bank [90]. However, despite being a significant step for-ward, the EEMD is still a univariate method, which cannot solve the uniquenessproblem. In addition, it is further limited by its computational complexity.Multivariate empirical mode decompositionThe issues of common mode alignment and nonuniqueness have been the majorobstacles for applications of these channel-wise EMD methods in multivariate dataanalysis. The bivariate EMD (BEMD) and rotation-invariant EMD (RI-EMD) wereproposed considering the information between two channels whereas these two al-gorithms can cater only 2-channel signals [11, 92]. To cope with this problem,Multivariate Empirical Mode Decomposition (MEMD) was introduced by Rehmanand Mandic in [89], which also demonstrated its ability to do matched-scale de-composition across multichannel signals, preserving the multichannel properties.The input multivariate signals are mapped into multiple real-valued ‘projected’ sig-nals along directions in m-dimensional spaces. The detail of the MEMD algorithmis described in Algorithm 1.Similar to the EEMD, Rehman et al., taking the advantage of MEMD, proposeda noise assisted extension of the MEMD, namely NAMEMD. Firstly, based on theMEMD method, it decomposes the original signals along with n-channel multi-variate independent Gaussian white noise. At the end of this process, it discardsthe IMFs corresponding to the n-channel white noise and obtains a set of IMFsfor the original signal. It should be noted that it is different from the EEMD, inwhich the noise is directly added to the observed signals, as the noise here residesin a different subspace. The Gaussian white noise can be used to enforce a filterbank structure and provide better definition of frequency bands inherent to the data.Therefore, the NAMEMD can accurately align the common oscillatory modes inthe IMFs from multichannel signals and effectively alleviate the problem of mode25Algorithm 1 The MEMD algorithmInput: m-dimensional signal X(t) = [x1(t),x2(t), ...,xm(t)]TOutput: the IMFs corresponding to each channel of the X(t)1: Generate a suitable point set based on Hammersley sequences for sampling on(m-1) dimension sphere;2: Calculate the projections {pk}Kk=1 of the original signal along all the K direc-tion vectors {vk}Kk=1, where K represents the number of direction vectors;3: Identify the time points {tki }Kk=1 correspond to the maxima of each projects;4: Interpolate each maxima of each projection [tki , pk] and get the multivariateenvelops {ek(t)}Kk = 1;5: Calculate the mean by mean(t) = 1/K∑Kk=1 ek(t);6: Calculate the detail signal d(t) = x(t)−mean(t). If d(t) satisfies the stoppingcriterion for a multivariate IMF, then obtain an IMF c(t) = d(t) and repeat step2 to step 6 on x(t)−d(t), otherwise apply them on d(t).mixing within the resulted IMFs [90].Independent component analysisIn this section, we illustrate the basic idea of ICA. The observed signal can bedenoted as X(t) = [x1(t),x2(t), ...,xm(t)]T whose elements are linear mixtures of nindependent elements of the source component vector S = [s1(t),s2(t), ...,sn(t)]Tand m is greater than or equal to n. We can model the mixing process as follows:X = AS (2.1)where the matrix A represents the linear mixture of the source components. Thegoal of ICA is to find the unmixing matrix W to recover the original source byassuming that they are independent and no more than one component is Gaussian-signal. The FastICA is a commonly used way to do the independent componentanalysis and we also evaluate its performance in this work.26Multiset canonical correlations analysisCCA has been a powerful BSS tool, however, it can only analyze the correlationstructure between two datasets [49]. Therefore, it cannot be used in this work, in-volving three datasets. MCCA, an extension of CCA, was proposed by Kettenringet al. to analyze the linear relationships between multiple variates [56, 72]. Con-sidering the case for K datasets, the observed vector of each dataset contains thelinear mixtures of the corresponding l sources. We can model the mixing processas following,Xk = AkSk k = 1,2, ...,K (2.2)Where Xk and Sk ∈ Rl are l dimensional observed vector and underlying sources re-spectively, Ak ∈ Rl×l is the non-singular mixing matrix. Then the source canonicalvector can be unmixed based on linear combination of the observed vector by:Sk =W kXk k = 1,2, ...,K, (2.3)where W k is the canonical transformation unmixing matrix of the kth dataset. It canbe obtained by optimizing objective functions related to the overall correlation. TheMCCA ensures that the corresponding latent sources are highly correlated acrossdatasets, and meanwhile the latent sources are uncorrelated within each dataset.One thing should be noted is that MCCA reduces to CCA when the number ofdatasets m equals to 2.EEMD-ICAThe use of EEMD in combination with ICA for source separation was first pro-posed by Mijovi et al. in [81] and was employed for the removal of artifactsfrom single channel EEG and EMG signals. The EEMD was used to decom-27pose a single-channel signal into a multichannel signal, comprised of IMFs. Thenthese IMFs can be employed as the input to the ICA method with the aim ofestimating the underlying true source signal. Finally, the sources determined tobe artifacts were discarded and the remaining sources were used to reconstructthe signal-channel signal free of artifacts. Mijovi et al. applied the proposedmethod EEMD-ICA on simulated data and the real EEG as well as EMG signalsand compared it with the other two signal-channel artifacts removing methods,namely Single-Channel Independent Component Analysis (SCICA) and Wavelet-Independent Component Analysis (WICA). It was shown that their proposed methodachieved the best performance [81].EEMD-MCCAThe EEMD-MCCA [123] is an extension of the previously introduced method,namely EEMD-CCA, which was first detailed by Sweeny et al. as a novel single-channel artifact removal technique [103]. Similar to these two signal-channel de-noising methods, including EEMD-ICA and EEMD-CCA, it decomposed the sin-gle channel signal into multichannel IMFs components and then input the IMFs ofinterest into the BSS method. However, one thing should be noted is that MCCAdoes not extract the source signals channel-wisely. Considering the inter-channelinformation, it can jointly extract the underlying source signals from multiple chan-nels.2.2.2 Proposed FrameworkNAMEMD-MCCATo extract the heart beat signal from the nanofiber based strain sensor signals, wepropose taking advantage of both NAMEMD and MCCA by exploring their com-28bination and denote the proposed method as NAMEMD-MCCA, which is shortlyoutlined in Algorithm 2. Firstly, NAMEMD is applied to the observed signals.Each channel of the observed signals is decomposed into multiple IMFs compo-nents. Utilizing the dyadic filter bank property of MEMD, NAMEMD can effec-tively improve the performance of frequency localization and reduce mode-mixing.In addition, it is shown that NAMEMD can alleviate the effects of noise in EEMDmethod [105]. Then the IMFs components determined to be artifacts are discardedand these observed signals are converted into groups of datasets, denoted by X1,X2, . . . , XK . These datasets are selected as input to the MCCA algorithm. MCCAprovides an effective way to jointly extract the group of underlying sources whilemaximizing the correlations among different datasets and retaining the intersubjectsource variability. Then the source signals are extracted and the potential heart beatsignal is selected.We evaluate the effects of the noise size n and the number of interesting IMFsl. Increasing the size of the independent noise in the NAMEMD generally helpsenforce the desired quasi-dyadic structure and reduces the degree of the overlapbetween the IMFs’ spectra. Intuitively a larger noise size n is preferred. However,the computational cost also increases if we add more channels of noise. In ourpreliminary study, we investigated the performances when the noise size n variesfrom 1 to 10 and observed that the estimated RHBR is not sensitive to n. Wetherefore heuristically set n to be 1 for computational efficiency. We also estimatedthe RHBR when l varies from 3 to 6, and obtained the best performance when weselect 4 IMFs for each channel. Therefore, we heuristically set l to be 4 in thischapter when analyzing the nanofiber-based strain sensor signals.29Algorithm 2 The proposed NAMEMD-MCCA algorithmInput: a m-dimensional signalX(t) = [x1(t),x2(t), ...,xm(t)]T , the size of the noise n and the number of interestingIMFs for each channel signal lOutput: The extracted k source components for each channel1: Generate n-channel multivariate independent white noise and combine withthe original signal X(t) to obtain a (m+n)-channel signal X′(t);2: Decompose the noise-combined signal X′(t) into IMFs based on the MEMDmethod and obtain multivariate IMFs;3: Discard the IMFs corresponding to the n-channel white noise and obtain m setof IMFs for the original signal X(t);4: Calculate the frequency spectral for each IMF and select l interesting IMFs foreach channel based on their dominant frequencies. These IMFs should havethe common scale and the same index among channels. Then, combine theseIMFs from each channel to be a single dataset and then get m datasets, each ofwhich contains l channel signals;5: Apply the MCCA method to these m datasets and extract the l source signalsfor each dataset.Heart beat rate estimationWe aim to retrieve the heart beat rate from the obtained heart beat source sig-nal, which can be obtained by different BSS methods mentioned above, includingEEMD-ICA, EEMD-MCCA and the proposed NAMEMD-MCCA. The heart beatrate estimation method [123] can be divided into 3 steps, including peak detection,peak filtering and calculating the mean of the peak intervals. The details of thedetection method are shown in Algorithm 3. We calculate the derivate of the ex-tracted heart beat signal and identify the local extrema. Considering the shape ofthe typical heart beat signal, the peaks should be larger than the surrounding ex-trema to some extent. In this work, we measure the difference between some peaksand the corresponding closest valleys and finally set (2ndmax− 2ndmin)/4 as thethreshold to detect the potential peaks. In addition, we filter the fake peaks and30Algorithm 3 The heart beat rate estimation methodInput: heart beat signal si(t) and sampling frequency f sOutput: heart beat rate HBR1: Peak Detection:i) calculate the derivate of si(t) and detect the local maxima and minima;ii) considering typical heart beat signals, only the maxima satisfying the fol-lowing specific condition can be regarded as a peak,Peak> Nearest minima+ 2ndmax−2ndmin4,in which the nearest minima represents the left and the right minimum whichis closest to the potential peak, the 2ndmax and 2ndmin mean the second largestand the second smallest magnitude of the heart beat signal.2: Peak filtering, including:i) exclude the endpoints from the potential peaks;ii) exclude the fake peaks which are quite close to each other. The thresholdwe set here is 800 while the sampling frequency is 1600Hz. This implies thatthe HBR is no more than 120 bpm.3: Calculate the mean of the peak intervals:i) calculate the peak interval and discard the intervals which are larger than1.2fs, which corresponds to 50 bpm;ii) calculate the mean of the effective peak interval (MPI) as HBR:HBR =f sMPI×60invalid peak intervals according to the normal RHBR ranging from 50-120 bpm.Then we get the estimate of the RHBR based on the mean of the effective peakintervals.312.2.3 Data DescriptionSynthetic DataIn this simulation, we apply the proposed NAMEMD-MCCA method to syntheticdata and also implement EEMD-MCCA and EEMD-ICA for comparison. In or-der to simulate the real physiological signal, we generate m groups of sources us-ing real PPG from [55], EMG from [52], Motion signals from [82] as well aspower frequency signal and Gaussian white noise simulated by MATLAB (ver-sion R2014a). The clean PPG signal was obtained from pulse oximeter [55] andthe EMG signal was monitored by using VEST(Capintec) [52]. The motion sig-nal, which is usually the result of intermittent mechanical forces, was obtained byusing a Holter recorder on an active subject [82]. Considering the difference oftheir sampling frequencies, we scale them into 300Hz. Each group contains theFigure 2.1: Three groups of source signals and mixed signals. In each sub-figure, the horizontal axis represents the time index and the vertical axisrepresents the amplitude of the signal in microvolt.32following 5 sources:s1 = PPG signals2 = EMG signals3 = Motion signals4 = Power frequency signal (60Hz)s5 = Gaussian white noise(2.4)There is 0.01 second time-shift between the PPG signals in successive groups.Considering the different locations of sensors in reality, we make EMG signalsand the Motion signals distinct among groups, as shown in Fig. 2.1. The signalrecorded by each sensor is simulated as the linear mixture of the correspondinggroup of sources. m mixed signals are generated as follows,X i = AiSi, i = 1,2, . . . ,m (2.5)where the Si = [s1;s2;s3;s4;s5] denotes the ith group of sources and the Ai is themixture matrix randomly generated for the ith group of sources. An example withthree channels of the mixed signals is shown in Fig. 2.1. For easy visualization,we just plot the first 100 points of the power frequency signal. As can be seenfrom Fig. 2.1, the mixed signals in channels 2 and 3 are much noisier than that ofchannel 1. The related mixture matrices areA1 =[−0.2435 0.2262 0.3538 −0.0721 −0.1044], (2.6)A2 =[−0.2433 −0.2397 −0.0670 −0.2155 0.2345], (2.7)A3 =[−0.3305 −0.1377 −0.0518 0.1027 −0.3773]. (2.8)33Real DataThe fabrication of core-shell nanofiber (NF) mesh strain sensors was explained indetail in our previous paper [98]. In brief, Polyacrylonitrile (PAN) NFs were fab-ricated using a conventional electrospinning technique. PAN (average molecularweight of 100000 g/mol, Scientific Polymer Products) is dissolved in DMF (99.9%,Fisher Scientific) with concentration of 10 wt% and stirred at 60oC for 24 h to forma homogenous solution. It is then loaded into a plastic syringe with a blunted G18needle. An electrospinning unit (KATO TECH CO. LTD.) is used to prepare thenanofibrous samples from the solution. A constant volume of the solution is deliv-ered to the needle at a flow rate of 0.6 ml/h and a high potential of 17 kV is appliedto the needle. The non-woven fibers are collected on the collector (aluminum foil,17 cm far from the needle) connected to the ground to form NF network. Thenanofiber mesh is then coated with a thin layer of gold by sputtering, which pro-vides a uniform conformal coating on the surface of NFs. Ribbons of coated NFswith desired dimension of about 2 mm width and 10-50 mm length is then cut andtransferred to the surface of a desired elastomer substrate, Polydimethylsiloxane(PDMS), with the thickness of ≈1 mm. Electrical contact pads and wirings areprepared by application of silver pint to both ends of the NF ribbon. A PDMSlayer is then poured on the top of the NF mesh, followed by degassing and curingprocess at 90oC for 45 minutes. It was demonstrated that this novel and comfort-able NF-based sensor is highly sensitive and provides exceptional performance foraccurate health monitoring [98]. Fig. 2.2(a) and (b) shows the dimensions and thecompliant nature of typical NF sensors. The small, soft and comfortable naturehighlights the suitability of the NF sensor for wearable health monitoring.Radial pulse sensing was performed by attaching NF-based sensors to epider-mis on the wrist using a double-sided tape. The minute deformation of the skin34caused by radial pulse results in change in resistance of the sensors. Consideringthe low deformation level, to obtain strong signals, the strain sensor should be at-tached to a location close to the radial artery. Due to the anatomical variation ofdifferent subjects, the location of the radial artery slightly varies in different peo-ple. In order to cover a larger area and ensure that at least one sensor is locatedon the right position, we have used three sensors as depicted in Fig. 2.2 (c) andsimultaneously recorded their responses. The measurement set-up is schematicallyshown in Fig. 2.3. Wires are connected to the end contact pads of the sensors andconnected to a costume-made multichannel resistance measurement system, whichwas designed and developed in our lab. The multichannel resistance measurementsystem, consisting of an analog circuit, a Wheatstone bridge and an amplifier foreach channel, provides the possibility of real-time measurement of multiple sensorssimultaneously. It also includes a USB 1608FS-PLUS (Measurement Computing)to convert the analog signal of the circuit to its digital representation for real-timerecording of the signals. A LabVIEW program was used for operating and record-ing the data.The NF sensor signals were collected when the subject sat or stood steadily. InFigure 2.2: Photographs of NF sensors: (a) Dimensions of NF sensors; (b)Flexibility of NF sensors; (c) Three NF sensors used for monitoring theRHBR.35Figure 2.3: Illustration of the measurement system.addition, to evaluate the robustness of the proposed algorithm, the sensor responsesto pulse when the subject was doing the following 9 tasks were also collected. Thetime duration of each task is 12.5 seconds. The task of ‘Fix type1’ serves as anexample in this chapter.(1) Finger Flex and Wrist Flex: flexion of all fingers together and flexion of thewrist.(2) Fix type key1: type key 1 using the index finger while the wrist and forearmwas fixed.(3) Fix type1 and Fix type2: type key 1 and 2 alternatively using the indexfinger while the forearm was fixed. We repeat this task twice.(4) Fix type: randomly type while keeping the forearm and wrist fixed.(5) Fix writing: randomly write using a pen with fixed forearm.(6) Fix writing1 and Fix writing2: write the number 1 and 2 using a pen whilethe forearm is fixed respectively.To better describe the NAMEMD-MCCA-based procedure for monitoring theRHBR from the NF sensor signals, we summarize the entire procedure in Algo-rithm 4. Since the signals of interest lie in the low frequency band, the originalsignals are preprocessed by low pass filtering (to 20Hz) and down sampling (by5) for computational efficiency. Then the NAMEMD-MCCA method is applied tothe preprocessed signals. 4 source components for each sensor are extracted by36Algorithm 4 The resting heart beat detection method based on NAMEMD-MCCAInput: The raw 3-channel nanofiber signals X(t) = [x1(t),x2(t),x3(t)]T , thenumber of the multivariate independent white noises (n = 1 here), and the numberof interesting IMFs for each channel signal (l = 4 here)Output: The estimated resting heart beat rate RHBR1: Preprocess the raw nanofiber sensor signals x(t) by low-pass filtering (to 20Hz)and down sampling (with decimation factor = 5) to obtain g(t);2: Apply the proposed NAMEMD-MCCA algorithm to the preprocessed datag(t) and extract the source signals s(t);3: Select the heart beat signal si(t) from the extracted source signals s(t);4: Apply the proposed HBR estimation method, based on the peak detection, peakfiltering and calculating the mean of the peak intervals, on the extracted heartbeat signal. Then the RHBR is detected.MCCA and the potential heart beat signal is selected from them. Finally, the HBRestimation method is applied to the potential heart beat signal and the RHBR isdetected.2.3 Experimental ResultsIn this section, the performance of two currently available EMD-BSS methods andthe proposed method is evaluated on synthetic datasets. We then examine thesethree methods on real nano-sensor data to obtain the heart beat signal and estimatethe corresponding RHBR.2.3.1 Simulation StudyWe test the performances of three methods, including EEMD-ICA, EEMD-MCCAand the proposed NAMEMD-MCCA, on synthetic data. An example of the syn-thetic data is shown in Fig.2.1, in which each channel of the mixed signals is sim-ulated as a linear mixture of the corresponding 5 sources. As seen from Fig. 2.1,the signal in channel 2 and 3 are noisier than that in channel 1. It is challenging to37recover the pulse wave signal from these two channels. To this aim, we attempt toincorporate these three channel signals. We test the proposed NAMEMD-MCCAmethod as well as other two existing methods on this simulated data. The extractedpulse waves corresponding to channel 1 are shown in Fig. 2.4. Compared with theother two methods, NAMEMD-MCCA provides the best performance in preserv-ing the shape of the original signal. The extracted signal by EEMD-ICA conveysmore noise and it is nothing like the shape of pulse wave. Although we can getthe heart beat from the result of EEMD-MCCA, it would be more difficult or evenimpossible to get more physical information from this signal.We also utilize the correlation coefficient (CC) between the original PPG andthe extracted pulse waves as an evaluation measure to the effect of these threemethods. The CC corresponding to the NAMEMD-MCCA method is 0.978 whichis higher than that of EEMD-ICA by 36% and that of EEMD-MCCA by 3%. Fordefiniteness and without loss of generality, we generate 100 synthetic datasets andcompare these three methods on them.As shown in Fig. 2.5, NAMEMD-MCCA and EEMD-MCCA can obtain com-parable performance in recovering the pulse wave and they generally performsbetter than EEMD-ICA, although EEMD-ICA performs better than the other twomethods occasionally. As to NAMEMD-MCCA, the average of the absolute CCs is0.902, whereas that of EEMD-ICA and EEMD-MCCA is just 0.623 and 0.873 re-spectively. In addition, it should be noted that the extracted source signals obtainedby ICA are not unique. Even repeating the analysis on the same datasets, we mayget different independent components [20]. In addition, EEMD does not considerthe inter-relationship among the signals while MEMD does. Thus, NAMEMD-MCCA performs better than EEMD-MCCA on average.38Figure 2.4: Extracted pulse waves based on the synthetic data.2.3.2 Real Data StudyIn this section, we apply the proposed novel NAMEMD-MCCA method to extractthe heart beat signal from some noisy nanofiber based strain sensor signals. Thenthe HBR estimation method is used to detect the RHBR from the obtained heartbeat signal.Empirical mode decomposition of the signalBefore performing dedicated analysis, we preprocess the raw data, shown in thefirst row of Fig. 2.6, by low-pass filtering and down sampling. Then we decom-pose the preprocessed signal utilizing the NAMEMD method. For graphical pre-sentation, we just include 4 IMF components for each sensor. The number of thesezero-crossings is a rough indication of the mean frequency of each IMF component39Figure 2.5: The absolute correlation coefficients between the original PPGsignals and the extracted pulse waves on 100 synthetic datasets. Theblue asterisks represent the averages and the red lines stand for the me-dians. The edges of the box are the lower and upper quartiles. Outliersbeyond the upper whisker or under the lower whiskers are indicated bythe red plus signs.[42]. As shown in Fig. 2.6, the mean frequencies decrease with the component’sindexes increase and it is a further indication of the hierarchical structure of theequivalent filter bank. It is suspected that the high frequency noise is appeared inthe first several components whereas relative low frequency noise resides in the lastseveral components. We eliminate those components whose frequency contents areirrelevant to that of the heat beat signal. It acts as the first step of denoising andleads to the major advantages of high efficiency in the next step.The NAMEMD has the ability to align ‘common scales’ present within multi-variate data [90] and it is the main difference with the classical channel-wise EMDmethods. To highlight this, we also decompose the preprocessed signal using the40Figure 2.6: Time domain analysis of the NAMEMD results. In each subfig-ure, the horizontal axis represents the time index and the vertical axisrepresents the amplitude of the signal.powerful EEMD method and compare its results with the results of NAMEMD inthe frequency domain, shown in Fig.2.7 and Fig. 2.8. As can be seen from Fig.2.7,the common scales are shown in the IMF components with the same indexes. Thealleviation of mode mixing mainly benefits from the use of inter-channel informa-tion and the Gaussian white noise which residues in a separated subspace from theuseful data. Such mode alignment property helps to utilize the similar scales frommultiple sources. However, for the EEMD method, the IMF components with thesame index always have different scales, as shown in Fig. 2.8. Take the IMF7 as anexample, the dominant frequency of IMF7 from sensor 1 is about 3.5 Hz, whereasthat from sensor 2 and 3 is about 10 Hz and 7 Hz respectively. Therefore, it is ar-bitrary to select the IMF components with the same index from the EEMD results41Figure 2.7: Frequency domain analysis of the NAMEMD results.as the input of the subsequent analysis, such as BSS methods.Another interesting finding is that we can observe some repetitive signals whoseperiods are similar to the heart beat signals in the decomposition results of sensor 2and sensor 3 by NAMEMD, such as IMF10 from sensor 2 and IMF 7 from sensor3 shown in Fig. 2.6, even though these two sensor signals are polluted by the noiseheavily. However, we could not find that similarity in the decomposition resultsby EEMD. This maybe benefit from the inter-channel information considered byNAMEMD whereas EEMD is channel-wise method. In addition, it also suggeststhat we must put the sensor at the ideal location if we want to get the heart signalfrom only one sensor, otherwise it would be much more difficult or even impossibleto obtain that signal.42Figure 2.8: Frequency domain analysis of the EEMD results.Extracted Heart Beat Signals Based on the MCCAThe artifacts components can be selected and removed from the obtained IMF com-ponents. Then 12 IMF components (4 from each sensor) are used to be the inputs ofthe blind source separation techniques, including MCCA and ICA. For the case ofMCCA, those IMF components from 3 sensors form 3 datasets, acting as the inputof the MCCA. However, for the case of ICA, we combine these 12 IMF compo-nents together and try to extract the heart beat signal from the noisy signals by theFastICA. Then the underlying source signals resembling the heart beat signals areselected and displayed. The separation results of NAMEMD-MCCA and some pre-vious methods including EEMD-ICA and EEMD-MCCA are compared in Fig.2.9.Generally, these three techniques can recover the heart beat signals and the heart43Figure 2.9: Extracted heart beat signals when employing three differentmethods.beat rates can be retrieved from them. However, the results of EEMD-ICA andEEMD-MCCA seem to convey more noise. As we can see from the 2nd subfigureof Fig.2.9, there are stark disparities between the magnitudes of different cycles,which may obscure the physical meaning of the heart beat signal. As to the resultof EEMD-MCCA method, the gaps between peaks and base line are too small andthis may increase the difficulty to detect the peaks sometimes. In contrast, the ex-tracted signal of NAMEMD-MCCA highly resembles the heart beat signal. Thesimilarity between them also suggests the effectiveness of the proposed two-stepstrategy to recover the underlying source signal.44Table 2.1: The estimated RHBR results when employing three differentmethods.Task NAMEMD-MCCA EEMD-ICA EEMD-MCCAFinger Flex 67.6 68.5 65.4Fix type 66.1 62.5 65.1Fix type key1 70.7 60.3 60.9Fix type1 69.6 69.2 69.7Fix type2 68.2 71.1 68.3Fix writing 69.9 68.7 71.3Fix writing1 68.5 65.7 66.1Fix writing2 63.6 64.6 63.8Wrist Flex 70.9 72.1 71.4Fix Sit 72.1 70.1 71.1Fix Stand 70.8 70.1 68.8AVG* 68.9 67.5 67.4STD* 2.47 3.75 3.44* AVG and STD represent the average and the standard deviation re-spectively.Heart Beat EstimationWe apply the heart beat rate estimation method mentioned in Algorithm 3 to theextracted heart beat signals of 11 tasks and the estimated results are shown in Table1. To evaluate the reliability of the results, we also asked an experienced physicianto measure the pulse rate of the subject during each task manually, from whichthe average resting pulse rate of the subject derived to be 70 bpm. For the pro-posed method, the average of the RHBR results is closer to the ground truth andthe related standard variation is 2.47, which is the smallest one among the threevariations.2.4 DiscussionsIn this chapter, we explore the potential of using novel nanofiber based strain sen-sors for monitoring heart beat signals. Three NF sensors are attached to different45locations on the wrist. These sensors can cover a large area to ensure getting goodsignals from the right position. Even though the acquired signals may be contam-inated, it is still possible to recover the heart beat signal from multi-sensors byexploring the inter-channel information. The nano-sensor signals are recorded un-der different conditions. Similar to other types of biomedical measurement tools,detecting and extracting source signals from back ground noises is challenging forwearable technologies. The proposed NAMEMD-MCCA method provides supe-rior performances for detecting RHBR from the measured nano-sensor data, as itmainly benefits from the appropriate combination of the NAMEMD and MCCA.At the first step, the NAMEMD works as adaptive filter banks with scale align-ment of channels and can overcome the limitations of classical channel-wise EMDmethods, such as standard EMD and the EEMD which can only cater for univari-ate signals, when dealing with data from multiple sources. If multiple signals areprocessed separately, the EEMD fails to align the frequency responses from IMFswith same index of multichannel signals, making the subsequent analysis meaning-less [90]. However, the NAMEMD method is able to accurately align the commonoscillatory modes in corresponding IMFs from multivariate signals, and thereforeproviding an intuitive and effective way for the analysis of narrowband but non-stationary rhythms from the multichannel nanofiber based strain sensor signals. Inaddition, unlike the EEMD, the noise in NAMEMD is not added directly to theoriginal signal. Instead it is kept in separate channels of a multivariate signal. Inthis way, the physically disjointed input signal and the added noise can alleviate theeffects of noise interference, a unique property of NAMEMD. Compared with theother two existing algorithms including EEMD-ICA and EEMD-MCCA, the bet-ter performance of the proposed method is credited to the use of the inter-channelinformation, enhanced localization properties and increased robustness to artifactsof the NAMEMD method.46At the second step, the IMFs from each sensor whose dominant frequenciesare close to that of typical heart beat signals are analyzed further by the follow-ing BSS methods. MCCA can be used to efficiently separate the source signals.Under certain assumptions, the group of corresponding sources from each datasetcan be jointly extracted by MCCA through maximizing the correlations among theextracted sources [20, 72]. Yet, ICA always aims to find a linear projection of themultivariate signals that maximize the mutual independence. Comparing with ICAwhich is based on high order statistics (HOS), MCCA uses the second order statis-tics and thus yields higher computational efficiency. For the health monitoringsystem, people always desire a direct feedback, and thus ensuring a lower com-putational complexity is often of great importance. In addition, the ICA methodalways fails to converge which may render that some source signals still are mixedwith noisy components. A few comparisons have shown that MCCA are morecompetitive in solving multivariate blind source separation problems [72].After obtaining the heart beat signal, the HBR estimation method can be ap-plied to detect the subject’s RHBR. It should be noted that the selection of theheart beat signal from the separation results of BSS methods is based on visualinspection. In real-world application, a selection criteria or an automated selectionmethod is needed. We are currently working on this issue based on some machinelearning strategies.2.5 ConclusionsThe ability to accurately detect flexions and extensions is critical for wearablehealth monitoring. In this chapter, we propose performing radial pulse sensingusing three nanofiber based strain sensors, and we aim to extract the heart beatsignal and estimate the corresponding RHBR from the recorded NF-based Sen-sor data. Since the type of nanofiber-based sensors used in this chapter is novel47and its application to heart beat rate estimation is also new, we need to investigatewhich approach is more appropriate to analyze this new type of nano sensor signalsfor estimating the heart beat rate. In addition to investigating two state-of-the-artEMD-BSS based methods, to better address the artifact removal challenge for ourspecific application, a novel framework based on the NAMEMD and MCCA isproposed to extract the heart beat signal from the multi-channel NF-based sensorsignals. It is shown that the proposed method provides the best performance, espe-cially for preserving the shape of the heart beat signal.This chapter is the first work during my PhD study. It was motivated by theparticular application to extract the heart beat signal from the nanofiber sensorsignals. We specifically designed this method for this application. However, it isstill applicable to the general cases when: 1) the number of observations is smallerthan that of sources; 2) there is cross correlation between each pair of observations.48Chapter 3Underdetermined Joint BlindSource Separation for 2 datasetsbased on Tensor DecompositionIn this chapter, we aim to jointly separate the underdetermined mixtures of latentsources from two datasets, where the number of sources exceeds the number ofobservations in each dataset. Currently available BSS methods, including JBSS andUBSS, cannot address this underdetermined problem effectively. We exploit thesecond-order statistics of observations and introduce a novel blind source sepa-ration method, termed as UJBSS-2. Considering the dependence information be-tween two datasets, the problem of jointly estimating the mixing matrices is tack-led via CPD of a specialized tensor in which a set of spatial covariance matricesare stacked. Furthermore, the estimated mixing matrices are used to recover thesources from each dataset separately. Numerical results demonstrate the compet-itive performance of the proposed method when compared to a commonly usedJBSS method, multiset canonical correlation analysis (MCCA), and the single-set49UBSS method, UBSS-FAS.3.1 Motivation and ObjectivesWith the advances of sensor technologies, we now have access to a large amount ofmultiset or multimodal data which needs to be jointly analyzed to extract the latentand physiologically meaningful components [5, 17, 121, 122]. Hence, JBSS algo-rithms which can jointly retrieve the source components are of great interest. JBSSexploits the dependence information across datasets and generally could yield bet-ter performances than that of single-set blind source separation (BSS) methodsapplied to each dataset separately [69]. JBSS also can keep the extracted compo-nents aligned across different datasets, an important feature that is not provided bysingle-set BSS methods.Numerous models have been introduced to generalize the idea of single-setBSS to JBSS. For example, ICA has been extended to handle multiple datasets[5]. Group ICA and joint ICA attempt to concatenate multiple datasets into onedataset in the vertical and horizontal dimension respectively. Then the standardICA can be performed on the concatenated single dataset. Independent VectorAnalysis (IVA) generalizes ICA to multiple datasets by exploring the statisticaldependence across datasets [5]. In addition, CCA has been popular to analyzerelationships between two sets of variables [56]. It seeks a linear transformation ofthe observations such that the obtained corresponding source components acrosstwo datasets are maximally correlated. A generalization of CCA from two datasetsto multiple datasets, MCCA, is shown to be flexible and powerful for discoveringassociations across multiple datasets [72]. Another recent extension of CCA is thejoint diagonalization of many cross-cumulate matrices [69], which is especiallyeffective when there is no explicit source distribution known in advance [69] .It is worth noting that the above mentioned JBSS algorithms were originally50proposed for the determined case, since they generally assume that the number ofsources is equal to or less than that of observations. This assumption may not betrue in some practical applications, due to concerns such as the cost or time issues[60]. However, to our best knowledge, in the current literature there is no JBSSmethod specifically designed for the underdetermined case (i.e., the number ofsources is greater than that of observations), even though there are some single-setunderdetermined BSS (UBSS) methods [31, 62, 116] which can be used to separatethe mixtures from each dataset separately. To fill this gap, in this chapter, we planto extend the idea of JBSS to the underdetermined case for two datasets.More specifically, inspired by the CCA model and the simultaneous matrixdiagonalization [31], we exploit the second-order statistics of the observations intwo datasets and propose a novel BSS method, referred to as the UJBSS-2. Itis based on tensor decomposition which is attractive for our problem due to itsability to estimate the mixing matrix in UBSS for a single dataset[122]. Unlikethe traditional (over)determined JBSS methods, the proposed UJBSS consists oftwo steps. Firstly, the mixing matrices are jointly estimated through a specializedtensor decomposition of the set of spatial covariance matrices of the observations.This step is the main emphasis of this chapter. Then the estimated mixing matricesare used to recover the sources from each dataset. In this work, we employ a noveltime-frequency (TF) analysis method [116] to recover the sources.3.2 Problem Formulation and Proposed Method3.2.1 Signal Generation Model and Problem StatementThe problem of interest here is the underdetermined JBSS problem for 2 datasets.For the case of UJBSS for 2 datasets, the observed vector of each dataset containsthe linear mixtures of the corresponding N sources. We can model the mixing51process as follows,X [k] = A[k]S[k]+ e[k], k = 1,2 (3.1)where X [k] = [x[k]1 ,x[k]2 , . . . ,x[k]M ]T represents the M-dimensional observations, s[k] =[s[k]1 ,s[k]2 , . . . ,s[k]N ]T means the underlying N-dimensional sources which are assumedto be mutually uncorrelated, A[k] ∈ RM×N with M < N (i.e., the underdeterminedcase) represents the unknown mixing matrix for dataset k, E [k] means the possibleadditive noise which is generally assumed to be zero mean, temporally white anduncorrelated with the source signals.The problem under consideration is to estimate the two mixing matrices A[1]and A[2] jointly up to permutation and scaling. In order to demonstrate the perfor-mance of the proposed mixing matrices estimation method, we also implement anapproach for further source recovering based on the estimated A[1] and A[2].3.2.2 Mixing Matrices EstimationWe have the following assumptions concerning the sources:(1) The sources are uncorrelated within each dataset.E{s[k]i (t)(s[k]j (t+ τ))H}= 0∀τ, 1≤ i 6= j ≤ N,k = 1,2,(I)where τ represents the time delay and ·H denotes the complex conjugate transpose.(2) The corresponding sources from two different datasets have non-zero cor-relations and the sources with different indices across datasets are not correlated.D[τ] =E{S[1](t)(S[2](t+ τ))H}=diag[ρ1(τ), ...,ρN(τ)].(II)Considering the correlations within and between the two datasets, we have the52spatial covariance matrices of the observations with delays, e.g., C[l], satisfy,C[1] = E{X [1](t)X [2](t+ τ1)H}= (A[1])D[τ1](A[2])H ,C[2] = E{X [1](t)X [2](t+ τ2)H}= (A[1])D[τ2](A[2])H ,...C[L] = E{X [1](t)X [2](t+ τL)H}= (A[1])D[τL](A[2])H ,(3.2)in which τl means the time delay and the matrix D[τl ] = E{S[1](t)S[2](t + τl)H} isdiagonal, for l = 1,2, ...,L.We stack the matrices C[1],C[2], ...,C[L] in a tensor C ∈ RM×M×L as follows:(C )i, j,l = (C[l])i, j, i = 1,2,...,M, j = 1,2,...,M, l = 1,2,...,L. We define the matrix D ofsize L×N with the element Dl,n = (D[τl ])n,n, for l = 1,2,...,L, n = 1,2,...,N. Then wecan represent C as:C =N∑n=1a[1]n ◦a[2]n ◦dn, (3.3)in which the symbol ◦ denotes the tensor outer product operation, and a[1]n and a[2]nare the nth column of the mixing matrices A[1] and A[2] respectively. With this ten-sor format, now the problem of estimating the mixing matrices A[1] and A[2] canbe reformulated as a problem of CPD, which aims to decompose a higher-ordertensor as a linear combination of a minimal number of rank-one tensors [61, 120].In (3.3), each rank-one tensor is associated with a single source. Therefore, themixing matrices A[1] and A[2] can be estimated via the unique CPD of C . Actu-ally, the power of CPD mainly stems from its uniqueness property. The uniquenessmeans that the decomposition is the only possible combination of rank-one tensorswhich sum to the objective tensor with the exception of permutation and scaling.The uniqueness is based on the rank of tensors. De Lathauwer et al. have studieddifferent methods to determine the rank of a tensor and concluded that the de-53composition of a three-order tensor is unique [30, 61] if the number of sources NsatisfiesN ≤ L and N(N−1)≤M2(M−1)2/2. (3.4)The CPD of the tensor can be calculated by minimizing the Frobenius normof the difference between the original tensor and its estimated results using theAlternating Least Squares (ALS) algorithm due to its programming simplicity andpopularity. In this chapter, the Enhanced Line Search (ELS) is used to enhance theconvergence of the ALS [31, 88]. It should be mentioned that the choices of τ1,τ2, . . . , τL may affect the estimation precision of the mixing matrices. Here, weheuristically choose the time delay as τl ∈ [0,60] samples. If τl is too large, thecorrelation between two related sources with delay will be close to 0 and then thematrix D might be ill conditioned. It is desired to select τ1, τ2, . . . , τL such thatD is well conditioned. In addition, if the time delay τ is too large, the covariancematrix of the sources in two datasets (i.e., D[τ]) will be close to a null matrix andthus the assumption (II) may not hold.3.2.3 Source Extraction Based on The Estimated Mixing MatricesA complete JBSS approach consists of both mixing matrix estimation and sourceextraction, even though our main focus in this chapter is the estimation of mixingmatrices. In order to demonstrate the performance of the proposed mixing matri-ces estimation method, we adopt a recently-developed TF analysis method [116]to recover the latent sources based on the estimated mixing matrices. For any un-derdetermined and instantaneous mixing model, priori knowledge is essential torestore the latent sources. In this chapter, we assume that the number of sourcessatisfies N ≤ 2M− 1. WVD [12, 112] is firstly applied to the observations andsparser representation in the TF domain can be achieved. For any time series sig-54nal with finite-energy, such as y1(t) and y2(t), the auto WVD is defined asWy1(t, f ) =∫ ∞−∞y1(t+ τ/2)y∗1(t− τ/2)e− j2pi f τdτ (3.5)and the cross WVD between the two signals is defined asWy1,y2(t, f ) =∫ ∞−∞y1(t+ τ/2)y∗2(t− τ/2)e− j2pi f τdτ. (3.6)For the linear mixing model shown in (3.1), we can haveW x[k](t, f ) = A[k]W s[k](t, f )(A[k])H , k = 1,2. (3.7)In order to extract the auto WVDs of the sources from the known mixtures,it is vital to identify the auto-term TF points where Ws[k]nshows energy concen-tration. The high TF concentration property of WVD and the sparsity of thesources guarantee that the auto-term TF points of the sources are dominant amongall the TF points [116]. Therefore, we can collect enough information and thenrecover the sources based on the WVD values of all auto-term TF points. Ac-cording to the definition of auto-term TF points, the spatial time frequency distri-bution (STFD) matrix of the sources at any auto-term TF point (ta, fa) is diago-nal. The WVD value of the sources α at auto-term point (ta, fa) can be written as[Ws[k]1 ,s[k]1(ta, fa),Ws[k]2 ,s[k]2(ta, fa), . . . ,Ws[k]N ,s[k]N(ta, fa)]T . Here α satisfies(A[k]A[k])α = rvec(W X [k](ta, fa)), (3.8)where  represents the Khatri-Rao product and rvec means the vectorization of amatrix. The auto WVD value of the sources at each auto-term TF point, denotedas α , can be uniquely determined based on (3.8) under the following assumptions:55any M columns of the mixing matrix A[k] are linearly independent; the number ofsources satisfies the condition N ≤ 2M−1 as in [116], which is more stringent than(3.4). Here, we adopt an auto-term TF points searching strategy proposed in [116]and reconstruct the sources in the time domain based on the auto WVD values atall auto-term TF points.The major steps of the proposed UJBSS algorithm are summarized in Algo-rithm 5.Algorithm 5 The UJBSS algorithm for two datasetsInput: M-dimensional observations X [1] and X [2].Output: the estimated mixing matrices A[1], A[2] and the recovered N-dimensionalsources s[1], s[2].1: Construct a three-order tensor C as in (3.2);2: Calculate the CPD with N components for the tensor C and estimate the mix-ing matrices A[1] and A[2] as in (3.3);3: Calculate the WVD values of the observations W X [1](t, f ) and W X [2](t, f ) as in(3.5) and (3.6);4: Calculate the auto WVD values of the sources at auto-term TF pointsW S[1](t, f ) and W S[2](t, f ) based on the estimated mixing matrices as in (3.8);5: Recover the sources in the time domain.3.3 Numerical StudyIn this section, we present two simulation studies, where different types of sourcesare considered, to demonstrate the separation performance of the proposed UJBSSmethod for two datasets. Two performance indices are used to evaluate the perfor-mance of the proposed method. One is the estimation error, defined as:Error = 10log10{mean( ||A− Aˆ||||A|| )} (3.9)56where Aˆ denotes the optimally ordered estimate of A. The other one measures thePearson product-moment correlation coefficient (PPMCC) between the estimatedsources and the original sources, which is defined asPPMCC =cov(sn, sˆn)σsnσsˆn(3.10)where sˆn means the estimate of the source sn, cov means the covariance betweentwo variables and σ means the standard deviation. In order to ensure the depen-dence between the sources of the two datasets, the sources are synthesized as fol-lows,S[1] =[s[1]1 ,s[1]2 , . . . ,s[1]N ]T ;S[2] =[s[2]1 ,s[2]2 , . . . ,s[2]N ]T=S[1].∗ (uni f rnd(0,1,S[1])(3.11)where uni f rnd(0,1,S[1]) generates a vector with the same size of S[1] and eachelement of the vector is randomly drawn from the continuous uniform distributionon the interval (0,1). The average correlation between the source s[1]n in the firstdataset and the corresponding source s[2]n in the second dataset is about 0.85, whichcan be regarded as highly correlated.3.3.1 Simulation 1: Audio SignalsThe sources used in this simulation include 8 audio signals, such as a piece ofsound from the cable news network (CNN) and a piece of sound of an anonymoussinger. The mixing matrices are generated randomly with elements following theuniform distribution U [−1,1]. For simplicity, each column of the mixing matricesis normalized into a unit vector. In our first setting, 6 sources are mixed into 4observations in each dataset and the corresponding sources in the two datasets arecorrelated. With different signal-to-noise ratios (SNRs), we compare the proposed57Figure 3.1: Performance of the proposed UJBSS method. (a) Performancecomparisons between the proposed UJBSS method and the single-setSOBIUM method. Here the number of sources N = 6 and the numberof observations M = 4. (b) Estimation error of A[1] when employingthe proposed UJBSS method. Here the number of sources N = 8 andthe number of observations M varies from 4 to 8. Similar results areobserved for A[2].UJBSS method with a commonly-used single-set UBSS method, SOBIUM [31],when it is applied to each dataset separately. We repeat the simulation 500 timesand the performance is shown in Fig.3.1(a). Benefiting from the dependence in-formation between two different datasets, the proposed UJBSS can provide moreaccurate estimation of the two mixing matrices, while SOBIUM neglects the pos-sible inter-dataset information. We also note that the Error measure from eitherthe UJBSS or the UBSS decreases with the increase of the SNR.We also test the performance of the proposed method as the number of theobservations increases from 4 to 8, while the number of sources is set to be 8.As noted in Fig.3.1(b), the estimation performance is getting better when moreobservations are available. Furthermore, we also test the performance when thecondition (3.4) is not satisfied. For instance, with M = 3 and N = 8, as expected,the proposed method cannot estimate the mixing matrices correctly. We observethat the Error can be close to 0dB even when the SNR is larger than 40dB.58Figure 3.2: The original sources and extracted sources from the first dataset.(a) Simulation 1. (b) Simulation 2. The top subfigures are the originalsources and the bottom ones are the extracted sources. Similar resultsare observed for the second dataset.We recover the latent sources from each dataset based on the estimated mixingmatrices. In an illustrative example, we linearly mix 4 sources into 3 observationsin each dataset. Fig.3.2(a) shows the separation results of the first dataset in thetime domain. The top four subfigures of Fig.3.2(a) represent the original sourcesand the bottom four subfigures are the recovered sources from the proposed UJBSSmethod. Considering the fact that SOBIUM only can estimate the mixing matri-ces, we further compare the proposed method with a single-set UBSS method, theUBSS-FAS [116], and the JBSS method MCCA [72], in term of the PPMCC be-tween the original sources and the recovered sources. The performance results ofthese three methods are reported in Table 3.1. Although adopting the same technol-ogy in extracting sources, the performance of the proposed method is significantlybetter than that of the single-set UBSS-FAS method. This observation supportsthe importance of estimating mixing matrices accurately. MCCA, which has beensuccessfully used in many fields [27], assumes that the number of sources is equalto the number of observations in each dataset and it could not be used to sepa-59rate sources in the underdetermined case directly. We add one observation in eachdataset so that MCCA can be applied. Therefore it is not really a fair setting andcomparison to the proposed method. However, we note that the performance ofMCCA is not as good as that of the proposed method, even with an additional ob-servation signal. The following reasons could contribute to the worse performanceof MCCA: it is mainly due to the fact that the correlation coefficients betweensources in two datasets are quite close [20, 72]; the performance of MCCA maysuffer from the error accumulation of the deflation-based separation methods [69].Table 3.1: PPMCC performance results in Simulation 1.Methods s1 s2 s3 s4Dataset 1UJBSS 0.935 -0.809 0.840 0.924UBSS-FAS[116] -0.478 0.892 -0.224 0.943MCCA∗[72] 0.717 0.622 0.774 0.649Dataset 2UJBSS 0.866 -0.913 0.919 0.727UBSS-FAS[116] -0.384 -0.867 -0.003 -0.629MCCA∗[72] 0.711 -0.623 0.767 0.650∗We add one additional observation in each dataset when we evaluate the MCCA.3.3.2 Simulation 2: Physiological SignalsIn this experiment, we employ four physiological signals as sources, includingelectrocardiogram (ECG), EEG, electrooculography (EOG) and EMG from a pub-licly available database [45]. The sources corresponding to the second dataset aregenerated following (3.11). We get similar result as in the first simulation. Here,we just show the performances in term of the estimated sources in the time do-main and the PPMCC results between the original sources and the estimated ones.Fig.3.2(b) shows the separation results of the four physiological signals. Althoughthey still seem noisy, the estimated sources from the proposed UJBSS are highly60correlated with the original ones. As shown in Table 3.2, the proposed methodyields promising results when it is used to separate the latent and underdeterminedmixtures from two datasets. Compared to the classical JBSS method MCCA, theproposed UJBSS approach needs fewer number of observations in each dataset,while it still can provide a competitive performance.Table 3.2: PPMCC performance results in Simulation 2.Methods ECG EEG EOG EMGDataset 1UJBSS 0.778 -0.920 0.950 0.807UBSS-FAS[116] -0.544 -0.908 0.222 0.450MCCA∗[72] 0.752 0.885 -0.798 -0.723Dataset 2UJBSS 0.793 0.874 0.857 0.784UBSS-FAS[116] -0.494 -0.751 0.393 -0.534MCCA∗[72] 0.761 -0.885 -0.807 -0.714∗We add one additional observation in each dataset when we evaluate the MCCA.3.4 ConclusionIn this work, we exploit the spatial covariance of the observations in two datasetsand present a novel UJBSS method to jointly estimate the mixing matrices fromtwo datasets when the number of observations is smaller than that of the sources(i.e., the underdetermined case). The mixing matrices are accurately estimatedthrough CPD of a specialized tensor in which a set of covariance matrices arestacked. Further the sources are recovered based on the estimated mixing matri-ces. Numerical results demonstrate the competitive performances of the proposedmethod when compared to the commonly used JBSS method and the single-setUBSS method.This Chapter represents the second work in my PhD study. To the best of ourknowledge, there is no any existing JBSS method which can work in the underde-61termined case. Inspired by CCA, we explore the dependence information betweentwo datasets and propose a novel strategy named as UJBSS for two datasets. Weutilize the physiological signals, including audio signals and biomedical signals, ascase studies. However, it is a general method for recovering the underlying sourcesfrom two datasets in the underdetermined case.62Chapter 4Underdetermined Joint BlindSource Separation of MultipleDatasetsIn this chapter, we tackle the problem of jointly separating instantaneous linear un-derdetermined mixtures of latent sources from multiple datasets, where the numberof sources exceeds that of observations in each dataset. Currently available BSSmethods, including JBSS and UBSS, cannot address this underdetermined problemeffectively. We exploit second-order statistics of observations and present a novelblind source separation method, referred to as UJBSS-M, as a generalization of ourprevious work on two datasets [124]. In this work, the cross correlation betweeneach pair of datasets is modeled by a third-order tensor in which a set of spatial co-variance matrices corresponding to different time delays are stacked. Consideringthe latent common structure of these constructed tensors, the mixing matrices arejointly estimated via joint canonical polyadic decomposition of these specializedtensors. Furthermore, we recover the sources from each dataset separately based on63the estimated mixing matrices. Simulation results demonstrate that the proposedUJBSS-m method yields superior performances when compared to commonly usedsingle-set UBSS and JBSS methods.4.1 Motivation and ObjectivesThe increasing availability of multiset and multimodal signals has posed new chal-lenges for conventional blind source separation (BSS) methods which are origi-nally designed to analyze one data set at a time. There are many applications in-volving multiple datasets which have dependence relationships between them andneed to be jointly analyzed [4, 44, 122], such as EEG, ECG, and MRI data. Hence,JBSS algorithms have attracted great interest in the fields of signal processing ow-ing to their ability to simultaneously recover the underlying and physiologicallymeaningful components from multiple datasets. The crucial difference betweenBSS and JBSS is reflected by the fact that BSS only examines each dataset sepa-rately, whereas JBSS generalizes BSS to consider the dependence across multipledatasets [22]. Compared with conventional single-set BSS methods, JBSS gen-erally could yield better performances. In addition, JBSS can keep the extractedcomponents aligned across different datasets, an important feature that is not pro-vided by single-set BSS methods.The most original JBSS method was likely CCA, which has been popular toanalyze relationships between two sets of variables [56]. It seeks a linear trans-formation of the observations such that the obtained corresponding source compo-nents across two datasets are maximally correlated. A generalization of CCA fromtwo datasets to multiple datasets, MCCA was shown to be flexible and powerful fordiscovering associations across multiple datasets [72]. Another recent extension ofCCA is the joint diagonalization of many cross-cumulate matrices [69], which isespecially effective when there is no explicit source distribution known in advance64[69]. In addition to CCA-type algorithms, numerous models have been introducedto generalize the idea of single-set BSS to JBSS. For example, ICA has been ex-tended to handle multiple datasets [5]. The group ICA and the joint ICA attempt toconcatenate multiple datasets into one dataset in the vertical and horizontal dimen-sion respectively, and then the standard ICA can be performed on the concatenatedsingle dataset [5]. Based on the modular Bayesian framework, Groves et al. pro-posed a novel Linked ICA [48], encapsulating the ideas from both the group ICAand joint ICA. IVA generalizes ICA to multiple datasets by exploring statisticaldependences across datasets [5].It is worth noting that the above mentioned JBSS algorithms were originallyproposed for the determined case, since they generally assume that the numberof sources is equal to or less than that of the observations. This assumption maynot be true in some practical applications, due to concerns such as the cost ortime issues [60]. However, to our best knowledge, in the current literature there isonly very limited work on JBSS methods specifically designed for the underdeter-mined case (i.e., the number of sources is greater than that of observations), eventhough there have been more single-set underdetermined BSS (UBSS) methods[31, 59, 62, 91, 115, 116, 119] which can be used to unmix the mixtures from eachdataset separately. These UBSS methods can be divided mainly into two categories[104]. Most UBSS methods rely upon the sparsity of source signals in a specificdomain, e.g., the time-frequency domain [91, 115, 116]. This category of methodsusually require exhaustive computation, especially when the number of sources islarge. Many algebraic methods were also proposed for unmixing the mixtures inthe underdetermined case, most of which are based on decomposition of differentdata structures, e.g., covariance matrices[31].In our previous paper [124], we proposed an underdetermined joint blind sourceseparation method for two datasets (UJBSS-2) based on the decomposition of a65specialized tensor. However, it can only jointly estimate the mixing matrices fromtwo datasets and cannot be extended directly to unmix the mixtures from multiple(larger than two) datasets [125]. To fill this gap in the literature, in this chapter,we plan to extend the idea of JBSS to the underdetermined case and generalize theidea of underdetermined joint blind source separation (UJBSS) for two datasets tothat for multiple datasets.More specifically, inspired by the MCCA model and the simultaneous diag-onalization of covariance matrices [26, 31, 46], we exploit second-order statis-tics of the observations in each pair of datasets and propose a novel BSS method,termed as the underdetermined joint blind source separation for multiple datasets(UJBSS-m). Unlike the traditional (over)determined JBSS methods, the proposedUJBSS-m consists of two steps: 1) jointly estimate the mixing matrices from mul-tiple datasets, and 2) recover the underlying sources individually based on eachmixing matrix estimated in step 1). The most challenging task is to estimate theunknown mixing matrices precisely, which is the main concern of this chapter. Inthis work, this problem is tackled via joint canonical polyadic decomposition ofspecialized tensors. The dependence information between each pair of datasets ismodeled by a third-order tensor where a set of spatial covariance matrices relatedto different time delays are stacked. Considering the possible combinations of twodatasets, the pairs of the corresponding tensors share a common factor and thenthe mixing matrices (i.e., factor matrices of those tensors) can be jointly estimatedby optimization-based methods. The estimated mixing matrices are further used torecover the sources from each dataset. In this work, we explore a novel subspacerepresentation based method [116] to recover the sources.Our main contributions are summarized as follows:1) This chapter extends the idea of (over)determined JBSS to that of the under-determined case.662) Exploiting the cross correlation between each pair of datasets, we proposea novel and effective method to jointly estimate the mixing matrices for multipledatasets. More precise estimates of the mixing matrices can be achieved via theproposed UJBSS-m method compared to several classical single-set UBSS meth-ods and JBSS methods.3) The proposed UJBSS-m method can be used to solve single set UBSS prob-lems and could achieve better performance in some cases, as demonstrated in thepromising application of noise enhanced signal processing.4) The proposed UJBSS-m method does not rely upon the sparsity of signalsand therefore it can be applied to a wide class of signals, e.g., audio/speech andbiomedical signals.4.2 Notations and PreliminariesIn this chapter, we generally use the notation of [2], which was adapted by [57, 61].A tensor can be interpreted as multi-index numerical array, whereby the order of atensor is the dimensionality of the array. Scalars, denoted as lowercase letters, e.g.,x, are said to be tensors of zero order. Vectors (first-order tensors) are denoted byboldface lowercase letters, e.g., x. Matrices (second-order tensors) are denoted byboldface capital letters X . Third-order or higher-order tensors are denoted by bold-face Euler script letters, e.g., X . The transpose, inverse, Moore-Penrose pseudoinverse, norm are denoted by (·)T , (·)−1, (·)†, ‖ · ‖.The operation of matricization reorders the elements of a higher-order tensorinto a matrix. For example, mode-n matricization of a Nth-order tensor X ∈RI1×I2×···×IN yields a matrix X (n) ∈ RIn×(I1×I2×···×In−1×In+1×···×IN) whose columnsare all mode-n fibers arranged in a specifically predefined order. In this chapter,67tensor element xi1,i2,...,iN corresponds to matrix elements x(n)(in , j) , wherej = 1+N∑l=1l 6=n(il−1)(l−1∏k=1k 6=nIk). (4.1)The inner product of two same-sized tensors X ,Y ∈ RI1×I2×···×IN is the sumof the products of their elements, i.e.,<X ,Y >=I1∑i1=1I2∑i2=1. . .IN∑iN=1xi1,i2,...,iN yi1,i2,...,iN . (4.2)The Frobenius norm of a tensorX is the square root of its inner product with itself,i.e.,‖X ‖=√<X ,X >. (4.3)The outer product of vectors {a(n)} ∈ RIn ,n = 1,2, . . . ,N yields a rank-one tensorX = a(1) ◦ a(2) ◦ . . .a(N) with entries xi1,i2,...,iN = a(1)i1 a(2)i2 . . .a(N)iN , where the ◦ rep-resents the outer product operation. The superscript in parentheses represents oneelement in a sequence, e.g., a(n) represents the nth vector in a sequence of vectors.In order to demonstrate multi-way models, the usual matrix product, such asKronecker product and Khatri-Rao product, is not sufficient. A frequently usedoperation is the mode-n product, denoted by ×n. The mode-n product of a tensorX ∈ RI1×I2×···×IN with a matrix A ∈ RJn×In amounts to the product of all mode-nfibers with A and yields a tensor with the size of (I1×I2×·· ·×In−1×Jn×In+1 · · ·×IN), whose entries are given by(X ×n A)i1,i2,...,in−1, jn,in+1...,iN=In∑in=1xi1,i2,...,in−1,in,in+1...iN a jn,in .(4.4)68The mode-n product of a tensor and a vector is a special case of the mode-nproduct of a tensor and a matrix with the size of (1× In). Note that the order of theresult is (N−1), one less than the order of the original tensor. It is often useful tocalculate the product of a tensor with a sequence of vectors. LetX denote a tensorwith the size of I1× I2× ·· ·× IN , and let {a(n)} (n = 1,2, . . . ,N), be a sequenceof vectors, each with the length of In. Then the product of X with a sequence ofvectors in all modes yields a scalar, i.e.,y =X ×1 a(1)×2 a(2)×3 · · ·×N a(N)=I1∑i1=1I2∑i2=1. . .IN∑iNxi1,i2,...,iN a(1)i1 a(2)i2 . . .a(N)iN .(4.5)We refer the readers to [57, 61] for further details and discussions about varioustensor operations.4.3 Problem FormulationThe problem of interest here is the underdetermined JBSS for multiple datasets,e.g., K datasets. The M observations of each dataset contain the linear mixtures ofthe corresponding N sources. We can model the mixing process as follows,X (k) = A(k)S(k)+E(k), k = 1,2, . . . ,K. (4.6)X (k) = [x(k)1 ,x(k)2 , . . . ,x(k)M ]T denotes the M-dimensional observations with real val-ues and x(k)m is the mth channel of the observations in dataset k. S(k)= [s(k)1 ,s(k)2 , . . . ,s(k)N ]Tmeans the underlying N-dimensional sources with real values and s(k)n is the nthsource for dataset k. A(k) = [a(k)1 ,a(k)2 , . . . ,a(k)N ] ∈ RM×N with M < N (i.e., the un-derdetermined case) denotes the unknown mixing matrix, whose nth column a(k)ncorresponds to the source s(k)n for dataset k. E(k) means the possible additive noise69which is generally assumed to be zero mean, temporally white and uncorrelatedwith the source signals.Similar to several existing JBSS methods, e.g. MCCA [72] and JDAIG-SOS[69], we have the following assumptions regarding the sources:(1) The sources are uncorrelated within each dataset:E{s(k)i (t)(s(k)j (t+ τ))T}= 0∀τ, 1≤ i 6= j ≤ N, k = 1,2, . . . ,K,(I)where s(k)i (t) is the i-th source in dataset k and s(k)j (t+τ) represents the j-th sourcewith the time delay τ in dataset k.(2) The corresponding sources from two different datasets have non-zero cor-relations and the sources with different indices across datasets are not correlated:D(τ) =E{S(k1)(t)(S(k2)(t+ τ))T}=Diag(ρ1(τ),ρ2(τ), . . . ,ρN(τ)),(II)where Diag(·) represents the diagonal matrix, the ρn(τ)=E{s(k1)n (t)(s(k2)n (t+τ))T}denotes the covariance between s(k1)n (t) and s(k2)n (t + τ). This assumption meansthat the corresponding sources in multiple datasets are second-order correlated witheach other. In addition, the sources within [s(1)i ,s(2)i , . . . ,s(K)i ] are uncorrelated withthe sources within [s(1)j ,s(2)j , . . . ,s(K)j ] for 1≤ i 6= j ≤ N.The task of estimating the mixing matrices {A(k)} and retrieving the under-lying sources are not equivalent in the underdetermined case. Therefore, mostUBSS methods consist of two stages: estimate the mixing matrices first and thenretrieve the underlying sources. The major problem under consideration is to esti-mate {A(k)} jointly up to permutation and scaling. In this chapter, this problem isaddressed via a specially designed joint tensor decomposition. In addition, retriev-70ing the underlying sources when the mixing matrices are estimated or known is aclassic inverse problem [62]. In order to further demonstrate the performance ofthe proposed mixing matrices estimation method, we also implement an approachfor source recovering based on the estimated A(k).4.4 Canonical Polyadic Decomposition of TensorA polyadic decomposition aims to decompose a higher-order tensor as a linearcombination of rank-one tensors [61, 120]. For the case of a third-order tensorX ∈ RI×J×K , it can be written in the formX =N∑n=1an ◦bn ◦ cn, (4.7)where N is a positive integer and an ∈ RI , bn ∈ RJ , cn ∈ RK . Equivalently, it can bewritten element wisely asxi, j,k =N∑n=1ai,nb j,nck,n, (4.8)where i = 1,2, . . . , I, j = 1,2, . . . ,J and k = 1,2, . . . ,K. The rank of a tensor isthe smallest number of rank-one tensors that yield the tensor in the way as (4.7).If rank(X ) = N, (4.7) is the CPD of X , which is also known as the CanonicalDecomposition (CANDECOMP) or Parallel Factor Analysis (PARAFAC) [2]. Thecanonical polyadic approximation means thatX ≈ [[A,B,C]]≡N∑n=1an ◦bn ◦ cn,(4.9)where N = rank(X ).71The factor matrices refer to the combination of the vectors corresponding toeach rank-one tensor and can be written asA = [a1,a2, . . . ,aN ] ∈ RI×NB = [b1,b2, . . . ,bN ] ∈ RJ×NC = [c1,c2, . . . ,cN ] ∈ RK×N .(4.10)To a large extent, the power of CPD mainly stems from its uniqueness property.The uniqueness of CPD means that the decomposition is the only possible combi-nation of rank-one tensors which sum to the objective tensor with the exception ofthe indeterminacies of column permutation and scaling. The permutation indeter-minacy refers to the fact that we can permute the rank-one terms arbitrarily. Thescaling indeterminacy means that we can scale the individual column of the factormatrices as long as their product remains the same, i.e.,X =N∑n=1(α1n an)◦ (α2n bn)◦ (α3n cn) if α1nα2nα3n = 1. (4.11)The uniqueness condition is based on the rank of tensors. The most famous resulton uniqueness of CPD was reported by J. Kruskal [64]. Kruskal’s theorem statesthat the CPD of a third-order tensor X ∈ RI×J×K is deterministically unique if N(where N = rank(X )) satisfiesN ≤ kA+ kB+ kC−22, (4.12)where k· denotes the k-rank of a given matrix (·), meaning that k· is the largestinteger that any k· columns of the matrix (·) are linearly independent. Checkingdeterministic conditions can be cumbersome. De Lathauwer et al. have studieddifferent methods to determine the rank of a tensor and concluded that the de-72composition of a third-order tensor X ∈ RI×J×K is generically unique (i.e., withprobability one) [30] provided that N satisfiesN ≤ K and N(N−1)≤ IJ(I−1)(J−1)/2. (4.13)Domanov et al. further complemented the existing bounds for generic uniquenessof the CPD [36] and concluded that the CPD of a third-order tensor X ∈ RI×J×Kof rank N is generically unique if2≤ I ≤ J ≤ K ≤ NN ≤ I+ J+2K−2−√(I− J)2+4K2,(4.14)or3≤ I ≤ J ≤ N ≤ KN ≤ (I−1)(J−1).(4.15)There are two main approaches to compute the CPD of a tensor, namely thelinear algebra [35] and optimization based methods [2, 99]. Both types of methodshave their own strengths and weaknesses. For a thorough study of the uniquenessconditions and computation, we refer to [30, 34, 61] and the references therein.4.5 Algorithm for Estimating the Mixing Matrices inUJBSSHow to estimate the mixing matrix is still a challenging problem, even for under-determined case of single dataset. In this chapter, we propose a novel and effectivealgorithm to jointly estimate the mixing matrices from multiple dataset, which canbe regarded as an extension of the method based on statistical property of signals,e.g., simultaneous diagonalization of the second order autocovariance and CPD of73Figure 4.1: Illustration of how to generate tensors by incorporating the de-pendence information between each pair of datasets.a specialized tensor [31, 46, 124]. For ease of presentation, we take the case of 3datasets as an example, e.g., X (1), X (2) and X (3), and it can be easily generalizedto the case of more than 3 datasets. The problem is reformulated as joint canonicalpolyadic decomposition of a sequence of third-order tensors, which share commonfactor matrices. It should be mentioned that the proposed method is limited toreal-valued problems and cannot be directly generalized to complex-valued cases.4.5.1 Tensor ConstructionThe cross covariance of the observations with time delay τ , such as the observa-tions in dataset k1, X (k1)(t), and the observations in dataset k2 with time delay τ ,X (k2)(t+ τ), can be formulated asE{X (k1)(t)X (k2)(t+ τ)T}=(A(k1))E{S(k1)(t)S(k2)(t+ τ)T}(A(k2))T ,(4.16)where k1 and k2 represent the index of each dataset and range from 1 to 3. Con-sidering the correlations within and between each pair of datasets, the covariance74matrices between X (1) and X (2) with time delay τ satisfyP(1) = E{X (1)(t)X (2)(t+ τ1)T}= (A(1))U (τ1)(A(2))T ,P(2) = E{X (1)(t)X (2)(t+ τ2)T}= (A(1))U (τ2)(A(2))T ,...P(L) = E{X (1)(t)X (2)(t+ τL)T}= (A(1))U (τL)(A(2))T ,(4.17)in which τl means the time delay and the matrix U (τl) = E{S(1)(t)S(2)(t + τl)T} isdiagonal, for l = 1,2, ...,L.We stack the sequence of covariance matrices P(1),P(2), . . . ,P(L), denoted as{P(l)}, in a tensor P ∈ RM×M×L as follows: (P)i, j,l = (P(l))i, j, i = 1,2, . . . ,M,j = 1,2, . . . ,M, l = 1,2, . . . ,L. We define the matrix U of size L×N with theelement U l,n = (U (τl))n,n, for l = 1, 2, ..., L, n = 1, 2, ..., N. Then we can representP as (see Fig. 4.1):P =N∑n=1a(1)n ◦a(2)n ◦un, (4.18)in which a(1)n and a(2)n are the nth column of the mixing matrices A(1) and A(2)respectively, and un is the nth column of the matrix U .Similarly, the covariance matrix between the other two pairs of observationswith time delay τl , denoted as Q(l) and R(l) satisfyQ(l) = E{X (1)(t)X (3)(t+ τl)T}= (A(1))V (τl)(A(3))T ,R(l) = E{X (2)(t)X (3)(t+ τl)T}= (A(2))W (τl)(A(3))T ,(4.19)where V (τl) = E{S(1)(t)S(3)(t+τl)T} and W (τl) = E{S(2)(t)S(3)(t+τl)T} for l = 1,2, . . . , L. Stack these two sequence of covariance matrices {Q} and {R} in tensorsQ ∈RM×M×L andR ∈RM×M×L as follows: (Q)i, j,l = (Q(l))i, j, (R)i, j,l = (R(l))i, j,75i = 1, 2, ..., M, j = 1, 2, ..., M, l = 1, 2, ..., L. To simplify the notation, we furtherdefine the matrix V ∈ RL×N and W ∈ RL×N with the element V l,n = (V (τl))n,n andW l,n = (W (τl))n,n, for l = 1, 2, ..., L, n = 1, 2, ..., N. Then these two tensors can berepresented as (see Fig. 4.1):Q =N∑n=1a(1)n ◦a(3)n ◦ vn,R =N∑n=1a(2)n ◦a(3)n ◦wn,(4.20)in which a(k)n is the nth column of the mixing matrices A(k) for k = 1, 2, 3, vn andwn are the nth column of the matrix V and W respectively.It should be mentioned that the choices of τ1, τ2, . . . , τL may affect the esti-mation precision of the mixing matrices. If τl is too large, the correlation betweentwo related sources with the delay will be close to 0 and then the covariance matrixmight be ill conditioned. It is desired to select τ1, τ2, . . . , τL such that U , V and Ware well conditioned. In addition, if the time delay τ is too large, the covariancematrix of the sources in two datasets (e.g., U (τ)) will be close to a null matrix andthus the assumption (II) may not hold. Here, we heuristically choose the time delayas τl ∈ [0,200] data samples.Fig. 4.1 illustrates how to generate these tensors by incorporating the depen-dence information between each pair of datasets. It is worth noting that each pairof tensors share a common factor matrix, e.g., P and Q are coupled in the modeof A(1).4.5.2 Joint Tensor Polyadic DecompositionConsidering the common latent structure, now the problem of estimating the mix-ing matrices A(k) can be reformulated as a problem of joint CPD of a collection of76tensors, e.g., P , Q and R for the case of three datasets. There are two main ap-proaches to jointly decompose a sequence of tensors, i.e., linear algebra [101] andoptimization based methods [3, 100, 109]. Sørensen et al. took into account thecoupling between multiple tensors and developed a linear algebra based algorithm[101]. This method can provide an explicit solution for exact tensor decomposi-tion. However, in practice data are noisy and consequently the estimation may benot accurate. In addition, it is notable that the linear algebra based method requiresthe full column rank of the common factor matrices whereas the common factorsin our problem are rank deficient [101]. In this chapter, we generalize the idea ofcoupled matrix and tensor factorization (CMTF) and jointly decompose a sequenceof tensors via gradient-based optimization method [3, 100, 109].The uniqueness condition of the joint CPD is important in practice. Simplysaid, the solution of the joint CPD will be generic unique if all the individual CPDsare unique. In this chapter, we can get the unique solution of each mixing matrixgenerically, providing the number of sources satisfies the condition (4.14) or (4.15).It is worth mentioning that this uniqueness condition of the joint CPD might befurther relaxed, but the topic itself deserves a stand-alone theoretical paper and itis out of scope of the current paper.The aim is to find the factor matrices {A(k)} ∈ RM×N and the covariance ofsources between different datasets U ,V and W ∈ RL×N which can minimize thefollowing objective function, a variant of Frobenius norm of the difference betweenthe given tensors and their canonical polyadic approximation, written as77f (A(1),A(2),A(3),U ,V ,W )=12‖P− [[A(1),A(2),U ]]‖2︸ ︷︷ ︸f (1)(A(1),A(2),U)+12‖Q− [[A(1),A(3),V ]]‖2︸ ︷︷ ︸f (2)(A(1),A(3),V )+12‖R− [[A(2),A(3),W ]]‖2︸ ︷︷ ︸f (3)(A(2),A(3),W ).(4.21)where [[·]] denotes the canonical polyadic approximation of a given tensor. Thisequation simultaneously takes the coupling information between different tensorsinto account. We propose to solve this problem via a gradient-based optimizationmethod. Proposition 1 elaborates the partial derivative of the objective function fwith respect to each column of the desired matrices, i.e. {a(k)n }, un, vn and wn forn = 1,2, . . . ,N.The equations in Proposition 1 is proved in the Appendix. Then the gradient off can be assembled via stacking the partial derivatives with respect to each columnof the factor matrices, as∇ f =[∂ f∂a(1)1; ∂ f∂a(1)2; . . . ; ∂ f∂a(1)N; . . . ; ∂ f∂w1 ; . . . ;∂ f∂wN]T. (4.22)Once we get this gradient, we can calculate the factor matrices, including the mix-ing matrices and the covariance matrices, based on any first-order optimizationmethod. In this chapter, we employ the nonlinear conjugate gradient (NCG) algo-rithm implemented in [38] to solve the unconstrained optimization problem andestimate the mixing matrices of multiple datasets simultaneously. Compared withsecond-order optimization methods, such as Newton-based methods, NCG alwaysrequires less computation and memory[109].78Proposition 1. The partial derivative of the objective function f with respect toeach column of the desired matrices , i.e., {a(k)n }, un, vr and wn, are given by∂ f∂a(1)n=−P×2 a(2)n ×3 un−Q×2 a(3)n ×3 vn+N∑c=1[(a(2)n )T a(2)c (un)T uc+(a(3)n )T a(3)c (vn)T vc]a(1)c∂ f∂a(2)n=−P×1 a(1)n ×3 un−R×2 a(3)n ×3 wn+N∑c=1[(a(1)n )T a(1)c (un)T uc+(a(3)n )T a(3)c (wn)T wc]a(2)c∂ f∂a(3)n=−Q×1 a(1)n ×3 vn−R×1 a(2)n ×3 wn+N∑c=1[(a(1)n )T a(1)c (vn)T vc+(a(2)n )T a(2)c (wn)T wc]a(3)c∂ f∂un=−P×1 a(1)n ×2 a(2)n +N∑c=1[(a(1)n )T a(1)c (a(2)n )T a(2)c ]uc∂ f∂vn=−Q×1 a(1)n ×2 a(3)n +N∑c=1[(a(1)n )T a(1)c (a(3)n )T a(3)c ]vc∂ f∂wn=−R×1 a(2)n ×2 a(3)n +N∑c=1[(a(2)n )T a(2)c (a(2)n )T a(3)c ]wc.4.6 Source Extraction Based on the Estimated MixingMatricesUnlike the (over)determined case, the estimation of the mixing matrix is not equiv-alent to recovering the underlying sources in UBSS. A complete UBSS approachalways consists of both mixing matrix estimation and source extraction, even thoughour main focus in this chapter is the estimation of mixing matrices. Extracting thesources when the mixing matrix is estimated is a classic inverse problem. Manytechniques have already been proposed in the literature, including array process-ing techniques [107] and methods exploiting the sparsity of sources in a domain,e.g., the TF domain [116]. In order to demonstrate the performance of the proposedmixing matrices estimation method, we adopt a recently-developed subspace repre-sentation method [59] to recover the latent sources based on the estimated mixingmatrices. For simplicity, the proposed method for extracting sources is derived79without considering the background noise. However, it was shown to be robust tothe background noise [59].For any underdetermined non-homogeneous linear equation, the complete so-lution can be represented as the sum of its particular solution and a general solu-tion of the corresponding homogeneous equation. As to the case in this chapter,A(k)S(k) = X (k), the general solution of source S(k) can be written asS(k) = S(k)p +S(k)h, (4.23)where the S(k)p denotes its particular solution and S(k)h denotes a general solution ofthe corresponding homogeneous equation A(k)S(k) = 0. One particular solution ofthe above mentioned non-homogeneous equation isS(k)p = (A(k))†X (k), (4.24)where (A(k))† denotes the pseudo-inverse of the mixing matrix A(k). In addition,the general solution of the homogeneous equation A(k)S(k) = 0 can be expressed asS(k)h =V Z(k), (4.25)where V is an N ∗ (N−M) matrix whose columns are bases of the nullspace ofA(k) and Z(k) is an arbitrary matrix with the size of (N −M) ∗ T (T representsthe total number of samples in each channel) [102]. The basis matrix V can beobtained from the mixing matrix A(k) and then the problem which aims to estimatethe N dimensional observations boils down to the problem of estimating N −Mdimensional latent variable Z(k).In order to be applicable to a wide class of signals, such as audio and biologicalsignals EEG, EMG, the Generalized Gaussian Distribution (GGD) [58] is utilized80to model the source distributions. Mathematically, it is expressed in the followingequationpy(y;σ ,β ) =v(β )σexp{−c(β )|y−µσ|2/(1+β )}, (4.26)wherec(β ) = (Γ(3/2(1+β ))Γ(1/2(1+β )))1/(1+β )v(β ) =Γ(3/2(1+β ))1/2(1+β )Γ(1/2(1+β ))3/2,(4.27)in which Γ(·) is the Gamma function. σ is the standard derivation and µ is the meanof a continuous random variable y. In this chapter, the mean of source is assumed tobe 0. We define the parameter set θ = {β ,σ} for simplicity, where each componentof β = [β1, . . . ,βN ] and σ = [σ1, . . . ,σN ] correspond to each channel of the sources.The parameters of the GGD θ can be estimated to maximize the likelihood of theobserved mixtures X (k) based on Expectation-maximization (EM) algorithm. ThenZ can be obtained by sampling from p(Z(k)|X (k),θ) asZˆ(k) =1GG∑g=1Z(k)g , (4.28)where {Z(k)1 , . . . ,Z(k)G } are the G samples drawn from p(Z(k)|X (k),θ) using theMarkov Chain Monte Carlo (MCMC) method. Then we recover the underlyingsources based onSˆ(k)= (A(k))†X (k)+V Z(k). (4.29)The major steps of the proposed UJBSS-m algorithm are summarized in Algo-rithm 6. The number of time delays is 20 in default. The step size of time delays,i.e. τl+1−τl , is suggested to be 2 samples (corresponding to 0.25ms) and 5 samples(corresponding to 5ms) for audio signals and physiological signals respectively.81Algorithm 6 The UJBSS-m algorithm based on joint tensor decompositionInput: M-dimensional observations {X (k)} and the number of sources N in eachdatasets, for k= 1, 2,. . . ,K.Output: the estimated mixing matrices {A(k)} and the recovered N-dimensionalsources {S(k)}, for k= 1, 2,. . . ,K.STEP 1: For each pair of datasets, e.g., X (k1) and X (k2)(k1 6= k2), we calculatethe cross covariance matrices as (5.3) and stack them to construct a third-ordertensor as in Section 4.5.1. Considering the combination of datasets, we get(K2)tensors where each pair of tensors share a common factor matrix, as shown inFig. 4.1;STEP 2: Calculate the joint polyadic decomposition of the tensors constructedin step 1 via optimization based method and estimate the mixing matrices {A(k)}as in Section 4.5.2;STEP 3: Estimate the parameters of the Generalized Gaussian distribution basedon the EM algorithm.Initialize: initialize the parameter θ to some random values.E-step: calculate the expected value of the log likelihood function with re-spect to the conditional distribution of Z(k) given the observation X (k) under thecurrent estimate of θ . It can be expressed as Ep(Z(k)|X (k),θ ∗)(log(p(Z(k)|X (k),θ))),where θ ∗ means the parameter value got in the initialization or the previous M-step.M-step: update the parameter set θ to maximize the above expected value.The updated value isθ = argmaxθEp(Z(k)|X (k),θ ∗)(log(p(Z(k)|X (k),θ)))≈ argmaxθ1GG∑g=1log(p(Z(k)g |X (k),θ)),where {Z(k)1 , . . . ,Z(k)G } are G samples drawn from p(Z(k)|X (k),θ) based on theMCMC method.Iterate: iterate the E-step and M-step until convergence.STEP 4: Recover the sources S(k) based on the minimum mean-square errorcriterion as in Equation (4.29).824.7 Numerical Study for the Multiple Dataset CaseTo demonstrate the joint separation performance for multiple datasets, simulationsare performed on both audio and biological signals when applying the proposedUJBSS-m and several commonly used BSS methods. Two performance indicesare used to evaluate the separation performances. One is the estimation error of themixing matrices, defined as:Error = 10log10{mean( ||A− Aˆ||||A|| )}, (4.30)where Aˆ denotes the optimally ordered estimate of A. The other measures the Pear-son correlation coefficient (PCC) between the estimated sources and the originalones, which is defined asPCC(s(k)n , sˆ(k)n ) =cov(s(k)n , sˆ(k)n )σs(k)nσsˆ(k)n, (4.31)where sˆ(k)n means the estimate of the source s(k)n in the kth dataset, cov(·, ·) meansthe covariance between two variables and σ means the standard deviation. In orderto ensure the dependence between the sources of each pair of datasets, the sourcesare synthesized as follows,S(1) =[s(1)1 ,s(1)2 , . . . ,s(1)N ]T ;S(2) =S(1).∗ (uni f rnd(0,1,S(1))S(3) =S(1).∗ (uni f rnd(0,1,S(1)),(4.32)where uni f rnd(0,1,S(1)) generates a matrix with the same size of S(1) and eachelement of the matrix is randomly drawn from the continuous uniform distributionon the interval (0,1). The average correlation between the source s(1)n and the cor-83responding source s(k)n (k = 2,3) is about 0.85; the average correlation between thesource s(2)n and the corresponding source s(3)n is about 0.7, both of which can beregarded as highly correlated.4.7.1 Simulation 1: Audio SignalsThe sources used in this simulation include 8 audio signals, such as two pieces ofsound from the cable news network (CNN) news and a piece of sound of an anony-mous singer, all of which are publicly available1. The sampling rate is 8000Hz.The mixing matrices are generated randomly with elements following the uniformdistribution U [−1,1]. For simplicity, each column of the mixing matrices is nor-malized into a unit vector. Three datasets are generated following (4.32). In ourfirst setting, 5 sources are mixed into 4 observations in each dataset and the cor-responding sources in the different datasets are highly correlated. With differentsignal-to-noise ratios (SNRS)), we compare the proposed UJBSS-m method witha commonly-used single-set UBSS method, SOBIUM [31], when it is applied toeach dataset separately. We also test the performance of our recent work on UJBSSfor two datasets, UJBSS-2 [124], when two datasets are available, e.g., X (1) andX (2). We repeat the simulation 1000 times and the performance is shown in Fig.4.2. Results are given according to the SNR level in the range of -5dB - 40dB.Benefiting from dependence information between different datasets, the proposedUJBSS can provide more accurate estimation of the mixing matrices, while SO-BIUM neglects the possible inter-dataset information. Compared with UJBSS-2,the proposed UJBSS-m takes into account more dependence information, amongthree datasets rather than between two datasets, and yields better performance.We also note that the Error measure from the single set UBSS and UJBSS meth-ods decreases with the increase of the SNR. The proposed UJBSS-m consistently1http://research.ics.aalto.fi/ica/cocktail/sounds.html84provides the best results over the whole SNR range, suggesting the performancestability of the proposed algorithm.We also examine the performance of the proposed method with the decrementof the under-determinacy level, i.e., the number of the observations increases from4 to 7 while the number of sources is set to be 8. As noted in Fig. 4.3, the estima-tion performance is getting better when more observations are available. Besidesthe degraded estimation precision, a higher under-determinacy level also requireshigher computational complexity. The performance is getting better when the SNRis increased from -5dB to 20dB. The change of estimation error is not obviouswhen the SNR is greater than 20dB even there are some fluctuations. It is shownthat the estimation performance relies upon several factors such as the noise levelSNR, under-determinacy level (i.e., the number of sources for a given number ofsensors) and the correlation between each pair of datasets.We recover the latent sources from each dataset based on the estimated mixingmatrices. In an illustrative example, we linearly mix 4 audio sources into 3 obser-vations in each dataset. Fig. 4.4 shows the separation results of the first dataset inthe time domain. The top four subfigures of Fig. 4.4 represent the original sources,the middle three subfigures are the mixed observations and the bottom four sub-figures are the recovered sources via the proposed UJBSS-m method. In addition,we compare the proposed method with other three single-set UBSS methods, in-cluding SOBIUM, UBSS based on subspace representation (UBSS-SR for short)[59] and UBSS based on sparse coding (UBSS-SC for short) [119], as well as theJBSS method MCCA [72] and UJBSS for two datasets UJBSS-2 [124], in termof the PCC between the original sources and the recovered ones. Both UBSS-SR and UBSS-SC are based on single source detection, which assumes that the TFpoints are occupied by a single source or the corresponding single source possessesdominant energy. However, the performance of estimating the mixing matrix de-85−10 −5 0 5 10 15 20 25 30 35 40−14−12−10−8−6−4−20SNR (dB)Error (dB)  UBSS SOBIUM X(1)UJBSS−2 X(1)&X(2)UJBSS−2 X(1)&X(3)UJBSS−m X(1),X(2)&X(3)Figure 4.2: Simulation 1: performance comparisons on audio signals whenusing the proposed UJBSS-m method and other UBSS methods, includ-ing the single-set UBSS method SOBIUM [31] and the UJBSS methodfor two dataset, i.e., UJBSS-2 [124]. Here the number of sources N =5 and the number of observations M = 4. The number of time delays L= 20 and the step size of time delays (i.e., τl − τl−1) is 2 data samples,corresponding to 0.25ms. Similar results are observed for A(2) and A(3).teriorates when this assumption is not satisfied. Furthermore, given that the time-frequency analysis method [116, 124] is memory-intensive and time-consuming,we estimate the mixing matrices via UJBSS-2 and SOBIUM respectively, and thenextract the sources using the same method as in UBSS-SR [59]. The performanceresults of these six methods are reported in Table 4.1.Despite adopting the same technology in extracting sources, the performanceof the proposed method is significantly better than that of the single-set SOBIUM86−5 0 5 10 15 20 25 30 35 40−10−9−8−7−6−5−4−3−2−10SNR (dB)Error (dB)  M = 4, N = 8M = 5, N = 8M = 6, N = 8M = 7, N = 8Figure 4.3: Simulation 1: estimation error of A(1) when employing the pro-posed UJBSS method. Here the number of sources N = 8 and the num-ber of observations M varies from 4 to 7. The number of time delays L= 20 and the step size of time delays (i.e., τl − τl−1) is 2 data samples,corresponding to 0.25ms.and UBSS-SR method. This observation confirms the importance of estimatingmixing matrices accurately. In addition, the proposed method also outperforms arecently proposed UBSS method UBSS-SC. The main reason is that such UBSSmethods always require the sparsity of the sources to some extent, while the as-sumption may not be satisfied in reality. MCCA, which has been successfully usedin many fields [27], assumes that the number of sources is equal to the numberof observations in each dataset and it could not be used to separate sources in theunderdetermined case directly. We add one observation in each dataset so that87AmplitudeTime Index0 1 2 3 4 5x 104−505Original s(1)10 1 2 3 4 5x 104−6−4−20246Original s(1)20 1 2 3 4 5x 104−10−5051015Original s(1)30 1 2 3 4 5x 104−6−4−2024Original s(1)40 1 2 3 4 5x 104−10−50510Observation x(1)10 1 2 3 4 5x 104−505Observation x(1)20 1 2 3 4 5x 104−10−5051015Observation x(1)30 1 2 3 4 5x 104−505Estimated s(1)10 1 2 3 4 5x 104−4−20246Estimated s(1)20 1 2 3 4 5x 104−10−5051015Estimated s(1)30 1 2 3 4 5x 104−6−4−2024Estimated s(1)4Figure 4.4: Simulation 1: an illustrative example from the proposed UJBSS-m method. First row: The original 4 sources; Second row: 3 channelsof the mixed observations; Third row: the recovered 4 sources from thefirst dataset.MCCA can be applied. Therefore it is not really a fair setting and comparison tothe proposed method. However, we note that the performance of MCCA is notas good as that of the proposed method, even with an additional observation sig-nal. The following reasons could contribute to the worse performance of MCCA:it is mainly due to the fact that the correlation coefficients between sources in twodatasets are quite close [20, 72]; the performance of MCCA may suffer from erroraccumulation of the deflation-based separation methods [69].4.7.2 Simulation 2: Physiological SignalsIn this experiment, we employ four physiological signals as sources, includingECG, EEG, EOG and EMG from a publicly available database [45]. The samplingrate is 1000 Hz. The sources corresponding to the other two datasets are generatedfollowing (4.32). We get similar results as that in the simulation 1. As can be seenfrom Fig. 4.5, the estimation performance is getting better with the increase of the88Table 4.1: PCC performance results in Simulation 1.Methods s1 s2 s3 s4Dataset 1UJBSS-m 0.993 1.000 0.881 0.992UJBSS-2[124] 0.989 1.000 0.859 0.990SOBIUM[31] 0.980 1.000 0.512 0.967UBSS-SC[119] 0.951 0.965 -0.066 0.934UBSS-SR[59] 0.901 0.999 0.226 0.967MCCA∗[72] 0.571 0.909 0.675 0.732Dataset 2UJBSS-m 0.910 0.998 0.740 0.944UJBSS-2[124] 0.897 0.998 0.714 0.892SOBIUM[31] 0.877 0.998 0.686 0.962UBSS-SC[119] 0.795 0.940 -0.401 0.607UBSS-SR[59] 0.885 0.981 0.730 0.954MCCA∗[72] -0.574 0.911 -0.678 -0.733Dataset 3UJBSS-m 0.899 1.000 0.783 0.870UJBSS-2[124] 0.720 1.000 0.515 0.869SOBIUM[31] 0.756 -0.951 0.078 0.880UBSS-SC[119] 0.381 -0.434 -0.614 -0.630UBSS-SR[59] -0.557 0.987 0.620 -0.561MCCA∗[72] 0.577 -0.911 -0.679 0.732(1) ∗We add one additional observation in each dataset when we evaluate theMCCA.SNR of observations. In the whole SNR range, the proposed UJBSS-m method es-timates the mixing matrices with higher accuracy than the single-set UBSS methodSOBIUM and UJBSS method for two datasets UJBSS-2.We also investigate the effect of the time delays, as shown in Fig. 4.6. Athigh SNR level, e.g. SNR = 20dB, the average Error of the proposed UJBSS-mis -11.57dB when the step size of the time delays is 5 data samples,correspondingto 5ms. However, the average Error corresponding to 3 data samples is -6.67dB,significantly larger than that of 5 data samples. The main reason is that the changeof the covariance matrices is not obvious for the small step size and the covariance89matrices related to these delays could not provide enough information to estimatethe common factors, i.e. the mixing matrices. If the time delay is too large, suchas more than 500 data samples (corresponding to 500ms), the covariance betweentwo datasets will be close to 0. Here, we select 5 data samples as the step size ofthe time delays. In practice, we should select the time delays empirically based onthe characters of the sources, e.g., we suggest the time delays smaller than 100msfor physiological signals. In addition, we evaluate the role of the number of timedelays and find that it has less impact on the performance. In this chapter, we setthe number of time delays to 20.We further show the performances in term of the PCC results between the orig-inal sources and the estimated ones. As shown in Table 4.2, the proposed methodyields promising results when it is used to separate the latent and underdeterminedmixtures. Compared to the classical JBSS method MCCA, the proposed UJBSSapproach needs fewer number of observations in each dataset, while it yields abetter performance.4.8 A Case Study: Solve A Single Set UBSS ProblemBased on UJBSS-mIn this section, we show that the proposed UJBSS-m method can be employed tosolve a single set UBSS problem, the noise enhanced signal processing problem,with a superior performance. As in Simulation 1, we employ 5 real audio signalsas the sources. These 5 audio signals are mixed into 4 observations with a mixingmatrix A(1) whose elements follow the uniform distribution U [−1,1]. We generatethree datasets asX (1) = A(1)S(1)X (2) = awgn(X (1),20 dB)X (3) = awgn(X (1),20 dB),(4.33)90−10 −5 0 5 10 15 20 25 30 35 40−14−12−10−8−6−4−2SNR (dB)Error (dB)  UBSS SOBIUM X(1)UJBSS−2 X(1)&X(2)UJBSS−2 X(1)&X(3)UJBSS−m X(1),X(2)&X(3)Figure 4.5: Simulation 2: performance comparisons on physiological signalsbetween the proposed UJBSS-m method and two other methods (i.e.,the single-set UBSS method SOBIUM [31] and UJBSS method for twodataset, UJBSS-2 [124]). Here the number of sources N = 4 and thenumber of observations M = 3. The number of time delays L = 20 andthe step size of time delays (i.e., τl − τl−1) is 2 data samples. Similarresults are observed for A(2) and A(3).where awgn(X (1),20 dB) represents adding white Gaussian noise to the signalsX (1) (i.e., the real observations) with SNR of 20dB. Noise, traditionally regardedas the unwanted signal, can play a very important constructive role in estimationproblems, which is known as noise enhanced signal processing. X (2) and X (3) arerandom noise added signals based on X (1).The problem of interest here is to estimate the mixing matrix A(1). Tradition-ally, we can estimate A(1) from the dataset X (1) based on the single set UBSS91−5 0 5 10 15 20−12−10−8−6−4−20SNR (dB)Error (dB)  13579Figure 4.6: Simulation 2: performance of the proposed UJBSS-m methodwhen the step size of time delays (i.e., τl − τl−1) varies from 1 to 9.Here the number of sources N = 4 and the number of observations M =3. The number of time delays L = 20. Similar results are observed forA(2) and A(3).method SOBIUM. Here we also can apply the proposed algorithm as mentionedin Section 4.5. Then we recover the sources via the method mentioned in Section4.6 based on the estimated mixing matrix A(1). We repeat the experiment for 1000times and calculate the sum of the absolute PCC (SAPCC) between the recoveredsources and the original ones, which is calculated asSAPCC =5∑n=1abs(PCC(s(1)n , sˆ(1)n )), (4.34)where abs(·) represents the absolute value function. Fig. 4.7 shows the distributionof the performance for 1000 repeats of the experiments. The average SAPCC for92Table 4.2: PCC performance results in Simulation 2.Methods ECG EOG EEG EMGDataset 1UJBSS-m 0.985 0.994 0.910 0.831UJBSS-2[124] 0.975 0.986 0.872 0.781SOBIUM [31] -0.815 -0.985 0.800 -0.382UBSS-SC[119] 0.090 0.949 -0.707 -0.679UBSS-SR[59] 0.504 -0.906 0.115 0.503MCCA∗[72] 0.613 -0.754 0.695 0.674Dataset 2UJBSS-m 0.956 0.821 1.000 0.832UJBSS-2[124] 0.883 0.705 0.942 0.308SOBIUM [31] 0.851 -0.467 0.999 0.726UBSS-SC[119] -0.420 0.544 0.776 0.726UBSS-SR[59] -0.894 0.288 -0.967 -0.203MCCA∗[72] 0.634 -0.754 0.678 -0.676Dataset 3UJBSS-m 0.779 0.738 0.998 0.997UJBSS-2[124] 0.777 0.738 0.998 0.997SOBIUM [31] 0.443 0.697 -0.993 -0.983UBSS-SC[119] 0.394 0.638 0.722 0.848UBSS-SR[59] 0.582 0.609 0.426 0.871MCCA∗[72] -0.624 -0.756 0.689 0.676(1) ∗We add one additional observation in each dataset when we evaluate theMCCA.UBSS is 4.53 while that for the proposed UJBSS-m is 4.76, even with the samesource extraction technology. The one-way Analysis of Variance (ANOVA) is per-formed on the results provided by these 1000 repeats. The obtained p value is1.5677e-11, which means that the results of the proposed UJBSS-m method andsingle set UBSS method are significantly different. The proposed UJBSS-m algo-rithm demonstrates more robust and better performance. This example illustratesthat the estimation accuracy could be improved by adding suitable noises to theinput signals.93UBSS SOBIUM UJBSS−m44.14.24.34.44.54.64.74.84.95SAPCCFigure 4.7: The sum of the absolute correlation coefficients between the re-covered sources and the original ones. The blue asterisks represent theaverages and the red lines stand for the medians. The edges of the boxare the lower and upper quartiles.4.9 Conclusions and DiscussionThis chapter is the third work in my PhD study. Benefiting from the dependenceinformation between two datasets, the UJBSS method proposed in chapter 3 gainedpromising performance. It is nature to generalize the idea for two datasets intomultiple datasets. However, the UJBSS-2 for two datasets cannot be directly uti-lized to solve the problem in multiple datasets. In this chapter, we generalize theUJBSS for two datasets (i.e., UJBSS-2) to the case of multiple datasets. The basicidea is similar to that in UJBSS-2, which estimate the mixing matrices jointly firstand then restore the source signals. In this chapter, we exploit the cross correlationof the observations between each pair of datasets and present a novel underdeter-94mined joint blind source separation method, namely UJBSS-m, to jointly estimatethe mixing matrices from multiple datasets when the number of observations issmaller than that of the sources. The mixing matrices are accurately estimatedthrough joint canonical polyadic decomposition of a sequence of specialized ten-sors in which a set of covariance matrices are stacked. Further the sources arerecovered based on the estimated mixing matrices. Numerical results on multi-ple datasets demonstrate the superior performances of the proposed method whencompared to the commonly used JBSS and single-set UBSS methods.As an example application for noise enhanced signal processing, we also showthat the proposed UJBSS-m method can be utilized to solve the single-set UBSSproblem when suitable noise is added to the observations. In addition, the proposedUJBSS-m method does not rely upon sparsity of signals and therefore it can beapplied to a wide class of signals when: 1) the sources within each dataset areuncorrelated and 2) the sources across different datasets are correlated only oncorresponding indices.95Chapter 5Removing Muscle Artifacts fromEEG Data via UnderdeterminedBlind Source SeparationEEG recordings are often contaminated by artifacts from EMG. These artifacts re-duce the quality of the EEG signals and disturb further analysis of EEG, such asin brain connectivity modeling. If enough number of EEG recordings are avail-able, then there exists a considerable range of BSS methods which can suppressor remove the distorting effect of such artifacts. However, for many practical ap-plications, such as the ambulatory health-care monitoring, the number of sensorsused to collect EEG is limited. As a result, the conventional BSS methods, suchas CCA and ICA, do not work in such cases. Considering the increasing needfor biomedical signal processing in ambulatory environment, this chapter proposesa novel underdetermined BSS method exploring the cross correlation and auto-correlation of underlying sources. We evaluate the performance of our proposedmethod through numerical simulations in which EEG recordings are contaminated96with muscle artifacts. The results demonstrate that the proposed method can ef-fectively and efficiently remove muscle artifacts meanwhile preserving the EEGactivity successfully. This is a promising tool for real-world biomedical signalprocessing applications.5.1 Motivation and ObjectivesEEG is extensively used in brain science research, such as neuroscience and cogni-tive science [106]. However, it is susceptible to various physiological factors otherthan neural activities. ECG from cardiac activities, EOG from ocular movements,and EMG from muscular activities are the most common artifacts. These unde-sired artifacts interferer with the signal of interest and disturb subsequent analysisof EEG signals. Compared to other types of artifacts, it is generally more chal-lenging to remove artifacts from the contracting head muscles (i.e. EMG signals)[79]. The main reasons for this difficulty are four folds: 1) EMG signals alwayshave higher amplitude than the smaller EEG signals; 2) EMG signals have a widespectral distribution, and especially overlaps with the beta activity in 15-30 Hzof EEG signals; 3) EMG signals have a broad anatomical distribution and can bedetected across the entire scalp; 4) EMG signals also exhibit less repetition andconsequently are harder to stereotype [18, 21, 106].A artifacts removal is clearly an important issue and is a prerequisite step forour subsequent analysis. In order to remove the EMG artifacts, a number of ap-proaches have been proposed, such as filtering, regression and EMD. An alterna-tive strategy is based on BSS, which is more commonly used and demonstratedto be effective for removing artifacts from EEG. As one of the most popular BSSmethods, ICA has been extensively explored for this purpose [21, 28], aiming toseparate multichannel EEG into statistical independent components (ICs, i.e. un-derlying sources). Then the ICs determined to be artifacts can be discarded and the97remaining ICs can be used to reconstruct the artifact-free EEG. However, due to theissue of crosstalk between brain and muscle activity, ICs containing EEG signalsare still contaminated by EMG [65, 83]. Therefore, ICA itself may not performeffectively in removing EMG from EEG.Second order blind identification makes use of the temporal correlation and isshown as an effective alternative to ICA in removing EMG artifacts. However,this method is designed for stationary signal and it may suffer when the under-lying source is nonstationary, such as in the case of transient muscular activities[23]. More recently, CCA has been explored as a more reliable method to re-move EMG from scalp EEG. It aims to find mutually uncorrelated sources whichare maximally autocorrelated. Compared with EEG, EMG has relatively low au-tocorrelation. Taking advantage of this distinguishable feature, sources with highautocorrelation should correspond to EEG while the source with relatively lowautocorrelation is regarded as the EMG artifact. Then the underlying EMG com-ponents are discarded to reconstruct the EEG signals. As recently suggested in[21, 29, 43], CCA can achieve superior performance over ICA.It is worth noting that the above mentioned BSS algorithms generally assumethat the number of sources is equal to or less than that of the observations. How-ever, for many practical applications, such as ambulatory health-care monitoring,it is desirable to collect the mixed signals using fewer sensors. In these cases, theabove assumption does not hold and UBSS is required. Again, considering theincreasing need for biomedical signal processing in ambulatory environment, thischapter proposes a novel UBSS method that investigates second-order statistics ofthe observations. Same as the existing UBSS methods, this proposed method con-sists of two steps: the mixing matrix is estimated first, followed by the separationof underlying sources based on the estimated result of the mixing matrix.More specifically, inspired by stochastic resonance [78], we add tiny random98noise to the EEG recordings and construct multiple datasets across which the un-derlying sources are highly correlated. We further explore the cross correlationand autocorrelation of underlying sources, rather than solely the partial cross-correlation as in our previous paper [111, 124]. The mixing matrix is estimatedaccurately via joint polyadic tensor decompositions of a set of tensors where spa-tial covariance matrices corresponding to different time delays are stacked. Fur-thermore, the underlying sources, including EEG and EMG, are inferred based onthe estimated mixing matrix from the EEG observations. Sources related to mus-cle activity are identified and removed during EEG reconstruction. We evaluatethe performance of the proposed method through numerical simulations in whichEEG recordings are contaminated with muscle artifacts. The results demonstratethat the proposed method can effectively and efficiently remove muscle artifactswhile preserving the EEG successfully.5.2 Problem FormulationThe problem of interest here is to recover the underlying N sources with a limitednumber of observations. In this study, EEG observation signals are denoted bya matrix X(t) = [x1(t);x2(t); . . . ;xM(t)] ∈ RM×T , where M represents the numberof EEG observations and T is the number of data samples. It is assumed that theunderlying N sources S, including the signal of interest (i.e. EEG) and undesiredartifacts (i.e. EMG), are linearly mixed into the M observations X . The mixingprocess is modeled as follows,X = AS+E. (5.1)A ∈RM×N with M < N (i.e., the underdetermined case) denotes the unknown mix-ing matrix. E is the possible additive noise which is generally assumed to be the99zero mean, temporally white and uncorrelated with the source signals.5.3 Proposed MethodIt is suggested in [111] that the UJBSS method can be used to solve the single-set UBSS problem. However, this only utilizes part of the cross-correlation [111]and neglects the autocorrelation and the other parts of cross-correlation. In thischapter, we fully exploit the cross-correlation and autocorrelation of the sourcesand propose a novel UBSS algorithm. We add tiny noise to the observations, X(which is also expressed as X (1)), and construct the other two datasets as,X (1) =X = [x(1)1 ,x(1)2 , . . . ,x(1)N ]TX (2) =awgn(X (1),20dB)X (3) =awgn(X (1),20dB),(5.2)in which awgn(X (1),20dB) adds white Gaussian noise to the measured EEG obser-vation signals X (1), with the signal-to-noise ratio equaling to 20dB. (5.2) ensuresdependence between each pair of datasets. Similar to CCA, it is reasonable to as-sume that:(1) The sources in each dataset are uncorrelated.E{s(k)i (t)(s(k)j (t+ τ))T}= 0∀τ, 1≤ i 6= j ≤ N, k = 1,2,3,(I)where s(k)i (t) is the i-th source in dataset k and s(k)j (t+τ) represents the j-th sourcewith the time delay τ in dataset k.(2) The corresponding sources from two different datasets have non-zero cor-100relations and sources with different indices across datasets are not correlated.D(τ) =E{S(k1)(t)(S(k2)(t+ τ))T}=Diag(ρ1(τ),ρ2(τ), . . . ,ρN(τ)),(II)where Diag(·) represents the diagonal matrix, and the ρn(τ) = E{s(k1)n (t)(s(k2)n (t +τ))T} denotes the covariance between s(k1)n (t) and s(k2)n (t + τ). This assumptionsuggests that the corresponding sources in multiple datasets are second-order cor-related with each other.In this chapter, we fully exploit the second order auto covariance and cross co-variance of EEG signals and propose a novel and effective algorithm to estimatethe mixing matrix. The problem is reformulated as a joint canonical polyadic de-composition of a sequence of third-order tensors, which share the common factormatrix A(1). The auto covariance of the EEG signals can be formulated asE{X (1)(t)X (1)(t+ τ)T}=(A(1))E{S(1)(t)S(1)(t+ τ)T}(A(1))T ,(5.3)where τ represents the time delay. The covariance matrices corresponding to dif-ferent time delays, τ1 to τL, satisfy,B(1) = E{X (1)(t)X (1)(t+ τ1)T}=(A(1))C(τ1)(A(1))T...B(L) = E{X (1)(t)X (1)(t+ τL)T}=(A(1))C(τL)(A(1))T ,(5.4)in which C(τ1) = E{S(1)(t)S(1)(t + τl)T} is diagonal, l = 1 . . .L. We stack the auto101covariance matrices {B(l)} in a tensorB ∈ RM×M×L as follows:(B)i, j,l = (B(l))i, j, (5.5)in which i = 1,2, . . . ,M, j = 1,2, . . . ,M, l = 1,2, . . . ,L. We define the matrix C ofsize L×N with the element Cl,n = (C(τl))n,n, for l = 1, 2, ..., L, n = 1, 2, ..., N. Thenwe have:B =N∑n=1a(1)n ◦a(1)n ◦ cn, (5.6)in which the ◦ denotes the outer product operation, a(1)n is the nth column of themixing matrix A(1), and cn is the nth column of the matrix C.The cross covariance between the EEG signals X (1) and the noise-added signals(i.e. X (2) and X (3)) with time delay τ , can be formulated as:E{X (1)(t)X (k)(t+ τ)T}=(A(1))E{S(1)(t)S(k)(t+ τ)T}(A(k))TE{X (1)(t+ τ)X (k)(t)T}=(A(1))E{S(1)(t+ τ)S(k)(t)T}(A(k))T(5.7)in which k = 2 or 3. Considering the correlations within and between each pair ofdatasets, the cross covariance matrices corresponding to time delay τl satisfy:F(l) = E{X (1)(t)X (2)(t+ τl)T}= (A(1))G(τl)(A(2))TH(l) = E{X (1)(t+ τl)X (2)(t)T}= (A(1))I(τl)(A(2))TJ(l) = E{X (1)(t)X (3)(t+ τl)T}= (A(1))K(τl)(A(3))TP(l) = E{X (1)(t+ τl)X (3)(t)T}= (A(1))Q(τl)(A(3))T ,(5.8)in which the cross variance between the sources across each pair of datasets, G(τl), I(τl),K(τl),Q(τl),102are diagonal. Similar to (5.5), these sets of cross variance matrices are stacked intotensorsF ,H ,J ,P , which can be represented as,F =N∑n=1a(1)n ◦a(2)n ◦gnH =N∑n=1a(1)n ◦a(2)n ◦ inJ =N∑n=1a(1)n ◦a(3)n ◦ knP =N∑n=1a(1)n ◦a(3)n ◦qn.(5.9)Considering the common latent structure where each pair of tensors share thefactor matrix A(1), the mixing matrix can be estimated via joint CPD of a collectionof tensors B,F ,H ,J ,P . In this chapter, we generalize the idea of coupledmatrix and tensor factorization (CMTF) and jointly decompose these tensors viathe gradient-based optimization method [3, 100, 109]. The objective function canbe expressed as:f (A(1),A(2),A(3),C,G, I,K,Q)=12‖B− [[A(1),A(1),C]]‖2+ 12‖F − [[A(1),A(2),G]]‖2+12‖H − [[A(1),A(2), I]]‖2+ 12‖J − [[A(1),A(3),K]]‖2+12‖P− [[A(1),A(3),Q]]‖2(5.10)where [[·]] denotes the canonical polyadic approximation of a given tensor. Thisequation simultaneously takes the coupling information between different tensorsinto account. We propose to solve this problem via a gradient-based optimizationmethod. The partial derivative of the objective function f with respect to each103column of A(1) is:∂ f∂a(1)n=−2N∑n=1(B×3 cn)a(1)n +2N∑d=1(cTn cd)((a(1)n )T a(1)d )a(1)d−F ×2 a(2)n ×3 gn−H ×2 a(2)n ×3 in−J ×2 a(3)n ×3 kn−P×2 a(3)n ×3 qn+N∑d=1[(a(2)n )T a(2)d (gn)T gd +(a(2)n )T a(2)d (in)T id +(a(3)n )T a(3)d (kn)T kd+(a(3)n )T a(3)d (qn)T qd ]a(1)d .(5.11)Similarly, we can calculate the partial derivative of f with respect to other fac-tor matrices and obtain the gradient. Then the mixing matrix A(1) can be calcu-lated based on any first-order optimization method. In this chapter, considering thecompetitive advantage of ‘efficiency’ and ‘requiring less memory’, we employ thenonlinear conjugate gradient algorithm (NCG) implemented in [38] to solve theunconstrained optimization problem and further estimate the mixing matrix A(1).Once the mixing matrix is estimated, extracting the sources is a classic inverseproblem. Here, we adopt a recently-developed subspace representation method[59] to recover the latent sources based on the estimated mixing matrix. The de-tails of this method can be found in Chapter 4. Next, the recovered sources aresorted in terms of their autocorrelations. Due to the relatively low autocorrelationof EMG signals, muscle artifacts are isolated and set to 0 during reconstruction.Subsequently, the cleaned signals Xeeg can be obtained. The major steps of theproposed method are summarized in Algorithm 7.5.4 Data Generation and Performance IndicesIn order to evaluate the performance of the proposed method, obtaining the groundtruth, i.e. the pure EEG and EMG signals, is quite necessary. In previous studies[29], to obtain the ground truth EEG signals, experienced neurophysiologist in-104Algorithm 7 The proposed method for removing muscle artifact from EEG signalsInput: M-dimensional observations XOutput: The artifact-free EEG data Xeeg.1: Create the other two data via adding Gaussian white noise to the EEG obser-vations X ;2: Calculate the auto covariance and the cross covariance as in (5.4) and (5.8)with different time delays, and construct a sequence of third-order tensors;3: Calculate the joint CPD of the tensors constructed in step 2 and estimate themixing matrix A (is also expressed as A(1));4: Recover the underlying sources S(1), including EEG and EMG artifacts;5: Sort the recovered sources S(1) in term of their autocorrelations and recognizethe EMG artifacts from them;6: Set the rows of S(1) corresponding to muscle artifacts to be zero and get S(1)new;7: Reconstruct the artifact-free EEG signals Xeeg via Xeeg = A(1)S(1)new.spected many EEG recordings and select the clean EEG signals from them. How-ever, frequent difficulties have surfaced in acquiring artifact-free EEG signals in re-ality, and it is even more difficult to ensure that the selected signals are completelyfree of muscle activities. In this section, we generate synthetic EEG and EMG sig-nals, and examine the performance of the proposed method when the ground truthis available.The simulated EEG sources are generated according to the phase-resetting the-ory proposed by Markinen et al. [75]. As in [117] and [21], we generate each EEGsource by adding 4 sinusoids with frequencies randomly chosen from the rangeof 4 Hz to 30 Hz. To illustrate the performance of the proposed method, N EEGsources, SEEG, are produced independently. Here, we set N to be 4. Analogous tothe work of Delorme et al. [33], an EMG source, SEMG, is simulated using randomnoise band-pass filtered between 20 and 60 Hz. The sampling rate here is 250 Hzand each channel is 40 seconds long. In addition, the 4-channel EEG observationsare modeled as,105X = AS= A[SEEG;SEMG],(5.12)where S includes 4 EEG sources and 1 EMG source, and A is the mixing matrixgenerated randomly with elements following the uniform distribution U [−1,1]. Forsimplicity, each column of the mixing matrix is normalized into a unit vector.To fairly compare the proposed method with existing EMG artifacts removalmethods, 1000 independent simulations are implemented and three performanceindices are employed. The first performance measurement is the mean relativeestimation error of the mixing matrix A, which is defined as:Error = 10log10{mean( ||A− Aˆ||||A|| )}, (5.13)where Aˆ denotes the optimally ordered estimate of A. The second measure is theMean of Absolute Correlation (MAC) between the estimated sources and the origi-nal ones, which is defined asMAC = mean(1Nn=N∑n=1|cov(sn, sˆn)σsnσsˆn|), (5.14)where sˆn represents the estimate of the source sn, cov(·, ·) represents the covariancebetween two variables and σ denotes the standard deviation. The Relative RootMean Squared Error (RRMSE) is the third measure used to evaluate the effect ofmuscle artifact removal, which is defined as:RRMSE =RMS(XEEG− XˆEEG)RMS(XEEG), (5.15)where RMS(·) denotes the root mean squared (RMS) value of a matrix/vector. For106instance, the RMS value of the EEG observations X is expressed as,RMS(X) =√1M ·TM∑m=1T∑t=1Xm,t , (5.16)where M is the number of EEG observations (which is 4 in this chapter), and Trepresents the number of data samples.5.5 Numerical Study for the Synthetic EEG DataThe original sources in our study include 4 EEG sources (represented by SEEG) and1 EMG source (SEMG), which are linearly mixed into 4 observations X following(5.1). The other two datasets, X (2) and X (3), are generated following (5.2). In ourprevious paper [111], we discussed the effect of the step size and the number oftime delays. Considering the difference in the sampling rate, here we select 1 datasample corresponding to 4ms as the step size of time delays. Compared to the stepsize, the number of time delays has less impact on the performance. To furtherenhance the time efficiency, we set the number of time delays to 10.Fig.5.1 shows the estimation error as a function of SNRS. We compare the pro-posed method with a commonly-used single-set UBSS method (SOBIUM) and thepreviously developed underdetermined joint BSS method (UJBSS-m) [31, 111].SOBIUM exploits the autocorrelation of the sources and reformulate the problemof estimating the mixing matrix as decomposing a higher-order tensor. UJBSS-mmodels the cross correlation between each pair of two datasets and it is consideredas a great alternative to SOBIUM in solving the single-set underdetermined BSSproblem [111]. Compared with the SOBIUM and UJBSS-m, which only utilizedautocorrelation or part of cross correlation, the proposed method fully exploitsthe second-order statistics of the observations. Benefiting from this, the proposedmethod consistently yields the best results over the entire SNRS range, which also107suggests the stability of the proposed method.We estimate the mixing matrix A via the single-set UBSS method SOBIUM,the underdetermined joint BSS method UJBSS-m and our proposed method, andfurther recover the latent sources using the subspace representation method [59].The source with the lowest autocorrelation is then recognized as the EMG source.Next the EMG source is set to 0 and the EEG sources are used to reconstruct theartifact-free EEG signals XEEG. As an illustrative example, Fig.5.2 demonstratesthe original XEEG and the corresponding reconstruction results. It is shown thatour proposed method is able to remove the EMG artifact perfectly and the recon-structed EEG signals are highly correlated with the original artifact-free EEG sig-nals. In addition, we also compared the proposed method with a recently developedEMG artifact removal method, EEMD-CCA [21]. EEMD-CCA is a single-channeltechnique for muscle activity removal and is therefore suitable for removing arti-facts with a limited number of observations. It was shown that this EEMD-CCAtechnique outperforms the multichannel technique based on CCA for removingmuscle artifacts from EEG signals [18, 21]. In this chapter, we apply this EEMD-CCA method to each channel of the EEG observations X and decompose eachchannel into multiple IMFS, which are the input of the CCA method. Given the rel-ative lower autocorrelation, the last canonical variate (CV, i.e. output of the CCA) isset to 0 during EEG reconstruction. We also test the effect of the number of canon-ical variates which are selected as the EMG artifacts and discarded during EEGreconstruction. The best performance is gained when the number of CV resmblingEMG is set to 1.In this chapter, we repeat the experiments 1000 times and the resulting averageperformance is shown in Table 5.1. We further compare the proposed method withthree other artifact removal methods, including SOBIUM, UJBSS-m and EEMD-CCA, in terms of MAC, RRMSE and time efficiency. All three of the compared108BSS methods utilize the same technology to recover the sources when the mix-ing matrix is estimated separately. Despite this, the performance of the proposedmethod is significantly better than that of SOBIUM and UJBSS-m. This also sug-gests the importance of estimating mixing matrices accurately. In addition, we testthe computational cost of all four of these respective artifact removal methods. Toremove the muscle artifact from 10000-datapoint 4-channel EEG observations, theaverage computational time for the proposed method is 52.810s while that of SO-BIUM, UJBSS-m and EEMD-CCA are 11.195s, 26.907s and 52.910s respectively.The implementation is completed in MATLAB on a computer with Intel Core i7-4770 3.40 GHz CPU and 8.00G RAM. All the MATLAB codes used in this chapterare available upon request from the authors via email liangzou@ece.ubc.ca.Table 5.1: Performance comparison between the proposed method and theother three methods (SOBIUM, UJBSS-m, EEMD-CCA)MAC RRMSE Average time cost (second)SOBIUM [31] 0.863 0.147 11.195UJBSS-m [111] 0.930 0.099 26.907EEMD-CCA [21] NA 0.168 52.910The proposed method 0.940 0.085 52.8105.6 Conclusions and DiscussionIn this chapter, we propose an effective and novel method to remove muscle ar-tifacts from EEG signals. Compared with SOBIUM and UJBSS-m, which onlyutilize autocorrelation or a portion of cross-correlation, our proposed method fullyexploits the second-order statistics of observations. The mixing matrices are ac-curately estimated through joint CPD of a set of specialized tensors in whichcovariance matrices corresponding to different time delays are stacked. Subse-1090 5 10 15 20 25 30 35 40−9−8−7−6−5−4−3−2SNR (dB)Error (dB)  SOBIUMUJBSS−mProposed methodFigure 5.1: Estimation error of A with the change of signal-to-noise ratios.Here, the number of time delays L equals to 10, and the step size oftime delays (i.e. τl− τl−1) is 1 data sample corresponding to 4 ms.quently, sources are recovered based on these estimated mixing matrices. Com-pared with EEMD-CCA whose performance relies on the artifact level of the con-taminated EEG signals (ratio between the power of pure EEG and EMG), the pro-posed method is based on the statistical properties of the underlying sources, andtherefore is more robust. We evaluate the performance of the proposed methodthrough numerical simulations in which EEG recordings are contaminated withmuscle artifact. Our results demonstrate that the proposed method can effectivelyand efficiently remove muscle artifacts while preserving the EEG activity success-fully. Therefore, it is a promising tool for real-world biomedical signal processingapplications.1100 50 100 150 200 250 300 350 400 450 500−3−2−101234Channel 1: Correlation = 0.99427  0 50 100 150 200 250 300 350 400 450 500−4−20246Channel 2: Correlation = 0.99940 50 100 150 200 250 300 350 400 450 500−6−4−2024Channel 3: Correlation = 0.999040 50 100 150 200 250 300 350 400 450 500−4−3−2−10123Channel 4: Correlation = 0.99953Original EEGReconstructed EEGFigure 5.2: An illustrative example of the reconstructed EEG signals basedon the proposed EMG removal method.This is the last piece of my PhD work. It is a new and interesting application ofUJBSS. To the best of our knowledge, we are the first to apply the underdeterminedBSS method to remove EMG artifacts from a limited number of EEG observations.Further, we note that it is also applicable to remove other kinds of artifacts, such asECG and EOG.111Chapter 6Conclusion and Future Work6.1 ConclusionIn this dissertation, we have developed a set of novel underdetermined blind sourceseparation approaches for recovering underlying sources when the number of sourcesis greater than that of the observations. The proposed algorithms aim to addressseveral challenges in real applications, including limited number of observations,self/cross dependence information and source inference. The proposed methodswere evaluated on synthetic data and/or real physiological signals. It should benoted that since the underlying ground truth is unavailable in real physiologicalstudies, the performance evaluation in such cases basically depends on visual in-spection by experts in the field. The main contributions and findings of this disser-tation are summarized as follows.In Chapter 2, a novel UBSS framework, termed NAMEMD-MCCA, is pro-posed to extract the heart beat signal from multi-channel NF-based sensor signals.Considering various potential artifacts residing in the measured signals, recover-ing the underlying heart beat signal is an underdetermined problem. We inves-112tigate state-of-the-art EMD-BSS based methods for exacting RHBR informationaccurately based on the nano-sensor data and further propose the NAMEMD-MCCA for improved RHBR monitoring. Considering inter-channel information,NAMEMD processes the input NF-based signals in high dimensional space andcan effectively overcome the problems of uniqueness and mode mixing [77]. Asan extension of CCA, MCCA is able to jointly extract sources through maximizingthe correlations of the extracted sources across datasets. The combination of thesetwo methods (NAMEMD-MCCA) benefits from the use of cross-channel informa-tion and increased robustness to artifacts. We first apply the proposed methods tosynthetic data to illustrate their performance where the underlying truth is known.We then apply the proposed method to real nano-sensor data collected when thesubject performs 11 tasks and it is shown that the proposed NAMEMD-MCCAmethod can achieve superior performance.Another challenging question that we have posed in the underdetermined BSSfield is the underdetermined joint BSS problem, which aims to jointly estimate themixing matrices and/or extract the underlying source from multiple datasets whenthe number of sources is greater than that of observations in each dataset. Tra-ditional joint BSS methods are designed for the determined case, which assumethat the number of sources is equal to or less than that of the observations. Asmentioned, this assumption may not be true in certain practical applications due toconcerns including cost or time [60]. However, to the best of our knowledge, in thecurrent literature only very limited work has been done on JBSS methods specifi-cally designed for the underdetermined case. In order to address this concern, inChapter 3, we propose an underdetermined JBSS for 2 datasets, termed as UJBSS-2.Considering the dependence information between two datasets, we exploit second-order statistics of the observations. The problem of jointly estimating mixing ma-trices is tackled via CPD of a specialized tensor in which a set of spatial covariance113matrices are stacked. Numerical results demonstrate the competitive performanceof UJBSS-2 when compared to a commonly used JBSS method, MCCA, and thesingle-set UBSS method, UBSS-FAS.In Chapter 4, we generalize the idea of UJBSS-2 for two datasets into mul-tiple datasets [124]. In this work, we propose a novel and effective method tojointly estimate the mixing matrices for multiple datasets. Moving from our workin Chapter 3, here the dependence information is modeled in a set of three-ordertensors, rather than one single tensor. Considering the latent common structure ofthese constructed tensors, we jointly estimate the mixing matrices via joint canon-ical polyadic decomposition of these specialized tensors. In order to accuratelyinfer the source signals, we recover them by further utilizing a novel subspace rep-resentation based method. This proposed UJBSS-m method does not rely uponthe sparsity of signals and therefore it can be applied to a wide class of signals.In addition, we also show that UJBSS-m can be utilized to solve the single-setUBSS problem when suitable noise is added to the observations. Numerical resultson both audio and physiological signals demonstrate the superior performances ofthis proposed method.In Chapter 5, we propose a novel underdetermined blind source separationmethod for removing muscle artifacts from EEG signals. EEG recordings are oftencontaminated by various artifacts, among which the artifact from EMG is particu-larly difficult to eliminate. Such EMG artifacts reduce the quality of EEG signalsand disturb further analysis of EEG, as in brain connectivity modeling. If a highenough number of EEG recordings are available, we can remove or to some ex-tent suppress the distortion effect of such artifacts via a considerable range of BSSmethods, such as ICA and CCA. However, for many practical applications, such asambulatory health-care monitoring, a small number of sensors used to collect EEGis preferred and conventional BSS methods like CCA and ICA, will fail. Consider-114ing the recent increasing need for biomedical signal processing in the ambulatoryenvironment, we explore cross correlation and autocorrelation of the underlyingsources and propose a novel underdetermined BSS method. We conduct a perfor-mance comparison through numerical simulations in which 4 EEG recordings arecontaminated with 1 muscle artifact. It is demonstrated that the proposed methodcan effectively and efficiently remove the muscle artifact meanwhile successfullypreserving the EEG activity.6.2 Future Work6.2.1 Multiple Datasets GenerationIn Chapter 4 and Chapter 5, we demonstrate that underdetermined joint BSS meth-ods can be utilized to solve single set underdetermined BSS problems. In order toensure the relative high correlation between sources, we construct multiple datasetsthrough adding a certain amount of weak Gaussian white noise (e.g., with SNR =20dB) to the observations. Then we apply the proposed underdetermined joint BSSmethod to these noise assisted datasets. To make a fair comparison, we repeat thesimulation 1000 times in both studies. It is demonstrated that the proposed meth-ods gain better performance on average when we use these noise assisted datasets.For instance, the performance of the proposed method in Chapter 4 is better thanthat of SOBIUM with high confidence (80 percent probability). However, there isnot currently a strict theoretical analysis on how to add noise to ensure obtainingbetter performance with a higher probability. Thus, one possible research directionis to investigate a more rigorous method to add assisting noise adaptively to thegiven observations.1156.2.2 Estimate the Number of Source Signals in Determined andUnderdetermined BSSThe classical BSS problem includes two aspects of research: an estimation of thenumber of sources and source separation. The ‘blind’ aspect of BSS refers to thefact that there is generally no prior information available on the number of sourcesor on the mixing model. However, for conceptual and computational simplicity,most BSS algorithms usually require that the number of sources is specified in ad-vance. ICA makes the assumption that the number of sources is not greater thanthat of observations. MCCA assumes that the number of sources equals to thatof observations [56]. Further, the matrix diagonalization-based technique for theunderdetermined BSS has an upper limit for the number of sources. We see thatwe can obtain a unique solution only when the number of sources satisfies certainconditions [31]. However, these assumptions may not be the case in reality. Oneclassic example of the source separation problem is the cocktail party problem,which has been previously explained, where a number of people are talking simul-taneously. Without prior knowledge of the number of sources, we do not knowwhich BSS method is most suitable for solving this problem given the recordingsignals. In addition, results can vary if the number of sources is set differently. Indifferent scenarios, we may need to choose different types of BSS methods.An accurate estimation of the number of sources is shown to be of high impor-tance, and it is generally estimated before further source separation. As a result,several approaches have been proposed for estimating the number of sources. Waxand Kailath introduced the eigenvalue-based estimation method and investigatedthe observation that the number of dominant eigenvalues of the correlation ma-trix is equal to the number of sources in determined cases [41]. This method wasimproved by introducing Akaike Information Criterion (AIC) and Minimum De-scription Length (MDL) and some other measures in the estimation [40, 73, 94].116However, it is challenging to estimate the true number of sources when the ob-servations are noisy. To address this concern, sources number estimation methodswhich are robust to noisy observations are highly desired.In addition, to the best of our knowledge, there are only a few researchersdiscussing different ways to estimate the number of sources in UBSS, and theygenerally make use of the sparsity of source signals. For instance, SSPS on thetime domain or frequency domains are detected and similarity-based clusteringmethods are used to estimate the number of sources. However, signals may not beas sparse as assumed in the existing methods. Therefore, it is necessary to relax thesparsity constraint. Furthermore, the existing methods for estimating the numberof sources for UBSS cases are limited to instantaneous mixing models. We plan todevelop advanced methods to estimate the number of sources when the sources aremixed in a convolutional model. Lastly, the number of sources may be related tocertain physiological processes, such as depth of sleep. It would thus be of interestin some applications to monitor the dynamic change of the number of sources andthe corresponding mixing structure.6.2.3 Online Underdetermined BSSFor many practical applications, such as ambulatory health-care monitoring, it isdesirable to collect mixed signals using fewer sensors. In order to recover sourcesor remove unwanted noise, underdetermined BSS is preferred in these situations.Generally underdetermined BSS is more difficult to implement due to the lowernumber of available observations. Underdetermined BSS methods generally con-sist of two separate steps: mixing matrix estimation and underlying source infer-ence, which are very time consuming. To better serve such practical applications,real-time underdetermined BSS methods with light computational complexity arepreferred.117In addition, most existing BSS algorithms assume that the sources are physi-cally stationary, i.e., mixing filters are fixed. However, this assumption does notalways hold in real applications. For instance, in the cocktail party problem, it ishighly possible that both sources and sensors are not stationary in the room andtherefore the mixing model may be time varying. In these situations, it is nec-essary to model the mixing matrix as time-varying and develop UBSS methodsaccordingly.118Bibliography[1] F. Abrard and Y. Deville. A time–frequency blind signal separation methodapplicable to underdetermined mixtures of dependent sources. SignalProcessing, 85(7):1389–1403, 2005. → pages 7[2] E. Acar, D. M. Dunlavy, and T. G. Kolda. A scalable optimizationapproach for fitting canonical tensor decompositions. Journal ofChemometrics, 25(2):67–86, 2011. → pages 67, 71, 73[3] E. Acar, T. G. Kolda, and D. M. Dunlavy. All-at-once optimization forcoupled matrix and tensor factorizations. arXiv preprint arXiv:1105.3422,2011. → pages 77, 103[4] T. Adali, M. Anderson, and G.-S. Fu. Diversity in independent componentand vector analyses: Identifiability, algorithms, and applications in medicalimaging. IEEE Signal Processing Magazine, 31(3):18–33, 2014. → pages64[5] T. Adali, Y. Levin-Schwartz, and V. D. Calhoun. Multimodal data fusionusing source separation: Two effective models based on ica and iva andtheir properties. Proceedings of the IEEE, 103(9):1478–1493, 2015. →pages 50, 65[6] M. Aharon, M. Elad, and A. Bruckstein. rmk-svd: An algorithm fordesigning overcomplete dictionaries for sparse representation. IEEETransactions on signal processing, 54(11):4311–4322, 2006. → pages 9[7] A. Aissa-El-Bey, N. Linh-Trung, K. Abed-Meraim, A. Belouchrani, andY. Grenier. Underdetermined blind separation of nondisjoint sources in thetime-frequency domain. IEEE Transactions on Signal Processing, 55(3):897–907, 2007. → pages 7119[8] M. Anderson, T. Adali, and X.-L. Li. Joint blind source separation withmultivariate gaussian model: Algorithms and performance analysis. IEEETransactions on Signal Processing, 60(4):1672–1683, 2012. → pages 10[9] S. Arberet, R. Gribonval, and F. Bimbot. A robust method to count andlocate audio sources in a stereophonic linear anechoic mixture. InAcoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEEInternational Conference on, volume 3, pages III–745. IEEE, 2007. →pages 7[10] B. Arons. A review of the cocktail party effect. Journal of the AmericanVoice I/O Society, 12(7):35–50, 1992. → pages 2[11] M. Bin Altaf, T. Gautama, T. Tanaka, and D. P. Mandic. Rotation invariantcomplex empirical mode decomposition. In Acoustics, Speech and SignalProcessing, 2007. ICASSP 2007. IEEE International Conference on,volume 3, pages III–1009. IEEE, 2007. → pages 25[12] B. Bouachache and P. Flandrin. Wigner-ville analysis of time-varyingsignals. In Acoustics, Speech, and Signal Processing, IEEE InternationalConference on ICASSP ’82., volume 7, pages 1329–1332, May 1982. →pages 54[13] L. Brechet, M.-F. Lucas, C. Doncarli, and D. Farina. Compression ofbiomedical signals with mother wavelet optimization and best-basiswavelet packet selection. Biomedical Engineering, IEEE Transactions on,54(12):2186–2192, 2007. → pages 21[14] V. D. Calhoun, J. Liu, and T. Adalı. A review of group ica for fmri data andica for joint inference of imaging, genetic, and erp data. Neuroimage, 45(1):S163–S172, 2009. → pages 10, 11[15] A. T. Cemgil, C. Fe´votte, and S. J. Godsill. Variational and stochasticinference for bayesian source separation. Digital Signal Processing, 17(5):891–913, 2007. → pages 9[16] J.-C. Chao. On the design of robust criteria and algorithms for blind sourceseparation. PhD thesis, Southern Methodist University, 2007. → pages 5[17] X. Chen, C. He, Z. J. Wang, and M. J. McKeown. An ic-pls framework forgroup corticomuscular coupling analysis. Biomedical Engineering, IEEETransactions on, 60(7):2022–2033, 2013. → pages 50120[18] X. Chen, C. He, and H. Peng. Removal of muscle artifacts fromsingle-channel eeg based on ensemble empirical mode decomposition andmultiset canonical correlation analysis. Journal of Applied Mathematics,2014, 2014. → pages 97, 108[19] X. Chen, A. Liu, M. J. McKeown, H. Poizner, and Z. J. Wang. Aneemd-iva framework for concurrent multidimensional eeg andunidimensional kinematic data analysis. IEEE Transactions on BiomedicalEngineering, 61(7):2187–2198, 2014. → pages 11, 22[20] X. Chen, Z. J. Wang, and M. J. McKeown. A three-step multimodalanalysis framework for modeling corticomuscular activity with applicationto parkinsons disease. Biomedical and Health Informatics, IEEE Journalof, 18(4):1232–1241, 2014. → pages 38, 47, 60, 88[21] X. Chen, A. Liu, J. Chiang, Z. J. Wang, M. J. McKeown, and R. K. Ward.Removing muscle artifacts from eeg data: Multichannel or single-channeltechniques? IEEE Sensors Journal, 16(7):1986–1997, 2016. → pages 97,98, 105, 108, 109[22] X. Chen, Z. J. Wang, and M. McKeown. Joint blind source separation forneurophysiological data analysis: Multiset and multimodal methods. IEEESignal Processing Magazine, 33(3):86–107, 2016. → pages 64[23] S. Choi, A. Cichocki, and A. Beloucharni. Second order nonstationarysource separation. The Journal of VLSI Signal Processing, 32(1):93–104,2002. → pages 98[24] A. Cichocki and S.-i. Amari. Adaptive blind signal and image processing:learning algorithms and applications, volume 1. John Wiley & Sons, 2002.→ pages 5[25] P. Comon and C. Jutten. Handbook of Blind Source Separation:Independent component analysis and applications. Academic press, 2010.→ pages 5[26] M. Congedo, R. Phlypo, and J. Chatel-Goldman. Orthogonal andnon-orthogonal joint blind source separation in the least-squares sense. InSignal Processing Conference (EUSIPCO), 2012 Proceedings of the 20thEuropean, pages 1885–1889. IEEE, 2012. → pages 66[27] N. M. Correa, T. Adali, Y.-O. Li, and V. D. Calhoun. Canonical correlationanalysis for data fusion and group inferences. Signal Processing Magazine,IEEE, 27(4):39–50, 2010. → pages 59, 87121[28] M. Crespo-Garcia, M. Atienza, and J. L. Cantero. Muscle artifact removalfrom human sleep eeg by using independent component analysis. Annals ofbiomedical engineering, 36(3):467–475, 2008. → pages 97[29] W. De Clercq, A. Vergult, B. Vanrumste, W. Van Paesschen, andS. Van Huffel. Canonical correlation analysis applied to remove muscleartifacts from the electroencephalogram. IEEE transactions on BiomedicalEngineering, 53(12):2583–2587, 2006. → pages 21, 98, 104[30] L. De Lathauwer. A link between the canonical decomposition inmultilinear algebra and simultaneous matrix diagonalization. SIAMJournal on Matrix Analysis and Applications, 28(3):642–666, 2006. →pages 54, 73[31] L. De Lathauwer and J. Castaing. Blind identification of underdeterminedmixtures by simultaneous matrix diagonalization. Signal Processing, IEEETransactions on, 56(3):1096–1105, 2008. → pages xiii, 7, 8, 51, 54, 58, 65,66, 74, 84, 86, 89, 91, 93, 107, 109, 116[32] L. De Lathauwer, J. Castaing, and J.-F. Cardoso. Fourth-ordercumulant-based blind identification of underdetermined mixtures. IEEETransactions on Signal Processing, 55(6):2965–2973, 2007. → pages 8[33] A. Delorme, T. Sejnowski, and S. Makeig. Enhanced detection of artifactsin eeg data using higher-order statistics and independent componentanalysis. Neuroimage, 34(4):1443–1449, 2007. → pages 105[34] I. Domanov and L. De Lathauwer. Canonical polyadic decomposition ofthird-order tensors: relaxed uniqueness conditions and algebraic algorithm.Linear Algebra and its Applications, 513:342–375, 2017. → pages 73[35] I. Domanov and L. D. Lathauwer. Canonical polyadic decomposition ofthird-order tensors: reduction to generalized eigenvalue decomposition.SIAM Journal on Matrix Analysis and Applications, 35(2):636–660, 2014.→ pages 73[36] I. Domanov and L. D. Lathauwer. Generic uniqueness conditions for thecanonical polyadic decomposition and indscal. SIAM Journal on MatrixAnalysis and Applications, 36(4):1567–1589, 2015. → pages 73[37] T. Dong, Y. Lei, and J. Yang. An algorithm for underdetermined mixingmatrix estimation. Neurocomputing, 104:26–34, 2013. → pages 7122[38] D. M. Dunlavy, T. G. Kolda, and E. Acar. Poblano v1. 0: A matlab toolboxfor gradient-based optimization. Sandia National Laboratories,Albuquerque, NM and Livermore, CA, Tech. Rep. SAND2010-1422, 2010.→ pages 78, 104[39] J. Escudero, R. Hornero, D. Aba´solo, and A. Ferna´ndez. Quantitativeevaluation of artifact removal in real magnetoencephalogram signals withblind source separation. Annals of biomedical engineering, 39(8):2274–2286, 2011. → pages 22[40] E. Fishler and H. V. Poor. Estimation of the number of sources inunbalanced arrays via information theoretic criteria. IEEE Transactions onSignal Processing, 53(9):3543–3553, 2005. → pages 116[41] E. Fishler, M. Grosmann, and H. Messer. Detection of signals byinformation theoretic criteria: General asymptotic performance analysis.IEEE Transactions on Signal Processing, 50(5):1027–1036, 2002. →pages 116[42] P. Flandrin, G. Rilling, and P. Goncalves. Empirical mode decompositionas a filter bank. Signal Processing Letters, IEEE, 11(2):112–114, 2004. →pages 40[43] J. Gao, C. Zheng, and P. Wang. Online removal of muscle artifact fromelectroencephalogram signals based on canonical correlation analysis.Clinical EEG and neuroscience, 41(1):53–59, 2010. → pages 98[44] S. Ge, Q. Yang, R. Wang, P. Lin, J. Gao, Y. Leng, Y. Yang, and H. Wang. Abrain-computer interface based on a few-channel eeg-fnirs bimodal system.IEEE Access, 5:208–218, 2017. → pages 64[45] A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov,R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley.Physiobank, physiotoolkit, and physionet components of a new researchresource for complex physiologic signals. Circulation, 101(23):e215–e220,2000. → pages 60, 88[46] X.-F. Gong, X.-L. Wang, and Q.-H. Lin. Generalized non-orthogonal jointdiagonalization with lu decomposition and successive rotations. IEEETRANSACTIONS ON SIGNAL PROCESSING, 63(5), 2015. → pages 66,74123[47] R. Gribonval and S. Lesage. A survey of sparse component analysis forblind source separation: principles, perspectives, and new challenges. InESANN’06 proceedings-14th European Symposium on Artificial NeuralNetworks, pages 323–330. d-side publi., 2006. → pages 4[48] A. R. Groves, C. F. Beckmann, S. M. Smith, and M. W. Woolrich. Linkedindependent component analysis for multimodal data fusion. Neuroimage,54(3):2198–2217, 2011. → pages 65[49] H. Hotelling. Relations between two sets of variates. Biometrika, 28(3/4):321–377, 1936. → pages 10, 23, 27[50] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C.Yen, C. C. Tung, and H. H. Liu. The empirical mode decomposition andthe hilbert spectrum for nonlinear and non-stationary time series analysis.In Proceedings of the Royal Society of London A: Mathematical, Physicaland Engineering Sciences, volume 454, pages 903–995. The Royal Society,1998. → pages 21[51] A. Hyva¨rinen and E. Oja. Independent component analysis: algorithms andapplications. Neural networks, 13(4):411–430, 2000. → pages 6[52] Y. Ichimaru and G. Moody. Development of the polysomnographicdatabase on cd-rom. Psychiatry and Clinical Neurosciences, 53(2):175–177, 1999. → pages 32[53] M. T. Jensen, J. L. Marott, P. Lange, J. Vestbo, P. Schnohr, O. W. Nielsen,J. S. Jensen, and G. B. Jensen. Resting heart rate is a predictor of mortalityin copd. European Respiratory Journal, 42(2):341–349, 2013. → pages 20[54] S. Junnila, H. Kailanto, J. Merilahti, A.-M. Vainio, A. Vehkaoja,M. Zakrzewski, and J. Hyttinen. Wireless, multipurpose in-home healthmonitoring platform: Two case trials. Information Technology inBiomedicine, IEEE Transactions on, 14(2):447–455, 2010. → pages 20[55] W. Karlen, S. Raman, J. M. Ansermino, and G. A. Dumont. Multiparameterrespiratory rate estimation from the photoplethysmogram. BiomedicalEngineering, IEEE Transactions on, 60(7):1946–1953, 2013. → pages 32[56] J. R. Kettenring. Canonical analysis of several sets of variables.Biometrika, pages 433–451, 1971. → pages 10, 27, 50, 64, 116[57] H. A. Kiers. Towards a standardized notation and terminology in multiwayanalysis. Journal of chemometrics, 14(3):105–122, 2000. → pages 67, 69124[58] S. Kim and C. D. Yoo. Underdetermined blind source separation based ongeneralized gaussian distribution. In 2006 16th IEEE Signal ProcessingSociety Workshop on Machine Learning for Signal Processing, pages103–108. IEEE, 2006. → pages 80[59] S. Kim and C. D. Yoo. Underdetermined blind source separation based onsubspace representation. IEEE Transactions on Signal processing, 57(7):2604–2614, 2009. → pages 9, 65, 79, 80, 85, 86, 89, 93, 104, 108[60] M. Kleinsteuber and H. Shen. Blind source separation with compressivelysensed linear mixtures. Signal Processing Letters, IEEE, 19(2):107–110,2012. → pages 51, 65, 113[61] T. G. Kolda and B. W. Bader. Tensor decompositions and applications.SIAM review, 51(3):455–500, 2009. → pages 53, 54, 67, 69, 71, 73[62] Z. Koldovsky, P. Tichavsk, A. H. Phan, and A. Cichocki. A two-stagemmse beamformer for underdetermined signal separation. IEEE SignalProcessing Letters, 20(12):1227–1230, 2013. ISSN 1070-9908.doi:10.1109/LSP.2013.2285932. → pages 51, 65, 71[63] E. Kristal-Boneh, H. Silber, G. Harari, and P. Froom. The association ofresting heart rate with cardiovascular, cancer and all-cause mortality. eightyear follow-up of 3527 male israeli employees (the cordis study).European heart journal, 21(2):116–124, 2000. → pages 20[64] J. B. Kruskal. Three-way arrays: rank and uniqueness of trilineardecompositions, with application to arithmetic complexity and statistics.Linear algebra and its applications, 18(2):95–138, 1977. → pages 72[65] D. Labate, F. La Foresta, G. Morabito, I. Palamara, and F. C. Morabito.Entropic measures of eeg complexity in alzheimer’s disease through amultivariate multiscale approach. IEEE Sensors Journal, 13(9):3284–3292,2013. → pages 98[66] J. Lee, D. D. McManus, S. Merchant, and K. H. Chon. Automatic motionand noise artifact detection in holter ecg data using empirical modedecomposition and statistical approaches. Biomedical Engineering, IEEETransactions on, 59(6):1499–1506, 2012. → pages 21[67] J.-H. Lee, T.-W. Lee, F. A. Jolesz, and S.-S. Yoo. Independent vectoranalysis (iva): multivariate approach for fmri group study. Neuroimage, 40(1):86–109, 2008. → pages 9, 10125[68] X.-L. Li, M. Anderson, and T. Adalı. Second and higher-order correlationanalysis of multiple multidimensional variables by joint diagonalization. InInternational Conference on Latent Variable Analysis and SignalSeparation, pages 197–204. Springer, 2010. → pages 10[69] X.-L. Li, T. Adalı, and M. Anderson. Joint blind source separation bygeneralized joint diagonalization of cumulant matrices. Signal Processing,91(10):2314–2322, 2011. → pages 50, 60, 64, 65, 70, 88[70] Y. Li, A. Cichocki, and S.-i. Amari. Analysis of sparse representation andblind source separation. Neural computation, 16(6):1193–1234, 2004. →pages 9[71] Y. Li, S.-I. Amari, A. Cichocki, D. W. Ho, and S. Xie. Underdeterminedblind source separation based on sparse representation. IEEE Transactionson signal processing, 54(2):423–437, 2006. → pages 7, 9[72] Y.-O. Li, T. Adali, W. Wang, and V. D. Calhoun. Joint blind sourceseparation by multiset canonical correlation analysis. Signal Processing,IEEE Transactions on, 57(10):3918–3929, 2009. → pages 10, 23, 27, 47,50, 59, 60, 61, 64, 70, 85, 88, 89, 93[73] A. P. Liavas and P. A. Regalia. On the behavior of information theoreticcriteria for model order selection. IEEE Transactions on Signal Processing,49(8):1689–1695, 2001. → pages 116[74] J. Lin and A. Zhang. Fault feature separation using wavelet-ica filter. NDT& E International, 38(6):421–427, 2005. → pages 22[75] V. Ma¨kinen, H. Tiitinen, and P. May. Auditory event-related responses aregenerated independently of ongoing brain activity. Neuroimage, 24(4):961–968, 2005. → pages 105[76] D. Mandic et al. Filter bank property of multivariate empirical modedecomposition. Signal Processing, IEEE Transactions on, 59(5):2421–2426, 2011. → pages 22[77] D. P. Mandic, N. U. Rehman, Z. Wu, and N. E. Huang. Empirical modedecomposition-based time-frequency analysis of multivariate signals: thepower of adaptive data analysis. Signal Processing Magazine, IEEE, 30(6):74–86, 2013. → pages 21, 23, 113126[78] M. D. McDonnell and D. Abbott. What is stochastic resonance?definitions, misconceptions, debates, and its relevance to biology. PLoSComput Biol, 5(5):e1000348, 2009. → pages 98[79] B. W. McMenamin, A. J. Shackman, L. L. Greischar, and R. J. Davidson.Electromyogenic artifacts and electroencephalographic inferencesrevisited. Neuroimage, 54(1):4–9, 2011. → pages 97[80] L. Mesin, A. Holobar, and R. Merletti. Blind source separation:Application to biomedical signals. 2011. → pages 5[81] B. Mijovic, M. De Vos, I. Gligorijevic, J. Taelman, and S. Van Huffel.Source separation from single-channel recordings by combiningempirical-mode decomposition and independent component analysis.Biomedical Engineering, IEEE Transactions on, 57(9):2188–2196, 2010.→ pages 21, 22, 27, 28[82] G. B. Moody and R. G. Mark. The mit-bih arrhythmia database on cd-romand software for use with it. In Computers in Cardiology 1990,Proceedings., pages 185–188. IEEE, 1990. → pages 32[83] H. Nam, T.-G. Yim, S. K. Han, J.-B. Oh, and S. K. Lee. Independentcomponent analysis of ictal eeg in medial temporal lobe epilepsy.Epilepsia, 43(2):160–164, 2002. → pages 98[84] D. Nion, K. N. Mokios, N. D. Sidiropoulos, and A. Potamianos. Batch andadaptive parafac-based blind separation of convolutive speech mixtures.IEEE Transactions on Audio, Speech, and Language Processing, 18(6):1193–1207, 2010. → pages 8[85] P. Palatini, E. Casiglia, P. Pauletto, J. Staessen, N. Kaciroti, and S. Julius.Relationship of tachycardia with high blood pressure and metabolicabnormalities a study with mixture analysis in three populations.Hypertension, 30(5):1267–1273, 1997. → pages 20[86] P. Palatini, E. Casiglia, S. Julius, and A. C. Pessina. High heart rate: a riskfactor for cardiovascular death in elderly men. Archives of internalmedicine, 159(6):585–592, 1999. → pages 20[87] C. Park, D. Looney, A. Ahrabian, D. Mandic, et al. Classification of motorimagery bci using multivariate empirical mode decomposition. NeuralSystems and Rehabilitation Engineering, IEEE Transactions on, 21(1):10–22, 2013. → pages 22127[88] M. Rajih, P. Comon, and R. A. Harshman. Enhanced line search: A novelmethod to accelerate parafac. SIAM Journal on Matrix Analysis andApplications, 30(3):1128–1147, 2008. → pages 54[89] N. Rehman and D. P. Mandic. Multivariate empirical mode decomposition.In Proceedings of the Royal Society of London A: Mathematical, Physicaland Engineering Sciences, page rspa20090502. The Royal Society, 2009.→ pages 25[90] N. U. Rehman and D. P. Mandic. Filter bank property of multivariateempirical mode decomposition. Signal Processing, IEEE Transactions on,59(5):2421–2426, 2011. → pages 23, 25, 26, 40, 46[91] V. G. Reju, S. N. Koh, and Y. Soon. An algorithm for mixing matrixestimation in instantaneous blind source separation. Signal Processing, 89(9):1762–1773, 2009. → pages 8, 65[92] G. Rilling, P. Flandrin, P. Gonc¸alves, and J. M. Lilly. Bivariate empiricalmode decomposition. Signal Processing Letters, IEEE, 14(12):936–939,2007. → pages 25[93] D. Safieddine, A. Kachenoura, L. Albera, G. Birot, A. Karfoul, A. Pasnicu,A. Biraben, F. Wendling, L. Senhadji, and I. Merlet. Removal of muscleartifact from eeg data: comparison between stochastic (ica and cca) anddeterministic (emd and wavelet-based) approaches. EURASIP Journal onAdvances in Signal Processing, 2012(1):1–15, 2012. → pages 21[94] H. Sawada, R. Mukai, S. Araki, and S. Makino. Estimating the number ofsources using independent component analysis. Acoustical science andtechnology, 26(5):450–452, 2005. → pages 116[95] H. Sawada, S. Araki, and S. Makino. Underdetermined convolutive blindsource separation via frequency bin-wise clustering and permutationalignment. IEEE Transactions on Audio, Speech, and LanguageProcessing, 19(3):516–527, 2011. → pages 13[96] T. Shany, S. J. Redmond, M. R. Narayanan, and N. H. Lovell.Sensors-based wearable systems for monitoring of human movement andfalls. Sensors Journal, IEEE, 12(3):658–670, 2012. → pages 20[97] H. Snoussi and J. Idier. Bayesian blind separation of generalizedhyperbolic processes in noisy and underdeterminate mixtures. IEEETransactions on Signal Processing, 54(9):3257–3269, 2006. → pages 9128[98] S. Soltanian, A. Servati, R. Rahmanian, F. Ko, and P. Servati. Highlypiezoresistive compliant nanofibrous sensors for tactile and epidermalelectronic applications. Journal of Materials Research, 30(01):121–129,2015. → pages 20, 34[99] L. Sorber, M. Van Barel, and L. De Lathauwer. Optimization-basedalgorithms for tensor decompositions: Canonical polyadic decomposition,decomposition in rank-(l r,l r,1) terms, and a new generalization. SIAMJournal on Optimization, 23(2):695–720, 2013. → pages 73[100] L. Sorber, M. Van Barel, and L. De Lathauwer. Structured data fusion.IEEE Journal of Selected Topics in Signal Processing, 9(4):586–600, 2015.→ pages 77, 103[101] M. Sørensen, I. Domanov, and L. De Lathauwer. Coupled canonicalpolyadic decompositions and (coupled) decompositions in multilinearrank-(l r,n,l r,n,1) terms—part ii: Algorithms. SIAM Journal on MatrixAnalysis and Applications, 36(3):1015–1045, 2015. → pages 77[102] G. Strang. Introduction to Linear Algebra. Wellesley-Cambridge Press,2003. ISBN 9780961408893. → pages 80[103] K. T. Sweeney, S. F. McLoone, and T. E. Ward. The use of ensembleempirical mode decomposition with canonical correlation analysis as anovel artifact removal technique. Biomedical Engineering, IEEETransactions on, 60(1):97–105, 2013. → pages 22, 28[104] P. Tichavsky and Z. Koldovsky. Weight adjusted tensor method for blindseparation of underdetermined mixtures of nonstationary sources. IEEETransactions on Signal Processing, 59(3):1037–1047, 2011. → pages 65[105] N. ur Rehman, C. Park, N. E. Huang, and D. P. Mandic. Emd via memd:multivariate noise-aided computation of standard emd. Advances inAdaptive Data Analysis, 5(02), 2013. → pages 29[106] J. A. Urigu¨en and B. Garcia-Zapirain. Eeg artifact removalstate-of-the-artand guidelines. Journal of neural engineering, 12(3):031001, 2015. →pages 97[107] H. L. Van Trees. Detection, estimation, and modulation theory, optimumarray processing. John Wiley & Sons, 2004. → pages 79129[108] R. R. Va´zquez, H. Velez-Perez, R. Ranta, V. L. Dorr, D. Maquin, andL. Maillard. Blind source separation, wavelet denoising and discriminantanalysis for eeg artefacts and noise cancelling. Biomedical SignalProcessing and Control, 7(4):389–400, 2012. → pages 21[109] N. Vervliet, O. Debals, and L. De Lathauwer. Tensorlab 3.0-numericaloptimization strategies for large-scale constrained and coupledmatrix/tensor factorization. In 2016 Conference Record of the 50thAsilomar Conference on Signals, Systems and Computers. IEEE, 2016. →pages 77, 78, 103[110] E. Vincent, R. Gribonval, and C. Fe´votte. Performance measurement inblind audio source separation. IEEE transactions on audio, speech, andlanguage processing, 14(4):1462–1469, 2006. → pages 5[111] L. Z. Wang, X. Chen, X. Ji, and Z. J. Underdetermined joint blind sourceseparation of multiple datasets. IEEE Access, 5:7474–7487, 2017. →pages 8, 99, 100, 107, 109[112] E. Wigner. On the quantum correction for thermodynamic equilibrium.Physical Review, 40(5):749–759, 1932. → pages 54[113] Z. Wu and N. E. Huang. A study of the characteristics of white noise usingthe empirical mode decomposition method. Proceedings of the RoyalSociety of London. Series A: Mathematical, Physical and EngineeringSciences, 460(2046):1597–1611, 2004. → pages 24[114] Z. Wu and N. E. Huang. Ensemble empirical mode decomposition: anoise-assisted data analysis method. Advances in adaptive data analysis, 1(01):1–41, 2009. → pages 22[115] G. Wunder, H. Boche, T. Strohmer, and P. Jung. Sparse signal processingconcepts for efficient 5g system design. IEEE Access, 3:195–208, 2015. →pages 65[116] S. Xie, L. Yang, J.-M. Yang, G. Zhou, and Y. Xiang. Time-frequencyapproach to underdetermined blind source separation. Neural Networksand Learning Systems, IEEE Transactions on, 23(2):306–316, 2012. →pages 51, 54, 55, 56, 59, 60, 61, 65, 66, 79, 86[117] N. Yeung, R. Bogacz, C. B. Holroyd, S. Nieuwenhuis, and J. D. Cohen.Theta phase resetting and the error-related negativity. Psychophysiology,44(1):39–49, 2007. → pages 105130[118] T. Yilmaz, R. Foster, and Y. Hao. Detecting vital signs with wearablewireless sensors. Sensors, 10(12):10837–10862, 2010. → pages 20[119] L. Zhen, D. Peng, Z. Yi, Y. Xiang, and P. Chen. Underdetermined blindsource separation using sparse coding. IEEE Transactions on NeuralNetworks and Learning Systems, 2016. → pages 9, 65, 85, 89, 93[120] G. Zhou and A. Cichocki. Canonical polyadic decomposition based on asingle mode blind source separation. Signal Processing Letters, IEEE, 19(8):523–526, 2012. → pages 53, 71[121] G. Zhou, A. Cichocki, Y. Zhang, and D. P. Mandic. Group componentanalysis for multiblock data: Common and individual feature extraction.IEEE Transactions on Neural Networks and Learning Systems, PP(99):1–14, 2015. ISSN 2162-237X. doi:10.1109/TNNLS.2015.2487364. →pages 50[122] G. Zhou, Q. Zhao, Y. Zhang, T. Adali, S. Xie, and A. Cichocki. Linkedcomponent analysis from matrices to high-order tensors: Applications tobiomedical data. Proceedings of the IEEE, 104(2):310–331, 2016. ISSN0018-9219. doi:10.1109/JPROC.2015.2474704. → pages 50, 51, 64[123] L. Zou, X. Chen, A. Servati, P. Servati, and M. J. McKeown. A heart beatrate detection framework using multiple nanofiber sensor signals. In Signaland Information Processing (ChinaSIP), 2014 IEEE China Summit &International Conference on, pages 242–246. IEEE, 2014. → pages 28, 30[124] L. Zou, X. Chen, and Z. J. Wang. Underdetermined joint blind sourceseparation for two datasets based on tensor decomposition. IEEE SignalProcessing Letters, 23(5):673–677, 2016. → pages xiii, 8, 63, 65, 74, 84,85, 86, 89, 91, 93, 99, 114[125] L. Zou, Z. J. Wang, X. Chen, and X. Ji. Underdetermined joint blind sourceseparation based on tensor decomposition. In Electrical and ComputerEngineering (CCECE), 2016 IEEE Canadian Conference on, pages 1–4.IEEE, 2016. → pages 66131Appendix ADerivationsThe Appendix is the proof of Proposition 1 in Chapter 4, asThe partial derivative of the objective function f with respect to each column ofthe desired matrices , i.e., {a(k)n }, un, vr and wn, are given by∂ f∂a(1)n=−P×2 a(2)n ×3 un−Q×2 a(3)n ×3 vn+N∑c=1[(a(2)n )T a(2)c (un)T uc+(a(3)n )T a(3)c (vn)T vc]a(1)c∂ f∂a(2)n=−P×1 a(1)n ×3 un−R×2 a(3)n ×3 wn+N∑c=1[(a(1)n )T a(1)c (un)T uc+(a(3)n )T a(3)c (wn)T wc]a(2)c∂ f∂a(3)n=−Q×1 a(1)n ×3 vn−R×1 a(2)n ×3 wn+N∑c=1[(a(1)n )T a(1)c (vn)T vc+(a(2)n )T a(2)c (wn)T wc]a(3)c∂ f∂un=−P×1 a(1)n ×2 a(2)n +N∑c=1[(a(1)n )T a(1)c (a(2)n )T a(2)c ]uc∂ f∂vn=−Q×1 a(1)n ×2 a(3)n +N∑c=1[(a(1)n )T a(1)c (a(3)n )T a(3)c ]vc∂ f∂wn=−R×1 a(2)n ×2 a(3)n +N∑c=1[(a(2)n )T a(2)c (a(2)n )T a(3)c ]wc.132A.1 Proof for Proposition 1Proo f . The three components of the objective function in (4.21), i.e., f (1)(A(1),A(2),U),f (2)(A(1),A(3),V ) and f (3)(A(2),A(3),W ), share similar structure, which is the dif-ference between one tensor and the corresponding estimated results. Therefore,we take f (1)(A(1),A(2),U) and its partial derivative with respect to a(1)n for furtheranalysis. It can be rewritten asf (1)(A(1),A(2),U)=‖P− [[A(1),A(2),U ]]‖2=‖P‖2︸ ︷︷ ︸f (1)1−2<P, [[A(1),A(2),U ]]>︸ ︷︷ ︸f (1)2+‖[[A(1),A(2),U ]]‖2︸ ︷︷ ︸f (1)3.(A.1)The first summand f (1)1 does not involve any variable and therefore∂ f (1)1∂a(1)n= 0, (A.2)where 0 is the zero vector with the same length as a(1)n . The second summand f(1)2is the inner product of the tensorP with its its polyadic decomposition, and it canbe computed asf (1)2 =<P, [[A(1),A(2),U ]]>=<P,N∑n=1a(1)n ◦a(2)n ◦un >=N∑n=1M∑i1=1M∑i2=1L∑i3=1pi1,i2,i3a(1)i1,na(2)i2,nui3,n=N∑n=1(P×1 a(1)n ×2 a(2)n ×3 un)=N∑n=1(P×2 a(2)n ×3 un)T a(1)n .(A.3)133The partial derivative of f (1)2 with respect to each column of A(1) is∂ f (1)2∂a(1)n=P×2 a(2)n ×3 un. (A.4)The third summand is the square of the Frobenius norm of P’s polyadic decom-position, and it can be computed asf (1)3 =‖[[A(1),A(2),U ]]‖2=<N∑n=1a(1)n ◦a(2)n ◦un,N∑n=1a(1)n ◦a(2)n ◦un >=N∑b=1N∑c=1((a(1)b )T (a(1)c )(a(2)b )T (a(2)c )(ub)T (uc))︸ ︷︷ ︸F(b,c)=F(n,n)+N∑b=1b6=nN∑c=1c 6=nF(b,c)+2N∑c=1c 6=nF(n,c),(A.5)where b and c denote the indices of the factor matrices. The first summand of f (1)3isF(n,n) = (a(1)n )T (a(1)n )(a(2)n )T (a(2)n )(un)T (un), (A.6)and its partial derivative with respect to the nth column of the factor matrix A(1) is∂F(n,n)∂a(1)n= 2((a(2)n )T a(2)n uTn un)a(1)n . (A.7)The second summand of f (1)3 does not involve the variable a(1)n and therefore thecorresponding partial derivative with respect to a(1)n is the zero vector with the samelength as a(1)n . The third summand of f(1)3 is2N∑c=1c 6=nF(n,c) = 2N∑c=1c 6=n(a(1)n )T (a(1)c )(a(2)n )T (a(2)c )(un)T (uc), (A.8)134and its partial derivative with respect to the a(1)n can be computed as 2∑Nc=1c 6=n[(a(2)n )T a(2)c (un)T uc]a(1)c .Therefore,∂ f (1)3∂a(1)n=2((a(2)n )T a(2)n uTn un)a(1)n+2N∑c=1c 6=n[(a(2)n )T a(2)c (un)T uc]a(1)c=2N∑c=1[(a(2)n )T a(2)c (un)T uc]a(1)c .(A.9)Combining all the above results, i.e., equation (A.2), (A.4) and (A.9), the partialderivative of f (1)(A(1),A(2),U) with respect to the a(1)n can be computed as∂ f (1)(A(1),A(2),U)∂a(1)n=∂ f (1)1∂a(1)n−2∂ f(1)2∂a(1)n+∂ f (1)3∂a(1)n=−2P×2 a(2)n ×3 un+2N∑c=1[(a(2)n )T a(2)c (un)T uc]a(1)c .(A.10)Similarly, we can calculate the partial derivative of f (2)(A(1),A(3),V ) with respectto the a(1)n as∂ f (2)(A(1),A(3),V )∂a(1)n=−2Q×2 a(3)n ×3 vn+2N∑c=1[(a(3)n )T a(3)c (vn)T vc]a(1)c .(A.11)f (3)(A(2),A(3),W ) does not involve the variable a(1)n and therefore∂ f (3)(A(2),A(3),W )∂a(1)n= 0. (A.12)135Consequently, the partial derivative of the objective function with respect to a(1)n is∂ f (A(1),A(2),A(3),U ,V ,W )∂a(1)n=12∂ f (1)∂a(1)n+12∂ f (2)∂a(1)n+12∂ f (3)∂a(1)n=−P×2 a(2)n ×3 un−Q×2 a(2)n ×3 vn+N∑c=1[(a(2)n )T a(2)c (un)T uc+(a(3)n )T a(3)c (vn)T vc]a(1)c .(A.13)This completes the proof of the first equation in Proposition 1. The proof of otherequations is similar to that of (A.13) and thus omitted here.136"@en ;
  edm:hasType "Thesis/Dissertation"@en ;
  vivo:dateIssued "2017-11"@en ;
  edm:isShownAt "10.14288/1.0355539"@en ;
  dcterms:language "eng"@en ;
  ns0:degreeDiscipline "Electrical and Computer Engineering"@en ;
  edm:provider "Vancouver : University of British Columbia Library"@en ;
  dcterms:publisher "University of British Columbia"@en ;
  dcterms:rights "Attribution-NonCommercial-NoDerivatives 4.0 International"@* ;
  ns0:rightsURI "http://creativecommons.org/licenses/by-nc-nd/4.0/"@* ;
  ns0:scholarLevel "Graduate"@en ;
  dcterms:title "Underdetermined joint blind source separation with application to physiological data"@en ;
  dcterms:type "Text"@en ;
  ns0:identifierURI "http://hdl.handle.net/2429/63013"@en .