A Hybrid Precoding and SignalDetection Framework for FutureWireless SystemsbyKevin Bradley DsouzaB.Tech., National Institute of Technology Karnataka, India, 2017A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinThe Faculty of Graduate Studies(Electrical and Computer Engineering)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)December 2018c© Kevin Bradley Dsouza 2018The following individuals certify that they have read, and recommend to the Faculty ofGraduate and Postdoctoral Studies for acceptance, a thesis/dissertation entitled:A Hybrid Precoding and Signal Detection Framework for FutureWireless Systemssubmitted by Kevin Bradley Dsouza in partial fulfillment of the requirements for the degreeof Master of Applied Science in Electrical and Computer Engineering.Examining Committee:Vijay K. BhargavaSupervisorVincent WongSupervisory Committee MemberSudip ShekharSupervisory Committee MemberiiAbstractWith energy efficiency and spectrum management being major concerns in future wirelesssystems, this thesis primarily focuses on the precoding and signal detection capabilities ofnext generation wireless transceivers.In the first part of the thesis, we present a parallel framework to make hybrid precod-ing competitive in fast-fading environments. To enumerate, (i) a low-complexity algorithmwhich exploits the block diagonal phase-only nature of the analog precoder in a partiallyconnected structure is proposed to arrive at a hybrid precoding solution for a multi-carriersingle-user system using orthogonal frequency division multiplexing (OFDM), (ii) the orig-inal problem is broken down into independent subproblems of finding the magnitude andthe phase components which are solvable in parallel, (iii) a per-RF chain power constraintis introduced instead of the sum power constraint over all antennas, which is much morepractical in real systems, (iv) an alternating version of this scheme is proposed for increasedspectral-efficiency gains, (v) wideband PCS architecture is critiqued for its applicability infuture wireless systems and possible alternatives are discussed.In the second part of the thesis, we present a signal detection and time-frequency lo-calization framework for smart transceivers. Although deep learning techniques for imageanalysis have been advancing at a breakneck pace for the past few years, their applica-tion to RF data has been relatively less explored. To enumerate our contributions, (i) wepresent a modification of an existing state-of-the-art object classification technique calledFaster-RCNN (FRCNN) [1] for detection and time-frequency localization of the signal in aspectrogram of a wideband RF capture, (ii) insights into the design choices pertaining to theiiiAbstractvariables such as short-time Fourier transform (STFT) parameters, spectrogram and anchorsizes and network thresholds are discussed, (iii) synthetic data as per the recently proposedWiFi High Throughput (WiFi-HT) protocol [2] is generated and a mean average precision(mAP) of up to 0.9 is achieved when the model is trained and tested on positive signal tonoise ratio (SNR) values, (iv) certain drawbacks of the model with respect to low SNR levelsand disparate signal sizes are brought to light and possible solutions are discussed.ivLay SummaryFuture wireless systems are poised for the adoption of high frequency bands in conjunctionwith multiple antenna systems for increased speed of communication. However, the adop-tion of these technologies comes with its own set of design challenges. One such challengeis to limit the consumption of energy with increasing complexity. While future wirelesstransceivers need to be energy efficient, they should also aid in security applications. AsRF communication becomes pervasive for control and data transmission, from unmannedaerial vehicles (UAVs) to internet of things (IoT) devices, detection of the presence of a wire-less device by passive sensing becomes paramount for security purposes. The two concernsmentioned above will be the main focal points of this thesis.vPrefaceThe following publications have resulted from the research presented in this thesis:• K. B. Dsouza, K. N. R. S. V. Prasad, and V. K. Bhargava, “Hybrid Precoding withPartially Connected Structure for mmWave Massive MIMO OFDM: A Parallel Frame-work and Feasibility Analysis” IEEE Transactions on Wireless Communications, 2018.(Linked to Chapter 2)• K. B. Dsouza*, K. N. R. S. V. Prasad*, V. K. Bhargava, “A Deep Learning Frameworkfor Signal Detection and Time-Frequency Localization in Wideband RF Systems” to besubmitted to IEEE Journal on Selected Areas in Communications. (Linked to Chapter3)Statement of AuthorshipI am the primary author for the first publication and a joint primary author of the secondpublication listed above. I have been responsible to develop original ideas, derive math-ematical solutions, and generate simulation results for these publications. Prof. Vijay K.Bhargava, my research supervisor, shared with me his vast knowledge in the field and was aninstrument of constant support. My colleague, friend and co-author of both the above men-tioned works, K.N.R. Surya Prasad, provided valuable guidance and insights in identifyingthe research problems, developing solution methodologies, and documenting the results.Some of the simulation results were obtained using MATLAB, the computing environmentby MathWorks [3] and CVX, a convex optimization software developed by Grant et al. [4].viTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiMathematical Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiList of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 The Promise of 5G Wireless Systems . . . . . . . . . . . . . . . . . . . . . . 11.2 Need for Hybrid-Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Need for Automatic Signal Detection . . . . . . . . . . . . . . . . . . . . . . 21.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3viiTable of Contents2 A Per-RF Chain Hybrid Precoding Approach for Wideband PCS Systems 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2.1 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Magnitude-Phase Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3.1 Magnitude Sub-Problem . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.2 Phase Sub-Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.3 Par-ArgMod Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 192.4 Joint Formulation and Alt-ArgMod . . . . . . . . . . . . . . . . . . . . . . . 202.5 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.6.1 Spectral-Efficiency Evaluation . . . . . . . . . . . . . . . . . . . . . 232.6.2 Power Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.6.3 Run-Time Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 282.6.4 Realistic Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.7 Critique of Wideband PCS . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.8 Conclusion and Possible Future Work . . . . . . . . . . . . . . . . . . . . . 323 Signal Detection and Time-Frequency Localization Using Deep Learning 353.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2 Framework for signal detection and time-frequency localization . . . . . . . 383.3 Faster RCNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3.1 Base Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.3.2 Region Proposal Network . . . . . . . . . . . . . . . . . . . . . . . . 433.3.3 Detector Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3.4 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.3.5 Training procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48viiiTable of Contents3.4 Design Choices to Adopt FRCNN for Signal Detection and Time-FrequencyLocalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.4.1 Choice of STFT parameters . . . . . . . . . . . . . . . . . . . . . . . 483.4.2 Spectrogram size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.4.3 Choice of base network . . . . . . . . . . . . . . . . . . . . . . . . . 503.4.4 Anchor box sizes and aspect ratios . . . . . . . . . . . . . . . . . . . 503.4.5 RPN max and min overlap . . . . . . . . . . . . . . . . . . . . . . . 513.4.6 Detector network min and max overlap . . . . . . . . . . . . . . . . 523.5 Numerical Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.5.1 Dataset for training and testing . . . . . . . . . . . . . . . . . . . . . 533.5.2 Spectrogram generation . . . . . . . . . . . . . . . . . . . . . . . . . 533.5.3 Numerical thresholds for the FRCNN model . . . . . . . . . . . . . . 543.5.4 Training performance evaluation . . . . . . . . . . . . . . . . . . . . 543.5.5 Prediction Performance evaluation . . . . . . . . . . . . . . . . . . . 573.6 Conclusion and Possible Future Work . . . . . . . . . . . . . . . . . . . . . 644 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67AppendixA Proof of results in Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 76A.1 Optimality condition for wideband systems PCS . . . . . . . . . . . . . . . 76A.2 Results with varying number of streams and subcarriers . . . . . . . . . . . 76ixList of Tables2.1 Complexity comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.2 Run time comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.1 mAP with different number of anchors. . . . . . . . . . . . . . . . . . . . . . 59xList of Figures2.1 Hybrid precoding partially connected scheme in mmWave OFDM systems . . 62.2 Hybrid precoding in OFDM as tensor factorization. . . . . . . . . . . . . . . 102.3 Frontal slice of the tensor Fopt. . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Decomposition of the frontal slice of the tensor Fopt in Fig. 2.3 into theproduct of an RF precoder FRF and a baseband precoder FBB. . . . . . . . 112.5 Spectral efficiency achieved by precoding algorithms when Ns = N tRF =N rRF = 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6 Spectral efficiency achieved by Par-ArgMod algorithm (a,b) with varying NRFand (c,d) with varying SNR for different values of NRF . . . . . . . . . . . . . 252.7 Power level achieved by different RF-chains (a,b) with SDR-AltMin and (c,d)with Alt-ArgMod. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.8 Spectral efficiency achieved by Par-ArgMod using different channel modelswhen Ns = N tRF = N rRF = 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.9 Spectral efficiency achieved by Optimal and hybrid precoders with increasingNt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.1 Proposed framework for signal detection and time-frequency localization . . 383.2 The time content, frequency content, and spectrogram of an example wide-band RF capture, when the capture duration is 633 ms, center frequency is2.4 GHz, wideband bandwidth is 56 MHz, and sampling rate is 56 MHz. . . 393.3 Faster RCNN Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42xiList of Figures3.4 Training loss convergence for the RPN classification, RPN regression, Detectorclassification, and Detector regression tasks in the FRCNN model. . . . . . . 553.5 Targets for the RPN and the inputs for the Detector. The red boxes are thecomputed targets and the RoIs where as the blue ones indicate the GT signals. 563.6 Test images of the trained model. The red boxes are the predicted boundingboxes where as the blue ones indicate the GT signals. . . . . . . . . . . . . . 573.7 mAP for different base networks. . . . . . . . . . . . . . . . . . . . . . . . . 583.8 mAP with different RPN and Detector thresholds. . . . . . . . . . . . . . . . 603.9 mAP vs SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.10 Test images of model trained on disparate anchor sizes. The red boxes arethe predicted bounding boxes where as the blue ones indicate the GT signals. 63A.1 Spectral efficiency at SNR = 0 dB, with N tRF = N rRF = 4. . . . . . . . . . . . 77xiiMathematical NotationsThe following notations are used in this thesis. Boldface lowercase and uppercase letters,for example, a and A, refer to vectors and matrices respectively. The notation Ai,j refersto the entry on the row i and column j of matrix A. AH denotes the complex conjugatetranspose of matrix A. |C| denotes the modulus of the complex number C and arg (C)denotes its argument. ‖A‖F denotes the Frobenius norm of A. A† denotes the Moore-Penrose pseudo inverse of A, i.e, A† = (AHA)−1AH . A vector with superscript of *, forexample, x∗, refers to the ground truth and a subscript of a, for example, xa, refers to theanchor box.xiiiList of Abbreviations3GPP : 3rd Generation Partnership Project5G : Fifth GenerationAB : Anchor BoxADC : Analog to Digital ConverterAED : Audio Event DetectionBN : Base NetworkBS : Base StationCDL : Clustered Delay LineCNN : Convolutional Neural NetworkCRNN : Convolutional Recurrent Neural NetworkCRP : Chinese Restaurant ProcessCSI : Channel State InformationCWD : Choi-Williams DistributionDFT : Discrete Fourier TransformFCS : Fully Connected StructureFFT : Fast Fourier transformFRCNN : Faster-RCNNGT : Ground TruthHT : High ThroughputIFFT : Inverse Fast Fourier TransformIoT : Internet of ThingsxivList of AbbreviationsIoU : Intersection over UnionISM : Industrial Scientific and MedicalmAP : Mean Average PrecisionMIMO : Multiple-Input Multiple-OutputMSE : Mean Square ErrorNMS : Non-Max SuppressionNYU : NewYork UniversityOFDM : Orthogonal Frequency Division MultiplexingOMP : Orthogonal matching pursuitPCS : Partially Connected StructurePSD : Power Spectral DensityPZF : Phased-Zero-ForcingQCQP : Quadratic Constraint Quadratic ProgrammingQP : Quadratic ProgrammingRF : Radio FrequencyRoI : Region of InterestRPN : Region Proposal NetworkRSR : Ratio of Successful RecognitionSDP : Semi-Definite ProgrammingSIC : Successive Interference CancellationSNR : Signal to Noise RatioSTFT : Short Time Fourier TransformSV : Saleh-ValenzuelaSVD : Singular Value DecompositionTCSLs : Time Clusters and Spatial LobesUAV : Unmanned Aerial VehicleUMi-Sc : Urban Microcell - Street CanyonxvList of AbbreviationsURA : Uniform Rectangular ArrayUSPA : Uniform Square Planar ArrayVGA : Variable Gain AmplifierxviAcknowledgementsI would like to thank my supervisor Professor Vijay K. Bhargava for his patience, knowledge,and generous financial support. I thank him for providing me with an excellent research at-mosphere in his lab. This thesis would not have been possible without his support, guidance,and encouragement.I would like to sincerely thank Professor Lutz Lampe and Professor Vincent Wong fortheir informative courses on wireless systems and convex optimization, which provided methe background and knowledge I needed to venture into these topics. The second part of thisthesis was majorly inspired by collaboration with Skycope Technologies Inc. in associationwith Mitacs Canada. I am grateful to my colleagues at Skycope for their mentorship andMitacs for funding the research.My colleague and friend, K. N. R. Surya Prasad, provided invaluable insights throughoutthe duration of this research and helped refine the final manuscripts to a large extent. I’mindebted to him for his kindness and support.xviiDedicationTo my mom, who taught me that affection is not for the weak, my dad, who constantlyreaffirmed that inquisitiveness is always appreciated, and my sister, who demonstrates thatresilience is a virtue.xviiiChapter 1Motivation1.1 The Promise of 5G Wireless SystemsMillimeter-Wave (mmWave) is a key technology that will play a pivotal role in 5G wirelesscommunication systems [5]. The high capacity requirements of next-generation systems makeadoption of higher frequency bands such as mmWave inevitable. Recent measurement studiesof frequency bands from 5 GHz to 100 GHz in New York City have shown the feasibility ofusing mmWave technology for cellular communication [6]. The main concern with mmWaveis the high path loss and the absorption by atmospheric elements at such high frequencies [7].On the upside, the small wavelengths at these frequencies make viable the accommodationof high number of antenna elements at the base station (BS) empowering massive MIMOsystems. This combination of mmWave and massive MIMO will provide spatial multiplexingand beamforming gains to make up for the detrimental high frequency effects. Further, thespatial diversity offered by this system can be exploited for reliable communication.1.2 Need for Hybrid-PrecodingConventional MIMO systems use digital precoders in the baseband which can manipulateboth magnitude and phase of signals. This fully digital precoder requires RF chains com-prising of analog-to-digital converters (ADCs) and signal mixers, equal in number to thenumber of base station (BS) antennas. The power consumption and cost of these devicesmake it prohibitive to implement digital precoders for mmWave massive MIMO systems.11.3. Need for Automatic Signal DetectionThese constraints give rise to a new type of hybrid precoding architecture that addressesthese issues by restricting the number of RF chains connecting the baseband precoder to theanalog precoder [8]. The baseband precoder is converted to a low-dimensional one and theanalog precoder is made high-dimensional, because of which low-cost phase shifters are usedinstead of variable gain amplifiers (VGAs) [6]. The phase shifter network that connects theRF chains to the antennas determines whether the system is fully or partially connected.The fully connected structure (FCS) connects each RF chain to all the antennas, deliveringfull beamforming gain. The PCS significantly reduces the number of phase shifters to beused by connecting each RF chain to only a subset of the available antennas, sacrificingsome beamforming gain in the process. In this thesis, we adopt the PCS design, proposea low-complexity algorithm respecting per-RF chain power constraints and investigate itsperformance in a single-user mmWave wideband system.1.3 Need for Automatic Signal DetectionWith the emergence of the IoT, we are currently witnessing a steep surge in the numberof wireless devices around us. In future wireless systems, with the IoT devices, UAVs andsmart sensors-actuators coexisting alongside traditional mobile phones and access points, itbecomes imperative to distinguish between these devices from a privacy and security point ofview. Technology that can detect and differentiate such heterogeneous wireless signals andlocalize their time-frequency span can be commercialized into products for wireless securityand spectrum management. From the wireless security perspective, commercial productscan be built to make ad-hoc security decisions such as, sending emergency alerts on thepotential presence of an unexpected wireless device, employing techniques to jam signalsfrom the detected device, and also estimating the geolocation of the detected device from theincumbent signals. From the spectrum management perspective, commercial products can bebuilt to dynamically share spectrum among the vast number and variety of heterogeneous21.4. Outline of the Thesisdevices in the IoT space. Knowing a priori which time-frequency resources are under-utilized and which ones have minimum interference, smart spectrum allocations can be madein densely populated scenarios. In this thesis, we develop a deep learning framework topassively detect transmitting devices that are transmitting time-frequency localized contentin the wideband RF spectrum of interest. We also estimate the time and frequency span ofeach detected wireless transmission.1.4 Outline of the ThesisIn Chapter 2, we introduce a per-RF chain hybrid precoding approach for wideband PCS sys-tems. The original joint formulation is broken down into independent magnitude and phaseformulations, which is discussed in Sections 2.3, 2.3.1, 2.3.2. The Par-ArgMod algorithm ispresented in Section 2.3.3. Following this, a joint formulation leading to an alternating ap-proach is discussed in Section 2.4. Complexity analysis and simulation results are presentedin Sections 2.5 and 2.6. Wideband PCS systems are critiqued in Section 2.7 and finallySection 2.8 concludes with a few insights and avenues for future work.In Chapter 3, we introduce a signal detection and time-frequency localization frameworkusing deep learning. The overall framework and the FRCNN architecture is discussed inSections 3.2 and 3.3. The various design choices adopted are presented in Section 3.4. Thisis substantiated by the numerical studies elaborated in Section 3.5 and finally Section 3.6concludes with the insights obtained and possible avenues for future work.In Chapter 4 we provide a few concluding remarks, consolidate the insights obtained andmotivate future research directions.3Chapter 2A Per-RF Chain Hybrid PrecodingApproach for Wideband PCS Systems2.1 IntroductionWith massive MIMO systems poised to take over next generation wireless systems, hybridprecoding will become indispensable in these systems from the energy-efficiency point ofview. Several authors have investigated the hybrid precoding problem in the recent past[9]. Orthogonal matching pursuit (OMP) algorithm is widely adopted and gives satisfactoryperformance [8], [10]. A spatially sparse precoding scheme is proposed in [8] which exploitsthe sparse-scattering structure of the mmWave channel to formulate the precoder designas a matrix reconstruction problem constrained by sparsity. A phased-zero-forcing (PZF)scheme is proposed in [11], where a low-dimensional zero-forcing method is implementedon the equivalent channel obtained by the product of RF precoder and the actual channelmatrix. The above mentioned schemes cater to the FCS, which consumes high power due tothe number of phase shifters used [12]. PCS is seen as a promising candidate towards thisend and can provide good performance with higher energy efficiency [13].Partially connected architectures are looked into in [14]-[18], [19]. The concept of succes-sive interference cancellation (SIC) is used to derive the hybrid precoder in [16]. This workassumes that the baseband precoding matrix is diagonal, implying that the power alloca-tion is only done for different data streams, and the number of RF chains should be equalto that of the data streams. This leaves the analog-precoder to provide the beamforming42.1. Introductiongain, which might not be an optimum strategy [16], [17]. Codebook-based design of hybridprecoders for the narrowband systems is discussed in [14]. This codebook-based methodsearches exhaustively over the analog and baseband precoders without a given design crite-rion. The above mentioned methods are developed for narrowband systems and they do notconsider multi-carrier transmission. Alternating minimization is used as the design criterionin [20] to optimize the analog and digital precoder iteratively. However, the method in [20]is computationally intensive and may not be the best strategy for fast fading channels.All the above mentioned works consider a sum-power constraint on the baseband pre-coder. In practical hybrid precoding systems, each RF chain is equipped with its own am-plifier and therefore it would be more natural to consider a per-RF chain power constraint.There are works that use the per-antenna power constraint to perform hybrid precoding,however, most of the works like [21] and [22] use the FCS and their schemes are not ex-tendable to PCS. PCS design with per-antenna power constraint is considered in [23], but,this design is limited to narrowband systems and the extension of their scheme to widebandsystems is not straightforward. Therefore, to the best of our knowledge, there is no exist-ing work that addresses the problem of hybrid precoding with PCS in wideband systemsemploying per-RF chain power constraints.In this chapter we tackle the two main shortcomings in the existing literature. Firstlywe make our design adhere to the practical per-RF chain power constraint. Secondly weseparate the magnitude and phase formulations to arrive at a much faster precoding solutionthan existing state of the art in [20] which has the potential to be competitive in fast fadingchannels. Following this, we recognize that spectral efficiency gains can be achieved byalternating and extend the proposed algorithm to alternate between the magnitudes andphases of the precoders. This alternating version is faster and has better convergence thanstate of the art alternating minimization in [20] because of the per-RF chain optimizationapproach along with minimal performance loss at higher SNR. In the coming section wediscuss the system and channel model used in this work and also introduce the metric that52.2. System Model(a) Architecture of hybrid precoding in mmWaveOFDM systems (b) Partially connected mapping structureFigure 2.1: Hybrid precoding partially connected scheme in mmWave OFDM systemswill be used to evaluate performance.2.2 System ModelWe consider a single-user mmWave OFDM system as shown in Fig. 2.1a, where Ns datastreams are sent over each sub-carrier by Nt transmit antennas and received by Nr receiveantennas. The number of RF chains at the transmitter and the receiver for each subcarrierare denoted as N tRF and N rRF , respectively.The hybrid precoder at the transmitter comprises of an N tRF × Ns digital basebandprecoder FBB and an Nt ×N tRF analog RF precoder FRF . The transmitted signal thereforecan be written as x = FRFFBBs, where s is the Ns × 1 symbol vector such that E[ssH ] =1NsINs . The baseband precoding is performed in the frequency domain for each subcarrierfollowed by an inverse fast Fourier transform (IFFT) operation that consolidates the signalsof all the subcarriers. Now, as the analog precoding is a post-IFFT operation, all thesignals have to share a common analog precoder [24]. Therefore, the received signal on eachsubcarrier k can be expressed asy[k] = √ρWHBB[k]WHRFH[k]FRFFBB[k]s + WHBB[k]WHRFn, (2.1)62.2. System Modelwhere k ∈ [0, K−1] is the subcarrier index, ρ stands for the average received power, WBB[k]is the N rRF×Ns digital baseband decoder for the kth subcarrier, WRF is the Nr×N rRF analogdecoder shared across all the subcarriers at the receiver, H[k] is the channel matrix for thekth subcarrier, FRF is the Nt × N tRF shared analog RF precoder, FBB[k] is the N tRF × Nsdigital baseband precoder for the kth subcarrier, K is the total number of subcarriers, andn is the additive white noise vector, the elements of which are independent and identicallydistributed (i.i.d) complex Gaussian random variables with zero mean and variance σ2n. Weassume that perfect channel state information (CSI) is available at both the transmitter andthe receiver. In practical setups, CSI can be efficiently obtained by channel estimation atthe receiver using an adaptive compressed sensing approach with discrete Fourier transform(DFT)-based codebook design [25].With transmission of Gaussian symbols, the achievable spectral efficiency is given byR[k] = log det(INs +ρσ2nNs(WRFWBB[k])†H[k]FRFFBB[k]×FHBB[k]FHRFHH [k](WRFWBB[k]))(2.2)The phase shifter only implementation of the analog precoder confines the values of its ele-ments to satisfy the constant modulus constraint given by |(FRF )i,j| = 1√Nt and |(WRF )i,j| =1√Nr. The nature of the phase shifter network that connects the RF chains to the antennasmakes this network partially-connected. Fig. 2.1b illustrates the partially connected map-ping structure considered in our work, where each RF chain is connected to only a subsetof antennas, Nt/N tRF = Mt at the transmitter end. A similar structure is followed on thereceiver side, with the number of antennas per RF chain being Mr = Nr/N rRF . Unless oth-erwise specified, just the use of M assumes M = Mt = Mr and the use of NRF assumesNRF = N tRF = N rRF .72.2. System Model2.2.1 Channel ModelIn this work, we consider the clustered channel model which is known to characterizethe mmWave channels very well [26], [27]. Specifically, considering the theoretical Saleh-Valenzuela (SV) model [6], the channel matrix for the kth subcarrier in the frequency domainis given following [24] asH[k] = βNcl∑i=1Nray∑l=1αilar(φril, θril)at(φtil, θtil)He−j2piik/K , (2.3)where β =√NtNrNclNrayis the normalization factor, Ncl and Nray represent the number of clustersand number of rays in each cluster, K is the total number of subcarriers, and αil is the gain ofthe lth ray in the ith propagation cluster. It is assumed that the αil terms are i.i.d across theNray rays in each cluster i and that they follow a complex Gaussian distribution CN (0, σ2α,i).The variance terms σ2α,i across theNcl clusters satisfy the condition∑Ncli=1 σ2α,i = γ, which is thenormalization factor used to satisfy the constraint E[‖H‖2F ] = NtNr. In (2.3), ar(φril, θril) andat(φtil, θtil) refer to the array response vectors for the receiver and the transmitter respectively,where φril and θril represent the angles of azimuth and elevation for arrival at the receiverand φtil and θtil represent likewise for departure at the transmitter. For the antenna arrayarchitecture, we consider a uniform square planar array (USPA) at the transmitter with√Nt ×√Nt antenna elements. Hence, the array response vector corresponding to the lthray in the ith cluster for the transmitter is given asat(φtil, θtil) =1√Nt[1, ..., ej 2piλ d(p sinφtil sin θtil+q cos θtil),..., ej2piλd((√Nt−1) sinφtil sin θtil+(√Nt−1) cos θtil)]T(2.4)where d and λ are the antenna spacing and signal wavelength, and 0 ≤ p < √Nt and0 ≤ q < √Nt are the antenna indices in the 2D plane. A USPA with a similar array responsevector is used at the receiver with√Nr×√Nr antenna elements and same antenna spacing.82.3. Magnitude-Phase ApproachThe results of this work can be extended to a uniform rectangular array (URA), the responsevector for which can be found in [28]. This channel model is extended to conform to practicalconstraints as given in [26] to obtain the NewYork University (NYU) model and as givenin [27] to obtain the 3rd Generation Partnership Project (3GPP) model for evaluation inrealistic scenarios.2.3 Magnitude-Phase ApproachOne approach to solve the hybrid precoding problem is to make our design as close to theunconstrained fully digital precoder as possible with respect to a cost function [29]. Thismeans that we have to minimize the Euclidean distance between the two designs in order tomaximize performance, as noted in [8], [20], [30]. This poses to us an interesting problemof tensor factorization with certain constraints. The digital precoder has to deal with powerconstraints and the analog precoder faces constant modulus constraint because of the phaseshifter only approach. Also, the analog precoder is shared by all the subcarriers, which meansit has to be factored out of the tensor. We know that the fully digital optimal precoder anddecoder are the first Ns columns of the unitary matrices V and U of the channel matrix Hrespectively. We can obtain V and U from the singular-value decomposition (SVD) of thechannel matrix, i.e, H = UΣVH .From here on, we mainly focus on the precoder design only. We emphasize that the samedesign can be extended to the decoder as well because of the similarity in their mathematicalformulations (c.f. [8] for example). For OFDM systems, hybrid precoding is essentially atensor factorization problem as shown in Fig 2.2. The data tensor Fopt is decomposed intoa factor matrix FRF and a core tensor FBB according to the Tucker-1 decomposition model[29]. The factor matrix is shared by all the subcarriers introducing an inherent suboptimalityin the decomposition approach.We transform the above mentioned tensor factorization problem into two subproblems of92.3. Magnitude-Phase ApproachFigure 2.2: Hybrid precoding in OFDM as tensor factorization.Figure 2.3: Frontal slice of the tensor Fopt.estimating the magnitudes and phases of the elements of FBB[k] and FRF . Each element ofFopt can be written asFopti,j,k = Fi,j,kejψi,j,k . (2.5)A frontal slice of the tensor Fopt, representing a subcarrier is illustrated in Fig. 2.3, where,Fi,j,k = |Fopti,j,k | and ψi,j,k = arg (Fopti,j,k). Since FRF is the analog precoder, we can expresseach non-zero element in FRF as FRFi,j = ejθi,j . Note that we do not include the subscriptk for the θi,j terms in FRF because the analog precoder is shared across all the subcarriers.For the partially-connected precoder architecture under study (c.f. Fig. 2.1b), FRF takes ablock-diagonal structure because each RF chain is only connected to a small number Mt outof the total number Nt of antennas at the transmitter. This is also illustrated in Fig. 2.4a,where, |FRFi,j | = 1/√Nt for all the non-zero elements of FRF and θi,j = arg (FRFi,j).102.3. Magnitude-Phase Approach(a) Block diagonal structure of FRF .(b) Frontal slice of the tensor FBB.Figure 2.4: Decomposition of the frontal slice of the tensor Fopt in Fig. 2.3 into the productof an RF precoder FRF and a baseband precoder FBB.Moving to the baseband precoder FBB, each of its elements can be written asFBBi,j,k = Bi,j,kejφi,j,k , (2.6)where Bi,j,k = |FBBi,j,k | and φi,j,k = arg (FBBi,j,k). A frontal slice of the tensor FBB, corre-sponding to the subcarrier k is illustrated in Fig. 2.4b. In order for our hybrid precoder tobe as close to the optimal precoder as possible, we take the metric of mean squared error(MSE) following [16] and [20] and therefore to minimize the approximation involved adoptthe least squares L2-norm as the design objective. The optimization problem for the qth RFchain turns out to beminimizeθ,B,φqMt∑i=(q−1)Mt+1Ns∑j=1K∑k=1‖Fi,j,kejψi,j,k − CBq,j,kej(θi+φq,j,k)‖2subject to ‖Bq,:,k‖2 ≤ NsNt(2.7)where C = 1√Nt, is the constant modulus of the non-zero terms of the analog precoder.Omitting the summations, subscripts and constraints for ease of representation, the norm112.3. Magnitude-Phase Approachterm can be simplified as follows‖Fejψ − CBej(θ+φ)‖2 = [F cosψ − CB cos (θ + φ)]2 + [F sinψ − CB sin (θ + φ)]2= F 2 cos2 ψ + C2B2 cos2 (θ + φ)− 2CFB cosψ cos (θ + φ)+F 2 sin2 ψ + C2B2 sin2 (θ + φ)− 2CFB sinψ sin (θ + φ)= F 2 + C2B2 − 2CFB[cosψ cos (θ + φ) + sinψ sin (θ + φ)]= F 2 + C2B2 − 2CFB cos (ψ − (θ + φ))Taking the limits of cos(x) into consideration, we get the inequalityF 2 + C2B2 − 2CFB ≤ F 2 + C2B2 − 2CFB cos (ψ − (θ + φ)) (2.8)Equation 2.8 will be used to pose the magnitude problem in the coming subsection. Fornow, plugging this simplified version back into (2.7), the original problem boils down tominimizeθ,B,φqMt∑i=(q−1)Mt+1Ns∑j=1K∑k=1F2i,j,k + C2B2q,j,k − 2CFi,j,kBq,j,k cos (ψi,j,k − (θi + φq,j,k))subject to ‖Bq,:,k‖2 ≤ NsNt, q = 1, . . . , N tRF(2.9)In the coming subsections we propose independent formulations for the magnitude andphase of the hybrid precoder. We also provide theoretical justifications on how the formula-tion given in (2.9), referred to from here on as the original formulation, can be relaxed intothe proposed magnitude and phase formulations. Following this, a way to alternate betweenthe magnitude and the phase formulations is discussed.2.3.1 Magnitude Sub-ProblemFrom Fig. 2.2, we note that each row of the baseband precoder FBB is applied a differentphase shift by the different non-zero entries of the analog precoder FRF to reach different122.3. Magnitude-Phase Approachrows of the optimal precoder. Taking all the K subcarriers into account, we can write thefirst MtK equations for the qth RF chain asFopti,:,k ≈ ejθiFBBq,:,k ,1 ≤ i ≤Mt, 1 ≤ k ≤ K, q =⌈iMt⌉.(2.10)Each of the MtK equations listed above can be written as two separate equations - oneequating the magnitude of the LHS terms with that of the RHS terms and the other equatingthe phase of the LHS terms with that of the RHS terms. The same procedure can be followedfor the next MtK equations and so on. From (2.6) and (2.10), we can mathematically equatethe magnitudes in the hybrid precoder for subcarrier k and RF chain q asFi,j,k ≈ Bq,j,k,1 ≤ i ≤Mt, 1 ≤ j ≤ Ns, q =⌈iMt⌉.(2.11)As we consider MSE as the optimization metric, we adopt the least squares L2-norm as thedesign objective and formulate the magnitude sub-problem for each sub-carrier k and theqth RF chain asminimizeBq,:,kqMt∑i=(q−1)Mt+1Ns∑j=1‖Bq,j,k − Fi,j,k√Nt‖2subject to ‖Bq,:,k‖2 ≤ NsNt(2.12)Remark 1 (On the relation between the original formulation in (2.9) and the magnitudeformulation in (2.12)). From (2.8) and (2.9), we know that the lower bound to the original132.3. Magnitude-Phase Approachformulation consists of only the magnitude terms and is given byminimizeBq,j,kqMt∑i=(q−1)Mt+1Ns∑j=1‖Fi,j,k − Bq,j,kC‖2subject to ‖Bq,:,k‖2 ≤ NsNt(2.13)Note that the optimization problems in (2.12) and (2.13) are equivalent because C = 1√Nt.Consequently, we note that our magnitude formulation is trying to minimize a lower boundto the original formulation. An important aspect to note here is that by separating the mag-nitude formulation out of the original problem, we are able to do away with the dependenceamong the subcarriers. This allows us to solve the magnitude problem for each subcarrierindependently. Since we operate with a total of K subcarriers, we should solve K simi-lar problems as (2.12) - one for each subcarrier. This can be done in parallel because themagnitude subproblems do not have any inter-dependencies between sub-carriers.In (2.12), we have considered a per RF chain power constraint for the baseband precoder.Beamforming with per-antenna power constraint has been investigated in [31]-[34], however,the philosophy in these works can be extended to a per-RF chain power constraint as well.Most of the works on beamforming adopt a sum-power constraint on the antennas andfollowing [31], we believe that a per-RF chain power constraint is more realistic as in practicalhybrid precoding systems each RF chain is equipped with its own power amplifier and islimited by the linearity of that amplifier. Having a per-RF chain power constraint wouldalso aid in achieving equal power allocation among different RF chains so that all RF chainswill be equally active at a given time [35]. This is further elaborated in section 2.6.2.The per RF chain power constraint follows from the formulation of the regular precodingconstraint which limits the Frobenius norm of the precoders, i.e.,‖FRFFBB‖2F = Ns (2.14)142.3. Magnitude-Phase ApproachDue to the special block diagonal structure of the analog precoder FRF (c.f. Fig. 2.4a), thepower constraint can be rewritten asNtN tRF‖FBB‖2F = Ns. (2.15)Dividing the power equally among the N tRF RF chains, (2.15) transforms into the per RFchain power constraint given in (2.12).To solve (2.12), we firstly relax the power constraint so as to obtain a power budgetthat maintains the convexity of the problem at hand. Upon doing so, we end up with aquadratically constrained quadratic programming (QCQP) problem which is convex andcan be solved by standard convex optimization techniques [36]. Results show that the powerlimit is achieved by this formulation for each RF chain.A similar formulation as given in (2.12) can be formed for each of the N tRF successiverows of the baseband precoder FBB (one per RF chain). That is, in order to cover the entireprecoder, we need to solve a total N tRF magnitude problems per subcarrier. We can solvethese N tRF problems simultaneously in parallel because of the block diagonal nature of theanalog precoder FRF (c.f. Fig. 2.4a) as the partially connected nature of the hybrid precoderrenders each RF chain to be independent of the others. We can visualize the optimal precoderas a group of N tRF tensors stacked on top of each other which allow us to deal with each ofthem separately. The resulting baseband precoder may not directly maximize the spectralefficiency [8], [24], but it makes for a good substitute because it helps simplify the givenproblem. The magnitudes of the baseband precoder for all the subcarriers are obtained bysolving these independent set of N tRF formulations per subcarrier.The system of equations in (2.11) is overdetermined because the number of independentequations (MtNs) is much higher than the number of variables (Ns) involved. Specifically,the magnitude equations in (2.11) reveal that, for a given subcarrier k, each of the rows ofthe magnitude matrix B of the baseband precoder tries to approximate a set of Mt rows of152.3. Magnitude-Phase Approachthe magnitude matrix F of the optimal precoder. For an optimal solution to exist we requirethatNs ≥ Ns NtN tRF(2.16)i.e, we need N tRF ≥ Nt. This is not practical for hybrid precoding systems, because ofwhich we end up with an overdetermined system with a minimum norm solution.2.3.2 Phase Sub-ProblemThe phase sub-problem can derived from (2.10) by equating the phase terms on the l.h.swith those on the r.h.s. We obtain the following set of MtKNs equations for the qth RFchain asejψi,j,k ≈ ejθiejφq,j,k ,∀ 1 ≤ i ≤Mt, 1 ≤ j ≤ Ns, 1 ≤ k ≤ K, q =⌈iMt⌉where− pi ≤ θi ≤ pi, −pi ≤ φq,j,k ≤ pi,∀i, j, k(2.17)Observe from (2.17) and (2.11) that the main difference between the phase and magnitudeformulations is that the phase equations (c.f. (2.17)) are obtained upon including the con-straint that the analog precoder needs to be shared across the K subcarriers. This is unlikethe magnitude formulation, where the equations (c.f. (2.11)) are obtained upon includingthe constraint that the baseband precoders for the K subcarriers are independent of eachother. Applying a logarithmic transformation on the l.h.s and r.h.s of (2.17), we have thefollowing set of MtKNs equationsψi,j,k ≈ θi + φq,j,k,∀ 1 ≤ i ≤Mt, 1 ≤ j ≤ Ns, 1 ≤ k ≤ K, q =⌈iMt⌉where − pi ≤ θi ≤ pi, −pi ≤ φq,j,k ≤ pi,∀i, j, k(2.18)162.3. Magnitude-Phase ApproachFollowing the method of linear least squares to solve for the phases θi and φq,j,k of the digitaland analog precoders respectively, we formulate the phase sub-problem corresponding to theRF chain q in the hybrid precoder asminimizeθi,φq,j,kqMt∑i=1+(q−1)MtNs∑j=1K∑k=1(θi + φq,j,k − ψi,j,k)2,subject to− pi ≤ θi ≤ pi, −pi ≤ φq,j,k ≤ pi,∀i, j, k(2.19)The optimization problem in (2.19) is a linear-constrained quadratic programming (QP)problem and can be solved using standard convex optimization techniques [36]. A similarformulation as given in (2.19) can be formed for each RF chain q. Since we have a total ofN tRF RF chains at the transmitter, we should solve a total of N tRF subproblems to obtain allthe phases of the hybrid precoder.Remark 2 (On the relation between the original formulation in (2.9) and the phase for-mulation in (2.19)). The phase formulation given in (2.19) tries to reduce the gap betweenthe original formulation in (2.9) and the magnitude formulation in (2.12), as explained be-low. Note from (2.12) that in the magnitude formulation, we solve for a lower bound to theoriginal formulation given in (2.9). The cos (ψi,j,k − (θi + φq,j,k)) terms in (2.9) are respon-sible for the gap between the objective functions in (2.9) and (2.12). To reduce this gap, wecan maximize the cos (ψi,j,k − (θi + φq,j,k)) terms w.r.t (θi and φq,j,k)), i.e., we can solve thefollowing optimization problemmaximizeθi,φq,j,kqMt∑i=1+(q−1)MtNs∑j=1K∑k=1cos (ψi,j,k − (θi + φq,j,k))subject to − pi ≤ θi ≤ pi, −pi ≤ φq,j,k ≤ pi,(2.20)The problem in (2.20) is non-convex because the cosine function is not convex in (−pi, pi).By resorting to a first order Taylor approximation, we can simplify (2.20) as172.3. Magnitude-Phase Approachmaximizeθ,φqMt∑i=1+(q−1)MtNs∑j=1K∑k=11− (ψi,j,k − (θi + φq,j,k))22subject to − pi ≤ θi ≤ pi, −pi ≤ φq,j,k ≤ pi(2.21)Note that the optimization problem in (2.21) is equivalent to the phase formulation givenin (2.19). That is, our phase formulation is trying to minimize the difference between theobjective functions in the original formulation and the magnitude formulation.An important aspect to note from the phase formulation in (2.19) is that, unlike themagnitude formulation in (2.12), (2.19) has to deal with the dependence among the Ksubcarriers because all the subcarriers share the same analog precoder. Consequently, wecannot solve the phase sub-problems for each subcarrier independently. However, we notefrom (2.19) that the phase formulation for each RF chain q is independent of the remainingRF chains due to the block-diagonal nature of the analog precoder. Therefore, we can stillsolve N tRF phase sub-problems in parallel (one for each RF chain q) to obtain the entireanalog precoder.Similar to the magnitude formulation, the system of linear equations in (2.18) is overde-termined. This is because the available degrees of freedom, i.e, the variables θi and φq,j,k, arelesser in number than the number of constraints on these degrees, i.e., the number of equa-tions involved. Specifically, the overdetermined system of phase formulations has MtKNsequations for a set of Mt +KNs variables. The main cause for the overdetermined nature ofthe system is that the analog precoder, represented by the phase shifts θi, is shared acrossthe K subcarriers. In order to have an optimal solution we need to haveNtN tRF+NsK ≥ NtN tRFNsKNt +NsK(N tRF −Nt) ≥ 0Therefore, for N tRF ≥ Nt, we always have an optimal solution and for N tRF < Nt, we have182.3. Magnitude-Phase Approachan optimal solution ifNt ≥ NsK|Nt −N tRF | (2.22)It can be seen that as Ns or K increase, it becomes increasingly difficult to satisfy (2.22).Thus we end up with an overdetermined system with a minimum norm solution, similar tothe magnitude formulation.2.3.3 Par-ArgMod AlgorithmAlgorithm 1 : Par-ArgMod algorithm1: Input: H2: Compute SVD to get Fopt and set q=1;3: Solve in parallel for q = 1, . . . , N tRF4: For a given subcarrier k, find the magnitude values of the qth row of FBB[k] using themagnitude formulation (2.12). Solve K such optimization problems, i.e., one for eachsubcarrier, in parallel.5: Find the qth block diagonal column entry of FRF and phases of the qth row of FBB usingthe phase formulation (2.19).6: end when the N tRF parallel problems are solved7: Rearrange the magnitude and phase values to form FRF and FBBIn Algorithm 1, also referred to from here on as the Par-ArgMod algorithm, we presenta summary of the steps followed in our hybrid precoder design. The input to the proposedalgorithm is the channel matrix H, which is assumed to be perfectly known at the transmitter.We then apply SVD to the channel matrix and obtain the fully digital precoder Fopt. Next, weconsider the Mt transmit antennas connected to the RF chain q and obtain the magnitudeand phase terms of the hybrid precoder by solving (2.12) and (2.19) respectively. Thisprocedure is repeated in parallel for the N tRF RF chains, so as to fully obtain the hybridprecoder matrices FRF and FBB.192.4. Joint Formulation and Alt-ArgMod2.4 Joint Formulation and Alt-ArgModIntroducing the approximation in (2.21) in (2.9) we haveF 2 + C2B2 − 2CFB cos (ψ − (θ + φ)) = (F − CB)2 + CFB (ψ − (θ + φ))22 (2.23)Alt-ArgMod given in Algorithm 2, alternates between the magnitude and the phase formu-lations using equation (2.23). A good choice for the initial phase values is arrived at byusing the phase formulation in (2.19). Essentially, when the phases are fixed in (2.23), theresulting formulation becomes a QCQP problem which can be written asminimizeBq,j,kqMt∑i=(q−1)Mt+1Ns∑j=1(Fi,j,k − Bq,j,kC)2 + C1Fi,j,kBq,j,ksubject to ‖Bq,:,k‖2 ≤ NsNt(2.24)where C1 = C (ψi,j,k−(θi+φq,j,k))22 is the resulting constant after fixing the phases. Likewisewhen the resulting magnitudes from (2.24) are fixed in (2.23), the resulting formulationbecomes a QP problem with linear constraints which can be written asminimizeθ,φqMt∑i=1+(q−1)MtNs∑j=1K∑k=1C1 + C2(ψi,j,k − (θi + φq,j,k))22subject to − pi ≤ θi ≤ pi, −pi ≤ φq,j,k ≤ pi(2.25)where C1 = (Fi,j,k −Bq,j,kC)2 and C2 = CFi,j,kBq,j,k.In the coming section, we present a detailed analysis on the computational costs incurredby the proposed Par-ArgMod and Alt-ArgMod algorithm and compare them with existingwideband PCS schemes.202.5. Complexity AnalysisAlgorithm 2 : Alt-ArgMod algorithm1: Input: H2: Compute SVD to get Fopt and set q=1;3: Solve in parallel for q = 1, . . . , N tRF4: Initialize the phases in (2.23) using the phase formulation (2.19).5: Solve for the magnitudes using the resulting formulation as in (2.24).6: Solve for the phases by replacing the resulting magnitudes from step 5 as in (2.25).7: Rearrange the magnitude and phase values to form FRF and FBB per RF-chain.8: Repeat steps 5,6,7 until convergence criteria is met.9: end when the N tRF parallel problems are solved2.5 Complexity AnalysisIn this section we compare the computational complexity of our proposed algorithm with twoexisting wideband schemes namely the SDR-AltMin algorithm [20] and the Fixed-Widebandscheme [18]. Other hybrid precoding algorithms also exist, for example, the SIC-basedhybrid precoding scheme [16], but the extensions of such schemes to OFDM systems are notcurrently available and are therefore skipped here.In Par-ArgMod, we may note that the first step involves the SVD of the Nr ×Nt matrixH, which can be computed in O(N2rNt) operations. The magnitude formulation is a convexQCQP problem, which can be solved using interior point methods in O(N3s ) computations.As noted in [37], [38], QCQP and QP problems are given by an iteration complexity ofO(√Ns log(Ns/)) with an -approximate solution using the primal-dual interior-point algo-rithm for small updates. This should be done once per transmit RF chain and subcarrier,leading to a total of N tRFK problems for obtaining the magnitude terms of the hybrid pre-coder. The phase formulation in (18) is a QP problem which can be solved by using interiorpoint methods with a computational complexity of O((MtNsK)3) and iteration complexityof O(√MtNsK log(MtNsK/)). This process is followed in N tRF parallel problems - one perRF chain. Alt-ArgMod also takes similar time along with an additional factor of L which isthe number of iterations taken to converge.The SVD step is common to Par-ArgMod, Alt-ArgMod and SDR-AltMin algorithms be-212.5. Complexity Analysiscause it is required to obtain the optimal precoder values. The SDR-AltMin algorithm [20]has a semidefinite programming (SDP) problem inside an alternating minimization block,which makes it increasingly complex. Each iteration of the SDR AltMin algorithm requiresO((NsN tRF )3) computations for the baseband precoder and O(NtNsK) for the analog RFprecoder. Assuming the SDR-AltMin algorithm takes L iterations to converge, the total com-plexity would be O(LNtNsK+L(NsN tRF )3). SVD is also carried out in the Fixed Widebandalgorithm [18] to obtain the singular values of the covariance matrix. The covariance matrixitself can be obtained in O(M2t NrK) and the baseband precoder can be got in O(M2t Ns).The complexities of these algorithms are juxtaposed in Table 2.1.Algorithm ComplexitySDR-AltMin [20] O(LNtNsK + L(NsN tRF )3√NsN tRF log(NsN tRF/))Fixed Wideband [18] O(M2t NrK +M2t Ns)Par-ArgMod O(√MtNsK log(MtNsK/)(MtNsK)3 + (Ns)3√Ns log(Ns/))Alt-ArgMod O(L√MtNsK log(MtNsK/)(MtNsK)3 + L(Ns)3√Ns log(Ns/))Table 2.1: Complexity comparison.Par-ArgMod allows for parallelization; the magnitude and phase problems are solvedin parallel and these two problems are in turn solved as N tRF parallel subproblems. Inaddition, each of the N tRF subproblems for the magnitude formulation are solved for the Ksubcarriers as K parallel subproblems. The Alt-ArgMod also alternates separately for N tRFRF chains. The SDR-AltMin algorithm is linear in the number of transmit antennas Nt,which is the term we vary in massive MIMO studies. In contrast, thanks to the multiplelevels of parallelization involved, our proposed algorithms increase as O(M7/2t log(Mt/)) andare therefore dependent on Mt rather than Nt, which as we show in section 2.7, needs to bekept relatively constant with increase in Nt.The proposed algorithms also offer other distinct advantages over existing schemes. Al-ternating minimization does not have convergence guarantees to a global minimum or evena stationary point, and only converges to a solution where the cost function ceases to de-crease [39], [40], [41]. It also requires multiple runs because it imposes the non-trivial task of222.6. Simulation Resultschoosing good initial values to avoid convergence to bad local optima [20]. These drawbacksare observed in Alt-ArgMod as well, however, convergence is achieved sooner than SDR-AltMin on average because of a good initialization mechanism. This is discussed further insection 2.6.2. Also, the per-RF chain power constraint considered in the proposed algorithmsis a safer approach when allocating power to the different RF chains than the sum-powerconstraint used in Fixed Wideband and SDR-AltMin.2.6 Simulation ResultsIn this section, we evaluate the performance of our proposed algorithm. Ns data streams aresent over each of the K = 12 subcarriers with Nt = 36 and Nr = 16 antennas. The antennaelements are separated by half wavelength distance. For the theoretical Saleh-Valenzuelamodel we set Ncl = 5 clusters, Nray = 10 rays and the average power of each cluster is setto σ2α,i = 1. The azimuth and elevation AoDs and AoAs follow the Laplacian distributionwith uniformly distributed mean angles over [0,2pi) and angular spread of 10 degrees [20].All the results are averaged over 1000 independent channel realizations. We compare thespectral efficiency performance achieved by the proposed algorithm with that of the SDR-Alternating Minimization algorithm [20], the Fixed Wideband algorithm proposed in [18],and the codebook-based [42] methods. These three methods serve as our baseline for hybridprecoding with PCS in massive MIMO OFDM systems. We also consider two fully-connectedprecoding schemes, namely, the OMP [8] and the PE-AltMin [20] for comparison.2.6.1 Spectral-Efficiency EvaluationFirstly, Fig. 2.5 shows the variation of spectral efficiency with respect to SNR. Note that allbaseline schemes presented in the figure consider a sum-power constraint on the RF chains,whereas, we consider a more-realistic per-RF chain power constraint. We notice that thefully connected architectures follow closely after the optimal precoder, whereas the partially232.6. Simulation Resultsconnected structures lag in terms of spectral efficiency performance. The partially connectedarchitectures exhibit near similar performance for low SNR regions which is rather poorwhen compared to the fully connected architectures and deviate slightly when SNR is high,although the performance gap is not too conspicuous. The DFT-codebook based methodis seen to perform poorly in our setting mainly because of the finite size of the codebook.The size of the codebook for a given dimension is fundamentally limited by the number ofmutually unbiased bases that are available and the lower bound to the minimum distancethat can exist between the codewords.-30 -20 -10 0 10SNR (dB)05101520253035Spectral Efficiency (bits/s/Hz)Spectral efficiency with increasing SNROptimal digital (sum-power)PE-AltMin - Full (sum-power)OMP - Full (sum-power)Fixed Wideband - Partial (sum-power)SDR-AltMin - Partial (sum-power)Par-ArgMod (per-RF power)Alt-ArgMod (per-RF power)Codebook Based (sum-power)Figure 2.5: Spectral efficiency achieved by precoding algorithms when Ns = N tRF = N rRF = 4.Previous works [31],[43] have shown that the achievable capacity with the per-RF chainpower constraint is much lower than than the achievable capacity with the sum-power con-straint on the RF chains. This difference in the maximum achievable spectral efficiency isa major reason for the suboptimal performance of our scheme at high SNR. Although theAlt-ArgMod outperforms the non-alternating version considerably, further investigation isrequired on building non-alternating precoding schemes that can respect the per-RF chainpower constraint but not degrade in spectral efficiency performance at high SNR. Anotherreason for the sub-optimality of Par-ArgMod is that we separate the original joint formula-tion in (2.9) into parallelizable magnitude and phase optimization problems. This separationintroduces additional constraints on the degrees of freedom available in the precoder design,242.6. Simulation Resultsthus resulting in a decreased spectral efficiency performance [44]. This argument is validatedby the improved performance of Alt-ArgMod which works with the joint formulation givenin (2.23). Lastly, due to the partially connected wideband nature of the hybrid precoder,both our magnitude and phase formulations solve overdetermined systems and this adds tothe sub-optimality.2 4 6 8 10 12 14 16 18NtRF051015Spectral Efficiency (bits/s/Hz)Spectral efficiency with increasing number of RF chainsNrRF = 2NrRF = 4NrRF = 8NrRF = 16(a) Ns = 4 and SNR = 0 dB2 4 6 8 10 12 14 16 18NtRF0510152025Spectral Efficiency (bits/s/Hz)Spectral efficiency with increasing number of RF chainsNrRF = 2NrRF = 4NrRF = 8NrRF = 16(b) Ns = 4 and SNR = 5 dB-30 -25 -20 -15 -10 -5 0 5 10SNR (dB)0510152025Spectral Efficiency (bits/s/Hz)Spectral efficiency vs SNR with increasing number of RF chainsNtRF = 4NtRF = 6NtRF = 12NtRF = 18(c) Ns = N rRF = 4-30 -25 -20 -15 -10 -5 0 5 10SNR (dB)05101520253035Spectral Efficiency (bits/s/Hz)Spectral efficiency vs SNR with increasing number of RF chainsNrRF = 2NrRF = 8NrRF = 16FD - RxFD - Tx&Rx(d) Ns = N tRF = 4Figure 2.6: Spectral efficiency achieved by Par-ArgMod algorithm (a,b) with varying NRFand (c,d) with varying SNR for different values of NRF .Next, we fix the number of data streams Ns = 4, the signal to noise ratio (SNR) ρσ2n = 0 dBand vary the number of RF chains N tRF and N rRF at the transmitter and receiver respectively.It can be observed from Fig. 2.6a that increasing number of RF chains at the transmittercan increase the spectral efficiency. In order to achieve further gains, increasing the numberof RF chains at the receiver is a reasonable solution. A similar trend is observed when we252.6. Simulation Resultsincrease the SNR to 5 dB and redo the experiment as seen in Fig. 2.6b. There is a jump inspectral efficiency when N rRF = Nr in both the cases. These figures illustrate the trade-offintroduced by hybrid precoders having limited RF chains - we achieve energy savings at thecost of loss in spectral efficiency. It can be observed from Fig. 2.6c that with increasingN tRF the spectral efficiency increases, while Ns and N rRF are kept constant. Increasing thenumber of RF chains at the receiver has a similar effect as seen in Fig. 2.6d. Moreover,using N rRF = Nr is equivalent to having a fully digital architecture at the receiver.2.6.2 Power AnalysisThe per-RF chain power constraint considered in the proposed algorithm is a safer approachwhen allocating power to the different RF chains than the sum-power constraint used inFixed Wideband and SDR-AltMin. This is because unequal power allocation among the RFchains would feed into the non-linearity of the power amplifier and also decrease the overallamplifier efficiency. For further insight, we note that the power expenditure on linear poweramplifiers can be expressed as PPA = Pin/η, where Pin is the input power to the poweramplifiers and η is the power amplifier efficiency. Typically, η depends on the output powerPout of the PA and is given by η = ηmax√PoutPmax[45], where ηmax is the maximum poweramplifier efficiency and Pmax is the maximum output power of the power amplifier. WhenPmax is higher than the Pout (which happens in the case of unequal power allocation), wenote that η is smaller and consequently, the power expenditure PPA at the power amplifieris higher. Techniques which minimize the power amplifier losses in massive MIMO systemsare still an ongoing topic of investigation [46].The power achieved by different RF-chains over time is shown in Fig 2.7. The maximumallowed sum power is NsN tRF/Nt and per-RF power is Ns/Nt. SDR-AltMin is disparate inthe way it assigns power to different RF-chains and breaches the per-RF power limit as seenin Fig. 2.7a, which is not acceptable from a hardware perspective. It also has difficultiesconverging on some occasions as observed in Fig. 2.7b. On the contrary, Alt-ArgMod is fair262.6. Simulation Results0 10 20 30 40 50time00.050.10.150.20.250.30.350.4Achieved PowerAchieved Power of different RF chains over timemax per-rf powerrf-chains(a) SDR-AltMin with convergence0 10 20 30 40 50time00.050.10.150.20.250.3Achieved PowerAchieved Power of different RF chains over timemax per-rf powerrf-chains(b) SDR-AltMin without convergence0 10 20 30 40 50time0.0850.090.0950.10.1050.110.115Achieved PowerAchieved Power of different RF chains over timemax per-rf powerrf-chains(c) Alt-ArgMod with convergence0 10 20 30 40 50time0.0750.080.0850.090.0950.10.1050.110.115Achieved PowerAchieved Power of different RF chains over timemax per-rf powerrf-chains(d) Alt-ArgMod without convergenceFigure 2.7: Power level achieved by different RF-chains (a,b) with SDR-AltMin and (c,d)with Alt-ArgMod.272.6. Simulation Resultsin its allocation of power and does not breach the power limit for any RF chain because ofthe built-in constraints which is shown in Fig 2.7c. Also, as the algorithm alternates per-RFchain, even when convergence may not be achieved for a particular RF-chain, other RF-chainsare still able to converge as can be seen in Fig 2.7d. This is in contrast to the SDR-AltMinwhere all RF chains don’t achieve convergence. The above mentioned arguments emphasizethat the per-RF chain approach is a welcome departure from most existing schemes whichuse the sum-power constraint, resulting in unequal power allocation.2.6.3 Run-Time EvaluationFor practical fast fading systems, it would be of interest to adopt the algorithm that producesfaster results by sacrificing some spectral efficiency. Table 2.2 shows the time taken by theSDR Alt-Min, Par-ArgMod and the Alt-ArgMod algorithms, when averaged over 200 runs.The simulations are conducted on a machine with Intel(R) Xeon(R) CPU E5-2630 v2 runningat 2.6GHz (24CPUs). The SDR-AltMin algorithm is run on K parallel workers to find FBB.The Par-ArgMod algorithm is run on NRFK parallel workers for the magnitude problemand NRF parallel workers for the phase problem. Alt-ArgMod is run on similar number ofworkers as Par-ArgMod.Algorithm Mag time Phase time Total timeSDR-AltMin [20] - - 9.534 sPar-ArgMod 0.9301 s 1.1241 s 1.1241 sAlt-ArgMod - - 7.2135 sTable 2.2: Run time comparisonThe time taken by the Par-ArgMod algorithm is the maximum of the time taken bymagnitude and phase problems. The phase formulation is solved using the primal-dualinterior point method for sparse QP problems [38]. We see a nine fold decrease in time takenwhen using the parallel framework as compared to the alternating approach. This timegain could be significant in fast fading environments where the channel changes rapidly and282.6. Simulation Resultsprecoding has to be sped up in tandem. Moreover, the per RF-chain alternating Alt-ArgModis much faster than SDR-AltMin.2.6.4 Realistic ScenariosIn this section, the Saleh-Valenzuela (SV) channel model [6], is extended to realistic scenarios.An Urban Microcell - Street Canyon (UMi-Sc) scenario is considered at a frequency fc =30GHz with a bandwidth of 100MHz and number of subcarriers K = 12. Two practicalchannel models are considered, namely, the 3GPP and NYU channel models. The 3GPPclustered delay line (CDL) channel model is extended to the OFDM case according to [27].The model assumes that power angular spectrum in azimuth is a wrapped Gaussian and thezenith is Laplacian. The concept of time clusters is maintained with the delay for each clustergenerated using an exponential distribution. The delay and the angle are characterized by ajoint distribution. The NYU channel model [26], brings in the concept of time clusters andspatial lobes (TCSLs) in addition to the underlying aspects of the 3GPP model. The TCpowers are generated using an exponential function of delay and each multipath componentis assigned an unique lobe according to a uniform distribution.-30 -25 -20 -15 -10 -5 0 5 10SNR (dB)0246810121416Spectral Efficiency (bits/s/Hz)Spectral efficiency vs SNR with different channel models3GPPSVNYUBeam-SquintChannel in [18]Figure 2.8: Spectral efficiency achieved by Par-ArgMod using different channel models whenNs = N tRF = N rRF = 4.In addition to this, we demonstrate the performance of our scheme taking into account292.7. Critique of Wideband PCSthe beam squint effect in wideband systems using large arrays which is investigated in [47],[48]. We model beam squint in USPA following [49] and as noted by the authors observethat the spectral efficiency does suffer. It is also verified that with increasing Nt and K(i.e. bandwidth), the performance worsens due to increased beam-squint. We also show theperformance of our scheme with the channel model used in [18], where each ray in a clusteris given a unique phase shift, departing from the channel model used in this work whichassigns a unique phase shift to each cluster.Fig. 2.8 demonstrates the effect of using different channel models to evaluate the perfor-mance of the system. The spectral efficiency is estimated optimistically in the case of the3GPP cluster delay line channel model as the number of clusters and subrays in each clusterare assigned high values for the given UMi-Sc scenario. The number of clusters and subraysused in this work with the SV channel model is lesser when compared to the 3GPP speci-fications and is closer to the NYU channel model which derives its values from real worldchannel measurement data. The NYU model claims to be closer to the mmWave setting asthe mmWave scattering environment is sparse and won’t have large number of multipaths[8]. It is observed that while beam squint affects the spectral efficiency adversely, the useof channel model in [18] does not change the performance to a large extent. Moreover, it ismore optimistic in its estimation of spectral efficiency values when compared to the channelmodel used in our work.2.7 Critique of Wideband PCSThe variation of spectral efficiency with increasing number of transmit antennas for theoptimal precoder in OFDM systems is shown in Fig. 2.9a. The rise in spectral efficiencywith increasing number of transmit antennas is expected by the promise of massive MIMO.Increase in number of receive antennas also pumps the spectral efficiency up. This trend islost in hybrid systems as can be seen in Fig. 2.9b.302.7. Critique of Wideband PCS4 16 36 64 100Nt681012141618202224Spectral Efficiency (bits/s/Hz)Spectral efficiency vs NtNr = 4Nr = 16(a) Optimal precoder when Ns = 4 and SNR= 0 dB4 16 36 64 100Nt0510152025Spectral Efficiency (bits/s/Hz)Spectral efficiency vs NtSDR-Ns=4,Nrf=Nt/2FWB-Ns=4,Nrf=Nt/2Par-Ns=4,Nrf=Nt/2Par-Ns=Nrf=4Par-Ns=4,Nrf=NtNr = 4Nr =16Nr = 16Nr = 4Nr = 4Nr = 16(b) Partial hybrid precoders at SNR = 0 dBFigure 2.9: Spectral efficiency achieved by Optimal and hybrid precoders with increasing Nt.A number of experiments are run to validate this observation. It is seen that havingNs and NRF constant with rising Nt leads to rapid deterioration in performance, with thespectral efficiency dropping to a near zero. In this case we also observe that increasingNr also results in degradation of performance and could be because increasing number ofantennas (both at the transmitter and the receiver) adds to the suboptimality (increasingM leads to highly overdetermined systems in hybrid precoding) but doesn’t leverage thepositives with fixed NRF . When the number of RF chains is taken to be half the numberof antennas (both at the transmitter and the receiver), the spectral efficiency reduces withincreasing Nt but at a much lesser pace. This we see as a more realistic experiment as thenumber of RF chains need to be scaled accordingly with the antennas to take advantage ofthe increased number of degrees of freedom for spatial multiplexing. However, it is seen thatthe rising number of antennas makes the hybrid system increasingly suboptimal in all threepartially connected architectures compared here. Even though the ratio M of the numberof transmitter antennas to the number of RF chains is fixed,i.e., M = Nt/N tRF is fixed, thespectral efficiency worsens with Nt because the number of subproblems for magnitude andphase formulations is equal to NRF and the suboptimality in each problem adds up with312.8. Conclusion and Possible Future Workincreasing NRF . Adding to this, each RF chain is limited in power by Ns/Nt, which reducesas Nt increases, hampering the range of values the baseband precoder can take. Lastly,when NRF = Nt, it is observed that the spectral efficiency improves with increasing Nt. Inthis case, there is very little suboptimality in the system and the performance can add up.This confirms our hypothesis that the reduction in the number of RF chains is the majorbottleneck in performance scaling with wideband PCS.The above discussion can be formalized as is done by Proposition 1 in [23]. Extension ofProposition 1 in [23] to wideband systems can be found in the appendix. We follow a similarapproach and arrive at conditions for optimality for our system in (2.16) and (2.22). Now,because we use N tRF ≤ Nt (violating (2.16)) and also as Nt is not large enough to satisfy(2.22), the solutions that we obtain are not optimal solutions, but minimum norm solutionsin both magnitude and phase. From Fig 2.9b, it can be seen that whenever N tRF < Nt, thespectral efficiency grows worse as Nt increases. This is because (2.16) requires us to haveat least NRF = Nt, which when satisfied, leads to an increase in spectral efficiency with Nt.Even if we manage to satisfy (2.22), we will violate the more stringent (2.16) and obtain asuboptimal solution. Further investigation is required on how the losses in spectral efficiencywith increasing Nt can be mitigated. The current state-of-the-art on hybrid precoding withPCS, including the proposed algorithm suggests that from the spectral efficiency point ofview it would be favourable to implement PCS only in systems where the number of RFchains are comparable to the number of antennas.2.8 Conclusion and Possible Future WorkIn this Chapter, we have considered a single-user MIMO system. When OFDM is employed,we have proposed a low-complexity algorithm for the design of a partially connected hybridprecoder which turns out to be much faster than the existing state-of-the-art. The followinginsights are obtained:322.8. Conclusion and Possible Future Work• The run time comparison shows the superiority of the Par-ArgMod algorithm overexisting schemes for wideband PCS. In fast fading practical scenarios this would bevery desirable. Also, our proposed Alt-ArgMod outperforms other alternating schemeswith respect to run-time.• The per-RF chain power constraint used is more practical in nature than the sum-powerconstraint because each RF chain is generally equipped with its own power amplifier.• Departing from the original joint formulation of magnitude and phase in conventionalprecoding, we solve for the magnitude and phase as two independent subproblems inPar-ArgMod. This allows us to solve the magnitude formulations in parallel for eachRF chain and subcarrier. The phase formulations, on the other hand, can be solvedin parallel for each RF chain but not for each subcarrier because the analog precoderis shared among the subcarriers. The alternating version (Alt-ArgMod) alternates inparallel for each RF chain but follows a similar parallelization structure.• Increasing number of antennas without increasing the number of RF chains in tandemis seen to reduce the spectral efficiency performance considerably in wideband PCSdue to the overdetermined nature of the hybrid precoding system and the restrictivepower constraint on the baseband precoder. From the spectral efficiency point of view,it would therefore be favourable to implement PCS only in systems where the numberof RF chains are comparable to the number of antennas.• Although we propose algorithms that are faster and adhere to practical constraintsunder the PCS setting, our observations in Section 2.7 tell us that PCS is not a rec-ommended solution for massive MIMO systems where good spectral efficiency is arequirement, especially the ones employing multicarrier transmission like OFDM.Some possible avenues for future work:332.8. Conclusion and Possible Future Work• It is also worthy of investigating further whether the spectral efficiency values in wide-band PCS can be improved while respecting the per-RF chain power constraint.• Mitigate the detrimental effects of increasing the number of transmit antennas withlimited RF chains on spectral efficiency in wideband PCS.• Consider the statistical distribution of phases of the analog precoder in the optimizationprocess and their relationship with the array response vector.• For massive MIMO systems with high spectral efficiency requirement, investigate FCSdesign respecting the per-RF chain power constraint. Extend the design to make itworkable in real life fast fading scenarios. The resulting design would be the idealmarriage between spectral efficiency, practical implementation and energy savings.• Extend the proposed hybrid precoding design to the multi-user massive MIMO sce-nario. The main challenge in this extension would be to integrate the interferencecancellation procedure among the multiple users with the proposed procedure of par-allelizing the hybrid precoder.• Extension to distributed massive MIMO systems would also be an interesting avenueto explore as the inherent parallelization in the approach would inspire a distributedsetup.34Chapter 3Signal Detection and Time-FrequencyLocalization Using Deep Learning3.1 IntroductionAs wireless devices become pervasive in our day-to-day life, being able to passively detectthese devices in the spectrum is an important concern from the security as well as spectrummanagement perspective. Signal detection techniques have been investigated extensivelyin literature. A multi-band joint detection technique which jointly detects signal energylevels in multiple frequency bands is introduced in [50] where the spectrum sensing problemis formulated as an an optimization problem in an interference limited network. Waveletedge detection followed by blind source separation is done to separate the signals in thefrequency domain in [51]. In both these works, although the signals can be accuratelylocalized and separated in frequency, the joint time-frequency information is lost. In manyapplications such as detection of the hopping pattern of a wireless device and joint frequencyand temporal optimization of the shared spectrum, it becomes a necessity to detect both thetime and frequency information of the signals present. In [52], periodic signals are detectedusing a blind energy detection followed by a cyclostationary detection method where theextracted signals are then classified based on a Chinese restaurant process (CRP). Definingcustom features based on RF signatures and cyclostationarity properties may be a viablesolution but might not be the best approach to detect various types of heterogeneous signalsthat deviate from cyclostationary assumptions. This limitation is accompanied by the loss353.1. Introductionof temporal information.Moving away from cyclostationary assumptions requires an agnostic feature extractornetwork. With advances in deep learning techniques for time-series and image analysis, wecan extract rich features out of RF data for downstream tasks such as detection, localizationand classification. Audio event detection (AED) is one example where the application ofdeep learning has been explored in the recent past. The underlying philosophy is thatby converting the time series data into spectrograms and then employing deep learningtechniques, we can extract certain specific patterns that help detect and localize audio events.In [53], a state-of-the-art object detection framework was adapted to detect monophonic andpolyphonic audio events from the spectrograms. A similar approach is proposed in [54], withthe added functionality of capturing the long-term temporal context from the extractedfeatures through the use of a convolutional recurrent neural network (CRNN). Both [53]and [54] detect the presence of audio events by converting time series information into time-frequency spectrograms and then learning from the features present in the spectrograms.However, the same philosophy has not been well explored yet for detecting general purposewireless RF signals present in wideband spectrum.The idea of using deep learning based frameworks to detect wireless signals has beenlooked into recently. The work [55] converts the time-frequency information into powerspectral density (PSD) based spectrograms. The spectrogram is then fed into a five-layerconvolutional neural network (CNN) which is used to perform multi-class classification overdifferent wireless technologies like WiFi, Bluetooth and ZigBee. Although the approach isable to perform classification over heterogeneous devices, it cannot localize them in timeand frequency. Localization in time and frequency, if achieved, can be used to study variousother properties of these devices like hopping patterns, signal bandwidth and dwell time.This information will be crucial for security purposes because it helps us perform narrow-band jamming to mitigate rogue devices without effecting the other friendly devices on theindustrial, scientific and medical (ISM) band. A different time-frequency transformation,363.1. Introductioncalled the Choi-Williams distribution (CWD), is used in [56] to distinguish between differenttype of coding schemes like polytime codes, Frank code and Costas codes. After imagepreprocessing, this transformation is fed into a two-layer CNN with pooling and the recordedratio of successful recognition (RSR) is about 90% for most codes. However, it faces a similardrawback of not being able to localize the signal in time and frequency.The problem of detecting signals in a spectrogram falls under the more general problem ofobject detection in images. The state-of-the-art in this regard is to employ CNNs to identifywhether an image contains an object(s) and predict the bounding box of the detected object[57]-[63]. Previous state-of-the-art methods, for example [57][58], have employed one-stageobject detection using a single CNN to simultaneously obtain the category and locationof the objects. Such one-stage methods have recently been outperformed by certain two-stage object detection methods [59]-[63]. In these methods, the first stage generates a setof candidate bounding boxes, commonly referred to as the region proposals. Popular regionproposal methods include selective search [62], which is based on greedy superpixel merging,and EdgeBoxes [63], which is based on edge maps and edge groups. The second stageperforms a classification task on the region proposals to identify the objects and a refiningtask on the dominant region proposals to provide the bounding boxes. A major bottleneckof the above mentioned CNN methods is that they perform supervised machine learningand therefore require large amounts of labelled datasets to achieve high accuracy in objectdetection and localization [57]-[63]. Large labelled datasets are currently available for objectdetection in day-to-day real-life images containing humans (c.f. PASCAL VOCO [64] andMS COCO [65]) and audio signals (c.f. UrbanSound8k [66] and DCASE [67]). However,there are no standard labelled datasets available online for wireless signals present in thewideband RF spectrum.In this chapter we introduce a real-time deep learning framework based on the FRCNN [1],for detection and time-frequency localization of narrowband signals present in a wideband RFspectrum. Firstly, we find the most suitable feature extraction network and our experiments373.2. Framework for signal detection and time-frequency localizationsuggest that while weights pretrained on regular images are a good starting point for mediumsized networks, making the weights trainable gives much better performance. Following this,we provide design insights with respect to multiple variables such as the STFT parameters,spectrogram and anchor sizes and various thresholds of the model. To evaluate the detectionand localization performance of the proposed system, we generate synthetic data as per therecently proposed WiFi-HT protocol, adopt the mAP metric [68] and make the necessarymodifications to account for evaluation over varying SNR values. An mAP of 0.9 is recordedwhen the model is trained and tested on positive SNR values with single-bandwidth signals.In the coming sections we discuss our signal detection framework and introduce the Faster-RCNN architecture.3.2 Framework for signal detection andtime-frequency localizationSpectrogram creation, pre-processingBox detection in spectrogramsTime-frequency information extractionRF time-series captureFigure 3.1: Proposed framework for signal detection and time-frequency localizationWe propose a deep learning framework to detect and estimate the time-frequency span ofall wireless signals present in a wideband RF spectrum. The proposed framework takes thewideband RF time-series data as the input and provides the time and frequency informationof each detected signal as the output. An outline of the proposed framework is presented inFig. 3.1. Details on each stage are presented below.383.2. Framework for signal detection and time-frequency localization0 100 200 300 400 500 600Time [ms]020040060080010001200Signal amplitude(a) Signal amplitude vs. time-30 -20 -10 0 10 20 30Freq [MHz]00.050.10.150.2Magnitude of FFT components(b) Magnitude of FFT components vsfrequency100 200 300 400 500 600Time (ms)-20-1001020Frequency (MHz)-140-120-100-80-60-40-20Power/frequency (dB/Hz)Signals to be detected(c) Spectrogram image created by plotting the p.s.d values against time and frequencyFigure 3.2: The time content, frequency content, and spectrogram of an example widebandRF capture, when the capture duration is 633 ms, center frequency is 2.4 GHz, widebandbandwidth is 56 MHz, and sampling rate is 56 MHz.393.2. Framework for signal detection and time-frequency localizationRF time-series captureIn the first stage, we employ a wideband sensor with center frequency fc and bandwidth Wto record time-series RF data in fragments of T ms each. The time and frequency contentof an example wideband capture with fc = 2.4 GHz, W = 56 MHz, T = 633 ms, and asampling rate of 56MHz is given in Fig. 3.2. While Fig. 3.2a plots the signal amplitudeas a function of time, Fig. 3.2b plots the magnitude of the fast Fourier transform (FFT)components as a function of frequency.Spectrogram creation and pre-processingFor a compact representation of the wideband signal in terms of time and frequency, we applySTFT on the RF time-series captures and obtain the PSD as a function of time and frequency.Three-dimensional spectrogram images are then created by plotting the PSD values alongthe time and frequency axes. Fig. 3.2c illustrates the spectrogram image created for theRF capture in Fig. 3.2a-3.2b, when the STFT parameters are chosen as follows: number offrequency bins is 4096, number of time bins is 4096, the STFT window is of hann-type, andthe window overlap is of 2048 time bins. Few insights on the choice of STFT parameters aregiven in Section 3.4.1. As may be noted from Fig. 3.2c, the signals to be detected appear inthe form of rectangular boxes in the spectrogram image.From the spectrogram in Fig. 3.2c, we may note that the dimensions of each rectangularbox within the spectrogram give us the time and frequency information of the correspondingwireless signal. The problem of signal detection and time-frequency localization thereforeboils down to the problem of detecting and estimating the dimensions of rectangular-shapedboxes present in the spectrogram. Before attempting box-detection in the spectrogramimage, we may employ some pre-processing steps. For example, we may remove out-of-bandtransmissions to eliminate unreliable information. We may also employ denoising methods,such as wavelet denoising [69], to improve the SNR of the spectrogram.403.3. Faster RCNN ArchitectureBox detection in spectrogramsTo detect the rectangular-shaped boxes present in the spectrograms, we take a supervisedmachine learning approach, wherein, we train a FRCNN model [1] with several labelledspectrogram images. The trained FRCNN model, when input with a test spectrogram image,detects the rectangular-shaped boxes present in the image and reports their dimensions. InSection 3.3, we present details on FRCNN architecture and provide an overview on how theFRCNN model achieves the box detection task at hand.Time-frequency information extractionAs the final step, we convert the dimensions of each rectangular box reported by the FR-CNN model into time and frequency information. For example, using the STFT parametersemployed in the spectrogram creation stage, we may scale the x and y dimensions of eachbox into the time and frequency span of the corresponding signal. The same approach mayalso be followed to obtain the narrowband center frequency of the signal. In the next section,we present details on the FRCNN architecture and expose several design choices that needto be made to perform the signal detection and time-frequency localization task.3.3 Faster RCNN ArchitectureFaster RCNN is an object detection framework composed of three modules, as illustratedin Fig. 3.3. The first module, which is the base network (BN), takes the image as theinput (spectrogram in our case), extracts features that are relevant to the object detectiontask at hand and outputs a down-scaled feature image. The second module, which is theregion proposal network (RPN), takes as input the down-scaled feature image, a set of anchorboxes (ABs) and the ground truths (GTs). The RPN provides as output the region proposals,which are nothing but candidate boxes that are likely to contain the objects of interest. Inour case, the objects of interest are the rectangular boxes in the spectrogram. The region413.3. Faster RCNN ArchitectureFigure 3.3: Faster RCNN Architecture.proposals from the RPN, along with the feature image, are fed into the detector network,which is the third module. The detector network assigns object class labels to each regionproposal from the RPN, performs a regression task to localize the object within the regionproposal, and also provides probabilities with which the assigned labels are true. Essentiallythe RPN acts as an attention mechanism [70] over the feature image to help the classifierwith the object detection and localization task. When spectrograms are provided as input,the entire FRCNN network can be thought of as a single unified framework for detectingand localizing the rectangular boxes (which are nothing but a manifestation of the signals ofinterest) present in the spectrograms. Details on each module in the FRCNN are presentedbelow.3.3.1 Base NetworkThe base network is a CNN that can be either shallow or deep depending on the complexityof features that need to be extracted from the input image. The convolutional layers areinterleaved with max pooling layers and the combination of these layers decide the totaldown-scaling factor. In Section 3.5, we experiment with multiple feature extraction networks,423.3. Faster RCNN Architecturenamely, the VGG-13 [71] with the first 10 convolutional layers, the VGG-13 [71] and theResNet-50 [72]) networks, and report the impact on the performance of the FRCNN model.3.3.2 Region Proposal NetworkIn the RPN, the down-scaled feature map obtained from the base network is passed throughan n × n convolutional layer, where n = 3 typically, to obtain a low-dimensional featurevector. Also, a fixed number of user-defined raw region proposals, known as anchors, arecreated for each pixel in the input feature map to serve as the raw region proposals. The low-dimensional feature vector, along with the raw region proposals created from the anchors,are fed into two fully connected layers to perform a classification and a regression taskrespectively. The classification task assigns probabilistic labels to each raw region proposalas positive or negative, to denote whether the proposal is likely to contain an object ofinterest or not. For proposals that are deemed positive by the RPN, the regression tasktunes the size of the proposals to suit the dimensions of the object. The role of the anchorsis explained next.AnchorsAs briefly mentioned earlier, for each pixel of the down-scaled feature map from the BN, theRPN generates a predefined number Na of raw region proposals centered at the pixel, whereNa is the number of anchors. Anchors are user-defined raw region proposals whose size andaspect ratio needs to be specified before the training process begins. During the training, ifany anchor box has an intersection over union (IoU) greater than a certain threshold (referredto as the RPN max overlap) with the ground truth, the RPN treats the anchor box as apositive target. On the other hand, if the IoU is lesser than a certain threshold (referred toas the RPN min overlap) the RPN treats the anchor box as a negative target. Any anchorbox whose IoU with the ground truth falls between the RPN min and RPN max overlap, isleft unutilized and is not acted upon for any further decision making. If the total number of433.3. Faster RCNN Architectureanchors is Na, we would therefore have Na number of proposals per pixel in the input featuremap. The regression layer in the RPN will have 4Na outputs per pixel, which correspond tothe corner coordinates of the Na anchor boxes per pixel. Also, the classification layer willhave Na outputs per pixel, to denote the probabilities with which the associated proposalscontain the object of interest. Also, for an input feature map of size W ×H, we would havea total of WHNa region proposals in total.On the WHNa region proposals thus obtained, we perform a non-max suppression (NMS)operation, as explained below, and obtain a filtered list comprising a small predefined num-ber, say Nr, of region proposals. The filtered list of proposals are fed into the third module,which is the detector network, for further action.Non-Max SuppressionFrom the WHNa region proposals, we choose a small predefined number of region proposals,referred to from here on as the regions of interest (RoIs), and feed them into the detectornetwork. The motivation for NMS is that, by restricting the number of RoIs, improvementsare observed in the performance of the detector network and also the overall processingtime. During NMS, we firstly sort all the region proposals in decreasing order of theirprobabilities. Next, we retain the region proposal with the highest probability and suppressall other proposals whose IoU with the retained RoI is greater than a predefined threshold,referred to as the NMS threshold. The same procedure is followed for the RoI with the nexthighest probability and so on until we have retained a small predefined number Nr of RoIs.This process of suppression is agnostic to the anchor type that the RoI belongs to and onlyrelies on the RPN classification probabilities. After NMS, the RoIs which have high IoUwith the ground truth are treated as positive targets and the rest are treated as negative,i.e., background, targets, for the classification and regression tasks executed in the detectornetwork.443.3. Faster RCNN Architecture3.3.3 Detector NetworkThe detector network performs an RoI convolutional pooling operation, followed by a clas-sification and a regression operation.RoI Convolutional PoolingThe ROIs provided by the RPN (after the NMS operation) can be of different sizes, dependingon our choice of the anchors and the result of regression task in the RPN. The RoIs needto be converted into fixed size inputs in order to be able to feed them into convolutionallayers for classification and regression within the detector network. This action is carriedout by a convolutional network known as the RoI pooling network [73]. It takes two inputs,namely, the convolutional feature map from the BN and the filtered RoIs after the NMSoperation. For every RoI from the filtered RoI list, the RoI pooling network takes thesection of the convolutional feature map that corresponds to the RoI and scales it to somepre-defined output size (e.g., 7×7). The scaling process is carried out by doing the following:(i) dividing the RoI into equal-sized sections of the same dimension as the output, and (ii)finding the maximum value in each section and copying these to the output. The result isthat from a list of RoIs of different sizes, we can obtain modified RoIs of a fixed size. TheRoI pooling output dimension depends neither on the size of the feature map from the BNnor on the size of the filtered RoIs, but only on the number of sections we divide each RoIinto. By yielding fixed-size RoIs as the output, the RoI pooling layer allows us to use denselyconnected convolutional layers for the ensuing classification and regression tasks.Classification and Regression in the DetectorAfter performing RoI convolutional pooling, the fixed size RoIs are fed into a bunch ofconvolutional layers to convert them into low-dimensional feature vectors. These featurevectors are then input to two densely connected networks to perform a classification and aregression task respectively. The classification task focuses on assigning probabilities that453.3. Faster RCNN Architecturethe RoIs contain an object of interest. The regression task focuses on fine-tuning the size ofeach positive-labelled RoI to match the dimensions of the object present in it.When training the detector, if any RoI has an IoU greater than a certain threshold(referred to as the Detector max overlap) with the ground truth, the RoI is treated as apositive target. If the IoU is less than the Detector max overlap but is greater than a certainthreshold, namely the Detector min overlap, the RoI is treated as a negative target. AnyRoI whose IoU with the ground truth is less than the Detector min overlap, is left unutilizedfor any further decision making. During test time, if the total number of RoIs after the NMSoperation is Nr, the classification task in the detector yields Nr outputs, each denoting theprobability with which the associated RoI contains the signal of interest. For all the RoIswith probability higher than a certain threshold, namely the Detector positiveness threshold,the regression task yields 4 outputs, corresponding to the corner coordinates of the regressedRoI.3.3.4 Loss FunctionsFor the classification and regression tasks in both the RPN and the detector networks, weneed to optimize appropriate loss functions. A standard multi-task loss function, as definedin [1], can be given as:L(pi, ti) =1Ncls∑iLcls(pi, p∗i ) + λ1Nreg∑ip∗iLreg(ti, t∗i ), (3.1)where Ncls is the mini-batch size, Nreg = 4 ∗ Ncls is the total number of coordinates in themini-batch, i is the anchor index and pi is the predicted probability of the anchor i containingthe object of interest. The p∗i is the GT label which takes a value of 0 or 1, depending onwhether the anchor is negative or positive respectively. The t∗i is a vector comprising the fourparameterized coordinates of the GT box associated with a positive anchor and ti is that ofthe predicted bounding box. The pi and ti terms are the outputs from the classification and463.3. Faster RCNN Architectureregression respectively. The classification loss Lcls is a simple binary cross-entropy loss [74]over the two object classes of interest, namely the signal and the background. On the otherhand, the regression loss Lreg is given asLreg(ti, t∗i ) = R(ti − t∗i ) (3.2)where R is the robust loss function defined in [75]. The term p∗iLreg indicates that theregression loss is turned on only for positive anchors, i.e., when p∗i = 1, and is quiescentotherwise. Elements of the parameterized coordinate vectors t∗i and ti in the regressiontask are obtained from the corner coordinates of the ground truth, the anchor box and thebounding box as [1]:tx = (x− xa)/wa, ty = (y − ya)/ha,tw = log(w/wa), th = log(h/ha),t∗x = (x∗ − xa)/wa, t∗y = (y∗ − ya)/ha,t∗w = log(w∗/wa), t∗h = log(h∗/ha),(3.3)where x, y, w, and h denote the time and frequency coordinates of the center, the timespan and the frequency span of the bounding box in the spectrogram respectively. Thexa, ya, wa, ha and x∗, y∗, w∗, h∗ are similarly defined for the anchor box and the GT respec-tively. The regression tasks in both the RPN and the detector networks can be thought of asbounding-box regressions from an anchor box/region proposal to a nearby GT signal, withsome transformation of the optimization variables, as given in (3.3), for ease of training.Similar loss functions are used for both the RPN and the classifier with appropriate targetsin each case.473.4. Design Choices to Adopt FRCNN for Signal Detection and Time-Frequency Localization3.3.5 Training procedureWe employ the approximate joint training method proposed in [1] to train our system. Inthis method, during each forward pass, the RPN generates proposals which are used forthe backpropagation. The updated RPN from the backpropagation generates an additionalround of proposals which are then treated as precomputed proposals when training thedetector network. After the forward pass of the classifier, both the RPN loss and the detectorloss are summed and the resulting loss is used to backpropagate through both the networks.As mentioned in [1], this method ignores the derivative of the sum-loss with respect to theproposal boxes and therefore is an approximation. However, it reduces the training timesignificantly, is easier to implement, and the loss in performance due to the approximationis minor.3.4 Design Choices to Adopt FRCNN for SignalDetection and Time-Frequency LocalizationAs may be noticed from our description of the FRCNN architecture, there are several designchoices that we need to make in order to adopt the FRCNN model for the signal detectionand time-frequency localization task at hand. Below we provide multiple insights on themajor design choices that we have made.3.4.1 Choice of STFT parametersTo generate the spectrogram images from the raw RF time-series data, we need to applydiscrete-time STFT and this in turn requires us to choose a few hyperparameters, namely,the window size, the window type, the window overlap, and the FFT size. We may choosethese STFT parameters based on the minimum time and frequency resolutions that we needto achieve. The window size governs the maximum achievable frequency resolution. If the483.4. Design Choices to Adopt FRCNN for Signal Detection and Time-Frequency Localizationwindow is T seconds long, the minimum detectable narrowband bandwidth is 1/T Hz. Forexample, when the sampling rate is fs = 56 × 106 Hz and the window size is 5600, theminimum detectable bandwidth is 10 kHz. While respecting this lower limit, the FFT sizeallows us to control the number of frequency bins in the spectrogram. For example, an FFTsize of 1500 divides the spectrogram into 1500 frequency bins. Therefore, if the widebandbandwidth is 56 MHz, the minimum detectable signal bandwidth would be ≈ 37.3kHz.The window size, the window overlap, and the sampling rate govern the maximum achiev-able time resolution. For example, if the captures signal duration is Tsig = 0.633 seconds, thesampling rate is fs = 56× 106 Hz, the window size is Twin = 5600 and the window overlap isTov = 2800, the maximum achievable time resolution is Tsig∗(Twin−Tov)/(Tsig∗fs−Tov) ≈ 50µseconds. Lastly, the window type governs the amount of discontinuities between successivewindow segments. For example, windows which are tapered at the ends, such as the Hannand Hamming windows [76],[77], introduce much fewer unnatural discontinuities in the timedomain than the ones with non-tapering ends, such as the rectangular window.3.4.2 Spectrogram sizeWhen feeding the spectrogram into the FRCNN model, care should be taken to ensure thatthe size of the input image is not too large. This is mainly because of two reasons. Firstly,when we use large images as input to the model, each pass of training takes a long timeto complete because of the large input feature map, number of anchors, and the resultingset of computations. This would mean large convergence time. Secondly, the base networkthat extracts the features has a particular receptive field depending on the base networkused, such as the VGG and ResNet. When these receptive fields are large when comparedto the size of the signals, it would be hard for the network to make sense of the extractedfeatures as they would contain more of the background than any given signal. In light ofthese two observations, we restrict the length of our input image to 600 pixels in time. This isachieved by chopping the input spectrogram into chunks of 600 pixels each and recalculating493.4. Design Choices to Adopt FRCNN for Signal Detection and Time-Frequency Localizationthe relative positions of the ground truths accordingly. Doing so ensures that the size of thesignals in the spectrogram is comparable to the full spectrogram size. This approach is alsoseen to decrease the training time significantly.3.4.3 Choice of base networkAs may be noted from Section 3.3.1, the base network in the FRCNN model handles thecrucial task of feature extraction and therefore needs to be chosen based on the type ofinput data used and the type of object detection task at hand. Standard feature extractionmodels used in computer vision include VGG-13 [71], ResNet-50 [72], and MobileNet [78],among many other popular ones [79]. These models have been built to detect objects inbenchmark datasets for computer vision, such as PASCAL VOC2007 [64] and MC COCO[65]. Example objects that are to be detected from these datasets include humans, animals,and automobiles. It is therefore not clear whether the standard base network models suchas VGG-13 and ResNet-50 can perform feature extraction for the signal detection task onRF datasets. We may conduct a few experiments to choose the base network for the taskat hand. Firstly, we may consider the publicly available pretrained weights in the VGG-13and ResNet-50 models and verify if the extracted features are useful for the signal detectiontask at hand. Secondly, we may set the pretrained weights in these models as initializationpoints to optimize the weights in the VGG-13 and ResNet-50 models to perform tailor-madefeature extraction for the task at hand. Thirdly, we may only consider the architecture of theVGG-13 and ResNet-50 models, randomly initialize the weights and optimize the weightsfor the feature extraction. We pursue these three experiments in Section 3.5 and provideinsights based on the observations we make.3.4.4 Anchor box sizes and aspect ratiosAs we may recall from Fig. 3.3 and Section 3.3.2, anchor boxes aid the RPN in generatingregion proposals for the detector network. The RPN acts upon the anchor boxes generated503.4. Design Choices to Adopt FRCNN for Signal Detection and Time-Frequency Localizationper pixel of the input feature map by performing (i) a classification task which assigns toeach anchor the probabilities of it containing the signal of interest and (ii) a regression taskwhich regresses the corner coordinates of the anchor boxes to generate the region proposals.When training the RPN for the classification and regression, anchor boxes whose IoU withthe GT are larger than the RPN max overlap are treated as positive targets and the oneshaving IoU lower than the RPN min overlap are treated as negative targets. Consequently,in order for the training to be successful, we need to carefully choose the anchor sizes andaspect ratios, as well as the RPN min and max overlap values.Since the anchor boxes serve as raw region proposals for the RPN to act upon, we maychoose the anchor sizes and aspect ratios to match the dimensions of the signals that we wishto detect and localize. For example, if the narrowband signals are known follow the IEEE802.11n HT protocol, we know that the signals have a bandwidth of either 20 MHz or 40MHz and their duration ranges from about 300 microseconds to the order of 15 milliseconds[2][80]. We may therefore choose multiple anchor boxes, each with a frequency span of either20 or 40 MHz and with a time-span chosen uniformly randomly in the range [0.3, 15] ms.3.4.5 RPN max and min overlapThe RPN max overlap needs to be chosen such that at least a few anchor boxes per imagehave a high enough IoU with the GT to be considered as positive targets for the RPNtraining. Typically, the RPN max overlap is chosen to be greater than or equal to 0.5because an IoU of 0.5 or more gives us confidence that the anchor box indeed contains thesignal of interest. On a similar note, the RPN min overlap needs to be chosen such thatat least a few anchor boxes per image have a low enough IoU with GT to be considered asnegative targets for the RPN training. Typically, the RPN min overlap needs to be less than0.5 because an IoU less than 0.5 denotes that the anchor box may not contain the signalof interest. Higher RPN max overlap and lower RPN min overlap values would deem feweranchors as positive and negative targets respectively. This would consequently result in the513.5. Numerical Studiesconvergence rates being slower but smoother.3.4.6 Detector network min and max overlapThe detector max overlap needs to be chosen such that at least a few RoIs per imagehave a high enough IoU with the GT to be considered as positive targets for training theclassification and regression layers present in it. Typically, the detector max overlap is chosento be greater than or equal to 0.5 because an IoU of 0.5 or more gives us confidence that theRoI indeed contains the signal of interest. On a similar note, the detector min overlap needsto be chosen such that at least a few RoIs per image have an IoU value between the detectormax and min overlaps to be considered as negative targets for the training. Typically, thedetector min overlap needs to be less than 0.5 because an IoU less than 0.5 denotes that theRoI may not contain the signal of interest. The higher the detector max overlap, the fewerthe number of positive target RoIs. Also, the smaller the difference between the detectormax and min overlap values, the fewer the number of negative target RoIs. For the trainingto converge faster, we need to make sure that the pool of positive and negative targetsper image is as large as possible and therefore choose the detector max and min overlapsaccordingly. In Section 3.5, we conduct simple grid search experiments to select the bestpossible values for these thresholds.3.5 Numerical StudiesWe now present numerical studies on the performance of the Faster RCNN model for the sig-nal detection and time-frequency localization task under study. Below we provide details onthe training and test datasets, the spectrogram generation, the various numerical thresholdschosen for the FRCNN model, and the metric for performance evaluation.523.5. Numerical Studies3.5.1 Dataset for training and testingWe consider RF transmissions as per the IEEE 802.11n HT mode protocol [2], popularlyknown as the WiFi-HT protocol, and generate the time-series data synthetically using MAT-LAB WLAN toolbox [81]. All the generated RF captures are centered around the 5.8 GHzrange, have a time duration of 630ms, a wideband bandwidth of 56 MHz, and an SNR drawnuniformly randomly from the set {0, 10, 20, 30} dB. The data sampling rate is 56MHz andthe total useful bandwidth, after removing unreliable out-of-band transmissions, is 44.8MHz.On an average, each RF capture contains about 90 WiFi-HT signal packets, each having anarrowband bandwidth of 20MHz. All the signals are randomly subject either to line-of-sightor to non-line-of-sight small-scale fading effects. In total, we generate a dataset of 7 capturesper SNR, amounting to around 3780 signals. Out of the 7 captures per SNR, we randomlychoose 5 for training and the remaining 2 for test purposes.3.5.2 Spectrogram generationFor each RF capture, we apply STFT with the following choice of parameters: window size is5600, window overlap is 2800, window type is Hann, and the FFT size is 1500. The resultingspectrogram, after removal of out-of-band transmissions, contains 1200 frequency bins and12599 time bins. Each spectrogram image is chopped into fixed chunks of 600 bins in timein order to speed up the training process, as bigger spectrograms take longer time to trainand also to maximize the performance of FRCNN (c.f. Section 3.4.2 for details). This allowsus to detect signals that span a minimum of 0.05 ms in time and 37.3 kHz in frequency (c.f.Section 3.4.1). Each spectrogram image therefore results in a total of 21 input images to theFRCNN model. Consequently, the FRCNN model encounters a total of 630 input imagesduring training and a total of 252 input images during testing.533.5. Numerical Studies3.5.3 Numerical thresholds for the FRCNN modelThe convolutional layer at the start of the RPN (c.f. Section 3.3.2), which is used to createa low-dimensional feature vector from the feature map generated by the base network, ischosen as per [1] to be of size 3 × 3. The anchors are defined such that (i) the time-axissizes are chosen from the set {20, 40, 80, 120}, to represent signal time-spans of {1, 2, 4, 6}msrespectively, and (ii) the frequency-axis size is chosen to be 17.92, which corresponds to the90% useful bandwidths for the 20 MHz narrowband transmissions. In total, we thereforehave a maximum of Na = 4 anchor boxes. The RPN max overlap and the detector maxoverlap are chosen from the set {0.5, 0.7, 0.9}. The RPN min overlap and the detector minoverlap are chosen from the set {0.1, 0.3}. Also, following [1], we set the number of RoIs Nrfrom the NMS operation to 300, the output RoI size from the RoI pooling network to 7× 7,and the Detector positiveness threshold to 0.5. The λ value in the loss function is set to 1for all four classification and regression tasks in the RPN and detector. The mini-batch sizeNcls is set to 256 for the RPN and 32 for the detector.3.5.4 Training performance evaluationAs may be recalled from Sections 3.3.2-3.3.3, there are four main machine learning taskswithin the FRCNN model, namely the RPN classification, RPN regression, Detector classi-fication, and Detector regression. To evaluate the training performance, we consider an ex-ample experiment with the base network set to VGG-13 (initialized with pre-trained weightsand configured as trainable), each training is carried out with one spectrogram chunk, theRPN min overlap set to 0.1, RPN max overlap set to 0.9, Detector min overlap set to 0.1,detector max overlap set to 0.5, the weighted-sum loss function chosen as in Section 3.1 withλ = 1, and a total of 20 training epochs. In Fig. 3.4, we plot the training loss for the fourtasks mentioned above as a function of time. To focus on the trends, moving average wasapplied over a window of 5 time steps. We notice that all the four training losses converge543.5. Numerical Studieswith time to zero. The convergence rates and the fluctuations in the training loss dependon the quantity and quality of positive and negative targets available per mini-batch to theRPN and the Detector. Example positive targets on a spectrogram chunk for the RPN andthe positive RoI inputs for the detector networks are shown in Fig. 3.5. As may be notedfrom Sections 3.3.2 and 3.3.3, the RPN targets are the anchor boxes whose IoU with theground truth is ≥ RPN max overlap and the detector inputs are the RoI proposals chosenafter non-max suppression.0 200 400 600 800 1000Time Steps012345678RPN Classification LossRPN Classification Loss Over Time Steps(a) RPN classification loss vs time0 200 400 600 800 1000Time Steps00.050.10.150.20.25RPN Regression LossRPN Regression Loss Over Time Steps(b) RPN regression loss vs time0 200 400 600 800 1000Time Steps00.511.522.53Detector Classification LossDetector Classification Loss Over Time Steps(c) Detector classification loss vs time0 200 400 600 800 1000Time Steps00.10.20.30.40.50.6Detector Classification LossDetector Regression Loss Over Time Steps(d) Detector regression loss vs timeFigure 3.4: Training loss convergence for the RPN classification, RPN regression, Detectorclassification, and Detector regression tasks in the FRCNN model.553.5. Numerical Studies(a) RPN Targets (b) Positive Regions of InterestFigure 3.5: Targets for the RPN and the inputs for the Detector. The red boxes are thecomputed targets and the RoIs where as the blue ones indicate the GT signals.Algorithm 3 Calculation of the mean average precision (mAP) metricRequire: SNR levels, ground truths, bounding box predictions, detection probabilities1: for Each SNR level s do2: Sort the predictions in decreasing order of detection probabilities3: for Each bounding box prediction i do4: Assign a True label if it has IoU ≥ 0.5 with any GT (or False if otherwise)5: Calculate number of true positives (TP), false positives (FP), and false negatives (FN) sofar6: Calculate precision preci until the current prediction aspreci = TP/(TP + FP)7: Calculate recall ri until the current prediction asri = TP/(TP + FN)8: end for9: for recall levels rj= {0, 0.1, 0.2, . . . , 1} do10: Calculate the maximum achieved precision p˜recj for recall rj asp˜recj = maxrk≥rjprec (rk), where prec(rk) is the precision at recall rk11: end for12: Calculate the average of the maximum precisions for the 11 recall levels as APs =11∑j=1p˜recj13: end for14: return mean average precision (mAP), calculated as 1SS∑s=1APs, where S is the number ofSNR levels.563.5. Numerical Studies3.5.5 Prediction Performance evaluationTo evaluate the performance of the FRCNN model in signal detection and time-frequency lo-calization, we consider the mAP metric [68]. The mAP is a standard and widely-used metricfor performance evaluation of object detection algorithms in computer vision. An overviewof the mAP calculation is given in Algorithm 3. We begin by sorting all the boundingbox predictions from the FRCNN model in the decreasing order of the detector classifica-tion probabilities. Next, we assign a True or False label for each prediction, depending onwhether it has an IoU ≥ 0.5 with any ground truth or not. We also calculate the precisionand recall values until the current prediction using the formulae given in Steps 5 and 6 ofAlgorithm 3. We then consider 11 specific recall levels rj ranging from 0 to 1 in steps of 0.1and record the maximum achieved precision p˜recj for each recall level. The mAP is thenobtained as the average of the p˜recj values recorded for the 11 recall levels. Higher mAPvalues denote better prediction performance.Figure 3.6: Test images of the trained model. The red boxes are the predicted boundingboxes where as the blue ones indicate the GT signals.Sample prediction images of the model are shown in Fig. 3.6. It should be noted thatthese images are a result of a particular combination of parameters of the network. In thenext few subsections, we analyze the mAP performance of the trained FRCNN model for573.5. Numerical Studiesvarious parameters such as different base networks, anchors, RPN min and max overlaps,detector min and max overlaps, and SNR values.Impact of different base networksvgg-10 vgg-13 resnet-50Depth00.10.20.30.40.50.60.70.80.9mAPmAP with different Base Networksrandom init - BN trainablepretrained init - BN non-trainablepretrained init - BN trainableFigure 3.7: mAP for different base networks.In Fig. 3.7, we plot the mAP values achieved by the FRCNN model when the base network ischosen to be VGG-10, VGG-13, and ResNet-50, where the numbers 10, 13, and 50 denote thenumber of convolutional layers present . While the architectures for VGG-13 and Resnet-50are available online [71][72], the VGG-10 architecture is obtained from VGG-13 by removingthe last three convolutional layers. We try different three different combinations of featureextraction: (i) use pretrained weights as the initialization for the BN and configure it as non-trainable, (ii) use pretrained weights as the initialization for BN and configure it as trainable,and (iii) use random weights as the initialization for BN and configure it as trainable. Withthe VGG-10, we only attempt the third method because there are no pretrained weightsavailable for this architecture. We notice that the VGG-13, with the BN set as trainableand the pretrained weights used as the initialization, gives the best mAP performance. Theperformance drops slightly when the initialization is random and even more when the BN isset as non-trainable. Also, we notice that a very deep network such as the ResNet-50 may583.5. Numerical Studiesnot provide any improvement in mAP over moderately deep network such as the VGG-13because the training complexity increases with the depth and a moderate depth may actuallybe sufficient to extract all the features necessary for the task at hand. The VGG-10, whichis a slightly shallower network than the VGG-13, achieved poor mAP performance, possiblybecause it fails to perform feature extraction. For all the experiments presented henceforth,we fix the base network to VGG-13, initialize it with pretrained weights and configure it astrainable.Impact of the number of anchor boxesWe now analyze the effect of using different number of anchors on the mAP performance ofthe FRCNN model. In Table 3.1, we list out the mAP values achieved with Na = 1, 2, 3, 4.It is observed that the mAP values are fairly constant across different Na values, with onlymarginal improvements when Na is increased from 1 to 4. This reveals that the regressiontasks in the RPN and the Detector network are powerful enough to regress from arbitrarilyclose anchors to the ground truths. Also, the chosen anchors need not have very high IoUwith the ground truths for the training to be successful. For all the experiments presentedhenceforth, we fix the number of anchors to 3, with the anchor dimensions as given in Table3.1.Number of anchors Anchor dimensions mAP1 [1ms, 20MHz] 0.7913222 [1ms, 20MHz], [2ms, 20 MHz] 0.794363 [1ms, 20MHz], [2ms, 20 MHz], 0.8176363[3ms, 20MHz]4 [1ms, 20MHz], [2ms, 20 MHz], 0.8172389[3ms, 20MHz], [4ms, 20 MHz]Table 3.1: mAP with different number of anchors.593.5. Numerical StudiesImpact of RPN and Detector Min and Max OverlapsmAP for different RPN thresholds0.1 0.3RPN Min Overlap0.60.650.70.750.80.850.90.951mAPRPN Max Overlap=0.5RPN Max Overlap=0.7RPN Max Overlap=0.9(a) DET Min Overlap = 0.1, DET Max Overlap = 0.5mAP for different DET thresholds0.1 0.3DET Min Overlap0.40.50.60.70.80.91mAPDET Max Overlap=0.5DET Max Overlap=0.7DET Max Overlap=0.9(b) RPN Min Overlap = 0.1, RPN Max Overlap = 0.9mAP for different DET thresholds0.1 0.3DET Min Overlap0.50.550.60.650.70.750.80.850.90.951mAPDET Max Overlap=0.5DET Max Overlap=0.7DET Max Overlap=0.9(c) RPN Min Overlap = 0.3, RPN Max Overlap = 0.5Figure 3.8: mAP with different RPN and Detector thresholds.We now analyze the impact of the RPN and DET min and max thresholds. Since a gridsearch can be computationally exhaustive, we employ an alternate-once strategy for findingthe optimal threshold values. We first fix the DET Min and Max overlaps to be 0.1 and0.5 respectively and search over different combinations of RPN min and max overlap valuesas shown in Fig. 3.8a. Among the different thresholds, we observe that the (RPN Min,RPN Max) combinations (0.1, 0.9) and (0.3, 0.5) achieve better mAP values than the rest.We therefore fix the RPN Min and Max values to these two combinations and search over603.5. Numerical Studiesthe different DET min and max overlap values, as shown in Fig. 3.8b and 3.8c. We noticethat the best (DET min, DET max) combination among these runs is (0.1, 0.5) and fix thiscombination for all ensuing experiments.The alternate-once strategy presented above is a simplified search approach to obtainthe RPN and DET min and max overlap thresholds. Naturally, a more exhaustive gridsearch method may help us choose better threshold values but would require many moreexperimental runs. Note that the optimal threshold values obtained either via grid searchor via the alternate-once strategy, may not be universal for the FRCNN model because theoptimal values can depend on the type and quality of the training data.Impact of SNRWe now proceed to evaluate the performance of the FRCNN model for different SNR levels.We begin with a training dataset comprising 5 captures per SNR level in the set {0, 10, 20, 30}dB. In Fig. 3.9, we plot the mAP values achieved when the test dataset comprises of captureswith SNR = −10, 0, 10, 20, and 30 dB respectively. We notice that the mAP values areconsistently around 0.9 for positive SNR levels while dropping to a little over 0.5 for −10dB. To verify whether this trend is universal, we next consider a training dataset comprising5 captures per SNR level in the set {−10, 0, 10, 20, 30} dB, i.e., we have now included anegative SNR level. The mAP values for each SNR level in the test dataset are given inFig. 3.9. When compared to the case with non-negative SNR levels, we notice a drop inthe mAP performance for all the SNR levels except for −10 dB. This is expected as themodel should perform well on the data it has seen before, however, the drop in mAP valuescorresponding to positive SNR when trained with images of negative SNR points to the lackof generalization of the model across positive and negative SNR. It is also observed that themAP on test captures with SNR less than -10 dB is very poor irrespective of the trainingstrategy. This exposes the need for a denoising mechanism as a preprocessing step on thespectrograms before we feed them into the FRCNN model. We have conducted experiments613.5. Numerical Studieswith the general-purpose wavelet denoising [69] as a simple pre-processing step and haveobserved no significant improvement in the mAP values. This observation motivates theneed for an advanced custom-made denoising function that can take captures with stronglynegative SNR and bring the SNR level at least up to the -10 dB level before feeding them tothe FRCNN model. The design of such a custom-made denoising function is an interestingtopic of investigation for future work.mAP vs SNR-10 0 10 20 30SNR (dB)0.50.60.70.80.91mAPTrained on 0 -30 dBTrained on -10 - 30 dBFigure 3.9: mAP vs SNR.Impact of disparity in signal sizesIn all our experiments so far, we have considered signals with a fixed bandwidth of 20MHz.We now consider an artificially generated WiFi dataset comprising two different signal band-widths, namely, the 5 MHz and 40 MHz, and SNR levels in the set {−10, 0, 10, 2030}dB. Dueto the nature of the WiFi protocol, both the 5MHz and 40MHz signals have variable dwelltime but their bandwidths are different by a factor of 8. We conduct this experiment toverify whether the trained FRCNN model is indifferent to the disparity in the ground truthsizes. The number of anchor boxes Na is set to 6, with the time spans drawn from {1, 2, 3}msand the frequency spans drawn from {5, 40}MHz. We notice that the mAP values drop by a623.5. Numerical Studiesconsiderable amount to 0.5320695 from 0.7917968 in the single-bandwidth case. Sample testimages showing the prediction performance of the model are given below in Fig. 3.10. Thedrop in performance could have come from multiple sources: (i) the use of a single non-maxsuppression threshold may favor region proposals arising from a particular class of groundtruth signals, thereby ignoring the other types, (ii) the base network may not be able tosimultaneously extract all the important features for both the small and large bandwidthsignals, (iii) since the anchor sizes are vastly different, the use of a single RPN min overlap(and RPN max overlap) may result in the RPN assigning higher classification probabilitiesto region proposals from one particular type of signals. A deeper investigation needs to beconducted on these probable causes and appropriate architectural modifications need to bedeveloped in order to achieve uniform mAP performance across different signal sizes. Weare currently exploring this research direction.Figure 3.10: Test images of model trained on disparate anchor sizes. The red boxes are thepredicted bounding boxes where as the blue ones indicate the GT signals.633.6. Conclusion and Possible Future Work3.6 Conclusion and Possible Future WorkIn this Chapter, we have proposed a deep learning framework to perform real-time signaldetection and time-frequency localization in a wideband RF spectrum of interest. We convertthe data capture into a spectrogram and transform the given problem into an object detectionproblem. The following insights are obtained:• Our experiments suggest that while weights pretrained on regular images are a goodstarting point for medium sized networks, making the weights trainable gives muchbetter performance. It is also verified that deeper feature extraction networks don’tnecessarily improve performance.• An mAP of up to 0.9 is recorded when the model is trained and tested on positiveSNR values with single-bandwidth signals. This is seen to deteriorate as negative SNRsignals are included in the training process thus motivating the need for a customdenoiser. The experiments also reveal that the model is agnostic to the number ofanchor boxes because of the two-stage regression framework.• When trained with disparate signal sizes it is noted that the mAP decreases notice-ably. We hypothesize that this could be because of the inadequacy of the non-maxsuppression technique, feature extraction network and the RPN to handle anchors atdisparate scales.Some of the avenues for future work would be:• The performance of negative SNR captures is observed to be not satisfactory withrespect to the mAP. A custom denoising mechanism can be investigated to make surethe SNR level is acceptable before feeding the spectrogram to the signal detectionframework.643.6. Conclusion and Possible Future Work• It is seen that generalizing over SNR values is not very easy for the model to do.Custom made models corresponding to different SNR ranges can be introduced andtheir performance improvement over general purpose detectors can be observed.• It is observed that the employed non-max suppression technique may favour a par-ticular anchor box over another during the training process. A non-max suppressiontechnique could be devised that does not favour any particular anchor box type.• The performance of the model is seen to deteriorate when disparate signals are intro-duced in the training process. It would be interesting to change the architecture toaccommodate disparity in anchor sizes and come up with custom feature extractorsand region proposal networks for disparate sizes.• The experiments conducted in this work ignore the diversity and distribution in data.It would be interesting to investigate the impact of these two variations on the perfor-mance of the model.• The Faster RCNN model is capable of classifying distinct signal patches. It would benatural to extend the proposed framework to perform modulation classification.• While models pretrained with regular image datasets such as PASCAL VOC2007 [64]and MC COCO [65] are good starting points for training, it would be favorable to havea model pretrained on RF data to enable enhanced feature extraction and a multitudeof diverse downstream tasks.65Chapter 4ConclusionIn this thesis, we focused on two important directions involving next generation wirelesssystems. The first chapter of this thesis addressed an energy efficient partially connectedtransceiver architecture considering practical hardware as well as run-time constraints. Whilea case could be made about the energy efficiency of partially connected architectures, itshould be noted that spectral efficiency does indeed take a hit as a result of the approxima-tions in the design. While the promise of massive MIMO in terms of spectral efficiency isreal, this promise it would appear, does not translate very well to the energy efficient hybridprecoding regime. The experiments also do suggest strongly that carefully designed alter-nating algorithms outperform non-alternating ones, which is accompanied with increasedrun-time hindering real time implementation of these algorithms. Nevertheless, in applica-tions where energy efficiency is an important requirement and spectral efficiency is not, thesearchitectures could still play a vital role.The second chapter of this thesis proposed a real-time signal detection and time-frequencylocalization framework. It is demonstrated that deep learning indeed has the ability toperform high quality signal detection over a range of positive SNR. It would of course beunrealistic to demand that the model perform well across negative SNR as well, as this taskwould be equivalent to detecting objects in blurred images. A custom denoiser before thesignal detection framework would aid in enhancing performance in such a scenario. Whilethe model is very good with single-bandwidth signals, it falls short when multiple signalsof disparate sizes are included. We are looking to address this issue in our future researchwork.66Bibliography[1] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time objectdetection with region proposal networks,” in NIPS, 2015.[2] E. Perahia, and R. Stacey, Next Generation Wireless LANs: 802.11n and 802.11ac. 2ndEd., United Kingdom, Cambridge University Press, 2013.[3] MATLAB 2018a, The MathWorks, Inc., Natick, Massachusetts, United States.[4] M. Grant and S.Boyd. (2014). CVX: Matlab Software for Disciplined Convex Program-ming (version 2.1) [Online]. Available: http://cvxr.com/cvx.[5] T. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. Wong, J. Schulz, M.Samimi, and F. Gutierrez, “Millimeter wave mobile communications for 5G cellular: Itwill work!,” IEEE Access, vol. 1, pp. 335-349, May 2013.[6] T. S. Rappaport, R. W. Heath, R. C. Daniels, and J. N. Murdock, Millimeter WaveWireless Communication, Englewood Cliffs, NJ, USA:Prentice-Hall, 2014.[7] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S. Rappaport, and ElzaErkip, “Millimeter wave channel modeling and cellular capacity evaluation,” IEEE J.Sel. Areas Commun., vol. 32, no. 6, pp. 1164-1179, Jun. 2014.[8] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparseprecoding in millimeter wave MIMO systems,” IEEE Trans. Wireless Commun., vol.13, no. 3, pp. 1499-1513, Mar. 2014.67Bibliography[9] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen, L. Li, and K. Haneda,“Hybrid beamforming for massive MIMO-A survey,” IEEE Commun. Mag., vol. 55, no.9, Sep. 2017.[10] C. Rusu, R. Mendez-Rial, N. Gonzalez-Prelcicy, and R. W. Heath, “Low complexityhybrid sparse precoding and combining in millimeter wave MIMO systems,” Proc. IEEEInt. Conf. Commun. (ICC), London, U.K., Jun. 2015, pp. 1340-1345.[11] L. Liang, W. Xu, and X. Dong, “Low-complexity hybrid precoding in massive multiuserMIMO systems,” IEEE Wireless Commun. Lett., vol. 3, no. 6, pp. 653-656, Dec. 2014.[12] W. Tan, D. Xie, J. Xia, W. Tan, L. Fan, and S. Jin, “Spectral and energy efficiency ofmassive MIMO for hybrid architectures based on phase shifters,” IEEE Access, vol. 6,pp. 11751-11759, 2018.[13] S. Buzzi, and C. D’Andrea, “Are mmwave low-complexity beamforming structuresenergy-efficient? Analysis of the downlink MU-MIMO,” Proc. IEEE Globecom Work-shops (GC Wkshps), pp. 1-6, Dec. 2016.[14] J. Singh, and S. Ramakrishna, “On the feasibility of codebook-based beamforming inmillimeter wave systems with multiple antenna arrays,” IEEE Trans. Wireless Com-mun., vol. 14, no. 5, pp. 2670-2683, May 2015.[15] C. Kim, T. Kim, and J.-Y. Seol, “Multi-beam transmission diversity with hybrid beam-forming for MIMO-OFDM systems,” Proc. IEEE Global Commun. Conf. Workshops(GLOBECOM), Atlanta, GA, USA, Dec. 2013, pp. 61-65.[16] L. Dai, X. Gao, J. Quan, S. Han, and C.-L. I, “Near-optimal hybrid analog and dig-ital precoding for downlink mmwave massive MIMO systems,” Proc. IEEE Int. Conf.Commun. (ICC), London, U.K., Jun. 2015, pp. 1334-1339.68Bibliography[17] S. Han, C.-L. I, Z. Xu, and C. Rowell, “Large-scale antenna systems with hybrid analogand digital beamforming for millimeter wave 5G,” IEEE Commun. Mag., vol. 53, no. 1,pp. 186-194, Jan. 2015.[18] S. Park, A. Alkhateeb, and R. W. Heath, Jr., “Dynamic subarrays for hybrid precodingin wideband mmWave MIMO systems,” IEEE Trans. Wireless Commun., vol. 16, no.5, pp. 2907-2920, May 2017.[19] T. Kim, C. Kim, and J.-Y. Seol, “A low complexity hybrid beamforming algo-rithm for multi-beam diversity transmission,” IEEE Global Communications Conference(GLOBECOM’14), Dec. 2014.[20] X. Yu, J.-C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimization algorithmsfor hybrid precoding in millimeter wave MIMO systems,” IEEE J. Sel. Topics SignalProcess., vol. 10, no. 3, pp. 485-500, Apr. 2016.[21] F. Sohrabi, W. Yu, ”Hybrid digital and analog beamforming design for large-scaleMIMO systems”, Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp.2929-2933, Apr. 2015.[22] R. Lopez-Valcarce, N. Gonzalez-Prelcic, C. Rusu, R. W. Heath, ”Hybrid precodersand combiners for mmWave MIMO systems with per-antenna power constraints”, Proc.GLOBECOM, 2016.[23] N. Li, Z. Wei, H. Yang, X. Zhang, D. Yang, “Hybrid precoding for mmWave massiveMIMO systems with partially connected structure”, IEEE Access, vol. 5, pp. 15142-15151, Jun. 2017.[24] J. Lee, and Y. Lee, “AF relaying for millimeter wave communication systems with hy-brid RF/baseband MIMO processing,” Proc. IEEE Int. Conf. Commun. (ICC), Sydney,Australia, Jun. 2014, pp. 5838-5842.69Bibliography[25] M. Soleimani, R. C. Elliott, W. A. Krzymien, J. Melzer, and P. Mousavi, “Hybridbeamforming and DFT-based channel estimation for millimeter wave MIMO systems,”Proc. Personal, Indoor, and Mobile Radio Communications (PIMRC), pp. 1-7, Oct.2017.[26] M. K. Samimi and T. S. Rappaport, “3-D millimeter-wave statistical channel model for5G wireless system design,” IEEE Transactions on Microwave Theory and Techniques,vol.64, no.7, pp.22072225, July 2016.[27] 3GPP, “Study on channel model for frequency spectrum above 6GHz,” 3rd GenerationPartnership Project (3GPP), TR 38.900 V14.2.0, Jun. 2017[28] W. Tan, S. Jin, C. K. Wen, and J. Tao, ”Spectral efficiency of multi-user millimeterwave systems under single path with uniform rectangular arrays,” EURASIP Journalon Wireless Communications and Networking, vol.181, pp.1-13, Oct. 2017.[29] A. Cichocki, “Tensors decompositions: New concepts for brain data analysis?,” J. Con-trol Measure. Syst. Integr. (SICE), vol. 47, no. 7, pp. 507-517, 2011.[30] X. Yu, J. Zhang, and K. B. Letaief, “Hybrid precoding in millimeter wave systems: Howmany phase shifters are needed?,” Proc. IEEE Global Commun. Conf., pp. 1-6, Dec.2017.[31] W. Yu, and T. Lan, “Transmitter optimization for the multi-antenna downlink with per-antenna power constraints,” IEEE Trans. Signal Process., vol. 55, no. 6, pp. 2646-2660,June 2007.[32] W. Yu, and T. Lam, “Downlink beamforming with per-antenna power constraints,”Proc. IEEE SPAWC, pp. 1058-1062, Jun. 2005.[33] Z. Pi, “Optimal transmitter beamforming with per-antenna power constraints,” Proc.2012 IEEE International Conf. Commun., pp. 3779-3784, Jun. 2012.70Bibliography[34] Z. Rezaei, E. Yazdian, and F. S. Tabataba, “Optimal energy beamforming under per-antenna power constraint,” [online] Available: http://arXiv:1702.07545, Feb. 2017.[35] A. C. Jenifer, “Beamforming with per-antenna power constraint and transmit antennaselection using convex optimization technique,” Signal and Image Processing, 5(3), 59,2014.[36] S. Boyd, and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:CambridgeUniv. Press, 2004.[37] X. Cai, G. Wang, and Z. Zhang, “Complexity analysis and numerical implementation ofprimal-dual interior point methods for convex quadratic optimization based on a finitebarrier,” Numerical Algorithms, 62(2):289-306, Feb. 2013.[38] M. Achache, “A new primal-dual path-following method for convex quadratic program-ming. Computational and Applied Mathematics,” Computational and Applied Mathe-matics, vol. 25, no. 1, pp. 97-110, 2006.[39] A. Cichocki, A. Phan, “Fast Local Algorithms for Large Scale Nonnegative Matrix andTensor Factorizations,” IEICE Trans. Fundamentals of Electronics, vol. 92, pp. 708-721,Mar. 2009.[40] B. M. Browne, A. Langville, P. Pauca, and R. Plemmons, “Algorithms and applicationsfor approximate nonnegative matrix factorization,” Computational Statistics and DataAnalysis, vol.52, no.1, pp.155-173, Sep. 2007.[41] T.G. Kolda, and B. Bader, “Tensor decompositions and applications,” SIAM Review,vol. 51, no. 3, pp. 455-500, June 2008.[42] K. Satyanarayana, M. El-Hajjar, P.-H. Kuo, A. Mourad, and L. Hanzo, “Millimeterwave hybrid beamforming with DFT-MUB aided precoder codebook design,” Proc. Veh.Technol. Conf., pp. 1-5, Sep. 2017.71Bibliography[43] M. Vu, “MISO capacity with per-antenna power constraint,” IEEE Trans. Commun.,vol. 59, no. 5, pp. 1268-1274, May 2011.[44] D.P. Palomar, and Y.C. Eldar, Convex Optimization in Signal Processing and Commu-nications, Cambridge University Press, 2010.[45] M. M. A. Hossain, and R. Jantti, “Impact of efficient power amplifiers in wireless access,”Proc. IEEE GreenCom, pp. 36-40, Sep. 2011.[46] K. N. R. S. V. Prasad, E. Hossain, and V. K. Bhargava, “Energy-efficiency in massiveMIMO-based 5G networks: opportunities and challenges,” IEEE Wireless Commun.,vol. 3, no. 24, pp. 86-94, Jun. 2017.[47] M. Cai, K. Gao, D. Nie, B. Hochwald, J.N. Laneman, H. Huang, and K. Liu, “Effectof Wideband Beam Squint on Codebook Design in Phased-Array Wireless Systems,”Proc. IEEE Global Commun. Conf. (GLOBECOM), pp. 1-6, Dec. 2016.[48] B. Wang, F. Gao, S. Jin, H. Lin, and G. Y. Li, “Spatial-and Frequency-Wideband Effectsin Millimeter-Wave Massive MIMO Systems,” IEEE Transactions on Signal Processing,2018.[49] M. Cai, “Modeling and mitigating beam squint in millimeter wave wireless communica-tion,” PhD diss., University of Notre Dame, 2018.[50] Z. Quan, S. Cui, A. H. Sayed, and H. V. Poor, “Wideband spectrum sensing in cognitiveradio networks,” Proc. IEEE Int. Conf. Commun. pp. 901-906 2008-May.[51] D. Liu, C. Li, J. Liu, and K. Long, “A novel signal separation algorithm for widebandspectrum sensing in cognitive networks,” in Proc. IEEE Glob. Telecommun. Conf., Dec.2010, pp. 1-6.72Bibliography[52] M. Bkassiny, S. Jayaweera, Y. Li, and K. Avery, “Wideband spectrum sensing and non-parametric signal classification for autonomous self-learning cognitive radios,” IEEETrans. Wireless Commun. vol. 11 no. 7 pp. 2596-2605 Jul. 2012.[53] P. Pham, J. Li, J. Szurley, and S. Das, “Eventness: Object Detection on Spectrogramsfor Temporal Localization of Audio Events,” in Proc. IEEE International Conferenceon Acoustics, Speech and Signal Processing (ICASSP)., Apr. 2018, pp. 2491-2495.[54] C. Kao, W. Wang, M. Sun, and C. Wang, “R-CRNN: Regionbased convolutional recur-rent neural network for audio event detection,” in Proc. of Interspeech’18, Hyderabad,India, Sep. 2-6 2018.[55] N. Bitar, S. Muhammad, H. H. Refai, “Wireless Technology Identification Using DeepConvolutional Neural Networks,” Proceedings of IEEE 28th Annual International Sym-posium on Personal, Indoor, and Mobile Radio Communications (PIMRC), 2017.[56] M. Zhang, M. Diao, and L. Guo, “Convolutional neural networks for automatic cognitiveradio waveform recognition,” IEEE Access, vol. 5, pp. 11074-11082, 2017.[57] P. Sermanet, et al., “Overfeat: integrated recognition, localization and detection usingconvolutional networks,” arXiv preprint arXiv:1312.6229, Feb. 2014.[58] D. Erhan, et. al., “Scalable object detection using deep neural networks,” in Proc. ofIEEE Conf. Computer Vision and Pattern Recognition (CVPR), Jun. 2014, pp. 2155-2162.[59] R. Girshick, et. al., “Rich feature hierarchies for accurate object detection and semanticsegmentation,” in IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Jun.2014, pp. 580-587.[60] K. He, et. al., “Spatial pyramid pooling in deep convolutional networks for visual recog-nition,” in European Conf. Computer Vision (ECCV), 2014, pp. 346-361.73Bibliography[61] R. Girshick, “Fast R-CNN,” in IEEE Int. Conf. Computer Vision (ICCV), 2015, pp.1440-1448.[62] J. R. Uijlings, et al., “Selective search for object recognition,” Int. J. Computer Vision(IJCV), vol. 104, no. 2, pp 154-171, Apr. 2013.[63] C. L. Zitnick and P. Dollar,“Edge boxes: Locating object proposals from edges,” inEuropean Conference on Computer Vision(ECCV), Sept. 2014, pp. 391-405.[64] M. Everingham, et al., “The PASCAL visual object classes challenge 2007 (VOC2007)Results,” 2007.[65] T.-Y. Lin, et al., “Microsoft COCO: common objects in context,” in European Conf.Computer Vision (ECCV), 2014.[66] J. Salamon, C. Jacoby, and J. P. Bello, “A dataset and taxonomy for urban soundresearch,” in Proc. of the ACM Int. Conf. on Multimedia (ACM MM), 2014, pp. 1041-1044.[67] Annamaria Mesaros, et. al., “DCASE 2017 challenge setup: tasks, datasets and baselinesystem,” in Proc. of the Detection and Classification of Acoustic Scenes and EventsWorkshop (DCASE), 2017.[68] C. D. Manning, P. Raghavan, and H. Schutze, “Chapter 8: Evaluation in informationretrieval” in Introduction to Information Retrieval, Cambridge University Press. 2008.[69] R. Rangarajan, R. Venkataramanan, and S. Shah, “Image denoising using wavelets,”Wavelet and Time Frequencies, 2002.[70] J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio, “Attention-basedmodels for speech recognition,” in Neural Information Processing Systems (NIPS), 2015.[71] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale imagerecognition,” in International Conference on Learning Representations (ICLR), 2015.74[72] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”arXiv:1512.03385, 2015.[73] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutionalnetworks for visual recognition,” in ECCV, 2014.[74] Cross entropy, [Online]. Available: https://en.wikipedia.org/wiki/Cross entropy[75] R. Girshick, “Fast R-CNN,” in IEEE International Conference on Computer Vision(ICCV), 2015.[76] F. J. Harris, ”On the use of windows for harmonic analysis with the discrete Fouriertransform,” in Proc. of the IEEE, vol. 66, no. 1, pp. 51-83, Jan. 1978.[77] R. B. Blackman, and J. W. Tukey, “The measurement of power spectra from the pointof view of communications engineering-Part I,” Bell Syst. Tech. J., vol. 37, no. 1, pp.185-282, 1958.[78] A. G. Howard, et al., “Mobilenets: efficient convolutional neural networks for mobilevision applications.” [Online] arXiv preprint arXiv:1704.04861, Apr. 2017.[79] Pre-Trained Models, [Online]. Available: https://keras.rstudio.com/articles/applications.html[80] Packet transmission time in 802.11, [Online]. Available: https://sarwiki.informatik.hu-berlin.de/Packet transmission time in 802.11.[81] WLAN System Toolbox, [Online]. Available: https://www.mathworks.com/help/wlan/index.html75Appendix AProof of results in Chapter 2A.1 Optimality condition for wideband systems PCSExtending Proposition 1 in [23] to wideband systems, to achieve the maximum data rate inPCS, we have K sets of identical equations, where K is the total number of sub-carriers.Each set of equations is of the form Ax = b and has an optimal solution only ifN tRF ≥ rH [k] (A.1)where rH [k] is the rank of the channel matrix corresponding to the kth subcarrier. Assumingthat we have a full column rank channel matrix for all sub-carriers, we have rH [k] = Nt,∀k ∈1, 2, ..K. Therefore, it is required that N tRF ≥ Nt for an optimal solution to exist.A.2 Results with varying number of streams andsubcarriersWe note from equation (2.22) that the system becomes increasingly overdetermined withincrease in number of data streams (Ns) and number of sub-carriers (K). We verify this byobserving the spectral efficiency performance for different values of Ns and K as shown inFig. A.1.76A.2. Results with varying number of streams and subcarriers1 1.5 2 2.5 3 3.5 4Ns0.30.40.50.60.70.80.911.1Spectral Efficiency (bits/s/Hz)Spectral efficiency vs with increasing number of streamsNs = 1Ns = 2Ns = 4(a) Spectral efficiency vs Ns with K = 2.0 2 4 6 8 10 12Ns00.511.522.53Spectral Efficiency (bits/s/Hz)Spectral efficiency vs with increasing number of streamsNs = 1Ns = 2Ns = 4(b) Spectral efficiency vs K with Ns = 1.Figure A.1: Spectral efficiency at SNR = 0 dB, with N tRF = N rRF = 4.77
You may notice some images loading slow across the Open Collections website. Thank you for your patience as we rebuild the cache to make images load faster.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A hybrid precoding and signal detection framework for...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A hybrid precoding and signal detection framework for future wireless systems Dsouza, Kevin Bradley 2018
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | A hybrid precoding and signal detection framework for future wireless systems |
Creator |
Dsouza, Kevin Bradley |
Publisher | University of British Columbia |
Date Issued | 2018 |
Description | With energy efficiency and spectrum management being major concerns in future wireless systems, this thesis primarily focuses on the precoding and signal detection capabilities of next-generation wireless transceivers. In the first part of the thesis, we present a parallel framework to make hybrid precoding competitive in fast-fading environments. To enumerate, (i) a low-complexity algorithm which exploits the block diagonal phase-only nature of the analog precoder in a partially connected structure is proposed to arrive at a hybrid precoding solution for a multi-carrier single-user system using orthogonal frequency division multiplexing (OFDM), (ii) the original problem is broken down into independent subproblems of finding the magnitude and the phase components which are solvable in parallel, (iii) a per-RF chain power constraint is introduced instead of the sum power constraint over all antennas, which is much more practical in real systems, (iv) an alternating version of this scheme is proposed for increased spectral-efficiency gains, (v) wideband PCS architecture is critiqued for its applicability in future wireless systems and possible alternatives are discussed. In the second part of the thesis, we present a signal detection and time-frequency localization framework for smart transceivers. Although deep learning techniques for image analysis have been advancing at a breakneck pace for the past few years, their application to RF data has been relatively less explored. To enumerate our contributions, (i) we present a modification of an existing state-of-the-art object classification technique called Faster-RCNN (FRCNN) \cite{C108} for detection and time-frequency localization of the signal in a spectrogram of a wideband RF capture, (ii) insights into the design choices pertaining to the variables such as short-time Fourier transform (STFT) parameters, spectrogram and anchor sizes and network thresholds are discussed, (iii) synthetic data as per the recently proposed WiFi High Throughput (WiFi-HT) protocol \cite{wifi_ht} is generated and a mean average precision (mAP) of up to 0.9 is achieved when the model is trained and tested on positive signal to noise ratio (SNR) values, (iv) certain drawbacks of the model with respect to low SNR levels and disparate signal sizes are brought to light and possible solutions are discussed. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2020-01-31 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0375815 |
URI | http://hdl.handle.net/2429/68137 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2019-02 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2019_february_dsouza_kevinbradley.pdf [ 5.4MB ]
- Metadata
- JSON: 24-1.0375815.json
- JSON-LD: 24-1.0375815-ld.json
- RDF/XML (Pretty): 24-1.0375815-rdf.xml
- RDF/JSON: 24-1.0375815-rdf.json
- Turtle: 24-1.0375815-turtle.txt
- N-Triples: 24-1.0375815-rdf-ntriples.txt
- Original Record: 24-1.0375815-source.json
- Full Text
- 24-1.0375815-fulltext.txt
- Citation
- 24-1.0375815.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0375815/manifest